Our latest contribution to the 10th IEEE International Conference Cloud Computing (CLOUD 2017) is available online and can be accessed here.
The prosperity of cloud computing offers common infrastructures to a wide range of applications. Understanding these applications’ workload behaviors is the premise of designing, managing, and optimizing cloud systems. Considering the heterogeneity and diversity of cloud workloads, for the sake of fairness, cloud benchmarks must be able to accurately replicate their behaviors in cloud systems, including both the usages of cloud resources and the microarchitectural behaviors beyond the virtualization layer. Furthermore, workloads spanning long durations are usually required to achieve representativeness in evaluation. Hence the more challenging issue is to significantly reduce the evaluation duration while still preserving their workload characteristics.
This paper presents our efforts towards generating cloud workloads of diverse behaviors and reducible durations. Our benchmark tool, CloudMix, employs a repository of reducible workload blocks (RWBs) as the high level abstraction of workload behaviors, including usages of the two most important cloud resources (CPU and memory) and their pairing microarchitectural operations. CloudMix further introduces an efficient methodology to combine RWBs to synthesize and replicate diverse cloud workloads in real-world traces. The effectiveness of CloudMix is demonstrated by generating a variety of reducible workloads according to a Google cluster trace and by applying these workloads in job scheduling optimization on Hadoop YARN. The evaluation results show: (i) when the workload durations are reduced by 100 times, the replication errors of workload behaviors are smaller than 2.08%; (ii) when providing fast evaluations (workload durations are reduced by 10 to 100 times) to recommend the optimal setting in YARN job scheduling, the performance degradation in the recommended setting is just 0.69% compared to that of the actual optimal setting.
Concurrency and Computation: Practice and Experience just published our latest work on parallelisation strategies in the context of the the Cherenkov Telescope Array project. This is a result of an ongoing collaboration with CIEMAT (Spain) and INAF (Italy) and it can be accessed here.
In this work, a signal-extraction algorithm pertaining to the Cherenkov Telescope Array’s real-time-analysis pipeline has been parallelised using SSE, POSIX Threads and CUDA. Because of the observatory’s constraints, the online analysis has to be conducted on site, on hardware located at the telescopes, and compels a search for efficient computing solutions to handle the huge amount of measured data. This work is framed in a series of studies which benchmark several algorithms of the real-time-analysis pipeline on different architectures to gain an insight into the suitability and performance of each platform.
The Computer Physics Communications journal has just made available online our latest work on SaaS+PaaS architectures for service-driven computing. This is again the result of our collaboration with the Institute of Computing Technology from the Chinese Academy of Sciences and it can be accessed here.
Markov Chain Monte Carlo (MCMC) methods are widely used in the field of simulation and modelling of materials, producing applications that require a great amount of computational resources. Cloud computing represents a seamless source for these resources in the form of HPC. However, resource over-consumption can be an important drawback, specially if the cloud provision process is not appropriately optimized. In the present contribution we propose a two-level solution that, on one hand, takes advantage of approximate computing for reducing the resource demand and on the other, uses admission control policies for guaranteeing an optimal provision to running applications.
The first results of an ongoing collaboration with the Forest Genetics and Ecophysiology Research Group from the Technical University of Madrid has just been published online by the Tree Genetics & Genomes journal. It can be accessed here.
Direct sequencing of RNA (RNA-seq) using next-generation sequencing platforms has allowed a growing number of gene expression studies focused on forest trees in the last 5 years. Bioinformatic analyses derived from RNA-seq of forest trees are particularly challenging, because the massive genome length (~20.1 Gbp for loblolly pine) and the absence of annotated reference genomes require specific bioinformatic pipelines to obtain sound biological results. In the present manuscript, we review common bioinformatic challenges that researchers need to consider when analyzing RNA-seq data from forest tree species at the light of the experience acquired from recent studies. Furthermore, we list bioinformatic pipelines and data processing software available to overcome RNA-seq limitations. Finally, we discuss the impact of novel computation solutions, such as the cloud computing paradigm that allows RNA-seq analysis even for small research centers with limited resources.
The International Journal of Parallel Programming has just made available online our latest work on approximate request processing in cloud online services. This is the result of our collaboration with the Institute of Computing Technology from the Chinese Academy of Sciences and it can be accessed here.
Despite the importance of providing quick responsiveness to user requests for online services, such request processing is very resource expensive when dealing with large-scale service datasets. These often exceed the service providers’ budget when services are deployed on a cloud, in which resources are charged in monetary terms. Providing approximate processing results in request processing is a feasible solution for such problem that trades off result correctness (e.g. prediction or query accuracy) for response time reduction. However, existing techniques in this area either use parts of datasets or skip expensive computations to produce approximate results, thus resulting in large losses in result correctness on a tight resource budget. In this paper, we propose Synopsis-based Approximate Request Processing (SARP), a SARP framework to produce approximate results with small correctness losses even using small amount of resources. To achieve this, SARP conducts computations over synopses, which aggregate the statistical information of the entire service dataset at different approximation levels, based on two key ideas: (1) offline synopsis management that generates and maintains a set of synopses that represent the aggregation information of the dataset at different approximation levels. (2) Online synopsis selection that considers both the current resource allocation and the workload status so as to select the synopsis with the maximal length that can be processed within the required response time. We demonstrate the effectiveness of our approach by testing the recommendation services in e-commerce sites using a large, real-world dataset. Using prediction accuracy as the result correctness metric, the results demonstrate: (i) SARP achieves significant response time reduction with very small correctness losses compared to the exact processing results; (ii) using the same processing time, SARP demonstrates a considerable reduction in correctness loss compared to existing approximation techniques.
Springer has finally made available our latest paper in collaboration with the Spanish State Meteorological Agency. It can be accessed here.
The Weather Research & Forecasting (WRF) Model is a high performance computing application used by many worldwide meteorological agencies. Its execution may benefit from the cloud computing paradigm and from public cloud infrastructures in particular, but only if the parameters are chosen wisely. An optimal infrastructure by means of cost can be instantiated for a given deadline, and an optimal infrastructure by means of performance can be instantiated for a given budget. With this in mind, we provide the optimal parameters for the execution of the WRF on a public cloud infrastructure such as Amazon Web Services.
IEEE Xplore has published the result of one of our latest collaborations with the Institute of Computing Technology from the Chinese Academy of Sciences. This particular work was presented at the 44th International Conference on Parallel Processing (ICPP 2015) that took place in Beijing (China) on September. The paper can be accessed here.
Modern latency-critical online services often rely on composing results from a large number of server components. Hence the tail latency (e.g. The 99th percentile of response time), rather than the average, of these components determines the overall service performance. When hosted on a cloud environment, the components of a service typically co-locate with short batch jobs to increase machine utilization, and share and contend resources such as caches and I/O bandwidths with them.
The highly dynamic nature of batch jobs in terms of their workload types and input sizes causes continuously changing performance interference to individual components, hence leading to their latency variability and high tail latency. However, existing techniques either ignore such fine-grained component latency variability when managing service performance, or rely on executing redundant requests to reduce the tail latency, which adversely deteriorate the service performance when load gets heavier.
In this paper, we propose PCS, a predictive and component-level scheduling framework to reduce tail latency for large-scale, parallel online services. It uses an analytical performance model to simultaneously predict the component latency and the overall service performance on different nodes. Based on the predicted performance, the scheduler identifies straggling components and conducts near-optimal component-node allocations to adapt to the changing performance interferences from batch jobs. We demonstrate that, using realistic workloads, the proposed scheduler reduces the component tail latency by an average of 67.05% and the average overall service latency by 64.16% compared with the state-of-the-art techniques on reducing tail latency.
Scientific Programming (Hindawi, JCR:0.559) has just announced the call for papers for a Special Issue on cloud-based simulations and data analysis in which I’m participating as Guest Editor together with Dr. Fabrizio Messina (University of Catania, Catania, Italy) and Dr. Lars Braubach (Nordakademie, Elmshorn, Germany).
In many areas, including commercial as well as scientific fields, the generation and storage of large amounts of data have become essential. Manufacturing and engineering companies use cloud-based high-performance computing technologies and simulation techniques to model, simulate, and predict behavior of complicated models, involving the preliminary analysis of existing data as well the generation of data during the simulations. Having large amounts of data the following question arises, how can it be efficiently processed? Cloud computing holds the promise of providing elastic computational resources thereby adapting towards the concrete application needs and thus is a promising base technology for such processing techniques. But, cloud computing itself is not the complete solution and in order to exploit its underlying power novel algorithms and techniques have to be conceived.
In this special issue we invite original contributions providing novel ideas towards simulation and data processing in the context of cloud computing approaches. The aim of this special issue is to assemble visions, ideas, experiences, and research achievements in these areas.
Potential topics include, but are not limited to:
- Techniques for cloud-based simulations
- Computational Intelligence for cloud-based simulations
- Service composition for cloud-based simulations
- Computational Intelligence for data analysis
- Software architectures for cloud-based simulations
- Cloud-based data mining
- Big data analytics for predictive modeling
Authors can submit their manuscripts via the Manuscript Tracking System at http://mts.hindawi.com/submit/journals/sp/csda/.
Manuscript Due: Friday, 29 April 2016
First Round of Reviews: Friday, 22 July 2016
Publication Date: Friday, 16 September 2016
I’m very happy to announce that I’m serving as Guest Editor at the Computers Journal for a Special Issue on “High Performance Computing for Big Data”. The deadline for submissions is March 31st, 2016.
Big Data is, right now one of the hottest topics in computing research. This is because of:
- the numerous challenges that include (and are not limited to) capture, search, storage, sharing, transfer, representation and privacy of the data;
- and the wide spectrum of areas covered, that range from Bioinformatics to Space Science, and are a research challenge by themselves.
New technologies and algorithms have emerged from Big Data to efficiently manage and process great quantities of data within reasonable elapsed times. However, there are computing barriers that cannot be crossed without the proper resources.
The many ways that High Performance Computing can be delivered for facing Big Data challenges offer a wide spectrum of research opportunities. From FPGAs to cloud computing, technologies and algorithms can be brought to a whole different level and foster incredible insights from massive information repositories.
The papers accepted for publication in this Special Issue cover both fundamental issues and new concepts related to the application of High Performance Computing to the Big Data area.
Since march Future Generation Computer Systems has made available (online) our paper entitled “A multi-dimensional job scheduling”. This work is the result of a collaboration with the research group led by Prof. Lucio Grandinetti (University of Calabria, Italy) and it can be accessed here.
With the advent of new computing technologies, such as cloud computing and contemporary parallel processing systems, the building blocks of computing systems have become multi-dimensional. Traditional scheduling systems based on a single-resource optimization, like processors, fail to provide near optimal solutions. The efficient use of new computing systems depends on the efficient use of several resource dimensions. Thus, the scheduling systems have to fully use all resources. In this paper, we address the problem of multi-resource scheduling via multi-capacity bin-packing. We propose the application of multi-capacity-aware resource scheduling at host selection layer and queuing mechanism layer of a scheduling system. The experimental results demonstrate performance improvements of scheduling in terms of waittime and slowdown metrics.