GWpilot: Enabling multi-level scheduling in distributed infrastructures with GridWay and pilot jobs

Current systems based on pilot jobs are not exploiting all the scheduling advantages that the technique offers, or they lack compatibility or adaptability. To overcome the limitations or drawbacks in existing approaches, this study presents a different general-purpose pilot system, GWpilot. This system provides individual users or institutions with a more easy-to-use, easy-to-install, scalable, extendable, flexible and adjustable framework to efficiently run legacy applications. The framework is based on the GridWay meta-scheduler and incorporates the powerful features of this system, such as standard interfaces, fair-share policies, ranking, migration, accounting and compatibility with diverse infrastructures. GWpilot goes beyond establishing simple network overlays to overcome the waiting times in remote queues or to improve the reliability in task production. It properly tackles the characterisation problem in current infrastructures, allowing users to arbitrarily incorporate customised monitoring of resources and their running applications into the system. This functionality allows the new framework to implement innovative scheduling algorithms that accomplish the computational needs of a wide range of calculations faster and more efficiently. The system can also be easily stacked under other software layers, such as self-schedulers. The advanced techniques included by default in the framework result in significant performance improvements even when very short tasks are scheduled.

More information in the article:

A.J. Rubio-Montero, E. Huedo, F. Castejón, R. Mayo-García, GWpilot: Enabling multi-level scheduling in distributed infrastructures with GridWay and pilot jobs, Future Generation Computer Systems, Volume 45, April 2015, Pages 25-52, ISSN 0167-739X,

Distributed scheduling and data sharing in late-binding overlays

Pull-based late-binding overlays are used in some of today’s largest computational grids. Job agents are submitted to resources with the duty of retrieving real workload from a central queue at runtime. This helps overcome the problems of these complex environments: heterogeneity, imprecise status information and relatively high failure rates. In addition, the late job assignment allows dynamic adaptation to changes in grid conditions or user priorities. However, as the scale grows, the central assignment queue may become a bottleneck for the whole system. This article presents a distributed scheduling architecture for late-binding overlays, which addresses this issue by letting execution nodes build a distributed hash table and delegating job matching and assignment to them. This reduces the load on the central server and makes the system much more scalable and robust. Scalability makes fine-grained scheduling possible and enables new functionalities, like the implementation of a distributed data cache on the execution nodes, which helps alleviate the commonly congested grid storage services.

More information in the article:

Delgado Peris, A.; Hernandez, J.M.; Huedo, E., “Distributed scheduling and data sharing in late-binding overlays,” High Performance Computing & Simulation (HPCS), 2014 International Conference on , vol., no., pp.129,136, 21-25 July 2014.

A framework for building hypercubes using MapReduce

The European Space Agency’s Gaia mission will create the largest and most precise three dimensional chart of our galaxy (the Milky Way), by providing unprecedented position, parallax, proper motion, and radial velocity measurements for about one billion stars. The resulting catalog will be made available to the scientific community and will be analyzed in many different ways, including the production of a variety of statistics. The latter will often entail the generation of multidimensional histograms and hypercubes as part of the precomputed statistics for each data release, or for scientific analysis involving either the final data products or the raw data coming from the satellite instruments.

In this paper we present and analyze a generic framework that allows the hypercube generation to be easily done within a MapReduce infrastructure, providing all the advantages of the new Big Data analysis paradigm but without dealing with any specific interface to the lower level distributed system implementation (Hadoop). Furthermore, we show how executing the framework for different data storage model configurations (i.e. row or column oriented) and compression techniques can considerably improve the response time of this type of workload for the currently available simulated data of the mission.

In addition, we put forward the advantages and shortcomings of the deployment of the framework on a public cloud provider, benchmark against other popular solutions available (that are not always the best for such ad-hoc applications), and describe some user experiences with the framework, which was employed for a number of dedicated astronomical data analysis techniques workshops.

More information in the article:

D. Tapiador, W. O’Mullane, A.G.A. Brown, X. Luri, E. Huedo, P. Osuna, A framework for building hypercubes using MapReduce, Computer Physics Communications, Volume 185, Issue 5, May 2014, Pages 1429-1438, ISSN 0010-4655,

DSA-Research Group at UCM Joins €3.6 Million Consortium Tasked With Enabling Federated Cloud Networking

Challenging research in the flagship European Project on SDN, NFV and Cloud

Madrid, Spain – 5 February 2015 – DSA-Research (UCM) today announced it has joined a consortium of leading research organisations and universities from the U.K., Germany, Spain, Belgium, Israel and Italy focused on developing new innovative techniques to federate cloud network resources and to derive the integrated management cloud layer that enables an efficient and secure deployment of federated cloud applications.

The BEACON project will deliver a homogeneous virtualization layer, on top of heterogeneous underlying physical networks, computing and storage infrastructures, providing enablement for automated federation of applications across different clouds and datacentres. Senior member of DSA-Research and Scientific Coordinator of the project, Eduardo Huedo, said:

“BEACON will provide innovative techniques to federate cloud network resources and an integrated management cloud layer to enable the efficient and secure deployment of multi-cloud applications, which aggregate compute, storage and network resources from distributed infrastructures. This brings up new challenges for the research done at DSA-Research, like virtual networks spanning multiple datacentres, automated high-availability across datacentres, or datacentre location-aware elasticity. The technology developed as a result of this research will be contributed to the OpenNebula cloud management platform.”

DSA-Research (UCM) is joined on the project by Flexiant (U.K.), CETIC (Belgium), OpenNebula Systems (Spain), IBM Israel (Israel), Universita di Messina (Italy) and Lufthansa Systems (Germany).


BEACON is a collaborative research project co-funded under the ICT theme of HORIZON 2020 Research Programme of the European Union.

For more information visit

About DSA-Research (UCM)

The DSA (Distributed Systems Architecture) Research Group at Complutense University of Madrid conducts research in advanced distributed computing and virtualization technologies for large-scale infrastructures and resource provisioning platforms.

The group founded the OpenNebula open-source project, widely used technology to build IaaS cloud infrastructures; is a co-founder of the OGF Working Group on Cloud Computing Interface; and participates in the main European projects in cloud computing, such as RESERVOIR, flagship of European research initiatives in virtualized infrastructures and cloud computing, BonFIRE, 4CaaSt, StratusLab, PANACEA and CloudCatalyst. The results of the research have been published in several leading publications on virtualization and cloud computing, and members of the group participate in the Program Committee of the most important workshops and conference in the research field. The group founded the Spanish Initiative in Grid Middleware and the Working Group on SOI and Grids of INES – Spanish Technology Platform on Software and Services; and is involved in NESSI.

For more information visit

Performance evaluation of a signal extraction algorithm for the Cherenkov Telescope Array’s Real Time Analysis pipeline

The IEEE Xplore Digital Library has made available another of our latest conference papers. This time was at the IEEE International Conference on Cluster Computing 2014, which took place at Madrid past September.

IEEE Cluster 2014

The work was presented in the form of a poster entitled “Performance evaluation of a signal extraction algorithm for the Cherenkov Telescope Array’s Real Time Analysis pipeline” and the paper can be accessed here.


In this paper, several versions of a signal extraction algorithm, pertaining to the entry stage of the Cherenkov Telescope Array‘s Real Time Analysis pipeline, were implemented and optimized using SSE2, POSIX threads and CUDA. Results of this proof of concept let us gain an insight into the suitability of each platform, and the performance each one can deliver, to carry out this particular task.

This work constitutes a first step in the “cloudification” of this application and represents the first publication of my PhD student Juan José Rodríguez-Vázquez in this context.


J.L. Vázquez-Poletti

A Multi-Capacity Queuing Mechanism in Multi-Dimensional Resource Scheduling

Springer has published a volume of its Lecture Notes in Computer Science series with our paper entitled “A Multi-Capacity Queuing Mechanism in Multi-Dimensional Resource Scheduling”. This contribution was presented at the International Workshop on Adaptive Resource Management and Scheduling for Cloud Computing, held in conjunction with the ACM Symposium on Principles of Distributed Computing, that took place in Paris (France) past July 15th.

The volume can be accessed here and the paper is the result of an ongoing collaboration with the research group led by Prof. Lucio Grandinetti (University of Calabria, Italy).

ARMS-CC2014With the advent of new computing technologies, such as cloud computing and contemporary parallel processing systems, the building blocks of computing systems have become multi-dimensional. Traditional scheduling algorithms based on a single-resource optimization like processor fail to provide near optimal solutions. The efficient use of new computing systems depends on the efficient use of all resource dimensions. Thus, the scheduling algorithms have to fully use all resources. In this paper, we propose a queuing mechanism based on a multi-resource scheduling technique. For that, we model multi-resource scheduling as a multi-capacity bin-packing scheduling algorithm at the queue level to reorder the queue in order to improve the packing and as a result improve scheduling metrics. The experimental results demonstrate performance improvements in terms of waittime and slowdown metrics.


J.L. Vázquez-Poletti

A Model to Calculate Amazon EC2 Instance Performance in Frost Prediction Applications

Last week the First HPCLATAM – CLCAR Joint Conference took place in Valparaiso, Chile. There, a joint work with Prof. Carlos García Garino‘s research group (Universidad Nacional de Cuyo, Argentina) was presented. This work, entitled “A Model to Calculate Amazon EC2 Instance Performance in Frost Prediction Applications” has been published by Springer through its Communications in Computer and Information Science series.


Frosts are one of the main causes of economic losses in the Province of Mendoza, Argentina. Although it is a phenomenon that happens every year, frosts can be predicted using Agricultural Monitoring Systems (AMS). AMS provide information to start and stop frosts defense systems and thus reduce economic losses. In recent years, the emergence of infrastructures called Sensor Clouds improved AMS in several aspects such as scalability, reliability, fault tolerance, etc. Sensor Clouds use Wireless Sensor Networks (WSN) to collect data in the field and Cloud Computing to store and process these data. Currently, Cloud providers like Amazon offer different instances to store and process data in a profitable way. Moreover, due to the variety of offered instances arises the need for tools to determine which is the most appropriate instance type, in terms of execution time and economic costs, for running agro-meteorological applications. In this paper we present a model targeted to estimate the execution time and economic cost of Amazon EC2 instances for frosts prediction applications.


J.L. Vázquez-Poletti


Research stay at the Chinese Academy of Sciences

In the past month I had the pleasure and the honor to be hosted again by the Chinese Academy of Sciences, Beijing This was 3 years after the previous invitation.

Chinese Academy of Sciences

During this period I gave talks on cloud computing at the following institutions:

Talk at Tsinghua University

The talk introduced the basics of cloud computing and displayed real use cases of applications pertaining to emergent areas such as Bioinformatics and Space Exploration in which I have been involved in the past years.

Also, there have been some meetings pursuing collaboration opportunities. As a result, some initial joint work was started by our research group, ICMSEC and ICT.

Talk and meeting at ICT-CAS

Summarizing, this period has been very productive. The new opportunities that have arisen are a good example on how cloud computing is a hot technology.

J.L. Vázquez-Poletti

Spot Price prediction for Cloud Computing using Neural Networks

The International Journal of Computing has made available our paper entitled “Spot Price prediction for Cloud Computing using Neural Networks”. This work is the result of a collaboration with the research groups led by Prof. Lucio Grandinetti (University of Calabria, Italy) and Associate Prof. Volodymyr O. Turchenko (Ternopil National Economic Universit, Ukraine).


Advances in service-oriented architectures, virtualization, high-speed networks, and cloud computing has resulted in attractive pay-as-you-go services. Job scheduling on such systems results in commodity bidding for computing time. Amazon institutionalizes this bidding for its Elastic Cloud Computing (EC2) environment. Similar bidding methods exist for other cloud-computing vendors as well as multi–cloud and cluster computing brokers such as SpotCloud. Commodity bidding for computing has resulted in complex spot price models that have ad-hoc strategies to provide demand for excess capacity. In this paper we will discuss vendors who provide spot pricing and bidding and present the predictive models for future short-term and middle-term spot price prediction based on neural networks giving users a high confidence on future prices aiding bidding on commodity computing.


J.L. Vázquez-Poletti

Chapter in the Handbook of Research on Architectural Trends in Service-Driven Computing

At the end of June the Handbook of Research on Architectural Trends in Service-Driven Computing has been released by IGI Global. This publication, divided in 2 volumes, explores, delineates, and discusses recent advances in architectural methodologies and development techniques in service-driven computing. The handbook is an inclusive reference source for organizations, researchers, students, enterprise and integration architects, practitioners, software developers, and software engineering professionals engaged in the research, development, and integration of the next generation of computing.


We participated in the elaboration of this publication with the 28th Chapter, entitled “Admission Control in the Cloud: Algorithms for SLA-Based Service Model”.

Cloud Computing is a paradigm that allows the flexible and on-demand provisioning of computing resources. For this reason, many institutions have moved their systems to the Cloud, and in particular, to public infrastructures. Unfortunately, an increase in the demand for Cloud results in resource shortages affecting both providers and consumers. With this factor in mind, Cloud service providers need Admission Control algorithms in order to make a good business decision on the types of requests to be fulfilled. At the same time, Cloud providers have a desire to maximize the net income derived from provisioning the accepted service requests and minimize the impact of un-provisioned resources. This chapter introduces and compares Admission Control algorithms and proposes a service model that allows the definition of Service Level Agreements (SLAs).

Title: Handbook of Research on Architectural Trends in Service-Driven Computing
Editors: Raja Ramanathan and Kirtana Raja
Pub. date: June 2014
Pages: 759
Volume: 23 of Advances in Parallel Computing
ISBN13: 9781466661783
J.L. Vázquez-Poletti