Article ID Journal Published Year Pages File Type
425062 Future Generation Computer Systems 2013 15 Pages PDF
Abstract

Scheduling many tasks in environments of millions of unreliable nodes is a challenging problem. To our knowledge, no work in the literature has proposed a solution that also supports many policies with very different objectives. In this paper, we present a decentralized scheduling model that overcomes these problems. A hierarchical network overlay supports a scalable resource discovery and allocation scheme. It uses aggregated information to route tasks to the most suitable execution nodes, and is easily extensible to provide very different scheduling policies. For this paper, we implemented a policy that just allocates tasks to idle nodes, a policy that minimizes the global makespan and a policy that fulfills deadline requirements. With thorough simulation tests, we conclude that our model allocates any number of tasks to several million nodes in just a few seconds, with very low overhead and high resilience. Meanwhile, policies with different objectives implemented on our model perform almost as well as their centralized counterpart.

► We present a distributed scheduling model for large-scale platforms. ► Execution node availability information is aggregated in a tree-like overlay. ► Tasks are allocated to nodes with logarithmic cost with the number of tasks. ► Many policies are supported, with performance close to a centralized model. ► The model recovers from the simultaneous failure of a certain number of nodes.

Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, ,