Article ID Journal Published Year Pages File Type
4950528 Future Generation Computer Systems 2017 9 Pages PDF
Abstract
In this paper we study MapReduce scheduling on n parallel machines (servers) with different speeds v1≥v2≥⋯≥vn. Each job contains two kinds of tasks: map tasks and reduce tasks. The job's reduce tasks can only be processed after finishing all its map tasks. We assume that the map tasks are parallelized, i.e., it can be arbitrarily split and parts of the same task can be processed on different machines in parallel, while the reduce tasks are non-parallelizable. We consider both the offline and online scheduling problems. On offline version, if the reduce tasks are non-preemptive, we design an approximation algorithm whose worst case ratio is at most max{1+Δ2−1n,Δ}, where Δ=v1vn is the ratio of the fastest speed to the slowest speed. If the reduce tasks are preemptive, we provide an approximation algorithm with worst case ratio of 2. On online version where jobs arriving over time, we design two heuristics for non-preemptive and preemptive reduce tasks respectively. In experiment, we verify the advantage of our algorithms comparing with the state-of-the-art.
Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, , , ,