کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
524636 | 868790 | 2015 | 16 صفحه PDF | دانلود رایگان |
• We evaluated four scheduling strategies for multi-CPU and multi-GPU architectures.
• We designed a framework with performance models for task and transfer prediction.
• Work stealing is efficient with task annotations and data locality heuristics.
• HEFT cost model performs better on very regular computations.
In this paper, we present a comparison of scheduling strategies for heterogeneous multi-CPU and multi-GPU architectures. We designed and evaluated four scheduling strategies on top of XKaapi runtime: work stealing, data-aware work stealing, locality-aware work stealing, and Heterogeneous Earliest-Finish-Time (HEFT). On a heterogeneous architecture with 12 CPUs and 8 GPUs, we analysed our scheduling strategies with four benchmarks: a BLAS-1 AXPY vector operation, a Jacobi 2D iterative computation, and two linear algebra algorithms Cholesky and LU. We conclude that the use of work stealing may be efficient if task annotations are given along with a data locality strategy. Furthermore, our experimental results suggests that HEFT scheduling performs better on applications with very regular computations and low data locality.
Journal: Parallel Computing - Volume 44, May 2015, Pages 37–52