| کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
|---|---|---|---|---|
| 523861 | 868508 | 2016 | 12 صفحه PDF | دانلود رایگان |
• Local search algorithm that improves on task mapping algs for stencil patterns.
• Algorithm shown to reduce total running time and running time variability.
• Improvement shown to depend on the allocation algorithm used.
• Number of swaps made shown to be reasonable in practice.
We present a local search strategy to improve the coordinate-based mapping of a parallel job’s tasks to the MPI ranks of its parallel allocation in order to reduce network congestion and the job’s communication time. The goal is to reduce the number of network hops between communicating pairs of ranks. Our target is applications with a nearest-neighbor stencil communication pattern running on mesh systems with non-contiguous processor allocation, such as Cray XE and XK Systems. Using the miniGhost mini-app, which models the shock physics application CTH, we demonstrate that our strategy reduces application running time while also reducing the runtime variability. We further show that mapping quality can vary based on the selected allocation algorithm, even between allocation algorithms of similar apparent quality.
Journal: Parallel Computing - Volume 51, January 2016, Pages 67–78
