کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
6873231 1440631 2018 16 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Highly scalable SFC-based dynamic load balancing and its application to atmospheric modeling
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله
Highly scalable SFC-based dynamic load balancing and its application to atmospheric modeling
چکیده انگلیسی
Load balance is one of the major challenges for efficient supercomputing, especially for applications that exhibit workload variations. Various dynamic load balancing and workload partitioning methods have been developed to handle this issue by migrating workload between nodes periodically during the runtime. However, on today's top HPC systems-and even more so on future exascale systems-runtime performance and scalability of these methods becomes a concern, due to the costs exceeding the benefits of dynamic load balancing. In this work, we focus on methods based on space-filling curves (SFC), a well-established and comparably fast approach for workload partitioning. SFCs reduce the partitioning problem from n dimensions to one dimension. The remaining task, the so-called 1D partitioning problem or chains-on-chains partitioning problem, is to decompose a 1D workload array into consecutive, balanced partitions. While published parallel heuristics for this problem cannot reliably deliver the required workload balance, especially at large scale, exact algorithms are infeasible due to their sequential nature. We therefore propose a hierarchical method that combines a heuristic and an exact algorithm and allows to trade-off between these two approaches. We compare load balance, execution time, application communication, and task migration of the algorithms using real-life workload data from two different applications on two different HPC systems. The hierarchical method provides a significant speed-up compared to exact algorithms and yet achieves nearly the optimal load balance. On a Blue Gene/Q system, it is able to partition 2.6 million tasks for 524 288 processes with over 99% of the optimal balance in 23.4 ms only, while a published fast exact algorithm requires 6.4 s. We also provide a comparison to parallel load balancing methods implemented in the Zoltan library and present results from applying our methods to COSMO-SPECS+FD4, a detailed atmospheric simulation model that requires frequent dynamic load balancing to run efficiently at large scale.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Future Generation Computer Systems - Volume 82, May 2018, Pages 575-590
نویسندگان
, ,