کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
523856 | 868508 | 2016 | 14 صفحه PDF | دانلود رایگان |
• We adapt Ford–Fulkerson’s algorithm to find multiple disjoint paths to move data.
• We realize multi-path data movement by introducing intermediate nodes to the paths.
• We leverage routing policies to reduce the number of intermediate nodes.
• We implement pipelined data movement with the Parallel Active Message Interface.
In situ analysis has been proposed as a promising solution to glean faster insights and reduce the amount of data to storage. A critical challenge here is that the reduced dataset is typically located on a subset of the nodes and needs to be written out to storage. Data coupling in multiphysics codes also exhibits a sparse data movement pattern wherein data movement occurs among a subset of nodes. We evaluate the performance of data movement for sparse data patterns on the IBM Blue Gene/Q supercomputing system “Mira” and identify performance bottlenecks. We propose a multipath data movement algorithm for sparse data patterns based on an adaptation of a maximum flow algorithm together with breadth-first search that fully exploits all the underlying data paths and I/O nodes to improve data movement. We demonstrate the efficacy of our solutions through a set of microbenchmarks and application benchmarks on Mira scaling up to 131,072 compute cores. The results show that our approach achieves up to 5 × improvement in achievable throughput compared with the default mechanisms.
Journal: Parallel Computing - Volume 51, January 2016, Pages 3–16