کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
432372 | 688869 | 2013 | 13 صفحه PDF | دانلود رایگان |
• Proposed a Three-Level Parallelization Scheme for Molecular Dynamics.
• Employed hierarchical optimizations to address the bottlenecks of MD with TLPS.
• Evaluated MD simulation with TLPS and optimizations on TH-1A.
Heterogeneous systems with nodes containing more than one type of computation units, e.g., central processing units (CPUs) and graphics processing units (GPUs), are becoming popular because of their low cost and high performance. In this paper, we have developed a Three-Level Parallelization Scheme (TLPS) for molecular dynamics (MD) simulation on heterogeneous systems. The scheme exploits multi-level parallelism combining (1) inter-node parallelism using spatial decomposition via message passing, (2) intra-node parallelism using spatial decomposition via dynamically scheduled multi-threading, and (3) intra-chip parallelism using multi-threading and short vector extension in CPUs, and employing multiple CUDA threads in GPUs. By using a hierarchy of parallelism with optimizations such as communication hiding intra-node, and memory optimizations in both CPUs and GPUs, we have implemented and evaluated a MD simulation on a petascale heterogeneous supercomputer TH-1A. The results show that MD simulations can be efficiently parallelized with our TLPS scheme and can benefit from the optimizations.
Journal: Journal of Parallel and Distributed Computing - Volume 73, Issue 12, December 2013, Pages 1592–1604