کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
429360 | 687477 | 2011 | 6 صفحه PDF | دانلود رایگان |
The highly scalable parallel tree code PEPC for rapid computation of long-range (1/r) Coulomb forces is presented. It can be used as a library for applications involving electrostatics or Newtonian gravity in 3D. The code is based on the hashed oct-tree algorithm, in which particle coordinates are projected onto a space-filling curve prior to sorting and construction of multipole moments. However, standard particle sorting techniques can ultimately limit the scalability of such algorithms for thousands of cores, a bottleneck which can be alleviated by a recursive sort scheme specially adapted to the Morton curve. More serious limitations of the original locally essential tree concept of Salmon and Warren, which ultimately lead to a failure in memory scaling, are identified and analyzed rigorously. Benchmarks for the code on the IBM Blue Gene/P Jugene are presented which demonstrate scaling for multi-million particle systems on up to 8192 cores.
► We describe the parallel Barnes–Hut tree code PEPC for solving the N-body problem.
► The algorithm scales very well up to 4096 MPI tasks on a IBM Blue Gene/P system.
► The code allows for rapid parallel computation of more than 100 million particles.
► A new parallel sorting approach is the key aspect of the load balancing scheme.
► Efficiency analysis of the communication process reveals optimization opportunities.
Journal: Journal of Computational Science - Volume 2, Issue 2, May 2011, Pages 138–143