کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
484091 703253 2016 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Acceleration and Parallelization of ZENO/Walk-on-Spheres
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)
پیش نمایش صفحه اول مقاله
Acceleration and Parallelization of ZENO/Walk-on-Spheres
چکیده انگلیسی

This paper describes our on-going work to accelerate ZENO, a software tool based on Monte Carlo methods (MCMs), used for computing material properties at nanoscale. ZENO employs three main algorithms: (1) Walk on Spheres (WoS), (2) interior sampling, and (3) surface sampling. We have accelerated the first two algorithms. For the sake of brevity, the paper will discuss our work on the first one only as it is the most commonly used and the acceleration techniques were similar in both cases.WoS is a Brownian motion MCM for solving a class of partial differential equations (PDEs). It provides a stochastic solution to a PDE by estimating the probability that a random walk, which started at infinity, will hit the surface of the material under consideration. WoS is highly effective when the problem's geometry is additive, as this greatly reduces the number of walk steps needed to achieve accurate results. The walks start on the surface of an enclosing sphere and can make much larger jumps than in a direct simulation of Brownian motion. Our current implementation represents the molecular structure of nanomaterials as a union of possibly overlapping spheres. The core processing bottleneck in WoS is a Computational Geometry one, as the algorithm repeatedly determines the distance from query point to the material surface in each step of the random walk.In this paper, we present results from benchmarking spatial data structures, including several open-source implementations of k-D trees, for accelerating WoS algorithmically. The paper also presents results from our multicore and cluster parallel implementation to show that it exhibits linear strong scaling with the number of cores and compute nodes; this implementation delivers up to 4 orders of magnitude speedup compared to the original FORTRAN code when run on 8 nodes (each with dual 6-core Intel Xeon CPUs) with 24 threads per node.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Procedia Computer Science - Volume 80, 2016, Pages 269–278
نویسندگان
, , , ,