کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
7155954 1462640 2018 30 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
GPU-accelerated 3-D Finite Volume Particle Method
موضوعات مرتبط
مهندسی و علوم پایه سایر رشته های مهندسی مکانیک محاسباتی
پیش نمایش صفحه اول مقاله
GPU-accelerated 3-D Finite Volume Particle Method
چکیده انگلیسی
In previous works (Jahanbakhsh et al., CMAME 298 (2016): 80-107, Jahanbakhsh et al., CMAME 317 (2017): 102-127; see [1] and [2]), the authors introduced SPHEROS, a 3-D particle-based solver based on the Finite Volume Particle Method (FVPM) featuring a spherical top-hat kernel. In the present research, the authors present algorithms and optimization procedures that allowed to significantly accelerate computations by taking advantage of the computational power of Graphics Processing Units (GPUs). The new accelerated solver, GPU-SPHEROS, has been developed in CUDA and runs entirely on GPU, are presented. All the parallel algorithms and data structures have been designed specifically for the GPU many-core architecture. A roofline model has been utilized to assess the performance of the kernels and apply appropriate optimization strategies. In particular, the neighbor search algorithm, accounting for almost a third of the overall compute time, features an efficient Space-Filling Curve (SFC) as well as an optimized octree construction procedure. The memory-bound interaction vector computation, accounting for almost two thirds of the overall compute time, features fixed-size memory pre-allocation and an efficient data ordering to reduce memory transactions and cost of dynamic memory operations i.e. allocation and deallocation. As a case study, the numerical simulation results of water jet deviation by the rotating buckets in a Pelton turbine is presented and compared to available experimental data. For that case, a speedup by a factor of almost six times has been achieved on a single NVIDIA® Tesla™ P100-SXM2-16 GB GPU with GP100 Pascal architecture compared to a dual CPU node equipped with two Broadwell Intel® Xeon® E5-2690 v4 CPUs with 28 total physical cores.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computers & Fluids - Volume 171, 30 July 2018, Pages 79-93
نویسندگان
, , , , ,