کد مقاله کد نشریه سال انتشار مقاله انگلیسی ترجمه فارسی نسخه تمام متن
4967048 1365153 2018 19 صفحه PDF سفارش دهید دانلود کنید
عنوان انگلیسی مقاله
A GPU-accelerated semi-implicit fractional-step method for numerical solutions of incompressible Navier-Stokes equations
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
پیش نمایش صفحه اول مقاله
A GPU-accelerated semi-implicit fractional-step method for numerical solutions of incompressible Navier-Stokes equations
چکیده انگلیسی

Utility of the computational power of Graphics Processing Units (GPUs) is elaborated for solutions of incompressible Navier-Stokes equations which are integrated using a semi-implicit fractional-step method. The Alternating Direction Implicit (ADI) and the Fourier-transform-based direct solution methods used in the semi-implicit fractional-step method take advantage of multiple tridiagonal matrices whose inversion is known as the major bottleneck for acceleration on a typical multi-core machine. A novel implementation of the semi-implicit fractional-step method designed for GPU acceleration of the incompressible Navier-Stokes equations is presented. Aspects of the programing model of Compute Unified Device Architecture (CUDA), which are critical to the bandwidth-bound nature of the present method are discussed in detail. A data layout for efficient use of CUDA libraries is proposed for acceleration of tridiagonal matrix inversion and fast Fourier transform. OpenMP is employed for concurrent collection of turbulence statistics on a CPU while the Navier-Stokes equations are computed on a GPU. Performance of the present method using CUDA is assessed by comparing the speed of solving three tridiagonal matrices using ADI with the speed of solving one heptadiagonal matrix using a conjugate gradient method. An overall speedup of 20 times is achieved using a Tesla K40 GPU in comparison with a single-core Xeon E5-2660 v3 CPU in simulations of turbulent boundary-layer flow over a flat plate conducted on over 134 million grids. Enhanced performance of 48 times speedup is reached for the same problem using a Tesla P100 GPU.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Computational Physics - Volume 352, 1 January 2018, Pages 246-264
نویسندگان
, , ,