کد مقاله کد نشریه سال انتشار مقاله انگلیسی ترجمه فارسی نسخه تمام متن
4967048 1365153 2018 19 صفحه PDF ندارد دانلود کنید
عنوان انگلیسی مقاله
A GPU-accelerated semi-implicit fractional-step method for numerical solutions of incompressible Navier–Stokes equations
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
پیش نمایش صفحه اول مقاله
A GPU-accelerated semi-implicit fractional-step method for numerical solutions of incompressible Navier–Stokes equations
چکیده انگلیسی

Utility of the computational power of Graphics Processing Units (GPUs) is elaborated for solutions of incompressible Navier–Stokes equations which are integrated using a semi-implicit fractional-step method. The Alternating Direction Implicit (ADI) and the Fourier-transform-based direct solution methods used in the semi-implicit fractional-step method take advantage of multiple tridiagonal matrices whose inversion is known as the major bottleneck for acceleration on a typical multi-core machine. A novel implementation of the semi-implicit fractional-step method designed for GPU acceleration of the incompressible Navier–Stokes equations is presented. Aspects of the programing model of Compute Unified Device Architecture (CUDA), which are critical to the bandwidth-bound nature of the present method are discussed in detail. A data layout for efficient use of CUDA libraries is proposed for acceleration of tridiagonal matrix inversion and fast Fourier transform. OpenMP is employed for concurrent collection of turbulence statistics on a CPU while the Navier–Stokes equations are computed on a GPU. Performance of the present method using CUDA is assessed by comparing the speed of solving three tridiagonal matrices using ADI with the speed of solving one heptadiagonal matrix using a conjugate gradient method. An overall speedup of 20 times is achieved using a Tesla K40 GPU in comparison with a single-core Xeon E5-2660 v3 CPU in simulations of turbulent boundary-layer flow over a flat plate conducted on over 134 million grids. Enhanced performance of 48 times speedup is reached for the same problem using a Tesla P100 GPU.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Computational Physics - Volume 352, 1 January 2018, Pages 246-264
نویسندگان
, , ,