A comparison of CPU and GPU implementations for solving the Convection Diffusion equation using the local Modified SOR method

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
10358587	868543	2014	13 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

AVX Iterative methods - روش های جالب GPU computing - محاسبات GPU CUDA - کودا. پردازش موازی و مدل برنامه‌نویسی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر

پیش نمایش صفحه اول مقاله

A comparison of CPU and GPU implementations for solving the Convection Diffusion equation using the local Modified SOR method

چکیده انگلیسی

In this paper we study a parallel form of the SOR method for the numerical solution of the Convection Diffusion equation suitable for GPUs using CUDA. To exploit the parallelism offered by GPUs we consider the fine grain parallelism model. This is achieved by considering the local relaxation version of SOR. More specifically, we use SOR with red-black ordering using two sets of parameters Ï1ij and Ï2ij for the 5 point stencil. The parameter Ï1ij is associated with each red (iÂ +Â j even) grid point (i,j), whereas the parameter Ï2ij is associated with each black (i+j odd) grid point (i,j). The use of a parameter for each grid point avoids the global communication required in the adaptive determination of the best value of Ï and also increases the convergence rate of the SOR method (Varga, 1962) [38] and (Young, 1971) [41]. We present our strategy and the results of our effort to exploit the computational capabilities of GPUs under the CUDA environment. Additionally, two parallel CPU programs utilizing manual SSE2 (Streaming SIMD Extensions 2) and AVX (Advanced Vector Extensions) vectorization were developed as performance references. The optimizations applied on the GPU version were also considered for the CPU version. Significant performance improvement was achieved with all three developed GPU kernels differentiated by the degree of recomputations thus affecting the flops per element access ratio.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Parallel Computing - Volume 40, Issue 7, July 2014, Pages 173-185

نویسندگان

Yiannis Cotronis, Elias Konstantinidis, Maria A. Louka, Nikolaos M. Missirlis,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

A comparison of CPU and GPU implementations for solving the Convection Diffusion equation using the local Modified SOR method

دسترسی سریع

ارتباط

English Website