کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
6922457 | 865031 | 2016 | 10 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
A comparison of native GPU computing versus OpenACC for implementing flow-routing algorithms in hydrological applications
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
موضوعات مرتبط
مهندسی و علوم پایه
مهندسی کامپیوتر
نرم افزارهای علوم کامپیوتر
پیش نمایش صفحه اول مقاله

چکیده انگلیسی
In recent years GPU computing has gained wide acceptance as a simple low-cost solution for speeding up computationally expensive processing in many scientific and engineering applications. However, in most cases accelerating a traditional CPU implementation for a GPU is a non-trivial task that requires a thorough refactorization of the code and specific optimizations that depend on the architecture of the device. OpenACC is a promising technology that aims at reducing the effort required to accelerate C/C++/Fortran code on an attached multicore device. Virtually with this technology the CPU code only has to be augmented with a few compiler directives to identify the areas to be accelerated and the way in which data has to be moved between the CPU and GPU. Its potential benefits are multiple: better code readability, less development time, lower risk of errors and less dependency on the underlying architecture and future evolution of the GPU technology. Our aim with this work is to evaluate the pros and cons of using OpenACC against native GPU implementations in computationally expensive hydrological applications, using the classic D8 algorithm of O'Callaghan and Mark for river network extraction as case-study. We implemented the flow accumulation step of this algorithm in CPU, using OpenACC and two different CUDA versions, comparing the length and complexity of the code and its performance with different datasets. We advance that although OpenACC can not match the performance of a CUDA optimized implementation (Ã3.5 slower in average), it provides a significant performance improvement against a CPU implementation (Ã2-6) with by far a simpler code and less implementation effort.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computers & Geosciences - Volume 87, February 2016, Pages 91-100
Journal: Computers & Geosciences - Volume 87, February 2016, Pages 91-100
نویسندگان
Antonio J. Rueda, José M. Noguera, Adrián Luque,