کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
524645 868800 2013 17 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
An application-centric evaluation of OpenCL on multi-core CPUs
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
پیش نمایش صفحه اول مقاله
An application-centric evaluation of OpenCL on multi-core CPUs
چکیده انگلیسی


• We provide a detailed analysis of five OpenCL applications on 3 multi-core CPUs.
• We identify 3 main reasons for performance gaps, compared to the reference OpenMP.
• We present efficient performance tuning solutions, and quantify their impacts.
• We show that generic tuning leads to OpenCL performing at par or better than OpenMP.
• The potential users of our findings are OpenCL application and compiler developers.

Although designed as a cross-platform parallel programming model, OpenCL remains mainly used for GPU programming. Nevertheless, a large amount of applications are parallelized, implemented, and eventually optimized in OpenCL. Thus, in this paper, we focus on the potential that these parallel applications have to exploit the performance of multi-core CPUs. Specifically, we analyze the method to systematically reuse and adapt the OpenCL code from GPUs to CPUs. We claim that this work is a necessary step for enabling inter-platform performance portability in OpenCL.Our method is based on iterative tuning: given an application, we choose a reasonable OpenMP implementation as a performance reference and we systematically tune the OpenCL code to reach or exceed this threshold. In the process, we identify the factors that significantly impact the performance of the OpenCL code. We apply this method for five different applications, selected from the Rodinia benchmark suite (which provides equivalent OpenMP and OpenCL implementations), and make a series of thorough evaluations with different datasets on three different multi-core platforms. We find that the OpenCL performance on CPUs is affected by typical, hard-coded GPU optimizations (unsuitable for multi-core CPUs), by the fine-grained parallelism of the model, and by the immature OpenCL compilers. Systematically fixing these issues allowed OpenCL to achieve OpenMP’s or better performance, proving it can be a good option for programming multi-core CPUs.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Parallel Computing - Volume 39, Issue 12, December 2013, Pages 834–850
نویسندگان
, , , ,