کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4951567 1441478 2017 29 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A quantitative roofline model for GPU kernel performance estimation using micro-benchmarks and hardware metric profiling
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله
A quantitative roofline model for GPU kernel performance estimation using micro-benchmarks and hardware metric profiling
چکیده انگلیسی
Typically, the execution time of a kernel on a GPU is a difficult to predict measure as it depends on a wide range of factors. Performance can be limited by either memory transfer, compute throughput or other latencies. In this paper, we improve on the roofline model following a quantitative approach and present a completely automated GPU performance prediction technique. In this respect this model utilizes micro-benchmarking and profiling in a “black box” fashion as no inspection of source/binary code is required. The proposed model combines parameters in order to characterize the performance limiting factor and to estimate execution time. In addition, we propose the quadrant-split visual representation, which captures the characteristics of multiple processors in relation to a particular kernel. We performed experiments on stencil computation (red/black SOR), SGEMM and a total of 28 kernels of the Rodinia benchmark suite, using six CUDA GPUs and we showed an absolute error in predictions of 27.66% in the average case. Furthermore, the performance model was also examined on an AMD GPU through the HIP programming environment. Prediction errors were comparable despite the significant architectural differences between different vendor GPUs.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Parallel and Distributed Computing - Volume 107, September 2017, Pages 37-56
نویسندگان
, ,