کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
460551 696393 2014 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Improving branch divergence performance on GPGPU with a new PDOM stack and multi-level warp scheduling
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر شبکه های کامپیوتری و ارتباطات
پیش نمایش صفحه اول مقاله
Improving branch divergence performance on GPGPU with a new PDOM stack and multi-level warp scheduling
چکیده انگلیسی

General-purpose graphics processing unit (GPGPU) plays an important role in massive parallel computing nowadays. A GPGPU core typically holds thousands of threads, where hardware threads are organized into warps. With the single instruction multiple thread (SIMT) pipeline, GPGPU can achieve high performance. But threads taking different branches in the same warp violate SIMD style and cause branch divergence. To support this, a hardware stack is used to sequentially execute all branches. Hence branch divergence leads to performance degradation. This article represents the PDOM (post dominator) stack as a binary tree, and each leaf corresponds to a branch target. We propose a new PDOM stack called PDOM-ASI, which can schedule all the tree leaves. The new stack can hide more long operation latencies with more schedulable warps without the problem of warp over-subdivision. Besides, a multi-level warp scheduling policy is proposed, which lets part of the warps run ahead and creates more opportunities to hide the latencies. The simulation results show that our policies achieve 10.5% performance improvements over baseline policies with only 1.33% hardware area overhead.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Systems Architecture - Volume 60, Issue 5, May 2014, Pages 420–430
نویسندگان
, , , ,