Article ID Journal Published Year Pages File Type
6873226 Future Generation Computer Systems 2018 8 Pages PDF
Abstract
An efficient thread scheduling method is a promising way to alleviate the problems and to boost performance. From the hardware perspective, the instructions are executed by warps which are made up by a fixed number of threads. So we propose a novel warp scheduling scheme to maintain data locality and to relieve cache pollution and thrashing issues. First, to make full use of time locality, we put the disordered warps into a supervised warp queue and issue the warps from oldest to youngest. To utilize space locality and to hide computation unit stalls, we put forward a new insertion method called LPI (Locality Protected Insertion) to reorder warps in the supervised warp queue to better hide long-latency warps with short-latency warps such as ALU operations and on-chip accesses. Over a wide variety of applications, the new scheduling method gains at most 10.1% and an average of 2.2% improvements over the baseline loose round-robin scheduling.
Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, , , , ,