Thread scheduling and memory coalescing for dynamic vectorization of SPMD workloads

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
523902	868525	2014	11 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

SMT MMT SIMD High performance - عملکرد بالا Computer architecture - معماری کامپیوتر Parallelism - همبستگی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر

پیش نمایش صفحه اول مقاله

Thread scheduling and memory coalescing for dynamic vectorization of SPMD workloads

چکیده انگلیسی

• We propose a new thread synchronization heuristic called Min-SP/PC.
• Min-SP/PC handles function calls better than previous algorithms.
• Many instructions in SPMD programs are identical across threads.
• Many memory accesses are either uniform or affine across threads.

Simultaneous Multi-Threading (SMT) is a hardware model in which different threads share the same processing unit. This model is a compromise between high parallelism and low hardware cost. Minimal Multi-Threading (MMT) is one architecture recently proposed that shares instruction decoding and execution between threads running the same program in an SMT processor, thereby generalizing the approach followed by Graphics Processing Units to general-purpose processors. In this paper we propose new ways to expose redundancies in the MMT execution model. First, we propose and evaluate a new thread reconvergence heuristic that handles function calls better than previous approaches. Our heuristic only inspects the program counter and the stack frame to reconverge threads; hence, it is amenable to efficient and inexpensive hardware implementation. Second, we demonstrate that this heuristic is able to reveal the existence of substantial regularity in inter-thread memory access patterns. We validate our results on data-parallel applications from the PARSEC and SPLASH suites. Our new reconvergence heuristic increases the throughput of our MMT model by 7%, when compared to a previous, and substantially more complex approach, due to Long et al. Moreover, it gives us an effective way to increase regularity in memory accesses. We have observed that over 70% of simultaneous memory accesses are either the same for all the threads, or are affine expressions of the thread identifier. This observation motivates the design of newly proposed hardware that benefits from regularity in inter-thread memory accesses.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Parallel Computing - Volume 40, Issue 9, October 2014, Pages 548–558

نویسندگان

Teo Milanez, Sylvain Collange, Fernando Magno Quintão Pereira, Wagner Meira Jr., Renato Ferreira,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Thread scheduling and memory coalescing for dynamic vectorization of SPMD workloads

دسترسی سریع

ارتباط

English Website