Generating data transfers for distributed GPU parallel programs

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
432377	688869	2013	12 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

parallel execution - اجرای موازی data transfer - جا به جایی داده Distributed memory - حافظه توزیع شده GPU - واحد پردازش گرافیکی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات

پیش نمایش صفحه اول مقاله

Generating data transfers for distributed GPU parallel programs

چکیده انگلیسی

• We automatically generate heterogeneous communications for distributed-memory architectures.
• Communication generation is based on static compiler analysis and runtime decisions.
• Accurate heterogeneous communications are generated for regular applications.
• Heterogeneous communications deal with accelerator-based GPU data transfers and message-passing for transfers between CPUs.

Nowadays, high performance applications exploit multiple level architectures, due to the presence of hardware accelerators like GPUs inside each computing node. Data transfers occur at two different levels: inside the computing node between the CPU and the accelerators and between computing nodes. We consider the case where the intra-node parallelism is handled with HMPP compiler directives and message-passing programming with MPI is used to program the inter-node communications. This way of programming on such an heterogeneous architecture is costly and error-prone. In this paper, we specifically demonstrate the transformation of HMPP programs designed to exploit a single computing node equipped with a GPU into an heterogeneous HMPP + MPI exploiting multiple GPUs located on different computing nodes.The STEP tool focuses on generating communications combining both powerful static analyses and runtime execution to reduce the volume of communications. Our source-to-source transformation is implemented inside the PIPS workbench. We detail the generated source program of the Jacobi kernel and show that the execution times and speedups are encouraging. At last we give some directions for the improvement of the tool.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Parallel and Distributed Computing - Volume 73, Issue 12, December 2013, Pages 1649–1660

نویسندگان

F. Silber-Chaussumier, A. Muller, R. Habel,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Generating data transfers for distributed GPU parallel programs

دسترسی سریع

ارتباط

English Website