کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
523876 868516 2015 28 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
On the design of a new dynamic credit-based end-to-end flow control mechanism for HPC clusters
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
پیش نمایش صفحه اول مقاله
On the design of a new dynamic credit-based end-to-end flow control mechanism for HPC clusters
چکیده انگلیسی


• Adaptation of the credit-based flow-control mechanism to the EXTOLL interconnect.
• In-depth evaluation of the previous flow control including the use of piggybacking.
• Design of a lightweight dynamic credit-based flow-control mechanism for clusters.
• Thorough performance evaluation of the new dynamic flow control.
• Analysis of the memory footprint of both flow control versions.

High Performance Computing usually leverages messaging libraries such as MPI, GASNet, or OpenSHMEM, among others, in order to exchange data among processes in large-scale clusters. Furthermore, these libraries make use of specialized low-level network layers in order to achieve as much performance as possible from hardware interconnects such as InfiniBand or 40 Gb Ethernet, for example. EXTOLL is an emerging network targeted at high performance clusters.Specialized low-level network layers require some kind of flow control in order to prevent buffer overflows at the receiver side. In this paper we present a new end-to-end flow control mechanism that is able to dynamically adapt, at execution time, the buffer resources used by a process according to the communication pattern of the parallel application and the varying activity among communicating peers. The tests carried out on a 64-node 1024-core EXTOLL cluster show that our new dynamic flow control mechanism presents very low overhead with an extraordinarily high buffer efficiency, as overall buffer resources are reduced by 4× with respect to the amount of buffers required by a static flow control protocol achieving similar low overhead levels.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Parallel Computing - Volume 46, July 2015, Pages 32–59
نویسندگان
, , , , ,