Article ID Journal Published Year Pages File Type
460863 Journal of Systems Architecture 2006 12 Pages PDF
Abstract

The paper concerns parallel computations with communication based on Remote Direct Memory Access (RDMA), which provides for low level un-buffered access to distributed memory of computational nodes. Fine grain computation involves very frequent transmissions of small messages. For their efficient execution with RDMA communication a special memory infrastructure—rotating buffers (RB)—is proposed. Their organization is adjusted to program needs in advance—before program execution. It allows intensive use of all communication resources available in the system based on additional synchronization between involved processes. The proposed method is illustrated by an example of a typical fine-grain problem, which is the discrete Fast Fourier Transform (FFT). “The Transpose Algorithm” of FFT has been implemented with the RDMA rotating buffers and its efficiency is compared with a solution based on standard message passing library MPI.

Keywords
Related Topics
Physical Sciences and Engineering Computer Science Computer Networks and Communications
Authors
, ,