Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
460863 | Journal of Systems Architecture | 2006 | 12 Pages |
The paper concerns parallel computations with communication based on Remote Direct Memory Access (RDMA), which provides for low level un-buffered access to distributed memory of computational nodes. Fine grain computation involves very frequent transmissions of small messages. For their efficient execution with RDMA communication a special memory infrastructure—rotating buffers (RB)—is proposed. Their organization is adjusted to program needs in advance—before program execution. It allows intensive use of all communication resources available in the system based on additional synchronization between involved processes. The proposed method is illustrated by an example of a typical fine-grain problem, which is the discrete Fast Fourier Transform (FFT). “The Transpose Algorithm” of FFT has been implemented with the RDMA rotating buffers and its efficiency is compared with a solution based on standard message passing library MPI.