کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
432987 689190 2016 15 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Embedding the optimal all-to-all personalized exchange on multistage interconnection networks++
ترجمه فارسی عنوان
تعبیه کل مبادله شخصی به شبکه های اتصال چند مرحله ای ++ به طور مطلوب
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
چکیده انگلیسی


• Embedding the optimal all-to-all personalized exchange (ATAPE) on MINs++ is proposed for the fast on-chip process.
• Static ATAPE (D=SD=S XOR (C+order) mod NN) and f-in-1 dynamic ATAPE (D=ρ[(S+C+order)modN]) are ultimately embedded for full (S,DS,D)-routing on MINs++.
• Existing ATAPE approaches rely on switch-/stage-control routing on MINs by either S- or D-routing.
• Correctness of the proposed ATAPE functions on d-nary-switch MINs++ and a crossbar of MIN++ are proven.
• Experimental results are confirmed fruitfully for N=64N=64 to 16384 processors with the significant speedup.

All-to-all personalized exchange (ATAPE) is an inspired process to speedup the parallel and distributed computing. Recently, ATAPE algorithms were successfully applied on multistage interconnection networks (MINs), including baseline and butterfly networks. However, routing of those algorithms on MINs relies on switch-patterns for stage-control from sources (SS), which is a half-routing solution since they cannot perform a full self-routing with the (SS, DD) protocol for all MINs. In this paper, first we propose a full-routing solution of the realizing ATAPE on a class of dd-nary-switch MINs++ (i.e., baseline++, butterfly++, etc.). Our ATAPE can be embedded on-chip effectively for not only (SS, DD) self-routing but also stage-/switch-control routing. Two embedded ATAPE functions incorporate with multi-stage pipelining are proposed in optimal O(N+log2N)O(N+log2N): 1. a (default) static function D=SD=S XOR (C+order) mod NN and 2. an (optional) f-in-1 dynamic function D=ρD=ρ [(S+C+order) mod NN] with the incrementing counter C=0C=0 to N−1N−1. Second, we introduce a crossbar of MINs++ with fewer delay-stages to achieve the ultimate ATAPE embedding. Finally, experimental results of applying ATAPE on such MINs++ are confirmed fruitfully, including the ATAPE-based NxNNxN-matrix transposition in O(N+log2N)O(N+log2N), which yields the significant speedup.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Parallel and Distributed Computing - Volume 88, February 2016, Pages 16–30
نویسندگان
, ,