MapReduce in MPI for Large-scale graph algorithms

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
524684	868824	2011	23 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

MPI Graph algorithms - الگوریتم های گراف MapReduce - نگاشت کاهش Message-passing - پیام گذراندن

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر

پیش نمایش صفحه اول مقاله

MapReduce in MPI for Large-scale graph algorithms

چکیده انگلیسی

We describe a parallel library written with message-passing (MPI) calls that allows algorithms to be expressed in the MapReduce paradigm. This means the calling program does not need to include explicit parallel code, but instead provides “map” and “reduce” functions that operate independently on elements of a data set distributed across processors. The library performs needed data movement between processors. We describe how typical MapReduce functionality can be implemented in an MPI context, and also in an out-of-core manner for data sets that do not fit within the aggregate memory of a parallel machine. Our motivation for creating this library was to enable graph algorithms to be written as MapReduce operations, allowing processing of terabyte-scale data sets on traditional MPI-based clusters. We outline MapReduce versions of several such algorithms: vertex ranking via PageRank, triangle finding, connected component identification, Luby’s algorithm for maximally independent sets, and single-source shortest-path calculation. To test the algorithms on arbitrarily large artificial graphs we generate randomized R-MAT matrices in parallel; a MapReduce version of this operation is also described. Performance and scalability results for the various algorithms are presented for varying size graphs on a distributed-memory cluster. For some cases, we compare the results with non-MapReduce algorithms, different machines, and different MapReduce software, namely Hadoop. Our open-source library is written in C++, is callable from C++, C, Fortran, or scripting languages such as Python, and can run on any parallel platform that supports MPI.

Research highlights
► We show how MapReduce operations can be performed on top of the message-passing interface (MPI) in parallel and out-of-core, and describe our open-source implementation, called the MR-MPI library.
► We describe MapReduce algorithms for several graph computations: generation of random R-MAT graphs (or matrices), vertex ranking via PageRank, triangle and connected-component finding, maximal independent set identification, and single-source shortest-path.
► We give performance data for running these algorithms on graphs with up to billions of edges.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Parallel Computing - Volume 37, Issue 9, September 2011, Pages 610–632

نویسندگان

Steven J. Plimpton, Karen D. Devine,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

MapReduce in MPI for Large-scale graph algorithms

دسترسی سریع

ارتباط

English Website