کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
424666 685619 2013 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Performance comparison under failures of MPI and MapReduce: An analytical approach
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله
Performance comparison under failures of MPI and MapReduce: An analytical approach
چکیده انگلیسی


• Analytical models are proposed to quantify the impact of failures.
• A numerical study is conducted on both MPI and MapReduce applications under failures.
• The impact of different parameters on the failure-prone performance is investigated.
• Extensive experiments are carried out to examine the accuracy of the proposed models.

MPI has been the de facto standard of parallel programming for decades. There has been an increasing concern about the reliability of MPI applications in recent years, partially due to the inefficiency of parallel checkpointing. MapReduce is a new programming model originally introduced to handle massive data processing. There are numerous efforts recently that transform classical MPI based scientific applications to MapReduce, due to the merits of easy programming, automatic parallelism, and fault tolerance of MapReduce. However, the stricter synchronization primitive supported by MapReduce also imposes considerable overhead.While the failure-free performance comparison between MPI and MapReduce has been investigated, there exists little work in comparing the two programming models under failures. In this paper, we propose an analytical approach to quantifying the capabilities of the two programming models to tolerate failures for a comparison. We also carry out extensive numerical analysis to study the impact of different parameters on fault tolerance. This work can be used by the HPC community for various purposes in making critical decisions. For example, it helps algorithm designers to answer the question such as, at which scale should we give up MPI and use MapReduce as the programming model for a better performance under the presence of failures?

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Future Generation Computer Systems - Volume 29, Issue 7, September 2013, Pages 1808–1815
نویسندگان
, ,