کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
425272 685710 2014 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Benchmarking MapReduce implementations under different application scenarios
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله
Benchmarking MapReduce implementations under different application scenarios
چکیده انگلیسی


• We compare performance of various MapReduce frameworks using several applications.
• We examine data, CPU, memory intensive and iterative application scenarios.
• We evaluate the performances on faulty, heterogeneous, and load-imbalanced clusters.
• We map the performance results to design decisions of each MapReduce framework.
• We present the strength of all the implementations featured in this study.

The MapReduce paradigm provides a scalable model for large scale data intensive computing and associated fault-tolerance. Data volumes generated and processed by scientific applications are growing rapidly. Several MapReduce implementations, with various degrees of conformance to the key tenets of the model, are available today. Each of these implementations is optimized for specific features. To make the right decisions, HPC application and middleware developers must thus understand the complex dependences between MapReduce features and their application. We present a set of benchmarks for quantifying, comparing, and contrasting the performance of MapReduce implementations under a wide range of representative use cases. To demonstrate the utility of the benchmarks and to provide a snapshot of the current implementation landscape, we report the performance of three different MapReduce implementations, and draw conclusions about their current performance characteristics. The three implementations we chose for evaluation are the widely used Hadoop implementation, Twister, which has been widely discussed in the literature in the context of scientific applications, and LEMO-MR which is our own implementation. We present the performance of these three implementations and draw conclusions about their performance characteristics.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Future Generation Computer Systems - Volume 36, July 2014, Pages 389–399
نویسندگان
, , , ,