کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
461009 696520 2016 12 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Adding data analytics capabilities to scaled-out object store
ترجمه فارسی عنوان
اضافه کردن قابلیت تجزیه و تحلیل داده ها به ذخیره سازی شیء بزرگ
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر شبکه های کامپیوتری و ارتباطات
چکیده انگلیسی
This work focuses on enabling effective data analytics on scaled-out object storage systems. Typically, applications perform MapReduce computations by first copying large amounts of data to a separate compute cluster (i.e. a Hadoop cluster). However; this approach is not very efficient considering that storage systems can host hundreds of petabytes of data. Network bandwidth can be easily saturated and the overall energy consumption would increase during large-scale data transfer. Instead of moving data between remote clusters; we propose the implementation of a data analytics layer on an object-based storage cluster to perform in-place MapReduce computation on existing data. The analytics layer is tied to the underlying object store, utilizing its data redundancy and distribution policies across the cluster. We implemented this approach with Ceph object storage system and Hadoop, and conducted evaluations with various benchmarks. Performance evaluations show that initial data copy performance is improved by up to 96% and the MapReduce performance is improved by up to 20% compared to the stock Hadoop implementation.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Systems and Software - Volume 121, November 2016, Pages 16-27
نویسندگان
, , ,