کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
6873072 1440627 2018 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
PP: Popularity-based Proactive Data Recovery for HDFS RAID systems
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله
PP: Popularity-based Proactive Data Recovery for HDFS RAID systems
چکیده انگلیسی
Clustered storage systems as HDFS have become a popular platform for handling Big Data, susceptible to failures and data recovery whenever a failure is detected. However, existing recovery schemes in HDFS are passive, which affects the processing efficiency of MapReduce jobs and degrades the performance of Hadoop storage systems. In order to address this problem, this paper proposes a novel scheme called PP (Popularity-based Proactive Data Recovery), that significantly boost the recovery efficiency of HDFS RAID systems. PP tracks the popular data and immediately recovers the missing popular data when a node fails in Hadoop system, which effect and variation on the performance of MapReduce jobs is minimal. The lightweight prototype implementation of PP and extensive experiments demonstrate that, compared with Hadoop's default locally-first scheduling and the degraded-first scheduling, PP significantly reduces the recovery time as well the execution time of MapReduce jobs concurrently.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Future Generation Computer Systems - Volume 86, September 2018, Pages 1146-1153
نویسندگان
, , , ,