کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
6883449 1444173 2018 17 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Performance optimization of MapReduce-based Apriori algorithm on Hadoop cluster
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر شبکه های کامپیوتری و ارتباطات
پیش نمایش صفحه اول مقاله
Performance optimization of MapReduce-based Apriori algorithm on Hadoop cluster
چکیده انگلیسی
Many techniques have been proposed to implement the Apriori algorithm on MapReduce framework but only a few have focused on performance improvement. FPC (Fixed Passes Combined-counting) and DPC (Dynamic Passes Combined-counting) algorithms combine multiple passes of Apriori in a single MapReduce phase to reduce the execution time. In this paper, we propose improved MapReduce based Apriori algorithms VFPC (Variable Size based Fixed Passes Combined-counting) and ETDPC (Elapsed Time based Dynamic Passes Combined-counting) over FPC and DPC. Further, we optimize the multi-pass phases of these algorithms by skipping pruning step in some passes, and propose Optimized-VFPC and Optimized-ETDPC algorithms. Quantitative analysis reveals that counting cost of additional un-pruned candidates produced due to skipped-pruning is less significant than reduction in computation cost due to the same. Experimental results show that VFPC and ETDPC are more robust and flexible than FPC and DPC whereas their optimized versions are more efficient in terms of execution time.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computers & Electrical Engineering - Volume 67, April 2018, Pages 348-364
نویسندگان
, , ,