کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
6861353 1439248 2018 20 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Rim: A reusable iterative model for big data
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
Rim: A reusable iterative model for big data
چکیده انگلیسی
In the big data environment, iterative computing is widely used in many applications such as data mining, machine learning, graph analysis and so on. Many iterative computing models are proposed to support the execution of iterative algorithms on big data efficiently. However, it is inefficient if the entire dataset has to be re-iterated when it is partly changed, for example, some data is included or excluded. This paper presents Rim, a Reusable Iterative computing Model which calculates the new iterative results with the updated dataset and the original iterative results, avoiding re-iteration on entire dataset. We propose the application conditions of Rim, and mathematically prove the accuracy and performance advantages of Rim, and describe Rim's application on three typical iterative algorithms, which are PageRank, K-means and Descendant-query. Finally, we implement Rim in Spark, and evaluate its performance on different test cases and iterative algorithms. In term of PageRank, K-Means and Descendant-query, experiments show our approach is on average 1.34×, 2.51×, 3.17× faster than re-iteration on massive dataset, respectively.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Knowledge-Based Systems - Volume 153, 1 August 2018, Pages 105-116
نویسندگان
, , , , ,