کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
424522 685587 2016 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Distributed volunteer computing for solving ensemble learning problems
ترجمه فارسی عنوان
محاسبات داوطلب توزیع شده برای حل مشکلات یادگیری گروهی
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
چکیده انگلیسی


• We present Mining@Home, a framework for distributed data mining.
• Mining@Home combines the benefits of P2P protocols with those of the volunteer computer paradigm.
• The framework is used to discover classifiers by applying the “bagging” technique on real data.
• We present performance results showing the efficiency and scalability of our approach.
• Performance results are obtained through real experiments, simulation and analytical assessment.

The volunteer computing paradigm, along with the tailored use of peer-to-peer communication, has recently proven capable of solving a wide area of data-intensive problems in a distributed scenario. The Mining@Home framework is based on these paradigms and it has been implemented to run a wide range of distributed data mining applications. The efficiency and scalability of the architecture can be fully exploited when the overall task can be partitioned into distinct jobs that may be executed in parallel, and input data can be reused, which naturally leads to the use of data cachers. This paper explores the opportunities offered by Mining@Home for coping with the discovery of classifiers through the use of the bagging approach: multiple learners are used to compute models from the same input data, so as to extract a final model with high statistical accuracy. Analysis focuses on the evaluation of experiments performed in a real distributed environment, enriched with simulation assessment–to evaluate very large environments–and with an analytical investigation based on the iso-efficiency methodology. An extensive set of experiments allowed to analyze a number of heterogeneous scenarios, with different problem sizes, which helps to improve the performance by appropriately tuning the number of workers and the number of interconnected domains.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Future Generation Computer Systems - Volume 54, January 2016, Pages 68–78
نویسندگان
, , ,