کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
523985 868538 2011 15 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A generic parallel processing model for facilitating data mining and integration
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
پیش نمایش صفحه اول مقاله
A generic parallel processing model for facilitating data mining and integration
چکیده انگلیسی

To facilitate data mining and integration (DMI) processes in a generic way, we investigate a parallel pipeline streaming model. We model a DMI task as a streaming data-flow graph: a directed acyclic graph (DAG) of Processing Elements (PEs). The composition mechanism links PEs via data streams, which may be in memory, buffered via disks or inter-computer data-flows. This makes it possible to build arbitrary DAGs with pipelining and both data and task parallelisms, which provide room for performance enhancement. We have applied this approach to a real DMI case in the life sciences and implemented a prototype. To demonstrate feasibility of the modelled DMI task and assess the efficiency of the prototype, we have also built a performance evaluation model. The experimental evaluation results show that a linear speedup has been achieved with the increase of the number of distributed computing nodes in this case study.

Research highlights
► A generic parallel pipeline streaming model facilitates data mining and integration.
► A prototype of a real use case demonstrates the feasibility of the proposed model.
► A performance evaluation model assesses the efficiency of the prototype.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Parallel Computing - Volume 37, Issue 3, March 2011, Pages 157–171
نویسندگان
, , , ,