A generic parallel processing model for facilitating data mining and integration

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
523985	868538	2011	15 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Life sciences - علوم زیستی Parallelism - همبستگی Workflow - گردش کار

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر

پیش نمایش صفحه اول مقاله

A generic parallel processing model for facilitating data mining and integration

چکیده انگلیسی

To facilitate data mining and integration (DMI) processes in a generic way, we investigate a parallel pipeline streaming model. We model a DMI task as a streaming data-flow graph: a directed acyclic graph (DAG) of Processing Elements (PEs). The composition mechanism links PEs via data streams, which may be in memory, buffered via disks or inter-computer data-flows. This makes it possible to build arbitrary DAGs with pipelining and both data and task parallelisms, which provide room for performance enhancement. We have applied this approach to a real DMI case in the life sciences and implemented a prototype. To demonstrate feasibility of the modelled DMI task and assess the efficiency of the prototype, we have also built a performance evaluation model. The experimental evaluation results show that a linear speedup has been achieved with the increase of the number of distributed computing nodes in this case study.

Research highlights
► A generic parallel pipeline streaming model facilitates data mining and integration.
► A prototype of a real use case demonstrates the feasibility of the proposed model.
► A performance evaluation model assesses the efficiency of the prototype.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Parallel Computing - Volume 37, Issue 3, March 2011, Pages 157–171

نویسندگان

Liangxiu Han, Chee Sun Liew, Jano van Hemert, Malcolm Atkinson,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

A generic parallel processing model for facilitating data mining and integration

دسترسی سریع

ارتباط

English Website