کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
425672 685814 2014 16 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Modeling and optimizing large-scale data flows
ترجمه فارسی عنوان
مدلسازی و بهینه سازی جریانهای اطلاعاتی در مقیاس بزرگ
کلمات کلیدی
تحقیقات با شدت زیاد، گردش داده ها، مدل سازی، بهینه سازی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
چکیده انگلیسی


• We examine advanced support for modeling and optimizing data mining and integration processes.
• We describe the overall process from a dynamic model to a static model for optimization.
• We elaborate on the performance implications of our optimization approach.
• We discuss the implemented GUI based on meta-modeling concepts and required annotation steps.

Modern scientific collaborations require large-scale integration of various processes. Higher-level dataflow languages are used on top of parallel and distributed dataflow systems to enable faster data-intensive workflow programs development, their easier optimization, and more maintainable code. In this paper, we present the rationales, design, and application of the needed advanced support for modeling and optimizing data flows for data mining and integration processes. The optimization research and development is based on dataflow pre-execution modeling and extending the registry of process activities by advanced annotations. Additionally, the overall process from a dynamic model to a static model as input for the optimization algorithms is described. This novel approach is implemented within an advanced graphical user interface, called the Process Designer, in order to support semi-automatic optimization as well as within a dataflow execution platform, called the Gateway. It can be adapted to any dataflow language implementation. The Process Designer architecture based on modern (meta-)modeling concepts naturally supports validated transformations between external textual and internal graphical representations of the targeted dataflow language, and in this way significantly increases the productivity and robustness of the implementation processes.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Future Generation Computer Systems - Volume 31, February 2014, Pages 12–27
نویسندگان
, , , ,