کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
432656 689006 2016 15 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Data-aware task scheduling for all-to-all comparison problems in heterogeneous distributed systems
ترجمه فارسی عنوان
برنامه ریزی داده های آگاه برای همه به همه مشکلات مقایسه در سیستم های توزیع ناهمگن
کلمات کلیدی
محاسبات توزیع شده، همه به همه مقایسه، توزیع داده، برنامه ریزی وظیفه
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
چکیده انگلیسی


• Abstraction of all-to-all comparison computing pattern with big data sets.
• Formulation of all-to-all comparisons in distributed systems as a constrained optimization.
• Data-aware task scheduling approach for solving the all-to-all comparison problem.
• Metaheuristic data pre-scheduling and dynamic task scheduling in the approach.

Solving large-scale all-to-all comparison problems using distributed computing is increasingly significant for various applications. Previous efforts to implement distributed all-to-all comparison frameworks have treated the two phases of data distribution and comparison task scheduling separately. This leads to high storage demands as well as poor data locality for the comparison tasks, thus creating a need to redistribute the data at runtime. Furthermore, most previous methods have been developed for homogeneous computing environments, so their overall performance is degraded even further when they are used in heterogeneous distributed systems. To tackle these challenges, this paper presents a data-aware task scheduling approach for solving all-to-all comparison problems in heterogeneous distributed systems. The approach formulates the requirements for data distribution and comparison task scheduling simultaneously as a constrained optimization problem. Then, metaheuristic data pre-scheduling and dynamic task scheduling strategies are developed along with an algorithmic implementation to solve the problem. The approach provides perfect data locality for all comparison tasks, avoiding rearrangement of data at runtime. It achieves load balancing among heterogeneous computing nodes, thus enhancing the overall computation time. It also reduces data storage requirements across the network. The effectiveness of the approach is demonstrated through experimental studies.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Parallel and Distributed Computing - Volumes 93–94, July 2016, Pages 87–101
نویسندگان
, , , ,