کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
461263 696581 2011 9 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Boosting adaptivity of fault-tolerant scheduling for real-time tasks with service requirements on clusters
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر شبکه های کامپیوتری و ارتباطات
پیش نمایش صفحه اول مقاله
Boosting adaptivity of fault-tolerant scheduling for real-time tasks with service requirements on clusters
چکیده انگلیسی

Thank to the excellent extensibility and usability, computer clusters have become the dominating platform for parallel computing. Fault-tolerance is mandatory for safety-critical applications running on clusters. In this paper we propose a service-aware and adaptive fault-tolerant scheduling algorithm using overlapping technologies (SAO in short) that can tolerate a node’s permanent failure at any time instant for real-time tasks with service requirements in heterogeneous clusters. SAO adopts the primary/backup model and considers the timing constraints, service requirements, and system resource utilization. To improve system resource utilization, we employ backup-backup (BB in short) and primary-backup (PB in short) overlapping technologies and analyze the overlapping constraints. In addition, SAO has high system adaptivity by dynamically adjusting the service levels of tasks based on system load. Furthermore, to improve resource utilization and schedulability, SAO makes backup copies adopt passive execution scheme or decrease the overlapping execution time of the primary copy and backup copy of a task as much as possible. Compared with a baseline algorithm SAWO (a service-aware and adaptive fault-tolerant scheduling algorithm without using overlapping technologies) and an existing algorithm DYFARS with simulation experiments, SAO achieves an average of 51.25% improvement in performability.


► A service-aware and adaptive fault-tolerant scheduling algorithm SAO was proposed.
► SAO adopts the primary/backup model and overlapping technologies.
► SAO can tolerate a node’s permanent failure at any time instant.
► SAO improves the adaptivity of real-time fault-tolerant systems.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Systems and Software - Volume 84, Issue 10, October 2011, Pages 1708–1716
نویسندگان
, , , ,