کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
5019264 1468201 2018 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Heterogeneous 1-out-of-N warm standby systems with online checkpointing
ترجمه فارسی عنوان
سیستم های آماده به کار گرم 1-out-of-Heterogeneous با بازرسی آنلاین
کلمات کلیدی
بازرسی آنلاین آماده به کار گرم هزینه مأموریت؛ قابلیت اطمینان ماموریت زمان ماموریت بهينه سازي؛ به موقع؛ ترتیب دهی
موضوعات مرتبط
مهندسی و علوم پایه سایر رشته های مهندسی مهندسی مکانیک
چکیده انگلیسی


- Warm standby systems with data checkpointing are considered.
- Checkpoints are performed in parallel with the primary mission task.
- Numerical method is suggested to evaluate mission metrics.
- Checkpoint distribution and element activation sequencing are optimized.

As a common practice in computing-related applications, checkpointing is used to facilitate an effective system recovery in the case of the occurrence of failures. Checkpoints are performed to save data associated with completed portion of a mission task. In the case of a failure, through rollback and data retrieval the system can resume the mission task from the last successful checkpoint instead of from the very beginning of the mission, saving time and cost. This paper models and optimizes 1-out-of-N: G warm standby systems subject to uneven online checkpointing, where checkpoints can be performed in parallel with execution of the primary mission task for improving efficiency of computing elements. Both data checkpoint and retrieval take dynamic time, depending on the amount of work completed. System elements can be heterogeneous in the time-to-failure distribution, performance, and level of readiness to take over the mission task during the warm standby mode. A numerical method is first suggested to evaluate mission performance indices including mission success probability, expected mission completion time, and expected mission operation cost. Examples are provided to demonstrate influence of mission deadline and element resource sharing parameter (i.e., CPU time distribution between the checkpointing procedure and the primary mission task) on the mission performance metrics. The optimal checkpoint distribution and optimal element activation sequencing problems are considered for different combinations of optimization objectives and constraints. A co-optimization problem is further addressed, which aims to find the optimal combination of checkpoint distribution and element activation sequence. Example optimization solutions illustrate the tradeoff among the three mission requirements (reliability, completion time, operation cost) for warm standby systems with online checkpoints.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Reliability Engineering & System Safety - Volume 169, January 2018, Pages 127-136
نویسندگان
, , ,