کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
425964 685971 2012 19 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
HOPE: A Hybrid Optimistic checkpointing and selective Pessimistic mEssage logging protocol for large scale distributed systems
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله
HOPE: A Hybrid Optimistic checkpointing and selective Pessimistic mEssage logging protocol for large scale distributed systems
چکیده انگلیسی

Future generation supercomputers will be message-passing distributed systems consisting of hundreds of thousands of processors. As the size of the system grows, failure rate increases. Hence for the success and deployability of such large scale systems, scalable checkpointing and recovery protocols need to be implemented. Existing checkpointing and rollback recovery protocols used for providing fault tolerance in distributed systems are not scalable to such large systems. In this paper, we address this important and timely issue and propose a scalable group-based Hybrid Optimistic checkpointing and selective Pessimistic mEssage logging (HOPE) protocol. Performance evaluation indicates, our protocol takes a balanced approach to lower checkpointing and message logging overhead and enhances scalability.


► A scalable checkpointing and recovery algorithm is presented.
► Application dependent checkpointing and recovery algorithms are scalable.
► Hybrid checkpointing and recovery algorithms are scalable.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Future Generation Computer Systems - Volume 28, Issue 8, October 2012, Pages 1217–1235
نویسندگان
, ,