Article ID Journal Published Year Pages File Type
428993 Information Processing Letters 2012 6 Pages PDF
Abstract

In this paper, a checkpointing protocol based on loose synchronization is proposed. The protocol enables processes to take checkpoints at different frequencies so that each process can control its rollback distance. In traditional asynchronous and quasi-synchronous checkpointing protocols, the checkpoints that are not up-to-date may be used for recovery. As a result, the rollback distance is often difficult to control. In the proposed protocol, the checkpoint cycle of each process is dynamically adjusted using a pessimistic scheme so that strict 1-rollback is achieved; namely, one of the last two checkpoints of each process can be utilized for recovery.

► We propose a pessimistic multi-cycle checkpointing protocol that allows processes to take checkpoints with different frequencies. ► We have proven that one of the last two checkpoints of each process can be utilized during recovery. ► We present a mathematical analysis of the checkpointing overhead.

Keywords
Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, , , , ,