Article ID Journal Published Year Pages File Type
1133751 Computers & Industrial Engineering 2015 12 Pages PDF
Abstract

•We analyze a novel retrial queue with checkpointing and rollback recovery.•We provide stability conditions, steady state analysis and apply mean value analysis.•An application of the main model in Time-Division Duplexing system is studied.•We investigate fault-tolerance along with power saving and delay.

In this paper we analyze a retrial queue that can be used to model fault-tolerant systems with checkpointing and rollback recovery. We assume that the service time of each job is decomposed into N modules, at the end of each of which a checkpoint is established. Checkpointing and rollback recovery consists, basically, of saving periodically the state of the system on a secure device so that, upon recovery from a system failure, the system can resume the computation from the most recent checkpoint, rather than from the beginning. Upon a successful service completion of a job, the server activates a timer and remains awake. If the timer expires without a request, the server departs for a vacation. Upon returning from the vacation, the server activates the timer again. Furthermore, both idle and vacation periods can be interrupted by the server in order to perform secondary jobs. Applications of this model can be found in power saving of mobile devices in a half-duplex communication system operating in wireless environment, and in long-running software applications. We investigate stability condition and steady state analysis. We also apply a mean value analysis to obtain useful performance measures, and prove that the model satisfies the stochastic decomposition property. Useful energy metrics are determined and constrained optimization problems are formulated and used to obtain extensive numerical results.

Related Topics
Physical Sciences and Engineering Engineering Industrial and Manufacturing Engineering
Authors
,