Article ID Journal Published Year Pages File Type
461181 Journal of Systems and Software 2011 10 Pages PDF
Abstract

This work addresses the leader election problem in partially synchronous distributed systems where processes can crash and recover. More precisely, it focuses on implementing the Omega failure detector class, which provides a leader election functionality, in the crash–recovery failure model. The concepts of communication efficiency and near-efficiency for an algorithm implementing Omega are defined. Depending on the use or not of stable storage, the property satisfied by unstable processes, i.e., those that crash and recover infinitely often, varies. Two algorithms implementing Omega are presented. In the first algorithm, which is communication-efficient and uses stable storage, eventually and permanently unstable processes agree on the leader with correct processes. In the second algorithm, which is near-communication-efficient and does not use stable storage, processes start their execution with no leader in order to avoid the disagreement among unstable processes, that will agree on the leader with correct processes after receiving a first message from the leader.

• We implement Omega in crash–recovery partially synchronous distributed systems. • A first algorithm uses stable storage and is communication-efficient. • A second algorithm does not use stable storage and is near-communication-efficient. • The algorithms differ on the degree of agreement of unstable processes.

Related Topics
Physical Sciences and Engineering Computer Science Computer Networks and Communications
Authors
, , ,