Communication-efficient leader election in crash

Article ID	Journal	Published Year	Pages	File Type
461181	Journal of Systems and Software	2011	10 Pages	PDF

Abstract

This work addresses the leader election problem in partially synchronous distributed systems where processes can crash and recover. More precisely, it focuses on implementing the Omega failure detector class, which provides a leader election functionality, in the crash–recovery failure model. The concepts of communication efficiency and near-efficiency for an algorithm implementing Omega are defined. Depending on the use or not of stable storage, the property satisfied by unstable processes, i.e., those that crash and recover infinitely often, varies. Two algorithms implementing Omega are presented. In the first algorithm, which is communication-efficient and uses stable storage, eventually and permanently unstable processes agree on the leader with correct processes. In the second algorithm, which is near-communication-efficient and does not use stable storage, processes start their execution with no leader in order to avoid the disagreement among unstable processes, that will agree on the leader with correct processes after receiving a first message from the leader.

• We implement Omega in crash–recovery partially synchronous distributed systems. • A first algorithm uses stable storage and is communication-efficient. • A second algorithm does not use stable storage and is near-communication-efficient. • The algorithms differ on the degree of agreement of unstable processes.

Keywords

Consensus Leader election