Generic Platform for Failure Recovery in Survivable Trees

Authors

  • V. Dynda

DOI:

https://doi.org/10.14311/780

Keywords:

fault tolerance, failure recovery, tree restoration, distributed algorithms

Abstract

Failure recovery is a fundamental task of the dependable systems needed to achieve fault-tolerant communications, smooth operation of system components and a comfortable user interface. Tree topologies are fragile, yet they are quite popular structures in computer systems. The term survivable tree denotes the capability of the tree network to deliver messages even in the presence of failures. In this paper, we analyze the characteristics of large-scale overlay survivable trees and identify the requirements for general-purpose failure recovery mechanisms in such an environment. We outline a generic failure recovery platform for preplanned tree restoration which meets those requirements, and we focus primarily on its completeness and correctness properties. The platform is based on bypass rings and it uses a bypass routing algorithm to ensure completeness, and specialized leader election to guarantee correctness. The platform supports multiple, on-line and on-the-fly recovery, provides an optional level of fault-tolerance, protection selectivity and optimization capability. It is independent of the the protected tree type (regarding traffic direction, number of sources, etc.) and forms a basis for application-specific fragment reconnection. 

Downloads

Download data is not yet available.

Author Biography

V. Dynda

Downloads

Published

2005-01-06

How to Cite

Dynda, V. (2005). Generic Platform for Failure Recovery in Survivable Trees. Acta Polytechnica, 45(6). https://doi.org/10.14311/780

Issue

Section

Articles