The search functionality is under construction.
The search functionality is under construction.

Fault Tolerance in Decentralized Systems

Brian RANDELL

  • Full Text Views

    0

  • Cite this

Summary :

In a decentralised system the problems of fault tolerance, and in particular error recovery, vary greatly depending on the design assumptions. For example, in a distributed database system, if one disregards the possibility of undetected invalid inputs or outputs, the errors that have to be recovered from will just affect the database, and backward error recovery will be feasible and should suffice. Such a system is typically supporting a set of activities that are competing for access to a shared database, but which are otherwise essentially independent of each other--in such circumstances conventional database transaction processing and distributed protocols enable backward recovery to be provided very effectively. But in more general systems the multiple activities will often not simply be competing against each other, but rather will at times be attempting to co-operate with each other, in pursuit of some common goal. Moreover, the activities in decentralised systems typically involve not just computers, but also external entities that are not capable of backward error recovery. Such additional complications make the task of error recovery more challenging, and indeed more interesting. This paper provides a brief analysis of the consequences of various such complications, and outlines some recent work on advanced error recovery techniques that they have motivated.

Publication
IEICE TRANSACTIONS on Communications Vol.E83-B No.5 pp.903-907
Publication Date
2000/05/25
Publicized
Online ISSN
DOI
Type of Manuscript
Special Section INVITED PAPER (IEICE/IEEE Joint Special Issue on Autonomous Decentralized Systems)
Category

Authors

Keyword