The search functionality is under construction.

IEICE TRANSACTIONS on Information

A Novel Replication Technique for Detecting and Masking Failures for Parallel Software: Active Parallel Replication

Adel CHERIF, Masato SUZUKI, Takuya KATAYAMA

  • Full Text Views

    0

  • Cite this

Summary :

We present a novel replication technique for parallel applications where instances of the replicated application are active on different group of processors called replicas. The replication technique is based on the FTAG (Fault Tolerant Attribute Grammar) computation model. FTAG is a functional and attribute based model. The developed replication technique implements "active parallel replication," that is, all replicas are active and compute concurrently a different piece of the application parallel code. In our model replicas cooperate not only to detect and mask failures but also to perform parallel computation. The replication mechanisms are supported by FTAG run time system and are fully application-transparent. Different novel mechanisms for checkpointing and recovery are developed. In our model during rollback recovery only that part of the computation that was detected faulty is discarded. The replication technique takes full advantage of parallel computing to reduce overall computation time.

Publication
IEICE TRANSACTIONS on Information Vol.E80-D No.9 pp.886-892
Publication Date
1997/09/25
Publicized
Online ISSN
DOI
Type of Manuscript
Special Section PAPER (Special Issue on Architectures, Algorithms and Networks for Massively Parallel Computing)
Category
Fault Tolerance

Authors

Keyword