Error Models and Fault-Secure Scheduling in Multiprocessor Systems

Koji HASHIMOTO; Tatsuhiro TSUCHIYA; Tohru KIKUNO

IEICE TRANSACTIONS on Information

Error Models and Fault-Secure Scheduling in Multiprocessor Systems

Koji HASHIMOTO, Tatsuhiro TSUCHIYA, Tohru KIKUNO

Full Text Views

0

Cite this

Summary :

A schedule for a parallel program is said to be 1-fault-secure if a system that uses the schedule can either produce correct output for the program or detect the presence of any faults in a single processor. Although several fault-secure scheduling algorithms have been proposed, they can all only be applied to a class of tree-structured task graphs with a uniform computation cost. Besides, they assume a stringent error model, called the redeemable error model, that considers extremely unlikely cases. In this paper, we first propose two new plausible error models which restrict the manner of error propagation. Then we present three fault-secure scheduling algorithms, one for each of the three models. Unlike previous algorithms, the proposed algorithms can deal with any task graphs with arbitrary computation and communication costs. Through experiments, we evaluate these algorithms and study the impact of the error models on the lengths of fault-secure schedules.

Publication: IEICE TRANSACTIONS on Information Vol.E84-D No.5 pp.635-650

Publication Date: 2001/05/01

Publicized

Online ISSN

DOI

Type of Manuscript: PAPER

Category: Fault Tolerance

Cite this

Copy

Koji HASHIMOTO, Tatsuhiro TSUCHIYA, Tohru KIKUNO, "Error Models and Fault-Secure Scheduling in Multiprocessor Systems" in IEICE TRANSACTIONS on Information, vol. E84-D, no. 5, pp. 635-650, May 2001, doi: .
Abstract: A schedule for a parallel program is said to be 1-fault-secure if a system that uses the schedule can either produce correct output for the program or detect the presence of any faults in a single processor. Although several fault-secure scheduling algorithms have been proposed, they can all only be applied to a class of tree-structured task graphs with a uniform computation cost. Besides, they assume a stringent error model, called the redeemable error model, that considers extremely unlikely cases. In this paper, we first propose two new plausible error models which restrict the manner of error propagation. Then we present three fault-secure scheduling algorithms, one for each of the three models. Unlike previous algorithms, the proposed algorithms can deal with any task graphs with arbitrary computation and communication costs. Through experiments, we evaluate these algorithms and study the impact of the error models on the lengths of fault-secure schedules.
URL: https://global.ieice.org/en_transactions/information/10.1587/e84-d_5_635/_p

Copy

@ARTICLE{e84-d_5_635,
author={Koji HASHIMOTO, Tatsuhiro TSUCHIYA, Tohru KIKUNO, },
journal={IEICE TRANSACTIONS on Information},
title={Error Models and Fault-Secure Scheduling in Multiprocessor Systems},
year={2001},
volume={E84-D},
number={5},
pages={635-650},
abstract={A schedule for a parallel program is said to be 1-fault-secure if a system that uses the schedule can either produce correct output for the program or detect the presence of any faults in a single processor. Although several fault-secure scheduling algorithms have been proposed, they can all only be applied to a class of tree-structured task graphs with a uniform computation cost. Besides, they assume a stringent error model, called the redeemable error model, that considers extremely unlikely cases. In this paper, we first propose two new plausible error models which restrict the manner of error propagation. Then we present three fault-secure scheduling algorithms, one for each of the three models. Unlike previous algorithms, the proposed algorithms can deal with any task graphs with arbitrary computation and communication costs. Through experiments, we evaluate these algorithms and study the impact of the error models on the lengths of fault-secure schedules.},
keywords={},
doi={},
ISSN={},
month={May},}

Copy

TY - JOUR
TI - Error Models and Fault-Secure Scheduling in Multiprocessor Systems
T2 - IEICE TRANSACTIONS on Information
SP - 635
EP - 650
AU - Koji HASHIMOTO
AU - Tatsuhiro TSUCHIYA
AU - Tohru KIKUNO
PY - 2001
DO -
JO - IEICE TRANSACTIONS on Information
SN -
VL - E84-D
IS - 5
JA - IEICE TRANSACTIONS on Information
Y1 - May 2001
AB - A schedule for a parallel program is said to be 1-fault-secure if a system that uses the schedule can either produce correct output for the program or detect the presence of any faults in a single processor. Although several fault-secure scheduling algorithms have been proposed, they can all only be applied to a class of tree-structured task graphs with a uniform computation cost. Besides, they assume a stringent error model, called the redeemable error model, that considers extremely unlikely cases. In this paper, we first propose two new plausible error models which restrict the manner of error propagation. Then we present three fault-secure scheduling algorithms, one for each of the three models. Unlike previous algorithms, the proposed algorithms can deal with any task graphs with arbitrary computation and communication costs. Through experiments, we evaluate these algorithms and study the impact of the error models on the lengths of fault-secure schedules.
ER -

IEICE TRANSACTIONS on Information

Error Models and Fault-Secure Scheduling in Multiprocessor Systems

Summary :

Authors

Keyword

Latest Issue

Contents

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles

IEICE TRANSACTIONS on Information

Error Models and Fault-Secure Scheduling in Multiprocessor Systems

Summary :

Authors

Keyword

Latest Issue

Contents

Copyrights notice of machine-translated contents

Cite this

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles