The search functionality is under construction.

The search functionality is under construction.

In this paper, we propose a new scheduling algorithm to achieve fault tolerance in multiprocessor systems. This algorithm first partitions a parallel program into subsets of tasks, based on the notion of height of a task graph. For each subset, the algorithm then duplicates and schedules the tasks in the subset successively. We prove that schedules obtained by the proposed algorithm can tolerate a single processor failure and show that the computational complexity of the algorithm is *O*(|*V*|^{4}) where *V* is the set of nodes of a task graph. We conduct simulations by applying the algorithm to two kinds of practical task graphs (Gaussian elimination and LU-decomposition). The results of this experiment show that fault tolerance can be achieved at the cost of small degree of time redundancy, and that performance in the case of a processor failure is improved compared to a previous algorithm.

- Publication
- IEICE TRANSACTIONS on Information Vol.E85-D No.3 pp.525-534

- Publication Date
- 2002/03/01

- Publicized

- Online ISSN

- DOI

- Type of Manuscript
- PAPER

- Category
- Fault Tolerance

The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.

Copy

Koji HASHIMOTO, Tatsuhiro TSUCHIYA, Tohru KIKUNO, "Effective Scheduling of Duplicated Tasks for Fault Tolerance in Multiprocessor Systems" in IEICE TRANSACTIONS on Information,
vol. E85-D, no. 3, pp. 525-534, March 2002, doi: .

Abstract: In this paper, we propose a new scheduling algorithm to achieve fault tolerance in multiprocessor systems. This algorithm first partitions a parallel program into subsets of tasks, based on the notion of height of a task graph. For each subset, the algorithm then duplicates and schedules the tasks in the subset successively. We prove that schedules obtained by the proposed algorithm can tolerate a single processor failure and show that the computational complexity of the algorithm is *O*(|*V*|^{4}) where *V* is the set of nodes of a task graph. We conduct simulations by applying the algorithm to two kinds of practical task graphs (Gaussian elimination and LU-decomposition). The results of this experiment show that fault tolerance can be achieved at the cost of small degree of time redundancy, and that performance in the case of a processor failure is improved compared to a previous algorithm.

URL: https://global.ieice.org/en_transactions/information/10.1587/e85-d_3_525/_p

Copy

@ARTICLE{e85-d_3_525,

author={Koji HASHIMOTO, Tatsuhiro TSUCHIYA, Tohru KIKUNO, },

journal={IEICE TRANSACTIONS on Information},

title={Effective Scheduling of Duplicated Tasks for Fault Tolerance in Multiprocessor Systems},

year={2002},

volume={E85-D},

number={3},

pages={525-534},

abstract={In this paper, we propose a new scheduling algorithm to achieve fault tolerance in multiprocessor systems. This algorithm first partitions a parallel program into subsets of tasks, based on the notion of height of a task graph. For each subset, the algorithm then duplicates and schedules the tasks in the subset successively. We prove that schedules obtained by the proposed algorithm can tolerate a single processor failure and show that the computational complexity of the algorithm is *O*(|*V*|^{4}) where *V* is the set of nodes of a task graph. We conduct simulations by applying the algorithm to two kinds of practical task graphs (Gaussian elimination and LU-decomposition). The results of this experiment show that fault tolerance can be achieved at the cost of small degree of time redundancy, and that performance in the case of a processor failure is improved compared to a previous algorithm.},

keywords={},

doi={},

ISSN={},

month={March},}

Copy

TY - JOUR

TI - Effective Scheduling of Duplicated Tasks for Fault Tolerance in Multiprocessor Systems

T2 - IEICE TRANSACTIONS on Information

SP - 525

EP - 534

AU - Koji HASHIMOTO

AU - Tatsuhiro TSUCHIYA

AU - Tohru KIKUNO

PY - 2002

DO -

JO - IEICE TRANSACTIONS on Information

SN -

VL - E85-D

IS - 3

JA - IEICE TRANSACTIONS on Information

Y1 - March 2002

AB - In this paper, we propose a new scheduling algorithm to achieve fault tolerance in multiprocessor systems. This algorithm first partitions a parallel program into subsets of tasks, based on the notion of height of a task graph. For each subset, the algorithm then duplicates and schedules the tasks in the subset successively. We prove that schedules obtained by the proposed algorithm can tolerate a single processor failure and show that the computational complexity of the algorithm is *O*(|*V*|^{4}) where *V* is the set of nodes of a task graph. We conduct simulations by applying the algorithm to two kinds of practical task graphs (Gaussian elimination and LU-decomposition). The results of this experiment show that fault tolerance can be achieved at the cost of small degree of time redundancy, and that performance in the case of a processor failure is improved compared to a previous algorithm.

ER -