1-2hit |
Ryo SUZUKI Mamoru OHARA Masayuki ARAI Satoshi FUKUMOTO Kazuhiko IWASAKI
This paper discusses hybrid state saving for applications in which processes should create checkpoints at constant intervals and can hold a finite number of checkpoints. We propose a reclamation technique for checkpoint space, that provides effective checkpoint time arrangements for a rollback distance distribution. Numerical examples show that when we cannot use the optimal checkpoint interval due to the system requirements, the proposed technique can achieve lower expected overhead compared to the conventional technique without considering the form of the rollback distance distribution.
Determining consistent global checkpoints of a distributed computation has applications in the areas such as rollback recovery, distributed debugging, output commit and others. Netzer and Xu introduced the notion of zigzag paths and presented necessary and sufficient conditions for a set of checkpoints to be part of a consistent global checkpoint. This result also reveals that determining the existence of zigzag paths between checkpoints is crucial for determining consistent global checkpoints. Recent research also reveals that determining zigzag paths on-line is not possible. In this paper, we present an off-line method for determining the existence of zigzag paths between checkpoints.