Junjun ZHENG Hiroyuki OKAMURA Tadashi DOHI
In this paper, we present non-Markovian availability models for capturing the dynamics of system behavior of an operational software system that undergoes aperiodic time-based software rejuvenation and checkpointing. Two availability models with rejuvenation are considered taking account of the procedure after the completion of rollback recovery operation. We further proceed to investigate whether there exists the optimal rejuvenation schedule that maximizes the steady-state system availability, which is derived by means of the phase expansion technique, since the resulting models are not the trivial stochastic models such as semi-Markov process and Markov regenerative process, so that it is hard to solve them by using the common approaches like Laplace-Stieltjes transform and embedded Markov chain techniques. The numerical experiments are conducted to determine the optimal rejuvenation trigger timing maximizing the steady-state system availability for each availability model, and to compare both two models.
Tadashi DOHI Hiroaki SUZUKI Kishor S. TRIVEDI
Software rejuvenation is a preventive and proactive solution that is particularly useful for counteracting the phenomenon of software aging. In this paper, we consider both the periodic and non-periodic software rejuvenation policies under different dependability measures. As is well known, the steady-state system availability is the probability that the software system is operating in the steady state and, at the same time, is often regarded as the mean up rate in the system operation period. We show that the mean up rate should be defined as the mean value of up rate, but not as the mean up time per mean operation time. We derive numerically the optimal software rejuvenation policies which maximize the steady-state system availability and the mean up rate, respectively, for each periodic or non-periodic model. Numerical examples show that the real mean up rate is always smaller than the system availability in the steady state and that the availability overestimates the ratio of operative time of the software system.
Hiroyuki OKAMURA Satoshi MIYAHARA Tadashi DOHI
Long running software systems are known to experience an aging phenomenon called software aging, one in which the accumulation of errors during the execution of software leads to performance degradation and eventually results in failure. To counteract this phenomenon a proactive fault management approach, called software rejuvenation, is particularly useful. It essentially involves gracefully terminating an application or a system and restarting it in a clean internal state. In this paper, we evaluate dependability performance of a communication network system with the software rejuvenation under the assumption that the requests arrive according to a Markov modulated Poisson process (MMPP). Three dependability measures, steady-state availability, loss probability of requests and mean response time on tasks, are derived through the hidden Markovian analysis based on the time-based software rejuvenation scheme. In numerical examples, we investigate the sensitivity of some model parameters to the dependability measures.
In recent years, considerable attention has been devoted to continuously running software systems whose performance characteristics are smoothly degrading in time. Software aging often affects the performance of a software system and eventually causes it to fail. A novel approach to handle transient software failures due to software aging is called software rejuvenation, which can be regarded as a preventive and proactive solution that is particularly useful for counteracting the aging phenomenon. In this paper, we focus on a high assurance software system with fault-tolerance and preventive rejuvenation, and analyze the stochastic behavior of such a highly critical software system. More precisely, we consider a fault-tolerant software system with two-version redundant structure and random rejuvenation schedule, and evaluate quantitatively some dependability measures like the steady-state system availability and MTTF based on the familiar Markovian analysis. In numerical examples, we examine the dependence of two fault tolerant techniques; design and environment diversity techniques, on the system dependability measures.
Hiroyuki OKAMURA Satoshi MIYAHARA Tadashi DOHI Shunji OSAKI
The software rejuvenation is one of the most effective preventive maintenance technique for operational software systems with high assurance requirement. In this paper, we propose the workload-based software rejuvenation scheme for a server type of software system, and develop stochastic models to determine the optimal software rejuvenation schedules for some dependability measures. In numerical examples, we evaluate quantitatively the performance of workload-based software rejuvenation scheme and compare it with the time-based rejuvenation scheme.
Hiroyuki OKAMURA Jungang GUAN Chao LUO Tadashi DOHI
This paper considers how to evaluate the resiliency for virtualized system with software rejuvenation. The software rejuvenation is a proactive technique to prevent the failure caused by aging phenomenon such as resource exhaustion. In particular, according to Gohsh et al. (2010), we compute a quantitative criterion to evaluate resiliency of system by using continuous-time Markov chains (CTMC). In addition, in order to convert general state-based models to CTMCs, we employ PH (phase-type) expansion technique. In numerical examples, we investigate the resiliency of virtualized system with software rejuvenation under two different rejuvenation policies.
Masanori ODAGIRI Tadashi DOHI Naoto KAIO Shunji OSAKI
This article considers a hybrid data backup model for a file system, which combines both conventional magnetic disk (MD) and write-once, read-many optical disk (OD). Since OD recently is a lower cost medium as well as a longer life medium than the ordinary MD, this kind of backup configuration is just recognized to be important. We mathematically formulate the hybrid data backup model and obtain the closed-form average cost rate when the system failure time and the recovery time follow exponential distributions. Numerical calculations are carried out to obtain the optimal backup policy, which is composed of two kinds of backup sizes from the main memory to MD and from MD to OD and minimizes the average cost rate. In numerical examples, the dependence of the optimal backup policy on the failure and the recovery mechanism is examined.
Hiroki FUJIO Hiroyuki OKAMURA Tadashi DOHI
The software rejuvenation is a proactive fault management technique for operational software systems which age due to the error conditions that accrue with time and/or load, and is important for high assurance systems design. In this paper, fine-grained shock models are developed to determine the optimal rejuvenation policies which maximize the system availability. We introduce three kinds of rejuvenation schemes and calculate the optimal software rejuvenation schedules maximizing the system availability for respective schemes. The stochastic models with three rejuvenation policies are extentions of Bobbio et al. (1998, 2001) and represent the failure phenomenon due to the exhaustion of the software resources caused by the memory leak, the fragmentation, etc. Numerical examples are devoted to compare three control schemes quantitatively.
Network survivability is defined as the ability of a network keeping connected under failures and/or attacks. In this paper, we propose two stochastic models; binomial model and negative binomial model, to quantify the network survivability and compare them with the existing Poisson model. We give mathematical formulae of approximate network survivability for respective models and use them to carry out the sensitivity analysis on model parameters. Throughout numerical examples it is shown that the network survivability can change drastically when the number of network nodes is relatively small under a severe attack mode which is called the Black hole attack.
Recently, the wavelet-based estimation method has gradually been becoming popular as a new tool for software reliability assessment. The wavelet transform possesses both spatial and temporal resolution which makes the wavelet-based estimation method powerful in extracting necessary information from observed software fault data, in global and local points of view at the same time. This enables us to estimate the software reliability measures in higher accuracy. However, in the existing works, only the point estimation of the wavelet-based approach was focused, where the underlying stochastic process to describe the software-fault detection phenomena was modeled by a non-homogeneous Poisson process. In this paper, we propose an interval estimation method for the wavelet-based approach, aiming at taking account of uncertainty which was left out of consideration in point estimation. More specifically, we employ the simulation-based bootstrap method, and derive the confidence intervals of software reliability measures such as the software intensity function and the expected cumulative number of software faults. To this end, we extend the well-known thinning algorithm for the purpose of generating multiple sample data from one set of software-fault count data. The results of numerical analysis with real software fault data make it clear that, our proposal is a decision support method which enables the practitioners to do flexible decision making in software development project management.
Chen LI Junjun ZHENG Hiroyuki OKAMURA Tadashi DOHI
Utilization data (a kind of incomplete data) is defined as the fraction of a fixed period in which the system is busy. In computer systems, utilization data is very common and easily observable, such as CPU utilization. Unlike inter-arrival times and waiting times, it is more significant to consider the parameter estimation of transaction-based systems with utilization data. In our previous work [7], a novel parameter estimation method using utilization data for an Mt/M/1/K queueing system was presented to estimate the parameters of a non-homogeneous Poisson process (NHPP). Since NHPP is classified as a simple counting process, it may not fit actual arrival streams very well. As a generalization of NHPP, Markovian arrival process (MAP) takes account of the dependency between consecutive arrivals and is often used to model complex, bursty, and correlated traffic streams. In this paper, we concentrate on the parameter estimation of an MAP/M/1/K queueing system using utilization data. In particular, the parameters are estimated by using maximum likelihood estimation (MLE) method. Numerical experiments on real utilization data validate the proposed approach and evaluate the effective traffic intensity of the arrival stream of MAP/M/1/K queueing system. Besides, three kinds of utilization datasets are created from a simulation to assess the effects of observed time intervals on both estimation accuracy and computational cost. The numerical results show that MAP-based approach outperforms the exiting method in terms of both the estimation accuracy and computational cost.