1-2hit |
Xuefeng WU Jie LI Hisao KAMEDA
In this paper, we present an analytic model to study the reliability of some important disk array organizations that have been proposed by others in the literature. These organizations are based on the combination of two options for the data layout, regular RAID-5 and block designs, and three alternatives for sparing, hot sparing, distributed sparing and parity sparing. Uncorrectable bit errors have big effects on reliability but are ignored in traditional reliability analysis of disk arrays. We consider both disk failures and uncorrectable bit errors in the model. The reliability of disk arrays is measured in terms of MTTDL (Mean Time To Data Loss). A unified formula of MTTDL has been derived for these disk array organizations. The MTTDLs of these disk array organizations are also compared using the analytic model. By numerical experiments, we show that the data losses caused by uncorrectable bit errors may dominate the data losses of disk array systems though only the data losses caused by disk failures are traditionally considered. The consideration of uncorrectable bit errors provides a more realistic look at the reliability of the disk array systems.
Xuefeng WU Jie LI Hisao KAMEDA
UNcorrectable Bit Errors (UNBEs) are important in considering the reliability of Redundant Array of Inexpensive Disks (RAID). They, however, have been ignored or have not been studied in detail in existing reliability analysis of RAID. In this paper, we present an analytic model to study the reliability of declustered-parity RAID by considering UNBEs. By using the analytic model, the optimistic and the pessimistic estimates of the probability that data loss occurs due to an UNBE during the data reconstruction after a disk failed (we call this DB data loss) are obtained. Then, the optimistic and the pessimistic estimates of the Mean Time To Data Loss (MTTDL) that take into account both DB data loss and the data loss caused by double independent disk failures (we call this DD data loss) are obtained. Furthermore, how the MTTDL depends on the number of units in a parity stripe, rebuild time of a failed disk and write fraction of data access are studied by numerical analysis.