The search functionality is under construction.

Keyword Search Result

[Keyword] erasure coding(2hit)

1-2hit
  • Reliability and Failure Impact Analysis of Distributed Storage Systems with Dynamic Refuging

    Hiroaki AKUTSU  Kazunori UEDA  Takeru CHIBA  Tomohiro KAWAGUCHI  Norio SHIMOZONO  

     
    PAPER-Data Engineering, Web Information Systems

      Pubricized:
    2016/06/17
      Vol:
    E99-D No:9
      Page(s):
    2259-2268

    In recent data centers, large-scale storage systems storing big data comprise thousands of large-capacity drives. Our goal is to establish a method for building highly reliable storage systems using more than a thousand low-cost large-capacity drives. Some large-scale storage systems protect data by erasure coding to prevent data loss. As the redundancy level of erasure coding is increased, the probability of data loss will decrease, but the increase in normal data write operation and additional storage for coding will be incurred. We therefore need to achieve high reliability at the lowest possible redundancy level. There are two concerns regarding reliability in large-scale storage systems: (i) as the number of drives increases, systems are more subject to multiple drive failures and (ii) distributing stripes among many drives can speed up the rebuild time but increase the risk of data loss due to multiple drive failures. If data loss occurs by multiple drive failure, it affects many users using a storage system. These concerns were not addressed in prior quantitative reliability studies based on realistic settings. In this work, we analyze the reliability of large-scale storage systems with distributed stripes, focusing on an effective rebuild method which we call Dynamic Refuging. Dynamic Refuging rebuilds failed blocks from those with the lowest redundancy and strategically selects blocks to read for repairing lost data. We modeled the dynamic change of amount of storage at each redundancy level caused by multiple drive failures, and performed reliability analysis with Monte Carlo simulation using realistic drive failure characteristics. We showed a failure impact model and a method for localizing the failure. When stripes with redundancy level 3 were sufficiently distributed and rebuilt by Dynamic Refuging, the proposed technique turned out to scale well, and the probability of data loss decreased by two orders of magnitude for systems with a thousand drives compared to normal RAID. The appropriate setting of a stripe distribution level could localize the failure.

  • Throughput Capacity Study for MANETs with Erasure Coding and Packet Replication

    Bin YANG  Yin CHEN  Guilin CHEN  Xiaohong JIANG  

     
    PAPER-Network

      Vol:
    E98-B No:8
      Page(s):
    1537-1552

    Throughput capacity is of great importance for the design and performance optimization of mobile ad hoc networks (MANETs). We study the exact per node throughput capacity of MANETs under a general 2HR-(g, x, f) routing scheme which combines erasure coding and packet replication techniques. Under this scheme, a source node first encodes a group of g packets into x (x ≥ g) distinct coded packets, and then replicates each of the coded packets to at most f relay nodes which help to forward them to the destination node. All original packets can be recovered once the destination node receives any g distinct coded packets of the group. To study the throughput capacity, we first construct two absorbing Markov chain models to depict the complicated packet delivery process under the routing scheme. Based on these Markov models, an analytical expression of the throughput capacity is derived. Extensive simulation and numerical results are provided to verify the accuracy of theoretical results on throughput capacity and to illustrate how system parameters will affect the throughput capacity in MANETs. Interestingly, we find that the replication of coded packets can improve the throughput capacity when the parameter x is relatively small.