The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] fault(493hit)

241-260hit(493hit)

  • Technique to Diagnose Open Defects that Takes Coupling Effects into Consideration

    Yasuo SATO  Iwao YAMAZAKI  Hiroki YAMANAKA  Toshio IKEDA  Masahiro TAKAKURA  Kazuhiko IWASAKI  

     
    PAPER-Dependable Computing

      Vol:
    E87-D No:9
      Page(s):
    2179-2185

    Although open defects are hard to diagnose because they are unstable, we developed a technique to diagnose completely open defects. We applied a new "segment model" that takes the coupling effects on a defective node that are caused by neighboring nodes into consideration. This technique is used to focuse not only on the behavior of the defective node, but also on the behavior of other nodes affecting its behavior. We explain the theoretical treatment of our model and present experimental results obtained from an actual chip.

  • The Reliability Performance of Wireless Sensor Networks Configured by Power-Law and Other Forms of Stochastic Node Placement

    Mika ISHIZUKA  Masaki AIDA  

     
    PAPER-Sensor Network

      Vol:
    E87-B No:9
      Page(s):
    2511-2520

    Sensor nodes are prone to failure and have limited power capacity, so the evaluation of fault tolerance and the creation of technology for improved tolerance are among the most important issues for wireless sensor networks. The placement of sensor nodes is also important, since this affects the availability of nodes within sensing range of a target in a given location and of routes to the base station. However, there has been little research on the placement of sensor nodes. Furthermore, all research to date has been based on deterministic node placement, which is not suitable when a great many sensor nodes are to be placed over a large area. In such a situation, we require stochastic node placement, where the sensor-positions are in accord with a probability density function. In this paper, we examine how fault tolerance can be improved by stochastic node placement that produces scale-free characteristics, that is, where the degree of the nodes follows a power law.

  • A Nested Invocation Suppression Mechanism for Active Replication Fault-Tolerant CORBA

    Deron LIANG  Chen-Liang FANG  Chyouhwa CHEN  

     
    PAPER-Dependable Computing

      Vol:
    E87-D No:8
      Page(s):
    2070-2077

    Active replication is a common approach to building highly available and reliable distributed software applications. The redundant nested invocation (RNI) problem arises when servers in a replicated group issues nested invocations to other server groups in response to a client invocation. Automatic suppression of RNI is always a desirable solution, yet it is usually a difficult design issue. If the system has multithreading support, the difficulties of implementation increase dramatically. Intuitively, to design a deterministic thread execution control mechanism is a possible approach. Unfortunately, some modern operating systems implement thread on kernel level for execution fairness. For the kernel thread case, modification on thread control implies modifying the operating system kernel. This approach loses system portability which is one of the important requirements of CORBA or middleware. In this work, we propose a mechanism to perform the auto-suppression of redundant nested invocation in an active replication fault-tolerant (FT) CORBA system. Besides the mechanism design, we discuss the design correctness semantic and the correctness proof of our design.

  • Comparing Software Rejuvenation Policies under Different Dependability Measures

    Tadashi DOHI  Hiroaki SUZUKI  Kishor S. TRIVEDI  

     
    PAPER-Dependable Computing

      Vol:
    E87-D No:8
      Page(s):
    2078-2085

    Software rejuvenation is a preventive and proactive solution that is particularly useful for counteracting the phenomenon of software aging. In this paper, we consider both the periodic and non-periodic software rejuvenation policies under different dependability measures. As is well known, the steady-state system availability is the probability that the software system is operating in the steady state and, at the same time, is often regarded as the mean up rate in the system operation period. We show that the mean up rate should be defined as the mean value of up rate, but not as the mean up time per mean operation time. We derive numerically the optimal software rejuvenation policies which maximize the steady-state system availability and the mean up rate, respectively, for each periodic or non-periodic model. Numerical examples show that the real mean up rate is always smaller than the system availability in the steady state and that the availability overestimates the ratio of operative time of the software system.

  • MPICH-GF: Transparent Checkpointing and Rollback-Recovery for Grid-Enabled MPI Processes

    Namyoon WOO  Hyungsoo JUNG  Heon Young YEOM  Taesoon PARK  Hyungwoo PARK  

     
    PAPER-Distributed, Grid and P2P Computing

      Vol:
    E87-D No:7
      Page(s):
    1820-1828

    Fault-tolerance is an essential feature of the distributed systems where the possibility of a failure increases with the growth of the system. In spite of extensive researches over two decades, fault-tolerance systems have not succeeded in practical use. It is due to the high overhead and the unhandiness of the previous fault-tolerance systems. In this paper, we propose MPICH-GF, a user-transparent checkpointing system for grid-enabled MPICH. Our objectives are to fill the gap between the theory and the practice of fault-tolerance systems, and to provide a checkpointing-recovery system for grids. To build a fault-tolerant MPICH version, we have designed task migration, dynamic process management, and atomic message transfer. MPICH-GF requires no modification of application source codes, and it affects the MPICH communication characteristics as less as possible. The features of MPICH-GF are that it supports the direct message transfer mode and that all of the implementation has been done at the lower layer, that is, the abstract device level. We have evaluated MPICH-GF using NPB applications on Globus middleware.

  • Ω Line Problem in Optimistic Log-Based Rollback Recovery Protocol

    MaengSoon BAIK  SungJin CHOI  ChongSun HWANG  JoonMin GIL  ChanYeol PARK  HeonChang YOO  

     
    PAPER-Distributed, Grid and P2P Computing

      Vol:
    E87-D No:7
      Page(s):
    1834-1842

    Optimistic log-based rollback recovery protocols have been regarded as an attractive fault-tolerant solution in distributed systems based on message-passing paradigm due to low overhead in failure-free time. These protocols are based on a Piecewise Deterministic (PWD) Assumption model. They, however, assumed that all logged non-deterministic events in a consistent global recovery line must be determinately replayed in recovery time. In this paper, we give the impossibility of deterministic replaying of logged non-deterministic event in a consistent global recovery line as a Ω Line Problem, because of asynchronous properties of distributed systems: no bound on the relative speeds of processes, no bound on message transmission delays and no global time source. In addition, we propose a new optimistic log-based rollback recovery protocol, which guarantees the deterministic replaying of all logged non-deterministic events belonged in a consistent global recovery line and solves a Ω Line Problem in recovery time.

  • Defect Level Prediction Using Multi-Model Fault Coverage

    Shyue-Kung LU  

     
    PAPER-Dependable Computing

      Vol:
    E87-D No:6
      Page(s):
    1488-1495

    As we enter the deep submicron era, the costs to maintain the quality of shipped products increases significantly. Unfortunately, even 100% coverage of the widely used single stuck-at faults cannot guarantee that the defect level of the shipped chips is low enough. This is due to the fact that the stuck-at fault model does not cover all catastrophic defects. Moreover, it is difficult to estimate the difference between stuck-at fault coverage and defect coverage. Multiple fault models or test techniques are usually adopted in the test process, each having its corresponding fault coverage. However, the relationship between the defect level and those individual fault coverages remains to be explored. In this paper, we first propose the concept of multi-model fault coverage (MFC) instead of the fault coverage based on a single fault model. The multi-model fault coverage for nonequiprobable faults is presented, and the multi-model fault coverage for equiprobable faults is shown to be a special case of nonequiprobable faults. The relationship between defect level, fabrication yield, and multi-model fault coverage is then derived. We also analyze the defect level error between the predicted defect level and the physical defect level. An algorithm is also proposed for estimating the number of fault models required in order to achieve sufficient accuracy. Experimental results show that multi-model fault coverage can be used to predict the defect level more precisely. As the number of fault models increases, the defect level error reduces significantly. Our approach is efficient for product quality prediction, especially for deep sub-micron devices.

  • Performance Analysis of Robust Hierarchical Mobile IPv6 for Fault-Tolerant Mobile Services

    Sangheon PACK  Taewan YOU  Yanghee CHOI  

     
    PAPER-Mobility Management

      Vol:
    E87-B No:5
      Page(s):
    1158-1165

    In mobile multimedia environment, it is very important to minimize handoff latency due to mobility. In terms of reducing handoff latency, Hierarchical Mobile IPv6 (HMIPv6) can be an efficient approach, which uses a mobility agent called Mobility Anchor Point (MAP) in order to localize registration process. However, MAP can be a single point of failure or performance bottleneck. In order to provide mobile users with satisfactory quality of service and fault-tolerant service, it is required to cope with the failure of mobility agents. In, we proposed Robust Hierarchical Mobile IPv6 (RH-MIPv6), which is an enhanced HMIPv6 for fault-tolerant mobile services. In RH-MIPv6, an MN configures two regional CoA and registers them to two MAPs during binding update procedures. When a MAP fails, MNs serviced by the faulty MAP (i.e., primary MAP) can be served by a failure-free MAP (i.e., secondary MAP) by failure detection/recovery schemes in the case of the RH-MIPv6. In this paper, we investigate the comparative study of RH-MIPv6 and HMIPv6 under several performance factors such as MAP unavailability, MAP reliability, packet loss rate, and MAP blocking probability. To do this, we utilize a semi-Markov chain and a M/G/C/C queuing model. Numerical results indicate that RH-MIPv6 outperforms HMIPv6 for all performance factors, especially when failure rate is high.

  • Intelligent versus Random Software Testing

    Juichi TAKAHASHI  

     
    PAPER-Metrics, Test, and Maintenance

      Vol:
    E87-D No:4
      Page(s):
    849-854

    Comparison of intelligent and random testing in data inputting is still under discussion. Little is also known about testing for the whole software and empirical testing methodology when random testing used. This study research not only for data inputting testing, but also operation of software (called transitions) in order to test the whole GUI software by intelligent and random testing. Methodology of this study is that we attempt to research efficiency of random and intelligent testing by Chinese postman problem. In general, random testing is considered straightforward but not efficient. Chinese postman problem testing is complicated but efficient. The comparison between random and intelligent testing would give further recommendation for software testing methodology.

  • An Efficient Centralized Algorithm Ensuring Consistent Recovery in Causal Message Logging with Independent Checkpointing

    JinHo AHN  SungGi MIN  

     
    LETTER-Dependable Computing

      Vol:
    E87-D No:4
      Page(s):
    1039-1043

    Because it has desirable features such as no cascading rollback, fast output commit and asynchronous logging, causal message logging needs a consistent recovery algorithm to tolerate concurrent failures. For this purpose, Elnozahy proposed a centralized recovery algorithm to have two practical benefits, i.e. reducing the number of stable storage accesses and imposing no restriction on the execution of live processes during recovery. However, the algorithm with independent checkpointing may force the system to be in an inconsistent state when processes fail concurrently. In this paper, we identify these inconsistent cases and then present a recovery algorithm to have the two benefits and ensure the system consistency when integrated with any kind of checkpointing protocol. Also, our algorithm requires no additional message compared with Elnozahy's algorithm.

  • A Design for Testability Technique for Low Power Delay Fault Testing

    James Chien-Mo LI  

     
    PAPER

      Vol:
    E87-C No:4
      Page(s):
    621-628

    This paper presents a Quiet-Noisy scan technique for low power delay fault testing. The novel scan cell design provides both the quiet and noisy scan modes. The toggling of scan cell outputs is suppressed in the quiet scan mode so the power is saved. Two-pattern tests are applied in the noisy scan mode so the delay fault testing is possible. The experimental data shows that the Quiet-Noisy scan technique effectively reduces the test power to 56% of that of the regular scan. The transition fault coverage is improved by 19.7% compared to an existing toggle suppression low power technique. The presented technique requires very minimal changes in the existing MUX-scan Design For Testability (DFT) methodology and needs virtually no computation. The penalties are area overhead, speed degradation, and one extra control in test mode.

  • A Framework for Network Fault Management Using Software Agents

    Edidiong Uyai EKAETTE  Behrouz Homayoun FAR  

     
    PAPER-System

      Vol:
    E87-D No:4
      Page(s):
    947-958

    This paper proposes a framework for distributed network management by incorporating fault and performance management metrics in a hierarchical decision making model. The goal of this research is to automate the fault management process. The fault management system is organized as a three level information processing model. Correlation results from each level are provided as evidence to the next level. Causal and temporal relationships between monitored variables are captured using Dynamic Bayesian Networks. As evidence is gathered, the probability of the presence of a fault is either strengthened or weakened. The proposed model is used for proactive fault detection as well as fault isolation purposes. A prototype implementing the ideas is presented.

  • A Simple Design of Time-Efficient Firing Squad Synchronization Algorithms with Fault-Tolerance

    Hiroshi UMEO  

     
    PAPER

      Vol:
    E87-D No:3
      Page(s):
    733-739

    In this paper we study a classical firing squad synchronization problem on a model of fault-tolerant cellular automata that have possibly some defective cells. Several fault-tolerant time-efficient synchronization algorithms are developed based on a simple freezing-thawing technique. It is shown that, under some constraints on the distribution of defective cells, any cellular array of length n with p defective cell segments can be synchronized in 2n - 2 + p steps.

  • Generation of Test Sequences with Low Power Dissipation for Sequential Circuits

    Yoshinobu HIGAMI  Shin-ya KOBAYASHI  Yuzo TAKAMATSU  

     
    PAPER-Test Generation and Compaction

      Vol:
    E87-D No:3
      Page(s):
    530-536

    When LSIs that are designed and manufactured for low power dissipation are tested, test vectors that make the power dissipation low should be applied. If test vectors that cause high power dissipation are applied, incorrect test results are obtained or circuits under test are permanently damaged. In this paper, we propose a method to generate test sequences with low power dissipation for sequential circuits. We assume test sequences generated by an ATPG tool are given, and modify them while keeping the original stuck-at fault coverages. The test sequence is modified by inverting the values of primary inputs of every test vector one by one. In order to keep the original fault coverage, fault simulation is conducted whenever one value of primary inputs is inverted. We introduce heuristics that perform fault simulation for a subset of faults during the modification of test vectors. This helps reduce the power dissipation of the modified test sequence. If the fault coverage by the modified test sequence is lower than that by the original test sequence, we generate a new short test sequence and add it to the modified test sequence.

  • Identification and Frequency Estimation of Feedback Bridging Faults Generating Logical Oscillation in CMOS Circuits

    Masaki HASHIZUME  Hiroyuki YOTSUYANAGI  Takeomi TAMESADA  

     
    PAPER-Fault Detection

      Vol:
    E87-D No:3
      Page(s):
    571-579

    When a feedback bridging fault occurs in a combinational circuit and it is activated, logical oscillation may occur in the circuit. In this paper, some electrical conditions are proposed to identify whether a feedback bridging fault occurs logical oscillation. Also, it is proposed how to estimate the oscillation frequency. They are based on piece linearlized models and do not require circuit simulation of large size of circuits. They are evaluated by some experiments. In the experiments, all of the feedback bridging faults occurring logical oscillation are identified. Also, oscillation frequencies larger than the ones obtained by SPICE simulation are derived by the proposed estimation method in the experiments. It promises us that the methods will be used for identifying such bridging faults and estimating the oscillation frequencies.

  • The Fault-Tolerant Early Bird Problem

    Bjorn FAY  Martin KUTRIB  

     
    PAPER

      Vol:
    E87-D No:3
      Page(s):
    687-693

    The capabilities of reliable computations in one-dimensional cellular automata are investigated by means of the Early Bird Problem. The problem is typical for situations in massively parallel systems where a global behavior must be achieved by only local interactions between the single elements. The cells that cause the misoperations are assumed to behave as follows. They run a self-diagnosis before the actual computation once. The result is stored locally such that the working state of a cell becomes visible to its neighbors. A non-working (defective) cell cannot modify information but is able to transmit it unchanged with unit speed. We present an O(n log (n) log (n))-time fault-tolerant solution of the Early Bird Problem.

  • Layout-Based Detection Technique of Line Pairs with Bridging Fault Using IDDQ

    Masaru SANADA  

     
    PAPER-Fault Detection

      Vol:
    E87-D No:3
      Page(s):
    557-563

    Abnormal IDDQ (Quiescent power supply current) is the signal to indicate the existence of physical damage which includes the between circuit lines. Using this signal, a CAD-based line pairs with bridging fault (LBFs) detection technique has been developed to enhance the manufacturing yield of advanced logic LSI with scaled-down structure and multi-metal layers. The proposed technique progressively narrows the doubtful LBFs down by logic information and layout structure. This technique, quickly handled, is applied to draw down the distribution chart of bridging fault portion on wafer, the feature of which chart is fed back to manufacturing process and layout design.

  • Test Sequence Generation for Test Time Reduction of IDDQ Testing

    Hiroyuki YOTSUYANAGI  Masaki HASHIZUME  Takeomi TAMESADA  

     
    PAPER-Test Generation and Compaction

      Vol:
    E87-D No:3
      Page(s):
    537-543

    In this paper, test time reduction for IDDQ testing is discussed. Although IDDQ testing is known to be effective to detect faults in CMOS circuit, test time of IDDQ testing is larger than that of logic testing since supply current is measured after a circuit is in its quiescent state. It is shown by simulation that test time of IDDQ test mostly depends on switching current. A procedure to modify test vectors and a procedure to arrange test vectors are presented for reducing the test time of IDDQ testing. A test sequence is modified such that switching current quickly disappears. The procedure utilizes a unit delay model to estimate the time of the last transition of logic value from L to H in a circuit. Experimental results for benchmark circuits show the effectiveness of the procedure.

  • Analog Circuit Test Using Transfer Function Coefficient Estimates

    Zhen GUO  Jacob SAVIR  

     
    LETTER

      Vol:
    E87-D No:3
      Page(s):
    642-646

    Coefficient-based test (CBT) is introduced for detecting parametric faults in analog circuits. The method uses pseudo Monte-Carlo simulation and system identification tools to determine whether a given circuit under test (CUT) is faulty.

  • Analysis and Testing of Bridging Faults in CMOS Synchronous Sequential Circuits

    Yukiya MIURA  

     
    PAPER-Fault Detection

      Vol:
    E87-D No:3
      Page(s):
    564-570

    In this paper, we analyze behaviors of bridging faults in CMOS synchronous sequential circuits based on transient analysis. From analysis results, we expose dynamic and analog behaviors of the circuit caused by the bridging faults, which are oscillation, asynchronous sequential behavior, IDDT failure and IDDQ failure as well as logic error. In order to detect this kind of fault, we show that not only IDDQ testing but also IDDT testing and logic testing which guarantees correct state transitions are required.

241-260hit(493hit)