The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] fault(494hit)

381-400hit(494hit)

  • A Study on Stability Analysis of Discrete Event Dynamic Systems

    Kwang-Hyun CHO  Jong-Tae LIM  

     
    PAPER-Automata,Languages and Theory of Computing

      Vol:
    E80-D No:12
      Page(s):
    1149-1154

    In supervisory control, discrete event dynamic systems (DEDSs) are modeled by finite-state automata, and their behaviors described by the associated formal languages; control is exercised by a supervisor, whose control action is to enable or disable the controllable events. In this paper we present a general stability concept for DEDSs, stability in the sense of Lyapunov with resiliency, by incorporating Lyapunov stability concepts with the concept of stability in the sense of error recovery. We also provide algorithms for verifying stability and obtaining a domain of attraction. Relations between the notion of stability and the notion of fault-tolerance are addressed.

  • A CAD-Based Approach to Fault Diagnosis of CMOS LSI with Single Fault Using Abnormal Iddq

    Masaru SANADA  

     
    PAPER

      Vol:
    E80-A No:10
      Page(s):
    1945-1954

    A CAD-based faulty portion diagnosis technique for CMOS-LSI with single fault using abnormal Iddq has been developed to indicate the presence of physical damage in a circuit. This method of progressively reducing the faulty portion, works by extracting the inner logic state of each block from logic simulation, and by deriving test vector numbers with abnomal Iddq. To easily perform fault diagnosis, the hierarchical circuit structure is divided into primitive blocks including simple logic gates. The diagnosis technique employs the comparative operation of each primitive block to determine whether one and the same inner logic state with abnormal Iddq exists in the inner logic state with normal Iddq or not. The former block is regarded as normal block and the latter block is regarded as faulty block. Faulty portion of the faulty block can be localized easily by using input logic state simulation. Experimental results on real faulty LSI with 100k gates demonstrated rapid diagnosis times of within ten hours and reliable extraction of the faulty portion.

  • An Efficiently Reconfigurable Architecture for Mesh-Arrays with PE and Link Faults

    Tadayoshi HORITA  Itsuo TAKANAMI  

     
    PAPER-Fault Tolerance

      Vol:
    E80-D No:9
      Page(s):
    879-885

    The authors previously proposed a reconfigurable architecture called the "XL-scheme" in order to cope with processor element (PE) faults as well as link faults. However, they described an algorithm for compensating only for link faults. They determined the potential ability to tolerate faults of the XL-scheme for simultaneous faults of links and PEs, and left a reconstruction algorithm for simultaneous PE and link faults to be studied in the future. This paper briefly explains the XL-scheme and gives a reconstruction algorithm for simultaneous PE and link faults. The algorithm first replaces faulty PEs with healthy ones and then replaces faulty links with healthy ones. We then compute the reliabilities of the mesh-arrays with simultaneous PE and link faults by simulation. We compare the reliability of the XL-scheme with that of the one-and-half track switch model. It is seen that the former is much larger than the latter. Furthermore, we show the result for processing time.

  • Further Research on Systematical Information Modeling

    Demin WU  Wei LU  

     
    PAPER-Communication Networks and Services

      Vol:
    E80-B No:9
      Page(s):
    1283-1289

    A new scheme based on hierarchical information organization and situation awareness to support network manager in failure localization is proposed. This paper integrates the situation theory for the needs of fault management to model the states and events. As the result, the proposed information model includes four fault management viewpoints to support situational, functional, logical and physical analysis within the respective networks. Object-oriented analysis is applied to construct the information. The correlation of network situation is derived by description logic. The proposed classification algorithm is applied to solve the situation awareness problem. By using this proposal the correlation performance is enhanced to logarithmic order.

  • A Novel Replication Technique for Detecting and Masking Failures for Parallel Software: Active Parallel Replication

    Adel CHERIF  Masato SUZUKI  Takuya KATAYAMA  

     
    PAPER-Fault Tolerance

      Vol:
    E80-D No:9
      Page(s):
    886-892

    We present a novel replication technique for parallel applications where instances of the replicated application are active on different group of processors called replicas. The replication technique is based on the FTAG (Fault Tolerant Attribute Grammar) computation model. FTAG is a functional and attribute based model. The developed replication technique implements "active parallel replication," that is, all replicas are active and compute concurrently a different piece of the application parallel code. In our model replicas cooperate not only to detect and mask failures but also to perform parallel computation. The replication mechanisms are supported by FTAG run time system and are fully application-transparent. Different novel mechanisms for checkpointing and recovery are developed. In our model during rollback recovery only that part of the computation that was detected faulty is discarded. The replication technique takes full advantage of parallel computing to reduce overall computation time.

  • Fault-Tolerant Cube-Connected Cycles Architectures Capable of Quick Broadcasting by Using Spare Circuits

    Nobuo TSUDA  

     
    PAPER-Fault Tolerance

      Vol:
    E80-D No:9
      Page(s):
    871-878

    The construction of fault-tolerant processor arrays with interconnections of cube-connected cycles (CCCs) by using an advanced spare-connection scheme for k-out-of-n redundancies called "generalized additional bypass linking" is described. The connection scheme uses bypass links with wired OR connections to spare processing elements (PEs) without external switches, and can reconfigure complete arrays by tolerating faulty portions in these PEs and links. The spare connections are designed as a node-coloring problem of a CCC graph with a minimum distance of 3: the chromatic numbers corresponding to the number of spare PE connections were evaluated theoretically. The proposed scheme can be used for constructing various k-out-of-n configurations capable of quick broadcasting by using spare circuits, and is superior to conventional schemes in terms of extra PE connections and reconfiguration control. In particular, it allows construction of optimal r-fault-tolerant configurations that provide r spare PEs and r extra connections per PE for CCCs with 4x PEs (x: integer) in each cycle.

  • On Dynamic Fault Tolerance for WSI Networks

    Toshinori YAMADA  Tomohiro NISHIMURA  Shuichi UENO  

     
    LETTER-Graphs and Networks

      Vol:
    E80-A No:8
      Page(s):
    1529-1530

    The finite reconfigurability and local reconfigurability of graphs were proposed by Sha and Steiglitz [1], [2] in connection with a problem of on-line reconfiguraion of WSI networks for run-time faults. It is shown in [2] that a t-locally-reconfigurable graph for a 2-dimensional N-vertex array AN can be constructed from AN by adding O(N) vertices and edges. We show that Ω(N) vertices must be added to an N-vertex graph GN in order to construct a t-locally-reconfigurable graph for GN, which means that the number of added vertices for the above mentioned t-locally-reconfigurable graph for AN is optimal to within a constant factor. We also show that a t-finitely-reconfigurable graph for an N-vertex graph GN can be constructed from GN by adding t vertices and tN + t (t+1)/2 edges.

  • A Comparison of Correlated Failures for Software Using Community Error Recovery and Software Breeding

    Kazuyuki SHIMA  Ken-ichi MATSUMOTO  Koji TORII  

     
    PAPER-Fault Tolerant Computing

      Vol:
    E80-D No:7
      Page(s):
    717-725

    We present a comparison of correlated failures for multiversion software using community error recovery (CER) and software breeding (SB). In CER, errors are detected and recovered at checkpoints which are inserted in all the versions of the software. SB is analogous to the breeding of plants and animals. In SB, versions consist of loadable modules, and a driver exchanges the modules between versions to detect and eliminate faulty modules. We formulate reliability models to estimate the probability of failure for software using either CER or SB. Our reliability models assume failures in the checkpoints in CER and the driver in SB. We use beta-binomial distribution for modeling correlated failures of versions, because much of the evidence suggests that the assumption that failures in versions occur independently is not always true. Our comparison indicates that multiversion software using SB is more reliable than that using CER when the probability of failure in the checkpoints in CER or the driver in SB is 10-7.

  • Data-Driven Fault Management for TINA Applications

    Hiroshi ISHII  Hiroaki NISHIKAWA  Yuji INOUE  

     
    PAPER-Distribute MGNT

      Vol:
    E80-B No:6
      Page(s):
    907-914

    This paper describes the effectiveness of stream-oriented data-driven scheme for achieving autonomous fault management of hyper-distributed systems such as networks based on the Telecommunications Information Networking Architecture (TINA). TINA, whose specifications are in the finalizing phase within TINA-Consortium, is aiming at achieving interoperability and reusability of telecom applications software and independent of underlying technologies. However, to actually implement TINA network, it is essential to consider the technology constraints. Especially autonomous fault management at run-time is crucial for distributed network environment because centralized control using global information is very difficult. So far many works have been done on so-called off-line management but runtime management of service failure seems immature. This paper proposes introduction of stream-oriented data-driven processors to the autonomous fault management at runtime in TINA based distributed network environment. It examines the features of distributed network applications and technology requirements to achieve fault management of those distributed applications such as effective multiprocessing of surveillance, testing, reconfiguration in addition to ordinary processing.

  • I-PROT: ISDN Protocol Fault Detection System

    Hikaru SUZUKI  Narumi TAKAHASHI  

     
    PAPER-Protocol

      Vol:
    E80-B No:6
      Page(s):
    888-893

    This paper discribes the ISDN PROtocol Testing system (I-PROT). The system consists of translation & distribution function block, layer-2 fault surveillance function block, layer-3 fault surveillance function block, cause detection function block, and HMI. The system receives data from protocol monitors and detects the error recovery sequences, (we call "quasi-normal sequences"), as well as the sequences that do not follow the protocol specifications, (we call "abnormal sequences"). In the layer-3 fault surveillance function block, we use the protocol specification database whose records are converted from the state transition rules and added the judgment which classify the rules into the "normal" and "quasi-normal." We also show the classification method which is applicable to all connection-oriented protocol specifications. In the layer-2 fault surveillance function block, we explain the another easy detecting method. In the cause function block, we describe the partial pattern matching method to relate the protocol fault to the real cause of the fault. We built the prototype of the I-PROT and examine the turn around time (TAT) performance. As a result of the examination, we find the TAT of the I-PROT is directly proportional to the number of the frames analyzed by the system, and the system can reduce the load of the conventional manual analysis by the maintenance personnel.

  • Spare Allocation and Compensation-Path Finding for Reconfiguring WSI Processor Arrays Having Single-Track Switches

    Takao OZAWA  Takeshi YAMAGUCHI  

     
    LETTER

      Vol:
    E80-A No:6
      Page(s):
    1072-1075

    In contrast to previous algorithms for reconfiguring processor arrays under the assumption that spare rows and columns are placed on the perimeter of the array or on fixed positions, our new algorithm employs movable and partitionable spare rows and columns. The objective of moving and partitioning spare rows and/or columns is the elimination of faulty processors each of which is blocked in all directions to spare processors. The results of our computer simulation indicate that reconfigurability can significantly be improved.

  • Formal Verification of Totally Self-Checking Properties of Combinational Circuits

    Kazuo KAWAKUBO  Koji TANAKA  Hiromi HIRAISHI  

     
    PAPER-Verification

      Vol:
    E80-D No:1
      Page(s):
    57-62

    In this paper we propose a method of formal verification of totally self-checking (TSC) properties of combinational circuits using logic function manipulation. We show that the problem of verification of TSC properties can be transformed to a satisfiability problem of decision functions formed from characteristic functions of a circuit's output code words. Then the problem can be solved using binary decision diagrams (BDD). Experimental results show the effectiveness of the proposed method.

  • A Learning Algorithm for Fault Tolerant Feedforward Neural Networks

    Nait Charif HAMMADI  Hideo ITO  

     
    PAPER-Redundancy Techniques

      Vol:
    E80-D No:1
      Page(s):
    21-27

    A new learning algorithm is proposed to enhance fault tolerance ability of the feedforward neural networks. The algorithm focuses on the links (weights) that may cause errors at the output when they are open faults. The relevances of the synaptic weights to the output error (i.e. the sensitivity of the output error to the weight fault) are estimated in each training cycle of the standard backpropagation using the Taylor expansion of the output around fault-free weights. Then the weight giving the maximum relevance is decreased. The approach taken by the algorithm described in this paper is to prevent the weights from having large relevances. The simulation results indicate that the network trained with the proposed algorithm do have significantly better fault tolerance than the network trained with the standard backpropagation algorithm. The simulation results show that the fault tolerance and the generalization abilities are improved.

  • Dependable Bus Arbitraion by Alternating Competition with Checkers

    Kazuo TOKITO  Takashi MATSUBARA  Yoshiaki KOGA  

     
    PAPER-Testing/Checking

      Vol:
    E80-D No:1
      Page(s):
    44-50

    A fault in multi-processing system arbitration circuits result in incorrect arbitration or abnormal operation of the system. A highly reliable system requires dependable arbitration in order to operate properly. Previously, we proposed alternate competing arbitration suitable for highly reliable systems. In this paper, we propose a method for improvement of fault detection and location using additional checkers. This method is effective to maintain reliability of the system.

  • A Method of Multiple Fault Diagnosis in Sequential Circuits by Sensitizing Sequence Pairs

    Nobuhiro YANAGIDA  Hiroshi TAKAHASHI  Yuzo TAKAMATSU  

     
    PAPER-Testing/Checking

      Vol:
    E80-D No:1
      Page(s):
    28-37

    This paper presents a method of multiple fault diagnosis in sequential circuits by input-sequence pairs having sensitizing input pairs. We, first, introduce an input-sequence pair having sensitizing input pairs to diagnose multiple faults in a sequential circuit represented by a combinational array model. We call such input-sequence pair the sensitizing sequence pair in this paper. Next, we describe a diagnostic method for multiple faults in sequential circuits by the sensitizing sequence pair. From a relation between a sensitizing path generated by a sensitizing sequence pair and a subcircuit, the proposed method deduces the suspected faults for the subcircuits, one by one, based on the responses observed at primary outputs without probing any internal line. Experimental results show that our diagnostic method identifies fault locations within small numbers of suspected faults.

  • A Fault Simulation Method for Crosstalk Faults in Synchronous Sequential Circuits

    Noriyoshi ITAZAKI  Yasutaka IDOMOTO  Kozo KINOSHITA  

     
    PAPER-Testing/Checking

      Vol:
    E80-D No:1
      Page(s):
    38-43

    With the scale-down of VLSI chip size and the reduction of switching time of logic gates, crosstalk faults become an important problem in testing of VLSI. For synchronous sequential circuits, the crosstalk pulses on data lines will be considered to be harmless, because they can be invalidated by a clocking phase. However, crosstalk pulses generated on clock lines or reset lines will cause an erroneous operation. In this work, we have analyzed a crosstalk fault scheme, and developed a fault simulator based on the scheme. Throughout this work, we considered the crosstalk fault as unexpected strong capacitive coupling between one data line and one clock line. Since we must consider timing in addition to a logic value, the unit delay model is used in our fault simulation. Our experiments on some benchmark circuits show that fault activation rates and fault detection rates vary widely depending on circuit characteristics. Fault detection rates of up to 80% are obtained from our simulation with test vectors generated at random.

  • Quad-Processor Redundancy for a RISC-Based Fault Tolerant Computer

    Shinichiro YAMAGUCHI  Tetsuaki NAKAMIKAWA  Naoto MIYAZAKI  Yuuichirou MORITA  Yoshihiro MIYAZAKI  Sakou ISHIKAWA  

     
    PAPER-Redundancy Techniques

      Vol:
    E80-D No:1
      Page(s):
    15-20

    The fault tolerant computer (FTC) is applied as a communication or database server in the information service and computer aided process control fields. User requires of the FTC are to provide the current level of performance and software transparency needing no additional dedicated program for fault tolerance. To meet these requirements, we propose quadprocessor redundancy (QPR) architecture that combines dualRISC based duplicated CPUs integrating main memories, and duplicated I/O subsystems by using some additional hardware. Duplicated CPUs run under the instruction level synchronization (lock step operation) , and the duplicated I/O subsystems are managed by an operating system. When a fault is detected, the faulty CPU is isolated by hardware. User program is continuously executed by the remaining CPU. We applied the QPR to our UNIX servers, and achieved satisfactory levels of performance.

  • A Systematic Design of Fault Tolerant Systolic Arrays Based on Triple Modular Redundancy in Time-Processor Space

    Mineo KANEKO  Hiroyuki MIYAUCHI  

     
    PAPER-Fault Tolerant Computing

      Vol:
    E79-D No:12
      Page(s):
    1676-1689

    A systematic procedure to configure faulttolerant systolic arrays based on Triplicated Triple Modular Redundancy is proposed. The design procedure consists of the triplication of the dependence graph which is formed from a target regular algorithm and the transformation onto physical time-processor domain. The resultant systolic arrays tolerate failures not only on processing elements but also on communication links. While it needs sophisticated connection scheme between processing elements to guarantee the fault-tolerance on communication links, the link complexity is possibly reduced by optimizing redundant operation scheme. Unconstrained and constrained link minimization problems are introduced, and the possibility and the constraints required for link complexity reduction are investigated.

  • Non-Regenerative Stochastic Petri Nets: Modeling and Analysis

    Qun JIN  Yoneo YANO  Yoshio SUGASAWA  

     
    PAPER

      Vol:
    E79-A No:11
      Page(s):
    1781-1790

    We develop a new class of stochastic Petri net: non-regenerative stochastic Petri net (NRSPN), which allows the firing time of its transitions with arbitrary distributions, and can automatically generate a bounded reachability graph that is equivalent to a generalization of the Markov renewal process in which some of the states may not constitute regeneration points. Thus, it can model and analyze behavior of a system whose states include some non-regeneration points. We show how to model a system by the NRSPN, and how to obtain numerical solutions for the NRSPN model. The probabilistic behavior of the modeled system can be clarified with the reliability measures such as the steady-state probability, the expected numbers of visits to each state per unit time, availability, unavailability and mean time between system failure. Finally, to demonstrate the modeling ability and analysis power of the NRSPN model, we present an example for a fault-tolerant system using the NRSPN and give numerical results for specific distributions.

  • Independent Spanning Trees of Product Graphs and Their Construction

    Koji OBOKATA  Yukihiro IWASAKI  Feng BAO  Yoshihide IGARASHI  

     
    PAPER-Graphs and Networks

      Vol:
    E79-A No:11
      Page(s):
    1894-1903

    A graph G is called an n-channel graph at vertex r if there are n independent spanning trees rooted at r. A graph G is called an n-channel graph if G is an n-channel graph at every vertex. Independent spanning trees of a graph play an important role in fault-tolerant broadcasting in the graph. In this paper we show that if G1 is an n1-channel graph and G2 is an n2-channel graph, then G1G2 is an (n1 + n2)-channel graph. We prove this fact by a construction of n1+n2 independent spanning trees of G1G2 from n1 independent spanning trees of G1 and n2 independent spanning trees of G2. As an application we describe a fault-tolerant broadcasting scheme along independent spanning trees.

381-400hit(494hit)