The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] fault(493hit)

321-340hit(493hit)

  • Crash Recovery for Distributed Mobile Computing Systems

    Tong-Ying Tony JUANG  

     
    PAPER-Mobile Information Network and Personal Communications

      Vol:
    E84-A No:2
      Page(s):
    668-674

    One major breakthrough on the communication society recently is the extension of networking from wired to wireless networks. This has made possible creating a mobile distributed computing environment and has brought us several new challenges in distributed protocol design. Obviously, wireless networks do have some fundamental differences from wired networks that need to be paid special attention of, such as lower communication bandwidth compared to wired networks, limited electrical power due to battery capacity, and mobility of processes. These new issues make traditional recovery algorithm unsuitable. In this paper, we propose an efficient algorithm with O(nr) message complexity where O(nr) is the total number of mobile hosts (MHs) related to the failed MH. In addition, these MHs only need to rollback once and can immediately resume its operation without waiting for any coordination message from other MHs. During normal operation, the application message needs O(1) additional information when it transmitted between MHs and mobile support stations (MSSs). Each MSS must keep an ntotal_h*n cell_h dependency matrix, where O(ntotal_h) is the total number of MHs in the system and ncell_h is the total number of MHs in its cell. Finally, one related issue of resending lost messages is also considered.

  • Fault-Tolerant Routing Algorithms for Hypercube Interconnection Networks

    Keiichi KANEKO  Hideo ITO  

     
    PAPER-Fault Tolerance

      Vol:
    E84-D No:1
      Page(s):
    121-128

    Many researchers have used hypercube interconnection networks for their good properties to construct many parallel processing systems. However, as the number of processors increases, the probability of occurrences of faulty nodes also increases. Hence, for hypercube interconnection networks which have faulty nodes, several efficient dynamic routing algorithms have been proposed which allow each node to hold status information of its neighbor nodes. In this paper, we propose an improved version of the algorithm proposed by Chiu and Wu by introducing the notion of full reachability. A fully reachable node is a node that can reach all nonfaulty nodes which have Hamming distance l from the node via paths of length l. In addition, we further improve the algorithm by classifying the possibilities of detours with respect to each Hamming distance between current and target nodes. We propose an initialization procedure which makes use of an equivalent condition to perform this classification efficiently. Moreover, we conduct a simulation to measure the improvement ratio and to compare our algorithms with others. The simulation results show that the algorithms are effective when they are applied to low-dimensional hypercube interconnection networks.

  • RAM BIST

    Jacob SAVIR  

     
    PAPER-Integrated Electronics

      Vol:
    E84-C No:1
      Page(s):
    102-107

    This paper describes a random access memory (RAM, sometimes also called an array) test scheme that has the following attributes: (1) Can be used in both built-in mode and off chip/module mode. (2) Can be used to test and diagnose naked arrays. (3) Fault diagnosis is simple and is "free" for some faults during test. (4) Is never subject to aliasing. (5) Depending upon the test length, it can detect many kinds of failures, like stuck-cells, decoder faults, shorts, pattern-sensitive, etc. (6) If used as built-in feature, it does not slow down the normal operation of the array. (7) Does not require storage of correct responses. A single response bit always indicates whether a fault has been detected. Thus, the storage requirement for the implementation of the test scheme is zero. (8) If used as a built-in feature, the hardware overhead is very low.

  • Fault-Tolerant Robust Supervisor for Timed Discrete Event Systems: A Case Study on Spot Welding Processes

    Seong-Jin PARK  Jong-Tae LIM  

     
    LETTER-Theory of Automata, Formal Language Theory

      Vol:
    E83-D No:12
      Page(s):
    2178-2182

    In this paper we develop a robust control theory to achieve fault-tolerant behaviors of timed discrete event systems (DESs) with model uncertainty represented as a set of some possible models. To demonstrate the effectiveness of the proposed theory, we provide a case study of a resistance spot welding process.

  • Intrinsic Evolution for Synthesis of Fault-Recoverable Circuit

    Tae-Suh PARK  Chong-Ho LEE  Duck-Jin CHUNG  

     
    PAPER-Co-design and High-level Synthesis

      Vol:
    E83-A No:12
      Page(s):
    2488-2497

    This paper presents an evolutionary technique to build and maintain fault-recoverable digital circuits. As the synthesis of a circuit by genetic algorithm is progressed according to the circuit behavioral objectives and interactions with the environments, the knowledge regarding the architecture as well as the placement and routing processes is not the major concern of the proposed method. The evolutionary behavior of the circuit also prevents the circuit from stuck-at faults by continuously modifying the neighboring circuit blocks accordingly. This is done without the prior knowledge of where and how the faults occur because of the evolutionary nature. Thus, the overhead circuit blocks for fault diagnosis and redundancy are minimized with this design. The fault-recoverable evolvable hardware circuits are synthesized to build a few combinational logics by evolution and the fault recovery capabilities are shown with the reconfigurable FPGA.

  • On a Weight Limit Approach for Enhancing Fault Tolerance of Feedforward Neural Networks

    Naotake KAMIURA  Teijiro ISOKAWA  Yutaka HATA  Nobuyuki MATSUI  Kazuharu YAMATO  

     
    PAPER-Fault Tolerance

      Vol:
    E83-D No:11
      Page(s):
    1931-1939

    To enhance fault tolerance ability of the feedforward neural networks (NNs for short) implemented in hardware, we discuss the learning algorithm that converges without adding extra neurons and a large amount of extra learning time and cycles. Our algorithm modified from the standard backpropagation algorithm (SBPA for short) limits synaptic weights of neurons in range during learning phase. The upper and lower bounds of the weights are calculated according to the average and standard deviation of them. Then our algorithm reupdates any weight beyond the calculated range to the upper or lower bound. Since the above enables us to decrease the standard deviation of the weights, it is useful in enhancing fault tolerance. We apply NNs trained with other algorithms and our one to a character recognition problem. It is shown that our one is superior to other ones in reliability, extra learning time and/or extra learning cycles. Besides we clarify that our algorithm never degrades the generalization ability of NNs although it coerces the weights within the calculated range.

  • Fault-Tolerant and Self-Stabilizing Protocols Using an Unreliable Failure Detector

    Hiroyoshi MATSUI  Michiko INOUE  Toshimitsu MASUZAWA  Hideo FUJIWARA  

     
    PAPER-Algorithms

      Vol:
    E83-D No:10
      Page(s):
    1831-1840

    We investigate possibility of fault-tolerant and self-stabilizing protocols (ftss protocols) using an unreliable failure detector. Our main contribution is (1) to newly introduce k-accuracy of an unreliable failure detector, (2) to show that k-accuracy of a failure detector is necessary for any ftss k-group consensus protocol, and (3) to present three ftss k-group consensus protocols using a k-accurate and weakly complete failure detector under the read/write daemon on complete networks and on (n-k+1)-connected networks, and under the central daemon on complete networks.

  • Fault Tolerant Crossconnect and Wavelength Routing in All-Optical Networks

    Chuan-Ching SUE  Sy-Yen KUO  Yennun HUANG  

     
    PAPER

      Vol:
    E83-B No:10
      Page(s):
    2278-2293

    This paper proposes a fault tolerant optical crossconnect (FTOXC) which can tolerate link, channel, and internal optical switch failures via spare optical channels, extra input/output (I/O) ports for an optical switch, and associated wavelength converters. It also proposes a fault tolerant wavelength routing algorithm (FTWRA) which is used in the normal and the restored state. The FTOXC and FTWRA can be applied to any all-optical network and can recover many types of failures. FTOXC can configure the number of working and spare channels in each output link based on the traffic demand. Two formulations in this paper can be used to determine the optimal settings of channels. A global optimal setting of working and spare channels in each link can be found by formulating the problem as an integer linear program (ILP). In addition, the number of working and spare channels in each link can be dynamically adjusted according to the traffic loads and the system reliability requirements. The tradeoff between these two conflicting objectives is analyzed by the Markov decision process (MDP).

  • Efficient Test Generation Using Redundancy Identification

    Sangyoon HAN  Sungho KANG  

     
    LETTER-Fault Tolerance

      Vol:
    E83-D No:9
      Page(s):
    1814-1815

    To accomplish an efficient test pattern generation, the isomorphism identification algorithm and the pseudo dominator identification algorithm are developed which are used to identify redundant faults efficiently. Results show that test pattern generation using these algorithms is very efficient.

  • An FPGA Implementation of a Self-Reconfigurable System for the 1 1/2 Track-Switch 2-D Mesh Array with PE Faults

    Tadayoshi HORITA  Itsuo TAKANAMI  

     
    LETTER-Fault Tolerance

      Vol:
    E83-D No:8
      Page(s):
    1701-1705

    We gave in [1] the software and hardware algorithms for reconfiguring 1 1/2-track switch 2-D mesh arrays with faults of processing elements, avoiding them. This paper shows an implementation of the hardware algorithm using an FPGA device, and by the logical simulation confirms the correctness of the behavior and evaluates reconfiguration time. From the result it is found that a self-repairable system is realizable and the system is useful for the run-time as well as fabrication-time reconfiguration because it requires no host computer to execute the reconfiguration algorithm and the reconfiguration time is very short.

  • Efficient Techniques for Adaptive Independent Checkpointing in Distributed Systems

    Cheng-Min LIN  Chyi-Ren DOW  

     
    PAPER-Fault Tolerance

      Vol:
    E83-D No:8
      Page(s):
    1642-1653

    This work presents two novel algorithms to prevent rollback propagation for independent checkpointing: an efficient adaptive independent checkpointing algorithm and an optimized adaptive independent checkpointing algorithm. The last opportunity strategy that yields a better performance than the conservation strategy is also employed to prevent useless checkpoints for both causal rewinding paths and non-causal rewinding paths. The two methods proposed herein are domino effect-free and require only a limited amount of control information. They also take less unnecessary adaptive checkpoints than other algorithms. Furthermore, experimental results indicate that the checkpoint overhead of our techniques is lower than that of the coordinated checkpointing and domino effect-free algorithms for service-providing applications.

  • A Reconfiguration Algorithm for Memory Arrays Containing Faulty Spares

    Keiichi HANDA  Kazuhito HARUKI  

     
    PAPER

      Vol:
    E83-A No:6
      Page(s):
    1123-1130

    Reconfiguration of memory arrays using spare lines is known to be an NP-complete problem. In this paper, we present an algorithm that reconfigures a memory array without any faults by using spare lines effectively even if they contain faulty elements. First, the reconfiguration problem is transformed to an equivalent covering problem in which faulty elements are covered by imaginary fault-free spare lines. Next, the covering problem is heuristically solved by using the Dulmange-Mendelsohn decomposition. The experiments for recently designed memory arrays show that the proposed algorithm is fast and practical.

  • Data-Driven Implementation of Highly Efficient TCP/IP Handler to Access the TINA Network

    Hiroshi ISHII  Hiroaki NISHIKAWA  Yuji INOUE  

     
    PAPER-Software Platform

      Vol:
    E83-B No:6
      Page(s):
    1355-1362

    This paper discusses and clarifies effectiveness of data-driven implementation of protocol handling system to access TINA (Telecommunications Information Networking Architecture) network and internet. TINA is a networking architecture that achieves networking services and management ubiquitously for users and networks. Many TINA related ACTS (Advanced Communication Technologies and Services) projects have been organized in Europe. In Japan, The TINA Trial (TTT) to achieve ATM network management and services based on TINA architectures was done by NTT and several manufactures from April 1997 to April 1999. In these studies and trials, much effort is devoted to development of software based on service architecture and network architecture being standardized in TINA-C (TINA Consortium). In order to achieve TINA environment universally in customers and network sides, we have to consider how to deploy TINA environment onto user side and how to use access transmission capacity as efficiently as possible. Recent technology can easily achieve application and environment downloading from the network side to user side by use of e. g. , JAVA. In accessing the network, there are several possible bottlenecks in information exchange in customer side such as PC processing capability, access protocol handling capability, intra-house wiring bandwidth. Authors, in parallel with TINA software architecture study, have been studying versatile requirements for hardware platform of TINA network. In those studies, we have clarified that the stream-oriented data-driven processor authors have been studying and developing have high reliability, high multiprocessing and multimedia information processing capability. Based on these studies, this paper first shows Von Neumann-based protocol handler is ineffective in case of multiprocessing through mathematical and emulation studies. Then, we show our data-driven protocol handling can effectively realize access protocol handling by emulation study. Then, we describe a result of first step of implementation of data-driven TCP/IP protocol handling. This result proves our TCP/IP hub based on data-driven processor is applicable not only for TINA/CORBA network but normal internet access. Finally, we show a possible customer premises network configuration which resolves bottleneck to access TINA network through ATM access.

  • Monte Carlo Simulation for Analysis of Sequential Failure Logic

    Wei LONG  Yoshinobu SATO  Hua ZHANG  

     
    PAPER

      Vol:
    E83-A No:5
      Page(s):
    812-817

    The Monte Carlo simulation is applied to fault tree analyses of the sequential failure logic. In order to make the validity of the technique clear, case studies for estimation of the statistically expected numbers of system failures during (0, t] are conducted for two types of systems using the multiple integration method as well as the Monte Carlo simulation. Results from these two methods are compared. This validates the Monte Carlo simulation in solving the sequential failure logic with respectably small deviation rates for those cases.

  • Mobile Agent-Based Transactions in Open Environments

    Flavio Morais de ASSIS SILVA  Radu POPESCU-ZELETIN  

     
    PAPER-Mobile Agents

      Vol:
    E83-B No:5
      Page(s):
    973-987

    This paper describes a transaction model for open environments based on mobile agents. Mobile agent-based transactions combine mobility and the execution of control flows with transactional semantics. The model presented represents an approach for providing reliability and correctness of the execution of distributed activities, which fulfills important requirements of applications in Open Environments. The presented transaction model is based on a protocol for providing fault tolerance when executing mobile agent-based activities. This protocol is outlined in this paper. With this protocol, if an agent executing an activity at an agency (logical "place" in a distributed agent environment) becomes unreachable for a long time, the execution of the activity can be recovered and continue at another agency. The fault tolerance approach supports "multi-agent activities," i. e. , activities where some of its parts are spawned to execute and migrate asynchronously in relation to other parts. The described transaction model, called the basic (agent-based) transaction model, is an open nested transaction model. By being based on the presented fault tolerance mechanism, subtransactions can be executed asynchronously in relation to their parent transactions and agent-based transactions can explore alternatives in the event of agent unavailability. The model fulfills requirements for supporting the autonomy of organizations in a distributed agent environment.

  • On Reconfiguration Latency in Fault-Tolerant Systems

    Hagbae KIM  Sangmoon LEE  Taewha HONG  

     
    LETTER-Fault Tolerance

      Vol:
    E83-D No:5
      Page(s):
    1181-1182

    The reconfiguration latency defined as the time taken for reconfiguring a system upon failure detection or mode change. We evaluate it quantitatively for backup sparing, which is one of the most popular reconfiguration methods, by investigating the effects of key parameters.

  • Duplicated Hash Routing: A Robust Algorithm for a Distributed WWW Cache System

    Eiji KAWAI  Kadohito OSUGA  Ken-ichi CHINEN  Suguru YAMAGUCHI  

     
    PAPER

      Vol:
    E83-D No:5
      Page(s):
    1039-1047

    Hash routing is an algorithm for a distributed WWW caching system that achieves a high hit rate by preventing overlaps of objects between caches. However, one of the drawbacks of hash routing is its lack of robustness against failure. Because WWW becomes a vital service on the Internet, the capabilities of fault tolerance of systems that provide the WWW service come to be important. In this paper, we propose a duplicated hash routing algorithm, an extension of hash routing. Our algorithm introduces minimum redundancy to keep system performance when some caching nodes are crashed. In addition, we optionally allow each node to cache objects requested by its local clients (local caching), which may waste cache capacity of the system but it can cut down the network traffic between caching nodes. We evaluate various aspects of the system performance such as hit rates, error rates and network traffic by simulations and compare them with those of other algorithms. The results show that our algorithm achieves both high fault tolerance and high performance with low system overhead.

  • Fault Diagnosis Technique for Yield Enhancement of Logic LSI Using IDDQ

    Masaru SANADA  Hiromu FUJIOKA  

     
    PAPER

      Vol:
    E83-A No:5
      Page(s):
    842-850

    Abnormal IDDQ (Quiescent VDD supply current) indicates the existence of physical damage in a circuit. Using this phenomenon, a CAD-based fault diagnosis technology has been developed to analyze the manufacturing yield of logic LSI. This method to detect the fatal defect fragments in several abnormalities identified with wafer inspection apparatus includes a way to separate various leakage faults, and to define the diagnosis area encircling the abnormal portions. The proposed technique progressively narrows the faulty area by using logic simulation to extract the logic states of the diagnosis area, and by locating test vectors related to abnormal IDDQ. The fundamental diagnosis way employs the comparative operation of each circuit element to determine whether the same logic state with abnormal IDDQ exists in normal logic state or not.

  • Defect and Fault Tolerance SRAM-Based FPGAs by Shifting the Configuration Data

    Abderrahim DOUMAR  Hideo ITO  

     
    PAPER-Fault Tolerance

      Vol:
    E83-D No:5
      Page(s):
    1104-1115

    The homogeneous structure of field programmable gate arrays (FPGAs) suggests that the defect tolerance can be achieved by shifting the configuration data inside the FPGA. This paper proposes a new approach for tolerating the defects in FPGA's configurable logic blocks (CLBs). The defects affecting the FPGA's interconnection resources can also be tolerated with a high probability. This method is suited for the makers, since the yield of the chip is considerably improved, specially for large sizes. On the other hand, defect-free chips can be used as either maximum size, ordinary array chips or fault tolerant chips. In the fault tolerant chips, the users will be able to achieve directly the fault tolerance by only shifting the design data automatically, without changing the physical design of the running application, without loading other configurations data from the off-chip FPGA, and without the intervention of the company. For tolerating defective resources, the use of spare CLBs is required. In this paper, two possibilities for distributing the spare resources (king-shifting and Horse-allocation) are introduced and compared.

  • Distributed Software Agents for Network Fault Management

    Hassan HAJJI  Behrouz Homayoun FAR  

     
    PAPER-Application

      Vol:
    E83-D No:4
      Page(s):
    735-746

    This paper discusses a framework for automating fault management using distributed software agents. The management function is distributed among multiple agents that can carry out advanced reasoning activities on the network domain. Network domain modeling using Bayesian network is introduced. The agent detects, correlates and selectively seeks to derive a clear explanation of the alarms generated in its domain. Depending on the network's degree of automation, the agent can even carry out local recovery actions. The ideas of the paper are implemented in a software for inference in Bayesian network. We identify the potentialities of learning in the agent model, and present the class of problems to be addressed.

321-340hit(493hit)