The search functionality is under construction.

Keyword Search Result

[Keyword] Asynchronous circuits(22hit)

1-20hit(22hit)

  • Optimization Methods during RTL Conversion from Synchronous RTL Models to Asynchronous RTL Models

    Shogo SEMBA  Hiroshi SAITO  Masato TATSUOKA  Katsuya FUJIMURA  

     
    PAPER

      Vol:
    E103-A No:12
      Page(s):
    1417-1426

    In this paper, we propose four optimization methods during the Register Transfer Level (RTL) conversion from synchronous RTL models into asynchronous RTL models. The modularization of data-path resources and the use of appropriate D flip-flops reduce the circuit area. Fixing the control signal of the multiplexers and inserting latches for the data-path resources reduce the dynamic power consumption. In the experiment, we evaluated the effect of the proposed optimization methods. The combination of all optimization methods could reduce the energy consumption by 21.9% on average compared to the ones without the proposed optimization methods.

  • A Design Method for Designing Asynchronous Circuits on Commercial FPGAs Using Placement Constraints

    Tatsuki OTAKE  Hiroshi SAITO  

     
    PAPER

      Vol:
    E103-A No:12
      Page(s):
    1427-1436

    In this paper, we propose a design method to design asynchronous circuits with bundled-data implementation on commercial Field Programmable Gate Arrays using placement constraints. The proposed method uses two types of placement constraints to reduce the number of delay adjustments to fix timing violations and to improve the performance of the bundled-data implementation. We also propose a floorplan algorithm to reduce the control-path delays specific to the bundled-data implementation. Using the proposed method, we could design the asynchronous circuits whose performance is close to and energy consumption is small compared to the synchronous counterparts with less delay adjustment.

  • Conversion from Synchronous RTL Models to Asynchronous RTL Models

    Shogo SEMBA  Hiroshi SAITO  

     
    PAPER

      Vol:
    E102-A No:7
      Page(s):
    904-913

    In this paper, to make asynchronous circuit design easy, we propose a conversion method from synchronous Register Transfer Level (RTL) models to asynchronous RTL models with bundled-data implementation. The proposed method consists of the generation of an intermediate representation from a given synchronous RTL model and the generation of an asynchronous RTL model from the intermediate representation. This allows us to deal with different representation styles of synchronous RTL models. We use the eXtensible Markup Language (XML) as the intermediate representation. In addition to the asynchronous RTL model, the proposed method generates a simulation model when the target implementation is a Field Programmable Gate Array and a set of non-optimization constraints for the control circuit used in logic synthesis and layout synthesis. In the experiment, we demonstrate that the proposed method can convert synchronous RTL models specified manually and obtained by a high-level synthesis tool to asynchronous ones.

  • Novel Implementation Method of Multiple-Way Asynchronous Arbiters

    Masashi IMAI  Tomohiro YONEDA  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E98-A No:7
      Page(s):
    1519-1528

    Multiple-way (N-way) asynchronous arbitration is an important issue in asynchronous system design. In this paper, novel implementation methods of N-way asynchronous arbiters are presented. We first present N-way rectangle mesh arbiters using 2-way mutual exclusion elements. Then, N-way token-ring arbiters based on the non-return-to-zero signaling is also presented. The former can issue grant signals with the same percentage for all the arriving request signals while the latency is proportional to the number of inputs. The latter can achieve low latency and low energy arbitration for a heavy workload environment and a large number of inputs. In this paper, we compare their performances using the 28nm FD-SOI process technologies qualitatively and quantitatively.

  • A Low EMI Circuit Design with Asynchronous Multi-Frequency Clocking

    Jeong-Gun LEE  

     
    BRIEF PAPER-Electronic Circuits

      Vol:
    E97-C No:12
      Page(s):
    1158-1161

    In this paper, we propose a new design technique called extit{asynchronous multi-frequency clocking} for suppressing EMI at a chip design level by combining two independent EMI-suppressing approaches: extit{multi-frequency clocking} and extit{asynchronous circuit design} techniques. To show the effectiveness of our approach, a five-stage pipelined asynchronous MIPS with multi-frequency clocking has been implemented on a commercial Xilinx FPGA device. Our approach shows 11.05 dB and 5.88 dB reductions of peak EM radiation in the prototyped implementation when compared to conventional synchronous and bundled-data asynchronous circuit counterparts, respectively.

  • Asynchronous Stochastic Decoding of LDPC Codes: Algorithm and Simulation Model

    Naoya ONIZAWA  Warren J. GROSS  Takahiro HANYU  Vincent C. GAUDET  

     
    PAPER-VLSI Architecture

      Vol:
    E97-D No:9
      Page(s):
    2286-2295

    Stochastic decoding provides ultra-low-complexity hardware for high-throughput parallel low-density parity-check (LDPC) decoders. Asynchronous stochastic decoding was proposed to demonstrate the possibility of low power dissipation and high throughput in stochastic decoders, but decoding might stop before convergence due to “lock-up”, causing error floors that also occur in synchronous stochastic decoding. In this paper, we introduce a wire-delay dependent (WDD) scheduling algorithm for asynchronous stochastic decoding in order to reduce the error floors. Instead of assigning the same delay to all computation nodes in the previous work, different computation delay is assigned to each computation node depending on its wire length. The variation of update timing increases switching activities to decrease the possibility of the “lock-up”, lowering the error floors. In addition, the WDD scheduling algorithm is simplified for the hardware implementation in order to eliminate time-averaging and multiplication functions used in the original WDD scheduling algorithm. BER performance using a regular (1024, 512) (3,6) LDPC code is simulated based on our timing model that has computation and wire delay estimated under ASPLA 90nm CMOS technology. It is demonstrated that the proposed asynchronous decoder achieves a 6.4-9.8× smaller latency than that of the synchronous decoder with a 0.25-0.3 dB coding gain.

  • High-Throughput Partially Parallel Inter-Chip Link Architecture for Asynchronous Multi-Chip NoCs

    Naoya ONIZAWA  Akira MOCHIZUKI  Hirokatsu SHIRAHAMA  Masashi IMAI  Tomohiro YONEDA  Takahiro HANYU  

     
    PAPER-Dependable Computing

      Vol:
    E97-D No:6
      Page(s):
    1546-1556

    This paper introduces a partially parallel inter-chip link architecture for asynchronous multi-chip Network-on-Chips (NoCs). The multi-chip NoCs that operate as a large NoC have been recently proposed for very large systems, such as automotive applications. Inter-chip links are key elements to realize high-performance multi-chip NoCs using a limited number of I/Os. The proposed asynchronous link based on level-encoded dual-rail (LEDR) encoding transmits several bits in parallel that are received by detecting the phase information of the LEDR signals at each serial link. It employs a burst-mode data transmission that eliminates a per-bit handshake for a high-speed operation, but the elimination may cause data-transmission errors due to cross-talk and power-supply noises. For triggering data retransmission, errors are detected from the embedded phase information; error-detection codes are not used. The throughput is theoretically modelled and is optimized by considering the bit-error rate (BER) of the link. Using delay parameters estimated for a 0.13 µm CMOS technology, the throughput of 8.82 Gbps is achieved by using 10 I/Os, which is 90.5% higher than that of a link using 9 I/Os without an error-detection method operating under negligible low BER (<10-20).

  • An ASIC Design Support Tool Set for Non-pipelined Asynchronous Circuits with Bundled-Data Implementation

    Minoru IIZUKA  Naohiro HAMADA  Hiroshi SAITO  

     
    PAPER

      Vol:
    E96-C No:4
      Page(s):
    482-491

    This paper proposes an ASIC design support tool set for non-pipelined asynchronous circuits with bundled-data implementation. This tool set consists of seven tools to automate design processes of bundled-data implementation such as the generation of design constraints, timing verification, and delay adjustment considering a given latency constraint. With the proposed design flow which combines the proposed tool set and commercial CAD tools, most of design processes from an RTL model is fully automated. In the experiments, to show the effectiveness of energy consumption in bundled-data implementation compared to synchronous counterpart, this paper synthesizes several circuits with a latency constraint which is generated from the synchronous counterpart with the minimum clock cycle time.

  • Long-Range Asynchronous On-Chip Link Based on Multiple-Valued Single-Track Signaling

    Naoya ONIZAWA  Atsushi MATSUMOTO  Takahiro HANYU  

     
    PAPER-Circuit Theory

      Vol:
    E95-A No:6
      Page(s):
    1018-1029

    We have developed a long-range asynchronous on-chip data-transmission link based on multiple-valued single-track signaling for a highly reliable asynchronous Network-on-Chip. In the proposed signaling, 1-bit data with control information is represented by using a one-digit multi-level signal, so serial data can be transmitted asynchronously using only a single wire. The small number of wires alleviates the routing complexity of wiring long-range interconnects. The use of current-mode signaling makes it possible to transmit data at high speed without buffers or repeaters over a long interconnect wire because of the low-voltage swing of signaling, and it leads to low-latency data transmission. We achieve a latency of 0.45 ns, a throughput of 1.25 Gbps, and energy dissipation of 0.58 pJ/bit with a 10-mm interconnect wire under a 0.13 µm CMOS technology. This represents an 85% decrease in latency, a 150% increase in throughput, and a 90% decrease in energy dissipation compared to a conventional serial asynchronous data-transmission link.

  • Integration of Behavioral Synthesis and Floorplanning for Asynchronous Circuits with Bundled-Data Implementation

    Naohiro HAMADA  Hiroshi SAITO  

     
    PAPER

      Vol:
    E95-C No:4
      Page(s):
    506-515

    In this paper, we propose a synthesis method for asynchronous circuits with bundled-data implementation. The proposed method iteratively applies behavioral synthesis and floorplanning to obtain a near optimum circuit in the term of latency under given design constraints. To improve latency, behavioral synthesis and floorplanning are carried out so that the delay of the control circuit is minimized and the addition of delay elements to satisfy timing constraints is minimized. We evaluate the effectiveness of the proposed method in terms of latency, area, and the number of timing violations while synthesizing several benchmarks. Experimental results show that the proposed method synthesizes faster circuits compared to the circuit synthesized without the proposed method. Also, the proposed method is effective to reduce the number of timing violations.

  • Asynchronous Pipeline Controller Based on Early Acknowledgement Protocol

    Chammika MANNAKKARA  Tomohiro YONEDA  

     
    PAPER-Computer System

      Vol:
    E93-D No:8
      Page(s):
    2145-2161

    A new pipeline controller based on the Early Acknowledgement (EA) protocol is proposed for bundled-data asynchronous circuits. The EA protocol indicates acknowledgement by the falling edge of the acknowledgement signal in contrast to the 4-phase protocol, which indicates it on the rising edge. Thus, it can hide the overhead caused by the resetting period of the handshake cycle. Since we have designed our controller assuming several timing constraints, we first analyze the timing constraints under which our controller correctly works and then discuss their appropriateness. The performance of the controller is compared both analytically and experimentally with those of two other pipeline controllers, namely, a very high-speed 2-phase controller and an ordinary 4-phase controller. Our controller performs better than a 4-phase controller when pipeline has processing elements. We have obtained interesting results in the case of a non-linear pipeline with a Conditional Branch (CB) operation. Our controller has slightly better performance even compared to 2-phase controller in the case of a pipeline with processing elements. Its superiority lies in the EA protocol, which employs return-to-zero control signals like the 4-phase protocol. Hence, our controller for CB operation is simple in construction just like the 4-phase controller. A 2-phase controller for the same operation needs to have a slightly complicated mechanism to handle the 2-phase operation because of the non-return-to-zero control signals, and this results in a performance overhead.

  • Highly Reliable Multiple-Valued One-Phase Signalling for an Asynchronous On-Chip Communication Link

    Naoya ONIZAWA  Takahiro HANYU  

     
    PAPER-Multiple-Valued VLSI Technology

      Vol:
    E93-D No:8
      Page(s):
    2089-2099

    This paper presents highly reliable multiple-valued one-phase signalling for an asynchronous on-chip communication link under process, supply-voltage and temperature variations. New multiple-valued dual-rail encoding, where each code is represented by the minimum set of three values, makes it possible to perform asynchronous communication between modules with just two wires. Since an appropriate current level is individually assigned to the logic value, a sufficient dynamic range between adjacent current signals can be maintained in the proposed multiple-valued current-mode (MVCM) circuit, which improves the robustness against the process variation. Moreover, as the supply-voltage and the temperature variations in smaller dimensions of circuit elements are dominated as the common-mode variation, a local reference voltage signal according to the variations can be adaptively generated to compensate characteristic change of the MVCM-circuit component. As a result, the proposed asynchronous on-chip communication link is correctly operated in the operation range from 1.1 V to 1.4 V of the supply voltage and that from -50 to 75 under the process variation of 3σ. In fact, it is demonstrated by HSPICE simulation in a 0.13-µm CMOS process that the throughput of the proposed circuit is enhanced to 435% in comparison with that of the conventional 4-phase asynchronous communication circuit under a comparable energy dissipation.

  • Ultra Low Power Delay Element with Post-Chip Adjustable Ability

    Jung-Lin YANG  Chih-Wei CHAO  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E92-A No:12
      Page(s):
    3381-3389

    Our paper proposes a low power delay element with many other valuable characteristics for asynchronous circuits in the bundled-data implementation. Delay elements are frequently utilized to interact with asynchronous environment for revealing the current status of the bundled-data asynchronous circuits. Thus, a notable portion of the total energy is consumed by the delay elements for this kind of designs. Moreover, constructing a specific delay on a chip is a difficult task for recent CMOS technology. An extreme low power asymmetrical delay element with post-chip adjustment feature was developed mainly for solving these issues. Our initial intention was to develop a programmable delay element for asynchronous data path components. The proposed delay element is also suitable for many other applications requiring low power constraint. In addition to the programmability, the delay element also demonstrated efficiently characteristics such as good tolerance to process and temperature variations on the delay. Our delay element is equivalent to approximately the average power of a 4-stage inverter chain. A large delay can be obtained by cascaded scheme with nearly zero handshaking overhead. All arguments were cautiously verified by the post-layout simulation setup using TSMC 0.35 µm and 0.18 µm technologies under all extreme corners.

  • A Conservative Framework for Safety-Failure Checking

    Frederic BEAL  Tomohiro YONEDA  Chris J. MYERS  

     
    PAPER-Verification and Timing Analysis

      Vol:
    E91-D No:3
      Page(s):
    642-654

    We present a new framework for checking safety failures. The approach is based on the conservative inference of the internal states of a system by the observation of the interaction with its environment. It is based on two similar mechanisms : forward implication, which performs the analysis of the consequences of an input applied to the system, and backward implication, that performs the same task for an output transition. While being a very simple approach, it is general and we believe it can yield efficient algorithms in different safety-failure checking problems. As a case study, we have applied this framework to an existing problem, the hazard checking in (speed-independent) asynchronous circuits. Our new methodology yields an efficient algorithm that performs better or as well as all existing algorithms, while being more general than the fastest one.

  • Scheduling Methods for Asynchronous Circuits with Bundled-Data Implementations Based on the Approximation of Start Times

    Hiroshi SAITO  Naohiro HAMADA  Nattha JINDAPETCH  Tomohiro YONEDA  Chris MYERS  Takashi NANYA  

     
    PAPER-System Level Design

      Vol:
    E90-A No:12
      Page(s):
    2790-2799

    This paper proposes new scheduling methods for asynchronous circuits with bundled-data implementations. Since operations in asynchronous circuits start after the completion of a previous operation, this method approximates the set of start times for each operation using the delay of the resources. Next, this method decides on control steps from the approximated sets of start times, which are used in scheduling algorithms. This paper extends two scheduling algorithms used for synchronous circuits so that the approximated sets of start times and the decided control steps are used. Finally, this paper shows the effectiveness of our proposed methods by comparing scheduling results with ones obtained by the original two scheduling algorithms.

  • A Cost-Effective Handshake Protocol and Its Implementation for Bundled-Data Asynchronous Circuits

    Masakazu SHIMIZU  Koki ABE  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E89-A No:1
      Page(s):
    280-287

    We propose and implement a four-phase handshake protocol for bundled-data asynchronous circuits with consideration given to power consumption and area. A key aspect is that our protocol uses three phases for generating the matched delay to signal the completion of the data-path stage operation whereas conventional methods use only one phase. A comparison with other protocols at 0.18 µm process showed that our protocol realized lower power consumption than any other protocol at cycle times of 1.2 ns or more. The area of the delay generator required for a given data-path delay was less than half that of other protocols. The overhead of the timing generator was the same as or less than that of other protocols.

  • Partial Order Reduction for Timed Circuit Verification Based on Level Oriented Model

    Tomoya KITAI  Yusuke OGURO  Tomohiro YONEDA  Eric MERCER  Chris MYERS  

     
    PAPER-Verification and Dependability Analysis

      Vol:
    E86-D No:12
      Page(s):
    2601-2611

    Using a level oriented model for verification of asynchronous circuits helps users to easily construct formal models with high readability or to naturally model data-path circuits. On the other hand, in order to use such a model for larger circuit, some technique to avoid the state explosion problem is essential. This paper first defines a level oriented formal model based on time Petri nets, and then proposes a partial order reduction algorithm that prunes unnecessary state generation while guaranteeing the correctness of the verification.

  • Eliminating Isochronic-Fork Constraints in Quasi-Delay-Insensitive Circuits

    Nattha SRETASEREEKUL  Takashi NANYA  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E86-A No:4
      Page(s):
    900-907

    The Quasi-Delay-Insensitive (QDI) model assumes that all the forks are isochronic. The isochronic-fork assumption requires uniform wire delays and uniform switching thresholds of the gates associated with the forking branches. This paper presents a method for determining such forks that do not have to satisfy the isochronic fork requirements, and presents experimental results that show many isochronic forks assumed for existing QDI circuits do not actually have to be "isochronic" or can be even ignored.

  • High-Level Test Generation for Asynchronous Circuits from Signal Transition Graph

    Eunjung OH  Soo-Hyun KIM  Dong-Ik LEE  Ho-Yong CHOI  

     
    PAPER-Test Generation

      Vol:
    E85-A No:12
      Page(s):
    2674-2683

    In this paper, we have proposed an efficient high-level test generation method for asynchronous circuits. The test generation is based on specification level, especially on Signal Transition Graph (STG), which is a kind of specification method for asynchronous circuits. We define a high-level fault model, called a single State Transition Fault (STF) model on STG. Test patterns for STFs are generated based on Stable State Graph (SSG), which can be derived from STG directly. The state space explored in test generation is greatly reduced and hence the test generation cost is small in terms of execution time. To enhance the fault coverage at gate-level, we have also proposed an extended STF (ESTF) model with additional gate-level information. Experimental results show that the generated test for STFs achieves high fault coverage with low cost for single stuck-at faults of its corresponding synthesized gate-level circuit. The generated test for ESTFs attains higher fault coverage with same benchmark in cost of longer execution time. Further, we have also proposed a 3-phase test generation based on the above proposed methods. An effective test generation is implemented by 3-phase: 1) test generation for STFs, 2) test generation for ESTFs, and 3) test generation using an asynchronous product machine traversal method. Experimental results also show that the proposed 3-phase test generation achieves higher fault coverage in cost of longer execution time.

  • Framework of Timed Trace Theoretic Verification Revisited

    Bin ZHOU  Tomohiro YONEDA  Chris MYERS  

     
    PAPER-Verification

      Vol:
    E85-D No:10
      Page(s):
    1595-1604

    This paper develops a framework to support trace theoretic verification of timed circuits and systems. A theoretical foundation for classifying timed traces as either successes or failures is developed. The concept of the semimirror is introduced to allow conformance checking thus supporting hierarchical verification of timed circuits and systems. Finally, we relate our framework to those previously proposed for timing verification.

1-20hit(22hit)