The search functionality is under construction.

Keyword Search Result

[Keyword] network-on-chip(36hit)

21-36hit(36hit)

  • A Locality-Aware Hybrid NoC Configuration Algorithm Utilizing the Communication Volume among IP Cores

    Seungju LEE  Masao YANAGISAWA  Nozomu TOGAWA  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E95-A No:9
      Page(s):
    1538-1549

    Network-on-chip (NoC) architectures have emerged as a promising solution to the lack of scalability in multi-processor systems-on-chips (MPSoCs). With the explosive growth in the usage of multimedia applications, it is expected that NoC serves as a multimedia server supporting multi-class services. In this paper, we propose a configuration algorithm for a hybrid bus-NoC architecture together with simulation results. Our target architecture is a hybrid bus-NoC architecture, called busmesh NoC, which is a generalized version of a hybrid NoC with local buses. In our BMNoC configuration algorithm, cores which have a heavy communication volume between them are mapped in a cluster node (CN) and connected by a local bus. CNs can have communication with each other via edge switches (ESes) and mesh routers (MRs). With this hierarchical communication network, our proposed algorithm can improve the latency as compared with conventional methods. Several realistic applications applied to our algorithm illustrate the better performance than earlier studies and feasibility of our proposed algorithm.

  • Long-Range Asynchronous On-Chip Link Based on Multiple-Valued Single-Track Signaling

    Naoya ONIZAWA  Atsushi MATSUMOTO  Takahiro HANYU  

     
    PAPER-Circuit Theory

      Vol:
    E95-A No:6
      Page(s):
    1018-1029

    We have developed a long-range asynchronous on-chip data-transmission link based on multiple-valued single-track signaling for a highly reliable asynchronous Network-on-Chip. In the proposed signaling, 1-bit data with control information is represented by using a one-digit multi-level signal, so serial data can be transmitted asynchronously using only a single wire. The small number of wires alleviates the routing complexity of wiring long-range interconnects. The use of current-mode signaling makes it possible to transmit data at high speed without buffers or repeaters over a long interconnect wire because of the low-voltage swing of signaling, and it leads to low-latency data transmission. We achieve a latency of 0.45 ns, a throughput of 1.25 Gbps, and energy dissipation of 0.58 pJ/bit with a 10-mm interconnect wire under a 0.13 µm CMOS technology. This represents an 85% decrease in latency, a 150% increase in throughput, and a 90% decrease in energy dissipation compared to a conventional serial asynchronous data-transmission link.

  • Support Efficient and Fault-Tolerant Multicast in Bufferless Network-on-Chip

    Chaochao FENG  Zhonghai LU  Axel JANTSCH  Minxuan ZHANG  Xianju YANG  

     
    PAPER-Computer System

      Vol:
    E95-D No:4
      Page(s):
    1052-1061

    In this paper, we propose three Deflection-Routing-based Multicast (DRM) schemes for a bufferless NoC. The DRM scheme without packets replication (DRM_noPR) sends multicast packet through a non-deterministic path. The DRM schemes with adaptive packets replication (DRM_PR_src and DRM_PR_all) replicate multicast packets at the source or intermediate node according to the destination position and the state of output ports to reduce the average multicast latency. We also provide fault-tolerant supporting in these schemes through a reinforcement-learning-based method to reconfigure the routing table to tolerate permanent faulty links in the network. Simulation results illustrate that the DRM_PR_all scheme achieves 41%, 43% and 37% less latency on average than that of the DRM_noPR scheme and 27%, 29% and 25% less latency on average than that of the DRM_PR_src scheme under three synthetic traffic patterns respectively. In addition, all three fault-tolerant DRM schemes achieve acceptable performance degradation at various link fault rates without any packet lost.

  • A Process-Variation-Adaptive Network-on-Chip with Variable-Cycle Routers and Variable-Cycle Pipeline Adaptive Routing

    Yohei NAKATA  Hiroshi KAWAGUCHI  Masahiko YOSHIMOTO  

     
    PAPER

      Vol:
    E95-C No:4
      Page(s):
    523-533

    As process technology is scaled down, a typical system on a chip (SoC) becomes denser. In scaled process technology, process variation becomes greater and increasingly affects the SoC circuits. Moreover, the process variation strongly affects network-on-chips (NoCs) that have a synchronous network across the chip. Therefore, its network frequency is degraded. We propose a process-variation-adaptive NoC with a variation-adaptive variable-cycle router (VAVCR). The proposed VAVCR can configure its cycle latency adaptively on a processor core basis, corresponding to the process variation. It can increase the network frequency, which is limited by the process variation in a conventional router. Furthermore, we propose a variable-cycle pipeline adaptive routing (VCPAR) method with VAVCR; the proposed VCPAR can reduce packet latency and has tolerance to network congestion. The total execution time reduction of the proposed VAVCR with VCPAR is 15.7%, on average, for five task graphs.

  • Hybrid Wired/Wireless On-Chip Network Design for Application-Specific SoC

    Shouyi YIN  Yang HU  Zhen ZHANG  Leibo LIU  Shaojun WEI  

     
    PAPER

      Vol:
    E95-C No:4
      Page(s):
    495-505

    Hybrid wired/wireless on-chip network is a promising communication architecture for multi-/many-core SoC. For application-specific SoC design, it is important to design a dedicated on-chip network architecture according to the application-specific nature. In this paper, we propose a heuristic wireless link allocation algorithm for creating hybrid on-chip network architecture. The algorithm can eliminate the performance bottleneck by replacing multi-hop wired paths by high-bandwidth single-hop long-range wireless links. The simulation results show that the hybrid on-chip network designed by our algorithm improves the performance in terms of both communication delay and energy consumption significantly.

  • A Scalable and Reconfigurable Fault-Tolerant Distributed Routing Algorithm for NoCs

    Zewen SHI  Xiaoyang ZENG  Zhiyi YU  

     
    PAPER-Computer System

      Vol:
    E94-D No:7
      Page(s):
    1386-1397

    Manufacturing defects in the deep sub-micron VLSI process and aging resulted problems of devices during lifecycle are inevitable, and fault-tolerant routing algorithms are important to provide the required communication for NoCs in spite of failures. The proposed algorithm, referred to as scalable and reconfigurable fault-tolerant distributed routing (RFDR), partitions the system into nine regions using the concept of divide-and-conquer. It is a distributed algorithm, and each router guarantees fault-tolerance within one's own region and the system can be still sustained with multiple fault areas. The proposed RFDR has excellent scalability with hardware cost keeping constant independent of system size. Also it is completely reconfigurable when new nodes fail. Simulations under various synthetic traffic patterns show its better performance compared to Extended-XY routing algorithm. Moreover, there is almost no hardware overhead compared to Logic-Based Distributed Routing (LBDR), but the fault-tolerance capacity is enhanced in the proposed algorithm. Hardware cost is reduced 37% compared to Reconfigurable Distributed Scalable Predictable Interconnect Network (R-DSPIN) which only supports single fault region.

  • A New Multiple-Round Dimension-Order Routing for Networks-on-Chip

    Binzhang FU  Yinhe HAN  Huawei LI  Xiaowei LI  

     
    PAPER-Computer System

      Vol:
    E94-D No:4
      Page(s):
    809-821

    The Network-on-Chip (NoC) is limited by the reliability constraint, which impels us to exploit the fault-tolerant routing. Generally, there are two main design objectives: tolerating more faults and achieving high network performance. To this end, we propose a new multiple-round dimension-order routing (NMR-DOR). Unlike existing solutions, besides the intermediate nodes inter virtual channels (VCs), some turn-legally intermediate nodes inside each VC are also utilized. Hence, more faults are tolerated by those new introduced intermediate nodes without adding extra VCs. Furthermore, unlike the previous solutions where some VCs are prioritized, the NMR-DOR provides a more flexible manner to evenly distribute packets among different VCs. With extensive simulations, we prove that the NMR-DOR maximally saves more than 90% unreachable node pairs blocked by faults in previous solutions, and significantly reduces the packet latency compared with existing solutions.

  • Combined Use of Rising and Falling Edge Triggered Clocks for Peak Current Reduction in IP-Based SoC/NoC Designs

    Tsung-Yi WU  Tzi-Wei KAO  How-Rern LIN  

     
    PAPER-High-Level Synthesis and System-Level Design

      Vol:
    E93-A No:12
      Page(s):
    2581-2589

    In a typical SoC (System-on-Chip) design, a huge peak current often occurs near the time of an active clock edge because of aggregate switching of a large number of transistors. The number of aggregate switching transistors can be lessened if the SoC design can use a clock scheme of mixed rising and falling triggering edges rather than one of pure rising (falling) triggering edges. In this paper, we propose a clock-triggering-edge assignment technique and algorithms that can assign either a rising triggering edge or a falling triggering edge to each clock of each IP core of a given IP-based SoC/NoC (Network-on-Chip) design. The goal of the algorithms is to reduce the peak current of the design. Our proposed technique has been implemented as a software system. The system can use an LP technique to find an optimal or suboptimal solution within several seconds. The system also can use an ILP technique to find an optimal solution, but the ILP technique is not suitable to be used to solve a complex design. Experimental results show that our algorithms can reduce peak currents up to 56.3%.

  • Highly Reliable Multiple-Valued One-Phase Signalling for an Asynchronous On-Chip Communication Link

    Naoya ONIZAWA  Takahiro HANYU  

     
    PAPER-Multiple-Valued VLSI Technology

      Vol:
    E93-D No:8
      Page(s):
    2089-2099

    This paper presents highly reliable multiple-valued one-phase signalling for an asynchronous on-chip communication link under process, supply-voltage and temperature variations. New multiple-valued dual-rail encoding, where each code is represented by the minimum set of three values, makes it possible to perform asynchronous communication between modules with just two wires. Since an appropriate current level is individually assigned to the logic value, a sufficient dynamic range between adjacent current signals can be maintained in the proposed multiple-valued current-mode (MVCM) circuit, which improves the robustness against the process variation. Moreover, as the supply-voltage and the temperature variations in smaller dimensions of circuit elements are dominated as the common-mode variation, a local reference voltage signal according to the variations can be adaptively generated to compensate characteristic change of the MVCM-circuit component. As a result, the proposed asynchronous on-chip communication link is correctly operated in the operation range from 1.1 V to 1.4 V of the supply voltage and that from -50 to 75 under the process variation of 3σ. In fact, it is demonstrated by HSPICE simulation in a 0.13-µm CMOS process that the throughput of the proposed circuit is enhanced to 435% in comparison with that of the conventional 4-phase asynchronous communication circuit under a comparable energy dissipation.

  • Worst-Case Flit and Packet Delay Bounds in Wormhole Networks on Chip

    Yue QIAN  Zhonghai LU  Wenhua DOU  

     
    PAPER-Embedded, Real-Time and Reconfigurable Systems

      Vol:
    E92-A No:12
      Page(s):
    3211-3220

    We investigate per-flow flit and packet worst-case delay bounds in on-chip wormhole networks. Such investigation is essential in order to provide guarantees under worst-case conditions in cost-constrained systems, as required by many hard real-time embedded applications. We first propose analysis models for flow control, link and buffer sharing. Based on these analysis models, we obtain an open-ended service analysis model capturing the combined effect of flow control, link and buffer sharing. With the service analysis model, we compute equivalent service curves for individual flows, and then derive their flit and packet delay bounds. Our experimental results verify that our analytical bounds are correct and tight.

  • Study-Based Error Recovery Scheme for Networks-on-Chip

    Depeng JIN  Shijun LIN  Li SU  Lieguang ZENG  

     
    LETTER-VLSI Systems

      Vol:
    E92-D No:11
      Page(s):
    2272-2274

    Motivated by different error characteristics of each path, we propose a study-based error recovery scheme for Networks-on-Chip (NoC). In this scheme, two study processes are executed respectively to obtain the characteristics of the errors in every link first; and then, according to the study results and the selection rule inferred by us, this scheme selects a better error recovery scheme for every path. Simulation results show that compared with traditional simple retransmission scheme and hybrid single-error-correction, multi-error-retransmission scheme, this scheme greatly improves the throughput and cuts down the energy consumption with little area increase.

  • Analyzing Credit-Based Router-to-Router Flow Control for On-Chip Networks

    Yue QIAN  Zhonghai LU  Wenhua DOU  Qiang DOU  

     
    PAPER

      Vol:
    E92-C No:10
      Page(s):
    1276-1283

    Credit-based router-to-router flow control is one main link-level flow control mechanism proposed for Networks on Chip (NoCs). Based on network calculus, we analyze its performance and optimal buffer size. To model the feedback control behavior due to credits, we introduce a virtual network service element called flow controller. Then we derive its service curve, and further the system service curve. In addition, we give and prove a theorem that determines the optimal buffer size guaranteeing the maximum system service curve. Moreover, assuming the latency-rate server model for routers, we give closed-form formulas to calculate the flit delay bound and optimal buffer size. Our experiments with real on-chip traffic traces validate that our analysis is correct; delay bounds are tight and the optimal buffer size is exact.

  • A Link Removal Methodology for Application-Specific Networks-on-Chip on FPGAs

    Daihan WANG  Hiroki MATSUTANI  Michihiro KOIBUCHI  Hideharu AMANO  

     
    PAPER-VLSI Systems

      Vol:
    E92-D No:4
      Page(s):
    575-583

    The regular 2-D mesh topology has been utilized for most of Network-on-Chips (NoCs) on FPGAs. Spatially biased traffic generated in some applications makes a customization method for removing links more efficient, since some links become low utilization. In this paper, a link removal strategy that customizes the router in NoC is proposed for reconfigurable systems in order to minimize the required hardware amount. Based on the pre-analyzed traffic information, links on which the communication amount is small are removed to reduce the hardware cost while maintaining adequate performance. Two policies are proposed to avoid deadlocks and they outperform up*/down* routing, which is a representative deadlock-free routing on irregular topology. In the case of the image recognition application susan, the proposed method can save 30% of the hardware amount without performance degradation.

  • Pre-Allocation Based Flow Control Scheme for Networks-On-Chip

    Shijun LIN  Li SU  Haibo SU  Depeng JIN  Lieguang ZENG  

     
    LETTER-VLSI Systems

      Vol:
    E92-D No:3
      Page(s):
    538-540

    Based on the traffic predictability characteristic of Networks-on-Chip (NoC), we propose a pre-allocation based flow control scheme to improve the performance of NoC. In this scheme, routes are pre-allocated and the injection rates of all routes are regulated at the traffic sources according to the average available bandwidths in the links. Then, the number of packets in the network is decreased and thus, the congestion probability is reduced and the communication performance is improved. Simulation results show that this scheme greatly increases the throughput and cuts down the average latency with little area and energy overhead, compared with the switch-to-switch flow control scheme.

  • Implementation of a High-Speed Asynchronous Data-Transfer Chip Based on Multiple-Valued Current-Signal Multiplexing

    Tomohiro TAKAHASHI  Takahiro HANYU  

     
    PAPER

      Vol:
    E89-C No:11
      Page(s):
    1598-1604

    This paper presents an asynchronous multiple-valued current-mode data-transfer controller chip based on a 1-phase dual-rail encoding technique. The proposed encoding technique enables "one-way delay" asynchronous data transfer because request and acknowledge signals can be transmitted simultaneously and valid states are detected by calculating the sum of dual-rail codewords. Since a key component, a current-to-voltage conversion circuit in a valid-state detector, is tuned so as to obtain a sufficient voltage range to improve switching speed of a comparator, signal detection can be performed quickly in spite of using 6-level signals. It is evaluated using HSPICE simulation with a 0.18-µm CMOS that the throughput of the proposed circuit based on the 1-phase dual-rail scheme attains 435 Mbps/wire which is 2.9 times faster than that of a CMOS circuit based on a conventional 4-phase dual-rail scheme. The test chip is fabricated, and the asynchronous data-transfer behavior of the proposed scheme is confirmed.

  • Design a Switch Wrapper for SNA On-Chip-Network

    Jiho CHANG  Jongsu YI  JunSeong KIM  

     
    PAPER

      Vol:
    E89-A No:6
      Page(s):
    1615-1621

    In this paper we present a design of a switch wrapper as a component of SNA (SoC Network Architecture), which is an efficient on-chip-network compared to a shared bus architecture in a SoC. The SNA uses crossbar routers to provide the increasing demand on communication bandwidth within a single chip. A switch wrapper for SNA is located between a crossbar router and IPs connecting them together. It carries out a mode of routing to assist crossbar routers and executes protocol conversions to provide compatibility in IP reuse. A switch wrapper consists of a direct router, two AHB-SNP converters, a controller and two optional interface socket modules. We implemented a SNP switch wrapper in VHDL and confirmed its functionality using ModelDim simulation. Also, we synthesized it using a Xilinx Virtex2 device to determine resource requirements: the switch wrapper seems to occupy appropriate spaces, about 900 gates, considering that a single SNA crossbar router costs about 20,000 gates.

21-36hit(36hit)