IEICE global.ieice.org Site

Keyword Search Result

[Keyword] network-on-chip(36hit)

21-36hit(36hit)

A Locality-Aware Hybrid NoC Configuration Algorithm Utilizing the Communication Volume among IP Cores
Seungju LEE Masao YANAGISAWA Nozomu TOGAWA

PAPER-VLSI Design Technology and CAD

Vol:
E95-A No:9
Page(s):
1538-1549
Network-on-chip (NoC) architectures have emerged as a promising solution to the lack of scalability in multi-processor systems-on-chips (MPSoCs). With the explosive growth in the usage of multimedia applications, it is expected that NoC serves as a multimedia server supporting multi-class services. In this paper, we propose a configuration algorithm for a hybrid bus-NoC architecture together with simulation results. Our target architecture is a hybrid bus-NoC architecture, called busmesh NoC, which is a generalized version of a hybrid NoC with local buses. In our BMNoC configuration algorithm, cores which have a heavy communication volume between them are mapped in a cluster node (CN) and connected by a local bus. CNs can have communication with each other via edge switches (ESes) and mesh routers (MRs). With this hierarchical communication network, our proposed algorithm can improve the latency as compared with conventional methods. Several realistic applications applied to our algorithm illustrate the better performance than earlier studies and feasibility of our proposed algorithm.
Long-Range Asynchronous On-Chip Link Based on Multiple-Valued Single-Track Signaling
Naoya ONIZAWA Atsushi MATSUMOTO Takahiro HANYU

PAPER-Circuit Theory

Vol:
E95-A No:6
Page(s):
1018-1029
We have developed a long-range asynchronous on-chip data-transmission link based on multiple-valued single-track signaling for a highly reliable asynchronous Network-on-Chip. In the proposed signaling, 1-bit data with control information is represented by using a one-digit multi-level signal, so serial data can be transmitted asynchronously using only a single wire. The small number of wires alleviates the routing complexity of wiring long-range interconnects. The use of current-mode signaling makes it possible to transmit data at high speed without buffers or repeaters over a long interconnect wire because of the low-voltage swing of signaling, and it leads to low-latency data transmission. We achieve a latency of 0.45 ns, a throughput of 1.25 Gbps, and energy dissipation of 0.58 pJ/bit with a 10-mm interconnect wire under a 0.13 µm CMOS technology. This represents an 85% decrease in latency, a 150% increase in throughput, and a 90% decrease in energy dissipation compared to a conventional serial asynchronous data-transmission link.
Support Efficient and Fault-Tolerant Multicast in Bufferless Network-on-Chip
Chaochao FENG Zhonghai LU Axel JANTSCH Minxuan ZHANG Xianju YANG

PAPER-Computer System

Vol:
E95-D No:4
Page(s):
1052-1061
In this paper, we propose three Deflection-Routing-based Multicast (DRM) schemes for a bufferless NoC. The DRM scheme without packets replication (DRM_noPR) sends multicast packet through a non-deterministic path. The DRM schemes with adaptive packets replication (DRM_PR_src and DRM_PR_all) replicate multicast packets at the source or intermediate node according to the destination position and the state of output ports to reduce the average multicast latency. We also provide fault-tolerant supporting in these schemes through a reinforcement-learning-based method to reconfigure the routing table to tolerate permanent faulty links in the network. Simulation results illustrate that the DRM_PR_all scheme achieves 41%, 43% and 37% less latency on average than that of the DRM_noPR scheme and 27%, 29% and 25% less latency on average than that of the DRM_PR_src scheme under three synthetic traffic patterns respectively. In addition, all three fault-tolerant DRM schemes achieve acceptable performance degradation at various link fault rates without any packet lost.
A Process-Variation-Adaptive Network-on-Chip with Variable-Cycle Routers and Variable-Cycle Pipeline Adaptive Routing
Yohei NAKATA Hiroshi KAWAGUCHI Masahiko YOSHIMOTO

PAPER

Vol:
E95-C No:4
Page(s):
523-533
As process technology is scaled down, a typical system on a chip (SoC) becomes denser. In scaled process technology, process variation becomes greater and increasingly affects the SoC circuits. Moreover, the process variation strongly affects network-on-chips (NoCs) that have a synchronous network across the chip. Therefore, its network frequency is degraded. We propose a process-variation-adaptive NoC with a variation-adaptive variable-cycle router (VAVCR). The proposed VAVCR can configure its cycle latency adaptively on a processor core basis, corresponding to the process variation. It can increase the network frequency, which is limited by the process variation in a conventional router. Furthermore, we propose a variable-cycle pipeline adaptive routing (VCPAR) method with VAVCR; the proposed VCPAR can reduce packet latency and has tolerance to network congestion. The total execution time reduction of the proposed VAVCR with VCPAR is 15.7%, on average, for five task graphs.
Hybrid Wired/Wireless On-Chip Network Design for Application-Specific SoC
Shouyi YIN Yang HU Zhen ZHANG Leibo LIU Shaojun WEI

PAPER

Vol:
E95-C No:4
Page(s):
495-505
Hybrid wired/wireless on-chip network is a promising communication architecture for multi-/many-core SoC. For application-specific SoC design, it is important to design a dedicated on-chip network architecture according to the application-specific nature. In this paper, we propose a heuristic wireless link allocation algorithm for creating hybrid on-chip network architecture. The algorithm can eliminate the performance bottleneck by replacing multi-hop wired paths by high-bandwidth single-hop long-range wireless links. The simulation results show that the hybrid on-chip network designed by our algorithm improves the performance in terms of both communication delay and energy consumption significantly.
A Scalable and Reconfigurable Fault-Tolerant Distributed Routing Algorithm for NoCs
Zewen SHI Xiaoyang ZENG Zhiyi YU

PAPER-Computer System

Vol:
E94-D No:7
Page(s):
1386-1397
Manufacturing defects in the deep sub-micron VLSI process and aging resulted problems of devices during lifecycle are inevitable, and fault-tolerant routing algorithms are important to provide the required communication for NoCs in spite of failures. The proposed algorithm, referred to as scalable and reconfigurable fault-tolerant distributed routing (RFDR), partitions the system into nine regions using the concept of divide-and-conquer. It is a distributed algorithm, and each router guarantees fault-tolerance within one's own region and the system can be still sustained with multiple fault areas. The proposed RFDR has excellent scalability with hardware cost keeping constant independent of system size. Also it is completely reconfigurable when new nodes fail. Simulations under various synthetic traffic patterns show its better performance compared to Extended-XY routing algorithm. Moreover, there is almost no hardware overhead compared to Logic-Based Distributed Routing (LBDR), but the fault-tolerance capacity is enhanced in the proposed algorithm. Hardware cost is reduced 37% compared to Reconfigurable Distributed Scalable Predictable Interconnect Network (R-DSPIN) which only supports single fault region.
A New Multiple-Round Dimension-Order Routing for Networks-on-Chip
Binzhang FU Yinhe HAN Huawei LI Xiaowei LI

PAPER-Computer System

Vol:
E94-D No:4
Page(s):
809-821
The Network-on-Chip (NoC) is limited by the reliability constraint, which impels us to exploit the fault-tolerant routing. Generally, there are two main design objectives: tolerating more faults and achieving high network performance. To this end, we propose a new multiple-round dimension-order routing (NMR-DOR). Unlike existing solutions, besides the intermediate nodes inter virtual channels (VCs), some turn-legally intermediate nodes inside each VC are also utilized. Hence, more faults are tolerated by those new introduced intermediate nodes without adding extra VCs. Furthermore, unlike the previous solutions where some VCs are prioritized, the NMR-DOR provides a more flexible manner to evenly distribute packets among different VCs. With extensive simulations, we prove that the NMR-DOR maximally saves more than 90% unreachable node pairs blocked by faults in previous solutions, and significantly reduces the packet latency compared with existing solutions.
Combined Use of Rising and Falling Edge Triggered Clocks for Peak Current Reduction in IP-Based SoC/NoC Designs
Tsung-Yi WU Tzi-Wei KAO How-Rern LIN

PAPER-High-Level Synthesis and System-Level Design

Vol:
E93-A No:12
Page(s):
2581-2589
In a typical SoC (System-on-Chip) design, a huge peak current often occurs near the time of an active clock edge because of aggregate switching of a large number of transistors. The number of aggregate switching transistors can be lessened if the SoC design can use a clock scheme of mixed rising and falling triggering edges rather than one of pure rising (falling) triggering edges. In this paper, we propose a clock-triggering-edge assignment technique and algorithms that can assign either a rising triggering edge or a falling triggering edge to each clock of each IP core of a given IP-based SoC/NoC (Network-on-Chip) design. The goal of the algorithms is to reduce the peak current of the design. Our proposed technique has been implemented as a software system. The system can use an LP technique to find an optimal or suboptimal solution within several seconds. The system also can use an ILP technique to find an optimal solution, but the ILP technique is not suitable to be used to solve a complex design. Experimental results show that our algorithms can reduce peak currents up to 56.3%.
Highly Reliable Multiple-Valued One-Phase Signalling for an Asynchronous On-Chip Communication Link
Naoya ONIZAWA Takahiro HANYU

PAPER-Multiple-Valued VLSI Technology

Vol:
E93-D No:8
Page(s):
2089-2099
This paper presents highly reliable multiple-valued one-phase signalling for an asynchronous on-chip communication link under process, supply-voltage and temperature variations. New multiple-valued dual-rail encoding, where each code is represented by the minimum set of three values, makes it possible to perform asynchronous communication between modules with just two wires. Since an appropriate current level is individually assigned to the logic value, a sufficient dynamic range between adjacent current signals can be maintained in the proposed multiple-valued current-mode (MVCM) circuit, which improves the robustness against the process variation. Moreover, as the supply-voltage and the temperature variations in smaller dimensions of circuit elements are dominated as the common-mode variation, a local reference voltage signal according to the variations can be adaptively generated to compensate characteristic change of the MVCM-circuit component. As a result, the proposed asynchronous on-chip communication link is correctly operated in the operation range from 1.1 V to 1.4 V of the supply voltage and that from -50 to 75 under the process variation of 3σ. In fact, it is demonstrated by HSPICE simulation in a 0.13-µm CMOS process that the throughput of the proposed circuit is enhanced to 435% in comparison with that of the conventional 4-phase asynchronous communication circuit under a comparable energy dissipation.
Worst-Case Flit and Packet Delay Bounds in Wormhole Networks on Chip
Yue QIAN Zhonghai LU Wenhua DOU

PAPER-Embedded, Real-Time and Reconfigurable Systems

Vol:
E92-A No:12
Page(s):
3211-3220
We investigate per-flow flit and packet worst-case delay bounds in on-chip wormhole networks. Such investigation is essential in order to provide guarantees under worst-case conditions in cost-constrained systems, as required by many hard real-time embedded applications. We first propose analysis models for flow control, link and buffer sharing. Based on these analysis models, we obtain an open-ended service analysis model capturing the combined effect of flow control, link and buffer sharing. With the service analysis model, we compute equivalent service curves for individual flows, and then derive their flit and packet delay bounds. Our experimental results verify that our analytical bounds are correct and tight.
Study-Based Error Recovery Scheme for Networks-on-Chip
Depeng JIN Shijun LIN Li SU Lieguang ZENG

LETTER-VLSI Systems

Vol:
E92-D No:11
Page(s):
2272-2274
Motivated by different error characteristics of each path, we propose a study-based error recovery scheme for Networks-on-Chip (NoC). In this scheme, two study processes are executed respectively to obtain the characteristics of the errors in every link first; and then, according to the study results and the selection rule inferred by us, this scheme selects a better error recovery scheme for every path. Simulation results show that compared with traditional simple retransmission scheme and hybrid single-error-correction, multi-error-retransmission scheme, this scheme greatly improves the throughput and cuts down the energy consumption with little area increase.
Analyzing Credit-Based Router-to-Router Flow Control for On-Chip Networks
Yue QIAN Zhonghai LU Wenhua DOU Qiang DOU

PAPER

Vol:
E92-C No:10
Page(s):
1276-1283
Credit-based router-to-router flow control is one main link-level flow control mechanism proposed for Networks on Chip (NoCs). Based on network calculus, we analyze its performance and optimal buffer size. To model the feedback control behavior due to credits, we introduce a virtual network service element called flow controller. Then we derive its service curve, and further the system service curve. In addition, we give and prove a theorem that determines the optimal buffer size guaranteeing the maximum system service curve. Moreover, assuming the latency-rate server model for routers, we give closed-form formulas to calculate the flit delay bound and optimal buffer size. Our experiments with real on-chip traffic traces validate that our analysis is correct; delay bounds are tight and the optimal buffer size is exact.
A Link Removal Methodology for Application-Specific Networks-on-Chip on FPGAs
Daihan WANG Hiroki MATSUTANI Michihiro KOIBUCHI Hideharu AMANO

PAPER-VLSI Systems

Vol:
E92-D No:4
Page(s):
575-583
The regular 2-D mesh topology has been utilized for most of Network-on-Chips (NoCs) on FPGAs. Spatially biased traffic generated in some applications makes a customization method for removing links more efficient, since some links become low utilization. In this paper, a link removal strategy that customizes the router in NoC is proposed for reconfigurable systems in order to minimize the required hardware amount. Based on the pre-analyzed traffic information, links on which the communication amount is small are removed to reduce the hardware cost while maintaining adequate performance. Two policies are proposed to avoid deadlocks and they outperform up*/down* routing, which is a representative deadlock-free routing on irregular topology. In the case of the image recognition application susan, the proposed method can save 30% of the hardware amount without performance degradation.
Pre-Allocation Based Flow Control Scheme for Networks-On-Chip
Shijun LIN Li SU Haibo SU Depeng JIN Lieguang ZENG

LETTER-VLSI Systems

Vol:
E92-D No:3
Page(s):
538-540
Based on the traffic predictability characteristic of Networks-on-Chip (NoC), we propose a pre-allocation based flow control scheme to improve the performance of NoC. In this scheme, routes are pre-allocated and the injection rates of all routes are regulated at the traffic sources according to the average available bandwidths in the links. Then, the number of packets in the network is decreased and thus, the congestion probability is reduced and the communication performance is improved. Simulation results show that this scheme greatly increases the throughput and cuts down the average latency with little area and energy overhead, compared with the switch-to-switch flow control scheme.
Implementation of a High-Speed Asynchronous Data-Transfer Chip Based on Multiple-Valued Current-Signal Multiplexing
Tomohiro TAKAHASHI Takahiro HANYU

PAPER

Vol:
E89-C No:11
Page(s):
1598-1604
This paper presents an asynchronous multiple-valued current-mode data-transfer controller chip based on a 1-phase dual-rail encoding technique. The proposed encoding technique enables "one-way delay" asynchronous data transfer because request and acknowledge signals can be transmitted simultaneously and valid states are detected by calculating the sum of dual-rail codewords. Since a key component, a current-to-voltage conversion circuit in a valid-state detector, is tuned so as to obtain a sufficient voltage range to improve switching speed of a comparator, signal detection can be performed quickly in spite of using 6-level signals. It is evaluated using HSPICE simulation with a 0.18-µm CMOS that the throughput of the proposed circuit based on the 1-phase dual-rail scheme attains 435 Mbps/wire which is 2.9 times faster than that of a CMOS circuit based on a conventional 4-phase dual-rail scheme. The test chip is fabricated, and the asynchronous data-transfer behavior of the proposed scheme is confirmed.
Design a Switch Wrapper for SNA On-Chip-Network
Jiho CHANG Jongsu YI JunSeong KIM

PAPER

Vol:
E89-A No:6
Page(s):
1615-1621
In this paper we present a design of a switch wrapper as a component of SNA (SoC Network Architecture), which is an efficient on-chip-network compared to a shared bus architecture in a SoC. The SNA uses crossbar routers to provide the increasing demand on communication bandwidth within a single chip. A switch wrapper for SNA is located between a crossbar router and IPs connecting them together. It carries out a mode of routing to assist crossbar routers and executes protocol conversions to provide compatibility in IP reuse. A switch wrapper consists of a direct router, two AHB-SNP converters, a controller and two optional interface socket modules. We implemented a SNP switch wrapper in VHDL and confirmed its functionality using ModelDim simulation. Also, we synthesized it using a Xilinx Virtex2 device to determine resource requirements: the switch wrapper seems to occupy appropriate spaces, about 900 gates, considering that a single SNA crossbar router costs about 20,000 gates.

21-36hit(36hit)

Keyword Search Result

[Keyword] network-on-chip(36hit)

A Locality-Aware Hybrid NoC Configuration Algorithm Utilizing the Communication Volume among IP Cores

Long-Range Asynchronous On-Chip Link Based on Multiple-Valued Single-Track Signaling

Support Efficient and Fault-Tolerant Multicast in Bufferless Network-on-Chip

A Process-Variation-Adaptive Network-on-Chip with Variable-Cycle Routers and Variable-Cycle Pipeline Adaptive Routing

Hybrid Wired/Wireless On-Chip Network Design for Application-Specific SoC

A Scalable and Reconfigurable Fault-Tolerant Distributed Routing Algorithm for NoCs

A New Multiple-Round Dimension-Order Routing for Networks-on-Chip

Combined Use of Rising and Falling Edge Triggered Clocks for Peak Current Reduction in IP-Based SoC/NoC Designs

Highly Reliable Multiple-Valued One-Phase Signalling for an Asynchronous On-Chip Communication Link

Worst-Case Flit and Packet Delay Bounds in Wormhole Networks on Chip

Study-Based Error Recovery Scheme for Networks-on-Chip

Analyzing Credit-Based Router-to-Router Flow Control for On-Chip Networks

A Link Removal Methodology for Application-Specific Networks-on-Chip on FPGAs

Pre-Allocation Based Flow Control Scheme for Networks-On-Chip

Implementation of a High-Speed Asynchronous Data-Transfer Chip Based on Multiple-Valued Current-Signal Multiplexing

Design a Switch Wrapper for SNA On-Chip-Network

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles