The search functionality is under construction.

Keyword Search Result

[Keyword] router(91hit)

1-20hit(91hit)

  • Hybrid, Asymmetric and Reconfigurable Input Unit Designs for Energy-Efficient On-Chip Networks

    Xiaoman LIU  Yujie GAO  Yuan HE  Xiaohan YUE  Haiyan JIANG  Xibo WANG  

     
    PAPER

      Pubricized:
    2023/04/10
      Vol:
    E106-C No:10
      Page(s):
    570-579

    The complexity and scale of Networks-on-Chip (NoCs) are growing as more processing elements and memory devices are implemented on chips. However, under strict power budgets, it is also critical to lower the power consumption of NoCs for the sake of energy efficiency. In this paper, we therefore present three novel input unit designs for on-chip routers attempting to shrink their power consumption while still conserving the network performance. The key idea behind our designs is to organize buffers in the input units with characteristics of the network traffic in mind; as in our observations, only a small portion of the network traffic are long packets (composed of multiple flits), which means, it is fair to implement hybrid, asymmetric and reconfigurable buffers so that they are mainly targeting at short packets (only having a single flit), hence the smaller power consumption and area overhead. Evaluations show that our hybrid, asymmetric and reconfigurable input unit designs can achieve an average reduction of energy consumption per flit by 45%, 52.3% and 56.2% under 93.6% (for hybrid designs) and 66.3% (for asymmetric and reconfigurable designs) of the original router area, respectively. Meanwhile, we only observe minor degradation in network latency (ranging from 18.4% to 1.5%, on average) with our proposals.

  • A Compression Router for Low-Latency Network-on-Chip

    Naoya NIWA  Yoshiya SHIKAMA  Hideharu AMANO  Michihiro KOIBUCHI  

     
    PAPER-Computer System

      Pubricized:
    2022/11/08
      Vol:
    E106-D No:2
      Page(s):
    170-180

    Network-on-Chips (NoCs) are important components for scalable many-core processors. Because the performance of parallel applications is usually sensitive to the latency of NoCs, reducing it is a primary requirement. In this study, a compression router that hides the (de)compression-operation delay is proposed. The compression router (de)compresses the contents of the incoming packet before the switch arbitration is completed, thus shortening the packet length without latency penalty and reducing the network injection-and-ejection latency. Evaluation results show that the compression router improves up to 33% of the parallel application performance (conjugate gradients (CG), fast Fourier transform (FT), integer sort (IS), and traveling salesman problem (TSP)) and 63% of the effective network throughput by 1.8 compression ratio on NoC. The cost is an increase in router area and its energy consumption by 0.22mm2 and 1.6 times compared to the conventional virtual-channel router. Another finding is that off-loading the decompressor onto a network interface decreases the compression-router area by 57% at the expense of the moderate increase in communication latency.

  • DORR: A DOR-Based Non-Blocking Optical Router for 3D Photonic Network-on-Chips

    Meaad FADHEL  Huaxi GU  Wenting WEI  

     
    PAPER-Computer System

      Pubricized:
    2021/01/27
      Vol:
    E104-D No:5
      Page(s):
    688-696

    Recently, researchers paid more attention on designing optical routers, since they are essential building blocks of all photonic interconnection architectures. Thus, improving them could lead to a spontaneous improvement in the overall performance of the network. Optical routers suffer from the dilemma of increased insertion loss and crosstalk, which upraises the power consumed as the network scales. In this paper, we propose a new 7×7 non-blocking optical router based on the Dimension Order Routing (DOR) algorithm. Moreover, we develop a method that can ensure the least number of MicroRing Resonators (MRRs) in an optical router. Therefore, by reducing these optical devices, the optical router proposed can decrease the crosstalk and insertion loss of the network. This optical router is evaluated and compared to Ye's router and the optimized crossbar for 3D Mesh network that uses XYZ routing algorithm. Unlike many other proposed routers, this paper evaluates optical routers not only from router level prospective yet also consider the overall network level condition. The appraisals show that our optical router can reduce the worst-case network insertion loss by almost 8.7%, 46.39%, 39.3%, and 41.4% compared to Ye's router, optimized crossbar, optimized universal OR, and Optimized VOTEX, respectively. Moreover, it decreases the Optical Signal-to-Noise Ratio (OSNR) worst-case by almost 27.92%, 88%, 77%, and 69.6% compared to Ye's router, optimized crossbar, optimized universal OR, and Optimized VOTEX, respectively. It also reduces the power consumption by 3.22%, 23.99%, 19.12%, and 20.18% compared to Ye's router, optimized crossbar, optimized universal OR, and Optimized VOTEX, respectively.

  • RPC: An Approach for Reducing Compulsory Misses in Packet Processing Cache

    Hayato YAMAKI  Hiroaki NISHI  Shinobu MIWA  Hiroki HONDA  

     
    PAPER-Information Network

      Pubricized:
    2020/09/07
      Vol:
    E103-D No:12
      Page(s):
    2590-2599

    We propose a technique to reduce compulsory misses of packet processing cache (PPC), which largely affects both throughput and energy of core routers. Rather than prefetching data, our technique called response prediction cache (RPC) speculatively stores predicted data in PPC without additional access to the low-throughput and power-consuming memory (i.e., TCAM). RPC predicts the data related to a response flow at the arrival of the corresponding request flow, based on the request-response model of internet communications. Our experimental results with 11 real-network traces show that RPC can reduce the PPC miss rate by 13.4% in upstream and 47.6% in downstream on average when we suppose three-layer PPC. Moreover, we extend RPC to adaptive RPC (A-RPC) that selects the use of RPC in each direction within a core router for further improvement in PPC misses. Finally, we show that A-RPC can achieve 1.38x table-lookup throughput with 74% energy consumption per packet, when compared to conventional PPC.

  • Flex-LIONS: A Silicon Photonic Bandwidth-Reconfigurable Optical Switch Fabric Open Access

    Roberto PROIETTI  Xian XIAO  Marjan FARIBORZ  Pouya FOTOUHI  Yu ZHANG  S. J. Ben YOO  

     
    INVITED PAPER

      Pubricized:
    2020/05/14
      Vol:
    E103-B No:11
      Page(s):
    1190-1198

    This paper summarizes our recent studies on architecture, photonic integration, system validation and networking performance analysis of a flexible low-latency interconnect optical network switch (Flex-LIONS) for datacenter and high-performance computing (HPC) applications. Flex-LIONS leverages the all-to-all wavelength routing property in arrayed waveguide grating routers (AWGRs) combined with microring resonator (MRR)-based add/drop filtering and multi-wavelength spatial switching to enable topology and bandwidth reconfigurability to adapt the interconnection to different traffic profiles. By exploiting the multiple free spectral ranges of AWGRs, it is also possible to provide reconfiguration while maintaining minimum-diameter all-to-all interconnectivity. We report experimental results on the design, fabrication, and system testing of 8×8 silicon photonic (SiPh) Flex-LIONS chips demonstrating error-free all-to-all communication and reconfiguration exploiting different free spectral ranges (FSR0 and FSR1, respectively). After reconfiguration in FSR1, the bandwidth between the selected pair of nodes is increased from 50Gb/s to 125Gb/s while an all interconnectivity at 25Gb/s is maintained using FSR0. Finally, we investigate the use of Flex-LIONS in two different networking scenarios. First, networking simulations for a 256-node datacenter inter-rack communication scenario show the potential latency and energy benefits when using Flex-LIONS for optical reconfiguration based on different traffic profiles (a legacy fat-tree architecture is used for comparison). Second, we demonstrate the benefits of leveraging two FSRs in an 8-node 64-core computing system to provide reconfiguration for the hotspot nodes while maintaining minimum-diameter all-to-all interconnectivity.

  • An Intelligent and Decentralized Content Diffusion System in Smart-Router Networks

    Hanxing XUE  Jiali YOU  Jinlin WANG  

     
    PAPER-Network

      Pubricized:
    2019/02/12
      Vol:
    E102-B No:8
      Page(s):
    1595-1606

    Smart-routers develop greatly in recent years as one of the representative products of IoT and Smart home. Different from traditional routers, they have storage and processing capacity. Actually, smart-routers in the same location or ISP have better link conditions and can provide high quality service to each other. Therefore, for the content required services, how to construct the overlay network and efficiently deploy replications of popular content in smart-routers' network are critical. The performance of existing centralized models is limited by the bottleneck of the single point's performance. In order to improve the stability and scalability of the system through the capability of smart-router, we propose a novel intelligent and decentralized content diffusion system in smart-router network. In the system, the content will be quickly and autonomously diffused in the network which follows the specific requirement of coverage rate in neighbors. Furthermore, we design a heuristic node selection algorithm (MIG) and a replacement algorithm (MCL) to assist the diffusion of content. Specifically, system based MIG will select neighbor with the maximum value of information gain to cache the replication. The replication with the least loss of the coverage rate gain will be replaced in the system based on MCL. Through the simulation experiments, at the same requirement of coverage rate, MIG can reduce the number of replications by at least 20.2% compared with other algorithms. Compared with other replacement algorithms, MCL achieves the best successful service rate which means how much ratio of the service can be provided by neighbors. The system based on the MIG and MCL can provide stable service with the lowest bandwidth and storage cost.

  • Waffle: A New Photonic Plasmonic Router for Optical Network on Chip

    Chao TANG  Huaxi GU  Kun WANG  

     
    LETTER-Computer System

      Pubricized:
    2018/05/29
      Vol:
    E101-D No:9
      Page(s):
    2401-2403

    Optical interconnect is a promising candidate for network on chip. As the key element in the network on chip, the routers greatly affect the performance of the whole system. In this letter, we proposed a new router architecture, Waffle, based on compact 2×2 hybrid photonic-plasmonic switching elements. Also, an optimized architecture, Waffle-XY, was designed for the network employed XY routing algorithm. Both Waffle and Waffle-XY are strictly non-blocking architectures and can be employed in the popular mesh-like networks. Theoretical analysis illustrated that Waffle and Waffle-XY possessed a better performance compared with several representative routers.

  • Compact CAR: Low-Overhead Cache Replacement Policy for an ICN Router

    Atsushi OOKA  Suyong EUM  Shingo ATA  Masayuki MURATA  

     
    PAPER-Network System

      Pubricized:
    2017/12/18
      Vol:
    E101-B No:6
      Page(s):
    1366-1378

    Information-centric networking (ICN) has gained attention from network research communities due to its capability of efficient content dissemination. In-network caching function in ICN plays an important role to achieve the design motivation. However, many researchers on in-network caching due to its ability to efficiently disseminate content. The in-network caching function in ICN plays an important role in realizing the design goals. However, many in-network caching researchers have focused on where to cache rather than how to cache: the former is known as content deployment in the network and the latter is known as cache replacement in an ICN router. Although the cache replacement has been intensively researched in the context of web-caching and content delivery network previously, networks, the conventional approaches cannot be directly applied to ICN due to the fine granularity of chunks in ICN, which eventually changes the access patterns. In this paper, we argue that ICN requires a novel cache replacement algorithm to fulfill the requirements in the design of a high performance ICN router. Then, we propose a novel cache replacement algorithm to satisfy the requirements named Compact CLOCK with Adaptive Replacement (Compact CAR), which can reduce the consumption of cache memory to one-tenth compared to conventional approaches. In this paper, we argue that ICN requires a novel cache replacement algorithm to fulfill the requirements set for high performance ICN routers. Our solution, Compact CLOCK with Adaptive Replacement (Compact CAR), is a novel cache replacement algorithm that satisfies the requirements. The evaluation result shows that the consumption of cache memory required to achieve a desired performance can be reduced by 90% compared to conventional approaches such as FIFO and CLOCK.

  • The Performance Evaluation of a 3D Torus Network Using Partial Link-Sharing Method in NoC Router Buffer

    Naohisa FUKASE  Yasuyuki MIURA  Shigeyoshi WATANABE  M.M. HAFIZUR RAHMAN  

     
    PAPER-Computer System

      Pubricized:
    2017/06/30
      Vol:
    E100-D No:10
      Page(s):
    2478-2492

    The high performance network-on-chip (NoC) router using minimal hardware resources to minimize the layout area is very essential for NoC design. In this paper, we have proposed a memory sharing method of a wormhole routed NoC architecture to alleviate the area overhead of a NoC router. In the proposed method, a memory is shared by multiple physical links by using a multi-port memory. In this paper, we have proposed a partial link-sharing method and evaluated the communication performance using the proposed method. It is revealed that the resulted communication performance by the proposed methods is higher than that of the conventional method, and the progress ratio of the 3D-torus network is higher than that of 2D-torus network. It is shown that the improvement of communication performance using partial link sharing method is achieved with slightly increase of hardware cost.

  • Up-Stream Dispatching of Power by Density of Power Packet

    Shinya NAWATA  Ryo TAKAHASHI  Takashi HIKIHARA  

     
    LETTER-Systems and Control

      Vol:
    E99-A No:12
      Page(s):
    2581-2584

    Power packet is a unit of electric power transferred by a pulse with an information tag. This letter discusses up-stream dispatching of required power at loads to sources through density modulation of power packet. Here, power is adjusted at a proposed router which dispatches power packets according to the tags. It is analyzed by averaging method and numerically verified.

  • Intra-AS Performance Analysis of Distributed Mobility Management Schemes

    Oshani ERUNIKA  Kunitake KANEKO  Fumio TERAOKA  

     
    PAPER-Information Network

      Pubricized:
    2015/05/12
      Vol:
    E98-D No:8
      Page(s):
    1477-1492

    Distributed Mobility Management (DMM) defines Internet Protocol (IP) mobility which does not depend on centralized manipulation. DMM leads to the abatement of non-optimal routing, a single point of failure, and scalability problems appearing in centralized Mobility Management (MM). The fact that most DMM schemes are in the proposal phase and non-existence of a standardization, urge to investigate the proposed schemes thoroughly to confirm their capabilities and thereby, to determine the best candidate practice for DMM. This paper examines five novel DMM proposals discussed in the Internet Engineering Task Force (IETF) using router-level Internet Service Provider (ISP) topologies of Sprint (USA), Tiscali (Europe), Telstra (AUS), and Exodus (USA), as user mobility within an ISP network is considered the most realistic and recurrent user movement in the modern scope. Results reflect behavioral differences of schemes depending on the network. ISPs closer to the Internet core with high density of Point of Presences (PoPs) such as Sprint show poorer outcome when centralized anchors/controllers are employed while Proxy Mobile IP (PMIP) based enhancements offer higher reliability. In contrast, smaller ISPs that reside farther away from the Internet core yield better performance with SDN-Based and Address Delegation schemes. Although the PMIP-Based DMM schemes perform better during handover, their outturn is trivialized due to higher latency in the data plane. In contrast, the Address Delegation and SDN-Based schemes have excessive cost and latency in performing handover due to routing table updates, but perform better in data plane, suggesting that control/data plane split may best address the optimal routing.

  • High-Speed Design of Conflictless Name Lookup and Efficient Selective Cache on CCN Router

    Atsushi OOKA  Shingo ATA  Kazunari INOUE  Masayuki MURATA  

     
    PAPER-Network

      Vol:
    E98-B No:4
      Page(s):
    607-620

    Content-centric networking (CCN) is an innovative network architecture that is being considered as a successor to the Internet. In recent years, CCN has received increasing attention from all over the world because its novel technologies (e.g., caching, multicast, aggregating requests) and communication based on names that act as addresses for content have the potential to resolve various problems facing the Internet. To implement these technologies, however, requires routers with performance far superior to that offered by today's Internet routers. Although many researchers have proposed various router components, such as caching and name lookup mechanisms, there are few router-level designs incorporating all the necessary components. The design and evaluation of a complete router is the primary contribution of this paper. We provide a concrete hardware design for a router model that uses three basic tables — forwarding information base (FIB), pending interest table (PIT), and content store (CS) — and incorporates two entities that we propose. One of these entities is the name lookup entity, which looks up a name address within a few cycles from content-addressable memory by use of a Bloom filter; the other is the interest count entity, which counts interest packets that require certain content and selects content worth caching. Our contributions are (1) presenting a proper algorithm for looking up and matching name addresses in CCN communication, (2) proposing a method to process CCN packets in a way that achieves high throughput and very low latency, and (3) demonstrating feasible performance and cost on the basis of a concrete hardware design using distributed content-addressable memory.

  • RONoC: A Reconfigurable Architecture for Application-Specific Optical Network-on-Chip

    Huaxi GU  Zheng CHEN  Yintang YANG  Hui DING  

     
    LETTER-Computer System

      Vol:
    E97-D No:1
      Page(s):
    142-145

    Optical Network-on-Chip (ONoC) is a promising emerging technology, which can solve the bottlenecks faced by electrical on-chip interconnection. However, the existing proposals of ONoC are mostly built on fixed topologies, which are not flexible enough to support various applications. To make full use of the limited resource and provide a more efficient approach for resource allocation, RONoC (Reconfigurable Optical Network-on-Chip) is proposed in this letter. The topology can be reconfigured to meet the requirement of different applications. An 8×8 nonblocking router is also designed, together with the communication mechanism. The simulation results show that the saturation load of RONoC is 2 times better than mesh, and the energy consumption is 25% lower than mesh.

  • A Router-Aided Hierarchical P2P Traffic Localization Based on Variable Additional Delay Insertion

    Hiep HOANG-VAN  Yuki SHINOZAKI  Takumi MIYOSHI  Olivier FOURMAUX  

     
    PAPER

      Vol:
    E97-B No:1
      Page(s):
    29-39

    Most peer-to-peer (P2P) systems build their own overlay networks for implementing peer selection strategies without taking into account the locality on the underlay network. As a result, a large quantity of traffic crossing internet service providers (ISPs) or autonomous systems (ASes) is generated on the Internet. Controlling the P2P traffic is therefore becoming a big challenge for the ISPs. To control the cost of the cross-ISP/AS traffic, ISPs often throttle and/or even block P2P applications in their networks. In this paper, we propose a router-aided approach for localizing the P2P traffic hierarchically; it features the insertion of additional delay into each P2P packet based on geographical location of its destination. Compared to the existing approaches that solve the problem on the application layer, our proposed method does not require dedicated servers, cooperation between ISPs and P2P users, or modification of existing P2P application software. Therefore, the proposal can be easily utilized by all types of P2P applications. Experiments on P2P streaming applications indicate that our hierarchical traffic localization method not only reduces significantly the inter-domain traffic but also maintains a good performance of P2P applications.

  • Fanout Set Partition Scheme for QoS-Guaranteed Multicast Transmission

    Kyungmin KIM  Seokhwan KONG  Jaiyong LEE  

     
    PAPER-Network

      Vol:
    E96-B No:12
      Page(s):
    3080-3090

    Increasing demand for multicast transmission necessitates service-specific and precise quality-of-service (QoS) control. Since existing works provided limited methodologies such as best path selection, their ability is restricted by the given topology and the congestion status of the network. This paper proposes a fanout set partition (FSP) scheme to realize QoS-guaranteed multicast transmission. The FSP scheme adjusts the delay of the multicast flow by dividing its fanout set into smaller subsets. Since it is carried out based on the service requirement, service-specific QoS control is implemented. Mathematical analysis investigates the trade-offs, and the performance evaluation results show significant improvements under various traffic conditions.

  • A 250 Msps, 0.5 W eDRAM-Based Search Engine Dedicated Low Power FIB Application

    Hisashi IWAMOTO  Yuji YANO  Yasuto KURODA  Koji YAMAMOTO  Kazunari INOUE  Ikuo OKA  

     
    PAPER-Integrated Electronics

      Vol:
    E96-C No:8
      Page(s):
    1076-1082

    Ternary content addressable memory (TCAM) is popular LSI for use in high-throughput forwarding engines on routers. However, the unique structure applied in TCAM consume huge amounts of power, therefore it restricts the ability to handle large lookup table capacity in IP routers. In this paper, we propose a commodity-memory based hardware architecture for the forwarding information base (FIB) application that solves the substantial problems of power and density. The proposed architecture is examined by a fabricated test chip with 40 nm embedded DRAM (eDRAM) technology, and the effect of power reduction verified is greatly lower than conventional TCAM based and the energy metric achieve 0.01 fJ/bit/search. The power consumption is almost 0.5 W at 250 Msps and 8M entries.

  • A Node Design and a Framework for Development and Experimentation for an Information-Centric Network Open Access

    George PARISIS  Dirk TROSSEN  Hitoshi ASAEDA  

     
    INVITED PAPER

      Vol:
    E96-B No:7
      Page(s):
    1650-1660

    Information-centric networking has been touted as an alternative to the current Internet architecture. Our work addresses a crucial part of such a proposal, namely the design of a network node within an information-centric networking architecture. Special attention is given in providing a platform for development and experimentation in an emerging network research area; an area that questions many starting points of the current Internet. In this paper, we describe the service model exposed to applications and provide background on the operation of the platform. For illustration, we present current efforts in deployment and experimentation with demo applications presented, too.

  • WHIT: A More Efficient Hybrid Method for Single-Packet IP Traceback Using Walsh Matrix and Router Degree Distribution

    Yulong WANG  Ji REN  

     
    PAPER-Internet

      Vol:
    E96-B No:7
      Page(s):
    1896-1907

    Single-packet attack can be tracked with logging-based IP traceback approaches, whereas DDoS attack can be tracked with marking-based approaches. However, both approaches have their limits. Logging-based approaches incur heavy overhead for packet-digest storage as well as time overhead for both path recording and recovery. Marking-based approaches incur little traceback overhead but are unable to track single packets. Simply deploying both approaches in the same network to deal with single-packet and DDoS attacks is not an efficient solution due to the heavy traceback overhead. Recent studies suggest that hybrid approaches are more efficient as they consume less router memory to store packet digests and require fewer attack packets to recover attack paths. Thus, the hybrid single packet traceback approach is more promising in efficiently tracking both single-packet and DDoS attacks. The major challenge lies in reducing storage and time overhead while maintaining single-packet traceback capability. We present in this paper a new hybrid approach to efficiently track single-packet attacks by designing a novel path fragment encoding scheme using the orthogonality of Walsh matrix and the degree distribution characteristic of router-level topologies. Compared to HIT (Hybrid IP Traceback), which, to the best of our knowledge, is the most efficient hybrid approach for single-packet traceback, our approach has three advantages. First, it reduces the overhead by 2/3 in both storage and time for recording packet paths. Second, the time overhead for recovering packet paths is also reduced by a calculatable amount. Finally, our approach generates no more than 2/3 of the false-positive paths generated by HIT.

  • A Low-Power Packet Memory Architecture with a Latency-Aware Packet Mapping Method

    Hyuk-Jun LEE  Seung-Chul KIM  Eui-Young CHUNG  

     
    LETTER-Computer System

      Vol:
    E96-D No:4
      Page(s):
    963-966

    A packet memory stores packets in internet routers and it requires typically RTTC for the buffer space, e.g. several GBytes, where RTT is an average round-trip time of a TCP flow and C is the bandwidth of the router's output link. It is implemented with DRAM parts which are accessed in parallel to achieve required bandwidth. They consume significant power in a router whose scalability is heavily limited by power and heat problems. Previous work shows the packet memory size can be reduced to , where N is the number of long-lived TCP flows. In this paper, we propose a novel packet memory architecture which splits the packet memory into on-chip and off-chip packet memories. We also propose a low-power packet mapping method for this architecture by estimating the latency of packets and mapping packets with small latencies to the on-chip memory. The experimental results show that our proposed architecture and mapping method reduce the dynamic power consumption of the off-chip memory by as much as 94.1% with only 50% of the packet buffer size suggested by the previous work in realistic scenarios.

  • Router Power Reduction through Dynamic Performance Control Based on Traffic Predictions

    Hiroyuki ITO  Hiroshi HASEGAWA  Ken-ichi SATO  

     
    PAPER-Energy in Electronics Communications

      Vol:
    E95-B No:10
      Page(s):
    3130-3138

    We investigate the possibility of reducing router power consumption through dynamic router performance control. The proposed algorithm employs a typical low pass filter and, therefore, is simple enough to implement in each related element in a router. Numerical experiments using several real Internet traffic data sets show the degree of reduction in power consumption that can be achieved by using the proposed dynamic performance control algorithm. Detailed analysis clarifies the relationships among various parameter values that include packet loss ratios and the degree of power savings. We also propose a simple method based on the leaky bucket model, which can instantaneously estimate the packet loss ratio. It is shown that this simple method yields a good approximation of the results obtained by exact packet-by-packet simulation. The simple method easily enables us to derive appropriate parameter values for the control algorithm for given traffic that may differ in different segments of the Internet.

1-20hit(91hit)