The search functionality is under construction.

Author Search Result

[Author] Hiroaki NISHI(17hit)

1-17hit
  • Signal Transmission and Coding Architecture for Next-Generation Ethernet

    Hidehiro TOYODA  Hiroaki NISHI  Shinji NISHIMURA  Hisaaki KANAI  Katsuyoshi HARASAWA  

     
    PAPER

      Vol:
    E86-D No:11
      Page(s):
    2317-2324

    The first practical approach to 100-Gigabit Ethernet, i.e., Ethernet with a throughput of 100-Gb/s, is proposed for use in the next generation of LANs for GRID computing and large-capacity data centers. New structures, including a coding architecture, de-skewing method and high-speed packaging techniques, are introduced to the PHY layer to obtain the required data rate. Our form of 100-Gigabit Ethernet uses 10-Gb/s 10-channel CWDM or parallel-optical links. The coding architecture is formed of 64B/66B codes, modified for the CWDM and parallel links. In the de-skewing of the parallel signals, specially designed IDLE characters are used to compensate for skewing of data in the respective signal lanes. Advanced packaging techniques, which suppress the propagation loss and reflection of the 10-Gb/s lanes to obtain high-speed, good integrity and low-noise signaling, are proposed and evaluated. The proposed architectural features make this 100-Gigabit Ethernet concept practical for next-generation LANs.

  • 100-Gb/s Physical-Layer Architecture for Next-Generation Ethernet

    Hidehiro TOYODA  Shinji NISHIMURA  Michitaka OKUNO  Kouji FUKUDA  Kouji NAKAHARA  Hiroaki NISHI  

     
    PAPER

      Vol:
    E89-B No:3
      Page(s):
    696-703

    A high-speed physical-layer architecture for Ethernet is described that supports 100-Gb/s throughput and 40-km transmission, making it well suited for next-generation metro-area and intrabuilding networks. Its links comprise 1210-Gb/s synchronized parallel optical lanes. Ethernet data frames are transmitted by coarse wavelength division multiplexing link and bundled optical fibers. Ten of the lanes convey 640-bit data synchronously (64 bits10 lanes). One conveys forward error correction code ((132 b, 140 b) Hamming code), providing highly reliable (BER < 10-12) data transmission, and the other conveys parity data, enabling fault-lane recovery. A newly developed 64B/66B code-sequence-based deskewing mechanism is used that provides low-latency compensation for the lane-to-lane skew, which is less than 88 ns. Testing of this physical-layer architecture in a field programmable gate array circuit demonstrated that it can provide 100-Gb/s data communication with a 590 k gate circuit, which is small enough for implementation in a single LSI circuit.

  • A High-Speed, Highly-Reliable Network Switch for Parallel Computing System Using Optical Interconnection

    Shinji NISHIMURA  Tomohiro KUDOH  Hiroaki NISHI  Koji TASHO  Katsuyoshi HARASAWA  Shigeto AKUTSU  Shuji FUKUDA  Yasutaka SHIKICHI  

     
    PAPER-Optical Interconnection Systems

      Vol:
    E84-C No:3
      Page(s):
    288-294

    RHiNET-2/SW is a network switch for the RHiNET-2 parallel computing system. RHiNET-2/SW enables high-speed and long-distance data transmission between PC nodes for parallel computing. In RHiNET-2/SW, a one-chip CMOS switch-LSI and eight pairs of 800-Mbit/s 12-channel parallel optical interconnection modules are mounted into a single compact board. This switch allows high-speed 8-Gbit/s/port parallel optical data transmission over a distance of up to 100 m, and the aggregate throughput is 64 Gbit/s/board. The CMOS-ASIC switching LSI enables high-throughput (64 Gbit/s) packet switching with a single chip. The parallel optical interconnection modules enable high-speed and low-latency data transmission over a long distance. The structure and layout of the printed circuit board is optimized for high-speed, high-density device implementation to overcome electrical problems such as signal propagation-loss and crosstalk. All of the electrical interfaces are composed of high-speed CMOS-LVDS logic (800 Mbit/s/pin). We evaluated the reliability of the optical I/O port through long-term data transmission. No errors were detected during 50 hours of continuous data transmission at a data rate of 800 Mbit/s 10 bits (BER: < 2.44 10-14). This test result shows that RHiNET-2/SW can provide high-throughput, long-transmission-length, and highly reliable data transmission in a practical parallel computing system.

  • SoC Architecture Synthesis Methodology Based on High-Level IPs

    Michiaki MURAOKA  Hiroaki NISHI  Rafael K. MORIZAWA  Hideaki YOKOTA  Yoichi ONISHI  

     
    PAPER-System Level Design

      Vol:
    E87-A No:12
      Page(s):
    3057-3067

    We propose a sophisticated synthesis methodology for SoC (System-on-Chip) architectures from the system level specification based on reusable high-level IPs named as Virtual Cores (VCores), in this paper. This synthesis methodology generates an initial architecture that consists of a CPU, buses, IPs, peripherals, I/Os and an RTOS (Real Time Operating System), as well as making tradeoffs to the architecture, between hardware and software on assigned software VCores and hardware VCores. The results of an architecture level design experiment, using the proposed methodology, shows that the partial automation of the architecture synthesis process, allied with design reuse, accelerates the architecture design, therefore, reducing the time required to design an architecture of SoC.

  • Cache-Based Network Processor Architecture: Evaluation with Real Network Traffic

    Michitaka OKUNO  Shinji NISHIMURA  Shin-ichi ISHIDA  Hiroaki NISHI  

     
    PAPER

      Vol:
    E89-C No:11
      Page(s):
    1620-1628

    A novel cache-based network processor (NP) architecture that can catch up with next generation 100-Gbps packet-processing throughput by exploiting a nature of network traffic is proposed, and the prototype is evaluated with real network traffic traces. This architecture consists of several small processing units (PUs) and a bit-stream manipulation hardware called a burst-stream path (BSP) that has a special cache mechanism called a process-learning cache (PLC) and a cache-miss handler (CMH). The PLC memorizes a packet-processing method with all table-lookup results, and applies it to subsequent packets that have the same information in their header. To avoid packet-processing blocking, the CMH handles cache-miss packets while registration processing is performed at the PLC. The combination of the PLC and CMH enables most packets to skip the execution at the PUs, which dissipate huge power in conventional NPs. We evaluated an FPGA-based prototype with real core network traffic traces of a WIDE backbone router. From the experimental results, we observed a special case where the packet of minimum size appeared in large quantities, and the cache-based NP was able to achieve 100% throughput with only the 10%-throughput PUs due to the existence of very high temporal locality of network traffic. From the whole results, the cache-based NP would be able to achieve 100-Gbps throughput by using 10- to 40-Gbps throughput PUs. The power consumption of the cache-based NP, which consists of 40-Gbps throughput PUs, is estimated to be only 44.7% that of a conventional NP.

  • Data-Driven Implementation of Highly Efficient TCP/IP Handler to Access the TINA Network

    Hiroshi ISHII  Hiroaki NISHIKAWA  Yuji INOUE  

     
    PAPER-Software Platform

      Vol:
    E83-B No:6
      Page(s):
    1355-1362

    This paper discusses and clarifies effectiveness of data-driven implementation of protocol handling system to access TINA (Telecommunications Information Networking Architecture) network and internet. TINA is a networking architecture that achieves networking services and management ubiquitously for users and networks. Many TINA related ACTS (Advanced Communication Technologies and Services) projects have been organized in Europe. In Japan, The TINA Trial (TTT) to achieve ATM network management and services based on TINA architectures was done by NTT and several manufactures from April 1997 to April 1999. In these studies and trials, much effort is devoted to development of software based on service architecture and network architecture being standardized in TINA-C (TINA Consortium). In order to achieve TINA environment universally in customers and network sides, we have to consider how to deploy TINA environment onto user side and how to use access transmission capacity as efficiently as possible. Recent technology can easily achieve application and environment downloading from the network side to user side by use of e. g. , JAVA. In accessing the network, there are several possible bottlenecks in information exchange in customer side such as PC processing capability, access protocol handling capability, intra-house wiring bandwidth. Authors, in parallel with TINA software architecture study, have been studying versatile requirements for hardware platform of TINA network. In those studies, we have clarified that the stream-oriented data-driven processor authors have been studying and developing have high reliability, high multiprocessing and multimedia information processing capability. Based on these studies, this paper first shows Von Neumann-based protocol handler is ineffective in case of multiprocessing through mathematical and emulation studies. Then, we show our data-driven protocol handling can effectively realize access protocol handling by emulation study. Then, we describe a result of first step of implementation of data-driven TCP/IP protocol handling. This result proves our TCP/IP hub based on data-driven processor is applicable not only for TINA/CORBA network but normal internet access. Finally, we show a possible customer premises network configuration which resolves bottleneck to access TINA network through ATM access.

  • Experimental Verification of SDN/NFV in Integrated mmWave Access and Mesh Backhaul Networks Open Access

    Makoto NAKAMURA  Hiroaki NISHIUCHI  Jin NAKAZATO  Konstantin KOSLOWSKI  Julian DAUBE  Ricardo SANTOS  Gia Khanh TRAN  Kei SAKAGUCHI  

     
    PAPER-Network

      Pubricized:
    2020/09/29
      Vol:
    E104-B No:3
      Page(s):
    217-228

    In this paper, a Proof-of-Concept (PoC) architecture is constructed, and the effectiveness of mmWave overlay heterogeneous network (HetNet) with mesh backhaul utilizing route-multiplexing and Multi-access Edge Computing (MEC) utilizing prefetching algorithm is verified by measuring the throughput and the download time of real contents. The architecture can cope with the intensive mobile data traffic since data delivery utilizes multiple backhaul routes based on the mesh topology, i.e. route-multiplexing mechanism. On the other hand, MEC deploys the network edge contents requested in advance by nearby User Equipment (UE) based on pre-registered context information such as location, destination, demand application, etc. to the network edge, which is called prefetching algorithm. Therefore, mmWave access can be fully exploited even with capacity-limited backhaul networks by introducing the proposed algorithm. These technologies solve the problems in conventional mmWave HetNet to reduce mobile data traffic on backhaul networks to cloud networks. In addition, the proposed architecture is realized by introducing wireless Software Defined Network (SDN) and Network Function Virtualization (NFV). In our architecture, the network is dynamically controlled via wide-coverage microwave band links by which UE's context information is collected for optimizing the network resources and controlling network infrastructures to establish backhaul routes and MEC servers. In this paper, we develop the hardware equipment and middleware systems, and introduce these algorithms which are used as a driver of IEEE802.11ad and open source software. For 5G and beyond, the architecture integrated in mmWave backhaul, MEC and SDN/NFV will support some scenarios and use cases.

  • RPC: An Approach for Reducing Compulsory Misses in Packet Processing Cache

    Hayato YAMAKI  Hiroaki NISHI  Shinobu MIWA  Hiroki HONDA  

     
    PAPER-Information Network

      Pubricized:
    2020/09/07
      Vol:
    E103-D No:12
      Page(s):
    2590-2599

    We propose a technique to reduce compulsory misses of packet processing cache (PPC), which largely affects both throughput and energy of core routers. Rather than prefetching data, our technique called response prediction cache (RPC) speculatively stores predicted data in PPC without additional access to the low-throughput and power-consuming memory (i.e., TCAM). RPC predicts the data related to a response flow at the arrival of the corresponding request flow, based on the request-response model of internet communications. Our experimental results with 11 real-network traces show that RPC can reduce the PPC miss rate by 13.4% in upstream and 47.6% in downstream on average when we suppose three-layer PPC. Moreover, we extend RPC to adaptive RPC (A-RPC) that selects the use of RPC in each direction within a core router for further improvement in PPC misses. Finally, we show that A-RPC can achieve 1.38x table-lookup throughput with 74% energy consumption per packet, when compared to conventional PPC.

  • Designing Coplanar Superconducting Lumped-Element Bandpass Filters Using a Mechanical Tuning Method

    Shigeki HONTSU  Kazuyuki AGEMURA  Hiroaki NISHIKAWA  Masanobu KUSUNOKI  

     
    PAPER

      Vol:
    E89-C No:2
      Page(s):
    151-155

    A coplanar type lumped-element 6-pole microwave Chebyshev bandpass filter (BPF) of center frequency (f0) 2.0 GHz and fractional bandwidth (FBW) 1.0 % was designed. For the design method, theory of direct coupled resonator filters using K-inverters was employed. Coplanar type lumped-element BPFs are composed of a meander-line L and interdigital C elements. The frequency response was simulated and analyzed using an electromagnetic field simulator (Sonnet-EM). Further, the changes in f0 and FBW of the BPF were also realized by the mechanical tuning method.

  • Data-Driven Fault Management for TINA Applications

    Hiroshi ISHII  Hiroaki NISHIKAWA  Yuji INOUE  

     
    PAPER-Distribute MGNT

      Vol:
    E80-B No:6
      Page(s):
    907-914

    This paper describes the effectiveness of stream-oriented data-driven scheme for achieving autonomous fault management of hyper-distributed systems such as networks based on the Telecommunications Information Networking Architecture (TINA). TINA, whose specifications are in the finalizing phase within TINA-Consortium, is aiming at achieving interoperability and reusability of telecom applications software and independent of underlying technologies. However, to actually implement TINA network, it is essential to consider the technology constraints. Especially autonomous fault management at run-time is crucial for distributed network environment because centralized control using global information is very difficult. So far many works have been done on so-called off-line management but runtime management of service failure seems immature. This paper proposes introduction of stream-oriented data-driven processors to the autonomous fault management at runtime in TINA based distributed network environment. It examines the features of distributed network applications and technology requirements to achieve fault management of those distributed applications such as effective multiprocessing of surveillance, testing, reconfiguration in addition to ordinary processing.

  • Low-Power Network-Packet-Processing Architecture Using Process-Learning Cache for High-End Backbone Router

    Michitaka OKUNO  Shin-ichi ISHIDA  Hiroaki NISHI  

     
    PAPER-Digital

      Vol:
    E88-C No:4
      Page(s):
    536-543

    A novel cache-based packet-processing-engine (PPE) architecture that achieves low-power consumption and high packet-processing throughput by exploiting the nature of network traffic is proposed. This architecture consists of a processing-unit array and a bit-stream manipulation path called a burst stream path (BSP) that has a special cache mechanism called a process-learning cache (PLC). Network packets, which have the same information in their header, appear repeatedly over a short time. By exploiting that nature, the PLC memorizes the packet-processing method with all results (i. e. , table lookups), and applies it to other packets. The PLC enables most packets to skip the execution at the processing-unit array, which consumes high power. As a practical implementation of the cache-based PPE architecture, P-Gear was designed. In particular, P-Gear was compared with a conventional PPE in terms of silicon die size and power consumption. According to this comparison, in the case of current 0.13-µm CMOS process technology, P-Gear can achieve 100-Gbps (gigabit per second) packet-processing throughput with only 36.5% of the die size and 32.8% of the power consumption required by the conventional PPE. Configurations of both architectures for the 1- to 100-Gbps throughput range were also analyzed. In the throughput range of 10-Gbps or more, P-Gear can achieve the target throughput in a smaller die size than the conventional PPE. And for the whole throughput range, P-Gear can achieve a target throughput at lower power than the conventional PPE.

  • The RDT Router Chip: A Versatile Router for Supporting a Distributed Shared Memory

    Hiroaki NISHI  Ken-ichiro ANJO  Tomohiro KUDOH  Hideharu AMANO  

     
    PAPER-Interconnection Networks

      Vol:
    E80-D No:9
      Page(s):
    854-862

    JUMP-1 is currently under development by seven Japanese universities to establish techniques for building an efficient distributed shared memory on a massively parallel processor. It provides a coherent cache with reduced hierarchical bit-map directory scheme to achieve cost effective and high performance management. Messages for coherent cache are transferred through a fat tree on the RDT (Recursive Diagonal Torus) interconnection network. RDT router supports versatile functions including multicast and acknowledge combining for the reduced hierarchical bit-map directory scheme. By using 0.5µm BiCMOS SOG technology, it can transfer all packets synchronized with a unique CPU clock (50MHz). Long coaxial cables (4m at maximum) are directly driven with the ECL interface of this chip. Using the dual port RAM, packet buffers allow to push and pull a flit of the packet simultaneously.

  • Architecture and Evaluation of a Third-Generation RHiNET Switch for High-Performance Parallel Computing

    Hiroaki NISHI  Shinji NISHIMURA  Katsuyoshi HARASAWA  Tomohiro KUDOH  Hideharu AMANO  

     
    PAPER

      Vol:
    E86-D No:10
      Page(s):
    1987-1995

    RHiNET-3/SW is the third-generation switch used in the RHiNET-3 system. It provides both low-latency processing and flexible connection due to its use of a credit-based flow-control mechanism, topology-free routing, and deadlock-free routing. The aggregate throughput of RHiNET-3/SW is 80 Gbps, and the latency is 140 ns. RHiNET-3/SW also provides a hop-by-hop retransmission mechanism. Simulation demonstrated that the effective throughput at a node in a 64-node torus RHiNET-3 system is equivalent to the effective throughput of a 64-bit 33-MHz PCI bus and that the performance of RHiNET-3/SW almost equals or exceeds the best performance of RHiNET-2/SW, the second-generation switch. Although credit-based flow control requires 26% more gates than rate-based flow control to manage the virtual channels (VCs), it requires less VC memory than rate-based flow control. Moreover, its use in a network system reduces latency and increases the maximum throughput compared to rate-based flow control.

  • Design Philosophy of a Networking-Oriented Data-Driven Processor: CUE

    Hiroaki NISHIKAWA  

     
    INVITED PAPER

      Vol:
    E89-C No:3
      Page(s):
    221-229

    To realize a secure networking infrastructure, the author is carrying out CUE (Coordinating Users' requirements and Engineering constraints) project with a network carrier and a VLSI manufacture. Since CUE-series data-driven processors developed in the project were specifically designed to be an embedded programmable component as well as a multi-processor element, particular design considerations were taken to achieve real-time multiprocessing capabilities essentially needed in multi-media communication environment. A novel data-driven paradigm is first introduced with special emphasis on VLSI-oriented parallel processing architectures. Data-driven protocol handlings on CUE-p and CUE-v1 are then discussed for their real-time multiprocessing capability without any runtime overheads. The emulation facility RESCUE (Real-time Execution System for CUE-series data-driven processors) was also built to develop scalable chip multi-processors in self-evolutional manner. Based on emulation results, the latest version named CUE-v2 was realized as a hybrid processor enabling simultaneous processing of data-driven and control-driven threads to achieve higher performance for inline processing and to avoid any bottlenecks in sequential parts of real-time programs frequently encountered in actual time-sensitive applications. Effectiveness of the data-driven chip multi-processor architecture will finally be addressed for lower power consumption and scalability to realize future VLSI processors in the sub-100 nm era.

  • Time Synchronization Technique Using EPON for Next-Generation Power Grids

    Yuichi NAKAMURA  Andy HARVATH  Hiroaki NISHI  

     
    PAPER

      Vol:
    E99-B No:4
      Page(s):
    859-866

    Changing attitudes toward energy security and energy conservation have led to the introduction of distributed power systems such as photovoltaic, gas-cogeneration, biomass, water, and wind power generators. The mass installation of distributed energy generators often causes instability in the voltage and frequency of the power grid. Moreover, the power quality of distributed power grids can become degraded when system faults or the activation of highly loaded machines cause rapid changes in power load. To avoid such problems and maintain an acceptable power quality, it is important to detect the source of these rapid changes. To address these issues, next-generation power grids that can detect the fault location have been proposed. Fault location demands accurate time synchronization. Conventional techniques use the Global Positioning System (GPS) and/or IEEE 1588v2 for time synchronization. However, both methods have drawbacks — GPS cannot be used in indoor situations, and the installation cost of IEEE 1588v2 devices is high. In this paper, a time synchronization technique using the broadcast function of an Ethernet Passive Optical Network (EPON) system is proposed. Experiments show that the proposed technique is low-cost and useful for smart grid applications that use time synchronization in EPON-based next-generation power grids.

  • Novel Method to Watermark Anonymized Data for Data Publishing

    Yuichi NAKAMURA  Yoshimichi NAKATSUKA  Hiroaki NISHI  

     
    PAPER-Privacy

      Pubricized:
    2017/05/18
      Vol:
    E100-D No:8
      Page(s):
    1671-1679

    In this study, an anonymization infrastructure for the secondary use of data is proposed. The proposed infrastructure can publish data that includes privacy information while preserving the privacy by using anonymization techniques. The infrastructure considers a situation where ill-motivated users redistribute the data without authorization. Therefore, we propose a watermarking method for anonymized data to solve this problem. The proposed method is implemented, and the proposed method's tolerance against attacks is evaluated.

  • Self-Organized Link State Aware Routing for Multiple Mobile Agents in Wireless Network

    Akihiro ODA  Hiroaki NISHI  

     
    PAPER

      Vol:
    E93-B No:8
      Page(s):
    2012-2021

    Recently, the importance of data sharing structures in autonomous distributed networks has been increasing. A wireless sensor network is used for managing distributed data. This type of distributed network requires effective information exchanging methods for data sharing. To reduce the traffic of broadcasted messages, reduction of the amount of redundant information is indispensable. In order to reduce packet loss in mobile ad-hoc networks, QoS-sensitive routing algorithm have been frequently discussed. The topology of a wireless network is likely to change frequently according to the movement of mobile nodes, radio disturbance, or fading due to the continuous changes in the environment. Therefore, a packet routing algorithm should guarantee QoS by using some quality indicators of the wireless network. In this paper, a novel information exchanging algorithm developed using a hash function and a Boolean operation is proposed. This algorithm achieves efficient information exchanges by reducing the overhead of broadcasting messages, and it can guarantee QoS in a wireless network environment. It can be applied to a routing algorithm in a mobile ad-hoc network. In the proposed routing algorithm, a routing table is constructed by using the received signal strength indicator (RSSI), and the neighborhood information is periodically broadcasted depending on this table. The proposed hash-based routing entry management by using an extended MAC address can eliminate the overhead of message flooding. An analysis of the collision of hash values contributes to the determination of the length of the hash values, which is minimally required. Based on the verification of a mathematical theory, an optimum hash function for determining the length of hash values can be given. Simulations are carried out to evaluate the effectiveness of the proposed algorithm and to validate the theory in a general wireless network routing algorithm.