The search functionality is under construction.

Author Search Result

[Author] Junichiro KADOMOTO(4hit)

1-4hit
  • A Principal Factor of Performance in Decoupled Front-End

    Yuya DEGAWA  Toru KOIZUMI  Tomoki NAKAMURA  Ryota SHIOYA  Junichiro KADOMOTO  Hidetsugu IRIE  Shuichi SAKAI  

     
    PAPER

      Pubricized:
    2023/06/30
      Vol:
    E106-D No:12
      Page(s):
    1960-1968

    One of the performance bottlenecks of a processor is the front-end that supplies instructions. Various techniques, such as cache replacement algorithms and hardware prefetching, have been investigated to facilitate smooth instruction supply at the front-end and to improve processor performance. In these approaches, one of the most important factors has been the reduction in the number of instruction cache misses. By using the number of instruction cache misses or derived factors, previous studies have explained the performance improvements achieved by their proposed methods. However, we found that the number of instruction cache misses does not always explain performance changes well in modern processors. This is because the front-end in modern processors handles subsequent instruction cache misses in overlap with earlier ones. Based on this observation, we propose a novel factor: the number of miss regions. We define a region as a sequence of instructions from one branch misprediction to the next, while we define a miss region as a region that contains one or more instruction cache misses. At the boundary of each region, the pipeline is flushed owing to a branch misprediction. Thus, cache misses after this boundary are not handled in overlap with cache misses before the boundary. As a result, the number of miss regions is equal to the number of cache misses that are processed without overlap. In this paper, we demonstrate that the number of miss regions can well explain the variation in performance through mathematical models and simulation results. The results show that the model explains cycles per instruction with an average error of 1.0% and maximum error of 4.1% when applying an existing prefetcher to the instruction cache. The idea of miss regions highlights that instruction cache misses and branch mispredictions interact with each other in processors with a decoupled front-end. We hope that considering this interaction will motivate the development of fast performance estimation methods and new microarchitectural methods.

  • FPGA-based Garbling Accelerator with Parallel Pipeline Processing

    Rin OISHI  Junichiro KADOMOTO  Hidetsugu IRIE  Shuichi SAKAI  

     
    PAPER

      Pubricized:
    2023/08/02
      Vol:
    E106-D No:12
      Page(s):
    1988-1996

    As more and more programs handle personal information, the demand for secure handling of data is increasing. The protocol that satisfies this demand is called Secure function evaluation (SFE) and has attracted much attention from a privacy protection perspective. In two-party SFE, two mutually untrustworthy parties compute an arbitrary function on their respective secret inputs without disclosing any information other than the output of the function. For example, it is possible to execute a program while protecting private information, such as genomic information. The garbled circuit (GC) — a method of program obfuscation in which the program is divided into gates and the output is calculated using a symmetric key cipher for each gate — is an efficient method for this purpose. However, GC is computationally expensive and has a significant overhead even with an accelerator. We focus on hardware acceleration because of the nature of GC, which is limited to certain types of calculations, such as encryption and XOR. In this paper, we propose an architecture that accelerates garbling by running multiple garbling engines simultaneously based on the latest FPGA-based GC accelerator. In this architecture, managers are introduced to perform multiple rows of pipeline processing simultaneously. We also propose an optimized implementation of RAM for this FPGA accelerator. As a result, it achieves an average performance improvement of 26% in garbling the same set of programs, compared to the state-of-the-art (SOTA) garbling accelerator.

  • A Study of Physical Design Guidelines in ThruChip Inductive Coupling Channel

    Li-Chung HSU  Junichiro KADOMOTO  So HASEGAWA  Atsutake KOSUGE  Yasuhiro TAKE  Tadahiro KURODA  

     
    PAPER-Physical Level Design

      Vol:
    E98-A No:12
      Page(s):
    2584-2591

    ThruChip interface (TCI) is an emerging wireless interface in three-dimensional (3-D) integrated circuit (IC) technology. However, the TCI physical design guidelines remain unclear. In this paper, a ThruChip test chip is designed and fabricated for design guidelines exploration. Three inductive coupling interface physical design scenarios, baseline, power mesh, and dummy metal fill, are deployed in the test chip. In the baseline scenario, the test chip measurement results show that thinning chip or enlarging coil dimension can further reduce TCI power. The power mesh scenario shows that the eddy current on power mesh can dramatically reduce magnetic pulse signal and thus possibly cause TCI to fail. A power mesh splitting method is proposed to effectively suppress eddy current impact while minimizing power mesh structure impact. The simulation results show that the proposed method can recover 77% coupling coefficient loss while only introducing additional 0.5% IR-drop. In dummy metal fill case, dummy metal fill enclosed within TCI coils have no impact on TCI transmission and thus are ignorable.

  • Analysis and Evaluation of Electromagnetic Interference between ThruChip Interface and LC-VCO

    Junichiro KADOMOTO  So HASEGAWA  Yusuke KIUCHI  Atsutake KOSUGE  Tadahiro KURODA  

     
    BRIEF PAPER

      Vol:
    E99-C No:6
      Page(s):
    659-662

    This paper presents analysis and simple design guideline for ThruChip Interface (TCI) as located by LC-VCO which is used in high-speed SoC. The electromagnetic interference (EMI) from TCI channels to LC-VCO is analyzed and evaluated. The accuracy of the analysis and design guidelines is verified through the test-chip verification.