The search functionality is under construction.

Author Search Result

[Author] Tomoki NAKAMURA(3hit)

1-3hit
  • A Principal Factor of Performance in Decoupled Front-End

    Yuya DEGAWA  Toru KOIZUMI  Tomoki NAKAMURA  Ryota SHIOYA  Junichiro KADOMOTO  Hidetsugu IRIE  Shuichi SAKAI  

     
    PAPER

      Pubricized:
    2023/06/30
      Vol:
    E106-D No:12
      Page(s):
    1960-1968

    One of the performance bottlenecks of a processor is the front-end that supplies instructions. Various techniques, such as cache replacement algorithms and hardware prefetching, have been investigated to facilitate smooth instruction supply at the front-end and to improve processor performance. In these approaches, one of the most important factors has been the reduction in the number of instruction cache misses. By using the number of instruction cache misses or derived factors, previous studies have explained the performance improvements achieved by their proposed methods. However, we found that the number of instruction cache misses does not always explain performance changes well in modern processors. This is because the front-end in modern processors handles subsequent instruction cache misses in overlap with earlier ones. Based on this observation, we propose a novel factor: the number of miss regions. We define a region as a sequence of instructions from one branch misprediction to the next, while we define a miss region as a region that contains one or more instruction cache misses. At the boundary of each region, the pipeline is flushed owing to a branch misprediction. Thus, cache misses after this boundary are not handled in overlap with cache misses before the boundary. As a result, the number of miss regions is equal to the number of cache misses that are processed without overlap. In this paper, we demonstrate that the number of miss regions can well explain the variation in performance through mathematical models and simulation results. The results show that the model explains cycles per instruction with an average error of 1.0% and maximum error of 4.1% when applying an existing prefetcher to the instruction cache. The idea of miss regions highlights that instruction cache misses and branch mispredictions interact with each other in processors with a decoupled front-end. We hope that considering this interaction will motivate the development of fast performance estimation methods and new microarchitectural methods.

  • Wafer-Level Characteristic Variation Modeling Considering Systematic Discontinuous Effects

    Takuma NAGAO  Tomoki NAKAMURA  Masuo KAJIYAMA  Makoto EIKI  Michiko INOUE  Michihiro SHINTANI  

     
    PAPER

      Pubricized:
    2023/07/19
      Vol:
    E107-A No:1
      Page(s):
    96-104

    Statistical wafer-level characteristic variation modeling offers an attractive method for reducing the measurement cost in large-scale integrated (LSI) circuit testing while maintaining test quality. In this method, the performance of unmeasured LSI circuits fabricated on a wafer is statistically predicted based on a few measured LSI circuits. Conventional statistical methods model spatially smooth variations in the wafers. However, actual wafers can exhibit discontinuous variations that are systematically caused by the manufacturing environment, such as shot dependence. In this paper, we propose a modeling method that considers discontinuous variations in wafer characteristics by applying the knowledge of manufacturing engineers to a model estimated using Gaussian process regression. In the proposed method, the process variation is decomposed into systematic discontinuous and global components to improve estimation accuracy. An evaluation performed using an industrial production test dataset indicates that the proposed method effectively reduces the estimation error for an entire wafer by over 36% compared with conventional methods.

  • Consensus-Based Distributed Exp3 Policy Over Directed Time-Varying Networks Open Access

    Tomoki NAKAMURA  Naoki HAYASHI  Masahiro INUIGUCHI  

     
    PAPER

      Pubricized:
    2023/10/16
      Vol:
    E107-A No:5
      Page(s):
    799-805

    In this paper, we consider distributed decision-making over directed time-varying multi-agent systems. We consider an adversarial bandit problem in which a group of agents chooses an option from among multiple arms to maximize the total reward. In the proposed method, each agent cooperatively searches for the optimal arm with the highest reward by a consensus-based distributed Exp3 policy. To this end, each agent exchanges the estimation of the reward of each arm and the weight for exploitation with the nearby agents on the network. To unify the explored information of arms, each agent mixes the estimation and the weight of the nearby agents with their own values by a consensus dynamics. Then, each agent updates the probability distribution of arms by combining the Hedge algorithm and the uniform search. We show that the sublinearity of a pseudo-regret can be achieved by appropriately setting the parameters of the distributed Exp3 policy.