The search functionality is under construction.
The search functionality is under construction.

Author Search Result

[Author] Ushio JIMBO(3hit)

1-3hit
  • Applying Razor Flip-Flops to SRAM Read Circuits

    Ushio JIMBO  Junji YAMADA  Ryota SHIOYA  Masahiro GOSHIMA  

     
    PAPER

      Vol:
    E100-C No:3
      Page(s):
    245-258

    Timing fault detection techniques address the problems caused by increased variations on a chip, especially with dynamic voltage and frequency scaling (DVFS). The Razor flip-flop (FF) is a timing fault detection technique that employs double sampling by the main and shadow FFs. In order for the Razor FF to correctly detect a timing fault, not the main FF but the shadow FF must sample the correct value. The application of Razor FFs to static logic relaxes the timing constraints; however, the naive application of Razor FFs to dynamic precharged logic such as SRAM read circuits is not effective. This is because the SRAM precharge cannot start before the shadow FF samples the value; otherwise, the transition of the bitline of the SRAM stops and the value sampled by the shadow FF will be incorrect. Therefore, the detect period cannot overlap the precharge period. This paper proposes a novel application of Razor FFs to SRAM read circuits. Our proposal employs a conditional precharge according to the value of a bitline sampled by the main FF. This enables the detect period to overlap the precharge period, thereby relaxing the timing constraints. The additional circuit required by this method is simple and only needed around the sense amplifier, and there is no need for a clock delayed from the system clock. Consequently, the area overhead of the proposed circuit is negligible. This paper presents SPICE simulations of the proposed circuit. Our proposal reduces the minimum cycle time by 51.5% at a supply voltage of 1.1 V and the minimum voltage by 31.8% at cycle time of 412.5 ps.

  • Design of a Register Cache System with an Open Source Process Design Kit for 45nm Technology

    Junji YAMADA  Ushio JIMBO  Ryota SHIOYA  Masahiro GOSHIMA  Shuichi SAKAI  

     
    PAPER

      Vol:
    E100-C No:3
      Page(s):
    232-244

    An 8-issue superscalar core generally requires a 24-port RAM for the register file. The area and energy consumption of a multiported RAM increase in proportional to the square of the number of ports. A register cache can reduce the area and energy consumption of the register file. However, earlier register cache systems suffer from lower IPC caused by register cache misses. Thus, we proposed the Non-Latency-Oriented Register Cache System (NORCS) to solve the IPC problem with a modified pipeline. We evaluated NORCS mainly from the viewpoint of microarchitecture in the original article, and showed that NORCS maintains almost the same IPC as conventional register files. Researchers in NVIDIA adopted the same idea for their GPUs. However, the evaluation was not sufficient from the viewpoint of LSI design. In the original article, we used CACTI to evaluate the area and energy consumption. CACTI is a design space exploration tool for cache design, and adopts some rough approximations. Therefore, this paper shows design of NORCS with FreePDK45, an open source process design kit for 45nm technology. We performed manual layout of the memory cells and arrays of NORCS, and executed SPICE simulation with RC parasitics extracted from the layout. The results show that, from a full-port register file, an 8-entry NORCS achieves a 75.2% and 48.2% reduction in area and energy consumption, respectively. The results also include the latency which we did not present in our original article. The latencies of critical path is 307ps and 318ps for an 8-entry NORCS and a conventional multiported register file, respectively, when the same two cycles are allocated to register file read.

  • Skewed Multistaged Multibanked Register File for Area and Energy Efficiency

    Junji YAMADA  Ushio JIMBO  Ryota SHIOYA  Masahiro GOSHIMA  Shuichi SAKAI  

     
    PAPER-Computer System

      Pubricized:
    2017/01/11
      Vol:
    E100-D No:4
      Page(s):
    822-837

    The region that includes the register file is a hot spot in high-performance cores that limits the clock frequency. Although multibanking drastically reduces the area and energy consumption of the register files of superscalar processor cores, it suffers from low IPC due to bank conflicts. Our skewed multistaging drastically reduces not the bank conflict probability but the pipeline disturbance probability by the second stage. The evaluation results show that, compared with NORCS, which is the latest research on a register file for area and energy efficiency, a proposed register file with 18 banks achieves a 39.9% and 66.4% reduction in circuit area and in energy consumption, while maintaining a relative IPC of 97.5%.