The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] register renaming(4hit)

1-4hit
  • Improvement of Renamed Trace Cache through the Reduction of Dependent Path Length for High Energy Efficiency

    Ryota SHIOYA  Hideki ANDO  

     
    PAPER-Computer System

      Pubricized:
    2015/12/04
      Vol:
    E99-D No:3
      Page(s):
    630-640

    Out-of-order superscalar processors rename register numbers to remove false dependencies between instructions. A renaming logic for register renaming is a high-cost module in a superscalar processor, and it consumes considerable energy. A renamed trace cache (RTC) was proposed for reducing the energy consumption of a renaming logic. An RTC caches and reuses renamed operands, and thus, register renaming can be omitted on RTC hits. However, conventional RTCs suffer from several performance, energy consumption, and hardware overhead problems. We propose a semi-global renamed trace cache (SGRTC) that caches only renamed operands that are short distance from producers outside traces, and solves the problems of conventional RTCs. Evaluation results show that SGRTC achieves 64% lower energy consumption for renaming with a 0.2% performance overhead as compared to a conventional processor.

  • Physical Register Sharing through Value Similarity Detection

    In Pyo HONG  Ha Young JEONG  Yong Surk LEE  

     
    LETTER-Computer Systems

      Vol:
    E89-D No:10
      Page(s):
    2678-2681

    Modern processors have large instruction windows to improve performance. They usually adopt register renaming, where every active instruction with a valid destination needs a physical register. As the instruction windows get larger, however, bigger physical register files are required. To solve this problem, we proposed a physical register sharing technique. It shares a physical register among multiple instructions based on a value similarity. As a result, we achieved performance improvement without increasing the size of the physical register file. In addition, the proposed technique can also be used to reduce the timing, complexity and area overhead of the physical register file.

  • A Low-Cost Recovery Mechanism for Processors with Large Instruction Windows

    In Pyo HONG  Byung In MOON  Yong Surk LEE  

     
    LETTER-VLSI Systems

      Vol:
    E89-D No:6
      Page(s):
    1967-1970

    The latest processors employ a large instruction window and longer pipelines to achieve higher performance. Although current branch predictors show high accuracy, the misprediction penalty is getting larger in proportion to the number of pipeline stages and pipeline width. This negative effect also happens in case of exceptions or interrupts. Therefore, it is important to recover processor state quickly and restart processing immediately. In this letter, we propose a low-cost recovery mechanism for processors with large instruction windows.

  • Dynamic Fast Issue (DFI) Mechanism for Dynamic Scheduled Processors

    Abderazek BEN ABDALLAH  Mudar SAREM  Masahiro SOWA  

     
    PAPER-VLSI Architecture

      Vol:
    E83-A No:12
      Page(s):
    2417-2425

    Superscalar processors can achieve increased performance by issuing instructions Out-of-Order (OoO) from the original instruction stream. Implementing an OoO instruction scheme requires a hardware mechanism to prevent incorrectly executed instructions from updating registers values. In addition, performance decreases if data dependencies, a branch or a trap among instructions appears. To this end we propose a new mechanism named Dynamic Fast Issue (DFI) mechanism to issue instructions in an OoO fashion to multiple parallel functional units without considerable hardware complexity. The above system, which will be implemented in our Superscalar Functional Assignments Register Microprocessor(FARM), solves data dependencies, supports precise interrupt and branch prediction, which are the main problems associated with the dynamic scheduling of instructions in superscalar machines. Results are written only once,Write-once, directly into the register file (RF). To ensure that results are written in order in their appropriate output registers, a record of instruction order and state is maintained by a status buffer (STB). A 64 entries integrated register file is implemented to hold both renamed and logical registers. To recover the processor state from an interrupt or a branch miss-prediction, a status buffer (STB) and a recovery list table (RLT) are implemented. Novel aspects of the above system architecture as well as the principle underlying this process and the constraints that must be met is presented. Performance evaluation results are performed through full-pipelined-level architectural simulator and SPECint95 benchmark programs.