IEICE global.ieice.org Site

Author Search Result

[Author] Ki ANDO(24hit)

21-24hit(24hit)

MLP-Aware Dynamic Instruction Window Resizing in Superscalar Processors for Adaptively Exploiting Available Parallelism
Yuya KORA Kyohei YAMAGUCHI Hideki ANDO

PAPER-Computer System

Pubricized:
2014/09/22
Vol:
E97-D No:12
Page(s):
3110-3123
Single-thread performance has not improved much over the past few years, despite an ever increasing transistor budget. One of the reasons for this is that there is a speed gap between the processor and main memory, known as the memory wall. A promising method to overcome this memory wall is aggressive out-of-order execution by extensively enlarging the instruction window resources to exploit memory-level parallelism (MLP). However, simply enlarging the window resources lengthens the clock cycle time. Although pipelining the resources solves this problem, it in turn prevents instruction-level parallelism (ILP) from being exploited because issuing instructions requires multiple clock cycles. This paper proposed a dynamic scheme that adaptively resizes the instruction window based on the predicted available parallelism, either ILP or MLP. Specifically, if the scheme predicts that MLP is available during execution, the instruction window is enlarged and the window resources are pipelined, thereby exploiting MLP. Conversely, if the scheme predicts that less MLP is available, that is, ILP is exploitable for improved performance, the instruction window is shrunk and the window resources are de-pipelined, thereby exploiting ILP. Our evaluation results using the SPEC2006 benchmark programs show that the proposed scheme achieves nearly the best performance possible with fixed-size resources. On average, our scheme realizes a performance improvement of 21% over that of a conventional processor, with additional cost of only 6% of the area of the conventional processor core or 3% of that of the entire processor chip. The evaluation results also show 8% better energy efficiency in terms of 1/EDP (energy-delay product).
Reducing Energy Consumption of Wakeup Logic through Double-Stage Tag Comparison
Yasutaka MATSUDA Ryota SHIOYA Hideki ANDO

PAPER-Computer System

Pubricized:
2021/11/02
Vol:
E105-D No:2
Page(s):
320-332
The high energy consumption of current processors causes several problems, including a limited clock frequency, short battery lifetime, and reduced device reliability. It is therefore important to reduce the energy consumption of the processor. Among resources in a processor, the issue queue (IQ) is a large consumer of energy, much of which is consumed by the wakeup logic. Within the wakeup logic, the tag comparison that checks source operand readiness consumes a significant amount of energy. This paper proposes an energy reduction scheme for tag comparison, called double-stage tag comparison. This scheme first compares the lower bits of the tag and then, only if these match, compares the higher bits. Because the energy consumption of tag comparison is roughly proportional to the total number of bits compared, energy is saved by reducing this number. However, this sequential comparison increases the delay of the IQ, thereby increasing the clock cycle time. Although this can be avoided by allocating an extra cycle to the issue operation, this in turn degrades the IPC. To avoid IPC degradation, we reconfigure a small number of entries in the IQ, where several oldest instructions that are likely to have an adverse effect on performance reside, to a single stage for tag comparison. Our evaluation results for SPEC2017 benchmark programs show that the double-stage tag comparison achieves on average a 21% reduction in the energy consumed by the wakeup logic (15% when including the overhead) with only 3.0% performance degradation.
A Nearly Perfect Total-Field/Scattered-Field Boundary for the One-Dimensional CIP Method
Yoshiaki ANDO Hiroyuki SAITO Masashi HAYAKAWA

PAPER-Electromagnetic Theory

Vol:
E91-C No:10
Page(s):
1677-1683
A total-field/scattered-field (TF/SF) boundary which is commonly used in the finite-difference time-domain (FDTD) method to illuminate scatterers by plane waves, is developed for use in the constrained interpolation profile (CIP) method. By taking the numerical dispersion into account, the nearly perfect TF/SF boundary can be achieved, which allows us to calculate incident fields containing high frequency components without fictitious scattered fields. First of all, we formulate the TF/SF boundary in the CIP scheme. The numerical dispersion relation is then reviewed. Finally the numerical dispersion is implemented in the TF/SF boundary to estimate deformed incident fields. The performance of the nearly perfect TF/SF boundary is examined by measuring leaked fields in the SF region, and the proposed method drastically diminish the leakage compared with the simple TF/SF boundary.
Automatic Communication Synthesis with Hardware Sharing for Multi-Processor SoC Design
Yuki ANDO Seiya SHIBATA Shinya HONDA Hiroyuki TOMIYAMA Hiroaki TAKADA

PAPER-High-Level Synthesis and System-Level Design

Vol:
E93-A No:12
Page(s):
2509-2516
We present a hardware sharing method for design space exploration of multi-processor embedded systems. In our prior work, we had developed a system-level design tool named SystemBuilder which automatically synthesizes target implementation of a system from a functional description. In this work, we have extended SystemBuilder so that it can automatically synthesize an area-efficient implementation which shares a hardware module among different applications. With SystemBuilder, designers only need to enable an option in order to share a hardware module. The designers, therefore, can easily explore a design space including hardware sharing in short time. A case study shows the effectiveness of the hardware sharing on design space exploration.

21-24hit(24hit)

Author Search Result

[Author] Ki ANDO(24hit)

MLP-Aware Dynamic Instruction Window Resizing in Superscalar Processors for Adaptively Exploiting Available Parallelism

Reducing Energy Consumption of Wakeup Logic through Double-Stage Tag Comparison

A Nearly Perfect Total-Field/Scattered-Field Boundary for the One-Dimensional CIP Method

Automatic Communication Synthesis with Hardware Sharing for Multi-Processor SoC Design

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles