The search functionality is under construction.

Author Search Result

[Author] Vasily G. MOSHNYAGA(13hit)

1-13hit
  • A Novel Computationally Adaptive Hardware Algorithm for Video Motion Estimation

    Vasily G. MOSHNYAGA  

     
    PAPER-Imaging Circuits and Algorithms

      Vol:
    E82-C No:9
      Page(s):
    1749-1754

    A new hardware algorithm for the block matching video motion estimation is presented. The algorithm works in the full-search fashion but unlike the Full-Search Block Matching Algorithm (FSBMA) it adjusts the number of computations dynamically to variable picture contents. Due to incorporated mechanism of data-driven thresholding, the proposed algorithm performs as four times as less operations comparing to the FSBMA while maintaining the same quality of results. Its hardware implementation is simple and compact. A supportive hardware design as well as simulation results on benchmarks are outlined.

  • Multiplier Energy Reduction by Dynamic Voltage Variation

    Vasily G. MOSHNYAGA  Tomoyuki YAMANAKA  

     
    PAPER-VLSI Circuit

      Vol:
    E88-A No:12
      Page(s):
    3548-3553

    Design of portable battery operated multimedia devices requires energy-efficient multiplication circuits. This paper proposes a novel architectural technique to reduce power consumption of digital multipliers. Unlike related approaches which focus on multiplier transition activity reduction, we concentrate on dynamic reduction of supply voltage. Two implementation schemes capable of dynamically adjusting a double voltage supply to input data variation are presented. Simulations show that using these schemes we can reduce energy consumption of 1616-bit multiplier by 34% and 29% on peak and by 10% and 7% on average with area overhead of 15% and 4%, respectively, while maintaining the performance of traditional multiplier.

  • Instruction Encoding for Reducing Power Consumption of I-ROMs Based on Execution Locality

    Koji INOUE  Vasily G. MOSHNYAGA  Kazuaki MURAKAMI  

     
    PAPER

      Vol:
    E86-A No:4
      Page(s):
    799-805

    In this paper, we propose an instruction encoding scheme to reduce power consumption of instruction ROMs. The power consumption of the instruction ROM strongly depends on the switching activity of bit-lines due to their large load capacitance. In our approach, the binary-patterns to be assigned as op-codes are determined based on the frequency of instructions in order to reduce the number of bit-line dis-charging. Simulation results show that our approach can reduce 40% of bit-line switchings from a conventional organization.

  • A Floorplan Based Methodology for Data-Path Synthesis of Sub-Micron ASICs

    Vasily G. MOSHNYAGA  Keikichi TAMARU  

     
    PAPER-High-Level Synthesis

      Vol:
    E79-D No:10
      Page(s):
    1389-1395

    As IC fabrication technology enters a deepsubmicron region with device feature sizes <0.35µm, interconnect becomes the most dominant factor in design of high-speed Application Specific Integrated Circuits (ASICs). This paper proposes a novel methodology for automated data-path synthesis of such circuits and outlines algorithms to support it. In contrast to other approaches, we formulate interconnect area/delay optimizations as high-level synthesis transformations and use them during the synthesis to minimize the impact of wiring on circuit characteristics. Experiments with FIR filter implementations show that such formulation jointly with on the fly" module generation and performance-driven floorplanning provides more than a 30% reduction in wiring delay for deep sub-micron designs.

  • Issue Queue Energy Reduction through Dynamic Voltage Scaling

    Vasily G. MOSHNYAGA  

     
    PAPER-Low-Power Technologies

      Vol:
    E85-C No:2
      Page(s):
    272-278

    With increased size and issue-width, instruction issue queue becomes one of the most energy consuming units in today's superscalar microprocessors. This paper presents a novel architectural technique to reduce energy dissipation of adaptive issue queue, whose functionality is dynamically adjusted at runtime to match the changing computational demands of instruction stream. In contrast to existing schemes, the technique exploits a new freedom in queue design, namely the voltage per access. Since loading capacitance operated in the adaptive queue varies in time, the clock cycle budget becomes inefficiently exploited. We propose to trade-off the unused cycle time with supply voltage, lowering the voltage level when the queue functionality is reduced and increasing it with the activation of resources in the queue. Experiments show that the approach can save up to 39% of the issue queue energy without large performance and area overhead.

  • Trends in High-Performance, Low-Power Cache Memory Architectures

    Koji INOUE  Vasily G. MOSHNYAGA  Kazuaki MURAKAMI  

     
    PAPER-High-Performance Technologies

      Vol:
    E85-C No:2
      Page(s):
    304-314

    One of uncompromising requirements from portable computing is energy efficiency, because that affects directly the battery life. On the other hand, portable computing will target more demanding applications, for example moving pictures, so that higher performance is still required. Cache memories have been employed as one of the most important components of computer systems. In this paper, we briefly survey architectural techniques for high performance, low power cache memories.

  • Reducing Cache Energy Dissipation by Using Dual Voltage Supply

    Vasily G. MOSHNYAGA  Hiroshi TSUJI  

     
    PAPER-Optimization of Power and Timing

      Vol:
    E84-A No:11
      Page(s):
    2762-2768

    Due to a large capacitance and enormous access rate, caches dissipate about a third of the total energy consumed by today's processors. In this paper we present a new architectural technique to reduce energy consumption in caches. Unlike previous approaches, which have focused on lowering cache capacitance and the number of accesses, our method exploits a new freedom in cache design, namely the voltage per access. Since in modern block-buffered caches, the loading capacitance operated on block-hit is much less than the capacitance operated on miss, the given clock cycle time is inefficiently utilized during the hit. We propose to trade-off this unused time with the supply voltage, lowering the voltage level on the hit and increasing it on the miss. Experiments show that the approach can half the cache energy dissipation without large performance and area overhead.

  • A Language for Designing Module Generators

    Vasily G. MOSHNYAGA  Keikichi TAMARU  Hiroto YASUURA  

     
    PAPER-Hardware Design Languages

      Vol:
    E76-D No:9
      Page(s):
    1066-1074

    A new applicative design language is proposed for developing generators of data-path modules from hardware algorithms. The language includes a set of primitives that represent placement operations, parameterized cells, routing patterns and a set of transformation rules specifying modifications of the module topology without changing its functionality. Using the language, a hardware algorithm designer can easily define both the topological and geometrical specifications of module generation directly at the functional level without engaged in the layout details. A sketch of the language and an example of module design with the language is presented.

  • Omitting Cache Look-up for High-Performance, Low-Power Microprocessors

    Koji INOUE  Vasily G. MOSHNYAGA  Kazuaki MURAKAMI  

     
    PAPER-Low-Power Technologies

      Vol:
    E85-C No:2
      Page(s):
    279-287

    In this paper, we propose a novel architecture for low-power direct-mapped instruction caches, called "history-based tag-comparison (HBTC) cache. " The cache attempts to reuse tag-comparison results for avoiding unnecessary tag checks. Execution footprints are recorded into an extended BTB (Branch Target Buffer). In our evaluation, it is observed that the energy for tag comparison can be reduced by more than 90% in many applications.

  • Register-Transfer Module Selection for Sub-Micron ASIC Design

    Vasily G. MOSHNYAGA  Yutaka MORI  Keikichi TAMARU  

     
    LETTER

      Vol:
    E78-D No:3
      Page(s):
    252-255

    In order to shorten the time-to-market, Application-Specific Integrated Circuits (ASIC's) are designed from a library of pre-defined layout implementations for register-transfer modules such as multipliers, adders, RAM, ROM, etc. Current approaches to selecting the implementations from the library usually deal with their timing-area estimates and do not consider delay of the intermodule wiring. However, as sub-micron design rules are utilized for IC fabrication, wiring delay becomes comparable to the functional unit delay and can not longer be ignored even in register-transfer synthesis. In this paper we propose an algorithm that combines module selection with Performance-Driven module placement and reduces an impact of wiring on sub-micron ASIC performance. The algorithm not only efficiently exploits multiple module realizations in the design library, but also finds the module placement which minimizes wiring delay. Experimental results on several benchmarks show that considering both module and wiring issues, more than 30% reduction of the total circuit delay can be achieved.

  • FPGA Design of User Monitoring System for Display Power Control

    Tomoaki ANDO  Vasily G. MOSHNYAGA  Koji HASHIMOTO  

     
    PAPER-High-Level Synthesis and System-Level Design

      Vol:
    E95-A No:12
      Page(s):
    2364-2372

    This paper introduces new FPGA design of user-monitoring system for power management of PC display. From the camera readings the system detects whether the user looks at the screen or not and produces signals to control the display backlight. The system provides over 88% eye detection accuracy at 8f/s image processing rate. We describe new eye-tracking algorithm and hardware and present the results of its experimental evaluation in prototype display power management system.

  • Reduction of Background Computations in Block-Matching Motion Estimation

    Vasily G. MOSHNYAGA  Koichi MASUNAGA  

     
    PAPER-Video/Image Coding

      Vol:
    E87-A No:3
      Page(s):
    539-546

    A new algorithm and architecture to eliminate redundant operations in block-matching (BM) motion estimation is proposed. The key step of this work is to use binary-matching to define image regions with the static background content and then exclude these regions from the actual motion estimation. According to experiments, the approach maintains the highest PSNR, while making as half as less computations in comparison to the adaptive BM or 1/8 of the computations required by the full-search BM. An implementation scheme is outlined.

  • Quantitative Evaluation of State-Preserving Leakage Reduction Algorithm for L1 Data Caches

    Reiko KOMIYA  Koji INOUE  Vasily G. MOSHNYAGA  Kazuaki MURAKAMI  

     
    PAPER

      Vol:
    E88-A No:4
      Page(s):
    862-868

    As the transistor feature sizes and threshold voltages reduce, leakage energy consumption has become an inevitable issue for high-performance microprocessor designs. Since on-chip caches are major contributors of the leakage, a number of researchers have proposed efficient leakage reduction techniques. However, it is still not clear that 1) what kind of algorithm can be considered and 2) how much they have impact on energy and performance. To answer these questions, we explore run-time cache management algorithm, and evaluate the energy-performance efficiency for several alternatives.