The search functionality is under construction.

Author Search Result

[Author] Kimiyoshi USAMI(14hit)

1-14hit
  • A Fine-Grained Power Gating Control on Linux Monitoring Power Consumption of Processor Functional Units

    Atsushi KOSHIBA  Motoki WADA  Ryuichi SAKAMOTO  Mikiko SATO  Tsubasa KOSAKA  Kimiyoshi USAMI  Hideharu AMANO  Masaaki KONDO  Hiroshi NAKAMURA  Mitaro NAMIKI  

     
    PAPER

      Vol:
    E98-C No:7
      Page(s):
    559-568

    The authors have been researching on reducing the power consumption of microprocessors, and developed a low-power processor called “Geyser” by applying power gating (PG) function to the individual functional units of the processor. PG function on Geyser reduces the power consumption of functional units by shutting off the power voltage of idle units. However, the energy overhead of switching the supply voltage for units on and off causes power increases. The amount of the energy overhead varies with the behavior of each functional unit which is influenced by running application, and also with the core temperature. It is therefore necessary to switch the PG function itself on or off according to the state of the processor at runtime to reduce power consumption more effectively. In this paper, the authors propose a PG control method to take the power overhead into account by the operating system (OS). In the proposed method, for achieving much power reduction, the OS calculates the power consumption of each functional unit periodically and inhibits the PG function of the unit whose energy overhead is judged too high. The method was implemented in the Linux process scheduler and evaluated. The results show that the average power consumption of the functional units is reduced by up to 17.2%.

  • A Perpetuum Mobile 32bit CPU on 65nm SOTB CMOS Technology with Reverse-Body-Bias Assisted Sleep Mode

    Koichiro ISHIBASHI  Nobuyuki SUGII  Shiro KAMOHARA  Kimiyoshi USAMI  Hideharu AMANO  Kazutoshi KOBAYASHI  Cong-Kha PHAM  

     
    PAPER

      Vol:
    E98-C No:7
      Page(s):
    536-543

    A 32bit CPU, which can operate more than 15 years with 220mAH Li battery, or eternally operate with an energy harvester of in-door light is presented. The CPU was fabricated by using 65nm SOTB CMOS technology (Silicon on Thin Buried oxide) where gate length is 60nm and BOX layer thickness is 10nm. The threshold voltage was designed to be as low as 0.19V so that the CPU operates at over threshold region, even at lower supply voltages down to 0.22V. Large reverse body bias up to -2.5V can be applied to bodies of SOTB devices without increasing gate induced drain leak current to reduce the sleep current of the CPU. It operated at 14MHz and 0.35V with the lowest energy of 13.4 pJ/cycle. The sleep current of 0.14µA at 0.35V with the body bias voltage of -2.5V was obtained. These characteristics are suitable for such new applications as energy harvesting sensor network systems, and long lasting wearable computers.

  • An Energy-Efficient Floorplan Driven High-Level Synthesis Algorithm for Multiple Clock Domains Design

    Shin-ya ABE  Youhua SHI  Kimiyoshi USAMI  Masao YANAGISAWA  Nozomu TOGAWA  

     
    PAPER

      Vol:
    E98-A No:7
      Page(s):
    1376-1391

    In this paper, we first propose an HDR-mcd architecture, which integrates periodically all-in-phase based multiple clock domains and multi-cycle interconnect communication into high-level synthesis. In HDR-mcd, an entire chip is divided into several huddles. Huddles can realize synchronization between different clock domains in which interconnection delay should be considered during high-level synthesis. Next, we propose a high-level synthesis algorithm for HDR-mcd, which can reduce energy consumption by optimizing configuration and placement of huddles. Experimental results show that the proposed method achieves 32.5% energy-saving compared with the existing single clock domain based methods.

  • Dynamic Sleep Control for Finite-State-Machines to Reduce Active Leakage Power

    Kimiyoshi USAMI  Hiroshi YOSHIOKA  

     
    PAPER-Logic Synthesis

      Vol:
    E87-A No:12
      Page(s):
    3116-3123

    Leakage power is predicted to become dominant in the total operation power as the transistor technology gets advanced. Even in the current technology, dramatic increase of leakage power at elevated temperature is a big problem. Burn-in testing, which is typically performed at 125, is facing at difficulties such as throughput degradation or thermal runaway due to increase of leakage power. Reducing leakage power at operation time is essential to solve these problems. We propose a novel approach to make use of an enable signal of a gated-clock technique for reducing active leakage power. A sleep transistor is provided between combinational logic circuits and the ground, and is controlled by the enable signal. When state transitions do not occur in Finite-State-Machines (FSM's), the enable signal becomes low and the state flip-flops keep the data. At the same time, the sleep transistor is turned off so that combinational logic gates are electrically disconnected from the ground to reduce leakage. Simulation results have shown that the proposed scheme reduces active leakage power by 30-60% in 0.18 µm technology. The total power was reduced by 20% at the maximum at 125. It was also found that performance degradation was tolerable for burn-in testing.

  • Fine-Grained Run-Tume Power Gating through Co-optimization of Circuit, Architecture, and System Software Design Open Access

    Hiroshi NAKAMURA  Weihan WANG  Yuya OHTA  Kimiyoshi USAMI  Hideharu AMANO  Masaaki KONDO  Mitaro NAMIKI  

     
    INVITED PAPER

      Vol:
    E96-C No:4
      Page(s):
    404-412

    Power consumption has recently emerged as a first class design constraint in system LSI designs. Specially, leakage power has occupied a large part of the total power consumption. Therefore, reduction of leakage power is indispensable for efficient design of high-performance system LSIs. Since 2006, we have carried out a research project called “Innovative Power Control for Ultra Low-Power and High-Performance System LSIs”, supported by Japan Science and Technology Agency as a CREST research program. One of the major objectives of this project is reducing the leakage power consumption of system LSIs by innovative power control through tight cooperation and co-optimization of circuit technology, architecture, and system software designs. In this project, we focused on power gating as a circuit technique for reducing leakage power. Temporal granularity is one of the most important issue in power gating. Thus, we have developed a series of Geysers as proof-of-concept CPUs which provide several mechanisms of fine-grained run-time power gating. In this paper, we describe their concept and design, and explain why co-optimization of different design layers are important. Then, three kinds of power gating implementations and their evaluation are presented from the view point of power saving and temporal granularity.

  • FOREWORD

    Kimiyoshi USAMI  

     
    FOREWORD

      Vol:
    E96-A No:12
      Page(s):
    2457-2457
  • Multi-Voltage Variable Pipeline Routers with the Same Clock Frequency for Low-Power Network-on-Chips Systems

    Akram BEN AHMED  Hiroki MATSUTANI  Michihiro KOIBUCHI  Kimiyoshi USAMI  Hideharu AMANO  

     
    PAPER

      Vol:
    E99-C No:8
      Page(s):
    909-917

    In this paper, the Multi-voltage (multi-Vdd) variable pipeline router is proposed to reduce the power consumption of Network-on-Chips (NoCs) designed for Chip Multi-processors (CMPs). The multi-Vdd variable pipeline router adjusts its pipeline depth (i.e., communication latency) and supply voltage level in response to the applied workload. Unlike Dynamic Voltage and Frequency Scaling (DVFS) routers, the operating frequency remains the same for all routers throughout the CMP; thus, omitting the need to synchronize neighboring routers working at different frequencies. Two types of router architectures are presented: a Coarse-Grained Variable Pipeline (CG-VP) router that changes the voltage supplied to the entire router, and a Fine-Grained Variable Pipeline (FG-VP) router that uses a finer power partition. The evaluation results showed that the CG-VP and FG-VP routers achieve a 22.9% and 35.3% power reduction on average with 14% and 23% area overhead in comparison with a baseline router without variable pipelines, respectively. Thanks to the adopted look-ahead mechanism to switch the supply voltage, the performance overhead is only 4.4%.

  • An Operating System Guided Fine-Grained Power Gating Control Based on Runtime Characteristics of Applications

    Atsushi KOSHIBA  Mikiko SATO  Kimiyoshi USAMI  Hideharu AMANO  Ryuichi SAKAMOTO  Masaaki KONDO  Hiroshi NAKAMURA  Mitaro NAMIKI  

     
    PAPER

      Vol:
    E99-C No:8
      Page(s):
    926-935

    Fine-grained power gating (FGPG) is a power-saving technique by switching off circuit blocks while the blocks are idle. Although FGPG can reduce power consumption without compromising computational performance, switching the power supply on and off causes energy overhead. To prevent power increase caused by the energy overhead, in our prior research we proposed an FGPG control method of the operating system(OS) based on pre-analyzing applications' power usage. However, modern computing systems have a wide variety of use cases and run many types of application; this makes it difficult to analyze the behavior of all these applications in advance. This paper therefore proposes a new FGPG control method without profiling application programs in advance. In the new proposed method, the OS monitors a circuit's idle interval periodically while application programs are running. The OS enables FGPG only if the interval time is long enough to reduce the power consumption. The experimental results in this paper show that the proposed method reduces power consumption by 9.8% on average and up to 17.2% at 25°C. The results also show that the proposed method achieves almost the same power-saving efficiency as the previous profile-based method.

  • Selective Multi-Threshold Technique for High-Performance and Low-Standby Applications

    Kimiyoshi USAMI  Naoyuki KAWABE  Masayuki KOIZUMI  Katsuhiro SETA  Toshiyuki FURUSAWA  

     
    PAPER-Optimization of Power and Timing

      Vol:
    E85-A No:12
      Page(s):
    2667-2673

    In portable applications such as W-CDMA cell phones, high performance and low standby leakage are both required. We propose an automated design technique to selectively use multi-threshold CMOS (MTCMOS) in a cell-by-cell fashion. MT cells consisting of low-Vth transistors and high-Vth sleep transistors are newly introduced. MT cells are assigned to critical paths to speed up, while High-Vth cells are assigned to non-critical paths to reduce leakage. Compared to the conventional MTCMOS, the gate delay is not affected by the discharge patterns of other gates because there is no virtual ground to be shared. We applied this technique to a test chip of a DSP core for W-CDMA baseband LSI. The worst path-delay was improved by 14% over the single high-Vth design without increasing standby leakage at 10% area overhead.

  • Floorplan Driven Architecture and High-Level Synthesis Algorithm for Dynamic Multiple Supply Voltages

    Shin-ya ABE  Youhua SHI  Kimiyoshi USAMI  Masao YANAGISAWA  Nozomu TOGAWA  

     
    PAPER-High-Level Synthesis and System-Level Design

      Vol:
    E96-A No:12
      Page(s):
    2597-2611

    In this paper, we propose an adaptive voltage huddle-based distributed-register architecture (AVHDR architecture), which integrates dynamic multiple supply voltages and interconnection delay into high-level synthesis. In AVHDR architecture, voltages can be dynamically assigned for energy reduction. In other words, low supply voltages are assigned to non-critical operations, and leakage power is cut off by turning off the power supply to the sleeping functional units. Next, an AVHDR-based high-level synthesis algorithm is proposed. Our algorithm is based on iterative improvement of scheduling/binding and floorplanning. In the iteration process, the modules in each huddle can be placed close to each other and the corresponding AVHDR architecture can be generated and optimized with floorplanning information. Experimental results show that on average our algorithm achieves 43.9% energy-saving compared with conventional algorithms.

  • Energy Efficient Approximate Storing of Image Data for MTJ Based Non-Volatile Flip-Flops and MRAM

    Yoshinori ONO  Kimiyoshi USAMI  

     
    PAPER

      Pubricized:
    2021/01/06
      Vol:
    E104-C No:7
      Page(s):
    338-349

    A non-volatile memory (NVM) employing MTJ has a lot of strong points such as read/write performance, best endurance and operating-voltage compatibility with standard CMOS. However, it consumes a lot of energy when writing the data. This becomes an obstacle when applying to battery-operated mobile devices. To solve this problem, we propose an approach to augment the capability of the precision scaling technique for the write operation in NVM. Precision scaling is an approximate computing technique to reduce the bit width of data (i.e. precision) for energy reduction. When writing image data to NVM with the precision scaling, the write energy and the image quality are changed according to the write time and the target bit range. We propose an energy-efficient approximate storing scheme for non-volatile flip-flops and a magnetic random-access memory (MRAM) that allows us to write the data by optimizing the bit positions to split the data and the write time for each bit range. By using the statistical model, we obtained optimal values for the write time and the targeted bit range under the trade-off between the write energy reduction and image quality degradation. Simulation results have demonstrated that by using these optimal values the write energy can be reduced up to 50% while maintaining the acceptable image quality. We also investigated the relationship between the input images and the output image quality when using this approach in detail. In addition, we evaluated the energy benefits when applying our approach to nine types of image processing including linear filters and edge detectors. Results showed that the write energy is reduced by further 12.5% at the maximum.

  • Sleep Transistor Sizing Method Using Accurate Delay Estimation Considering Input Vector Pattern and Non-linear Current Model

    Seidai TAKEDA  Kyundong KIM  Hiroshi NAKAMURA  Kimiyoshi USAMI  

     
    PAPER-Physical Level Design

      Vol:
    E94-A No:12
      Page(s):
    2499-2509

    Beyond deep sub-micron era, Power Gating (PG) is one of the most effective techniques to reduce leakage power of circuits. The most important issue of PG circuit design is how to decide the width of sleep transistor. Smaller total sleep transistor width provides smaller leakage power in standby mode, however, insufficient sleep transistor insertion suffers from significant performance degradation. In this paper, we present an accurate and fast gate-level delay estimation method for PG circuits and a novel sleep transistor sizing method utilizing our delay estimation for module-based PG circuits. This method achieves high accuracy within acceptable computation time utilizing accurate discharge current estimation based on delayed logic simulations with limited input vector patterns and by realizing precise current characteristics for logic gates and sleep transistors. Experimental results show that our delay estimation successfully achieves high accuracy and avoids overestimation and underestimation seen in conventional method. Also, our sleep transistor sizing method on average successfully reduces the width of sleep transistors by 40% when compared to conventional methods within an acceptable computation time.

  • Energy-Efficient Standard Cell Memory with Optimized Body-Bias Separation in Silicon-on-Thin-BOX (SOTB)

    Yusuke YOSHIDA  Kimiyoshi USAMI  

     
    PAPER

      Vol:
    E100-A No:12
      Page(s):
    2785-2796

    This paper describes a design of energy-efficient Standard Cell Memory (SCM) using Silicon-on-Thin-BOX (SOTB). We present automatic place and routing (P&R) methodology for optimal body-bias separation (BBS) for SCM, which enables to apply different body bias voltages to latches and to other peripheral circuits within SCM. Capability of SOTB to effectively reduce leakage by body biasing is fully exploited in BBS. Simulation results demonstrated that our approach allows us to design SCM with 40% smaller energy dissipation at the energy minimum voltage as compared to the conventional design flow. For the process and temperature variations, Adaptive Body Bias (ABB) for SCM with our BBS provided 70% smaller leakage energy than ABB for the conventional SCM, while achieving the same clock frequency.

  • Delay Modeling and Critical-Path Delay Calculation for MTCMOS Circuits

    Naoaki OHKUBO  Kimiyoshi USAMI  

     
    PAPER-Simulation and Verification

      Vol:
    E89-A No:12
      Page(s):
    3482-3490

    One of the critical issues in MTCMOS design is how to estimate a circuit delay quickly. In MTCMOS circuit, voltage on virtual ground fluctuates due to a discharge current of a logic cell. This event affects to the cell delay and makes static timing analysis (STA) difficult. In this paper, we propose a delay modeling and static STA methodology targeting at MTCMOS circuits. In the proposed method, we prepare a delay look-up table (LUT) consisting of the input slew, the output load capacitance, the virtual ground length, and a power-switch size. Using this LUT, we compute a circuit delay for each logic cell by applying the linear interpolation. This technique enables to calculate the cell delay considering the delay increase by the voltage fluctuation of virtual ground line. Experimental results show that the proposed methodology enables to estimate the cell delay and the critical path delay within 8% errors compared with SPICE simulation.