The search functionality is under construction.
The search functionality is under construction.

Author Search Result

[Author] Makoto ISHIKAWA(7hit)

1-7hit
  • A Low-Power Embedded RISC Microprocessor with an Integrated DSP for Mobile Applications

    Tetsuya YAMADA  Makoto ISHIKAWA  Yuji OGATA  Takanobu TSUNODA  Takahiro IRITA  Saneaki TAMAKI  Kunihiko NISHIYAMA  Tatsuya KAMEI  Ken TATEZAWA  Fumio ARAKAWA  Takuichiro NAKAZAWA  Toshihiro HATTORI  Kunio UCHIYAMA  

     
    INVITED PAPER

      Vol:
    E85-C No:2
      Page(s):
    253-262

    A 32-bit embedded RISC microprocessor core integrating a DSP has been developed using a 0.18-µm five-layer-metal CMOS technology. The integrated DSP has a single-MAC and exploits CPU resources to reduce hardware. The DSP occupies only 0.5 mm2. The processor core includes a large on-chip 128 kB SRAM called U-memory. A large capacity on-chip memory decreases the amount of traffic with an external memory. And it is effective for low-power and high-performance operation. To realize low-power dissipation for the U-memory access, the active ratio of U-memory's access is reduced. The critical path is a load path from the U-memory, and we optimized the path through the whole chip. The chip achieves 0.79 mA/MHz executing Dhrystone 1.1 at 108 MHz, which is suitable for mobile applications.

  • CPU Model-Based Mechatronics/Hardware/Software Co-design Technology for Real-Time Embedded Control Systems

    Makoto ISHIKAWA  George SAIKALIS  Shigeru OHO  

     
    PAPER-VLSI Design Technology

      Vol:
    E90-C No:10
      Page(s):
    1992-2001

    We review practical case studies of a developing method of highly reliable real-time embedded control systems using a CPU model-based hardware/software co-simulation. We take an approach that enables us to fully simulate a virtual mechanical control system including a mechatronics plant, microcontroller hardware, and object code level software. This full virtual system approach simulates control system behavior, especially that of the microcontroller hardware and software. It enables design space exploration of microarchitecture, control design validation, robustness evaluation of the system, software optimization before components design. It also avoids potential problems. The advantage of this work is that it comprises all the components in a typical control system, enabling the designers to analyze effects from different domains, for example mechanical analysis of behavior due to differences in controller microarchitecture. To further improve system design, evaluation and analysis, we implemented an integrated behavior analyzer in the development environment. This analyzer can graphically display the processor behavior during the simulation without affecting simulation results such as task level CPU load, interrupt statistics, and the software variable transition chart. It also provides useful information on the system behavior. This virtual system analysis does not require software modification, does not change the control timing, and does not require any processing power from the target microcontroller. Therefore this method is suitable for real-time embedded control system design, in particular automotive control system design that requires a high level of reliability, robustness, quality, and safety. In this study, a Renesas SH-2A microcontroller model was developed on a CoMETTMplatform from VaST Systems Technology. An electronic throttle control (ETC) system and an engine control system were chosen to prove this concept. The electronic throttle body (ETB) model on the Saber® simulator from Synopsys® and the engine model on MATLAB®/Simulink® simulator from MathWorks can be simulated with the SH-2A model using a newly developed co-simulation interface between MATLAB®/Simulink® and CoMETTM. Though the SH-2A chip was being developed as the project was being executed, we were able to complete the OSEK OS development, control software design, and verification of the entire system using the virtual environment. After releasing a working sample chip in a later stage of the project, we found that such software could run on both actual ETC system and engine control system without critical problem. This demonstrates that our models and simulation environment are sufficiently credible and trustworthy.

  • A 4500 MIPS/W, 86 µA Resume-Standby, 11 µA Ultra-Standby Application Processor for 3G Cellular Phones

    Makoto ISHIKAWA  Tatsuya KAMEI  Yuki KONDO  Masanao YAMAOKA  Yasuhisa SHIMAZAKI  Motokazu OZAWA  Saneaki TAMAKI  Mikio FURUYAMA  Tadashi HOSHI  Fumio ARAKAWA  Osamu NISHII  Kenji HIROSE  Shinichi YOSHIOKA  Toshihiro HATTORI  

     
    PAPER-Digital

      Vol:
    E88-C No:4
      Page(s):
    528-535

    We have developed an application processor optimized for 3G cellular phones. It provides high energy efficiency by using various low power techniques. For low active power consumption, we use a hierarchical clock gating technique with a static clock gating controlled by software and a two-level dynamic clock gating controlled by hardware. This technique reduces clock power consumption by 35%. And we also apply a pointer-based pipeline to in the CPU core, which reduces the pipeline latch power by 25%. This processor contains 256 kB of on-chip user RAM (URAM) to reduce the external memory access power. The URAM read buffer (URB) enables high-throughput, low latency access to the URAM while keeping the CPU clock frequency high because the URAM read data is transferred to the URB in 256-bit widths at half the frequency of the CPU. The average miss penalty is 3.5 cycles at the CPU clock frequency, hit rate is 89% and the energy used for URAM reads is 8% less that what it would be for URAM without a URB. These techniques reduce the power consumption of the CPU core, and achieve 4500 MIPS/W at 1.0 V power supply (Dhrystone 2.1). For the low leakage requirements, we use internal power switches, and provides resume-standby (R-standby) and ultra-standby (U-standby) modes. Signals across a power boundary are transmitted through µI/O circuits to prevent invalid signal transmission. In the R-standby mode, the power supply to almost all the CPU core area, except for the URAM is cut off and the URAM is set to a retention mode. In the U-standby mode, the power supply to the URAM is also turned off for less leakage current. The leakage currents in the R-standby and in the U-standby modes are respectively only 98 and 12 µA. For quick recovery from the R-standby mode, the boot address register (BAR) and control register contents needed immediately after wake-up are saved by hardware into backup latches. The other contents are saved by software into URAM. It takes 2.8 ms to fully recover from R-standby.

  • An Embedded Processor Core for Consumer Appliances with 2.8GFLOPS and 36 M Polygons/s FPU

    Fumio ARAKAWA  Motokazu OZAWA  Osamu NISHII  Toshihiro HATTORI  Takeshi YOSHINAGA  Tomoichi HAYASHI  Yoshikazu KIYOSHIGE  Takashi OKADA  Masakazu NISHIBORI  Tomoyuki KODAMA  Tatsuya KAMEI  Makoto ISHIKAWA  

     
    PAPER-System Level Design

      Vol:
    E87-A No:12
      Page(s):
    3068-3074

    A SuperHTM embedded processor core implemented in a 130-nm CMOS process running at 400 MHz achieved 720 MIPS and 2.8 GFLOPS at a power of 250 mW in worst-case conditions. It has a dual-issue seven-stage pipeline architecture but maintains the 1.8 MIPS/MHz of the previous five-stage processor. The processor meets the requirements of a wide range of applications, and is suitable for digital appliances aimed at the consumer market, such as cellular phones, digital still/video cameras, and car navigation systems.

  • Thresholding Based Image Segmentation Aided by Kleene Algebra

    Makoto ISHIKAWA  Naotake KAMIURA  Yutaka HATA  

     
    PAPER-Probability and Kleene Algebra

      Vol:
    E82-D No:5
      Page(s):
    962-967

    This paper proposes a thresholding based segmentation method aided by Kleene Algebra. For a given image including some regions of interest (ROIs for short) with the coherent intensity level, assume that we can segment each ROI on applying thresholding technique. Three segmented states are then derived for every ROI: Shortage denoted by logic value 0, Correct denoted by 1 and Excess denoted by 2. The segmented states for every ROI in the image can be then expressed on a ternary logic system. Our goal is then set to find "Correct (1)" state for every ROI. First, unate function, which is a model of Kleene Algebra, based procedure is proposed. However, this method is not complete for some cases, that is, correctly segmented ratio is about 70% for three and four ROI segmentation. For the failed cases, Brzozowski operations, which are defined on De Morgan algebra, can accommodate to completely find all "Correct" states. Finally, we apply these procedures to segmentation problems of a human brain MR image and a foot CT image. As the result, we can find all "1" states for the ROIs, i. e. , we can correctly segment the ROIs.

  • A 45-nm 37.3 GOPS/W Heterogeneous Multi-Core SOC with 16/32 Bit Instruction-Set General-Purpose Core

    Osamu NISHII  Yoichi YUYAMA  Masayuki ITO  Yoshikazu KIYOSHIGE  Yusuke NITTA  Makoto ISHIKAWA  Tetsuya YAMADA  Junichi MIYAKOSHI  Yasutaka WADA  Keiji KIMURA  Hironori KASAHARA  Hideo MAEJIMA  

     
    PAPER-Integrated Electronics

      Vol:
    E94-C No:4
      Page(s):
    663-669

    We built a 12.4 mm12.4 mm, 45-nm CMOS, chip that integrates eight 648-MHz general purpose cores, two matrix processor (MX-2) cores, four flexible engine (FE) cores and media IP (VPU5) to establish heterogeneous multi-core chip architecture. The general purpose core had its IPC (instructions per cycle) performance enhanced by adding 32-bit instructions to the existing 16-bit fixed-length instruction set and executing up to two 32-bit instructions per cycle. Considering these five-to-seven years of embedded LSI and increasing trend of access-master within LSI, we predict that the memory usage of single core will not exceed 32-bit physical area (i.e. 4 GB), but chip-total memory usage will exceed 4 GB. Based on this prediction, the physical address was expanded from 32-bit to 40-bit. The fabricated chip was tested and a parallel operation of eight general purpose cores and four FE cores and eight data transfer units (DTU) is obtained on AAC (Advanced Audio Coding) encode processing.

  • Reducing Consuming Clock Power Optimization of a 90 nm Embedded Processor Core

    Tetsuya YAMADA  Masahide ABE  Yusuke NITTA  Kenji OGURA  Manabu KUSAOKE  Makoto ISHIKAWA  Motokazu OZAWA  Kiwamu TAKADA  Fumio ARAKAWA  Osamu NISHII  Toshihiro HATTORI  

     
    PAPER-Low Power Techniques

      Vol:
    E89-C No:3
      Page(s):
    287-294

    A low-power SuperHTM embedded processor core, the SH-X2, has been designed in 90-nm CMOS technology. The power consumption was reduced by using hierarchical fine-grained clock gating to reduce the power consumption of the flip-flops and the clock-tree, synthesis and a layout that supports the implementation of the clock gating, and several-level power evaluations for RTL refinement. With this clock gating and RTL refinement, the power consumption of the clock-tree and flip-flops was reduced by 35% and 59%, including the process shrinking effects, respectively. As a result, the SH-X2 achieved 6,000 MIPS/W using a Renesas low-power process with a lowered voltage. Its performance-power efficiency was 25% better than that of a 130-nm-process SH-X.