The search functionality is under construction.

Author Search Result

[Author] Hiroshi MAKINO(14hit)

1-14hit
  • A 2.6-ns 64-b Fast and Small CMOS Adder

    Hiroyuki MORINAKA  Hiroshi MAKINO  Yasunobu NAKASE  Hiroaki SUZUKI  Koichiro MASHIKO  Tadashi SUMI  

     
    PAPER

      Vol:
    E79-C No:4
      Page(s):
    530-537

    We present a 64-b adder having a 2.6-ns delay time at 3.3 V power supply within 0.27 mm2 using 0.5-µm CMOS technology. We derived our adder design from architectural level considerations. The considerations include not only the gate intrinsic delay but also the wiring delay and the gate capacitance delay. As a result, a 64-b adder, (56-b Carry Look-ahead Adder(CLA) +8-b Carry Select Adder (CSA)), was designed. In this design, a new carry select scheme called Modified Carry Select (MCS) is also proposed.

  • The Challenge of Collaboration among Academies and Asia Pacific for ITS R&D

    Hiroshi MAKINO  Shunsuke KAMIJO  

     
    INVITED PAPER

      Vol:
    E98-A No:1
      Page(s):
    259-266

    ITS R&D includes wide variety of research area such as mechanical engineering, road engineering, traffic engineering, information and communication engineering, and electrical engineering. In spite of initiatives across the variety of engineering is essential to solve the problems of practical social systems, it is difficult to collaborate among engineering. Based on the joint research of the Japan Society of Civil Engineers and the Institute of Electrical Engineers held at the Great East Japan Earthquake, this paper discusses about necessity of collaboration among academies on ITS R&D. International collaboration is also important for ITS R&D. Asian countries could share the same problems and solutions, since many of mega cities exist in Asia region and they suffers from heavy traffics. Therefore, we need to discuss the common solution to our problems.

  • A Field Programmable Sequencer and Memory with Middle Grained Programmability Optimized for MCU Peripherals

    Yoshifumi KAWAMURA  Naoya OKADA  Yoshio MATSUDA  Tetsuya MATSUMURA  Hiroshi MAKINO  Kazutami ARIMOTO  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E99-A No:5
      Page(s):
    917-928

    A Field Programmable Sequencer and Memory (FPSM), which is a programmable unit exclusively optimized for peripherals on a micro controller unit, is proposed. The FPSM functions as not only the peripherals but also the standard built-in memory. The FPSM provides easier programmability with a smaller area overhead, especially when compared with the FPGA. The FPSM is implemented on the FPGA and the programmability and performance for basic peripherals such as the 8 bit counter and 8 bit accuracy Pulse Width Modulation are emulated on the FPGA. Furthermore, the FPSM core with a 4K bit SRAM is fabricated in 0.18µm 5 metal CMOS process technology. The FPSM is an half the area of FPGA, its power consumption is less than one-fifth.

  • Novel VLIW Code Compaction Method for a 3D Geometry Processor

    Hiroaki SUZUKI  Hiroyuki KAWAI  Hiroshi MAKINO  Yoshio MATSUDA  

     
    PAPER-Digital Signal Processing

      Vol:
    E84-A No:11
      Page(s):
    2885-2893

    A VLIW (Very Long Instruction Word) architecture with a new code compaction method has been proposed. For a 3D-geometry processor, we consider two types of 2-issue VLIW architectures, the floating-point execution accelerating VLIW (FP-VLIW) and the data-move enhancing VLIW (MV-VLIW) architectures, as expansions of a Single-Streaming Single Instruction, Multiple Data (SS-SIMD) architecture. To solve the code bloat problem which is common to VLIW architectures, the proposed method makes it possible to compact original codes into the VLIW codes by software tools and decompact the VLIW codes by a simple hardware decompactor composed of an instruction swap circuit on a chip. Speeds and code densities of the two VLIWs with the code compaction are compared to the SS-SIMD with the same instruction set and the same building blocks. The FP-VLIW shows the fastest speed performance in the evaluation results of the viewperf CDRS-03 benchmark programs. It is 36% faster than the SS-SIMD used as reference. The proposed compaction method keeps the 95% code density of the SS-SIMD. One test program shows that the code density of the MV-VLIW is higher than that of the SS-SIMD. This result demonstrates that the merit of compacting nops can be greater than the VLIW penalty. The FP-VLIW architecture with the code compaction achieves 1.36 times the speed performance without significant code-density deterioration.

  • Signal Integrity Design and Analysis for a 400 MHz RISC Microcontroller

    Akira YAMADA  Yasuhiro NUNOMURA  Hiroaki SUZUKI  Hisakazu SATO  Niichi ITOH  Tetsuya KAGEMOTO  Hironobu ITO  Takashi KURAFUJI  Nobuharu YOSHIOKA  Jingo NAKANISHI  Hiromi NOTANI  Rei AKIYAMA  Atsushi IWABU  Tadao YAMANAKA  Hidehiro TAKATA  Takeshi SHIBAGAKI  Takahiko ARAKAWA  Hiroshi MAKINO  Osamu TOMISAWA  Shuhei IWADE  

     
    PAPER-Design Methods and Implementation

      Vol:
    E86-C No:4
      Page(s):
    635-642

    A high-speed 32-bit RISC microcontroller has been developed. In order to realize high-speed operation with minimum hardware resource, we have developed new design and analysis methods such as a clock distribution, a bus-line layout, and an IR drop analysis. As a result, high-speed operation of 400 MHz has been achieved with power dissipation of 0.96 W at 1.8 V.

  • A Low Standby Current DSP Core Using Improved ABC-MT-CMOS with Charge Pump Circuit

    Hiromi NOTANI  Masayuki KOYAMA  Ryuji MANO  Hiroshi MAKINO  Yoshio MATSUDA  Osamu TOMISAWA  Shuhei IWADE  

     
    PAPER-Circuit Design

      Vol:
    E86-C No:4
      Page(s):
    597-603

    A 64-bit 100-MHz multimedia DSP core has been designed using 0.15-µ m CMOS technology. An improved Auto-Backgate-Controlled MT-CMOS (ABC-MT-CMOS) circuit with a charge pump is adopted to suppress the standby leakage current. The dynamic active current of whole chip was simulated to optimize the size of a switch for a power supply control. The DSP core chip, which integrates 300-kgate Logic, 64-kbyte SRAM and charge pump circuit, has 8-µ A standby leakage current. The reduction rate is 1/250.

  • A Floating-Point Divider Using Redundant Binary Circuits and an Asynchronous Clock Scheme

    Hiroaki SUZUKI  Hiroshi MAKINO  Koichiro MASHIKO  

     
    PAPER-Electronic Circuits

      Vol:
    E82-C No:1
      Page(s):
    105-110

    This paper describes a new floating-point divider (FDIV), in which the key features of redundant binary circuits and an asynchronous clock scheme reduce the delay time and area penalty. The redundant binary representation of +1 = (1, 0), 0 = (0, 0), -1 = (0,1) is applied to the all mantissa division circuits. The simple and unified representation reduces circuit delay for the quotient determination. Additionally, the local clock generator circuit for the asynchronous clock scheme eliminates clock margin overhead. The generator circuit guarantees the worst delay-time operation by the feedback loop of the replica delay paths via a C-element. The internal iterative operation by the asynchronous scheme and the modified redundant-binary addition/subtraction circuit keep the area small. The architecture design avoids extra calculation time for the post processes, whose main role is to produce the floating-point status flags. The FDIV core using proposed technologies operates at 42. 1 ns with 0.35 µm CMOS technology and triple metal interconnections. The small core of 13.5 k transistors is laid-out in a 730µm 910 µm area.

  • Realistic Scaling Scenario for Sub-100 nm Embedded SRAM Based on 3-Dimensional Interconnect Simulation

    Yasumasa TSUKAMOTO  Tatsuya KUNIKIYO  Koji NII  Hiroshi MAKINO  Shuhei IWADE  Kiyoshi ISHIKAWA  Yasuo INOUE  Norihiko KOTANI  

     
    PAPER

      Vol:
    E86-C No:3
      Page(s):
    439-446

    It is still an open problem to elucidate the scaling merits of an embedded SRAM with Low Operating Power (LOP) MOSFETs fabricated in 50, 70 and 100 nm CMOS technology nodes. Taking into account a realistic SRAM cell layout, we evaluated the parasitic capacitance of the bit line (BL) as well as the word line (WL) in each generation. By means of a 3-Dimensional (3D) interconnect simulator (Raphael), we focused on the scaling merit through a comparison of the simulated SRAM BL delay for each CMOS technology node. In this paper, we propose two kinds of original interconnect structure which modify ITRS (International Technology Roadmap for Semiconductors), and make it clear that the original interconnect structures with reduced gate overlap capacitance guarantee the scaling merits of SRAM cells fabricated with LOP MOSFETs in 50 and 70 nm CMOS technology nodes.

  • A Design of High-Speed 4-2 Compressor for Fast Multiplier

    Hiroshi MAKINO  Hiroaki SUZUKI  Hiroyuki MORINAKA  Yasunobu NAKASE  Hirofumi SHINOHARA  Koichiro MASHIKO  Tadashi SUMI  Yasutaka HORIBA  

     
    PAPER

      Vol:
    E79-C No:4
      Page(s):
    538-548

    This paper describes the design of a high-speed 4-2 compressor for fast multipliers. Through the survey of the six kinds of representative conventional 4-2 compressor (RBA 1-3 and NBA 1-3) in both the redundant binary (RB) and the normal binary (NB) scheme, we extracted two problems that degrades the operating speed. The first is the use of multi-input complex gates and the second is the existence of transmission gates (TG) at the input and/or output stages. To solve these problems, we propose high-speed 4-2 compressors using the RB scheme, which we call the high-speed redundant binary adders (HSRBAs). Six kinds of HSRBAs, HSRBA 1-6, were derived by making the Boolean equations suitable for high-speed CMOS circuits. Among them, HSRBA2, HSRBA4 and HSRBA6 have no multi-input complex gate and input/output TG, and perform at a delay time of 0.89 ns which is the fastest of all 4-2 compressors. We investigated the logical relation between HSRBAs and conventional 4-2 compressors by analyzing the Boolean equations for each circuit. This investigation shows that all the conventional redundant binary adders RBA1-3 have the same logic structures as HSRBA2. We also showed the conventional normal binary adders NBA1-3 have the same logic structures as HSRBA1, HSRBA3 and HSRBA5, respectively. This implies all 4-2 compressors can be derived from the same equation regardless of RB or NB. We applied the HSRBA2 to a 5454-bit multiplier using 0.5-µm CMOS technology. The multiplication time at the supply voltage of 3.3 V was 8.8 ns. This is the fastest 5454-bit multiplier with 0.5-µm CMOS so far, and 83% of the speed improvement is due to the high speed 4-2 compressor.

  • A Wide Range 1.0-3.6 V 200 Mbps, Push-Pull Output Buffer Using Parasitic Bipolar Transistors

    Takahiro SHIMADA  Hiromi NOTANI  Yasunobu NAKASE  Hiroshi MAKINO  Shuhei IWADE  

     
    PAPER

      Vol:
    E87-C No:4
      Page(s):
    571-577

    We proposed a push-pull output buffer that maintains the data transmission rate for lower supply voltages. It operates at an internal supply voltage (VDD) of 0.7-1.6 V and an interface supply voltage (VDDX) of 1.0-3.6 V. In low VDDX operation, the output buffer utilizes parasitic bipolar transistors instead of MOS transistors to maintain drivability. Furthermore forward body bias (FBB) control is provided for the level converter in low VDD operation. We fabricated a test chip with a standard 0.15 µm CMOS process. Measurement results indicate that the proposed output buffer achieves 200 Mbps operation at VDD of 0.7 V and VDDX of 1.0 V.

  • A 286 MHz 64-b Floating Point Multiplier with Enhanced CG Operation

    Hiroshi MAKINO  Hiroaki SUZUKI  Hiroyuki MORINAKA  Yasunobu NAKASE  Koichiro MASHIKO  Tadashi SUMI  

     
    PAPER-Logic

      Vol:
    E79-C No:7
      Page(s):
    915-924

    This paper presents a high speed 64-b floating point (FP) multiplier that has a useful function for computer graphics(CG). The critical path delay is minimized by using high speed logic gates and limiting the stage number of series transmission gates (TG's). The high speed redundant binary architecture is applied to the multiplication of significands. This FP multiplier has a special function of "CG multiplication" that directly multiplies a pixel data by an FP data. This multiplier was fabricated by 0.5 µm CMOS technology with triple-level metal of interconnection. The active area size is 4.25.1mm2.The operating cycle time is 3.5 ns at the supply voltage of 3.3 V, which corresponds to the frequency of 286 MHz, Implementation of CG multiplication increases the transistor count only 4%. Also, CG multiplication has no effect on the delay in the critical path.

  • A Low-Power Microcontroller with Body-Tied SOI Technology

    Hisakazu SATO  Yasuhiro NUNOMURA  Niichi ITOH  Koji NII  Kanako YOSHIDA  Hironobu ITO  Jingo NAKANISHI  Hidehiro TAKATA  Yasunobu NAKASE  Hiroshi MAKINO  Akira YAMADA  Takahiko ARAKAWA  Toru SHIMIZU  Yuichi HIRANO  Takashi IPPOSHI  Shuhei IWADE  

     
    PAPER

      Vol:
    E87-C No:4
      Page(s):
    563-570

    A low-power microcontroller has been developed with 0.10 µm bulk compatible body-tied SOI technology. For this work, only two new masks are required. For the other layers, existing masks of a prior work developed with 0.18 µm bulk CMOS technology can be applied without any changes. With the SOI technology, the high-speed operation of over 600 MHz has been achieved at a supply voltage of 1.2 V, which is 1.5 times faster than prior work. Also, a five times improvement in the power-delay product has been achieved at a supply voltage 0.8 V. Moreover, the compatibility of the SOI technology with bulk CMOS has been verified, because all circuit blocks of the chip, including logic, memory, analog circuit, and PLL, are completely functional, even though only two new masks are used.

  • A Large-Scale, Flip-Flop RAM Imitating a Logic LSI for Fast Development of Process Technology

    Masako FUJII  Koji NII  Hiroshi MAKINO  Shigeki OHBAYASHI  Motoshige IGARASHI  Takeshi KAWAMURA  Miho YOKOTA  Nobuhiro TSUDA  Tomoaki YOSHIZAWA  Toshikazu TSUTSUI  Naohiko TAKESHITA  Naofumi MURATA  Tomohiro TANAKA  Takanari FUJIWARA  Kyoko ASAHINA  Masakazu OKADA  Kazuo TOMITA  Masahiko TAKEUCHI  Shigehisa YAMAMOTO  Hiromitsu SUGIMOTO  Hirofumi SHINOHARA  

     
    PAPER

      Vol:
    E91-C No:8
      Page(s):
    1338-1347

    We propose a new large-scale logic test element group (TEG), called a flip-flop RAM (FF-RAM), to improve the total process quality before and during initial mass production. It is designed to be as convenient as an SRAM for measurement and to imitate a logic LSI. We implemented a 10 Mgates FF-RAM using our 65-nm CMOS process. The FF-RAM enables us to make fail-bit maps (FBM) of logic cells because of its cell array structure as an SRAM. An FF-RAM has an additional structure to detect the open and short failure of upper metal layers. The test results show that it can detect failure locations and layers effortlessly using FBMs. We measured and analyzed it for both the cell arrays and the upper metal layers. Their results provided many important clues to improve our processes. We also measured the neutron-induced soft error rate (SER) of FF-RAM, which is becoming a serious problem as transistors become smaller. We compared the results of the neutron-induced soft error rate to those of previous generations: 180 nm, 130 nm, and 90 nm. Because of this TEG, we can considerably shorten the development period for advanced CMOS technology.

  • An Application of Air-Bridge Metal Interconnections to High Speed GaAs LSI's

    Minoru NODA  Hiroshi MATSUOKA  Norio HIGASHISAKA  Masaaki SHIMADA  Hiroshi MAKINO  Shuichi MATSUE  Yasuo MITSUI  Kazuo NISHITANI  Akiharu TADA  

     
    PAPER

      Vol:
    E75-C No:10
      Page(s):
    1146-1153

    Air-bridge metal interconnection technology is used for upper level power supply line interconnections in GaAs LSI's to reduce the signal propagation delay time. This technology reduces both parasitic capacitance between the signal line and the power supply line, and propagation delay in the signal line to about 10% and about 50%, respectively, compared to conventional 3-level interconnections without air-bridges. Under standard load conditions (FI=FO=2, length of load line=2 mm), the air-bridge technique leads to gate propagation delays which are about 60% of those in conventional interconnections. We fabricated 2.1-k gate Gate Arrays and 4-kb SRAM's using the air-bridge structure to interconnect power supply lines. For a Gate Array with 0.7 µm gate Buried P-layer Lightly Doped Drain (BPLDD) FET's, the typical gate propagation delay under standard load conditions was about 110 ps with a dissipation power of 1.4 mW/gate. SRAM's with 05 µm gate BPLDD's had typical access time (tacc) of 1.5 ns with a dissipation power of 700 mW/chip.