The search functionality is under construction.
The search functionality is under construction.

IEICE TRANSACTIONS on Electronics

  • Impact Factor

    0.63

  • Eigenfactor

    0.002

  • article influence

    0.1

  • Cite Score

    1.3

Advance publication (published online immediately after acceptance)

Volume E87-C No.4  (Publication Date:2004/04/01)

    Special Section on Low-Power System LSI, IP and Related Technologies
  • FOREWORD

    Masahiko YOSHIMOTO  

     
    FOREWORD

      Page(s):
    427-428
  • Perspectives of Low-Power VLSI's

    Takayasu SAKURAI  

     
    INVITED PAPER

      Page(s):
    429-436

    The paper covers techniques to cope with ever-increasing leakage power as well as dynamic power of CMOS VLSI's. The techniques to be presented range from software, system, circuit to device level. The novel trend is to look into the cooperative approaches between disciplines such as software-circuit cooperation and circuit-technology cooperation. The biggest challenge that System-on-a-Chip designers should meet in the future is the fact that transistors go more and more leaky in digital and memory circuits as generation advances. The topics to break through this stringent problem are described. Approaches to lower power at the system level are also discussed. The paper touches on new applications and markets which will be open up by the low-power VLSI's.

  • Ultralow-Voltage MTCMOS/SOI Circuits for Batteryless Mobile System

    Takakuni DOUSEKI  Masashi YONEMARU  Eiji IKUTA  Akira MATSUZAWA  Atsushi KAMEYAMA  Shunsuke BABA  Tohru MOGAMI  Hakaru KYURAGI  

     
    INVITED PAPER

      Page(s):
    437-447

    This paper describes an ultralow-power multi-threshold (MT) CMOS/SOI circuit technique that mainly uses fully-depleted MOSFETs. The MTCMOS/SOI circuit, which combines fully-depleted low- and medium-Vth CMOS/SOI logic gates and high-Vth power-switch transistors, makes it possible to lower the supply voltage to 0.5 V and reduce the power dissipation of LSIs to the 1-mW level. We overview some MTCMOS/SOI digital and analog components, such as a CPU, memory, analog/RF circuit and DC-DC converter for an ultralow-power mobile system. The validity of the ultralow-voltage MTCMOS/SOI circuits is confirmed by the demonstration of a self-powered 300-MHz-band short-range wireless system. A 1-V SAW oscillator and a switched-capacitor-type DC-DC converter in the transmitter makes possible self-powered transmission by the heat from a hand. In the receiver, a 0.5-V digital controller composed of a 8-bit CPU, 256-kbit SRAM, and ROM also make self-powered operation under illumination possible.

  • A Single-Chip JPEG2000 Encode Processor Capable of Compressing D1-Images at 30 frames/s without Tile Division

    Hideki YAMAUCHI  Shigeyuki OKADA  Kazuhiko TAKETA  Tatsushi OHYAMA  

     
    PAPER

      Page(s):
    448-456

    A VLSI-specific wavelet processing technique has been developed and implemented as a processor in accordance with the JPEG2000 specification. This proposed procedure of discrete wavelet transforms uses an altered calculation equations and makes use of intermediate results through wavelet calculation. The implementation of the proposed procedure is capable of realizing a highly efficient DWT for large size images in spite of using low hardware costs and a small size buffering memory. In order to obtain fast EBCOT processing, three types of parallel processing are introduced in the EBCOT architecture. The processor performs compression of 720480 pixels images with the speed of 30 frames per second (fps) at a required operating frequency as low as 32 MHz or lower. Furthermore, it need not divide an image into tiles so that the problem of deterioration of image quality due to tile division does not occur. A prototype of this processor has been fabricated in a 0.25-µm 5-layer CMOS process. The chip is 10.210.4 mm2 in size and consumes 2.0 W when supplied with 2.5 V and 32 MHz.

  • A Feed-Forward Dynamic Voltage Control Algorithm for Low Power MPEG4 on Multi-Regulated Voltage CPU

    Hideo OHIRA  Kentaro KAWAKAMI  Miwako KANAMORI  Yasuhiro MORITA  Masayuki MIYAMA  Masahiko YOSHIMOTO  

     
    PAPER

      Page(s):
    457-465

    In this paper, we describe a feed-forward dynamic voltage/clock-frequency control method enabling low power MPEG4 on multi-regulated voltage CPU with combining the characteristics of the CPU and the video encoding processing. This method theoretically achieves minimum low power consumption which is close to the hardware-level power consumption. Required processing performance for MPEG4 visual encoding totally depends on the activity of the sequence, and high motion sequence requires high performance and low motion sequence requires low performance. If required performance is predictable, lower power consumption can be achieved with controlling the adequate voltage and clock-frequency dynamically at every frame. The proposed method in this paper is predicting the required processing performance of a future frame using our unique feed-forward analysis method and controlling a voltage and frequency dynamically at every frame along with the forward analysis value. The simulation results indicate that the proposed feed-forward analysis method adequately predicts the required processing performance of every future frame, and enables to minimize power consumption on software basis MPEG4 visual encoding processing. In the case that CPU has Frequency-Voltage characteristics of 1.8 V @400 MHz to 1.0 V @189 MHz, the proposed method reduces the power consumption approximately 37% at high motion sequences or 65% at low motion sequences comparing with the conventional software video encoding method.

  • VLSI-Oriented Motion Estimation Using a Steepest Descent Method in Mobile Video Coding

    Masayuki MIYAMA  Junichi MIYAKOSHI  Kousuke IMAMURA  Hideo HASHIMOTO  Masahiko YOSHIMOTO  

     
    PAPER

      Page(s):
    466-474

    This paper describes a VLSI-oriented motion estimation algorithm using a steepest descent method (SDM) applied to MPEG-4 visual communication with a mobile terminal. The SDM algorithm is optimized for QCIF or CIF resolution video and VLSI implementation. The SDM combined with a subblock search method is developed to enhance picture quality. Simulation results show that a mean PSNR drop of the SDM algorithm processing QCIF 15 fps resolution video in comparison with a full search algorithm is -0.17 dB. Power consumption of a VLSI based on the SDM algorithm assuming 0.18 µm CMOS technology is estimated at 2 mW. The VLSI attains higher picture quality than that based on the other fast motion estimation algorithm, and is applicable to mobile video applications.

  • A 160 mW, 80 nA Standby, MPEG-4 Audiovisual LSI with 16 Mbit Embedded DRAM and a 5 GOPS Post Filtering Unit

    Hideho ARAKIDA  Masafumi TAKAHASHI  Yoshiro TSUBOI  Tsuyoshi NISHIKAWA  Hideaki YAMAMOTO  Toshihide FUJIYOSHI  Yoshiyuki KITASHO  Yasuyuki UEDA  Tetsuya FUJITA  

     
    PAPER

      Page(s):
    475-481

    We present a single-chip MPEG-4 audiovisual LSI in a 0.13 µm CMOS, 5-layer metal technology with 16 Mbit embedded DRAM, which integrates four 16 bit RISC and dedicated hardware accelerators including a 5 GOPS post filtering unit. It consumes 160 mW at 125 MHz and dissipates 80 nA in the standby mode. The chip is the world first LSI handling MPEG-4 CIF video encoding at 15 frames/sec and audio/speech encoding simultaneously.

  • A Single Chip H.32X Multimedia Communication Processor with CIF 30 fr/s MPEG-4/H.26X Bi-directional Codec

    Noriyuki MINEGISHI  Ken-ichi ASANO  Keisuke OKADA  Masahiko YOSHIMOTO  

     
    PAPER

      Page(s):
    482-490

    A single chip processor suitable for various multimedia communication products has been developed. This chip achieves real-time bi-directional encoding/decoding for CIF resolution video at a frame rate of 30 fr/s, and meets such standards, as H.320 and H.324. The chip is composed of a video-processing unit for MPEG-4 and H.26X standards, a DSP unit for speech codec and multiplex processes, and a RISC unit for managing the whole chip. By heterogeneous multiple processor architecture, careful study of task sharing for each processing unit and bus configuration, a single chip solution can be achieved with reasonable operation speed and low-power consumption suitable for consumer products. Moreover, by applying an original video processing unit architecture, this chip achieves real-time bi-directional encoding/decoding for CIF-resolution video at a frame rate of 30 fr/s. An original video bus was developed to provide high performance and low-power consumption while sharing one external memory which is necessary for various video processes and graphics functions. This shared memory also has the effect of minimizing die size and I/O ports. This chip has been fabricated with 4-metal 0.18 µm CMOS technology to produce a chip area of 10.510.5 mm2 with 1.2 W power dissipation including I/O power, at 1.8 V for internal supply and 3.3 V for I/O power supply.

  • A 1.5 V, 200 MHz, 400 MIPS, 188 µA/MHz and 1.2 V, 300 MHz, 600 MIPS, 169 µA/MHz Digital Signal Processor Core for 3G Wireless Applications

    Hiroshi TAKAHASHI  Shigeshi ABIKO  Kenichi TASHIRO  Kaoru AWAKA  Yutaka TOYONOH  Rimon IKENO  Shigetoshi MURAMATSU  Yasumasa IKEZAKI  Tsuyoshi TANAKA  Akihiro TAKEGAMA  Hiroshi KIMIZUKA  Hidehiko NITTA  Miki KOJIMA  Masaharu SUZUKI  James Lowell LARIMER  

     
    PAPER

      Page(s):
    491-501

    A new high-speed and low-power digital signal processor (DSP) core, C55x, was developed for next generation applications such as 3G cellular phone, PDA, digital still camera (DSC), audio, video, embedded modem, DVD, and so on. To support such MIPS-rich applications, a packet size of an instruction fetch increased from 16-bit to 32-bit comparing with the world's most popular C54x DSP core, while maintaining complete software compatibility with the legacy DSP code. An on-chip instruction buffer queue (IBQ) automatically unpacks the packets and issues multiple instructions in parallel for the efficient use of circuit resources. The efficiency of the parallelism has been further improved by additional hardwares such as second 1717-bit MAC, a 16-bit ALU, and three temporary registers that can be used for simple computations. Four 40-bit accumulators make it possible to execute more operation per cycle with dramatically reduced overall power consumption. These new architecture allows two times efficiency of instruction per cycle (IPC) than the previous DSP core on typical applications at the same MHz. The new DSP core was designed for TI's two 130 nm technologies, one with high-VT for low-leakage and middle-performance operation at 1.5 V, and the other with low-VT for high-performance and low-VDD operation at 1.2 V, to provide best choices for any applications with a single layout data base. With the low-leakage process, the DSP core operates at over 200 MHz with 188 µA/MHz (at 75% Dual MAC + 25% ADD) active power and less than 1.63 µA standby current. The high-performance process provides it with 300 MHz with 169 µA/MHz active power and less than 680 µA standby current. The new core was designed by a semi-custom approach (ASIC + custom library) using 5-level Cu metal system with low-k dielectric material of fluorosilicate glass (FSG), and about one million transistors are contained in the core. The total balance of its power, performance, area, and leakage current (PPAL) is well suitable to most of next generation applications. In this paper, we will discuss features of the new DSP core, including circuit design techniques for high-speed and low-power, and present an example product.

  • A 100 MHz 7.84 mm2 31.7 msec 439 mW 512-Point 2-Dimensional FFT Single-Chip Processor

    Naoto MIYAMOTO  Leo KARNAN  Kazuyuki MARUO  Koji KOTANI  Tadahiro OHMI  

     
    PAPER

      Page(s):
    502-509

    A single-chip 512-point FFT processor is presented. This processor is based on the cached-memory architecture (CMA) with the resource-saving multi-datapath radix-23 computation element. The 2-stage CMA, including a pair of single-port SRAMs, is also introduced to speedup the execution time of the 2-dimensional FFTs. Using the above techniques, we have designed an FFT processor core which integrates 552,000 transistors within an area of 2.82.8 mm2 with CMOS 0.35 µm triple-layer-metal process. This processor can execute a 512-point, 36-bit-complex fixed-point data format, 1-dimensonal FFT in 23.2 µsec and a 2-dimensional one in only 23.8 msec at 133 MHz operation. The power consumption of this processor is 439.6 mW at 3.3 V, 100 MHz operation.

  • A Low Power Programmable Turbo Decoder Macro Using the SOVA Algorithm

    Hirohisa GAMBE  Kazuhisa OHBUCHI  Teruo ISHIHARA  Takaaki ZAKOJI  Kiyomichi ARAKI  

     
    PAPER

      Page(s):
    510-519

    Turbo codes are of particular use in applications of wireless communication systems, where various types of communication are required and the data rate must be changed, depending on the situation. In such applications, adaptation of turbo coding specifications is required in terms of coding block size, data speed, parity bit arrangement or configuration of a convolutional coder, as well as the need for real time processing. We present new ideas to provide these capabilities for a low power decoder circuit by focusing on the configuration of a convolutional decoding algorithm, which occupies a significant proportion of the hardware circuit. We utilize the Soft Output Viterbi Algorithm (SOVA) for the base algorithm, produced by adding the concept of a soft output to the Viterbi Algorithm (VA). The Maximum A Posteriori (MAP) algorithm and its simplified version of MAX-LOG-MAP are also widely known. MAP is recognized as a means of achieving very good bit error rate (BER) characteristics. On the other hand SOVA has been regarded as a method which can be simply implemented with less computational resources, but at a cost of higher degradation. However, in many of recent systems we combine turbo coding with some other method such as Automatic Repeat Request (ARQ) to maintain a good error correction performance and we only have to pay attention to the performance in the range of low carrier-to-noise ratio (CNR), where SOVA has fairly satisfactory BER characteristics. This makes the SOVA approach attractive for a low power programmable IP macro solution, when the fundamental advantage of SOVA is fully utilized in the implementation of an LSI circuit. We discuss the processing algorithm and circuit configuration and show that about 40% reduction in power consumption can be achieved. It is also shown that the IP macro can handle 1.5 Mbps information decoding at 100 MHz clock rate.

  • A Physical Synthesis Methodology for Multi-Threshold-Voltage Design in Low-Power Embedded Processor

    Toshihiro HATTORI  Kenji OGURA  

     
    PAPER

      Page(s):
    520-526

    In low-power embedded processors design, stand-by power constraint is also important with average power and operation frequency. Multi-threshold-voltage cells are used in the design and the ratio of low-Vth cells should be controlled. On the other hand, physical synthesis flow is indispensable to achieve high performance and short design time. This paper proposes a physical synthesis methodology under the restriction of maximum low-Vth cell ratio. The experimental results show that our method can achieve only 4 MHz slower logic within 5% margin of the target low-Vth ratio. We have applied this design flow in an application processor design and the designed processor demonstrates 360 MIPS at 200 MHz only with 80 mW at 1.0 V, namely 4500 MIPS/W and 4.2 mA leakage current without any power-cut mode.

  • Design of a Fast Asynchronous Embedded CISC Microprocessor, A8051

    Je-Hoon LEE  YoungHwan KIM  Kyoung-Rok CHO  

     
    PAPER

      Page(s):
    527-534

    In this paper, we design and implement a fast asynchronous embedded CISC microprocessor, A8051, introducing well-tuned pipeline architecture and enhanced control schemes. This work shows an asynchronous design methodology for a CISC type processor, handling the complicated control structure and various instructions. We tuned the proposed architecture to the 5-stage pipeline, reducing the number of idle stages. For the work, we regrouped the instructions based on the number of the machine cycles identified. A8051 has three enhanced control features to improve the system performance: multi-looping control of the pipeline stage, variable length instruction register to get a multiple word instruction in a time, and branch prediction accelerating. The proposed A8051 was synthesized to a gate level design using a 0.35 µm CMOS standard cell library. Simulation results indicate that A8051 provides about 18 times higher speed than the traditional Intel 8051 and about 5 times higher speed than the previously designed asynchronous 8051. In power consumption, core of A8051 shows 15 times higher MIPS/Watt than the synchronous H8051.

  • Selective-Sets Resizable Cache Memory Design for High-Performance and Low-Power CPU Core

    Takashi KURAFUJI  Yasunobu NAKASE  Hidehiro TAKATA  Yukinaga IMAMURA  Rei AKIYAMA  Tadao YAMANAKA  Atsushi IWABU  Shutarou YASUDA  Toshitsugu MIWA  Yasuhiro NUNOMURA  Niichi ITOH  Tetsuya KAGEMOTO  Nobuharu YOSHIOKA  Takeshi SHIBAGAKI  Hiroyuki KONDO  Masayuki KOYAMA  Takahiko ARAKAWA  Shuhei IWADE  

     
    PAPER

      Page(s):
    535-542

    We apply a selective-sets resizable cache and a complete hierarchy SRAM for the high-performance and low-power RISC CPU core. The selective-sets resizable cache can change the cache memory size by varying the number of cache sets. It reduces the leakage current by 23% with slight degradation of the worst case operating speed from 213 MHz to 210 MHz. The complete hierarchy SRAM enables the partial swing operation not only in the bit lines, but also in the global signal lines. It reduces the current consumption of the memory by 4.6%, and attains the high-speed access of 1.4 ns in the typical case.

  • A Novel Static Prediction Scheme for Filter Cache Structures

    Kugan VIVEKANANDARAJAH  Thambipillai SRIKANTHAN  Christopher T. CLARKE  Saurav BHATTACHARYYA  

     
    PAPER

      Page(s):
    543-548

    Energy dissipation in cache memories is becoming a major design issue for embedded microprocessors. Predictive filter cache based instruction cache hierarchy has been shown to effectively reduce the energy-delay product. In this paper, a simplified pattern prediction algorithm is proposed for the filter cache hierarchy. The prediction scheme relies on the static nature of the hit or miss pattern of the instruction access streams. The static patterns are maintained in a small 32x1-bit wide Static Pattern Table (SPT). Our investigations show that the proposed prediction algorithm is superior to that based on Next Fetch Prediction Table (NFPT) for all the benchmarks simulated. With the proposed approach, energy delay product reduction of up to 6.79% was evident when compared with that using NFPT. Moreover, since the prediction scheme is based on the static assignment of patterns, it lends well for area and power efficient implementation than that employs dynamic pattern prediction although it is marginally inferior (i.e. 0.69%) in term of energy delay product.

  • Electric-Energy Generation through Variable-Capacitive Resonator for Power-Free LSI

    Masayuki MIYAZAKI  Hidetoshi TANAKA  Goichi ONO  Tomohiro NAGANO  Norio OHKUBO  Takayuki KAWAHARA  

     
    PAPER

      Page(s):
    549-555

    A vibration-to-electric energy converter as a power generator through a variable-resonating capacitor is theoretically and experimentally demonstrated as a potential on-chip battery. The converter is constructed from three components: a mechanical-variable capacitor, a charge-transporter circuit and a timing-capture control circuit. An optimum design methodology is theoretically described to maximize the efficiency of the vibration-to-electric energy conversion. The energy-conversion efficiency is analyzed based on the following three factors: the mechanical-energy to electric-energy conversion loss, the parasitic elements loss in the charge-transporter circuit and the timing error in the timing-capture circuit. Through the mechanical-energy conversion analysis, the optimum condition for the resonance is found. The parasitic elements in the charge-transporter circuit and the timing management of the capture circuit dominate the output energy efficiency. These analyses enable the optimum design of the energy-conversion system. The converter is fabricated experimentally. The practical measured power is 0.12 µW, and the conversion efficiency is 21%. This efficiency is calculated from a 43% mechanical-energy conversion loss and a 63% charge-transportation loss. The timing-capture circuit is manually controlled in this experiment, so that the timing error is not considered in the efficiency. From our result, a new system LSI application with an embedded power source can be explored for the ubiquitous computing world.

  • A Full-CMOS Single Chip Bluetooth LSI with 1.5 MHz-IF Receiver and Direct Modulation Transmitter

    Fumitoshi HATORI  Hiroki ISHIKURO  Mototsugu HAMADA  Ken-ichi AGAWA  Shouhei KOUSAI  Hiroyuki KOBAYASHI  Duc Minh NGUYEN  

     
    PAPER

      Page(s):
    556-562

    This paper describes a full-CMOS single-chip Bluetooth LSI fabricated using a 0.18 µm CMOS, triple-well, quad-metal technology. The chip integrates radio and baseband, which is compliant with Bluetooth Core Specification version 1.1. A direct modulation transmitter and a low-IF receiver architecture are employed for the low-power and low-cost implementation. To reduce the power consumption of the digital blocks, it uses a clock gating technique during the active modes and a power manager during the low power modes. The maximum power consumption is 75 mW for the transmission, 120 mW for the reception and 30 µW for the low power mode operation. These values are low enough for mobile applications. Sensitivity of -80 dBm has been achieved and the transmitter can deliver up to 4 dBm.

  • A Low-Power Microcontroller with Body-Tied SOI Technology

    Hisakazu SATO  Yasuhiro NUNOMURA  Niichi ITOH  Koji NII  Kanako YOSHIDA  Hironobu ITO  Jingo NAKANISHI  Hidehiro TAKATA  Yasunobu NAKASE  Hiroshi MAKINO  Akira YAMADA  Takahiko ARAKAWA  Toru SHIMIZU  Yuichi HIRANO  Takashi IPPOSHI  Shuhei IWADE  

     
    PAPER

      Page(s):
    563-570

    A low-power microcontroller has been developed with 0.10 µm bulk compatible body-tied SOI technology. For this work, only two new masks are required. For the other layers, existing masks of a prior work developed with 0.18 µm bulk CMOS technology can be applied without any changes. With the SOI technology, the high-speed operation of over 600 MHz has been achieved at a supply voltage of 1.2 V, which is 1.5 times faster than prior work. Also, a five times improvement in the power-delay product has been achieved at a supply voltage 0.8 V. Moreover, the compatibility of the SOI technology with bulk CMOS has been verified, because all circuit blocks of the chip, including logic, memory, analog circuit, and PLL, are completely functional, even though only two new masks are used.

  • A Wide Range 1.0-3.6 V 200 Mbps, Push-Pull Output Buffer Using Parasitic Bipolar Transistors

    Takahiro SHIMADA  Hiromi NOTANI  Yasunobu NAKASE  Hiroshi MAKINO  Shuhei IWADE  

     
    PAPER

      Page(s):
    571-577

    We proposed a push-pull output buffer that maintains the data transmission rate for lower supply voltages. It operates at an internal supply voltage (VDD) of 0.7-1.6 V and an interface supply voltage (VDDX) of 1.0-3.6 V. In low VDDX operation, the output buffer utilizes parasitic bipolar transistors instead of MOS transistors to maintain drivability. Furthermore forward body bias (FBB) control is provided for the level converter in low VDD operation. We fabricated a test chip with a standard 0.15 µm CMOS process. Measurement results indicate that the proposed output buffer achieves 200 Mbps operation at VDD of 0.7 V and VDDX of 1.0 V.

  • +3 V/-3 V Operation 1.2 Gbps Write Driver for Hard Disk Drives

    Yasuyuki OKUMA  Kenji MAIO  Hiroyasu YOSHIZAWA  

     
    PAPER

      Page(s):
    578-581

    This paper describes low voltage write driver with pulse adding circuit. The presented write driver is constructed from the main switch circuit with impedance matching and pulse adding circuits and a timing generator. The main switch circuit is voltage type driver with matching resisters for flexible lines between a write driver and a write head. For 1.2 Gbps operation, the flexible lines have to be treated as transmission lines. Furthermore, to achieve steep rise/fall edge, the pulse adding circuits to generate double of supply voltage, +3.3/-3 V, at rise/fall edge have been developed. The write driver was implemented using 0.35 µm BiCMOS process. The die size is 1.2 mm0.6 mm and the measured results achieved tr/tf of less than 0.25 ns, tp of 0.5 ns and Ip of 73 mA.

  • Low-Power Multiple-Valued Current-Mode Logic Using Substrate Bias Control

    Akira MOCHIZUKI  Takahiro HANYU  

     
    PAPER

      Page(s):
    582-588

    A new multiple-valued current-mode (MVCM) logic circuit using substrate bias control is proposed for low-power VLSI systems at higher clock frequency. Since a multi-level threshold value is represented as a threshold voltage of an MOS transistor, a voltage comparator is realized by a single MOS transistor. As a result, two basic components, a comparator and an output generator in the MVCM logic circuit can be merged into a single MOS differential-pair circuit where the threshold voltages of MOS transistors are controlled by substrate biasing. Moreover, the leakage current is also reduced using substrate bias control. As a typical example of an arithmetic circuit, a radix-2 signed-digit full adder using the proposed circuit is implemented in a 0.18- µm CMOS technology. Its dynamic and static power dissipations are reduced to about 79 percent and 14 percent, respectively, in comparison with those of the corresponding binary CMOS implementation at the supply voltage of 1.8 V and the clock frequency of 500 MHz.

  • µI/O Architecture: A Power-Aware Interconnect Circuit Design for SoC and SiP

    Yusuke KANNO  Hiroyuki MIZUNO  Nobuhiro OODAIRA  Yoshihiko YASU  Kazumasa YANAGISAWA  

     
    PAPER

      Page(s):
    589-597

    A power-aware interconnect circuit design--called µI/O architecture--has been developed to provide low-cost system solutions for System-on-Chip (SoC) and System-in-Package (SiP) technologies. The µI/O architecture provides a common interface throughout the module enabling hierarchical I/O design for SoC and SiP. The hierarchical I/O design allows the driver size to be optimized without increasing design complexity. Moreover, it includes a signal-level converter for integrating wide-voltage-range circuit blocks and a signal wall function for turning off each block independently--without invalid signal transmission--by using an internal power switch.

  • A Power Reduction Scheme for Data Buses by Dynamic Detection of Active Bits

    Masanori MUROYAMA  Akihiko HYODO  Takanori OKUMA  Hiroto YASUURA  

     
    PAPER

      Page(s):
    598-605

    To transfer a small number, we inherently need a small number of bits. However all bit lines on a data bus change their status and redundant power is consumed. To reduce the redundant power consumption, we introduce a concept named active bit. In this paper, we propose a power reduction scheme for data buses using active bits. Suppressing switching activity of inactive bits, we can reduce redundant power consumption. We propose various power reduction techniques using active bits and the implementation methods. Experimental results illustrate up to 54.2% switching activity reduction.

  • Memory Data Organization for Low-Energy Address Buses

    Hiroyuki TOMIYAMA  Hiroaki TAKADA  Nikil D. DUTT  

     
    PAPER

      Page(s):
    606-612

    Energy consumption has become one of the most critical constraints in the design of portable multimedia systems. For media applications, address buses between processor and data memory consume a considerable amount of energy due to their large capacitance and frequent accesses. This paper studies impacts of memory data organization on the address bus energy. Our experiments show that the address bus activity is significantly reduced by 50% through exploring memory data organization and encoding address buses.

  • Circuit Partition and Reordering Technique for Low Power IP Design

    Kun-Lin TSAI  Shanq-Jang RUAN  Chun-Ming HUANG  Edwin NAROSKA  Feipei LAI  

     
    PAPER

      Page(s):
    613-620

    Circuit partition, retiming and state reordering techniques are effective in reducing power consumption of circuits. In this paper, we propose a partition architecture and a methodology to reduce power consumption when designing low power IP, named PRC (Partition and Reordering Circuit). The circuit reordering synthesis flow consists of three phases: first, evenly partition the circuit based on the Shannon expansion; secondly encode the output vectors of each partition to build an equivalent functional logic. Finally, apply reordering algorithm to reorganize the logic function to reduce power consumption and decrease area cost. The validity of our architecture is proven by applying it to MCNC benchmark with simulation environment.

  • A Design for Testability Technique for Low Power Delay Fault Testing

    James Chien-Mo LI  

     
    PAPER

      Page(s):
    621-628

    This paper presents a Quiet-Noisy scan technique for low power delay fault testing. The novel scan cell design provides both the quiet and noisy scan modes. The toggling of scan cell outputs is suppressed in the quiet scan mode so the power is saved. Two-pattern tests are applied in the noisy scan mode so the delay fault testing is possible. The experimental data shows that the Quiet-Noisy scan technique effectively reduces the test power to 56% of that of the regular scan. The transition fault coverage is improved by 19.7% compared to an existing toggle suppression low power technique. The presented technique requires very minimal changes in the existing MUX-scan Design For Testability (DFT) methodology and needs virtually no computation. The penalties are area overhead, speed degradation, and one extra control in test mode.

  • Pipelined Wake-Up Scheme to Reduce Power Line Noise for Block-Wise Shutdown of Low-Power VLSI Systems

    Jin-Hyeok CHOI  Yong-Ju KIM  Jae-Kyung WEE  Seongsoo LEE  

     
    LETTER

      Page(s):
    629-633

    Block-wise shutdown of idle functional blocks in VLSI systems is a promising approach to reduce power consumption. Especially, multi-threshold voltage CMOS (MTCMOS) is widely accepted to save leakage power during idle time. As operating frequency increases, it requires short wake-up time to use the shutdown block in time. However, short wake-up time of a large block causes large current surge during wake-up process. This often leads to system malfunction due to severe power line noise. This is one of the serious problems for practical implementation of MTCMOS block-wise shutdown. This letter proposes an effective wake-up scheme for block-wise shutdown of low-power VLSI systems. It exploits pipelined wake-up strategy that reduces current surge during wake-up process. In this letter, the proposed scheme was analyzed and simulated from the viewpoint of power distribution network. To verify its validity, it was applied to a multiplier block in Compact Flash controller chip on a test board. According to the simulation results of equivalent R, L, and C modeling, the proposed scheme achieved significant improvement over conventional concurrent shutdown schemes.

  • Regular Section
  • A Low Power and Small Area Analog Adaptive Line Equalizer for 100 Mb/s Data Rate on UTP Cable

    Kwisung YOO  Hoon LEE  Gunhee HAN  

     
    PAPER-Electronic Circuits

      Page(s):
    634-639

    The cable length in wired serial data communication is limited because the limited bandwidth of a long cable introduces ISI (Inter Symbol Interference). A line equalizer can be used at the receiver to extend the channel bandwidth. This paper proposes a low-power and small-area analog adaptive line equalizer for 100-Mb/s operation on UTP (Unshielded Twisted Pair) cable up to 100 m. The proposed adaptive line equalizer is fabricated with 0.35-µm CMOS process, consumes 19 mW and occupies only 0.07 mm2 Measurement results show that the prototype can operate at data rate of 100 Mb/s on a 100-m cable and 155 Mb/s on a 50-m cable.

  • A Low-Power Edge-Triggered and Logic-Embedded Flip-Flop Using Complementary Pass Transistor Circuit

    Ki-Tae PARK  Tomokatsu MIZUKUSA  Hyo-Sig WON  Hiroyuki KURINO  Mitsumasa KOYANAGI  

     
    LETTER-Electronic Circuits

      Page(s):
    640-644

    A new low power edge-triggered and logic embedded flip-flop based on complementary pass transistor circuit is proposed. This flip-flop provides small clock load, short propagation delay, single-phase clock scheme and small layout area. The flip-flop can reduce 35.2% power consumption while improving 24.7% propagation delay in comparison to conventional transmission-gate master-slave flip-flop in a standard 0.35 µm CMOS technology at 1.5 V power supply. In addition, logic functions can be embedded in the flip-flop. In 2-inputs multiplexer and flip-flop circuit, the proposed circuit can reduce 28.0% power consumption and improve 20.3% propagation delay compared to conventional circuit.

  • A Power-Down Circuit Scheme Using Data-Preserving Complementary Pass Transistor Flip-Flop for Low-Power High-Performance Multi-Threshold CMOS LSI

    Ki-Tae PARK  Tomokatsu MIZUKUSA  Hyo-Sig WON  Kyu-Myung CHOI  Jeong-Taek KONG  Hiroyuki KURINO  Mitsumasa KOYANAGI  

     
    LETTER-Electronic Circuits

      Page(s):
    645-648

    A new power-down circuit scheme using data-preserving complementary pass transistor flip-flop circuit for low-power, high-performance Multi-Threshold voltage CMOS (MTCMOS) LSI is presented. The proposed circuit can preserve a stored data during power-down period while maintaining low leakage current without any extra circuit and complex timing design. The flip-flop provides 24% improved delay and 30% less silicon area compared to conventional MTCMOS flip-flop circuit. A 16-bits DSP processor core using the proposed circuit and 0.18 µ m CMOS technology was designed. The DSP chip was successfully operated at 120 MHz, 1.65 V and its total leakage current in power-down mode was four orders smaller than conventional DSP chip.