The search functionality is under construction.
The search functionality is under construction.

IEICE TRANSACTIONS on Electronics

  • Impact Factor

    0.63

  • Eigenfactor

    0.002

  • article influence

    0.1

  • Cite Score

    1.3

Advance publication (published online immediately after acceptance)

Volume E84-C No.2  (Publication Date:2001/02/01)

    Special Issue on Low-Power High-Performance VLSI Processors and Technologies
  • FOREWORD

    Gensuke GOTO  

     
    FOREWORD

      Page(s):
    129-130
  • Trends in High-Performance, Low-Power Processor Architectures

    Kazuaki MURAKAMI  Hidetaka MAGOSHI  

     
    PAPER

      Page(s):
    131-138

    This paper briefly surveys architectural technologies of recent or future high-performance, low-power processors for improving the performance and power/energy consumption simultaneously. Achieving both high performance and low power at the same time imposes a lot of challenges on processor design, and therefore gives us a lot of opportunities for devising new technologies. The paper also tries to provide some insights into the technology direction in future.

  • Embedded Processor Core with 64-Bit Architecture and Its System-On-Chip Integration for Digital Consumer Products

    Kunio UCHIYAMA  Fumio ARAKAWA  Yasuhiko SAITO  Koki NOGUCHI  Atsushi HASEGAWA  Shinichi YOSHIOKA  Naohiko IRIE  Takeshi KITAHARA  Mark DEBBAGE  Andy STURGES  

     
    PAPER

      Page(s):
    139-149

    A 64-bit architecture for an embedded processor targeted for next-generation digital consumer products has been developed. It has dual-mode instruction sets and is optimized for high multimedia performance, provided by SIMD/floating-point vector instructions in 32-bit length ISA, and small code size, provided by a conventional 16-bit length ISA. Large register files, (6464b and 6432b), a split-branch mechanism, and virtual cache are also adopted in the architecture. A 714MIPS/9.6 GOPS/400 MHz processor core with the 64-bit architecture and a system LSI containing the core are developed using 0.15-µm technology. The LSI includes a 3.2 GB/sec high-bandwidth on-chip bus, a high-speed DRAM interface, a SRAM/Flash/ROM/Multiplexed-bus interface, and a 66 MHz PCI interface that provide the performance required for next-generation multimedia applications.

  • A 350 MHz 5.6 GOPS/1.4 GFLOPS 4-Way VLIW Embedded Microprocessor

    Hiroshi OKANO  Atsuhiro SUGA  Hideo MIYAKE  Yoshimasa TAKEBE  Yasuki NAKAMURA  Hiromasa TAKAHASHI  

     
    PAPER

      Page(s):
    150-156

    A 5.6 GOPS/1.4 GFLOPS 350 MHz, four-way very long instruction word (VLIW) microprocessor is developed for embedded applications in a 0.18 µm five-layer-metal CMOS process. This processor features a two-way integer pipeline and two-way floating/media pipelines. Each floating pipeline and media pipeline has two-parallel and four-parallel single instruction multiple-data (SIMD) mechanisms, respectively. The processor has separate instruction and data caches, each of 16 KB in size and having four-way set associative. The data cache employs a non-blocking technique and can process two load instructions in parallel. The processor had about a 50% clock net power reduction compared with one without power optimization. 6.7 million transistors are integrated in an area of 7.5 mm 7.5 mm. Since all circuit blocks were developed using logic synthesis, the processor is easy to adapt to system-on-a-chip (SoC) applications.

  • A Low Power Media Processor Core Performable CIF30 fr/s MPEG4/H26x Video Codec

    Hideo OHIRA  Toshihisa KAMEMARU  Hirokazu SUZUKI  Ken-ichi ASANO  Masahiko YOSHIMOTO  

     
    PAPER

      Page(s):
    157-165

    An architectural design of a media processor core optimized for MPEG4/H26x video codec targeted for use in mobile multimedia terminals is presented. The architecture consists of a maximum 6.4 GOPS SIMD (Single Instruction Multiple Data) processor, RISC-processor, VLC-processor, and intelligent DMA controller. The unique SIMD processor completes 2-D DCT processing in 132 clock cycles, or block matching (16 by 16 pixels) in 24 clock-cycles. VLC-processor allows the completion of 8 by 8 block run-level coding in average 10 clock cycles in the case of low bit-rates. The functions of transpose-registers in the SIMD processor, data sub-sampling technique in the DMA, or data-sliding technique between PEs (Processor Elements) in the SIMD processor eliminate a large amount of cycle loss for data handling, and extract the highest level of performance. Through the use of the above architecture and the lower power approach, CIF 30 frames/s MPEG4 Simple Profile video codec @ 100 MHz can be achieved. Estimated dissipation is as low as 280 mW. 300 kgates and 16 kBytes four port SRAM are contained on a 12 mm2 area by using 0.18 µm process technology. The combination of the RISC-processor and SIMD-processor can also operate MPEG4 core profile (shape coding) that requires flexibility and performance.

  • A Dynamically Configurable Multi-Format PSK Demodulator for Digital HDTV Using Broadcasting-Satellite

    Eiji ARITA  Takashi FUJIWARA  Kin-ichiro NISHIYAMA  Akiko MAENO  Yasuo MATSUNAMI  Masahiko NAKAMURA  Hirohisa MACHIDA  Shuji MURAKAMI  Hiroyuki NAKAYAMA  Masahiko YOSHIMOTO  

     
    PAPER

      Page(s):
    166-174

    A complete single chip multi-format Phase Shift Keying (PSK) demodulator ULSI for Japanese BS digital broadcasting is reported. The carrier recovery system shows the pull-in range up to +/-5 MHz. The clock recovery system cancels the poor group delay characteristic and the orthogonality degradation caused by the analog front end, and improves the BER performance by 0.2 dB. Thus the requirement to the analog front end is relaxed. A digital PLL ensures minimum program clock reference jitter in the output data stream, which simplifies jitter management in the succeeding MPEG2 system decoder. It integrates two 8-bit 60 MHz ADCs, 58 MHz VCO, 1 Mbit SRAM and the 450 K-gate FEC-demodulator core. Implementation of 1 Mbit de-interleaver RAM facilitates the use of a low cost receiver. The 8.8 milion transistor chip occupies the 72 mm2 in a 0.25 µm triple-metal CMOS technology.

  • Fully Digital Preambleless 40 Mbps QPSK Receiver for Burst Transmission

    Seung-Geun KIM  Wooncheol HWANG  Youngsun KIM  Youngkou LEE  Sungsoo CHOI  Kiseon KIM  

     
    PAPER

      Page(s):
    175-182

    We present a case of design and implementation of a high-speed burst QPSK (Quaternary Phase Shift Keying) receiver. Since the PSK modulation carries its information through the phase, the baseband digital receiver can recover transmitted symbol from the received phase. The implemented receiver estimates symbol time and frequency offset using sampled data over 32 symbols without transmitted symbol information, and embedded RAM is used for received phase delay over estimation time. The receiver is implemented using about 92,000 gates of Samsung KG75 SOG library which uses 0.65 µm CMOS technology. The fabricated chip test result shows that the receiver operates at 40 MHz clock rate on 5.6 V, which is equivalent to the 40 Mbps data rate.

  • A High-Performance Videophone Chip with Dual Multimedia VLIW Processor Cores

    Jeong-Min KIM  Yun-Su SHIN  In-Gu HWANG  Kwang-Sun LEE  Sang-Il HAN  Sang-Gyu PARK  Soo-Ik CHAE  

     
    PAPER

      Page(s):
    183-192

    A chip is described that integrates two multimedia VLIW processor cores with a hardware streaming engine. It can implement a real-time videophone, or an MPEG4 codec. Each processor core has identical resources, and shares the memory and system I/O interface units. With its symmetric structure, applications can be executed on either processor without constraints. To accelerate multimedia-specific applications, the architecture of this processor has several features. It merges the features of a RISC and a DSP, its instruction set is extended to accelerate both video and audio applications, and it supports an efficient embedded memory system, to reduce both the bandwidth and the latency for multimedia applications needing frequent memory accesses. The chip size will be 100 mm2 die that contains 700 K logic gates, 60 KB RAM, and 16 KB ROM, in a 0.25-µm CMOS standard cell technology. At 65 MHz operating frequency, it can process H.263 video coding at CIF 15 frames/sec, and G.723.1 audio coding with an 80% processing time allocation.

  • A Low-Power High-Performance Vector-Pipeline DSP for Low-Rate Videophones

    Kazutoshi KOBAYASHI  Makoto EGUCHI  Takuya IWAHASHI  Takehide SHIBAYAMA  Xiang LI  Kosuke TAKAI  Hidetoshi ONODERA  

     
    PAPER

      Page(s):
    193-201

    We propose a vector-pipeline processor VP-DSP for low-rate videophones which can encode and decode 10 frames/sec. of QCIF through a 29.2 kbps low-rate line. We have already fabricated a VP-DSP LSI by a 0.35 µm CMOS process. The area of the VP-DSP core is 4.26 mm2. It works properly at 25 MHz/1.6 V with a power consumption of 49 mW. Its peak performance is up to 400 MOPS, 8.2 GOPS/W.

  • An Embedded Software Scheme for a Real-Time Single-Chip MPEG-2 Encoder System with a VLIW Media Processor Core

    Hiroshi SEGAWA  Yoshinori MATSUURA  Satoshi KUMAKI  Tetsuya MATSUMURA  Stefan SCOTZNIOVSKY  Shu MURAYAMA  Tetsuro WADA  Ayako HARADA  Eiji OHARA  Ken-ichi ASANO  Toyohiko YOSHIDA  Yasutaka HORIBA  

     
    PAPER

      Page(s):
    202-211

    This paper describes an embedded software scheme for a single-chip MPEG-2 encoder that executes concurrent video, audio, and system encoding in real-time. The software features a scalable module structure, which is hierarchically composed and has expandable plug-in modules. For increased applicability, several task-modules are prepared for the respective video, audio, and system processing. In addition, an effective task management scheme that features polling and interrupt-based task switching has been proposed in order to achieve real-time operation. The software having these features and including all task-modules is implemented on a single media-processor D30V on a single chip MPEG-2 video, audio, and system encoder. This encoder realizes real-time MPEG-2 video encoding, Dolby Digital or MPEG-1 audio encoding, and system encoding that generates TS or PS over 50 Mbps for various applications. Assuming a DVD or DTV encoder system, the software is reconstructed with less than 56.6-kbytes of instruction and 145.6 MIPS performance. The single media-processor with 64-kbytes of instruction RAM and 162 MIPS performance, running at a clock rate of 162 MHz, can successfully accomplish a real-time operation with the proposed embedded software.

  • Low Power Current-Cut Switched-Current Matched Filter for CDMA

    Kenji TOGURA  Hiroyuki NAKASE  Koji KUBOTA  Kazuya MASU  Kazuo TSUBOUCHI  

     
    PAPER

      Page(s):
    212-219

    We have proposed a current-cut switched-current matched filter (CC-SIMF) for direct-sequence code-division multiple-access (DS-CDMA). The 256-chip CC-SIMF can achieve low power consumption of less than 10 mW under high-speed operation of more than 16 Mcps. To reduce the current transfer error accumulation, we propose a parallel SIMF configuration. A 128-chip SIMF using 0.8-µm Complementally Metal Oxide Semiconductor (CMOS) process has been designed and fabricated. Optimization of the current memory cell structure has been described. The correlation operation at 16 Mcps has been obtained using a 128-chip orthogonal m-sequence. The code phase separation performance for path diversity has been clearly observed. The power consumption has been significantly reduced using the current-cut method.

  • A 1-GHz Portable Digital Delay-Locked Loop with Infinite Phase Capture Ranges

    Koichiro MINAMI  Masayuki MIZUNO  Hiroshi YAMAGUCHI  Toshihiko NAKANO  Yusuke MATSUSHIMA  Yoshikazu SUMI  Takanori SATO  Hisashi YAMASHIDA  Masakazu YAMASHINA  

     
    PAPER

      Page(s):
    220-228

    This paper describes a 1-GHz portable digital delay-locked loop (DLL) with 0.15-µm CMOS technology. There are three factors contributing to jitter in digital DLLs. One is supply-noise induced jitter, another is jitter caused by delay time resolution and phase step in the delay line, and the third is jitter caused by the sensitivity of the phase detector. In order to achieve a low jitter digital DLL, we have developed a master-slave architecture that achieves infinite phase capture ranges and low latency, a delay line that improves the delay time resolution, a phase step suppression technique and a dynamic phase detector with increased sensitivity. These techniques were used to fabricate a digital DLL with improved jitter performance. Measured results showed that the DLL successfully achieves 29-ps peak-to-peak jitter with a quiet supply and 0.2-ps/ mV supply sensitivity.

  • A Cascade ALU Architecture for Asynchronous Super-Scalar Processors

    Motokazu OZAWA  Masashi IMAI  Yoichiro UENO  Hiroshi NAKAMURA  Takashi NANYA  

     
    PAPER

      Page(s):
    229-237

    Wire delays, instead of gate delays, are moving into dominance in modern VLSI design. Current synchronous processors have the critical path not in the ALU function but in the cache access. Since the cache performance enhancement is limited by the memory access delay which mainly consists of wire delays, a reduction in gate delays may no longer imply any enhancement in processor performance. To solve this problem, this paper presents a novel architecture, called the Cascade ALU. The Cascade ALU allows super-scalar processors with future technologies to move the critical path into the ALU part. Therefore the Cascade ALU can enjoy the expected progress in future device speed. Since the delay of the Cascade ALU varies depending on the executed instructions, an asynchronous system is shown to be suitable for implementing the Cascade ALU. However an asynchronous system may have a large handshake overhead, this paper also presents an asynchronous Fine Grain Pipeline technique that hides the handshake overhead. Finally, this paper presents results of performance and area evaluation for an asynchronous implementation of the cascade ALU. The results show that the cascade ALU architecture has a good performance scalability on the reduction of the ALU latency and imposes little area penalty compared with current synchronous processors.

  • Regular Section
  • Modal-Matching Analysis of Loss in Bent Graded-Index Optical Slab Waveguides

    Maria MIRIANASHVILI  Kazuo ONO  Masashi HOTTA  

     
    PAPER-Electromagnetic Theory

      Page(s):
    238-242

    Loss analysis in bent graded-index optical slab waveguides is given using the modal-matching method. The conformal mapping replaces curved structure by an equivalent straight waveguide with a modified index profile. For this planar waveguide structure, the normal modes are calculated using a multilayer approximation method. The wave incident on the bend is expanded initially into a finite set of normal modes of the equivalent straight structure, and the transverse fields are matched across the junction. The numerical results show the loss formation in the graded-index waveguides and its dependence of the effective index of the corresponding straight waveguide.

  • Behaviors of Negative Resistances and Its Influences on VCO Design

    Yao-Huang KAO  Tzung-Hsiung WU  

     
    PAPER-Microwaves, Millimeter-Waves

      Page(s):
    243-248

    The features of the negative resistance in common source and common gate FET configurations for wideband VCO are studied. They are also explained by the simplified three-capacitor model. A design procedure is then developed. The results are applied to a design of wide band oscillator at the several gigahertz region.

  • 1.0 V Operation Power Heterojunction FET for Digital Cellular Phones

    Takehiko KATO  Yasunori BITO  Naotaka IWATA  

     
    PAPER-Microwaves, Millimeter-Waves

      Page(s):
    249-252

    This paper describes 1.0 V operation power performance of a double doped AlGaAs/InGaAs/AlGaAs heterojunction FET for personal digital cellular phones. The developed FET with a multilayer cap consisting of a highly Si-doped GaAs, an undoped GaAs and a highly Si-doped AlGaAs exhibited an on-resistance of 1.3 Ωmm and a maximum drain current of 620 mA/mm. A 28 mm gate-width device, operating with a drain bias voltage of 1.0 V, demonstrated an output power of 1.0 W, a power-added efficiency of 59% and an associated gain of 13.7 dB at an adjacent channel leakage power at 50 kHz off-center frequency of -48 dBc with a 950 MHz π/4-shifted quadrature phase shift keying signal.

  • Dynamic Floating Body Control SOI CMOS for Power Managed Multimedia ULSIs

    Fukashi MORISHITA  Kazutami ARIMOTO  Kazuyasu FUJISHIMA  Hideyuki OZAKI  Tsutomu YOSHIHARA  

     
    PAPER-Integrated Electronics

      Page(s):
    253-259

    A novel body potential-controlling technique for floating SOI CMOS circuits is proposed and verified in this study. High-speed operation is realized with a small chip size by using body-floating SOI transistors. The use of this technique allows the threshold voltage of the body-floating transistors to be varied transitionally. Therefore, the standby current of SOI CMOS logic is reduced to less than 1/50th of that required by the non-controlled operation of the body potential, and the logic operates at a high speed during the active period. There is no speed penalty for the recovery operation from the standby mode. This technique supports sub-1 V operation, which will be required by future battery-operated devices with wide-range covering.

  • A 200 V CMOS SOI IC with Field-Plate Trench Isolation for EL Displays

    Kazunori KAWAMOTO  Hitoshi YAMAGUCHI  Hiroaki HIMI  Seiji FUJINO  Isao SHIRAKAWA  

     
    PAPER-Integrated Electronics

      Page(s):
    260-266

    EL (Electroluminescent) displays have been applied to automobiles, as their images are very clear and bright. High voltage, high integration and low power dissipation ICs are needed to drive these devices. To meet this, high voltage CMOS ICs using SOI (Silicon On Insulator) substrates are chosen as the driving devices. In this paper, an isolation structure between the output CMOS devices, of high density and high voltage is proposed. Conventional trench dielectric isolation shows degradation of a break down voltage with short distance from trench to source. In this work, the authors make clear the electric field distribution near the isolation, and offer a novel structure of "Field-plate Trench Isolation," which enables to relax the electric field on the silicon surface by shifting a part of electric field into surface oxide. Finally, operation of high voltage and high density, a 200-volt and 32-channel, EL display driver for automotive display panel is confirmed.

  • Ray Tracing Analysis of Large-Scale Random Rough Surface Scattering and Delay Spread

    Kwang-Yeol YOON  Mitsuo TATEIBA  Kazunori UCHIDA  

     
    LETTER-Electromagnetic Theory

      Page(s):
    267-270

    We have discussed a ray tracing method to estimate the scattering characteristics from random rough surface. It has been shown from the traced rays that the diffracted rays dominate over the reflected rays. For the field evaluation, we have used the Fresnel function for the diffracted coefficient and the Fresnel's reflection coefficients. Numerical examples have been carried out for the scattering characteristics of an ocean wave-like rough surface and the delay spared characteristics of a building-like surface. In the present work we have demonstrated that the ray tracing method is effective to numerical analysis of a rough surface scattering.

  • Optically-Fed Radio Access Point Module for a Fibre-Radio Downlink System

    Seiji FUKUSHIMA  Hideki FUKANO  Kaoru YOSHINO  Yutaka MATSUOKA  Seiko MITACHI  Kiyoto TAKAHATA  

     
    LETTER-Microwaves, Millimeter-Waves

      Page(s):
    271-273

    A compact optically-fed radio access point module was developed that consists of a uni-traveling-carrier refracting-facet photodiode, a patch antenna, and an optical input interface. An output power from the photodiode was 1.4 dBm at a frequency of 5.88 GHz without any bias voltage.

  • A CMOS DC Voltage Doubler with Nonoverlapping Switching Control

    Shi-Ho KIM  Jorgo TSOUHLARAKIS  Jan Van HOUDT  Herman MAES  

     
    LETTER-Electronic Circuits

      Page(s):
    274-277

    A new CMOS DC voltage doubler with nonoverlapping switching control is proposed, in order to eliminate the dynamic current loss during switching as well as the threshold voltage drop of the serial switches. The simulated results at 1.5 V show that the maximum power efficiency is improved with about 30%, whereas the efficiency in the low output current region is larger than 5 times compared to the conventional voltage doublers. This proposed CMOS DC voltage doubler can be used as a VPP generator of low voltage DRAM's.