Akira MOCHIZUKI Daisuke NISHINOHARA Takahiro HANYU
A new circuit technique based on pass-gate logic with dynamic supply-voltage and clock-frequency control is proposed for a low-power motion-vector detection VLSI processor. Since the pass-gate logic style has potential advantages that have small equivalent stray capacitance and small number of short-circuit paths, its circuit implementation makes it possible to reduce the power dissipation with maintaining high-speed switching capability. In case the calculation result is obtained on the way of calculation steps, additional power saving is also achieved by combining the pass-gate logic circuitry with a mechanism that dynamically scales down the supply voltage and the clock frequency while maintaining the calculation throughput. As a typical example, a sum of absolute differences (SAD) unit in a motion-vector detection VLSI processor is implemented and its efficiency in power saving is demonstrated.
This report describes a concrete method for realizing adiabatic charging reversible logic. First, we investigate the stabilization properties of a charge recycle regenerator using a switched capacitor circuit by SPICE simulation and an analytical method. In the N-step case, we proved that a step waveform is spontaneously generated. Next, for combinational logic, we propose an adiabatic charging binary decision diagram logic gate (AC-BDD) that uses this regenerator. The AC-BDD uses pass transistor logic based on a BDD, which is suitable for adiabatic logic. 8-bit AC-BDD multipliers were fabricated, and it is clarified that power consumption is reduced to 15% that of the same-rule-designed CMOS at 1 V and 1 MHz. Finally, we propose clocked energy reversible logic (CERL) that maintains the CMOS architecture for CMOS compatibility. CERL can reduce the clocked energy, which is used for charging the clock load capacitance, to 10% that of CMOS by using a power clock from the charge recycle regenerator.
Masayuki SHIRANE Yoichi HASHIMOTO Hirohito YAMADA Hiroyuki YOKOYAMA
A compact and stable optical sampling measurement system with a temporal resolution of 2 ps has been developed. External-cavity mode-locked laser-diode (EC-MLLD) modules, which directly generate coherent 2-ps optical pulses, were used as the optical sampling pulse sources. Real-time measurement of the recovery dynamics in semiconductor saturable absorber devices has been achieved by optical sampling combined with the pump-probe method. An EC-MLLD module was also utilized for simple sub-harmonic all-optical clock recovery based on the synchronization of the mode-locking operation by optical-pulse injection. Optical sampling measurement of 160-Gbit/s return-to-zero signals incorporating all-optical clock recovery has been demonstrated.
An adaptive 4-state phase-frequency detector (PFD) for clock and data recovery (CDR) PLL of non return to zero (NRZ) data is presented. The PLL achieves false-lock free operation with rapid frequency-capture and wide bit-rate-capture range. The variable bit rate operation is achieved by adaptive delay control of data delay. Circuitry and overall architecture are described in detail. A z-Domain analysis is also presented.
Enjian BAI Zhihua NIU Guozhen XIAO
In their paper, G. Gong and S.Q. Jiang construct a new pseudo-random sequence generator by using two ternary linear feedback shift registers (LFSR). The new generator is called an editing generator which a combined model of the clock-controlled generator and the shrinking generator. For a special case (Both the base sequence and the control sequence are mm-sequence of degree n), the period, linear complexity, symbol distribution and security analysis are discussed in the same article. In this paper, we expand the randomness results of the edited sequence for general cases, we do not restrict the base sequence and the control sequence has the same length. For four special cases of this generator, the randomness of the edited sequence is discussed in detail. It is shown that for all four cases the editing generator has good properties, such as large periods, high linear complexities, large ratio of linear complexity per symbol, and small un-bias of occurrences of symbol. All these properties make it necessary to resist to the attack from the application of Berlekamp-Massey algorithm.
Myeong-Hoon OH Seok-Jae PARK Dong-Ik LEE Ho-Yong CHOI
In this paper, we propose an advanced structure of the interface circuit, called a wrapper, for Globally Asynchronous Locally Synchronous (GALS) systems. The proposed wrapper is composed of a sender module and a receiver module. The sender module carries out data transfers in an efficient way by decoupling dependency between an external handshake protocol and an internal clock. The decoupling effect allows the external handshake protocol and the internal clock to be executed in a concurrent way and hence allows the wrapper to show better performance. We have designed our wrapper at the transistor level with 0.35-µm technology. When we compare our decoupled wrapper with two conventional wrappers based on pausible clocking scheme, our simulation results show that performance improvement is about 8-13% and 13-56%, respectively.
Chung-Seok (Andy) SEO Abhijit CHATTERJEE
A new approach to optical clock distribution utilizing optical waveguide interconnect technology is introduced. In this paper, we develop a new algorithm for design and optimization of embedded optical clock distribution networks for printed wiring boards. The optimization approach takes into account bending and propagation losses of optical waveguides. Less than 26.1 psec in signal timing skew is obtained for a signal flight time of 614.38 psec. About 15% reduction in optical power consumption is also obtained over clock nets routed with existing (optical) methods.
Young-Soo SOHN Seung-Jun BAE Hong-June PARK Soo-In CHO
A CMOS DFE (decision feedback equalization) receiver with a clock-data skew compensation was implemented for the SSTL (stub-series terminated logic) SDRAM interface. The receiver consists of a 2 way interleaving DFE input buffer for ISI reduction and a X2 over-sampling phase detector for finding the optimum sampling clock position. The measurement results at 1.2 Gbps operation showed the increase of voltage margin by about 20% and the decrease of time jitter in the recovered sampling clock by about 40% by equalization in an SSTL channel with 2 pF 4 stub load. Active chip area and power consumption are 3001000 µm2 and 142 mW, respectively, with a 2.5 V, 0.25 µm CMOS process.
Takaki YOSHIDA Masafumi WATARI
As semiconductor manufacturing technology advances, power dissipation and noise in scan testing have become critical problems. Our studies on practical LSI manufacturing show that power supply voltage drop causes testing problems during shift operations in scan testing. In this paper, we present a new testing method named MD-SCAN (Multi-Duty SCAN) which solves power supply voltage drop problems, as well as its experimental results applied to practical LSI chips.
Yasuo SATO Motoyuki SATO Koki TSUTSUMIDA Kazumi HATAYAMA Kazuyuki NOMOTO
We analyze the timing design methodology for testing chips using a multiple-clock domain scheme. We especially focus on the layout design of the design-for-test (DFT) circuits and the clock network. First, we demonstrate the built-in-self-testing (BIST) scheme for multiple-clock domains. Then, we discuss the layout method that achieves a low clock-skew between different clock domains with a small modification of the original user logic layout. Finally, we evaluate the fault coverage of our large ASIC chips designed using our new methodology. The short design period and high fault coverage of our methodology are confirmed using actual industrial designs. We introduce a viable approach for industrial designs because designers don't have to pay much attention to DFT. Our approach also provides designers with an easy method for LSI debugging and diagnostics.
This paper proposes a fast multi-cycle path detection method for large sequential circuits. The proposed method is based on ATPG techniques, especially on implication techniques, to use circuit structures and multi-cycle path conditions directly. The method also checks whether or not a multi-cycle path may be invalidated by static hazards at the inputs of flip-flops. Then we explain how to apply the proposed algorithm to real industrial designs. Experimental results show that our method is much faster than conventional ones and that it is efficient enough to handle large industrial designs.
Kenichi ICHINO Ko-ichi WATANABE Masayuki ARAI Satoshi FUKUMOTO Kazuhiko IWASAKI
We propose a technique of selecting seeds for the LFSR-based test pattern generators that are used in VLSI BISTs. By setting the computed seed as an initial value, target fault coverage, for example 100%, can be accomplished with minimum test length. We can also maximize fault coverage for a given test length. Our method can be used for both test-per-clock and test-per-scan BISTs. The procedure is based on vector representations over GF(2m), where m is the number of LFSR stages. The results indicate that test lengths derived through selected seeds are about sixty percent shorter than those derived by simple seeds, i.e. 0001, for a given fault coverage. We also show that seeds obtained through this technique accomplish higher fault coverage than the conventional selection procedure. In terms of the c7552 benchmark, taking a test-per-scan architecture with a 20-bit LFSR as an example, the number of undetected faults can be decreased from 304 to 227 for 10,000 LFSR patterns using our proposed technique.
Naoki IMASAKI Ambalavanar THARUMARAJAH Shinsuke TAMURA Toshiaki TANAKA
This paper proposes a simulation framework suitable for holonic manufacturing systems, or HMS, based on the concept of distributed self-simulation. HMS is a distributed system that comprises autonomous and cooperative elements called holons, for the flexible and agile manufacturing. The simulation framework proposed here capitalizes on this distributed nature, where each holon functions similar to an independent simulator with self-simulation capabilities to maintain its own clock, handle events, and detect inter-holon state inconsistencies and perform rollback actions. This paper discusses the detailed architecture and design issues of such a simulator and reports on the results of a prototype.
Stefano SANTI Riccardo ROVATTI Gianluca SETTI
We investigate the statistical features of both random- and chaos-based FM timing signals to ascertain their applicability to digital circuits and systems. To achieve such a goal, we consider both the case of single- and two-phase logic and characterize the random variable representing, respectively, the time lag between two subsequent rising edges or between two consecutive zero-crossing points of the modulated timing signal. In particular, we determine its probability density and compute its mean value and variance for cases which are relevant for reducing Electromagnetic emissions. Finally, we address the possible problems of performance degradation in a digital system driven by a modulated timing signal and to cope with this we give some guidelines for the proper choice of the statistical properties of the modulating signals.
Takashi HASHIMOTO Shunichi KUROMARU Masayoshi TOUJIMA Yasuo KOHASHI Masatoshi MATSUO Toshihiro MORIIWA Masahiro OHASHI Tsuyoshi NAKAMURA Mana HAMADA Yuji SUGISAWA Miki KUROMARU Tomonori YONEZAWA Satoshi KAJITA Takahiro KONDO Hiroki OTSUKI Kohkichi HASHIMOTO Hiromasa NAKAJIMA Taro FUKUNAGA Hiroaki TOIDA Yasuo IIZUKA Hitoshi FUJIMOTO Junji MICHIYAMA
A low power MPEG-4 video codec LSI with the capability for core profile decoding is presented. A 16-b DSP with a vector pipeline architecture and a 32-b arithmetic unit, eight dedicated hardware engines to accelerate MPEG-4 SP@L1 codec, CP@L1 decoding and post video processing, 20-Mb embedded DRAM, and three peripheral blocks are integrated together on a single chip. MPEG-4 SP@L1 codec, CP@L1 decoding and post video processing are realized with a hybrid architecture consisting of a programmable DSP and dedicated hardware engines at low operating frequency. In order to reduce the power consumption, clock gating technique is fully adopted in each hardware block and embedded DRAM is employed. The chip is implemented using 0.18-µm quad-metal CMOS technology, and its die area is 8.8 mm 8.6 mm. The power consumption is 90 mW at a SP@L1 codec and 110 mW at a CP@L1 decoding.
Jae-Wook LEE Cheon-O LEE Woo-Young CHOI
A new clock and data recovery circuit (CDR) is realized for the application of data communication systems requiring GHz-range clock signals. The high frequency jitter is one of major performance-limiting factors in CDR, particularly when NRZ data patterns are used. A novel phase detector is able to suppress this noise, and stable clock generation is achieved. Furthermore, optical characteristics for fast locking are achieved with the adaptive delay cell in the phase detector. The circuit is designed based on CMOS 0.25 µm fabrication process and its performance is verified by measurement results.
Tsutomu YOSHIMURA Kimio UEDA Jun TAKASOH Harufusa KONDOH
In this paper, we present a 10 Gbase Ethernet Transceiver that is suitable for 10 Gb/s Ethernet applications. The 10 Gbase Ethernet Transceiver LSI, which contains the high-speed interface and the fully integrated IEEE 802.3ae compliant logics, is fabricated in a 0.18 µm SOI/CMOS process and dissipates 2.9 W at 1.8 V supply. By incorporating the monolithic approach and the use of the advance CMOS process, this 10 GbE transceiver realizes a low power, low cost and compact solution for the exponentially increasing need of broadband network applications.
Peilin LIU Li JIANG Hiroshi NAKAYAMA Toshiyuki YOSHITAKE Hiroshi KOMAZAKI Yasuhiro WATANABE Hisakatsu ARAKI Kiyonori MORIOKA Shinhaeng LEE Hajime KUBOSAWA Yukio OTOBE
We have developed a low-power, high-performance MPEG-4 codec LSI for mobile video applications. This codec LSI is capable of up to CIF 30-fps encoding, making it suitable for various visual applications. The measured power consumption of the codec core was 9 mW for QCIF 15-fps codec operation and 38 mW for CIF 30-fps encoding. To provide an error-robust MPEG-4 codec, we implemented an error-resilience function in the LSI. We describe the techniques that have enabled low power consumption and high performance and discuss our test results.
Jae-Seung HWANG Chul-Soo PARK Chang-Soo PARK
We propose a simple technique for reducing the jitter of the output clock generated in the clock recovery circuit (CRC) for burst-mode data transmission. By using this technique, the proposed CRC based on the gated oscillator (GO) can recover the output clock with a low-jitter even when there are consecutive same data streams encountered in the system. The circuit is composed only of digital logic devices and can recover the input data errorless until 1,000 consecutive same data bits are incoming.
Kang-Yoon LEE Deog-Kyoon JEONG
A simple phase detector reducing the pattern dependent jitter in clock recovery circuit is developed in this paper. The developed phase detector automatically aligns the recovered to clock in the center of the data eye, while producing no ripple to the control voltage in locked condition of the PLL based clock recovery circuit. The UP and DOWN signals are separately generated to align them in locked condition. Thus, no explicit transient waveforms do not exist at the output of the phase detector. The elimination of high frequency ripple improves the jitter characteristics of the clock recovery circuit. The delay unit used in our phase detector requires no accurate control of the delay time. This feature eliminates the use of DLL to generate the precise delay time, which reduce the power consumption and area of the phase detector. The simulation shows that the RMS timing jitter is reduced by more than four times when compared with the conventional scheme. The rms jitter is 32 ps for the proposed phase detector and 133 ps for the phase detector in conventional scheme. In conventional scheme, even when the lock is achieved, the phase detector produces a triwave transient on the control voltage of the VCO, which depends on the data pattern. In the proposed phase detector, no such transient waveforms do not exist. The proposed phase detector can be incorporated in high performance clock recovery circuit for data communication systems.