Hang Liu Fei Wu
Keiji GOTO Toru KAWANO Ryohei NAKAMURA
Takahiro SASAKI Yukihiro KAMIYA
Xiang XIONG Wen LI Xiaohua TAN Yusheng HU
Anton WIDARTA
Hiroshi OKADA Mao FUKINAKA Yoshiki AKIRA
Shun-ichiro Ohmi
Tohgo HOSODA Kazuyuki SAITO
Shohei Matsuhara Kazuyuki Saito Tomoyuki Tajima Aditya Rakhmadi Yoshiki Watanabe Nobuyoshi Takeshita
Koji Abe Mikiya Kuzutani Satoki Furuya Jose A. Piedra-Lorenzana Takeshi Hizawa Yasuhiko Ishikawa
Yihan ZHU Takashi OHSAWA
Shengbao YU Fanze MENG Yihan SHEN Yuzhu HAO Haigen ZHOU
Ryo KUMAGAI Ryosuke SUGA Tomoki UWANO
Jun SONODA Kazusa NAKAMICHI
Kaiji Owaki Yusuke Kanda Hideaki Kimura
Takuya FUJIMOTO
Yuji Wada
Fuyuki Kihara Chihiro Matsui Ken Takeuchi
Keito YUASA Michihiro IDE Sena KATO Kenichi OKADA Atsushi SHIRANE
Tomoo Ushio Yuuki Wada Syo Yoshida
Futoshi KUROKI
Jun FURUTA Shotaro SUGITANI Ryuichi NAKAJIMA Takafumi ITO Kazutoshi KOBAYASHI
Yuya Ichikawa Ayumu Yamada Naoko Misawa Chihiro Matsui Ken Takeuchi
Ayumu Yamada Zhiyuan Huang Naoko Misawa Chihiro Matsui Ken Takeuchi
Yoshinori ITOTAGAWA Koma ATSUMI Hikaru SEBE Daisuke KANEMOTO Tetsuya HIROSE
Hikaru SEBE Daisuke KANEMOTO Tetsuya HIROSE
Zhibo CAO Pengfei HAN Hongming LYU
Takuya SAKAMOTO Itsuki IWATA Toshiki MINAMI Takuya MATSUMOTO
Koji YAMANAKA Kazuhiro IYOMASA Takumi SUGITANI Eigo KUWATA Shintaro SHINJO
Minoru MIZUTANI Takashi OHIRA
Katsumi KAWAI Naoki SHINOHARA Tomohiko MITANI
Baku TAKAHARA Tomohiko MITANI Naoki SHINOHARA
Akihiko ISHIWATA Yasumasa NAKA Masaya TAMURA
Atsushi Fukuda Hiroto Yamamoto Junya Matsudaira Sumire Aoki Yasunori Suzuki
Ting DING Jiandong ZHU Jing YANG Xingmeng JIANG Chengcheng LIU
Fan Liu Zhewang Ma Masataka Ohira Dongchun Qiao Guosheng Pu Masaru Ichikawa
Ludovico MINATI
Minoru Fujishima
Hyunuk AHN Akito IGUCHI Keita MORIMOTO Yasuhide TSUJI
Kensei ITAYA Ryosuke OZAKI Tsuneki YAMASAKI
Akira KAWAHARA Jun SHIBAYAMA Kazuhiro FUJITA Junji YAMAUCHI Hisamatsu NAKANO
Seiya Kishimoto Ryoya Ogino Kenta Arase Shinichiro Ohnuki
Yasuo OHTERA
Tomohiro Kumaki Akihiko Hirata Tubasa Saijo Yuma Kawamoto Tadao Nagatsuma Osamu Kagaya
Haonan CHEN Akito IGUCHI Yasuhide TSUJI
Keiji GOTO Toru KAWANO Munetoshi IWAKIRI Tsubasa KAWAKAMI Kazuki NAKAZAWA
Scaling of CMOS Integrated Circuit is becoming difficult, due mainly to rapid increase in power dissipation. How will the semiconductor technology and industry develop? This paper discusses challenges and opportunities in system LSI from three levels of perspectives: transistor level (physics), IC level (electronics), and business level (economics).
To realize a secure networking infrastructure, the author is carrying out CUE (Coordinating Users' requirements and Engineering constraints) project with a network carrier and a VLSI manufacture. Since CUE-series data-driven processors developed in the project were specifically designed to be an embedded programmable component as well as a multi-processor element, particular design considerations were taken to achieve real-time multiprocessing capabilities essentially needed in multi-media communication environment. A novel data-driven paradigm is first introduced with special emphasis on VLSI-oriented parallel processing architectures. Data-driven protocol handlings on CUE-p and CUE-v1 are then discussed for their real-time multiprocessing capability without any runtime overheads. The emulation facility RESCUE (Real-time Execution System for CUE-series data-driven processors) was also built to develop scalable chip multi-processors in self-evolutional manner. Based on emulation results, the latest version named CUE-v2 was realized as a hybrid processor enabling simultaneous processing of data-driven and control-driven threads to achieve higher performance for inline processing and to avoid any bottlenecks in sequential parts of real-time programs frequently encountered in actual time-sensitive applications. Effectiveness of the data-driven chip multi-processor architecture will finally be addressed for lower power consumption and scalability to realize future VLSI processors in the sub-100 nm era.
Noriyuki MINEGISHI Junichi MIYAKOSHI Yuki KURODA Tadayoshi KATAGIRI Yuki FUKUYAMA Ryo YAMAMOTO Masayuki MIYAMA Kousuke IMAMURA Hideo HASHIMOTO Masahiko YOSHIMOTO
An optical flow processor architecture is proposed. It offers accuracy and image-size scalability for video segmentation extraction. The Hierarchical Optical flow Estimation (HOE) algorithm [1] is optimized to provide an appropriate bit-length and iteration number to realize VLSI. The proposed processor architecture provides the following features. First, an algorithm-oriented data-path is introduced to execute all necessary processes of optical flow derivation allowing hardware cost minimization. The data-path is designed using 4-SIMD architecture, which enables high-throughput operation. Thereby, it achieves real-time optical flow derivation with 100% pixel density. Second, it has scalable architecture for higher accuracy and higher resolution. A third feature is the CMOS-process compatible on-chip 2-port DRAM for die-area reduction. The proposed processor has performance for CIF 30 fr/s with 189 MHz clock frequency. Its estimated core size is 6.02
Jumpei UCHIDA Nozomu TOGAWA Masao YANAGISAWA Tatsuo OHTSUKI
Elliptic curve cryptosystems are expected to be a next standard of public-key cryptosystems. A security level of elliptic curve cryptosystems depends on a difficulty of a discrete logarithm problem on elliptic curves. The security level of a elliptic curve cryptosystem which has a public-key of 160-bit is equivalent to that of a RSA system which has a public-key of 1024-bit. We propose an elliptic curve cryptosystem LSI architecture embedding word-based Montgomery multipliers. A Montgomery multiplication is an efficient method for a finite field multiplication. We can design a scalable architecture for an elliptic curve cryptosystem by selecting structure of word-based Montgomery multipliers. Experimental results demonstrate effectiveness and efficiency of the proposed architecture. In the hardware evaluation using 0.18 µm CMOS library, the high-speed design using 126 Kgates with 20
Koichiro ISHIBASHI Tetsuya FUJIMOTO Takahiro YAMASHITA Hiroyuki OKADA Yukio ARIMA Yasuyuki HASHIMOTO Kohji SAKATA Isao MINEMATSU Yasuo ITOH Haruki TODA Motoi ICHIHASHI Yoshihide KOMATSU Masato HAGIWARA Toshiro TSUKADA
Circuit techniques for realizing low-voltage and low-power SoCs for 90-nm CMOS technology and beyond are described. A proposed SAFBB (self-adjusted forward body bias techniques), ATC (Asymmetric Three transistor Cell) DRAM, and ADC using an offset canceling comparator deal with leakage and variability issues for these technologies. A 32-bit adder using SAFBB attained 353-µA at 400-MHz operation at 0.5-V supply voltage, and 1 Mb memory array using ATC DRAM cells achieved 1.5 mA at 50 MHz, 0.5 V. The 4-bit ADC attained 2 Gsample/s operation at a supply voltage of 0.9 V.
Yukihito OOWAKI Shinichiro SHIRATAKE Toshihide FUJIYOSHI Mototsugu HAMADA Fumitoshi HATORI Masami MURAKATA Masafumi TAKAHASHI
The module-wise dynamic voltage and frequency scaling (MDVFS) scheme is applied to a single-chip H.264/MPEG-4 audio/visual codec LSI. The power consumption of the target module with controlled supply voltage and frequency is reduced by 40% in comparison with the operation without voltage or frequency scaling. The consumed power of the chip is 63 mW in decoding QVGA H.264 video at 15 fps and MPEG-4 AAC LC audio simultaneously. This LSI keep operating continuously even during the voltage transition of the target module by introducing the newly developed dynamic de-skewing system (DDS) which watches and control the clock edge of the target module.
Nobuaki KOBAYASHI Tomomi EI Tadayoshi ENOMOTO
To drastically reduce the dynamic power (PAT) and the leakage power (PST) of the CMOS MPEG4/H.264 motion estimation (ME) circuits, several power reduction techniques were developed. They were circuit architectures, which were able to reduce the supply voltages (VDD) and numbers of logic gates of not only the whole circuit but the critical path, a fast motion estimation algorithm, and a leakage current reduction circuit. A 0.18-µm CMOS ME circuit has been fabricated by adopting those techniques. At a clock frequency of 160 MHz and VDD of 1.25 V, PAT decreased to 75.9 µW, which was 5.35% that of a conventional ME circuit. PST also decreased to 0.82 nW, which was 3.93% that of the conventional ME circuit.
Canh Quang TRAN Hiroshi KAWAGUCHI Takayasu SAKURAI
A low-power FPGA design approach is proposed based on a fine-grain VDD control scheme called micro-VDD-hopping. Four configurable logic blocks (CLBs) are grouped into one block where VDD is shared. In the micro-VDD-hopping scheme, VDD in each block is changed between VDDH (high VDD) and VDDL (low VDD) spatially and temporally in order to achieve lower power without performance degraded. A low-power level shifter that has less contention is also proposed for low-swing inter-block signals. The FPGA incorporates the Zigzag power-gating scheme, in which special care has been taken to cope with a sneak leakage-path problem. A test chip was fabricated using a 0.35-µm CMOS technology, together with the conventional fixed-VDD FPGA for comparison. Measurement results show that dynamic power in the proposed scheme can be reduced by 86% when a frequency is half of the maximum one. Simulation using a 90-nm CMOS technology shows that leakage power can be reduced by 97%, when the proposed method is used. The area overhead of the proposed FPGA is 2%.
Tetsuya YAMADA Masahide ABE Yusuke NITTA Kenji OGURA Manabu KUSAOKE Makoto ISHIKAWA Motokazu OZAWA Kiwamu TAKADA Fumio ARAKAWA Osamu NISHII Toshihiro HATTORI
A low-power SuperHTM embedded processor core, the SH-X2, has been designed in 90-nm CMOS technology. The power consumption was reduced by using hierarchical fine-grained clock gating to reduce the power consumption of the flip-flops and the clock-tree, synthesis and a layout that supports the implementation of the clock gating, and several-level power evaluations for RTL refinement. With this clock gating and RTL refinement, the power consumption of the clock-tree and flip-flops was reduced by 35% and 59%, including the process shrinking effects, respectively. As a result, the SH-X2 achieved 6,000 MIPS/W using a Renesas low-power process with a lowered voltage. Its performance-power efficiency was 25% better than that of a 130-nm-process SH-X.
In deep sub-micrometer CMOS process, owing to the thin gate oxide and small subthreshold voltage, the leakage current becomes more and more serious. The leakage current has made the impact on phase-locked loops (PLLs). In this paper, the compensation circuits are presented to reduce the leakage current on the charge pump circuit and the MOS capacitor as the loop filter. The proposed circuit has been fabricated in 0.13-µm CMOS process. The power consumption is 3 mW and the die area is 0.27
Hirotaka TAMURA Masaya KIBUNE Hisakatsu YAMAGUCHI Kouichi KANDA Kohtaroh GOTOH Hideki ISHIDA Junji OGAWA
The paper provides an overview of the circuit techniques for CMOS high-speed I/Os, focusing on the design issues in sub-100 nm standard CMOS. First, we describe the evolution of CMOS high-speed I/O since it appeared in mid 90's. In our view, the surge in the I/O bandwidth we experienced from the mid 90's to the present was driven by the continuous improvement of the CMOS IC performance. As a result, CMOS high-speed I/O has covered the data rate ranging from 2.5 Gb/s to 10 Gb/s, and now is heading for 40 Gb/s and beyond. To meet the speed requirements, an optimum choice of the transceiver architecture and its building blocks are crucial. We pick the most critical building blocks such as the decision circuit and the multiplexors and give detailed explanation of their designs. We describe the low-voltage operation of the high-speed I/O in view of reducing the power consumption. An example of a 90-nm CMOS 2.5 Gb/s transceiver operating off a 0.8 V power supply will be described. Operability at 0.8 V ensures that the circuits will not become obsolescent, even below the 60 nm process node.
Kouichi YAMAGUCHI Muneo FUKAISHI
This paper describes a BIST circuit for testing SoC integrated multi-channel serializer/deserializer (SerDes) macros. A newly developed packet-based PRBS generator enables the BIST to perform at-speed testing of asynchronous data transfers. In addition, a new technique for chained alignment checks between adjacent channels helps achieve a channel-count-independent architecture for verification of multi-channel alignment between SerDes macros. Fabricated in a 0.13-µm CMOS process and operating at > 500 MHz, the BIST has successfully verified all SerDes functions in at-speed testing of 5-Gbps
Daisuke MIZOGUCHI Noriyuki MIURA Takayasu SAKURAI Tadahiro KURODA
A wireless interface for stacked chips in System-in-a-Package is presented. The interface utilizes inductive coupling between metal inductors. S21 parameters of the inductive coupling are measured between chips stacked in face-up for the first time. Calculations from a theoretical model have good agreement with the measurements. A transceiver circuit for Non-Return-to-Zero signaling is developed to reduce power dissipation. The transceiver is implemented in a test chip fabricated in 0.35 µm CMOS and the chips are stacked in face-up. The chips communicate through the transceiver at 1.2 Gb/s/ch with 46 mW power dissipation at 3.3 V over 300 µm distance. A scaling scenario is derived based on the theoretical model and measurement results. It indicates that, if the communication distance is reduced to 13 µm in 70 nm CMOS, 34 Tbps/mm2 will be obtained.
Yoichi YUYAMA Akira TSUCHIYA Kazutoshi KOBAYASHI Hidetoshi ONODERA
In this paper, we propose alternate self shielding to remove critical transitions of on-chip global interconnect. Our proposed method alternates shield and signal wires cycle by cycle. The conventional self-shielding methods need additional wires to remove critical transition by encoding. The proposed alternate self-shielding, however, requires no additional wires. We evaluate our method by simulating signal transimission with a circuit simulator. As a result, our proposed method is superior in bit rate compared to others from 10% to 75%.
Masahiro NOMURA Taku OHSAWA Koichi TAKEDA Yoetsu NAKAZAWA Yoshinori HIROTA Yasuhiko HAGIHARA Naoki NISHI
This paper describes a newly developed automatic direction control scheme for bi-directional bus repeaters that uses dynamic collaborative driving techniques. Repeater directions are rapidly determined by detecting the direction of control signal propagation through an additional control signal line that is driven by dynamic collaborative drivers. Application to an on-chip peripheral bus reduces control circuit transistor counts by about 75% and the number of control signal lines by about 50% without loss of speed. Experimental results for a 0.18-µm CMOS implementation indicate that the proposed scheme is four times faster than a conventional scheme with no bi-directional bus repeaters.
As the technology scaling approaching nano-scale region, variability in device performance becomes a major issue in the design of integrated circuits. Besides the growing amount of variability, the statistical nature of the variability is changing as the progress of technology generation. In the past, die-to-die variability, which is well managed by the worst case design technique, dominates over within-die variability. In present and the future, the amount of within-die variability is increasing and it casts a challenge in design methodology. This paper first shows measured results of variability in three different processes of 0.35, 0.18, and 0.13 µm technologies, and explains the above mentioned trend of variability. An example of modeling for the within-die variability is explained. The impact of within-die random variability on circuit performance is demonstrated using a simple numerical example. It shows that a circuit that is designed optimally under the assumption of deterministic delay is now most susceptible to random fluctuation in delay, which clearly indicates the requirement of statistical design methodology.
Yasuo SATO Shuji HAMADA Toshiyuki MAEDA Atsuo TAKATORI Seiji KAJIHARA
In this paper we introduce a statistical quality model for delay testing that reflects fabrication process quality, design delay margin, and test timing accuracy. The model provides a measure that predicts the chip defect level that cause delay failure, including marginal small delay. We can therefore use the model to make test vectors that are effective in terms of both testing cost and chip quality. The results of experiments using ISCAS89 benchmark data and some large industrial design data reflect various characteristics of our statistical delay quality model.
Yuuichirou IKEDA Masaya SUMITA Makoto NAGATA
We have developed a 32-bit, 32-word, and 9-read, 7-write ported register file. This register file has several circuits and techniques for reducing the impact of process variation that is marked in recent process technologies, voltage variation, and temperature variation, so called PVT variation. We describe these circuits and techniques in detail, and confirm their effects by simulation and measurement of the test chip.
Toru NAKURA Makoto IKEDA Kunihiro ASADA
This paper demonstrates a feedforward active substrate noise cancelling technique using a power supply di/dt detector. Since the substrate is usually tied with the ground line with a low impedance, the substrate noise is closely related to the ground bounce which is proportional to the di/dt when inductance is dominant on the ground line impedance. Our active cancelling detects the di/dt of the power supply, and injects an anti-phase current into the substrate so that the di/dt-proportional substrate noise is cancelled out. Our first trial shows that 34% substrate noise reduction is achieved on our test circuit, and the theoretical analysis shows that the optimized canceller design will enhance the substrate noise suppression ratio up to 56%.
Mohamed ABBAS Makoto IKEDA Kunihiro ASADA
In this paper we present an on-chip noise detection circuit. In contrast with the previous works concerning on-chip noise measurement, this detector does not assume specific noise properties such as periodicity. The detector is able to continuously capture 10 nano-second time window from the measured signal with a resolution equal to 100 pico-second. The requested bandwidth of the output channel can be adjusted freely, therefore, the user can avoid the effect of on-chip parasites and the need to off-chip sophisticated monitoring tools. The detector is equipped with an on-chip programmable voltage divider, which enables measuring the high and low swing fluctuations accurately. Therefore, the detector is suitable to measure the non-periodic/single event noise for the purpose of reliability evaluation and performance modeling. The detector is implemented in a test chip using Hitachi 0.18 µm technology.
Makoto SUGIHARA Taiga TAKATA Kenta NAKAMURA Ryoichi INANAMI Hiroaki HAYASHI Katsumi KISHIMOTO Tetsuya HASEBE Yukihiro KAWANO Yusuke MATSUNAGA Kazuaki MURAKAMI Katsuya OKUMURA
We propose a cell library development methodology for throughput enhancement of character projection equipment. First, an ILP (Integer Linear Programming)-based cell selection is proposed for the equipment for which both of the CP (Character Projection) and VSB (Variable Shaped Beam) methods are available, in order to minimize the number of electron beam (EB) shots, that is, time to fabricate chips. Secondly, the influence of cell directions on area and delay time of chips is examined. The examination helps to reduce the number of EB shots with a little deterioration of area and delay time because unnecessary directions of cells can be removed. Finally, a case study is shown in which the numbers of EB shots are shown for several cases.
Yoshihide KOMATSU Yukio ARIMA Koichiro ISHIBASHI
This paper describes a soft error hardened latch (SEH-Latch) scheme that has an error correction function in the fine process. The storage node of the latch is separated into three electrodes and a soft error on one node is collected by the other two nodes despite the large amount and long-lasting influx of radiation-induced charges. To achieve this, we designed two types of SEH-Latch circuits and a standard latch circuit using 130-nm 2-well, 3-well, and also 90-nm 2-well CMOS processes. The proposed circuit demonstrated immunity that was two orders higher through an irradiation test using alpha-particles, and immunity that was one order higher through neutron irradiation. We also demonstrated forward body bias control, which improves alpha-ray immunity by 26% for a standard latch and achieves 44 times improvement in the proposed latch.
Danardono Dwi ANTONO Kenichi INAGAKI Hiroshi KAWAGUCHI Takayasu SAKURAI
This paper discusses propagation delay error, transient response, and power consumption distribution due to inductive effects in optimal buffered on-chip interconnects. Inductive effect is said to be important to consider in deep submicron (DSM) VLSI design. However, study shows that the effect decreases and can be neglected in next technology nodes for such conditions.
Chin-Jui LAI Ching-Her LEE Chung-I G. HSU Jean-Fu KIANG
A mode-matching technique in conjunction with the Floquet theorem is proposed to analyze the propagation characteristics of periodic circular surface waveguides. The circular waveguides are coated outside with a multilayered dielectric and have a ground plane with periodic corrugation of arbitrary profile. Three different ground corrugation profiles are examined to demonstrate the influences of the corrugation shape, depth, and width, dielectric thickness, and relative permittivity on bandstop characteristics.
A novel class of microstrip bandpass filter is configured using the impedance transformers and an improved stepped impedance resonator (SIR). This SIR is composed of a central narrow strip section with an aperture on ground and two wide strip sections at the two sides. This low-high-low SIR resonator has a promising capability in achieving an extremely large ratio of first two resonant frequencies for design of a bandpass filter with ultra-broad stopband. The two quarter-wavelength transformers with low and high impedances, referred as to impedance- and admittance-inverters, are modeled and utilized as alternative types of inductive and capacitive coupling elements with highly tightened degrees for wideband filter design. After extensive investigation is made on the two transformers and the proposed SIR, the two novel bandpass filters are constructed, designed and implemented. Two sets of predicted and measured frequency responses over a wide frequency range both quantitatively exhibit their several attractive features, such as ultra-broad stopband with deep rejection and broadened dominant passband with low insertion loss.
Hyun Bae LEE Kyoungho LEE Hae Kang JUNG Hong June PARK
The electrical parameters (8
Kang-Yoon LEE Hyunchul KU Young Beom KIM
This paper presents a fast switching CMOS frequency synthesizer with a new coarse tuning method for PHS applications. To achieve the fast lock-time and the low phase noise performance, an efficient bandwidth control scheme is proposed. To change the bandwidth, the charge pump current and the loop filter zero resistor should be changed. Charge pump up/down current mismatches are compensated with the current mismatch compensation block. The proposed coarse tuning method selects the optimal tuning capacitances of the LC-VCO to optimize the phase noise and the lock-time. The measured lock-time is about 20 µs and the phase noise is -121 dBc/
Hideaki TAKADA Shiro SUYAMA Munekazu DATE
We clarify the effective range of distance between the front and rear images of the depth-fused 3-D (DFD) visual illusion. The DFD visual illusion is perceived when two images with many edges in the front and rear frontal-parallel planes at different depths are overlapped from the viewpoint of an observer. We evaluated how the fusion of the DFD visual illusion depended on the difference in distance between the front and rear images when the distance between the two images was changed. Subjective tests clarified the cases where DFD can be applied.
A new approach used to formulate to mixed-path propagation of surface wave is presented based on two main ingredients: the decomposition of electromagnetic fields and the introduction of equivalent electric (magnetic) currents adopted for convenience. The present method can be extended to obtain the corresponding results for the arbitrary incident wave excitation.
Kenjiro MATSUOKA Kazushi SAEKI Eiji TERAOKA Minoru YAMADA Yuji KUWAMURA
Properties of the quantum noise and the optical feedback noise in blue-violet InGaN semiconductor lasers were measured in detail. We confirmed that the quantum noise in the blue-violet laser becomes higher than that in the near-infrared laser. This property is an intrinsic property basing on principle of the quantum mechanics, and is severe subject to apply the laser for optical disk with the small consuming power. The feedback noise was classified into two types of "low frequency type" and "flat type" basing on frequency spectrum of the noise. This classification was the same as that in the near infra-red lasers.
A simple low power low phase noise LC QVCO (Quadrature Voltage Controlled Oscillator) topology is proposed. The topology minimizes phase noise by eliminating the contributions from the tail current source and coupling transistors. With no more than 3.36 mW power consumption from a 1.2 V power supply, the VCO achieves -124 dBc/Hz phase noise performance at 1 MHz offset from the 2.85 GHz carrier frequency.