Hang Liu Fei Wu
Keiji GOTO Toru KAWANO Ryohei NAKAMURA
Takahiro SASAKI Yukihiro KAMIYA
Xiang XIONG Wen LI Xiaohua TAN Yusheng HU
Anton WIDARTA
Hiroshi OKADA Mao FUKINAKA Yoshiki AKIRA
Shun-ichiro Ohmi
Tohgo HOSODA Kazuyuki SAITO
Shohei Matsuhara Kazuyuki Saito Tomoyuki Tajima Aditya Rakhmadi Yoshiki Watanabe Nobuyoshi Takeshita
Koji Abe Mikiya Kuzutani Satoki Furuya Jose A. Piedra-Lorenzana Takeshi Hizawa Yasuhiko Ishikawa
Yihan ZHU Takashi OHSAWA
Shengbao YU Fanze MENG Yihan SHEN Yuzhu HAO Haigen ZHOU
Ryo KUMAGAI Ryosuke SUGA Tomoki UWANO
Jun SONODA Kazusa NAKAMICHI
Kaiji Owaki Yusuke Kanda Hideaki Kimura
Takuya FUJIMOTO
Yuji Wada
Fuyuki Kihara Chihiro Matsui Ken Takeuchi
Keito YUASA Michihiro IDE Sena KATO Kenichi OKADA Atsushi SHIRANE
Tomoo Ushio Yuuki Wada Syo Yoshida
Futoshi KUROKI
Jun FURUTA Shotaro SUGITANI Ryuichi NAKAJIMA Takafumi ITO Kazutoshi KOBAYASHI
Yuya Ichikawa Ayumu Yamada Naoko Misawa Chihiro Matsui Ken Takeuchi
Ayumu Yamada Zhiyuan Huang Naoko Misawa Chihiro Matsui Ken Takeuchi
Yoshinori ITOTAGAWA Koma ATSUMI Hikaru SEBE Daisuke KANEMOTO Tetsuya HIROSE
Hikaru SEBE Daisuke KANEMOTO Tetsuya HIROSE
Zhibo CAO Pengfei HAN Hongming LYU
Takuya SAKAMOTO Itsuki IWATA Toshiki MINAMI Takuya MATSUMOTO
Koji YAMANAKA Kazuhiro IYOMASA Takumi SUGITANI Eigo KUWATA Shintaro SHINJO
Minoru MIZUTANI Takashi OHIRA
Katsumi KAWAI Naoki SHINOHARA Tomohiko MITANI
Baku TAKAHARA Tomohiko MITANI Naoki SHINOHARA
Akihiko ISHIWATA Yasumasa NAKA Masaya TAMURA
Atsushi Fukuda Hiroto Yamamoto Junya Matsudaira Sumire Aoki Yasunori Suzuki
Ting DING Jiandong ZHU Jing YANG Xingmeng JIANG Chengcheng LIU
Fan Liu Zhewang Ma Masataka Ohira Dongchun Qiao Guosheng Pu Masaru Ichikawa
Ludovico MINATI
Minoru Fujishima
Hyunuk AHN Akito IGUCHI Keita MORIMOTO Yasuhide TSUJI
Kensei ITAYA Ryosuke OZAKI Tsuneki YAMASAKI
Akira KAWAHARA Jun SHIBAYAMA Kazuhiro FUJITA Junji YAMAUCHI Hisamatsu NAKANO
Seiya Kishimoto Ryoya Ogino Kenta Arase Shinichiro Ohnuki
Yasuo OHTERA
Tomohiro Kumaki Akihiko Hirata Tubasa Saijo Yuma Kawamoto Tadao Nagatsuma Osamu Kagaya
Haonan CHEN Akito IGUCHI Yasuhide TSUJI
Keiji GOTO Toru KAWANO Munetoshi IWAKIRI Tsubasa KAWAKAMI Kazuki NAKAZAWA
Today, practical semiconductor products are an integral part of our lives and the infrastructure of society, and this trend will continue in the future. New areas of application will expand into medical, environmental, and agriculture (food)-related fields in addition to the conventional information and communication technology (ICT)-related field. Low-cost semiconductor devices with advanced functions have thus far been realized by miniaturization. However, we are now approaching the physical limit of miniaturization, and also, the investment required for new semiconductor manufacturing facilities has become huge. Under such circumstances, we propose an approach based on semiconductor devices called microcube chips and ideas of semiconductor development, i.e., agile integration and "inch-fab." Our approach is expected to contribute to expanding the range of companies that can fabricate semiconductor devices to include small-size companies, exploring new applications of semiconductor devices, and providing a wide variety of semiconductor devices at a low cost from the semiconductor industry.
Toru SHIMIZU Kazutami ARIMOTO Osamu NISHII Sugako OTANI Hiroyuki KONDO
Various low power technologies have been developed and applied to LSIs from the point of device and circuit design. A lot more CPU cores as well as function IPs are integrated on a single chip LSI today. Therefore, not only the device and circuit low power technologies, but software power control technologies are becoming more important to reduce active power of application systems. This paper overviews the low power technologies and defines power management platform as a combination of hardware functions and software programming interface. This paper discusses importance of the power management platform and direction of its development.
Tianruo ZHANG Chen LIU Minghui WANG Satoshi GOTO
This paper proposes a region-of-interest (ROI) based H.264 encoder and the VLSI architecture of the ROI detection algorithm. In ROI based video coding system, pre-processing unit to detect ROI should only introduce low computational complexity overhead due to the low power requirement. The Macroblocks (MBs) in ROIs are detected sequentially in the same order of H.264 encoding to satisfy the MB level pipelining of ROI detector and H.264 encoder. ROI detection is performed in a novel estimation-and-verification process with an ROI contour template. Proposed architecture can be configured to detect either single ROI or multiple ROIs in each frame and the throughput of single detection mode is 5.5 times of multiple detection mode. 98.01% and 97.89% of MBs in ROIs can be detected in single and multiple detection modes respectively. Hardware cost of proposed architecture is only 4.68 k gates. Detection speed is 753 fps for CIF format video at the operation frequency of 200 MHz in multiple detection mode with power consumption of 0.47 mW. Compared with previous fast ROI detection algorithms for video coding application, the proposed architecture obtains more accurate and smaller ROI. Therefore, more efficient ROI based computation complexity and compression efficiency optimization can be implemented in H.264 encoder.
Yibo FAN Xiaoyang ZENG Satoshi GOTO
Integer Motion Estimation (IME) costs much computation in H.264/AVC video encoder. 2-D SAD tree IME architecture provides very high performance for encoder, and it has been used by many video codec designs. This paper proposes an optimized hardware design of 2-D SAD tree IME. Firstly, a new hardware architecture is proposed to reduce on-chip memory size. Secondly, a new search pattern is proposed to fully use memory bandwidth and reduce external memory access. Thirdly, the data-path is redesigned, and the performance is greatly improved. In order to compare with other IME designs, an IME design support D1 size, 30 fps with search range [
Gang HE Dajiang ZHOU Jinjia ZHOU Tianruo ZHANG Satoshi GOTO
Intra coding in H.264/AVC significantly enhances video compression efficiency. However, due to the high data dependency of intra prediction in H.264, both pipelining and parallel processing techniques are limited to be applied. Moreover, it is difficult to get high hardware utilization and throughput because of the long block/MB-level reconstruction loops. This paper proposes a high-performance intra prediction architecture that can support H.264/AVC high profile. The proposed MB/block co-reordering can avoid data dependency and improve pipeline utilization. Therefore, the timing constraint of real-time 4096
Yiqing HUANG Xiaocong JIN Jin ZHOU Jia SU Takeshi IKENAGA
One high profile intra predictor generation engine is proposed in this paper. Firstly, hardware level algorithm optimization for intra 8
Jinjia ZHOU Dajiang ZHOU Gang HE Satoshi GOTO
In this paper, we present a cache based motion compensation (MC) architecture for Quad-HD H.264/AVC video decoder. With the significantly increased throughput requirement, VLSI design for MC is greatly challenged by the huge area cost and power consumption. Moreover, the long memory system latency leads to performance drop of the MC pipeline. To solve these problems, three optimization schemes are proposed in this work. Firstly, a high-performance interpolator based on Horizontal-Vertical Expansion and Luma-Chroma Parallelism (HVE-LCP) is proposed to efficiently increase the processing throughput to at least over 4 times as the previous designs. Secondly, an efficient cache memory organization scheme (4S×4) is adopted to improve the on-chip memory utilization, which contributes to memory area saving of 25% and memory power saving of 39
Kosuke MIZUNO Hiroki NOGUCHI Guangji HE Yosuke TERACHI Tetsuya KAMINO Tsuyoshi FUJINAGA Shintaro IZUMI Yasuo ARIKI Hiroshi KAWAGUCHI Masahiko YOSHIMOTO
This paper describes a SIFT (Scale Invariant Feature Transform) descriptor generation engine which features a VLSI oriented SIFT algorithm, three-stage pipelined architecture and novel systolic array architectures for Gaussian filtering and key-point extraction. The ROI-based scheme has been employed for the VLSI oriented algorithm. The novel systolic array architecture drastically reduces the number of operation cycle and memory access. The cycle counts of Gaussian filtering module is reduced by 82%, compared with the SIMD architecture. The number of memory accesses of the Gaussian filtering module and the key-point extraction module are reduced by 99.8% and 66% respectively, compared with the results obtained assuming the SIMD architecture. The proposed schemes provide processing capability for HDTV resolution video (1920
Hiroki NOGUCHI Kazuo MIURA Tsuyoshi FUJINAGA Takanobu SUGAHARA Hiroshi KAWAGUCHI Masahiko YOSHIMOTO
We propose a low-memory-bandwidth, high-efficiency VLSI architecture for 60-k word real-time continuous speech recognition. Our architecture includes a cache architecture using the locality of speech recognition, beam pruning using a dynamic threshold, two-stage language model searching, a parallel Gaussian Mixture Model (GMM) architecture based on the mixture level and frame level, a parallel Viterbi architecture, and pipeline operation between Viterbi transition and GMM processing. Results show that our architecture achieves 88.24% required frequency reduction (66.74 MHz) and 84.04% memory bandwidth reduction (549.91 MB/s) for real-time 60-k word continuous speech recognition.
Xi ZHANG Chongmin LI Zhenyu LIU Haixia WANG Dongsheng WANG Takeshi IKENAGA
Previous research illustrates that LRU replacement policy is not efficient when applications exhibit a distant re-reference interval. Recently RRIP policy is proposed to improve the performance for such kind of workloads. However, the lack of access recency information in RRIP confuses the replacement policy to make the accurate prediction. To enhance the robustness of RRIP for recency-friendly workloads, we propose an Dynamic Adaptive Insertion and Re-reference Prediction (DAI-RRP) policy which evicts data based on both re-reference prediction value and the access recency information. DAI-RRP makes adaptive adjustment on insertion position and prediction value for different access patterns, which makes the policy robust across different workloads and different phases. Simulation results show that DAI-RRP outperforms LRU and RRIP. For a single-core processor with a 1 MB 16-way set last-level cache (LLC), DAI-RRP reduces CPI over LRU and Dynamic RRIP by an average of 8.1% and 2.7% respectively. Evaluations on quad-core CMP with a 4 MB shared LLC show that DAI-RRP outperforms LRU and Dynamic RRIP (DRRIP) on the weighted speedup metric by an average of 8.1% and 15.7% respectively. Furthermore, compared to LRU, DAI-RRP consumes the similar hardware for 16-way cache, or even less hardware for high-associativity cache. In summary, the proposed policy is practical and can be easily integrated into existing hardware approximations of LRU.
Reliability issues such as a soft error and NBTI (negative bias temperature instability) have become a matter of concern as integrated circuits continue to shrink. It is getting more and more important to take reliability requirements into account even for consumer products. This paper presents a dynamic continuous signature monitoring (DCSM) technique for high reliable computer systems. The DCSM technique dynamically generates reference signatures as well as runtime ones during executing a program. The DCSM technique stores the generated signatures in a signature table, which is a small storage circuit in a microprocessor, unlike the conventional static continuous signature monitoring techniques and contributes to saving program or data memory space that stores the signatures. Our experiments showed that our DCSM technique protected 1.4-100.0% of executed instructions depending on the size of signature tables.
Tetsuya IIZUKA Jaehyun JEONG Toru NAKURA Makoto IKEDA Kunihiro ASADA
This paper proposes an all-digital process variability monitor which utilizes a simple buffer ring with a pulse counter. The proposed circuit monitors the process variability according to a count number of a single pulse which propagates on the buffer ring and a fixed logic level after the pulse vanishes. The proposed circuit has been fabricated in 65 nm CMOS process and the measurement results demonstrate that we can monitor the PMOS and NMOS variabilities independently using the proposed monitoring circuit. The proposed monitoring technique is suitable not only for the on-chip process variability monitoring but also for the in-field monitoring of aging effects such as negative/positive bias instability (NBTI/PBTI).
Yoji BANDO Satoshi TAKAYA Toru OHKAWA Toshiharu TAKARAMOTO Toshio YAMADA Masaaki SOUDA Shigetaka KUMASHIRO Tohru MOGAMI Makoto NAGATA
A continuous-time waveform monitoring technique for quality on-chip power noise measurements features matched probing performance among a variety of voltage domains of interest in a VLSI circuit, covering digital Vdd, analog Vdd, as well as at Vss, and multiple probing capability at various locations on power planes. A calibration flow eliminates the offset as well as gain errors among probing channels. The consistency of waveforms acquired by the proposed continuous-time monitoring and sampled-time precise digitization techniques is ensured. A 90-nm CMOS on-chip monitor prototype demonstrates dynamic power supply noise measurements with
In this paper, a Stochastic Non-Homogeneous ARnoldi (SNHAR) method is proposed for the analysis of the on-chip power grid networks in the presence of process variations. In SNHAR method, the polynomial chaos based stochastic method is employed to handle the variations of power grids. Different from the existing StoEKS method which uses extended Krylov Subspace (EKS) method to compute the coefficients of the polynomial chaos, a computation-efficient and numerically stable Non-Homogeneous ARnoldi (NHAR) method is employed in SNHAR method to compute the coefficients of the polynomial chaos. Compared with EKS method, NHAR method has superior numerical stability and can achieve remarkably higher accuracy with even lower computational cost. As a result, SNHAR can capture the stochastic characteristics of the on-chip power grid networks with higher accuracy, but even lower computational cost than StoEKS.
Jinmyoung KIM Toru NAKURA Hidehiro TAKATA Koichiro ISHIBASHI Makoto IKEDA Kunihiro ASADA
This paper presents an on-chip resonant supply noise canceller utilizing parasitic capacitance of sleep blocks. The test chip was fabricated in a 0.18 µm CMOS process and measurement results show 43.3% and 12.5% supply noise reduction on the abrupt supply voltage switching and the abrupt wake-up of a sleep block, respectively. The proposed method requires 1.5% area overhead for four 100 k-gate blocks, which is 7.1 X noise reduction efficient comparing with the conventional decap for the same power supply noise, while achieves 47% improvement of settling time. These results make fast switching of power mode possible for dynamic voltage scaling and power gating.
Yuji KUNITAKE Toshinori SATO Hiroto YASUURA
Negative Bias Temperature Instability (NBTI) is one of the major reliability problems in advanced technologies. NBTI causes threshold voltage shift in a PMOS transistor. When the PMOS transistor is biased to negative voltage, threshold voltage shifts to negatively. On the other hand, the threshold voltage recovers if the PMOS transistor is positively biased. In an SRAM cell, due to NBTI, threshold voltage degrades in the load PMOS transistors. The degradation has the impact on Static Noise Margin (SNM), which is a measure of read stability of a 6-T SRAM cell. In this paper, we discuss the relationship between NBTI degradation in an SRAM cell and the dynamic stress and recovery condition. There are two important characteristics. One is a stress probability, which is defined as the rate that the PMOS transistor is negatively biased. The other is a stress and recovery cycle, which is defined as the switching interval of an SRAM value. In our observations, in order to mitigate the NBTI degradation, the stress probability should be small and the stress and recovery cycle should be shorter than 10 msec. Based on the observations, we propose a novel cell-flipping technique, which makes the stress probability close to 50%. In addition, we show results of the case studies, which apply the cell-flipping technique to register file and cache memories.
Tadayoshi ENOMOTO Nobuaki KOBAYASHI
We developed and applied a new circuit, called the “Self-controllable Voltage Level (SVL)” circuit, to achieve an expanded “read” and “write” margins and low leakage power in a 90-nm, 2-kbit, six-transistor CMOS SRAM. At the threshold voltage fluctuation of 6σ, the minimum supply voltage of the newly developed (dvlp.) SRAM for “write” operation was significantly reduced to 0.11 V, less than half that of an equivalent conventional (conv.) SRAM. The standby leakage power of the dvlp. SRAM was only 1.17 µW, which is 4.64% of that of the conv. SRAM at supply voltage of 1.0 V. Moreover, the maximum operating clock frequency of the dvlp. SRAM was 138 MHz, which is 15% higher than that (120 MHz) of the conv. SRAM at VMM of 0.4 V. An area overhead was 0.81% that of the conv. SRAM.
Teruyoshi HATANAKA Mitsue TAKAHASHI Shigeki SAKAI Ken TAKEUCHI
This paper presents an improvement of the memory cell reliability by the memory cell VTH optimization of the ferroelectric (Fe)-NAND flash memory. The effects of the memory cell VTH on the reliability of the Fe-NAND flash memory are experimentally analyzed for the first time. The reliability is evaluated by the measured VTH shift due to the read disturb, program disturb and data retention. Three types of Fe-NAND flash memory cells, a positive, zero and negative VTH memory cell, are defined on the basis of the memory cell VTH. The middle of VTH of programmed and erased states is 1 V, 0 V and -0.3 V in a positive, zero and negative VTH memory cell, respectively. The VTH shift of the positive, zero and negative VTH memory cells show similar characteristics in the program/erase and the VPASS and VPGM disturbs because the external electric field is so high that the internal depolarization field does not affect the VTH shift. On the other hand, in the data retention, the VTH shift of the three types of VTH memory cells show different characteristics. The reliability of the Fe-NAND flash memory is best optimized in the zero VTH memory cell. In the proposed zero VTH Fe-NAND flash memory cell scheme, the measured VTH shift due to the read disturb, program disturb and data retention decreases by 32%, 24% and 10%, respectively, compared with conventional positive VTH Fe-NAND flash memory cell scheme. Contrarily, in the negative VTH memory cell, the VTH shift during the data retention is 0.49 V and unacceptably large because of the depolarization field. The conventional positive VTH memory cell suffers from a sever read and program disturb. The measured results are drastically different from those of the conventional floating-gate NAND flash memory cell where the negative VTH memory cell is most suitable in terms of the reliability.
Masahiro IIDA Masahiro KOGA Kazuki INOUE Motoki AMAGASAKI Yoshinobu ICHIDA Mitsuro SAJI Jun IIDA Toshinori SUEYOSHI
An advantage of an RLD (reconfigurable logic device) such as an FPGA (field programmable gate array) is that it can be customized after being manufactured. Due to the aggressive technology scaling, device density is increasing, and it has become a serious problem in power consumption accordingly. In SoC of embedded systems, power gating is one of the major power reduction techniques. However, it is difficult to adopt SRAM-based RLDs because of the high overhead and SRAM being volatile. In this paper, we describe a TEG (test element group) chip of a reconfigurable logic based FeRAM (ferroelectric random access memory) technology. FeRAM brings reconfigurable logic devices the advantage of being a genuine power gater. The chip employs island-style routing architecture and uses a variable grain logic cell as a logic block. A NV-FF (non-volatile flip-flop), which contains FeRAM, a FF, and power-gating control circuits, is used as both configuration memories and FFs in a logic block. The NV-FF can transmit data between FeRAM and FF automatically when a power source is turned off/on. Thus chip-level power gating is possible. The hibernate/restore time is less than 1 ms. The chip has 18
Yoshimitsu TAKAMATSU Ryuichi FUJIMOTO Tsuyoshi SEKINE Takaya YASUDA Mitsumasa NAKAMURA Takuya HIRAKAWA Masato ISHII Motohiko HAYASHI Hiroya ITO Yoko WADA Teruo IMAYAMA Tatsuro OOMOTO Yosuke OGASAWARA Masaki NISHIKAWA Yoshihiro YOSHIDA Kenji YOSHIOKA Shigehito SAIGUSA Hiroshi YOSHIDA Nobuyuki ITOH
This paper presents a single-chip RF tuner/OFDM demodulator for a mobile digital TV application called “1-segment broadcasting.” To achieve required performances for the single-chip receiver, a tunable technique for a low-noise amplifier (LNA) and spurious suppression techniques are proposed in this paper. Firstly, to receive all channels from 470 MHz to 770 MHz and to relax distortion characteristics of following circuit blocks such as an RF variable-gain amplifier and a mixer, a tunable technique for the LNA is proposed. Then, to improve the sensitivity, spurious signal suppression techniques are also proposed. The single-chip receiver using the proposed techniques is fabricated in 90 nm CMOS technology and total die size is 3.26 mm
Mohiuddin HAFIZ Nobuo SASAKI Takamaro KIKKAWA
A differential input non-coherent BPSK receiver for the UWB-IR communication, based on threshold detection, has been presented in this paper. The chip can recover BPSK modulated Gaussian monocycle pulses (GMP), along with its first derivative, at a data rate of 500 Mb/s. No clock reception is required, as the receiver recovers data based on the relative phase of the two simultaneously received inputs. While retrieving the data, it consumes a power of 63 mW from a supply voltage of 1.8 V. A shunt-peaked narrow band amplifier, matched to the input antenna, is used to amplify the received GMP. Wireless data have been successfully recovered using a pair of horn antennas at a distance of 6 cm. The chip, developed in a 180 nm CMOS technology, occupies a die area of 3.4 mm2. The receiver is suitable for the non-coherent (self-synchronized) UWB-IR communication.
Jiangtao SUN Qing LIU Yong-Ju SUH Takayuki SHIBATA Toshihiko YOSHIMASU
A balanced push-push frequency doubler has been demonstrated in 0.25-µm SOI (Silicon on Insulator) SiGe BiCMOS technology operating from 22 GHz to 29 GHz with high fundamental frequency suppression and high conversion gain. A series LC resonator circuit is connected in parallel with the differential outputs of the doubler core circuit. The LC resonator is effective to improve the fundamental frequency suppression. In addition, the LC resonator works as a matching circuit between the output of the doubler core and the input of the output buffer amplifier, which increases the conversion gain of the whole circuit. A measured fundamental frequency suppression of greater than 46 dBc is achieved at an input power of -10 dBm in the output frequency band of 22-29 GHz. Moreover, maximum fundamental frequency suppression of 66 dBc is achieved at an input frequency of 13 GHz and an input power of -10 dBm. The frequency doubler works at a supply voltage of 3.3 V.
Hiroaki KATSURAI Hideki KAMITSUNA Hiroshi KOIZUMI Jun TERADA Yusuke OHTOMO Tsugumichi SHIBATA
As a future passive optical network (PON) system, the 10 Gigabit Ethernet PON (10G-EPON) has been standardized in IEEE 802.3av. As conventional Gigabit Ethernet PON (GE-PON) systems have already been widely deployed, 1G/10G co-existence technologies are strongly required for the next system. A gated voltage-controlled-oscillator (G-VCO)-based 10-Gb/s burst-mode clock and data recovery (CDR) circuit is presented for a 1G/10G co-existence PON system. It employs two new circuits to improve jitter transfer and provide tolerance to 1G/10G operation. An injection-controlled jitter-reduction circuit reduces output-clock jitter by 7 dB from 200-MHz input data jitter while keeping a short lock time of 20 ns. A frequency-variation compensation circuit reduces frequency mismatch among the three VCOs on the chip and offers large tolerance to consecutive identical digits. With the compensation, the proposed CDR circuit can employ multi VCOs, which provide tolerance to the 1G/10G co-existence situation. It achieves error-free (bit-error rate < 10-12) operation for 10-G bursts following bursts of other rates, obviously including 1G bursts. It also provides tolerance to a 256-bit sequence without a transition in the data, which is more than enough tolerance for 65-bit CIDs in the 64B/66B code of 10 Gigabit Ethernet.
Ryuichi FUJIMOTO Kyoya TAKANO Mizuki MOTOYOSHI Uroschanit YODPRASIT Minoru FUJISHIMA
Device modeling techniques for high-frequency circuits operating at over 100 GHz are presented. We have proposed the bond-based design as an accurate high-frequency circuit design method. Because layout parasitic extractions (LPE) are not required in the bond-based design, it can be applied high-frequency circuit design at over 100 GHz. However, customized device models are indispensable for the bond-based design. In this paper, device modeling techniques for high-frequency circuit design using the bond-based design are proposed. The customized device model for MOSFETs, transmission lines and pads are introduced. By using customized device models, the difference between the simulated and measured gains of an amplifier is improved to less than 0.6 dB at 120 GHz.
Po-Hung CHEN Koichi ISHIDA Xin ZHANG Yasuyuki OKUMA Yoshikatsu RYU Makoto TAKAMIYA Takayasu SAKURAI
In this paper, a 0.18-V input three-stage charge pump circuit applying forward body bias is proposed for energy harvesting applications. In the developed charge pump, all the MOSFETs are forward body biased by using the inter-stage/output voltages. By applying the proposed charge pump as the startup in the boost converter, the kick-up input voltage of the boost converter is reduced to 0.18 V. To verify the circuit characteristics, the conventional zero body bias charge pump and the proposed forward body bias charge pump were fabricated with 65 nm CMOS process. The measured output current of the proposed charge pump under 0.18-V input voltage is increased by 170% comparing to the conventional one at the output voltage of 0.5 V. In addition, the boost converter successfully boosts the 0.18-V input to higher than 0.65-V output.
Yimeng ZHANG Leona OKAMURA Tsutomu YOSHIHARA
A novel charge-recovery logic structure called Pulse Boost Logic (PBL) is proposed in this paper. PBL is a high-speed low-energy-dissipation charge-recovery logic with dual-rail evaluation tree structure. It is driven by 2-phase non-overlap clock, and requires no DC power supply. PBL belongs to boost logic family, which includes boost logic, enhanced boost logic and subthreshold boost logic. In this paper, PBL has been compared with other charge-recovery logic technologies. To demonstrate the performance of PBL structure, a 4-bit pipeline multiplier is designed and fabricated with 0.18 µm CMOS process technology. The simulation results indicate that the 4-bit multiplier can work at a frequency of 1.8 GHz, while the measurement of test chip is at operation frequency of 161 MHz, and the power dissipation at 161 MHz is 772 µW.
Koichi YAMAGUCHI Masayuki MIZUNO
Dicode partial response signaling system over inductively-coupled channel has been developed to achieve higher data rate than self-resonant frequencies of inductors. The developed system operates at five times higher data rates than conventional systems with the same inductor. A current-mode equalization in the transmitter designed in a 90-nm CMOS successfully reshapes waveforms to obtain dicode signals at the receiver. For a 5-Gb/s signaling through the coupled inductors with a 120-µm diameter and a 120-µm distance, 20-mV eye opening was observed. The power consumption value of the transmitter was 58 mW at the 5-Gb/s operation.
Koichi YAMAGUCHI Masayuki MIZUNO
Duobinary signaling has been introduced into asymmetric multi-chip communications such as DRAM or display interfaces, which allows a controlled amount of ISI to reduce signaling bandwidth by 2/3. A × 2 oversampled equalization has been developed to realize Duobinary signaling. Symbol-rate clock recovery form Duobinary signal has been developed to reduce power consumption for receivers. A Duobinary transmitter test chip was fabricated with 90-nm CMOS process. A 3.5 dB increase in eye height and a 1.5 times increase in eye width was observed.
Nguyen Ngoc MAI KHANH Masahiro SASAKI Kunihiro ASADA
In this paper, we present a 0.18-µm CMOS fully integrated X-band shock wave generator (SWG) with an on-chip dipole antenna and a digitally programmable delay circuit (DPDC) for pulse beam-formability in short-range and hand-held microwave active imaging applications. This chip includes a SWG, a 5-bit DPDC and an on-chip wide-band meandering dipole antenna. By using an integrated transformer, output pulse of the SWG is sent to the on-chip meandering dipole antenna. The SWG operates based on damping conditions to produce a 0.4-V peak-to-peak (p-p) pulse amplitude at the antenna input terminals in HSPICE simulation. The DPDC is designed to adjust delays of shock-wave outputs for the purpose of steering beams in antenna array systems. The wide-band dipole antenna element designed in the meandering shape is located in the top metal of a 5-metal-layer 0.18-µm CMOS chip. By simulating in Momentum of ADS 2009, the minimum value of antenna's return loss, S 11, and antenna's bandwidth (BW) are -19.37 dB and 25.3 GHz, respectively. The measured return loss of a stand-alone integrated meandering dipole is from -26 dB to -10 dB with frequency range of 7.5-12 GHz. In measurements of the SWG with the integrated antenna, by using a 20-dB standard gain horn antenna placed at a 38-mm distance from the chip's surface, a 1.1-mVp-p shock wave with a 9-11-GHz frequency response is received. A measured 3-ps pulse delay resolution is also obtained. These results prove that our proposed circuit is suitable for the purpose of fully integrated pulse beam-forming system.
Sarang KAZEMINIA Morteza MOUSAZADEH Kayrollah HADIDI Abdollah KHOEI
This paper presents a high speed single-stage latched comparator which is scheduled in time for both amplification and latch operations. Small active area and simple switching strategy besides desired power consumption at high comparison rates qualifies the proposed comparator to be repeatedly employed in high speed flash A/D converters. A strategy of kickback noise elimination besides gain enhancement is also introduced. A low power holding read-out circuit is presented. Post-Layout simulation results confirm 500 MS/s comparison rate with 5 mv resolution for a 1.6 v peak-to-peak input signal range and 600 µw power consumption from a 3.3 v power supply by using TSMC model of 0.35 µm CMOS technology. Total active area of proposed comparator and read-out circuit is about 300 µm2.
Fatemeh ABRISHAMIAN Katsumi MORISHITA
The adjustable range on post-fabrication resonance wavelength trimming of long-period fiber gratings was broadened toward the blue side, and the mechanisms of the resonance wavelength shifts caused by heating were investigated. It can be concluded that the glass structure relaxes more slowly than the residual stress with decreasing heating temperature and the blue shift caused by the residual stress relaxation appears more strongly at the early stage of heating. The blue shift of 41 nm was obtained by heating a long-period grating at 600
Yan-Ru TSENG Tzuen-Hsi HUANG Shang-Hsun WU
This paper presents a 7 GHz differential current-reused voltage-controlled oscillator (CR-VCO) with low power consumption and low phase noise using 0.18-µm CMOS technology. The output power of this CR-VCO is enhanced by utilizing a trifilar-transformer-feedback technique. The lower phase noise is achieved by the more symmetric voltage swings resulting from the improved balance of switching current. At a 1.5-V DC supply voltage, the power dissipation is only 3.4 mW. The total tuning range is 1.4 GHz (17.9%) as the tuning voltage ranges from 0 V to 1.8 V. The optimum phase noise is around -117.3 dBc/Hz at a frequency offset of 1 MHz from the center frequency of 7.07 GHz. The corresponding output power is around -6.8 dBm. For the proposed CR-VCO, the calculated figures-of-merit, FOM and FOMT , are -188.9 and -193.9 dBc/Hz, respectively.
Shingo MANDAI Toru NAKURA Tetsuya IIZUKA Makoto IKEDA Kunihiro ASADA
We introduce a 16 × cascaded time difference amplifier (TDA) using a differential logic delay cell with 0.18 µm CMOS process. By employing the differential logic delay cell in the delay chain instead of the CMOS logic delay cell, less than 8% TD gain offset with
Osamu NISHII Yoichi YUYAMA Masayuki ITO Yoshikazu KIYOSHIGE Yusuke NITTA Makoto ISHIKAWA Tetsuya YAMADA Junichi MIYAKOSHI Yasutaka WADA Keiji KIMURA Hironori KASAHARA Hideo MAEJIMA
We built a 12.4 mm
Guo-Ming SUNG Ying-Tsu LAI Chien-Lin LU
This paper presents a resistor-compensation technique for a CMOS bandgap and current reference, which utilizes various high positive temperature coefficient (TC) resistors, a two-stage operational transconductance amplifier (OTA) and a simplified start-up circuit in the 0.35-µm CMOS process. In the proposed bandgap and current reference, numerous compensated resistors, which have a high positive temperature coefficient (TC), are added to the parasitic n-p-n and p-n-p bipolar junction transistor devices, to generate a temperature-independent voltage reference and current reference. The measurements verify a current reference of 735.6 nA, the voltage reference of 888.1 mV, and the power consumption of 91.28 µW at a supply voltage of 3.3 V. The voltage TC is 49 ppm/