Yoshiya KOMATSU Masanori HARIYAMA Michitaka KAMEYAMA
This paper presents a novel architecture of an asynchronous FPGA for handshake-component-based design. The handshake-component-based design is suitable for large-scale, complex asynchronous circuit because of its understandability. This paper proposes an area-efficient architecture of an FPGA that is suitable for handshake-component-based asynchronous circuit. Moreover, the Four-Phase Dual-Rail encoding is employed to construct circuits robust to delay variation because the data paths are programmable in FPGA. The FPGA based on the proposed architecture is implemented in a 65 nm process. Its evaluation results show that the proposed FPGA can implement handshake components efficiently.
Kai LIAO XiaoXin CUI Nan LIAO KaiSheng MA
With the technology scaling down, leakage power becomes an important part of total power consumption. The relatively large leakage current weakens the energy recovery capability of adiabatic circuits and reduces its superiority, compared with static CMOS circuits in the field of low-power design. In this paper, we rebuild three types of adiabatic circuits (2N2N2P, IPAL and DCPAL) based on FinFET devices to obtain a large leakage power reduction by rationally utilizing the different operating modes of FinFET devices (SG, LP, and IG). A 16-bit adiabatic adder has been investigated to demonstrate the advantages of FinFET adiabatic circuits. The Predictive Technology Model (PTM) is used for 32-nm bulk MOSFET and FinFET devices and all of the simulations are based on HSPICE. The results evince the proposed FinFET adiabatic circuits have a considerable reduction (more than 60% for SG mode FinFET and more than 80% for LP mode FinFET) of power consumption compared with the bulk MOSFET ones. Furthermore, the FinFET adiabatic circuits also have higher limiting frequency of clock source and better noise immunity.
Kazuteru NAMBA Nobuhide TAKASHINA Hideo ITO
Small delay defects can cause serious issues such as very short lifetime in the recent VLSI devices. Delay measurement is useful to detect small delay defects in manufacturing testing. This paper presents a design for delay measurement to detect small delay defects on global routing resources, such as double, hex and long lines, in a Xilinx Virtex 4 based FPGA. This paper also shows a measurement method using the proposed design. The proposed measurement method is based on an existing one for SoC using delay value measurement circuit (DVMC). The proposed measurement modifies the construction of configurable logic blocks (CLBs) and utilizes an on-chip DVMC newly added. The number of configurations required by the proposed measurement is 60, which is comparable to that required by stuck-at fault testing for global routing resources in FPGAs. The area overhead is low for general FPGAs, in which the area of routing resources is much larger than that of the other elements such as CLBs. The area of every modified CLB is 7% larger than an original CLB, and the area of the on-chip DVMC is 22% as large as that of an original CLB. For recent FPGAs, we can estimate that the area overhead is approximately 2% or less of the FPGAs.
Bongsub SONG Kyunghoon KIM Junan LEE Kwangsoo KIM Younglok KIM Jinwook BURM
A complete 4-level pulse amplitude modulation (4-PAM) serial link transceiver including a wide frequency range clock generator and clock data recovery (CDR) is proposed in this paper. A dual-loop architecture, consisting of a frequency locked loop (FLL) and a phase locked loop (PLL), is employed for the wide frequency range clocks. The generated clocks from the FLL (clock generator) and the PLL (CDR) are utilized for a transmitter clock and a receiver clock, respectively. Both FLL and PLL employ the identical voltage controlled oscillators consisting of ring-type delay-cells. To improve the frequency tuning range of the VCO, deep triode PMOS loads are utilized for each delay-cell, since the turn-on resistance of the deep triode PMOS varies substantially by the gate-voltage. As a result, fabricated in a 0.13-µm CMOS process, the proposed 4-PAM transceiver operates from 1.5 Gb/s to 9.7 Gb/s with a bit error rate of 10-12. At the maximum data-rate, the entire power dissipation of the transceiver is 254 mW, and the measured jitter of the recovered clock is 1.61 psrms.
There is a relentless push for cost and size reduction in optical transmitters and receivers for fiber-optic links. Monolithically integrated optical chips in InP and Si may be a way to leap ahead of this trend. We discuss uses of integration technology to accomplish various telecommunications functions.
Manabu KOBAYASHI Hiroshi NINOMIYA Shigeyoshi WATANABE
I. O'Connor et al. have proposed a dynamically reconfigurable dynamic logic circuit (DRDLC) to generate some logic functions by using the double-gate (DG) carbon nanotube (CNT) FETs which have the ambipolar property [1]. This DRDLC consists of seven transistors to generate 14 logic functions which do not include the XOR and XNOR functions. On the other hand, K. Jabeur et al. have proposed a DRDLC to generate the whole set of 16 logic functions including XOR and XNOR by adding 4 or 8 transistors to O'Connor's circuit [5]. In this letter, we propose a DRDLC, which consists of only seven transistors, to generate the whole set of 16 logic functions by using DG-CNTFETs. Finally, we show that the number of transistors can be reduced compared to the conventional DRDLC to generate 16 logic functions.
Kota ASAKA Atsushi KANDA Akira OHKI Takeshi KUROSAKI Ryoko YOSHIMURA Hiroaki SANJOH Toshio ITO Makoto NAKAMURA Mikio YONEYAMA
By using impedance (Z) matching circuits in a low-cost transistor outline (TO) CAN package for a 10 Gb/s transmitter, we achieve a cost-effective and small bidirectional optical subassembly (BOSA) with excellent optical transmission waveforms and a > 40% mask margin over a wide temperature range (-10 to 85). We describe a design for Z matching circuits and simulation results, and discuss the advantage of the cost-effective compensation technique.
Masanori FURUTA Ippei AKITA Junya MATSUNO Tetsuro ITAKURA
This paper presents a 7-bit 1.5-GS/s time-interleaved (TI) SAR ADC. The scheme achieves better isolation between sub-ADCs thanks to embedding a track-and-hold (T/H) amplifier and reference voltage buffer in each sub-ADC. The proposed dynamic T/H circuit enables high-speed, low-power operation. The prototype is fabricated in a 65-nm CMOS technology. The total active area is 0.14,mm2 and the ADC consumes 36 mW from a 1.2-V supply. The measured results show the peak spurious-free dynamic range (SFDR) and signal-to-noise-and-distortion ratio (SNDR) are 52.4 dB and 39.6 dB, respectively, and an figure of Merit (FoM) of 300 fJ/conv. is achieved.
Kazuhiro GOI Kenji ODA Hiroyuki KUSAKA Akira OKA Yoshihiro TERADA Kensuke OGAWA Tsung-Yang LIOW Xiaoguang TU Guo-Qiang LO Dim-Lee KWONG
20-Gbps non return-to-zero (NRZ) – binary phase shift keying (BPSK) using the silicon Mach-Zehnder modulator is demonstrated and characterized. Measurement of a constellation diagram confirms successful modulation of 20-Gbps BPSK with the silicon modulator. Transmission performance is characterized in the measurement of bit-error-rate in accumulated dispersion range from -347 ps/nm to +334 ps/nm using SMF and a dispersion compensating fiber module. Optical signal-to-noise ratio required for bit-error-rate of 10-3 is 10.1 dB at back-to-back condition. It is 1.2-dB difference from simulated value. Obtained dispersion tolerance less than 2-dB power penalty for bit-error-rate of 10-3 is -220 ps/nm to +230 ps/nm. The symmetric dispersion tolerance indicates chirp-free modulation. Frequency chirp inherent in the modulation mechanism of the silicon MZM is also discussed with the simulation. The effect caused by the frequency chirp is limited to 3% shift in the chromatic dispersion range of 2 dB power penalty for BER 10-3. The effect inherent in the silicon modulation mechanism is confirmed to be very limited and not to cause any significant degradation in the transmission performance.
Substrate coupling of radio frequency (RF) components is represented by equivalent circuits unifying a resistive mesh network with lumped capacitors in connection with the backside of device models. Two-port S-parameter test structures are used to characterize the strength of substrate coupling of resistors, capacitors, inductors, and MOSFETs in a 65 nm CMOS technology with different geometries and dimensions. The consistency is finely demonstrated between simulation with the equivalent circuits and measurements of the test structures, with the deviation of typically less than 3 dB for passive and 6 dB for active components, in the transmission properties for the frequency range of interest up to 8 GHz.
Satoshi TAKAYA Yoji BANDO Toru OHKAWA Toshiharu TAKARAMOTO Toshio YAMADA Masaaki SOUDA Shigetaka KUMASHIRO Tohru MOGAMI Makoto NAGATA
The response of differential pairs against low-frequency substrate voltage variation is captured in a combined transistor and substrate network models. The model generation is regularized for variation of transistor geometries including channel sizes, fingering and folding, and the placements of guard bands. The expansion of the models for full-chip substrate noise analysis is also discussed. The substrate sensitivity of differential pairs is evaluated through on-chip substrate coupling measurements in a 90 nm CMOS technology with more than 64 different geometries and operating conditions. The trends and strengths of substrate sensitivity are shown to be well consistent between simulation and measurements.
Fei LI Masaya MIYAHARA Akira MATSUZAWA
Recent attempts to directly combine CMOS pixel readout chips with modern gas detectors open the possibility to fully take advantage of gas detectors. Those conventional readout LSIs designed for hybrid semiconductor detectors show some issues when applied to gas detectors. Several new proposed readout LSIs can improve the time and the charge measurement precision. However, the widely used basic charge sensitive amplifier (CSA) has an almost fixed dynamic range. There is a trade-off between the charge measurement resolution and the detectable input charge range. This paper presents a method to apply the folding integration technique to a basic CSA. As a result, the detectable input charge dynamic range is expanded while maintaining all the key merits of a basic CSA. Although folding integration technique has already been successfully applied in CMOS image sensors, the working conditions and the signal characteristics are quite different for pixel readout LSIs for gas particle detectors. The related issues of the folding CSA for pixel readout LSIs, including the charge error due to finite gain of the preamplifier, the calibration method of charge error, and the dynamic range expanding efficiency, are addressed and analyzed. As a design example, this paper also demonstrates the application of the folding integration technique to a Qpix readout chip. This improves the charge measurement resolution and expands the detectable input dynamic range while maintaining all the key features. Calculations with SPICE simulations show that the dynamic range can be improved by 12 dB while the charge measurement resolution is improved by 10 times. The charge error during the folding operation can be corrected to less than 0.5%, which is sufficient for large input charge measurement.
Kiichi NIITSU Naohiro HARIGAI Takahiro J. YAMAGUCHI Haruo KOBAYASHI
This paper describes a high-speed, robust, scalable, and low-cost feed-forward time amplifier that uses phase detectors and variable delay lines. The amplifier works by detecting the time difference between two rising input edges with a phase detector and adjusting the delay of the variable delay line accordingly. A test chip was designed and fabricated in 65 nm CMOS. The measured resulting performance indicates that it is possible to amplify time difference while maintaining high-speed operation.
Tong WANG Toshiya MITOMO Naoko ONO Shigehito SAIGUSA Osamu WATANABE
A four-stage power amplifier (PA) with 10 GHz 1-dB bandwidth (56–66 GHz) is presented. The broadband performance is achieved owing to π-section interstage matching network. Three-stage-current-reuse topology is proposed to enhance efficiency. The amplifier has been fabricated in 65 nm digital CMOS. 18 dB power gain and 9.6 dBm saturated power (Psat) are achieved at 60 GHz. The PA consumes current of 50 mA at 1.2 V supply voltage, and has a peak power-added efficiency (PAE) of 13.6%. To the best of the authors' knowledge, this work shows the highest PAE among the reported CMOS PAs that covers the worldwide 9 GHz ISM millimeter-wave band with less-than-1.2 V supply voltage.
In this paper, a high performance current latch sense amplifier (CLSA) with vertical MOSFET is proposed, and its performances are investigated. The proposed CLSA with the vertical MOSFET realizes a 11% faster sensing time with about 3% smaller current consumption relative to the conventional CLSA with the planar MOSFET. Moreover, the proposed CLSA with the vertical MOSFET achieves an 1.11 dB increased voltage gain G(f) relative to the conventional CLSA with the planar MOSFET. Furthermore, the proposed CLSA realizes up to about 1.7% larger yield than the conventional CLSA, and its circuit area is 42% smaller than the conventional CLSA.
Hiroshi NAKAMURA Weihan WANG Yuya OHTA Kimiyoshi USAMI Hideharu AMANO Masaaki KONDO Mitaro NAMIKI
Power consumption has recently emerged as a first class design constraint in system LSI designs. Specially, leakage power has occupied a large part of the total power consumption. Therefore, reduction of leakage power is indispensable for efficient design of high-performance system LSIs. Since 2006, we have carried out a research project called “Innovative Power Control for Ultra Low-Power and High-Performance System LSIs”, supported by Japan Science and Technology Agency as a CREST research program. One of the major objectives of this project is reducing the leakage power consumption of system LSIs by innovative power control through tight cooperation and co-optimization of circuit technology, architecture, and system software designs. In this project, we focused on power gating as a circuit technique for reducing leakage power. Temporal granularity is one of the most important issue in power gating. Thus, we have developed a series of Geysers as proof-of-concept CPUs which provide several mechanisms of fine-grained run-time power gating. In this paper, we describe their concept and design, and explain why co-optimization of different design layers are important. Then, three kinds of power gating implementations and their evaluation are presented from the view point of power saving and temporal granularity.
Minoru IIZUKA Naohiro HAMADA Hiroshi SAITO
This paper proposes an ASIC design support tool set for non-pipelined asynchronous circuits with bundled-data implementation. This tool set consists of seven tools to automate design processes of bundled-data implementation such as the generation of design constraints, timing verification, and delay adjustment considering a given latency constraint. With the proposed design flow which combines the proposed tool set and commercial CAD tools, most of design processes from an RTL model is fully automated. In the experiments, to show the effectiveness of energy consumption in bundled-data implementation compared to synchronous counterpart, this paper synthesizes several circuits with a latency constraint which is generated from the synchronous counterpart with the minimum clock cycle time.
Tatsuki HYODO Gaku ASAKURA Kiwamu TSUKADA Masashi KATO
This letter proposes an analog active noise control (ANC) circuit with an all-pass filter (APF). To improve performance of the previously reported analog ANC circuit, we inserted an APF to the circuit in order to fit phases of a noise and an electrical signal in the circuit. As a result, we confirmed improvement of the noise canceling effect of the analog ANC circuit.
Toshihiro KONISHI Keisuke OKUNO Shintaro IZUMI Masahiko YOSHIMOTO Hiroshi KAWAGUCHI
We present a small-area second-order all-digital time-to-digital converter (TDC) with two frequency shift oscillators (FSOs) comprising inverter chains and dynamic flipflops featuring low jitter. The proposed FSOs can maintain their phase states through continuous oscillation, unlike conventional gated ring oscillators (GROs) that are affected by transistor leakage. Our proposed FSOTDC is more robust and is eligible for all-digital TDC architectures in recent leaky processes. Low-jitter dynamic flipflops are adopted as a quantization noise propagator (QNP). A frequency mismatch occurring between the two FSOs can be canceled out using a least mean squares (LMS) filter so that second-order noise shaping is possible. In a standard 65-nm CMOS process, an SNDR of 61 dB is achievable at an input bandwidth of 500 kHz and a sampling rate of 16 MHz, where the respective area and power are 700 µm2 and 281 µW.
Takeshi OKUMOTO Kumpei YOSHIKAWA Makoto NAGATA
An effective supply voltage monitor evaluates dynamic variation of (Vdd-Vss) within power rails of integrated circuits on a die. The monitor occupies an area of as small as 10.8 14.5 µm2 and is followed by backend digitizing circuits, both using 3.3 V thick oxide transistors in a 65 nm CMOS technology for covering all power domains from core circuits to peripheral I/O rings. A prototype demonstrates capturing of effective supply voltage waveforms in digital (shift registers) as well as in analog (4 bit Flash ADC) circuits.