1-6hit |
Yuncheng ZHANG Bangan LIU Teruki SOMEYA Rui WU Junjun QIU Atsushi SHIRANE Kenichi OKADA
This paper presents a fully integrated yet compact receiver front-end for Sub-GHz applications such as Internet-of-Things (IoT). The low noise amplifier (LNA) matching network leverages an inductance boosting technique. A relatively small on-chip inductor with a compact area achieves impedance matching in such a low frequency. Moreover, a passive-mixer-first mode bypasses the LNA to extend the receiver dynamic-range. The passive mixer provides matching to the 50Ω antenna interface to eliminate the need for additional passive components. Therefore, the receiver can be fully-integrated without any off-chip matching components. The flipped-voltage-follower (FVF) cell is adopted in the low pass filter (LPF) and the variable gain amplifier (VGA) for its high linearity and low power consumption. Fabricated in 65nm LP CMOS process, the proposed receiver front-end occupies 0.37mm2 core area, with a tolerable input power ranging from -91.5dBm to -1dBm for 500kbps GMSK signal at 924MHz frequency. The power consumption is 1mW power under a 1.2V supply.
Heming SUN Dajiang ZHOU Peilin LIU Satoshi GOTO
In this paper, we present an area-efficient 4/8/16/32-point inverse discrete cosine transform (IDCT) architecture for a HEVC decoder. Compared with previous work, this work reduces the hardware cost from two aspects. First, we reduce the logical costs of 1D IDCT by proposing a reordered parallel-in serial-out (RPISO) scheme. By using the RPISO scheme, we can reduce the required calculations for butterfly inputs in each cycle. Secondly, we reduce the area of transpose architecture by proposing a cyclic data mapping scheme that can achieve 100% I/O utilization of each SRAM. To design a fully pipelined 2D IDCT architecture, we propose a pipelining schedule for row and column transform. The results show that the normalized area by maximum throughput for the logical IDCT part can be reduced by 25%, and the memory area can be reduced by 62%. The maximum throughput reaches 1248 Mpixels/s, which can support real-time decoding of a 4K × 2K 60fps video sequence.
Indika U. K. BOGODA APPUHAMYLAGE Shunsuke OKURA Toru IDO Kenji TANIGUCHI
This paper proposes an area efficient, low power, fractional CMOS bandgap reference (BGR) utilizing switched-current and current-memory techniques. The proposed circuit uses only one parasitic bipolar transistor and built-in current source to generate reference voltage. Therefore significant area and power reduction is achieved, and bipolar transistor device mismatch is eliminated. In addition, output reference voltage can be set to almost any value. The proposed circuit is designed and simulated in 0.18 µm CMOS process, and simulation results are presented. With a 1.6 V supply, the reference produces an output of about 628.5 mV, and simulated results show that the temperature coefficient of output is less than 13.8 ppm/ in the temperature range from 0 to 100. The average current consumption is about 8.5 µA in the above temperature range. The core circuit, including current source, opamp, current mirrors and switched capacitor filters, occupies less than 0.0064 mm2 (80 µm×80 µm).
Joonhee LEE Sungjun KIM Sehyung JEON Woojae LEE SeongHwan CHO
This letter presents an ultra low-jitter clock generator that employs an area-efficient LC-VCO. In order to fully utilize the area of the on-chip inductor, the loop filter of a phase locked loop (PLL) is located underneath the inductor. A prototype chip implemented in 0.13 µm CMOS process achieves 105 MHz to 225 MHz of clock frequency while consuming 4.2 mW from 1.2 V supply. The measured rms jitter and normalized rms jitter of the proposed clock generator are 2.8 ps and 0.031% at 105 MHz, respectively.
Hiroaki YAMAOKA Makoto IKEDA Kunihiro ASADA
This paper presents a new high-speed and area-efficient dual-rail PLA. The proposed circuit includes three schemes: 1) a divided column scheme (DCS), 2) a programmable sense-amplifier activation scheme (PSAS), and 3) an interdigitated column scheme (ICS). In the DCS, a column circuit of a PLA is divided and each circuit operates in parallel. This enhances the performance of the PLA, and this scheme becomes more effective as input data bandwidth increases. The PSAS is used to generate an activation pulse for sense amplifiers in the PLA. In this scheme, the proposed delay generators enable to minimize a timing margin depending on process variations and operating conditions. The ICS is used to enhance the area-efficiency of the PLA, where a method of physical compaction is employed. This scheme is effective for circuits which have the regularity in logic function such as arithmetic circuits. As applications of the proposed PLA, a comparator, a priority encoder, and an incrementor for 128-bit data processing were designed. The proposed circuit design schemes achieved a 22.2% delay reduction and a 37.5% area reduction on average over the conventional high-speed and low-power PLA in a 0.13-µm CMOS technology with a supply voltage of 1.2 V.
Hiroaki YAMAOKA Hiroaki YOSHIDA Makoto IKEDA Kunihiro ASADA
This paper describes an area-efficient dual-rail array logic architecture, a logic-cell-embedded PLA (LCPLA), which has 2-input logic cells in the structure. The 2-input logic cells composed of pass-transistors can realize any 2-input Boolean function and are embedded in a dual-rail PLA. The logic cells can be designed by connecting some local wires and do not require additional transistors over logic cells of the conventional dual-rail PLA. By using the logic cells, some classes of logic functions can be implemented efficiently, so that high-speed and low-power operations are also achieved. The advantages over the conventional PLAs and standard-cell-based designs were demonstrated by using benchmark circuits, and the LCPLA is shown to be effective to reduce the number of product terms. In a structure with a 64-bit input and a 1-bit output including 220 product terms, the LCPLA achieved an area reduction by 35% compared to the conventional high-speed dual-rail PLA, and the power-delay product was reduced by 74% and 46% compared to the conventional high-speed single-rail PLA and the conventional high-speed dual-rail PLA, respectively. A test chip of this configuration was fabricated using a 0.35-µm, 3-metal-layer CMOS technology, and was verified with a functional test using a logic tester and an electron-beam tester at frequencies of up to 100 MHz with a supply voltage of 3.3 V.