Tatsuya SAKANUSHI Jie HU Kou YAMADA
The simple repetitive control system proposed by Yamada et al. is a type of servomechanism for periodic reference inputs. This system follows a periodic reference input with a small steady-state error, even if there is periodic disturbance or uncertainty in the plant. In addition, simple repetitive control systems ensure that transfer functions from the periodic reference input to the output and from the disturbance to the output have finite numbers of poles. Yamada et al. clarified the parameterization of all stabilizing simple repetitive controllers. Recently, Yamada et al. proposed the parameterization of all stabilizing two-degrees-of-freedom (TDOF) simple repetitive controllers that can specify the input-output characteristic and the disturbance attenuation characteristic separately. However, when using the method of Yamada et al., it is complex to specify the low-pass filter in the internal model for the periodic reference input that specifies the frequency characteristics. This paper extends the results of Yamada et al. and proposes the parameterization of all stabilizing TDOF simple repetitive controllers with specified frequency characteristics in which the low-pass filter can be specified beforehand.
Teerachot SIRIBURANON Takahiro SATO Ahmed MUSA Wei DENG Kenichi OKADA Akira MATSUZAWA
This paper presents a 20 GHz push-push VCO realized by a 10 GHz super-harmonic coupled quadrature oscillator for a quadrature 60 GHz frequency synthesizer. The output nodes are peaked by a tunable second harmonic resonator. The proposed VCO is implemented in 65 nm CMOS process. It achieves a tuning range of 3.5 GHz from 16.1 GHz to 19.6 GHz with a phase noise of -106 dBc/Hz at 1 MHz offset. The power consumption of the core oscillators is 10.3 mW and an FoM of -181.3 dBc/Hz is achieved.
Yasuhiro SUGIMOTO Kazuma SAKATOH
Circuit techniques to enhance the linearity of input-voltage-to-current (V/I) conversion and to increase the output impedance of a current source by compensating for the low intrinsic gain of a transistor were introduced to realize a high-frequency operational transconductance amplifier (OTA) for a low supply voltage using sub-100-nm CMOS processes. Applying these techniques, a MOS 7th-order Gm-C linear-phase low-pass filter (LPF) was realized using a 65 nm CMOS process. A simplified biquad LPF that can serve as a component of a 7th-order LPF was newly developed by replacing OTAs with resistors. As a result, the -3 dB frequency bandwidth, group delay ripple, 3rd-order distortion, and 3rd-order input intercept point (IIP3) were 200 MHz, 2.2%, ≤ -55 dB with a 100 MHz input, and +10.3 dBm, respectively, all with a ± 0.1 Vp-p input signal at each input terminal in the pseudodifferential configuration. The LPF including an output buffer dissipated 60 mW in the case of a 1.2 V supply. Wide spurious-free dynamic range (SFDR) characteristics were confirmed up to high frequencies.
Fei LI Masaya MIYAHARA Akira MATSUZAWA
Recent attempts to directly combine CMOS pixel readout chips with modern gas detectors open the possibility to fully take advantage of gas detectors. Those conventional readout LSIs designed for hybrid semiconductor detectors show some issues when applied to gas detectors. Several new proposed readout LSIs can improve the time and the charge measurement precision. However, the widely used basic charge sensitive amplifier (CSA) has an almost fixed dynamic range. There is a trade-off between the charge measurement resolution and the detectable input charge range. This paper presents a method to apply the folding integration technique to a basic CSA. As a result, the detectable input charge dynamic range is expanded while maintaining all the key merits of a basic CSA. Although folding integration technique has already been successfully applied in CMOS image sensors, the working conditions and the signal characteristics are quite different for pixel readout LSIs for gas particle detectors. The related issues of the folding CSA for pixel readout LSIs, including the charge error due to finite gain of the preamplifier, the calibration method of charge error, and the dynamic range expanding efficiency, are addressed and analyzed. As a design example, this paper also demonstrates the application of the folding integration technique to a Qpix readout chip. This improves the charge measurement resolution and expands the detectable input dynamic range while maintaining all the key features. Calculations with SPICE simulations show that the dynamic range can be improved by 12 dB while the charge measurement resolution is improved by 10 times. The charge error during the folding operation can be corrected to less than 0.5%, which is sufficient for large input charge measurement.
Norifumi KAMIYA Yoichi HASHIMOTO Masahiro SHIGIHARA
In this paper, we present a novel class of long quasi-cyclic low-density parity-check (QC-LDPC) codes. Each of the codes in this class has a structure formed by concatenating single-parity-check codes and QC-LDPC codes of shorter lengths, which allows for efficient, high throughput encoder/decoder implementations. Using a code in this class, we design a forward error correction (FEC) scheme for optical transmission systems and present its high throughput encoder/decoder architecture. In order to demonstrate its feasibility, we implement the architecture on a field programmable gate array (FPGA) platform. We show by both FPGA-based simulations and measurements of an optical transmission system that the FEC scheme can achieve excellent error performance and that there is no significant performance degradation due to the constraint on its structure while getting an efficient, high throughput implementation is feasible.
Ryota SEKIMOTO Akira SHIKATA Kentaro YOSHIOKA Tadahiro KURODA Hiroki ISHIKURO
An ultra low power and low voltage successive-approximation-register (SAR) analog-to-digital converter (ADC) with timing optimized asynchronous clock generator is presented. By calibrating the delay amount of the clock generator, the DAC settling waiting time is adaptively optimized to counter the device mismatch. This technique improved the maximum sampling frequency by 40% keeping ENOB around 7-bit at 0.4 V analog and 0.7 V digital power supply voltage. The delay time dependency on power supply has small effect to the accuracy of conversion. Decreasing of supply voltage by 9% degrades ENOB only by 0.1-bit, and the proposed calibration can give delay margins for high voltage swing. The prototype ADC fabricated in 40 nm CMOS process achieved figure of merit (FoM) of 8.75-fJ/conversion-step with 2.048 MS/s at 0.6 V analog and 0.7 V digital power supply voltage. The ADC can operates from 50 S/s to 8 MS/s keeping ENOB over 7.5-bit.
Takao KIHARA Tomohiro SANO Masakazu MIZOKAMI Yoshikazu FURUTA Mitsuhiko HOKAZONO Takaya MARUYAMA Tetsuya HEIMA Hisayasu SATO
We present a multiband LTE SAW-less CMOS transmitter with source-follower-driven passive mixers, envelope-tracked RF-programmable gain amplifiers (RF-PGAs), and Marchand Baluns. A driver stage for passive mixers is realized by a source follower, which enables a quadrature modulator (QMOD) to achieve low noise performance at a 1.2 V supply and contributes to a small-area and low-power transmitter. An envelope-tracking technique is adopted to improve the linearity of RF-PGAs and obtain a better Evolved Universal Terrestrial Radio Access Adjacent Channel Leakage power Ratio (E-UTRA ACLR). The Marchand balun covers more frequency bands than a transformer and is more suitable for multiband operation. The proposed transmitter, which also includes digital-to-analog converters and a phase-locked loop, is implemented in a 65-nm CMOS process. The implemented transmitter achieves E-UTRA ACLR of less than -42 dBc and RX-band noise of less than -158 dBc/Hz in the frequency range of 700 MHz–2.6 GHz. These performances are good enough for multiband LTE and SAW-less operation.
Lei SUN Zhenyu LIU Takeshi IKENAGA
Scalable Video Coding (SVC) is an extension of H.264/AVC, aiming to provide the ability to adapt to heterogeneous networks or requirements. It offers great flexibility for bitstream adaptation in multi-point applications such as videoconferencing. However, transcoding between SVC and AVC is necessary due to the existence of legacy AVC-based systems. The straightforward re-encoding method requires great computational cost, and delay-sensitive applications like videoconferencing require much faster transcoding scheme. This paper proposes an ultra-low-delay SVC-to-AVC MGS (Medium-Grain quality Scalability) transcoder for videoconferencing applications. Transcoding is performed in pure frequency domain with partial decoding/encoding in order to achieve significant speed-up. Three fast transcoding methods in frequency domain are proposed for macroblocks with different coding modes in non-KEY pictures. KEY pictures are transcoded by reusing the base layer motion data, and error propagation is constrained between KEY pictures. Simulation results show that proposed transcoder achieves averagely 38.5 times speed-up compared with the re-encoding method, while introducing merely 0.71 dB BDPSNR coding quality loss for videoconferencing sequences as compared with the re-encoding algorithm.
Suyue LI Jian XIONG Peng CHENG Lin GUI Youyun XU
One major challenge to implement orthogonal frequency division multiplexing (OFDM) systems over doubly selective channels is the non-negligible intercarrier interference (ICI), which significantly degrades the system performance. Existing solutions to cope with ICI include zero-forcing (ZF), minimum mean square error (MMSE) and other linear or nonlinear equalization methods. However, these schemes fail to achieve a satisfactory tradeoff between performance and computational complexity. To address this problem, in this paper we propose two novel nonlinear ICI cancellation techniques, which are referred to as parallel interference cancelation (PIC) and hybrid interference cancelation (HIC). Taking advantage of the special structure of basis expansion model (BEM) based channel matrices, our proposed schemes enjoy low computational complexity and are capable of cancelling ICI effectively. Moreover, since the proposed schemes can flexibly select different basis functions and be independent of the channel statistics, they are applicable to practical OFDM based systems such as DVB-T2 over doubly selective channels. Theoretical analysis and simulation results both confirm their performance-complexity advantages in comparison with some existing methods.
Hao ZHANG Mengshu HUANG Yimeng ZHANG Tsutomu YOSHIHARA
This paper proposes a novel approach for implementing an ultra-low-power voltage reference using the structure of self-cascode MOSFET, operating in the subthreshold region with a self-biased body effect. The difference between the two gate-source voltages in the structure enables the voltage reference circuit to produce a low output voltage below the threshold voltage. The circuit is designed with only MOSFETs and fabricated in standard 0.18-µm CMOS technology. Measurements show that the reference voltage is about 107.5 mV, and the temperature coefficient is about 40 ppm/, at a range from -20 to 80. The voltage line sensitivity is 0.017%/V. The minimum supply voltage is 0.85 V, and the supply current is approximately 24 nA at 80. The occupied chip area is around 0.028 mm2.
Muchen LI Jinjia ZHOU Dajiang ZHOU Xiao PENG Satoshi GOTO
As the successive video compression standard of H.264/AVC, High Efficiency Video Codec (HEVC) will play an important role in video coding area. In the deblocking filter part, HEVC inherits the basic property of H.264/AVC and gives some new features. Based on this variation, this paper introduces a novel dual-mode deblocking filter architecture which could support both of the HEVC and H.264/AVC standards. For HEVC standard, the proposed symmetric unified-cross unit (SUCU) based filtering scheme greatly reduces the design complexity. As a result, processing a 1616 block needs 24 clock cycles. For H.264/AVC standard, it takes 48 clock cycles for a 1616 macro-block (MB). In synthesis result, the proposed architecture occupies 41.6k equivalent gate count at frequency of 200 MHz in SMIC 65 nm library, which could satisfy the throughput requirement of super hi-vision (SHV) on 60 fps. With filter reusing scheme, the universal design for the two standards saves 30% gate counts than the dedicated ones in filter part. In addition, the total power consumption could be reduced by 57.2% with skipping mode when the edges need not be filtered.
Xing LIU Daiyuan PENG Xianhua NIU Fang LIU
In order to evaluate the goodness of frequency hopping (FH) sequence design, the periodic Hamming correlation function is used as an important measure. But aperiodic Hamming correlation of FH sequences matters in real applications, while it received little attraction in the literature compared with periodic Hamming correlation. In this paper, the new aperiodic Hamming correlation lower bounds for FH sequences, with respect to the size of the frequency slot set, the sequence length, the family size, the maximum aperiodic Hamming autocorrelation and the maximum aperiodic Hamming crosscorrelation are established. The new aperiodic bounds are tighter than the Peng-Fan bounds. In addition, the new bounds include the second powers of the maximum aperiodic Hamming autocorrelation and the maximum aperiodic Hamming crosscorrelation but the Peng-Fan bounds do not include them. For the given sequence length, the family size and the frequency slot set size, the values of the maximum aperiodic Hamming autocorrelation and the maximum aperiodic Hamming crosscorrelation are inside of an ellipse which is given by the new aperiodic bounds.
Shan-Chun KUO Hong-Yuan JHENG Fan-Chieh CHENG Shanq-Jang RUAN
In this letter, a design of inverse discrete cosine transform for energy-efficient watermarking mechanism based on DS-CDMA with significant energy and area reduction is presented. Taking advantage of converged input data value set as a precomputation concept, the proposed one-dimensional IDCT is a multiplierless hardware which differs from Loeffler architecture and has benefits of low complexity and low power consumption. The experimental results show that our design can reduce 85.2% energy consumption and 58.6% area. Various spectrum and spatial attacks are also tested to corroborate the robustness.
Xiaopeng JIAO Jianjun MU Rong SUN
Turbo equalization is an iterative equalization and decoding technique that can achieve impressive performance gains for communication systems. In this letter, we investigate the turbo equalization method for the decoding of the Davey-MacKay (DM) construction over the IDS-AWGN channels, which indicates a cascaded insertion, deletion, substitution (IDS) channel and an additive white Gaussian noise (AWGN) channel. The inner decoder for the DM construction can be seen as an maximum a-posteriori (MAP) detector. It receives the beliefs generated by the outer LDPC decoder when turbo equalization is used. Two decoding schemes with different kinds of inner decoders, namely hard-input inner decoder and soft-input inner decoder, are investigated. Simulation results show that significant performance gains are obtained for both decoders with respect to the insertion/deletion probability at different SNR values.
Min-Chul SUN Sang Wan KIM Garam KIM Hyun Woo KIM Hyungjin KIM Byung-Gook PARK
A novel tunneling field-effect transistor (TFET) featuring the sigma-shape embedded SiGe sources and recessed channel is proposed. The gate facing the source effectively focuses the E-field at the tip of the source and eliminates the gradual turn-on issue of planar TFETs. The fabrication scheme modified from the state-of-the-art 45 nm/32 nm CMOS technology flows provides a unique benefit in the co-integrability and the control of ID-VGS characteristics. The feasibility is verified with TCAD process simulation of the device with 14 nm of the gate dimension. The device simulation shows 5-order change in the drain current with a gate bias change less than 300 mV.
Kosuke MIZUNO Kenta TAKAGI Yosuke TERACHI Shintaro IZUMI Hiroshi KAWAGUCHI Masahiko YOSHIMOTO
This paper describes a Histogram of Oriented Gradients (HOG) feature extraction accelerator that features a VLSI-oriented HOG algorithm with early classification in Support Vector Machine (SVM) classification, dual core architecture for parallel feature extraction and multiple object detection, and detection-window-size scalable architecture with reconfigurable MAC array for processing objects of several shapes. To achieve low-power consumption for mobile applications, early classification reduces the amount of computations in SVM classification efficiently with no accuracy degradation. The dual core architecture enables parallel feature extraction in one frame for high-speed or low-power computing and detection of multiple objects simultaneously with low power consumption by HOG feature sharing. Objects of several shapes, a vertically long object, a horizontally long object, and a square object, can be detected because of cooperation between the two cores. The proposed methods provide processing capability for HDTV resolution video (19201080 pixels) at 30 frames per second (fps). The test chip, which has been fabricated using 65 nm CMOS technology, occupies 4.22.1 mm2 containing 502 Kgates and 1.22 Mbit on-chip SRAMs. The simulated data show 99.5 mW power consumption at 42.9 MHz and 1.1 V.
In order to reduce the dynamic energy dissipation in CMOS LSIs, it is effective to reduce the frequency of value changes of the signals. In this paper, a data expression with the valid digit and lower digit overflow information is proposed to suppress unnecessary signal changes in integer functional units and registers of general purpose processors. Experimental results show that the proposed method reduces the energy dissipation by 9.8% for benchmark programs.
Benjamin DEVLIN Makoto IKEDA Kunihiro ASADA
In this paper we show that self synchronous circuits can provide robust operation in both soft error prone and low voltage operating environments. Self synchronous circuits are shown to be self checking, where a soft error will either cause a detectable error or halt operation of the circuit. A watchdog circuit is proposed to autonomously detect dual-rail '11' errors and prevent propagation, with measurements in 65 nm CMOS showing seamless operation from 1.6 V to 0.37 V. Compared to a system without the watchdog circuit size and energy-per-operation is increased 6.9% and 16% respectively, while error tolerance to noise is improved 83% and 40% at 1.2 V and 0.4 V respectively. A circuit that uses the dual-pipeline circuit style as redundancy against permanent faults is also presented and 40 nm CMOS measurement results shows correct operation with throughput of 1.2 GHz and 810 MHz at 1.1 V before and after disabling a faulty pipeline stage respectively.
Changyong PAN Linglong DAI Zhixing YANG
Time domain synchronous orthogonal frequency division multiplexing (TDS-OFDM) has higher spectral efficiency than the standard cyclic prefix OFDM (CP-OFDM) OFDM by replacing the random CP with the known training sequence (TS), which could be also used for synchronization and channel estimation. However, TDS-OFDM requires suffers from performance loss over fading channels due to the iterative interference cancellation has to be used to remove the mutual interferences between the TS and the useful data. To solve this problem, the novel TS based OFDM transmission scheme, referred to as the unified time-frequency OFDM (UTF-OFDM), is proposed in which the time-domain TS and the frequency-domain pilots are carefully designed to naturally avoid the interference from the TS to the data without any reconstruction. The proposed UTF-OFDM based flexible frame structure supports effective channel estimation and reliable channel equalization, while imposing a significantly lower complexity than the TDS-OFDM system at the cost of a slightly reduced spectral efficiency. Simulation results demonstrate that the proposed UTF-OFDM substantially outperforms the existing TDS-OFDM, in terms of the system's achievable bit error rate.
Hyuk-Jun LEE Seung-Chul KIM Eui-Young CHUNG
A packet memory stores packets in internet routers and it requires typically RTTC for the buffer space, e.g. several GBytes, where RTT is an average round-trip time of a TCP flow and C is the bandwidth of the router's output link. It is implemented with DRAM parts which are accessed in parallel to achieve required bandwidth. They consume significant power in a router whose scalability is heavily limited by power and heat problems. Previous work shows the packet memory size can be reduced to , where N is the number of long-lived TCP flows. In this paper, we propose a novel packet memory architecture which splits the packet memory into on-chip and off-chip packet memories. We also propose a low-power packet mapping method for this architecture by estimating the latency of packets and mapping packets with small latencies to the on-chip memory. The experimental results show that our proposed architecture and mapping method reduce the dynamic power consumption of the off-chip memory by as much as 94.1% with only 50% of the packet buffer size suggested by the previous work in realistic scenarios.