Ryuichi FUJIMOTO Kyoya TAKANO Mizuki MOTOYOSHI Uroschanit YODPRASIT Minoru FUJISHIMA
Device modeling techniques for high-frequency circuits operating at over 100 GHz are presented. We have proposed the bond-based design as an accurate high-frequency circuit design method. Because layout parasitic extractions (LPE) are not required in the bond-based design, it can be applied high-frequency circuit design at over 100 GHz. However, customized device models are indispensable for the bond-based design. In this paper, device modeling techniques for high-frequency circuit design using the bond-based design are proposed. The customized device model for MOSFETs, transmission lines and pads are introduced. By using customized device models, the difference between the simulated and measured gains of an amplifier is improved to less than 0.6 dB at 120 GHz.
Osamu NISHII Yoichi YUYAMA Masayuki ITO Yoshikazu KIYOSHIGE Yusuke NITTA Makoto ISHIKAWA Tetsuya YAMADA Junichi MIYAKOSHI Yasutaka WADA Keiji KIMURA Hironori KASAHARA Hideo MAEJIMA
We built a 12.4 mm12.4 mm, 45-nm CMOS, chip that integrates eight 648-MHz general purpose cores, two matrix processor (MX-2) cores, four flexible engine (FE) cores and media IP (VPU5) to establish heterogeneous multi-core chip architecture. The general purpose core had its IPC (instructions per cycle) performance enhanced by adding 32-bit instructions to the existing 16-bit fixed-length instruction set and executing up to two 32-bit instructions per cycle. Considering these five-to-seven years of embedded LSI and increasing trend of access-master within LSI, we predict that the memory usage of single core will not exceed 32-bit physical area (i.e. 4 GB), but chip-total memory usage will exceed 4 GB. Based on this prediction, the physical address was expanded from 32-bit to 40-bit. The fabricated chip was tested and a parallel operation of eight general purpose cores and four FE cores and eight data transfer units (DTU) is obtained on AAC (Advanced Audio Coding) encode processing.
Yiqing HUANG Xiaocong JIN Jin ZHOU Jia SU Takeshi IKENAGA
One high profile intra predictor generation engine is proposed in this paper. Firstly, hardware level algorithm optimization for intra 88 (I8MB) mode is introduced. The original candidate pixels for generating prediction samples of I8MB are replaced with boundary pixels of intra 44 (I4MB) blocks. Based on this adoption, full data reuse between predictors of I4MB and filtered samples of I8MB can be achieved with almost no quality loss. Secondly, one lossless two-44-block based parallel predictor generation flow is proposed. The original predictor generation flow is optimized from 16 stages to 10 stages for I4MB and Intra 1616 (I16MB), which saves 37.5% processing cycles. For I8MB, similar methodology with different processing order of 44 scaled blocks is introduced. Thirdly, fully utilized hardwired engines for I4MB, I16MB and I8MB are proposed in this paper. Except DC (direct current) and plane modes, full data reuse among all intra modes of high profile can be achieved. Fourthly, for DC mode, one combined predictor generation process is introduced and predictor generation of I16MB's DC mode is merged into the process of I4MB's DC mode. Moreover, by configuring proposed hardwired engines, predictor generation of I16MB's plane mode and chrominance plane mode can be accomplished with only 50% cycles of original design. Totally, when compared with original full-mode design and latest dynamic mode reused design, the proposed predictor generation engine can achieve 89.5% and 73.2% saving of processing cycles, respectively. Synthesized by TSMC 0.18 µm technology under worst work conditions (1.62 V, 125°C), with 380 MHz and 37.2 k gates, the proposed design can handle real-time high profile intra predictor generation of Super Hi-Vision 4 k4 k@60 fps. The maximum work frequency of our design under worst condition is 468 MHz.
Jinjia ZHOU Dajiang ZHOU Gang HE Satoshi GOTO
In this paper, we present a cache based motion compensation (MC) architecture for Quad-HD H.264/AVC video decoder. With the significantly increased throughput requirement, VLSI design for MC is greatly challenged by the huge area cost and power consumption. Moreover, the long memory system latency leads to performance drop of the MC pipeline. To solve these problems, three optimization schemes are proposed in this work. Firstly, a high-performance interpolator based on Horizontal-Vertical Expansion and Luma-Chroma Parallelism (HVE-LCP) is proposed to efficiently increase the processing throughput to at least over 4 times as the previous designs. Secondly, an efficient cache memory organization scheme (4S×4) is adopted to improve the on-chip memory utilization, which contributes to memory area saving of 25% and memory power saving of 3949%. Finally, by employing a Split Task Queue (STQ) architecture, the cache system is capable of tolerating much longer latency of the memory system. Consequently, the cache idle time is saved by 90%, which contributes to reducing the overall processing time by 2440%. When implemented with SMIC 90 nm process, this design costs a logic gate count and on-chip memory of 108.8 k and 3.1 kB respectively. The proposed MC architecture can support real-time processing of 3840×2160@60 fps with less than 166 MHz.
Koichi YAMAGUCHI Masayuki MIZUNO
Duobinary signaling has been introduced into asymmetric multi-chip communications such as DRAM or display interfaces, which allows a controlled amount of ISI to reduce signaling bandwidth by 2/3. A × 2 oversampled equalization has been developed to realize Duobinary signaling. Symbol-rate clock recovery form Duobinary signal has been developed to reduce power consumption for receivers. A Duobinary transmitter test chip was fabricated with 90-nm CMOS process. A 3.5 dB increase in eye height and a 1.5 times increase in eye width was observed.
We propose a new majority voting scheme for identifying downlink primary scrambling code, where two voting processes with different coherent correlation intervals (CCIs) are simultaneously performed. A false alarm probability and a threshold adjustment for the proposed scheme are investigated, and it is shown by computer simulations that the proposed scheme can perform well over a wide range of frequency offsets.
In this letter a simplified Jury's table for real polynomials is extended to complex polynomials. Then it is shown that the extended table contains information on the root distribution of complex polynomials with respect to the unit circle in the complex plane. The result given in this letter is distinct from the recent one in that root counting is performed in a different way.
Jong-Ok KIM Peter DAVIS Tetsuro UEDA Sadao OBANA
In this paper, we address adaptive link switching over heterogeneous wireless access networks including IEEE 802.11. When an IEEE 802.11 link is congested, the transmission link of a terminal with multi-RATs (radio access technologies) is switched to another radio access systems. To this end, we propose link-level metrics of LC (link cost) and AC (access cost) for quantifying TCP congestion over IEEE 802.11 networks. The proposed metric can be easily measured at a local wireless terminal, and that enables each multi-RAT terminal to work in a distributed way. Through various indoor and outdoor experiments using a test-bed system, we verify that the proposed link level metrics are good indicators of TCP traffic congestion. Experimental results show that the proposed metrics can detect congestion occurrence quickly, and avoid the TCP throughput degradation of other neighboring terminals, when they are used for transmission link switching.
Tadayoshi ENOMOTO Nobuaki KOBAYASHI
We developed and applied a new circuit, called the “Self-controllable Voltage Level (SVL)” circuit, to achieve an expanded “read” and “write” margins and low leakage power in a 90-nm, 2-kbit, six-transistor CMOS SRAM. At the threshold voltage fluctuation of 6σ, the minimum supply voltage of the newly developed (dvlp.) SRAM for “write” operation was significantly reduced to 0.11 V, less than half that of an equivalent conventional (conv.) SRAM. The standby leakage power of the dvlp. SRAM was only 1.17 µW, which is 4.64% of that of the conv. SRAM at supply voltage of 1.0 V. Moreover, the maximum operating clock frequency of the dvlp. SRAM was 138 MHz, which is 15% higher than that (120 MHz) of the conv. SRAM at VMM of 0.4 V. An area overhead was 0.81% that of the conv. SRAM.
In this paper, we show the recent progress of photonic network technologies for the new generation network (NWGN). The NWGN is based on new design concepts that look beyond the next generation network (NGN) and the Internet. The NWGN will maintain the sustainability of our prosperous civilization and help resolve various social issues and problems by the use of information and communication technologies. In order to realize the NWGN, many novel technologies in the physical layer are required, in addition to technologies in the network control layer. Examples of cutting-edge physical layer technologies required to realize the NWGN include a terabit/s/port or greater ultra-wideband optical packet switching system, a modulation-format-free optical packet switching (OPS) node, a hybrid optoelectronic packet switching node, a packet-based reconfigurable optical add/drop multiplexer (ROADM) system, an optical packet and circuit integrated node system, and optical buffering technologies.
This paper introduces a practical color filter array (CFA) interpolation technique. Among the many technologies proposed in this field, the inter-color methods that exploit correlation between color planes generally outperform the intra-color approaches. We have found that the filtering direction, e.g., horizontal or vertical, is among the most decisive factors for the performance of the CFA interpolation. However, most of the state-of-the-art technologies are not flexible enough in determining the filtering direction. For example, filtering only in the upper direction is not usually supported. In this context, we propose an inter-color CFA interpolation using a local map called unified geometry map (UGM). In this method, the filtering direction is determined based on the similarity of the local map data. Thus, it provides more choices of the filtering directions, which enhances the probability of finding the most appropriate direction. It is confirmed through simulations that the proposal outperforms the state-of-the-art algorithms in terms of objective quality measures. In addition, the proposed scheme is as inexpensive as the conventional methods with regard to resource consumption.
Guo-Ming SUNG Ying-Tsu LAI Chien-Lin LU
This paper presents a resistor-compensation technique for a CMOS bandgap and current reference, which utilizes various high positive temperature coefficient (TC) resistors, a two-stage operational transconductance amplifier (OTA) and a simplified start-up circuit in the 0.35-µm CMOS process. In the proposed bandgap and current reference, numerous compensated resistors, which have a high positive temperature coefficient (TC), are added to the parasitic n-p-n and p-n-p bipolar junction transistor devices, to generate a temperature-independent voltage reference and current reference. The measurements verify a current reference of 735.6 nA, the voltage reference of 888.1 mV, and the power consumption of 91.28 µW at a supply voltage of 3.3 V. The voltage TC is 49 ppm/ in the temperature range from 0 to 100 and 12.8 ppm/ from 30 to 100. The current TC is 119.2 ppm/ at temperatures of 0 to 100. Measurement results also demonstrate a stable voltage reference at high temperature (> 30), and a constant current reference at low temperature (< 70).
Junqi ZHANG Lina NI Chen XIE Ying TAN Zheng TANG
This paper presents an adaptive magnification transformation based particle swarm optimizer (AMT-PSO) that provides an adaptive search strategy for each particle along the search process. Magnification transformation is a simple but very powerful mechanism, which is inspired by using a convex lens to see things much clearer. The essence of this transformation is to set a magnifier around an area we are interested in, so that we could inspect the area of interest more carefully and precisely. An evolutionary factor, which utilizes the information of population distribution in particle swarm, is used as an index to adaptively tune the magnification scale factor for each particle in each dimension. Furthermore, a perturbation-based elitist learning strategy is utilized to help the swarm's best particle to escape the local optimum and explore the potential better space. The AMT-PSO is evaluated on 15 unimodal and multimodal benchmark functions. The effects of the adaptive magnification transformation mechanism and the elitist learning strategy in AMT-PSO are studied. Results show that the adaptive magnification transformation mechanism provides the main contribution to the proposed AMT-PSO in terms of convergence speed and solution accuracy on four categories of benchmark test functions.
Huakang LI Jie HUANG Qunfei ZHAO
In this paper, we propose a method for robot self-position identification by active sound localization. This method can be used for autonomous security robots working in room environments. A system using an AIBO robot equipped with two microphones and a wireless network is constructed and used for position identification experiments. Differences in arrival time to the robot's microphones are used as localization cues. To overcome the ambiguity of front-back confusion, a three-head-position measurement method is proposed. The position of robot can be identified by the intersection of circles restricted using the azimuth differences among different sound beacon pairs. By localizing three or four loudspeakers as sound beacons positioned at known locations, the robot can identify its position with an average error of 7 cm in a 2.53.0 m2 working space in the horizontal plane. We propose adjusting the arrival time differences (ATDs) to reduce the errors caused when the sound beacons are high mounted. A robot navigation experiment was conducted to demonstrate the effectiveness of the proposed position-identification system.
Yu TAMURA Toru IDO Kenji TANIGUCHI
A dynamic dither gain control technique for multi-level delta-sigma Digital-to-Analog Converters (DACs) using multi-stage Dynamic Element Matching (DEM) with a second order loop filter is proposed. The proposed technique provides improvement on the mismatch shaping performance through dynamic control of delta-sigma modulator dither gain. A large dither gain, which suppresses DEM operation dependency on input signal, is applied to delta-sigma modulator, when DEM loop filter output is greater than a designed reference. The design example using the proposed technique on a third order 17-level delta-sigma modulator with 3-stage cascaded DEM is shown in this paper. Simulation result with 1% analog segment mismatch shows over 10 dB improvement of THD+N performance under -50 dB amplitude input signal, compared to the case without the proposed technique.
Akira FUJIMAKI Isao NAKANISHI Shigeyuki MIYAJIMA Kohei ARAI Yukio AKITA Takekazu ISHIDA
We propose a neutron diffractometer system based on MgB2 thin film detectors and an SFQ signal processor. Small dimensions of MgB2 thin film detectors and high processing capability of the single flux quantum (SFQ) circuits enable us to handle several thousand or more detectors in a cryocooler, leading to a very compact system. In addition, the system can provide many diffraction patterns for different kinetic energies simultaneously. Kinetic energy is determined for individual neutrons by means of the time-of-flight method by using SFQ time-to-digital converters (TDCs). Digital outputs of the TDCs are multiplexed in time domain and sent to room-temperature electronics with reduced number of cables. A dual-input SFQ signal processor including TDCs and a multiplexer has been successfully demonstrated with a time resolution of 20 ns and power consumption of 400 µW. These values show high feasibility of the neutron diffraction system proposed here.
Takeshi YUASA Yukihiro TAHARA Naofumi YONEDA Hideyuki OH-HASHI
A millimeter-wave termination which is tolerant to the resistance error of the embedded resistive film in a multi-layered LTCC substrate has been developed. The tolerance to the resistance error can be accomplished using two bifurcated strip lines overlapping with the resistive film, whose lengths are different form each other. It has been experimentally demonstrated that the proposed termination configuration is effective to enhance the tolerance to resistance error of the embedded resistive film in the LTCC substrate.
Keisuke KUROIWA Masataka MORIYA Tadayuki KOBAYASHI Yoshinao MIZUGAKI
Although larger scale integration enhances the practicability of superconducting Josephson circuits, several technical problems begin to emerge during its progress. One of the problems is the increase of current through a ground plane (ground current). Excess ground current produces additional magnetic field and reduces operation margins of the circuits, because superconducting Josephson devices are very sensitive to magnetic field. In this paper, we evaluate current distribution in a superconducting ground plane by means of both experiments and numerical calculation. We also verify two methods for suppressing the ground current. One is a slot structure in the ground plane, and the other is alignment of the current-extraction point. Suppression of the ground current is quantitatively evaluated.
Hiroyuki AKAIKE Naoto NAITO Yuki NAGAI Akira FUJIMAKI
We describe the fabrication processes and electrical characteristics of two types of NbN junctions. One is a self-shunted NbN/NbNx/AlN/NbN Josephson junction, which is expected to improve the density of integrated circuits; the other is an underdamped NbN/AlNx/NbN tunnel junction with radical-nitride AlNx barriers, which has highly controllable junction characteristics. In the former, the junction characteristics were changed from underdamped to overdamped by varying the thickness of the NbNx layer. Overdamped junctions with a 6-nm-thick NbNx film exhibited a characteristic voltage of Vc = 0.8 mV and a critical current density of Jc = 22 A/cm2 at 4.2 K. In the junctions with radical-nitride AlNx barriers, Jc could be controlled in the range 0.01-3 kA/cm2 by varying the process conditions, and good uniformity of the junction characteristics was obtained.
Bandwidth is an extremely valuable and scarce resource in multimedia networks. Therefore, efficient bandwidth management is necessary in order to provide high Quality of Service (QoS) to users. In this paper, a new QoS-aware bandwidth allocation algorithm is proposed for the efficient use of available bandwidth. By using the multi-objective optimization technique and Talmud allocation rule, the bandwidth is adaptively controlled to maximize network efficiency while ensuring QoS provisioning. In addition, we adopt the online feedback strategy to dynamically respond to current network conditions. With a simulation study, we demonstrate that the proposed algorithm can adaptively approximate an optimized solution under widely diverse traffic load intensities.