1-11hit |
Aravind THARAYIL NARAYANAN Wei DENG Dongsheng YANG Rui WU Kenichi OKADA Akira MATSUZAWA
An all-digital fully-synthesizable PVT-tolerant clock data recovery (CDR) architecture for wireline chip-to-chip interconnects is presented. The proposed architecture enables the co-synthesis of the CDR with the digital core. By eliminating the resource hungry manual layout and interfacing steps, which are necessary for conventional CDR topologies, the design process and the time-to-market can be drastically improved. Besides, the proposed CDR architecture enables the re-usability of majority of the sub-systems which enables easy migration to different process nodes. The proposed CDR is also equipped with a self-calibration scheme for ensuring tolerence over PVT. The proposed fully-syntehsizable CDR was implemented in 28nm FDSOI. The system achieves a maximum data rate of 10.06Gbps while consuming a power of 16.1mW from a 1V power supply.
Xiaowei DENG Takahiro HANYU Michitaka KAMEYAMA
The investigation of device functions required from the systems point of view will be important for the development of the next generation of VLSI devices and systems. In this paper, a super pass transistor (SPT) model is presented as a quantum device candidate for future VLSI systems based on multiple-valued logic. A possible quantum device structure for the SPT model is also described, which employs the concepts of a lateral-resonant-tunneling quantum-dot transistor and a heterostructure field-effect transistor. Since it has the powerful capability of detecting multiple signal levels, the SPT will be useful for the implementation of highly compact multiple-valued VLSI systems. To exploit the functionality of the SPT, a super pass gate (SP-gate) corresponding to a single SPT is proposed as a multiple-valued universal logic module. The mathematical properties of the SP-gate are discussed. A design method for a multiple-valued SP-gate network is presented. An application of SP-gates to a multiple-valued image processing system is also demonstrated. The SP-gate network for the multiple-valued image processing system is evaluated in comparison with the corresponding NMOS implementation in terms of the number of transistors, interconnections and cascaded transistor stages. The size of a generalized series-parallel SP-gate network is also evaluated in comparison with a functionally equivalent multiple-valued series-parallel MOS pass transistor network. The results show that highly compact multiple-valued VLSI systems can be achieved if the SPT-model can be realized by an actual quantum device.
Teerachot SIRIBURANON Takahiro SATO Ahmed MUSA Wei DENG Kenichi OKADA Akira MATSUZAWA
This paper presents a 20 GHz push-push VCO realized by a 10 GHz super-harmonic coupled quadrature oscillator for a quadrature 60 GHz frequency synthesizer. The output nodes are peaked by a tunable second harmonic resonator. The proposed VCO is implemented in 65 nm CMOS process. It achieves a tuning range of 3.5 GHz from 16.1 GHz to 19.6 GHz with a phase noise of -106 dBc/Hz at 1 MHz offset. The power consumption of the core oscillators is 10.3 mW and an FoM of -181.3 dBc/Hz is achieved.
Ziwei DENG Yilin HOU Xina CHENG Takeshi IKENAGA
3D ball tracking is of great significance in ping-pong game analysis, which can be utilized to applications such as TV contents and tactic analysis, with some of them requiring real-time implementation. This paper proposes a CPU-GPU platform based Particle Filter for multi-view ball tracking including 4 proposals. The multi-peak estimation and the ball-like observation model are proposed in the algorithm design. The multi-peak estimation aims at obtaining a precise ball position in case the particles' likelihood distribution has multiple peaks under complex circumstances. The ball-like observation model with 4 different likelihood evaluation, utilizes the ball's unique features to evaluate the particle's similarity with the target. In the GPU implementation, the double-queue structure and the vectorized data combination are proposed. The double-queue structure aims at achieving task parallelism between some data-independent tasks. The vectorized data combination reduces the time cost in memory access by combining 3 different image data to 1 vector data. Experiments are based on ping-pong videos recorded in an official match taken by 4 cameras located in 4 corners of the court. The tracking success rate reaches 99.59% on CPU. With the GPU acceleration, the time consumption is 8.8 ms/frame, which is sped up by a factor of 98 compared with its CPU version.
Dongsheng YANG Tomohiro UENO Wei DENG Yuki TERASHIMA Kengo NAKATA Aravind Tharayil NARAYANAN Rui WU Kenichi OKADA Akira MATSUZAWA
A fully synthesizable all-digital phase-locked loop (AD-PLL) with a stochastic time-to-digital converter (STDC) is proposed in this paper. The whole AD-PLL circuit design is based on only standard cells from digital library, thus the layout of this AD-PLL can be automatically synthesized by a commercial place-and-route (P&R) tool with a foundry-provided standard-cell library. No manual layout and process modification is required in the whole AD-PLL design. In order to solve the delay mismatch issue in the delay-line-based time-to-digital converter (TDC), an STDC employing only standard D flip-flop (DFF) is presented to mitigate the sensitivity to layout mismatch resulted from automatic P&R. For the stochastic TDC, the key idea is to utilize the layout uncertainty due to automatic P&R which follows Gaussian distribution according to statistics theory. Moreover, the fully synthesized STDC can achieve a finer resolution compared to the conventional TDC. Implemented in a 28nm fully depleted silicon on insulator (FDSOI) technology, the fully synthesized PLL consumes only 480µW under 1.0V power supply while operating at 0.9GHz. It achieves a figure of merit (FoM) of -231.1dB with 4.0ps RMS jitter while occupying 0.0055mm2 chip area only.
Rui WU Wei DENG Shinji SATO Takuichi HIRANO Ning LI Takeshi INOUE Hitoshi SAKANE Kenichi OKADA Akira MATSUZAWA
A 60-GHz CMOS transmitter with on-chip antenna for high-speed short-range wireless interconnections is presented. The radiation gain of the on-chip antenna is doubled using helium-3 ion irradiation technique. The transmitter core is composed of a resistive-feedback RF amplifier, a double-balanced passive mixer, and an injection-locked oscillator. The wideband and power-saving design of the transmitter core guarantees the low-power and high-data-rate characteristic. The transmitter fabricated in a 65-nm CMOS process achieves 5-Gb/s data rate with an EVM performance of $-$12 dB for BPSK modulation at a distance of 1,mm. The whole transmitter consumes 17,mW from a 1.2-V supply and occupies a core area of 0.64,mm$^{2}$ including the on-chip antenna. The gain-enhanced antenna together with the wideband and power-saving design of the transmitter provides a low-power low-cost full on-chip solution for the short-range high-data-rate wireless communication.
Wei DENG Kenichi OKADA Akira MATSUZAWA
This paper investigates a clock frequency generator for ultra-low-voltage sub-picosecond-jitter clock generation in future 0.5-V LSI and power aware LSI. To address the potential possible solution for ultra-low-voltage applications, a 0.5 V clock frequency generator is proposed and implemented. Significant performances, in terms of sub 1-ps jitter, 50 MHz-to-6.4 GHz frequency tuning range with 2 bands and sub 1-mW PDC, demonstrated the viable replacement of ring oscillators in low-voltage and low-jitter clock generator.
Hanli LIU Teerachot SIRIBURANON Kengo NAKATA Wei DENG Ju Ho SON Dae Young LEE Kenichi OKADA Akira MATSUZAWA
This paper presents a 27.5-29.6GHz fractional-N frequency synthesizer using reference and frequency doublers to achieve low in-band and out-of-band phase-noise for 5G mobile communications. A consideration of the baseband carrier recovery circuit helps estimate phase noise requirement for high modulation scheme. The push-push amplifier and 28GHz balun help achieving differential signals with low out-of-band phase noise while consuming low power. A charge pump with gated offset as well as reference doubler help reducing PD noise resulting in low in-band phase noise while sampling loop filter helps reduce spurs. The proposed synthesizer has been implemented in 65nm CMOS technology achieving an in-band and out-of-band phase noise of -78dBc/Hz and -126dBc/Hz, respectively. It consumes only a total power of 33mW. The jitter-power figure-of-merit (FOM) is -231dB which is the highest among the state of the art >20GHz fractional-N PLLs using a low reference clock (<200MHz). The measured reference spurs are less than -80dBc.
Teerachot SIRIBURANON Wei DENG Kenichi OKADA Akira MATSUZAWA
This paper presents a constant-current-controlled class-C VCO using a self-adjusting replica bias circuit. The proposed class-C VCO is more suitable in real-life applications as it can maintain constant current which is more robust in phase noise performance over variation of gate bias of cross-coupled pair comparing to a traditional approach without amplitude modulation issue. The proposed VCO is implemented in 180,nm CMOS process. It achieves a tuning range of 4.8--4.9,GHz with a phase noise of -121,dBc/Hz at 1,MHz offset. The power consumption of the core oscillators is 4.8,mW and an FoM of -189,dBc/Hz is achieved.
Yilin HOU Ziwei DENG Xina CHENG Takeshi IKENAGA
In real-time 3D ball tracking of sports analysis in computer vision technology, complex algorithms which assure the accuracy could be time-consuming. Particle filter based algorithm has a large potential to accelerate since the algorithm between particles has the chance to be paralleled in heterogeneous CPU-GPU platform. Still, with the target multi-view 3D ball tracking algorithm, challenges exist: 1) serial flowchart for each step in the algorithm; 2) repeated processing for multiple views' processing; 3) the low degree of parallelism in reweight and resampling steps for sequential processing. On the CPU-GPU platform, this paper proposes the double stream system flow, the view priority based threads allocation, and the binary search oriented reweight. Double stream system flow assigns tasks which there is no data dependency exists into different streams for each frame processing to achieve parallelism in system structure level. View priority based threads allocation manipulates threads in multi-view observation task. Threads number is view number multiplied by particles number, and with view priority assigning, which could help both memory accessing and computing achieving parallelism. Binary search oriented reweight reduces the time complexity by avoiding to generate cumulative distribution function and uses an unordered array to implement a binary search. The experiment is based on videos which record the final game of an official volleyball match (2014 Inter-High School Games of Men's Volleyball held in Tokyo Metropolitan Gymnasium in Aug. 2014) and the test sequences are taken by multiple-view system which is made of 4 cameras locating at the four corners of the court. The success rate achieves 99.23% which is the same as target algorithm while the time consumption has been accelerated from 75.1ms/frame in CPU environment to 3.05ms/frame in the proposed system which is 24.62 times speed up, also, it achieves 2.33 times speedup compared with basic GPU implemented work.
Qi ZHAO Hongwei DENG Hongbo ZHAO
The Earth's ionosphere can hinder radio propagation with two serious problems: group delay and phase advance. Ionospheric irregularities are significantly troublesome since they make the amplitude and phase of the radio signals fluctuate rapidly, which is known as ionospheric scintillation. Severe ionospheric scintillation could cause loss of phase lock, which would degrade the positioning accuracy and affect the performance of navigation systems. Based on the phase screen model, this paper presents a novel power spectrum model of phase scintillation and a model of amplitude scintillation. Preliminary results show that, when scintillation intensity increases, the random phase and amplitude fluctuations become stronger, coinciding with the observations. Simulations of the scintillation effects on the acquisition of Beidou signals predict acquisition probability. In addition, acquisition probabilities of GPS and Beidou signals under different scintillation intensities are presented. And by the same SNR the acquisition probability decreases when the scintillation intensity increases. The simulation result shows that scintillation could cause the loss of the acquisition performance of Beidou navigation system. According to the comparison of Beidou and GPS simulations, the code length and code rate of satellite signals have an effect on the acquisition performance of navigation system.