1-3hit |
Asuka MAKI Daisuke MIYASHITA Shinichi SASAKI Kengo NAKATA Fumihiko TACHIBANA Tomoya SUZUKI Jun DEGUCHI Ryuichi FUJIMOTO
Many studies of deep neural networks have reported inference accelerators for improved energy efficiency. We propose methods for further improving energy efficiency while maintaining recognition accuracy, which were developed by the co-design of a filter-by-filter quantization scheme with variable bit precision and a hardware architecture that fully supports it. Filter-wise quantization reduces the average bit precision of weights, so execution times and energy consumption for inference are reduced in proportion to the total number of computations multiplied by the average bit precision of weights. The hardware utilization is also improved by a bit-parallel architecture suitable for granularly quantized bit precision of weights. We implement the proposed architecture on an FPGA and demonstrate that the execution cycles are reduced to 1/5.3 for ResNet-50 on ImageNet in comparison with a conventional method, while maintaining recognition accuracy.
Dongsheng YANG Tomohiro UENO Wei DENG Yuki TERASHIMA Kengo NAKATA Aravind Tharayil NARAYANAN Rui WU Kenichi OKADA Akira MATSUZAWA
A fully synthesizable all-digital phase-locked loop (AD-PLL) with a stochastic time-to-digital converter (STDC) is proposed in this paper. The whole AD-PLL circuit design is based on only standard cells from digital library, thus the layout of this AD-PLL can be automatically synthesized by a commercial place-and-route (P&R) tool with a foundry-provided standard-cell library. No manual layout and process modification is required in the whole AD-PLL design. In order to solve the delay mismatch issue in the delay-line-based time-to-digital converter (TDC), an STDC employing only standard D flip-flop (DFF) is presented to mitigate the sensitivity to layout mismatch resulted from automatic P&R. For the stochastic TDC, the key idea is to utilize the layout uncertainty due to automatic P&R which follows Gaussian distribution according to statistics theory. Moreover, the fully synthesized STDC can achieve a finer resolution compared to the conventional TDC. Implemented in a 28nm fully depleted silicon on insulator (FDSOI) technology, the fully synthesized PLL consumes only 480µW under 1.0V power supply while operating at 0.9GHz. It achieves a figure of merit (FoM) of -231.1dB with 4.0ps RMS jitter while occupying 0.0055mm2 chip area only.
Hanli LIU Teerachot SIRIBURANON Kengo NAKATA Wei DENG Ju Ho SON Dae Young LEE Kenichi OKADA Akira MATSUZAWA
This paper presents a 27.5-29.6GHz fractional-N frequency synthesizer using reference and frequency doublers to achieve low in-band and out-of-band phase-noise for 5G mobile communications. A consideration of the baseband carrier recovery circuit helps estimate phase noise requirement for high modulation scheme. The push-push amplifier and 28GHz balun help achieving differential signals with low out-of-band phase noise while consuming low power. A charge pump with gated offset as well as reference doubler help reducing PD noise resulting in low in-band phase noise while sampling loop filter helps reduce spurs. The proposed synthesizer has been implemented in 65nm CMOS technology achieving an in-band and out-of-band phase noise of -78dBc/Hz and -126dBc/Hz, respectively. It consumes only a total power of 33mW. The jitter-power figure-of-merit (FOM) is -231dB which is the highest among the state of the art >20GHz fractional-N PLLs using a low reference clock (<200MHz). The measured reference spurs are less than -80dBc.