Xiao XU Tsuyoshi SUGIURA Toshihiko YOSHIMASU
This paper presents two ultra-low voltage and high performance VCO ICs with two novel transformer-based harmonic tuned tanks. The first proposed harmonic tuned tank effectively shapes the pseudo-square drain-node voltage waveform for close-in phase noise reduction. To compensate the voltage drop caused by the transformer, an improved second tank is proposed. It not only has tuned harmonic impedance but also provides a voltage gain to enlarge the output voltage swing over supply voltage limitation. The VCO with second tank exhibits over 3 dB better phase noise performance in 1/f2 region among all tuning range. The two VCO ICs are designed, fabricated and measured on wafer in 45-nm SOI CMOS technology. With only 0.3 V supply voltage, the proposed two VCO ICs exhibit best phase noise of -123.3 and -127.2 dBc/Hz at 10 MHz offset and related FoMs of -191.7 and -192.2 dBc/Hz, respectively. The frequency tuning ranges of them are from 14.05 to 15.14 GHz and from 14.23 to 15.68 GHz, respectively.
Khilda AFIFAH Nicodimus RETDIAN
Hum noise such as power line interference is one of the critical problems in the biomedical signal acquisition. Various techniques have been proposed to suppress power line interference. However, some of the techniques require more components and power consumption. The notch depth in the conventional N-path notch filter circuits needs a higher number of paths and switches off-resistance. It makes the conventional N-path notch filter less of efficiency to suppress hum noise. This work proposed the new N-path notch filter to hum noise suppression in biomedical signal acquisition. The new N-path notch filter achieved notch depth above 40dB with sampling frequency 50Hz and 60Hz. Although the proposed circuits use less number of path and switches off-resistance. The proposed circuit has been verified using artificial ECG signal contaminated by hum noise at frequency 50Hz and 60Hz. The output of N-path notch filter achieved a noise-free signal even if the sampling frequency changes.
Akira TSUCHIYA Akitaka HIRATSUKA Kenji TANAKA Hiroyuki FUKUYAMA Naoki MIURA Hideyuki NOSAKA Hidetoshi ONODERA
This paper presents a design of CMOS transimpedance amplifier (TIA) and peaking inductor for high speed, low power and small area. To realize high density integration of optical I/O, area reduction is an important figure as well as bandwidth, power and so on. To determine design parameters of multi-stage inverter-type TIA (INV-TIA) with peaking inductors, we derive a simplified model of the bandwidth and the energy per bit. Multi-layered on-chip inductors are designed for area-effective inductive peaking. A 5-stage INV-TIA with 3 peaking inductors is fabricated in a 65-nm CMOS. By using multi-layered inductors, 0.02 mm2 area is achieved. Measurement results show 45 Gb/s operation with 49 dBΩ transimpedance gain and 4.4 mW power consumption. The TIA achieves 98 fJ/bit energy efficiency.
Jonathan MOJOO Yu ZHAO Muthu Subash KAVITHA Junichi MIYAO Takio KURITA
The task of image annotation is becoming enormously important for efficient image retrieval from the web and other large databases. However, huge semantic information and complex dependency of labels on an image make the task challenging. Hence determining the semantic similarity between multiple labels on an image is useful to understand any incomplete label assignment for image retrieval. This work proposes a novel method to solve the problem of multi-label image annotation by unifying two different types of Laplacian regularization terms in deep convolutional neural network (CNN) for robust annotation performance. The unified Laplacian regularization model is implemented to address the missing labels efficiently by generating the contextual similarity between labels both internally and externally through their semantic similarities, which is the main contribution of this study. Specifically, we generate similarity matrices between labels internally by using Hayashi's quantification method-type III and externally by using the word2vec method. The generated similarity matrices from the two different methods are then combined as a Laplacian regularization term, which is used as the new objective function of the deep CNN. The Regularization term implemented in this study is able to address the multi-label annotation problem, enabling a more effectively trained neural network. Experimental results on public benchmark datasets reveal that the proposed unified regularization model with deep CNN produces significantly better results than the baseline CNN without regularization and other state-of-the-art methods for predicting missing labels.
Hao XIAO Yanming FAN Fen GE Zhang ZHANG Xin CHENG
Optical navigation (OPNAV) is the use of the on-board imaging data to provide a direct measurement of the image coordinates of the target as navigation information. Among the optical observables in deep-space, the edge of the celestial body is an important feature that can be utilized for locating the planet centroid. However, traditional edge detection algorithms like Canny algorithm cannot be applied directly for OPNAV due to the noise edges caused by surface markings. Moreover, due to the constrained computation and energy capacity on-board, light-weight image-processing algorithms with less computational complexity are desirable for real-time processing. Thus, to fast and accurately extract the edge of the celestial body from high-resolution satellite imageries, this paper presents an algorithm-hardware co-design of real-time edge detection for OPNAV. First, a light-weight edge detection algorithm is proposed to efficiently detect the edge of the celestial body while suppressing the noise edges caused by surface markings. Then, we further present an FPGA implementation of the proposed algorithm with an optimized real-time performance and resource efficiency. Experimental results show that, compared with the traditional edge detection algorithms, our proposed one enables more accurate celestial body edge detection, while simplifying the hardware implementation.
Yasuhiro MOCHIDA Takayuki NAKACHI Takahiro YAMAGUCHI
High frame rate (HFR) video is attracting strong interest since it is considered as a next step toward providing Ultra-High Definition video service. For instance, the Association of Radio Industries and Businesses (ARIB) standard, the latest broadcasting standard in Japan, defines a 120 fps broadcasting format. The standard stipulates temporally scalable coding and hierarchical transmission by MPEG Media Transport (MMT), in which the base layer and the enhancement layer are transmitted over different paths for flexible distribution. We have developed the first ever MMT transmitter/receiver module for 4K/120fps temporally scalable video. The module is equipped with a newly proposed encapsulation method of temporally scalable bitstreams with correct boundaries. It is also designed to be tolerant to severe network constraints, including packet loss, arrival timing offset, and delay jitter. We conducted a hierarchical transmission experiment for 4K/120fps temporally scalable video. The experiment demonstrated that the MMT module was successfully fabricated and capable of dealing with severe network constraints. Consequently, the module has excellent potential as a means to support HFR video distribution in various network situations.
The recent development of semiconductor technology has led to downsized, large-scaled and low-power VLSI systems. However, the incidence of soft errors has increased. Soft errors are temporary events caused by striking of α-rays and high energy neutron radiation. Since the scale of VLSI has become smaller in recent development, it is necessary to consider the occurrence of not only single node upset (SNU) but also double node upset (DNU). The existing High-performance, Low-cost, and DNU Tolerant Latch design (HLDTL) does not completely tolerate DNU. This paper presents a new design of a DNU tolerant latch to resolve this issue by adding some transistors to the HLDTL latch.
Junjun ZHENG Hiroyuki OKAMURA Tadashi DOHI
In this paper, we present non-Markovian availability models for capturing the dynamics of system behavior of an operational software system that undergoes aperiodic time-based software rejuvenation and checkpointing. Two availability models with rejuvenation are considered taking account of the procedure after the completion of rollback recovery operation. We further proceed to investigate whether there exists the optimal rejuvenation schedule that maximizes the steady-state system availability, which is derived by means of the phase expansion technique, since the resulting models are not the trivial stochastic models such as semi-Markov process and Markov regenerative process, so that it is hard to solve them by using the common approaches like Laplace-Stieltjes transform and embedded Markov chain techniques. The numerical experiments are conducted to determine the optimal rejuvenation trigger timing maximizing the steady-state system availability for each availability model, and to compare both two models.
Shoichiro TAKEDA Megumi ISOGAI Shinya SHIMIZU Hideaki KIMATA
Phase-based video magnification methods can magnify and reveal subtle motion changes invisible to the naked eye. In these methods, each image frame in a video is decomposed into an image pyramid, and subtle motion changes are then detected as local phase changes with arbitrary orientations at each pixel and each pyramid level. One problem with this process is a long computational time to calculate the local phase changes, which makes high-speed processing of video magnification difficult. Recently, a decomposition technique called the Riesz pyramid has been proposed that detects only local phase changes in the dominant orientation. This technique can remove the arbitrariness of orientations and lower the over-completeness, thus achieving high-speed processing. However, as the resolution of input video increases, a large amount of data must be processed, requiring a long computational time. In this paper, we focus on the correlation of local phase changes between adjacent pyramid levels and present a novel decomposition technique called the local Riesz pyramid that enables faster phase-based video magnification by automatically processing the minimum number of sufficient local image areas at several pyramid levels. Through this minimum pyramid processing, our proposed phase-based video magnification method using the local Riesz pyramid achieves good magnification results within a short computational time.
Radio frequency energy transfer (RET) technology has been introduced as a promising energy harvesting (EH) method to supply power in both wireless communication (WLC) and power-line communication (PLC) systems. However, current RET modified MAC (medium access control) protocols have been proposed only for WLC systems. Due to the difference in the MAC standard between WLC and PLC systems, these protocols are not suitable for PLC systems. Therefore, how to utilize RET technology to modify the MAC protocol of PLC systems (i.e., IEEE 1901), which can use the radio frequency signal to provide the transmission power and the PLC medium to finish the data transmission, i.e., realizing the ‘cooperative communication’ remains a challenge. To resolve this problem, we propose a RET modified MAC protocol for PLC systems (RET-PLC MAC). Firstly, we improve the standard PLC frame sequence by adding consultation and confirmation frames, so that the station can obtain suitable harvested energy, once it occupied the PLC medium, and the PLC system can be operated in an on-demand and self-sustainable manner. On this basis, we present the working principle of RET-PLC MAC. Then, we establish an analytical model to allow mathematical verification of RET-PLC MAC. A 2-dimension discrete Markov chain model is employed to derive the numerical analysis results of RET-PLC MAC. The impacts of buffer size, traffic rate, deferral counter process of 1901, heterogeneous environment and quality of information (QoI) are comprehensively considered in the modeling process. Moreover, we deduce the optimal results of system throughput and expected QoI. Through extensive simulations, we show the performance of RET-PLC MAC under different system parameters, and verify the corresponding analytical model. Our work provides insights into realizing cooperative communication at PLC's MAC layer.
Toi TOMITA Wakaha OGATA Kaoru KUROSAWA Ryo KUWAYAMA
In this paper, we propose a new leakage-resilient identity-based encryption (IBE) scheme that is secure against chosen-ciphertext attacks (CCA) in the bounded memory leakage model. The security of our scheme is based on the external k-Linear assumption. It is the first CCA-secure leakage-resilient IBE scheme which does not depend on q-type assumptions. The leakage rate 1/10 is achieved under the XDLIN assumption (k=2).
Ryuta SHINGAI Yuria HIRAGA Hisakazu FUKUOKA Takamasa MITANI Takashi NAKADA Yasuhiko NAKASHIMA
Modern deep learning has significantly improved performance and has been used in a wide variety of applications. Since the amount of computation required for the inference process of the neural network is large, it is processed not by the data acquisition location like a surveillance camera but by the server with abundant computing power installed in the data center. Edge computing is getting considerable attention to solve this problem. However, edge computing can provide limited computation resources. Therefore, we assumed a divided/distributed neural network model using both the edge device and the server. By processing part of the convolution layer on edge, the amount of communication becomes smaller than that of the sensor data. In this paper, we have evaluated AlexNet and the other eight models on the distributed environment and estimated FPS values with Wi-Fi, 3G, and 5G communication. To reduce communication costs, we also introduced the compression process before communication. This compression may degrade the object recognition accuracy. As necessary conditions, we set FPS to 30 or faster and object recognition accuracy to 69.7% or higher. This value is determined based on that of an approximation model that binarizes the activation of Neural Network. We constructed performance and energy models to find the optimal configuration that consumes minimum energy while satisfying the necessary conditions. Through the comprehensive evaluation, we found that the optimal configurations of all nine models. For small models, such as AlexNet, processing entire models in the edge was the best. On the other hand, for huge models, such as VGG16, processing entire models in the server was the best. For medium-size models, the distributed models were good candidates. We confirmed that our model found the most energy efficient configuration while satisfying FPS and accuracy requirements, and the distributed models successfully reduced the energy consumption up to 48.6%, and 6.6% on average. We also found that HEVC compression is important before transferring the input data or the feature data between the distributed inference processes.
Kenji MII Akihito NAGAHAMA Hirobumi WATANABE
This paper proposes an ultra-low quiescent current low-dropout regulator (LDO) with a flipped voltage follower (FVF)-based load transient enhanced circuit for wireless sensor network (WSN). Some characteristics of an FVF are low output impedance, low voltage operation, and simple circuit configuration [1]. In this paper, we focus on the characteristics of low output impedance and low quiescent current. A load transient enhanced circuit based on an FVF circuit configuration for an LDO was designed in this study. The proposed LDO, including the new circuit, was fabricated in a 0.6 µm CMOS process. The designed LDO achieved an undershoot of 75 mV under experimental conditions of a large load transient of 100 µA to 10 mA and a current slew rate (SR) of 1 µs. The quiescent current consumed by the LDO at no load operation was 204 nA.
In this paper, we propose a secure computation of sparse coding and its application to Encryption-then-Compression (EtC) systems. The proposed scheme introduces secure sparse coding that allows computation of an Orthogonal Matching Pursuit (OMP) algorithm in an encrypted domain. We prove theoretically that the proposed method estimates exactly the same sparse representations that the OMP algorithm for non-encrypted computation does. This means that there is no degradation of the sparse representation performance. Furthermore, the proposed method can control the sparsity without decoding the encrypted signals. Next, we propose an EtC system based on the secure sparse coding. The proposed secure EtC system can protect the private information of the original image contents while performing image compression. It provides the same rate-distortion performance as that of sparse coding without encryption, as demonstrated on both synthetic data and natural images.
Yi GUO Heming SUN Ping LEI Shinji KIMURA
Approximate multiplier design is an effective technique to improve hardware performance at the cost of accuracy loss. The current approximate multipliers are mostly ASIC-based and are dedicated for one particular application. In contrast, FPGA has been an attractive choice for many applications because of its high performance, reconfigurability, and fast development round. This paper presents a novel methodology for designing approximate multipliers by employing the FPGA-based fabrics (primarily look-up tables and carry chains). The area and latency are significantly reduced by applying approximation on carry results and cutting the carry propagation path in the multiplier. Moreover, we explore higher-order multipliers on architectural space by using our proposed small-size approximate multipliers as elementary modules. For different accuracy-hardware requirements, eight configurations for approximate 8×8 multiplier are discussed. In terms of mean relative error distance (MRED), the error of the proposed 8×8 multiplier is as low as 1.06%. Compared with the exact multiplier, our proposed design can reduce area by 43.66% and power by 24.24%. The critical path latency reduction is up to 29.50%. The proposed multiplier design has a better accuracy-hardware tradeoff than other designs with comparable accuracy. Moreover, image sharpening processing is used to assess the efficiency of approximate multipliers on application.
Plane wave scattering from a circular conducting cylinder and a circular conducting strip has been formulated by equivalent surface currents which are postulated from the scattering geometrical optics (GO) field. Thus derived radiation far fields are found to be the same as those formulated by a conventional physical optics (PO) approximation for both E and H polarizations.
Yuki SAITO Kei AKUZAWA Kentaro TACHIBANA
This paper presents a method for many-to-one voice conversion using phonetic posteriorgrams (PPGs) based on an adversarial training of deep neural networks (DNNs). A conventional method for many-to-one VC can learn a mapping function from input acoustic features to target acoustic features through separately trained DNN-based speech recognition and synthesis models. However, 1) the differences among speakers observed in PPGs and 2) an over-smoothing effect of generated acoustic features degrade the converted speech quality. Our method performs a domain-adversarial training of the recognition model for reducing the PPG differences. In addition, it incorporates a generative adversarial network into the training of the synthesis model for alleviating the over-smoothing effect. Unlike the conventional method, ours jointly trains the recognition and synthesis models so that they are optimized for many-to-one VC. Experimental evaluation demonstrates that the proposed method significantly improves the converted speech quality compared with conventional VC methods.
Masayuki KAWATA Kiichi TATEISHI Kenichi HIGUCHI
This paper investigates the performance of interleave division multiple access (IDMA)-based random access with various interference canceller structures in order to support massive machine-type communications (mMTC) in the fifth generation (5G) mobile communication system. To support massive connectivity in the uplink, a grant-free and contention-based multiple access scheme is essential to reduce the control signaling overhead and transmission latency. To suppress the packet loss due to collision and to achieve multi-packet reception, non-orthogonal multiple access (NOMA) with interference cancellation at the base station receiver is essential. We use IDMA and compare various interference canceller structures such as the parallel interference canceller (PIC), successive interference canceller (SIC), and their hybrid from the viewpoints of the error rate and decoding delay time. Based on extensive computer simulations, we show that IDMA-based random access is a promising scheme for supporting mMTC and the PIC-SIC hybrid achieves a good tradeoff between the error rate and decoding delay time.
In this paper, we propose a deep model of visual recognition based on hybrid KPCA Network(H-KPCANet), which is based on the combination of one-stage KPCANet and two-stage KPCANet. The proposed model consists of four types of basic components: the input layer, one-stage KPCANet, two-stage KPCANet and the fusion layer. The role of one-stage KPCANet is to calculate the KPCA filters for convolution layer, and two-stage KPCANet is to learn PCA filters in the first stage and KPCA filters in the second stage. After binary quantization mapping and block-wise histogram, the features from two different types of KPCANets are fused in the fusion layer. The final feature of the input image can be achieved by weighted serial combination of the two types of features. The performance of our proposed algorithm is tested on digit recognition and object classification, and the experimental results on visual recognition benchmarks of MNIST and CIFAR-10 validated the performance of the proposed H-KPCANet.
Vantruong NGUYEN Jueping CAI Linyu WEI Jie CHU
In this letter, a piecewise linear (PWL) sigmoid function approximation based on the statistical distribution probability of the neurons' values in each layer is proposed to improve the network recognition accuracy with only addition circuit. The sigmoid function is first divided into three fixed regions, and then according to the neurons' values distribution probability, the curve in each region is segmented into sub-regions to reduce the approximation error and improve the recognition accuracy. Experiments performed on Xilinx's FPGA-XC7A200T for MNIST and CIFAR-10 datasets show that the proposed method achieves 97.45% recognition accuracy in DNN, 98.42% in CNN on MNIST and 72.22% on CIFAR-10, up to 0.84%, 0.57% and 2.01% higher than other approximation methods with only addition circuit.