Recently, the performances of discriminative correlation filter (CF) trackers are getting better and better in visual tracking. In this paper, we propose spatial-temporal regularization with precise state estimation based on discriminative correlation filter (STPSE) in order to achieve more significant tracking performance. First, we consider the continuous change of the object state, using the information from the previous two filters for training the correlation filter model. Here, we train the correlation filter model with the hand-crafted features. Second, we introduce update control in which average peak-to-correlation energy (APCE) and the distance between the object locations obtained by HOG features and hand-crafted features are utilized to detect abnormality of the state around the object. APCE and the distance indicate the reliability of the filter response, thus if abnormality is detected, the proposed method does not update the scale and the object location estimated by the filter response. In the experiment, our tracker (STPSE) achieves significant and real-time performance with only CPU for the challenging benchmark sequence (OTB2013, OTB2015, and TC128).
Yanjiang LIU Xianzhao XIA Jingxin ZHONG Pengfei GUO Chunsheng ZHU Zibin DAI
Side-channel analysis is one of the most investigated hardware Trojan detection approaches. However, nearly all the side-channel analysis approaches require golden chips for reference, which are hard to obtain actually. Besides, majority of existing Trojan detection algorithms focus on the data similarity and ignore the Trojan misclassification during the detection. In this paper, we propose a cost-sensitive golden chip-free hardware Trojan detection framework, which aims to minimize the probability of Trojan misclassification during the detection. The post-layout simulation data of voltage variations at different process corners is utilized as a golden reference. Further, a classification algorithm based on the combination of principal component analysis and Naïve bayes is exploited to identify the existence of hardware Trojan with a minimum misclassification risk. Experimental results on ASIC demonstrate that the proposed approach improves the detection accuracy ratio compared with the three detection algorithms and distinguishes the Trojan with only 0.27% area occupies even under ±15% process variations.
Yasuyuki MAEKAWA Yoshiaki SHIBAGAKI
Rain attenuation characteristics due to typhoon passage are discussed using the Ku-band BS satellite signal observations conducted by Osaka Electro-Communication University in Neayagawa from 1988 to 2019. The degree of hourly rain attenuation due to rainfall rate is largely enhanced as typhoon passes the east side of the station, while it becomes smaller in the case of west side passage. Compared to hourly ground wind velocities of nearby AMeDAS, the equivalent path lengths of rain attenuation become larger as the wind directions approach the same angle to the satellite, while they become smaller as the wind directions approach the opposite angle to the satellite. The increase and decrease of the equivalent path lengths are confirmed in other Ku-band and Ka-band satellite paths with different azimuth angles, such as CS, SKP, and SBC. Modified equivalent path lengths calculated by a simple propagation path model including horizontal wind speeds along the same direction to the satellite agree well with the equivalent path lengths observed by each satellite. The equivalent path lengths are, for the first time, proved to be largely affected by the direction of typhoon passage and the horizontal wind velocities.
Jerdvisanop CHAKAROTHAI Katsumi FUJII Yukihisa SUZUKI Jun SHIBAYAMA Kanako WAKE
In this study, we develop a numerical method for determining transient energy deposition in biological bodies exposed to electromagnetic (EM) pulses. We use a newly developed frequency-dependent finite-difference time-domain (FD2TD) method, which is combined with the fast inverse Laplace transform (FILT) and Prony method. The FILT and Prony method are utilized to transform the Cole-Cole model of biological media into a sum of multiple Debye relaxation terms. Parameters of Debye terms are then extracted by comparison with the time-domain impulse responses. The extracted parameters are used in an FDTD formulation, which is derived using the auxiliary differential equation method, and transient energy deposition into a biological medium is calculated by the equivalent circuit method. The validity of our proposed method is demonstrated by comparing numerical results and those derived from an analytical method. Finally, transient energy deposition into human heads of TARO and HANAKO models is then calculated using the proposed method and, physical insights into pulse exposures of the human heads are provided.
Takahiro KAWAGUCHI Naofumi TAKAGI
A 32-bit arithmetic logic unit (ALU) is designed for a rapid single flux quantum (RSFQ) bit-parallel processor. In the ALU, clocked gates are partially replaced by clockless gates. This reduces the number of D flip flops (DFFs) required for path balancing. The number of clocked gates, including DFFs, is reduced by approximately 40 %, and size of the clock distribution network is reduced. The number of pipeline stages becomes modest. The layout design of the ALU and simulation results show the effectiveness of using clockless gates in wide datapath circuits.
Naoki TAKEUCHI Taiki YAMAE Christopher L. AYALA Hideo SUZUKI Nobuyuki YOSHIKAWA
The adiabatic quantum-flux-parametron (AQFP) is an energy-efficient superconductor logic element based on the quantum flux parametron. AQFP circuits can operate with energy dissipation near the thermodynamic and quantum limits by maximizing the energy efficiency of adiabatic switching. We have established the design methodology for AQFP logic and developed various energy-efficient systems using AQFP logic, such as a low-power microprocessor, reversible computer, single-photon image sensor, and stochastic electronics. We have thus demonstrated the feasibility of the wide application of AQFP logic in future information and communications technology. In this paper, we present a tutorial review on AQFP logic to provide insights into AQFP circuit technology as an introduction to this research field. We describe the historical background, operating principle, design methodology, and recent progress of AQFP logic.
Fumihiro CHINA Naoki TAKEUCHI Hideo SUZUKI Yuki YAMANASHI Hirotaka TERAI Nobuyuki YOSHIKAWA
The adiabatic quantum flux parametron (AQFP) is an energy-efficient, high-speed superconducting logic device. To observe the tiny output currents from the AQFP in experiments, high-speed voltage drivers are indispensable. In the present study, we develop a compact voltage driver for AQFP logic based on a Josephson latching driver (JLD), which has been used as a high-speed driver for rapid single-flux-quantum (RSFQ) logic. In the JLD-based voltage driver, the signal currents of AQFP gates are converted into gap-voltage-level signals via an AQFP/RSFQ interface and a four-junction logic gate. Furthermore, this voltage driver includes only 15 Josephson junctions, which is much fewer than in the case for the previously designed driver based on dc superconducting quantum interference devices (60 junctions). In measurement, we successfully operate the JLD-based voltage driver up to 4 GHz. We also evaluate the bit error rate (BER) of the driver and find that the BER is 7.92×10-10 and 2.67×10-3 at 1GHz and 4GHz, respectively.
Taiki YAMAE Naoki TAKEUCHI Nobuyuki YOSHIKAWA
The adiabatic quantum-flux-parametron (AQFP) is an energy-efficient superconductor logic device. In a previous study, we proposed a low-latency clocking scheme called delay-line clocking, and several low-latency AQFP logic gates have been demonstrated. In delay-line clocking, the latency between adjacent excitation phases is determined by the propagation delay of excitation currents, and thus the rising time of excitation currents should be sufficiently small; otherwise, an AQFP gate can switch before the previous gate is fully excited. This means that delay-line clocking needs high clock frequencies, because typical excitation currents are sinusoidal and the rising time depends on the frequency. However, AQFP circuits need to be tested in a wide frequency range experimentally. Hence, in the present study, we investigate AQFP circuits adopting delay-line clocking with square excitation currents to apply delay-line clocking in a low frequency range. Square excitation currents have shorter rising time than sinusoidal excitation currents and thus enable low frequency operation. We demonstrate an AQFP buffer chain with delay-line clocking using square excitation currents, in which the latency is approximately 20ps per gate, and confirm that the operating margin for the buffer chain is kept sufficiently wide at clock frequencies below 1GHz, whereas in the sinusoidal case the operating margin shrinks below 500MHz. These results indicate that AQFP circuits adopting delay-line clocking can operate in a low frequency range by using square excitation currents.
Tomohiro YAMAJI Masayuki SHIRANE Tsuyoshi YAMAMOTO
A Josephson parametric oscillator (JPO) is an interesting system from the viewpoint of quantum optics because it has two stable self-oscillating states and can deterministically generate quantum cat states. A theoretical proposal has been made to operate a network of multiple JPOs as a quantum annealer, which can solve adiabatically combinatorial optimization problems at high speed. Proof-of-concept experiments have been actively conducted for application to quantum computations. This article provides a review of the mechanism of JPOs and their application as a quantum annealer.
Kenta SATO Naonori SEGA Yuta SOMEI Hiroshi SHIMADA Takeshi ONOMI Yoshinao MIZUGAKI
We experimentally evaluated random number sequences generated by a superconducting hardware random number generator composed of a Josephson-junction oscillator, a rapid-single-flux-quantum (RSFQ) toggle flip-flop (TFF), and an RSFQ AND gate. Test circuits were fabricated using a 10 kA/cm2 Nb/AlOx/Nb integration process. Measurements were conducted in a liquid helium bath. The random numbers were generated for a trigger frequency of 500 kHz under the oscillating Josephson-junction at 29 GHz. 26 random number sequences of 20 kb length were evaluated for bias voltages between 2.0 and 2.7 mV. The NIST FIPS PUBS 140-2 tests were used for the evaluation. 100% pass rates were confirmed at the bias voltages of 2.5 and 2.6 mV. We found that the Monobit test limited the pass rates. As numerical simulations suggested, a detailed evaluation for the probability of obtaining “1” demonstrated the monotonical dependence on the bias voltage.
Jiaheng LIU Ryusuke EGAWA Hiroyuki TAKIZAWA
As the number of cores on a processor increases, cache hierarchies contain more cache levels and a larger last level cache (LLC). Thus, the power and energy consumption of the cache hierarchy becomes non-negligible. Meanwhile, because the cache usage behaviors of individual applications can be different, it is possible to achieve higher energy efficiency of the computing system by determining the appropriate cache configurations for individual applications. This paper proposes a cache control mechanism to improve energy efficiency by adjusting a cache hierarchy to each application. Our mechanism first bypasses and disables a less-significant cache level, then partially disables the LLC, and finally adjusts the associativity if it suffers from a large number of conflict misses. The mechanism can achieve significant energy saving at the sacrifice of small performance degradation. The evaluation results show that our mechanism improves energy efficiency by 23.9% and 7.0% on average over the baseline and the cache-level bypassing mechanisms, respectively. In addition, even if the LLC resource contention occurs, the proposed mechanism is still effective for improving energy efficiency.
Zhi WENG Longzhen FAN Yong ZHANG Zhiqiang ZHENG Caili GONG Zhongyue WEI
As the basis of fine breeding management and animal husbandry insurance, individual recognition of dairy cattle is an important issue in the animal husbandry management field. Due to the limitations of the traditional method of cow identification, such as being easy to drop and falsify, it can no longer meet the needs of modern intelligent pasture management. In recent years, with the rise of computer vision technology, deep learning has developed rapidly in the field of face recognition. The recognition accuracy has surpassed the level of human face recognition and has been widely used in the production environment. However, research on the facial recognition of large livestock, such as dairy cattle, needs to be developed and improved. According to the idea of a residual network, an improved convolutional neural network (Res_5_2Net) method for individual dairy cow recognition is proposed based on dairy cow facial images in this letter. The recognition accuracy on our self-built cow face database (3012 training sets, 1536 test sets) can reach 94.53%. The experimental results show that the efficiency of identification of dairy cows is effectively improved.
Tomoyuki TANAKA Christopher L. AYALA Nobuyuki YOSHIKAWA
Extremely energy-efficient logic devices are required for future low-power high-performance computing systems. Superconductor electronic technology has a number of energy-efficient logic families. Among them is the adiabatic quantum-flux-parametron (AQFP) logic family, which adiabatically switches the quantum-flux-parametron (QFP) circuit when it is excited by an AC power-clock. When compared to state-of-the-art CMOS technology, AQFP logic circuits have the advantage of relatively fast clock rates (5 GHz to 10 GHz) and 5 - 6 orders of magnitude reduction in energy before cooling overhead. We have been developing extremely energy-efficient computing processor components using the AQFP. The adder is the most basic computational unit and is important in the development of a processor. In this work, we designed and measured a 16-bit parallel prefix carry look-ahead Kogge-Stone adder (KSA). We fabricated the circuit using the AIST 10 kA/cm2 High-speed STandard Process (HSTP). Due to a malfunction in the measurement system, we were not able to confirm the complete operation of the circuit at the low frequency of 100 kHz in liquid He, but we confirmed that the outputs that we did observe are correct for two types of tests: (1) critical tests and (2) 110 random input tests in total. The operation margin of the circuit is wide, and we did not observe any calculation errors during measurement.
Takumi NISHIME Hiroshi HASHIGUCHI Naobumi MICHISHITA Hisashi MORISHITA
Platform-mounted small antennas increase dielectric loss and conductive loss and decrease the radiation efficiency. This paper proposes a novel antenna design method to improve radiation efficiency for platform-mounted small antennas by characteristic mode analysis. The proposed method uses mapping of modal weighting coefficient (MWC) and infinitesimal dipole and evaluate the metal casing with 100mm × 55mm × 23mm as a platform excited by an inverted-F antenna. The simulation and measurement results show that the radiation efficiency of 5% is improved with the whole system from 2.5% of the single antenna.
Stanislav SEDUKHIN Yoichi TOMIOKA Kohei YAMAMOTO
In this paper, starting from the algorithm, a performance- and energy-efficient 3D structure or shape of the Tensor Processing Engine (TPE) for CNN acceleration is systematically searched and evaluated. An optimal accelerator's shape maximizes the number of concurrent MAC operations per clock cycle while minimizes the number of redundant operations. The proposed 3D vector-parallel TPE architecture with an optimal shape can be very efficiently used for considerable CNN acceleration. Due to implemented support of inter-block image data independency, it is possible to use multiple of such TPEs for the additional CNN acceleration. Moreover, it is shown that the proposed TPE can also be uniformly used for acceleration of the different CNN models such as VGG, ResNet, YOLO, and SSD. We also demonstrate that our theoretical efficiency analysis is matched with the result of a real implementation for an SSD model to which a state-of-the-art channel pruning technique is applied.
Caixia CAI Wenyang GAN Han HAI Fengde JIA
In this paper, to improve communication system's energy-efficiency (EE), multi-case optimization of two new transmission strategies is investigated. Firstly, with amplify-and-forward relaying and full-duplex technique, two new transmission strategies are designed. The designed transmission strategies consider direct links and non-ideal transmission conditions. At the same time, detailed capacity and energy consumption analyses of the designed transmission strategies are given. In addition, EE optimization and analysis of the designed transmission strategies are studied. It is the first case of EE optimization and it is achieved by joint optimization of transmit time (TT) and transmit power (TP). Furthermore, the second and third cases of EE optimization with respectively optimizing TT and TP are given. Simulations reveal that the designed transmission strategies can effectively improve the communication system's EE.
Rizal Setya PERDANA Yoshiteru ISHIDA
This study presents a formulation for generating context-aware natural language by machine from visual representation. Given an image sequence input, the visual storytelling task (VST) aims to generate a coherent, object-focused, and contextualized sentence story. Previous works in this domain faced a problem in modeling an architecture that works in temporal multi-modal data, which led to a low-quality output, such as low lexical diversity, monotonous sentences, and inaccurate context. This study introduces a further improvement, that is, an end-to-end architecture, called cross-modal contextualize attention, optimized to extract visual-temporal features and generate a plausible story. Visual object and non-visual concept features are encoded from the convolutional feature map, and object detection features are joined with language features. Three scenarios are defined in decoding language generation by incorporating weights from a pre-trained language generation model. Extensive experiments are conducted to confirm that the proposed model outperforms other models in terms of automatic metrics and manual human evaluation.
Tongzhou QU Zibin DAI Yanjiang LIU Lin CHEN Xianzhao XIA
The existing research on Amdahl's law is limited to multi/many-core processors, and cannot be applied to the important parallel processing architecture of coarse-grained reconfigurable arrays. This paper studies the relation between the multi-level parallelism of block cipher algorithms and the architectural characteristics of coarse-grain reconfigurable arrays. We introduce the key variables that affect the performance of reconfigurable arrays, such as communication overhead and configuration overhead, into Amdahl's law. On this basis, we propose a performance model for coarse-grain reconfigurable block cipher array (CGRBA) based on the extended Amdahl's law. In addition, this paper establishes the optimal integer nonlinear programming model, which can provide a parameter reference for the architecture design of CGRBA. The experimental results show that: (1) reducing the communication workload ratio and increasing the number of configuration pages reasonably can significantly improve the algorithm performance on CGRBA; (2) the communication workload ratio has a linear effect on the execution time.
Fei ZHANG Peining ZHEN Dishan JING Xiaotang TANG Hai-Bao CHEN Jie YAN
Intrusion is one of major security issues of internet with the rapid growth in smart and Internet of Thing (IoT) devices, and it becomes important to detect attacks and set out alarm in IoT systems. In this paper, the support vector machine (SVM) and principal component analysis (PCA) based method is used to detect attacks in smart IoT systems. SVM with nonlinear scheme is used for intrusion classification and PCA is adopted for feature selection on the training and testing datasets. Experiments on the NSL-KDD dataset show that the test accuracy of the proposed method can reach 82.2% with 16 features selected from PCA for binary-classification which is almost the same as the result obtained with all the 41 features; and the test accuracy can achieve 78.3% with 29 features selected from PCA for multi-classification while 79.6% without feature selection. The Denial of Service (DoS) attack detection accuracy of the proposed method can achieve 8.8% improvement compared with existing artificial neural network based method.
With the spread of the 5th generation mobile phone, the increase of the output power of PA (power amplifier) has become important, and in recent years, differential amplifiers that can increase the output voltage amplitude for the power supply voltage have been examined from the viewpoint of power synthesis. In the case of a differential PA, in addition to the advantage of voltage amplitude, the load impedance can be set 4 times as much as that of a single-ended PA, which makes it possible to reduce the impact of parasitic resistance. With the study of the differential PA, many transformer matching circuits have been studied in addition to the LC matching circuits that have been widely used in the past. The transformer matching circuit can easily realize the differential-single conversion, and the transformer matching circuit is an indispensable technology in the differential PA. As with the LC matching circuit, widening the bandwidth of the transformer matching circuit is at issue. In this paper, characteristics of basic transformer matching circuits are analyzed by adding input/output shunt capacitance to transformers and the conditions of bandwidth improvement are clarified. In addition, by comparing the FBW (fractional bandwidth) with the LC 2-stage matching circuit, it is shown that the FBW can be competitive.