Dongzhu LI Zhijie ZHAN Rei SUMIKAWA Mototsugu HAMADA Atsutake KOSUGE Tadahiro KURODA
A 0.13mJ/prediction with 68.6% accuracy wired-logic deep neural network (DNN) processor is developed in a single 16-nm field-programmable gate array (FPGA) chip. Compared with conventional von-Neumann architecture DNN processors, the energy efficiency is greatly improved by eliminating DRAM/BRAM access. A technical challenge for conventional wired-logic processors is the large amount of hardware resources required for implementing large-scale neural networks. To implement a large-scale convolutional neural network (CNN) into a single FPGA chip, two technologies are introduced: (1) a sparse neural network known as a non-linear neural network (NNN), and (2) a newly developed raster-scan wired-logic architecture. Furthermore, a novel high-level synthesis (HLS) technique for wired-logic processor is proposed. The proposed HLS technique enables the automatic generation of two key components: (1) Verilog-hardware description language (HDL) code for a raster-scan-based wired-logic processor and (2) test bench code for conducting equivalence checking. The automated process significantly mitigates the time and effort required for implementation and debugging. Compared with the state-of-the-art FPGA-based processor, 238 times better energy efficiency is achieved with only a slight decrease in accuracy on the CIFAR-100 task. In addition, 7 times better energy efficiency is achieved compared with the state-of-the-art network-optimized application-specific integrated circuit (ASIC).
Priyadharshini MOHANRAJ Saravanan PARAMASIVAM
The detection of hardware trojans has been extensively studied in the past. In this article, we propose a side-channel analysis technique that uses a wrapper-based feature selection technique for hardware trojan detection. The whale optimization algorithm is modified to carefully extract the best feature subset. The aim of the proposed technique is multiobjective: improve the accuracy and minimize the number of features. The power consumption traces measured from AES-128 trojan circuits are used as features in this experiment. The stabilizing property of the feature selection method helps to bring a mutual trade-off between the precision and recall parameters thereby minimizing the number of false negatives. The proposed hardware trojan detection scheme produces a maximum of 10.3% improvement in accuracy and reduction up to a single feature by employing the modified whale optimization technique. Thus the evaluation results conducted on various trust-hub cryptographic benchmark circuits prove to be efficient from the existing state-of-art methods.
In this work, template attacks that aimed to leak the nonce were performed on 256-bit ECDSA hardware to evaluate the resistance against side-channel attacks. The target hardware was an ASIC and was revealed to be vulnerable to the combination of template attacks and lattice attacks. Furthermore, the attack result indicated it was not enough to fix the MSB of the nonce to 1 which is a common countermeasure. Also, the success rate of template attacks was estimated by simulation. This estimation does not require actual hardware and enables us to test the security of the implementation in the design phase. To clarify the acceptable amount of the nonce leakage, the computational cost of lattice attacks was compared to that of ρ method which is a cryptanalysis method. As a result, the success rate of 2-bit leakage of the nonce must be under 62% in the case of 256-bit ECDSA. In other words, SNR must be under 2-4 in our simulation model.
Ryotaro NEGISHI Tatsuki KURIHARA Nozomu TOGAWA
Technological devices have become deeply embedded in people's lives, and their demand is growing every year. It has been indicated that outsourcing the design and manufacturing of integrated circuits, which are essential for technological devices, may lead to the insertion of malicious circuitry, called hardware Trojans (HTs). This paper proposes an HT detection method at gate-level netlists based on XGBoost, one of the best gradient boosting decision tree models. We first propose the optimal set of HT features among many feature candidates at a netlist level through thorough evaluations. Then, we construct an XGBoost-based HT detection method with its optimized hyperparameters. Evaluation experiments were conducted on the netlists from Trust-HUB benchmarks and showed the average F-measure of 0.842 using the proposed method. Also, we newly propose a Trojan probability propagation method that effectively corrects the HT detection results and apply it to the results obtained by XGBoost-based HT detection. Evaluation experiments showed that the average F-measure is improved to 0.861. This value is 0.194 points higher than that of the existing best method proposed so far.
Kota HISAFURU Kazunari TAKASAKI Nozomu TOGAWA
In recent years, with the wide spread of the Internet of Things (IoT) devices, security issues for hardware devices have been increasing, where detecting their anomalous behaviors becomes quite important. One of the effective methods for detecting anomalous behaviors of IoT devices is to utilize consumed energy and operation duration time extracted from their power waveforms. However, the existing methods do not consider the shape of time-series data and cannot distinguish between power waveforms with similar consumed energy and duration time but different shapes. In this paper, we propose a method for detecting anomalous behaviors based on the shape of time-series data by incorporating a shape-based distance (SBD) measure. The proposed method first obtains the entire power waveform of the target IoT device and extracts several application power waveforms. After that, we give the invariances to them, and we can effectively obtain the SBD between every two application power waveforms. Based on the SBD values, the local outlier factor (LOF) method can finally distinguish between normal application behaviors and anomalous application behaviors. Experimental results demonstrate that the proposed method successfully detects anomalous application behaviors, while the existing state-of-the-art method fails to detect them.
Xingyu WANG Ruilin ZHANG Hirofumi SHINOHARA
This paper introduces an inverter-based true random number generator (I-TRNG). It uses a single CMOS inverter to amplify thermal noise multiple times. An adaptive calibration mechanism based on clock tuning provides robust operation across a wide range of supply voltage 0.5∼1.1V and temperature -40∼140°C. An 8-bit Von-Neumann post-processing circuit (VN8W) is implemented for maximum raw entropy extraction. In a 130nm CMOS technology, the I-TRNG entropy source only occupies 635μm2 and consumes 0.016pJ/raw-bit at 0.6V. The I-TRNG occupies 13406μm2, including the entropy source, adaptive calibration circuit, and post-processing circuit. The minimum energy consumption of the I-TRNG is 1.38pJ/bit at 0.5V, while passing all NIST 800-22 and 800-90B tests. Moreover, an equivalent 15-year life at 0.7V, 25°C is confirmed by an accelerated NBTI aging test.
Dody ICHWANA PUTRA Muhammad HARRY BINTANG PRATAMA Ryotaro ISSHIKI Yuhei NAGAO Leonardo LANANTE JR Hiroshi OCHI
This paper presents a unified software and hardware wireless AI platform (USHWAP) for developing and evaluating machine learning in wireless systems. The platform integrates multi-software development such as MATLAB and Python with hardware platforms like FPGA and SDR, allowing for flexible and scalable device and edge computing application development. The USHWAP is implemented and validated using FPGAs and SDRs. Wireless signal classification, wireless LAN sensing, and rate adaptation are used as examples to showcase the platform's capabilities. The platform enables versatile development, including software simulation and real-time hardware implementation, offering flexibility and scalability for multiple applications. It is intended to be used by wireless-AI researchers to develop and evaluate intelligent algorithms in a laboratory environment.
Peiqi ZHANG Shinya TAKAMAEDA-YAMAZAKI
Binary Neural Networks (BNN) have binarized neuron and connection values so that their accelerators can be realized by extremely efficient hardware. However, there is a significant accuracy gap between BNNs and networks with wider bit-width. Conventional BNNs binarize feature maps by static globally-unified thresholds, which makes the produced bipolar image lose local details. This paper proposes a multi-input activation function to enable adaptive thresholding for binarizing feature maps: (a) At the algorithm level, instead of operating each input pixel independently, adaptive thresholding dynamically changes the threshold according to surrounding pixels of the target pixel. When optimizing weights, adaptive thresholding is equivalent to an accompanied depth-wise convolution between normal convolution and binarization. Accompanied weights in the depth-wise filters are ternarized and optimized end-to-end. (b) At the hardware level, adaptive thresholding is realized through a multi-input activation function, which is compatible with common accelerator architectures. Compact activation hardware with only one extra accumulator is devised. By equipping the proposed method on FPGA, 4.1% accuracy improvement is achieved on the original BNN with only 1.1% extra LUT resource. Compared with State-of-the-art methods, the proposed idea further increases network accuracy by 0.8% on the Cifar-10 dataset and 0.4% on the ImageNet dataset.
Ann Jelyn TIEMPO Yong-Jin JEONG
Field Programmable Gate Array (FPGA) is gaining popularity because of their reconfigurability which brings in security concerns like inserting hardware trojan. Various detection methods to overcome this threat have been proposed but in the ASIC's supply chain and cannot directly apply to the FPGA application. In this paper, the authors aim to implement a structural feature-based detection method for detecting hardware trojan in a cell-level netlist, which is not well explored yet, where the nets are segmented into smaller groups based on their interconnection and further analyzed by looking at their structural similarities. Experiments show positive performance with an average detection rate of 95.41%, an average false alarm rate of 2.87% and average accuracy of 96.27%.
Ryozo TAKAHASHI Takuji MIKI Makoto NAGATA
This brief presents a side-channel attack (SCA) technique on a high-speed asynchronous successive approximation register (SAR) analog-to-digital converter (ADC). The proposed dual neural network based on multiple noise waveforms separately discloses sign and absolute value information of input signals which are hidden by the differential structure and high-speed asynchronous operation. The target SAR ADC and on-chip noise monitors are designed on a single prototype chip for SCA demonstration. Fabricated in 40 nm, the experimental results show the proposed attack on the asynchronous SAR ADC successfully restores the input data with a competitive accuracy within 300 mV rms error.
Hardware oriented security and trust of semiconductor integrated circuit (IC) chips have been highly demanded. This paper outlines the requirements and recent developments in circuits and packaging systems of IC chips for security applications, with the particular emphasis on protections against physical implementation attacks. Power side channels are of undesired presence to crypto circuits once a crypto algorithm is implemented in Silicon, over power delivery networks (PDNs) on the frontside of a chip or even through the backside of a Si substrate, in the form of power voltage variation and electromagnetic wave emanation. Preventive measures have been exploited with circuit design and packaging technologies, and partly demonstrated with Si test vehicles.
Ann Jelyn TIEMPO Yong-Jin JEONG
Using third-party intellectual properties (3PIP) has been a norm in IC design development process to meet the time-to-market demand and at the same time minimizing the cost. But this flow introduces a threat, such as hardware trojan, which may compromise the security and trustworthiness of underlying hardware, like disclosing confidential information, impeding normal execution and even permanent damage to the system. In years, different detections methods are explored, from just identifying if the circuit is infected with hardware trojan using conventional methods to applying machine learning where it identifies which nets are most likely are hardware trojans. But the performance is not satisfactory in terms of maximizing the detection rate and minimizing the false positive rate. In this paper, a new hardware trojan detection approach is proposed where gate-level netlist is segmented into regions first before analyzing which nets might be hardware trojans. The segmentation process depends on the nets' connectivity, more specifically by looking on each fanout points. Then, further analysis takes place by means of computing the structural similarity of each segmented region and differentiate hardware trojan nets from normal nets. Experimental results show 100% detection of hardware trojan nets inserted on each benchmark circuits and an overall average of 1.38% of false positive rates which resulted to a higher accuracy with an average of 99.31%.
Binhao HE Meiting XUE Shubiao LIU Feng YU Weijie CHEN
The top-K sorting is a variant of sorting used heavily in applications such as database management systems. Recently, the use of field programmable gate arrays (FPGAs) to accelerate sorting operation has attracted the interest of researchers. However, existing hardware top-K sorting algorithms are either resource-intensive or of low throughput. In this paper, we present a resource-efficient top-K sorting architecture that is composed of L cascading sorting units, and each sorting unit is composed of P sorting cells. K=PL largest elements are produced when a variable length input sequence is processed. This architecture can operate at a high frequency while consuming fewer resources. The experimental results show that our architecture achieved a maximum 1.2x throughput-to-resource improvement compared to previous studies.
Tatsuki KURIHARA Nozomu TOGAWA
Recently, with the spread of Internet of Things (IoT) devices, embedded hardware devices have been used in a variety of everyday electrical items. Due to the increased demand for embedded hardware devices, some of the IC design and manufacturing steps have been outsourced to third-party vendors. Since malicious third-party vendors may insert malicious circuits, called hardware Trojans, into their products, developing an effective hardware-Trojan detection method is strongly required. In this paper, we propose 25 hardware-Trojan features focusing on the structure of trigger circuits for machine-learning-based hardware-Trojan detection. Combining the proposed features into 11 existing hardware-Trojan features, we totally utilize 36 hardware-Trojan features for classification. Then we classify the nets in an unknown netlist into a set of normal nets and Trojan nets based on a random-forest classifier. The experimental results demonstrate that the average true positive rate (TPR) becomes 64.2% and the average true negative rate (TNR) becomes 100.0%. They improve the average TPR by 14.8 points while keeping the average TNR compared to existing state-of-the-art methods. In particular, the proposed method successfully finds out Trojan nets in several benchmark circuits, which are not found by the existing method.
Yanjiang LIU Xianzhao XIA Jingxin ZHONG Pengfei GUO Chunsheng ZHU Zibin DAI
Side-channel analysis is one of the most investigated hardware Trojan detection approaches. However, nearly all the side-channel analysis approaches require golden chips for reference, which are hard to obtain actually. Besides, majority of existing Trojan detection algorithms focus on the data similarity and ignore the Trojan misclassification during the detection. In this paper, we propose a cost-sensitive golden chip-free hardware Trojan detection framework, which aims to minimize the probability of Trojan misclassification during the detection. The post-layout simulation data of voltage variations at different process corners is utilized as a golden reference. Further, a classification algorithm based on the combination of principal component analysis and Naïve bayes is exploited to identify the existence of hardware Trojan with a minimum misclassification risk. Experimental results on ASIC demonstrate that the proposed approach improves the detection accuracy ratio compared with the three detection algorithms and distinguishes the Trojan with only 0.27% area occupies even under ±15% process variations.
Kenta SATO Naonori SEGA Yuta SOMEI Hiroshi SHIMADA Takeshi ONOMI Yoshinao MIZUGAKI
We experimentally evaluated random number sequences generated by a superconducting hardware random number generator composed of a Josephson-junction oscillator, a rapid-single-flux-quantum (RSFQ) toggle flip-flop (TFF), and an RSFQ AND gate. Test circuits were fabricated using a 10 kA/cm2 Nb/AlOx/Nb integration process. Measurements were conducted in a liquid helium bath. The random numbers were generated for a trigger frequency of 500 kHz under the oscillating Josephson-junction at 29 GHz. 26 random number sequences of 20 kb length were evaluated for bias voltages between 2.0 and 2.7 mV. The NIST FIPS PUBS 140-2 tests were used for the evaluation. 100% pass rates were confirmed at the bias voltages of 2.5 and 2.6 mV. We found that the Monobit test limited the pass rates. As numerical simulations suggested, a detailed evaluation for the probability of obtaining “1” demonstrated the monotonical dependence on the bias voltage.
Tatsuma MORI Taito MANABE Yuichiro SHIBATA
The convex hull is the minimum convex surrounding a given set of points. Since the process of finding convex hulls has various practical application fields including embedded real-time systems, efficient acceleration of convex hull algorithms is an important problem in computer geometry. In this paper, we discuss an FPGA acceleration approach to address this problem. In order to compute the convex hull of an unsorted point set, it is necessary to store all the points during the computation, and thus the capacity of a on-chip memory is likely to be a major constraint for efficient FPGA implementation. On the other hand, approximate convex hulls are often sufficient for practical applications. Therefore, we propose a hardware oriented approximate convex hull algorithm, which can process the input points as a stream without storing all the points in the memory. We also propose some computation reduction techniques for efficient FPGA implementation. Then, we present FPGA implementation of the proposed algorithm, which is parallelized both in temporal and spatial domains, and evaluate its effectiveness in terms of performance and accuracy. As a result, we demonstrated 11 to 30 times faster performance compared to the widely-used convex hull software library Qhull. In addition, accuracy assessment revealed that the maximum approximation error normalized to the diameters of point sets was 0.038%, which was reasonably small for practical use cases.
Junko TAKAHASHI Keiichi OKABE Hiroki ITOH Xuan-Thuy NGO Sylvain GUILLEY Ritu-Ranjan SHRIVASTWA Mushir AHMED Patrick LEJOLY
The growing threat of Hardware Trojans (HT) in the System-on-Chips (SoC) industry has given way to the embedded systems researchers to propose a series of detection methodologies to identify and detect the presence of Trojan circuits or logics inside a host design in the various stages of the chip design and manufacturing process. Many state of the art works propose different techniques for HT detection among which the popular choice remains the Side-Channel Analysis (SCA) based methods that perform differential analysis targeting the difference in consumption of power, change in electromagnetic emanation or the delay in propagation of logic in various paths of the circuit. Even though the effectiveness of these methods are well established, the evaluation is carried out on simplistic models such as AES coprocessors and the analytical approaches used for these methods are limited by some statistical metrics such as direct comparison of EM traces or the T-test coefficients. In this paper, we propose two new detection methodologies based on Machine Learning algorithms. The first method consists in applying the supervised Machine Learning (ML) algorithms on raw EM traces for the classification and detection of HT. It offers a detection rate close to 90% and false negative smaller than 5%. In the second method, we propose an outlier/novelty algorithms based approach. This method combined with the T-test based signal processing technique, when compared with state-of-the-art, offers a better performance with a detection rate close to 100% and a false positive smaller than 1%. In different experiments, the false negative is nearly the same level than the false positive and for that reason the authors only show the false positive value on the results. We have evaluated the performance of our method on a complex target design: RISC-V generic processor. Three HTs with their corresponding sizes: 0.53%, 0.27% and 0.09% of the RISC-V processors are inserted for the experimentation. In this paper we provide elaborative details of our tests and experimental process for reproducibility. The experimental results show that the inserted HTs, though minimalistic, can be successfully detected using our new methodology.
Kento HASEGAWA Tomotaka INOUE Nozomu TOGAWA
Due to the rapid growth of the information industry, various Internet of Things (IoT) devices have been widely used in our daily lives. Since the demand for low-cost and high-performance hardware devices has increased, malicious third-party vendors may insert malicious circuits into the products to degrade their performance or to leak secret information stored at the devices. The malicious circuit surreptitiously inserted into the hardware products is known as a ‘hardware Trojan.’ How to detect hardware Trojans becomes a significant concern in recent hardware production. In this paper, we propose a hardware Trojan detection method that employs two-stage neural networks and effectively utilizes the Trojan probability of neighbor nets. At the first stage, the 11 Trojan features are extracted from the nets in a given netlist, and then we estimate the Trojan probability that shows the probability of the Trojan nets. At the second stage, we learn the Trojan probability of the neighbor nets for each net in the netlist and classify the nets into a set of normal nets and Trojan ones. The experimental results demonstrate that the average true positive rate becomes 83.6%, and the average true negative rate becomes 96.5%, which is sufficiently high compared to the existing methods.
Xiao-yu WAN Rui-fei CHANG Zheng-qiang WANG Zi-fu FAN
This paper investigates the sum rate (SR) maximization problem for downlink cooperative non-orthogonal multiple access (C-NOMA) systems with hardware impairments (HIs). The source node communicates with users via a half-duplex amplified-and-forward (HD-AF) relay with HIs. First, we derive the SR expression of the systems under HIs. Then, SR maximization problem is formulated under maximum power of the source, relay, and the minimum rate constraint of each user. As the original SR maximization problem is a non-convex problem, it is difficult to find the optimal resource allocation directly by tractional convex optimization method. We use variable substitution method to convert the non-convex SR maximization problem to an equivalent convex optimization problem. Finally, a joint power and rate allocation based on interior point method is proposed to maximize the SR of the systems. Simulation results show that the algorithm can improve the SR of the C-NOMA compared with the cooperative orthogonal multiple access (C-OMA) scheme.