Masahiro TADA Masayuki NISHIDA
In this study, we use a vision-based driving monitoring sensor to track drivers’ visual scanning behavior, a key factor for preventing traffic accidents. Our system evaluates driver’s behaviors by referencing the safety knowledge of professional driving instructors, and provides real-time voice-guided safety advice to encourage safer driving. Our system’s evaluation of safe driving behaviors matched the instructor’s evaluation with accuracy over 80%.
Linhan LI Qianying ZHANG Zekun XU Shijun ZHAO Zhiping SHI Yong GUAN
The Linux kernel has been applied in various security-sensitive fields, so ensuring its security is crucial. Vulnerabilities in the Linux kernel are usually caused by undefined behaviors of the C programming language, the most threatening of which are memory safety vulnerabilities. Both the software-based and hardware approaches to memory safety have disadvantages of poor performance, false positives, and poor compatibility. This paper explores the feasibility of using the safe programming language Rust to reconstruct a Linux kernel component and open-source the component's code. We leverage the Rust FFI mechanism to design a safe foreign interface layer to enable the reconstructed component to invoke other Linux functionalities, and then use Rust to reconstruct the component, during which we leverage Rust's type-safety and ownership mechanisms to improve its security, and finally export the C interface of the component to enable the invocation by the Linux kernel. The performance and memory overhead of the reconstructed component, referred to as “rOOM”, were evaluated, revealing a performance overhead of 8.9% in kernel mode, 5% in user mode, 3% in real time, and a memory overhead of 0.06%. These results suggest that it is possible to develop key components of the Linux kernel using Rust in terms of functionality, performance, and memory overhead.
Weisen LUO Xiuqin WEI Hiroo SEKIYA
This paper presents an analysis-based design method for designing the class-Φ22 wireless power transfer (WPT) system, taking its subsystems as a whole into account. By using the proposed design method, it is possible to derive accurate design values which can make sure the class-E Zero-Voltage-Switching/Zero-Derivative-Switching (ZVS/ZDS) to obtain without applying any tuning processes. Additionally, it is possible to take the effects of the switch on resistance, diode forward voltage drop, and equivalent series resistances (ESRs) of all passive elements on the system operations into account. Furthermore, design curves for a wide range of parameters are developed and organized as basic data for various applications. The validities of the proposed design procedure and derived design curves are confirmed by LTspice simulation and circuit experiment. In the experimental measurements, the class-Φ22 WPT system achieves 78.8% power-transmission efficiency at 6.78MHz operating frequency and 7.96W output power. Additionally, the results obtained from the LTspice simulation and laboratory experiment show quantitative agreements with the analytical predictions, which indicates the accuracy and validity of the proposed analytical method and design curves given in this paper.
Yuya DEGAWA Toru KOIZUMI Tomoki NAKAMURA Ryota SHIOYA Junichiro KADOMOTO Hidetsugu IRIE Shuichi SAKAI
One of the performance bottlenecks of a processor is the front-end that supplies instructions. Various techniques, such as cache replacement algorithms and hardware prefetching, have been investigated to facilitate smooth instruction supply at the front-end and to improve processor performance. In these approaches, one of the most important factors has been the reduction in the number of instruction cache misses. By using the number of instruction cache misses or derived factors, previous studies have explained the performance improvements achieved by their proposed methods. However, we found that the number of instruction cache misses does not always explain performance changes well in modern processors. This is because the front-end in modern processors handles subsequent instruction cache misses in overlap with earlier ones. Based on this observation, we propose a novel factor: the number of miss regions. We define a region as a sequence of instructions from one branch misprediction to the next, while we define a miss region as a region that contains one or more instruction cache misses. At the boundary of each region, the pipeline is flushed owing to a branch misprediction. Thus, cache misses after this boundary are not handled in overlap with cache misses before the boundary. As a result, the number of miss regions is equal to the number of cache misses that are processed without overlap. In this paper, we demonstrate that the number of miss regions can well explain the variation in performance through mathematical models and simulation results. The results show that the model explains cycles per instruction with an average error of 1.0% and maximum error of 4.1% when applying an existing prefetcher to the instruction cache. The idea of miss regions highlights that instruction cache misses and branch mispredictions interact with each other in processors with a decoupled front-end. We hope that considering this interaction will motivate the development of fast performance estimation methods and new microarchitectural methods.
Fuma SAWA Yoshinori KAMIZONO Wataru KOBAYASHI Ittetsu TANIGUCHI Hiroki NISHIKAWA Takao ONOYE
Advanced driver-assistance systems (ADAS) generally play an important role to support safe drive by detecting potential risk factors beforehand and informing the driver of them. However, if too many services in ADAS rely on visual-based technologies, the driver becomes increasingly burdened and exhausted especially on their eyes. The drivers should be back out of monitoring tasks other than significantly important ones in order to alleviate the burden of the driver as long as possible. In-vehicle auditory signals to assist the safe drive have been appealing as another approach to altering visual suggestions in recent years. In this paper, we developed an in-vehicle auditory signals evaluation platform in an existing driving simulator. In addition, using in-vehicle auditory signals, we have demonstrated that our developed platform has highlighted the possibility to partially switch from only visual-based tasks to mixing with auditory-based ones for alleviating the burden on drivers.
Joong-Won SHIN Masakazu TANUMA Shun-ichiro OHMI
In this research, we investigated the threshold voltage (VTH) control by partial polarization of metal-ferroelectric-semiconductor field-effect transistors (MFSFETs) with 5 nm-thick nondoped HfO2 gate insulator utilizing Kr-plasma sputtering for Pt gate electrode deposition. The remnant polarization (2Pr) of 7.2 μC/cm2 was realized by Kr-plasma sputtering for Pt gate electrode deposition. The memory window (MW) of 0.58 V was realized by the pulse amplitude and width of -5/5 V, 100 ms. Furthermore, the VTH of MFSFET was controllable by program/erase (P/E) input pulse even with the pulse width below 100 ns which may be caused by the reduction of leakage current with decreasing plasma damage.
Jonghyeok YOU Heesoo KIM Kilho LEE
This paper proposes a fault-resilient ROS platform supporting rapid fault detection and recovery. The platform employs heartbeat-based fault detection and node replication-based recovery. Our prototype implementation on top of the ROS Melodic shows a great performance in evaluations with a Nvidia development board and an inverted pendulum device.
Shinsei YOSHIKIYO Naoko MISAWA Kasidit TOPRASERTPONG Shinichi TAKAGI Chihiro MATSUI Ken TAKEUCHI
This paper proposes a layer-wise tunable retraining method for edge FeFET Computation-in-Memory (CiM) to compensate the accuracy degradation of neural network (NN) by FeFET device errors. The proposed retraining can tune the number of layers to be retrained to reduce inference accuracy degradation by errors that occur after retraining. Weights of the original NN model, accurately trained in cloud data center, are written into edge FeFET CiM. The written weights are changed by FeFET device errors in the field. By partially retraining the written NN model, the proposed method combines the error-affected layers of NN model with the retrained layers. The inference accuracy is thus recovered. After retraining, the retrained layers are re-written to CiM and affected by device errors again. In the evaluation, at first, the recovery capability of NN model by partial retraining is analyzed. Then the inference accuracy after re-writing is evaluated. Recovery capability is evaluated with non-volatile memory (NVM) typical errors: normal distribution, uniform shift, and bit-inversion. For all types of errors, more than 50% of the degraded percentage of inference accuracy is recovered by retraining only the final fully-connected (FC) layer of Resnet-32. To simulate FeFET Local-Multiply and Global-accumulate (LM-GA) CiM, recovery capability is also evaluated with FeFET errors modeled based on FeFET measurements. Retraining only FC layer achieves recovery rate of up to 53%, 66%, and 72% for FeFET write variation, read-disturb, and data-retention, respectively. In addition, just adding two more retraining layers improves recovery rate by 20-30%. In order to tune the number of retraining layers, inference accuracy after re-writing is evaluated by simulating the errors that occur after retraining. When NVM typical errors are injected, it is optimal to retrain FC layer and 3-6 convolution layers of Resnet-32. The optimal number of layers can be increased or decreased depending on the balance between the size of errors before retraining and errors after retraining.
Tsuyoshi SUGIURA Toshihiko YOSHIMASU
This paper presents a Ka-band high-efficiency power amplifier (PA) with a novel adaptively controlled gate capacitor circuit and a two-step adaptive bias circuit for 5th generation (5G) mobile terminal applications fabricated using a 45-nm silicon on insulator (SOI) CMOS process. The PA adopts a stacked FET structure to increase the output power because of the low breakdown voltage issue of scaled MOSFETs. The novel adaptive gate capacitor circuit properly controls the RF swing for each stacked FET to achieve high efficiency in the several-dB back-off region. Further, the novel two-step adaptive bias circuit effectively controls the gate voltage for each stacked FET for high linearity and high back-off efficiency. At a supply voltage of 4 V, the fabricated PA has exhibited a saturated output power of 20.0 dBm, a peak power added efficiency (PAE) of 42.7%, a 3dB back-off efficiency of 32.7%, a 6dB back-off efficiency of 22.7%, and a gain of 15.6 dB. The effective PA area was 0.82 mm by 0.74 mm.
Jeyoen KIM Takumi SOMA Tetsuya MANABE Aya KOJIMA
This paper attempts to identify which side of the road a bicycle is currently riding on using a common camera for realizing an advanced bicycle navigation system and bicycle riding safety support system. To identify the roadway area, the proposed method performs semantic segmentation on a front camera image captured by a bicycle drive recorder or smartphone. If the roadway area extends from the center of the image to the right, the bicyclist is riding on the left side of the roadway (i.e., the correct riding position in Japan). In contrast, if the roadway area extends to the left, the bicyclist is on the right side of the roadway (i.e., the incorrect riding position in Japan). We evaluated the accuracy of the proposed method on various road widths with different traffic volumes using video captured by riding bicycles in Tsuruoka City, Yamagata Prefecture, and Saitama City, Saitama Prefecture, Japan. High accuracy (>80%) was achieved for any combination of the segmentation model, riding side identification method, and experimental conditions. Given these results, we believe that we have realized an effective image segmentation-based method to identify which side of the roadway a bicycle riding is on.
Hiroshi SUZUKI Tsuyoshi FUNAKI
SiC-MOSFETs are being increasingly implemented in power electronics systems as low-loss, fast switching devices. Despite the advantages of an SiC-MOSFET, its large dv/dt or di/dt has fear of electromagnetic interference (EMI) noise. This paper proposes and demonstrates a simple and robust gate driver that can suppress ringing oscillation and surge voltage induced by the turn-off of the SiC-MOSFET body diode. The proposed gate driver utilizes the channel leakage current methodology (CLC) to enhance the damping effect by elevating the gate-source voltage (VGS) and inducing the channel leakage current in the device. The gate driver can self-adjust the timing of initiating CLC operation, which avoids an increase in switching loss. Additionally, the output voltage of the VGS elevation circuit does not need to be actively controlled in accordance with the operating conditions. Thus, the circuit topology is simple, and ringing oscillation can be easily attenuated with fixed circuit parameters regardless of operating conditions, minimizing the increase in switching loss. The effectiveness and versatility of proposed gate driver were experimentally validated for a wide range of operating conditions by double and single pulse switching tests.
There are continuous and strong demands for the DC-DC converter to reduce the size of passive components and increase the system power density. Advances in CMOS processes and GaN FETs enabled the switching frequency of DC-DC converters to be beyond 10MHz. The advancements of 3-D integrated magnetics will further reduce the footprint. In this paper, the overview of beyond-10MHz DC-DC converters will be provided first, and our recent achievements are introduced focusing on 3D-integration of Fe-based metal composite magnetic core inductor, and GaN FET control designs.
Shunsuke TSUKADA Hikaru TAKAYASHIKI Masayuki SATO Kazuhiko KOMATSU Hiroaki KOBAYASHI
A hybrid memory architecture (HMA) that consists of some distinct memory devices is expected to achieve a good balance between high performance and large capacity. Unlike conventional memory architectures, the HMA needs the metadata for data management since the data are migrated between the memory devices during the execution of an application. The memory controller caches the metadata to avoid accessing the memory devices for the metadata reference. However, as the amount of the metadata increases in proportion to the size of the HMA, the memory controller needs to handle a large amount of metadata. As a result, the memory controller cannot cache all the metadata and increases the number of metadata references. This results in an increase in the access latency to reach the target data and degrades the performance. To solve this problem, this paper proposes a metadata prefetching mechanism for HMAs. The proposed mechanism loads the metadata needed in the near future by prefetching. Moreover, to increase the effect of the metadata prefetching, the proposed mechanism predicts the metadata used in the near future based on an address difference that is the difference between two consecutive access addresses. The evaluation results show that the proposed metadata prefetching mechanism can improve the instructions per cycle by up to 44% and 9% on average.
Sejin JUNG Eui-Sub KIM Junbeom YOO
Traditional safety analysis techniques have shown difficulties in incorporating dynamically changing structures of CPSs (Cyber-Physical Systems). STPA (System-Theoretic Process Analysis), one of the widely used, needs to unfold and arrange all hidden structures before beginning a full-fledged analysis. This paper proposes an intermediate model “Information Unfolding Model (IUM)” and a process “Information Unfolding Process (IUP)” to unfold dynamic structures which are hidden in CPSs and so help analysts construct control structures in STPA thoroughly.
Masayoshi YAMAMOTO Shinya SHIRAI Senanayake THILAK Jun IMAOKA Ryosuke ISHIDO Yuta OKAWAUCHI Ken NAKAHARA
In response to fast charging systems, Silicon Carbide (SiC) power semiconductor devices are of great interest of the automotive power electronics applications as the next generation of fast charging systems require high voltage batteries. For high voltage battery EVs (Electric Vehicles) over 800V, SiC power semiconductor devices are suitable for 3-phase inverters, battery chargers, and isolated DC-DC converters due to their high voltage rating and high efficiency performance. However, SiC-MOSFETs have two characteristics that interfere with high-speed switching and high efficiency performance operations for SiC MOS-FET applications in automotive power electronics systems. One characteristic is the low voltage rating of the gate-source terminal, and the other is the large internal gate-resistance of SiC MOS-FET. The purpose of this work was to evaluate a proposed hybrid gate drive circuit that could ignore the internal gate-resistance and maintain the gate-source terminal stability of the SiC-MOSFET applications. It has been found that the proposed hybrid gate drive circuit can achieve faster and lower loss switching performance than conventional gate drive circuits by using the current source gate drive characteristics. In addition, the proposed gate drive circuit can use the voltage source gate drive characteristics to protect the gate-source terminals despite the low voltage rating of the SiC MOS-FET gate-source terminals.
Mitsuhiko IGARASHI Yuuki UCHIDA Yoshio TAKAZAWA Makoto YABUUCHI Yasumasa TSUKAMOTO Koji SHIBUTANI Kazutoshi KOBAYASHI
In this paper, we present an analysis of local variability of bias temperature instability (BTI) by measuring Ring-Oscillators (RO) on various processes and its impact on logic circuit and SRAM. The evaluation results based on measuring ROs of a test elementary group (TEG) fabricated in 7nm Fin Field Effect Transistor (FinFET) process, 16/14nm generation FinFET processes and a 28nm planer process show that the standard deviations of Negative BTI (NBTI) Vth degradation (σ(ΔVthp)) are proportional to the square root of the mean value (µ(ΔVthp)) at any stress time, Vth flavors and various recovery conditions. While the amount of local BTI variation depends on the gate length, width and number of fins, the amount of local BTI variation at the 7nm FinFET process is slightly larger than other processes. Based on these measurement results, we present an analysis result of its impact on logic circuit considering measured Vth dependency on global NBTI in the 7nm FinFET process. We also analyse its impact on SRAM minimum operation voltage (Vmin) of static noise margin (SNM) based on sensitivity analysis and shows non-negligible Vmin degradation caused by local NBTI.
We design a silicon gate-all-around junctionless field-effect transistor (JLFET) using a step thickness gate oxide (GOX) by the Sentaurus technology computer-aided design simulation. We demonstrate the different gate-induced drain leakage (GIDL) mechanism of the traditional inversion-mode field-effect transistor (IMFET) and JLFET. The off leakage in the IMFET is dominated by the parasitic bipolar junction transistor effect, whereas in the JLFET it is a result of the volume conduction due to the screening effect of the accumulated holes. With the introduction of a 4 nm thick-second GOX and remaining first GOX thickness of 1 nm, the tunneling generation is reduced at the channel-drain interface, leading to a decrease in the off current of the JLFET. A thicker second GOX has the total gate capacitance of JLFETs, where a 0.3 ps improved intrinsic delay is achieved. This alleviates the capacitive load of the transistor in the circuit applications. Finally, the short-channel effects of the step thickness GOX JLFET were investigated with a total gate length from 40 nm to 6 nm. The results indicate that the step thickness GOX JLFETs perform better on the on/off ratio and drain-induced barrier lowering but exhibit a small degradation on the subthreshold swing and threshold roll-off.
Cache prefetching technique brings huge benefits to performance improvement, but it comes at the cost of microarchitectural security in processors. In this letter, we deep dive into internal workings of a DCUIP prefetcher, which is one of prefetchers equipped in Intel processors. We discover that a DCUIP table is shared among different execution contexts in hyperthreading-enabled processors, which leads to another microarchitectural vulnerability. By exploiting the vulnerability, we propose a DCUIP poisoning attack. We demonstrate an AES encryption key can be extracted from an AES-NI implementation by mounting the proposed attack.
Li TAN Haoyu WANG Xiaofeng LIAN Jiaqi SHI Minji WANG
As the nodes of AWSN (Aerial Wireless Sensor Networks) fly around, the network topology changes frequently with high energy consumption and high cluster head mortality, and some sensor nodes may fly away from the original cluster and interrupt network communication. To ensure the normal communication of the network, this paper proposes an improved LEACH-M protocol for aerial wireless sensor networks. The protocol is improved based on the traditional LEACH-M protocol and MCR protocol. A Cluster head selection method based on maximum energy and an efficient solution for outlier nodes is proposed to ensure that cluster heads can be replaced prior to their death and ensure outlier nodes re-home quickly and efficiently. The experiments show that, compared with the LEACH-M protocol and MCR protocol, the improved LEACH-M protocol performance is significantly optimized, increasing network data transmission efficiency, improving energy utilization, and extending network lifetime.
Jianli CAO Zhikui CHEN Yuxin WANG He GUO Pengcheng WANG
Like many processors, GPGPU suffers from memory wall. The traditional solution for this issue is to use efficient schedulers to hide long memory access latency or use data prefetch mech-anism to reduce the latency caused by data transfer. In this paper, we study the instruction fetch stage of GPU's pipeline and analyze the relationship between the capacity of GPU kernel and instruction miss rate. We improve the next line prefetch mechanism to fit the SIMT model of GPU and determine the optimal parameters of prefetch mechanism on GPU through experiments. The experimental result shows that the prefetch mechanism can achieve 12.17% performance improvement on average. Compared with the solution of enlarging I-Cache, prefetch mechanism has the advantages of more beneficiaries and lower cost.