Yutaka MASUDA Jun NAGAYAMA TaiYu CHENG Tohru ISHIHARA Yoichi MOMIYAMA Masanori HASHIMOTO
This work proposes a design methodology that saves the power dissipation under voltage over-scaling (VOS) operation. The key idea of the proposed design methodology is to combine critical path isolation (CPI) and bit-width scaling (BWS) under the constraint of computational quality, e.g., Peak Signal-to-Noise Ratio (PSNR) in the image processing domain. Conventional CPI inherently cannot reduce the delay of intrinsic critical paths (CPs), which may significantly restrict the power saving effect. On the other hand, the proposed methodology tries to reduce both intrinsic and non-intrinsic CPs. Therefore, our design dramatically reduces the supply voltage and power dissipation while satisfying the quality constraint. Moreover, for reducing co-design exploration space, the proposed methodology utilizes the exclusiveness of the paths targeted by CPI and BWS, where CPI aims at reducing the minimum supply voltage of non-intrinsic CP, and BWS focuses on intrinsic CPs in arithmetic units. From this key exclusiveness, the proposed design splits the simultaneous optimization problem into three sub-problems; (1) the determination of bit-width reduction, (2) the timing optimization for non-intrinsic CPs, and (3) investigating the minimum supply voltage of the BWS and CPI-applied circuit under quality constraint, for reducing power dissipation. Thanks to the problem splitting, the proposed methodology can efficiently find quality-constrained minimum-power design. Evaluation results show that CPI and BWS are highly compatible, and they significantly enhance the efficacy of VOS. In a case study of a GPGPU processor, the proposed design saves the power dissipation by 42.7% with an image processing workload and by 51.2% with a neural network inference workload.
The challenges posed by big data in the 21st Century are complex: Under the previous common sense, we considered that polynomial-time algorithms are practical; however, when we handle big data, even a linear-time algorithm may be too slow. Thus, sublinear- and constant-time algorithms are required. The academic research project, “Foundations of Innovative Algorithms for Big Data,” which was started in 2014 and will finish in September 2021, aimed at developing various techniques and frameworks to design algorithms for big data. In this project, we introduce a “Sublinear Computation Paradigm.” Toward this purpose, we first provide a survey of constant-time algorithms, which are the most investigated framework of this area, and then present our recent results on sublinear progressive algorithms. A sublinear progressive algorithm first outputs a temporary approximate solution in constant time, and then suggests better solutions gradually in sublinear-time, finally finds the exact solution. We present Sublinear Progressive Algorithm Theory (SPA Theory, for short), which enables to make a sublinear progressive algorithm for any property if it has a constant-time algorithm and an exact algorithm (an exponential-time one is allowed) without losing any computation time in the big-O sense.
Double modular redundancy (DMR) is to execute an operation twice and detect a soft error by comparing the duplicated operation results. The soft error is corrected by re-executing necessary operations. The re-execution requires error-free input data and registers are needed to store such necessary error-free data. In this paper, a method to minimize the required number of registers is proposed where an appropriate subgraph partitioning of operation nodes are searched. In addition, using the proposed register minimization method, a minimization of the area of functional units and registers required to implement the DMR design is proposed.
Tingyao WU Zhisong BIE Celimuge WU
The newly proposed orthogonal time frequency space (OTFS) system exhibits excellent error performance on high-Doppler fading channels. However, the rectangular prototype window function (PWF) inherent in OTFS leads to high out-of-band emission (OOBE), which reduces the spectral efficiency in multi-user scenarios. To this end, this paper presents an OTFS system based on bi-orthogonal frequency division multiplexing (OTFS-BFDM) modulation. In OTFS-BFDM systems, PWFs with bi-orthogonal properties can be optimized to provide lower OOBE than OTFS, which is a special case with rectangular PWF. We further derive that the OTFS-BFDM system is sparsely-connected so that the low-complexity message passing (MP) decoding algorithm can be adopted. Moreover, the power spectral density, peak to average power ratio (PAPR) and bit error rate (BER) of the OTFS-BFDM system with different PWFs are compared. Simulation results show that: i) the use of BFDM modulation significantly inhibits the OOBE of OTFS system; ii) the better the frequency-domain localization of PWFs, the smaller the BER and PAPR of OTFS-BFDM system.
Tatsumi KONISHI Hiroyuki NAKANO Yoshikazu YANO Michihiro AOKI
This letter proposes a transmission scheme called spatial vector (SV), which is effective for Nakagami-m fading multiple-input multiple-output channels. First, the analytical error rate of SV is derived for Nakagami-m fading MIMO channels. Next, an example of SV called integer SV (ISV) is introduced. The error performance was evaluated over Nakagami-m fading from m = 1 to m = 50 and compared with spatial modulation (SM), enhanced SM, and quadrature SM. The results show that for m > 1, ISV outperforms the SM schemes and is robust to m variations.
Yuan ZHAO Wuyi YUE Yutaka TAKAHASHI
In this paper, we consider the transmission needs of communication networks for two classes of secondary users (SUs), named SU1 and SU2 (lowest priority) in cognitive radio networks (CRNs). In such CRNs, primary users (PUs) have preemptive priority over both SU1's users (SU1s) and SU2's users (SU2s). We propose a preemptive scheme (referred to as the P Scheme) and a non-preemptive scheme (referred to as the Non-P Scheme) when considering the interactions between SU1s and SU2s. Focusing on the transmission interruptions to SU2 packets, we present a probabilistic returning scheme with a returning probability to realize feedback control for SU2 packets. We present a Markov chain model to develop some formulas for SU1 and SU2 packets, and compare the influences of the P Scheme and the Non-P Scheme in the proposed probabilistic returning scheme. Numerical analyses compare the impact of the returning probability on the P Scheme and the Non-P Scheme. Furthermore, we optimize the returning probability and compare the optimal numerical results yielded by the P Scheme and the Non-P Scheme.
Yasushi ESAKI Yuta NAKAHARA Toshiyasu MATSUSHIMA
There have been some researchers that investigate the accuracy of the approximation to a function that shows a generating pattern of data by a deep neural network. However, they have confirmed only whether at least one function close to the function showing a generating pattern exists in function classes of deep neural networks whose parameter values are changing. Therefore, we propose a new criterion to infer the approximation accuracy. Our new criterion shows the existence ratio of functions close to the function showing a generating pattern in the function classes. Moreover, we show a deep neural network with a larger number of layers approximates the function showing a generating pattern more accurately than one with a smaller number of layers under the proposed criterion, with numerical simulations.
Masaki TAKANASHI Shu-ichi SATO Kentaro INDO Nozomu NISHIHARA Hiroki HAYASHI Toru SUZUKI
The prediction of the malfunction timing of wind turbines is essential for maintaining the high profitability of the wind power generation industry. Studies have been conducted on machine learning methods that use condition monitoring system data, such as vibration data, and supervisory control and data acquisition (SCADA) data to detect and predict anomalies in wind turbines automatically. Autoencoder-based techniques that use unsupervised learning where the anomaly pattern is unknown have attracted significant interest in the area of anomaly detection and prediction. In particular, vibration data are considered useful because they include the changes that occur in the early stages of a malfunction. However, when autoencoder-based techniques are applied for prediction purposes, in the training process it is difficult to distinguish the difference between operating and non-operating condition data, which leads to the degradation of the prediction performance. In this letter, we propose a method in which both vibration data and SCADA data are utilized to improve the prediction performance, namely, a method that uses a power curve composed of active power and wind speed. We evaluated the method's performance using vibration and SCADA data obtained from an actual wind farm.
In this paper, we propose the first private decision tree evaluation (PDTE) schemes which are suitable for use in Machine Learning as a Service (MLaaS) scenarios. In our schemes, a user and a model owner send the ciphertexts of a sample and a decision tree model, respectively, and a single server classifies the sample without knowing the sample nor the decision tree. Although many PDTE schemes have been proposed so far, most of them require to reveal the decision tree to the server. This is undesirable because the classification model is the intellectual property of the model owner, and/or it may include sensitive information used to train the model, and therefore the model also should be hidden from the server. In other PDTE schemes, multiple servers jointly conduct the classification process and the decision tree is kept secret from the servers under the assumption they do not collude. Unfortunately, this assumption may not hold because MLaaS is usually provided by a single company. In contrast, our schemes do not have such problems. In principle, fully homomorphic encryption allows us to classify an encrypted sample based on an encrypted decision tree, and in fact, the existing non-interactive PDTE scheme can be modified so that the server classifies only handling ciphertexts. However, the resulting scheme is less efficient than ours. We also show the experimental results for our schemes.
Eunsam KIM Jinsung KIM Hyoseop SHIN
This paper presents a novel cooperative recording scheme in networked PVRs based on P2P networks to increase storage efficiency compared with when PVRs operate independently of each other, while maintaining program availability to a similar degree. We employ an erasure coding technique to guarantee data availability of recorded programs in P2P networks. We determine the data redundancy degree of recorded programs so that the system can support all the concurrent streaming requests for them and maintain as much availability as needed. We also present how to assign recording tasks to PVRs and playback the recorded programs without performance degradation. We show that our proposed scheme improves the storage efficiency significantly, compared with when PVRs do not cooperate with each other, while keeping the playbackability of each request similarly.
Mineto TSUKADA Hiroki MATSUTANI
Currently there has been increasing demand for real-time training on resource-limited IoT devices such as smart sensors, which realizes standalone online adaptation for streaming data without data transfers to remote servers. OS-ELM (Online Sequential Extreme Learning Machine) has been one of promising neural-network-based online algorithms for on-chip learning because it can perform online training at low computational cost and is easy to implement as a digital circuit. Existing OS-ELM digital circuits employ fixed-point data format and the bit-widths are often manually tuned, however, this may cause overflow or underflow which can lead to unexpected behavior of the circuit. For on-chip learning systems, an overflow/underflow-free design has a great impact since online training is continuously performed and the intervals of intermediate variables will dynamically change as time goes by. In this paper, we propose an overflow/underflow-free bit-width optimization method for fixed-point digital circuits of OS-ELM. Experimental results show that our method realizes overflow/underflow-free OS-ELM digital circuits with 1.0x - 1.5x more area cost compared to the baseline simulation method where overflow or underflow can happen.
Yuki IMAI Shinichi NISHIZAWA Kazuhito ITO
Environmental power generation devices such as solar cells are used as power sources for IoT devices. Due to the large internal resistance of such power source, LSIs in the IoT devices may malfunction when the LSI operates at high speed, a large current flows, and the voltage drops. In this paper, a standard cell library of stacked structured cells is proposed to increase the delay of logic circuits within the range not exceeding the clock cycle, thereby reducing the maximum current of the LSIs. We show that the maximum power consumption of LSIs can be reduced without increasing the energy consumption of the LSIs.
The effect of provision of “Neither-Good-Nor-Bad” (NGNB) information on the perceived trustworthiness of agents has been investigated in previous studies. The experimental results have revealed several conditions under which the provision of NGNB information works effectively to make users perceive greater trust of agents. However, the experiments in question were carried out in a situation in which a user is able to choose, with the agent's advice, one of a limited number of options. In practical problems, we are often at a loss as to which to choose because there are too many possible options and it is not easy to narrow them down. Furthermore, in the above-mentioned previous studies, it was easy to predict the size of profits that a user would obtain because its pattern was also limited. This prompted us, in this paper, to investigate the effect of provision of NGNB information on the users' trust of agents under conditions where it appears to the users that numerous options are available. Our experimental results reveal that an agent that reliably provides NGNB information tends to gain greater user trust in a situation where it appears to the users that there are numerous options and their consequences, and it is not easy to predict the size of profits. However, in contradiction to the previous study, the results in this paper also reveal that stable provision of NGNB information in the context of numerous options is less effective in a situation where it is harder to obtain larger profits.
Toi TOMITA Wakaha OGATA Kaoru KUROSAWA
In this paper, we construct the first efficient leakage-resilient CCA2 (LR-CCA2)-secure attribute-based encryption (ABE) schemes. We also construct the first efficient LR-CCA2-secure identity-based encryption (IBE) scheme with optimal leakage rate. To obtain our results, we develop a new quasi-adaptive non-interactive zero-knowledge (QA-NIZK) argument for the ciphertext consistency of the LR-CPA-secure schemes. Our ABE schemes are obtained by boosting the LR-CPA-security of some existing schemes to the LR-CCA2-security by using our QA-NIZK arguments. The schemes are almost as efficient as the underlying LR-CPA-secure schemes.
Kota YOSHIDA Masaya HOJO Takeshi FUJINO
Autonomous robots are controlled using physical information acquired by various sensors. The sensors are susceptible to physical attacks, which tamper with the observed values and interfere with control of the autonomous robots. Recently, sensor spoofing attacks targeting subsequent algorithms which use sensor data have become large threats. In this paper, we introduce a new attack against the LiDAR-based simultaneous localization and mapping (SLAM) algorithm. The attack uses an adversarial LiDAR scan to fool a pose graph and a generated map. The adversary calculates a falsification amount for deceiving pose estimation and physically injects the spoofed distance against LiDAR. The falsification amount is calculated by gradient method against a cost function of the scan matching algorithm. The SLAM algorithm generates the wrong map from the deceived movement path estimated by scan matching. We evaluated our attack on two typical scan matching algorithms, iterative closest point (ICP) and normal distribution transform (NDT). Our experimental results show that SLAM can be fooled by tampering with the scan. Simple odometry sensor fusion is not a sufficient countermeasure. We argue that it is important to detect or prevent tampering with LiDAR scans and to notice inconsistencies in sensors caused by physical attacks.
Genki OSADA Budrul AHSAN Revoti PRASAD BORA Takashi NISHIDE
Virtual Adversarial Training (VAT) has shown impressive results among recently developed regularization methods called consistency regularization. VAT utilizes adversarial samples, generated by injecting perturbation in the input space, for training and thereby enhances the generalization ability of a classifier. However, such adversarial samples can be generated only within a very small area around the input data point, which limits the adversarial effectiveness of such samples. To address this problem we propose LVAT (Latent space VAT), which injects perturbation in the latent space instead of the input space. LVAT can generate adversarial samples flexibly, resulting in more adverse effect and thus more effective regularization. The latent space is built by a generative model, and in this paper we examine two different type of models: variational auto-encoder and normalizing flow, specifically Glow. We evaluated the performance of our method in both supervised and semi-supervised learning scenarios for an image classification task using SVHN and CIFAR-10 datasets. In our evaluation, we found that our method outperforms VAT and other state-of-the-art methods.
Keitaro HIWATASHI Satsuya OHATA Koji NUIDA
Integer division is one of the most fundamental arithmetic operators and is ubiquitously used. However, the existing division protocols in secure multi-party computation (MPC) are inefficient and very complex, and this has been a barrier to applications of MPC such as secure machine learning. We already have some secure division protocols working in Z2n. However, these existing results have drawbacks that those protocols needed many communication rounds and needed to use bigger integers than in/output. In this paper, we improve a secure division protocol in two ways. First, we construct a new protocol using only the same size integers as in/output. Second, we build efficient constant-round building blocks used as subprotocols in the division protocol. With these two improvements, communication rounds of our division protocol are reduced to about 36% (87 rounds → 31 rounds) for 64-bit integers in comparison with the most efficient previous one.
Masatoshi YAMADA Masaki OHATA Daisuke KAKOI
In ball games, acquiring skills to change the direction becomes necessary. For revealing the mechanism of skill acquisition in terms of the relevant field, it would be necessary to take an approach regarding players' cognition as well as body movements measurable from outside. In the phase of change-of-direction performance that this study focuses on, cognitive factors including the prediction of opposite players' movements and judgements of the situation have significance. The purpose of this study was to reveal cognitive transformation in the skill acquisition process for change-of-direction performance. The survey was conducted for three months from August 29 to November 28, 2020, and those surveyed were seven university freshmen belonging to women's basketball club of M University. The way to analyze verbal reports collected in order to explore the changes in the players' cognition is described in Sect.2. In Sect.3, we made a plot graph showing temporal changes in respective factors based on coding outcomes for verbal reports. Consequently, as cognitive transformation in the skill acquisition process for change-of-direction performance, four items such as (1) goal setting for skill acquisition, (2) experience of change in running direction, (3) experience of speed and acceleration, and (4) experience of the movement of lower extremities such as legs and hip joints were suggested as common cognitive transformation. In addition, cognitive transformation varied by the degree of skill acquisition for change-of-direction performance. It was indicated that paying too much attention to body feelings including the position of and shift in the center of gravity in the body posed an obstacle to the skill acquisition for change-of-direction performance.
Hiroshi FUJIWARA Yuichi SHIRAI Hiroaki YAMAMOTO
The construction of a Huffman code can be understood as the problem of finding a full binary tree such that each leaf is associated with a linear function of the depth of the leaf and the sum of the function values is minimized. Fujiwara and Jacobs extended this to a general function and proved the extended problem to be NP-hard. The authors also showed the case where the functions associated with leaves are each non-decreasing and convex is solvable in polynomial time. However, the complexity of the case of non-decreasing non-convex functions remains unknown. In this paper we try to reveal the complexity by considering non-decreasing non-convex functions each of which takes the smaller value of either a linear function or a constant. As a result, we provide a polynomial-time algorithm for two subclasses of such functions.
A new adaptive binarization method is proposed for the vehicle state images obtained from the intelligent operation and maintenance system of rail transit. The method can check the corresponding vehicle status information in the intelligent operation and maintenance system of rail transit more quickly and effectively, track and monitor the vehicle operation status in real time, and improve the emergency response ability of the system. The advantages of the proposed method mainly include two points. For decolorization, we use the method of contrast preserving decolorization[1] obtain the appropriate ratio of R, G, and B for the grayscale of the RGB image which can retain the color information of the vehicle state images background to the maximum, and maintain the contrast between the foreground and the background. In terms of threshold selection, the mean value and standard deviation of gray value corresponding to multi-color background of vehicle state images are obtained by using major cluster estimation[2], and the adaptive threshold is determined by the 2 sigma principle for binarization, which can extract text, identifier and other target information effectively. The experimental results show that, regarding the vehicle state images with rich background color information, this method is better than the traditional binarization methods, such as the global threshold Otsu algorithm[3] and the local threshold Sauvola algorithm[4],[5] based on threshold, Mean-Shift algorithm[6], K-Means algorithm[7] and Fuzzy C Means[8] algorithm based on statistical learning. As an image preprocessing scheme for intelligent rail transit data verification, the method can improve the accuracy of text and identifier recognition effectively by verifying the optical character recognition through a data set containing images of different vehicle statuses.