Sou NOBUKAWA Hirotaka DOHO Natsusaku SHIBATA Haruhiko NISHIMURA Teruya YAMANISHI
Fluctuations in nonlinear systems can enhance the synchronization with weak input signals. These nonlinear synchronization phenomena are classified as stochastic resonance and chaotic resonance. Many applications of stochastic resonance have been realized, utilizing its enhancing effect for the signal sensitivity. However, although some studies showed that the sensitivity of chaotic resonance is higher than that of stochastic resonance, only few studies have investigated the engineering application of chaotic resonance. A possible reason is that, in chaotic resonance, the chaotic state must be adjusted through internal parameters to reach the state that allows resonance. In many cases and especially in biological systems, such adjustments are difficult to perform externally. To overcome this difficulty, we developed a method to control the chaotic state for an appropriate state of chaotic resonance by using an external feedback signal. The method is called reducing the range of orbit (RRO) feedback method. Previously, we have developed the RRO feedback method for discrete chaotic systems. However, for applying the RRO feedback method to actual chaotic systems including biological systems, development of the RRO feedback signals in continuous chaotic systems must be considered. Therefore, in this study, we extended the RRO feedback method to continuous chaotic systems by focusing on the map function on the Poincaré section. We applied the extended RRO feedback method to Chua's circuit as a continuous chaotic system. The results confirmed that the RRO feedback signal can induce chaotic resonance. This study is the first to report the application of RRO feedback to a continuous chaotic system. The results of this study will facilitate further device development based on chaotic resonance.
Ryuta KAWANO Ryota YASUDO Hiroki MATSUTANI Michihiro KOIBUCHI Hideharu AMANO
Recently proposed irregular networks can reduce the latency for both on-chip and off-chip systems with a large number of computing nodes and thus can improve the performance of parallel applications. However, these networks usually suffer from deadlocks in routing packets when using a naive minimal path routing algorithm. To solve this problem, we focus attention on a lately proposed theory that generalizes the turn model to maintain the network performance with deadlock-freedom. The theorems remain a challenge of applying themselves to arbitrary topologies including fully irregular networks. In this paper, we advance the theorems to completely general ones. Moreover, we provide a feasible implementation of a deadlock-free routing method based on our advanced theorem. Experimental results show that the routing method based on our proposed theorem can improve the network throughput by up to 138 % compared to a conventional deterministic minimal routing method. Moreover, when utilized as the escape path in Duato's protocol, it can improve the throughput by up to 26.3 % compared with the conventional up*/down* routing.
Qian CHENG Jiang ZHU Tao XIE Junshan LUO Zuohong XU
A low-complexity time-invariant angle-range dependent directional modulation (DM) based on time-modulated frequency diverse array (TM-FDA-DM) is proposed to achieve point-to-point physical layer security communications. The principle of TM-FDA is elaborated and the vector synthesis method is utilized to realize the proposal, TM-FDA-DM, where normalization and orthogonal matrices are designed to modulate the useful baseband symbols and inserted artificial noise, respectively. Since the two designed matrices are time-invariant fixed values, which avoid real-time calculation, the proposed TM-FDA-DM is much easier to implement than time-invariant DMs based on conventional linear FDA or logarithmical FDA, and it also outperforms the time-invariant angle-range dependent DM that utilizes genetic algorithm (GA) to optimize phase shifters on radio frequency (RF) frontend. Additionally, a robust synthesis method for TM-FDA-DM with imperfect angle and range estimations is proposed by optimizing normalization matrix. Simulations demonstrate that the proposed TM-FDA-DM exhibits time-invariant and angle-range dependent characteristics, and the proposed robust TM-FDA-DM can achieve better BER performance than the non-robust method when the maximum range error is larger than 7km and the maximum angle error is larger than 4°.
Changyan ZHENG Tieyong CAO Jibin YANG Xiongwei ZHANG Meng SUN
Compared with acoustic microphone (AM) speech, bone-conducted microphone (BCM) speech is much immune to background noise, but suffers from severe loss of information due to the characteristics of the human-body transmission channel. In this letter, a new method for the speaker-dependent BCM speech enhancement is proposed, in which we focus our attention on the spectra restoration of the distorted speech. In order to better infer the missing components, an attention-based bidirectional Long Short-Term Memory (AB-BLSTM) is designed to optimize the use of contextual information to model the relationship between the spectra of BCM speech and its corresponding clean AM speech. Meanwhile, a structural error metric, Structural SIMilarity (SSIM) metric, originated from image processing is proposed to be the loss function, which provides the constraint of the spectro-temporal structures in recovering of the spectra. Experiments demonstrate that compared with approaches based on conventional DNN and mean square error (MSE), the proposed method can better recover the missing phonemes and obtain spectra with spectro-temporal structure more similar to the target one, which leads to great improvement on objective metrics.
Taku YAMAZAKI Ryo YAMAMOTO Genki HOSOKAWA Tadahide KUNITACHI Yoshiaki TANAKA
In wireless multi-hop networks such as ad hoc networks and sensor networks, backoff-based opportunistic routing protocols, which make a forwarding decision based on backoff time, have been proposed. In the protocols, each potential forwarder calculates the backoff time based on the product of a weight and global scaling factor. The weight prioritizes potential forwarders and is calculated based on hop counts to the destination of a sender and receiver. The global scaling factor is a predetermined value to map the weight to the actual backoff time. However, there are three common issues derived from the global scaling factor. First, it is necessary to share the predetermined global scaling factor with a centralized manner among all terminals properly for the backoff time calculation. Second, it is almost impossible to change the global scaling factor during the networks are being used. Third, it is difficult to set the global scaling factor to an appropriate value since the value differs among each local surrounding of forwarders. To address the aforementioned issues, this paper proposes a novel decentralized local scaling factor control without relying on a predetermined global scaling factor. The proposed method consists of the following three mechanisms: (1) sender-centric local scaling factor setting mechanism in a decentralized manner instead of the global scaling factor, (2) adaptive scaling factor control mechanism which adapts the local scaling factor to each local surrounding of forwarders, and (3) mitigation mechanism for excessive local scaling factor increases for the local scaling factor convergence. Finally, this paper evaluates the backoff-based opportunistic routing protocol with and without the proposed method using computer simulations.
Takuma IWATA Kohei NAKAMURA Yuta TOKUSASHI Hiroki MATSUTANI
In statistical analysis and data mining, change-point detection that identifies the change-points which are times when the probability distribution of time series changes has been used for various purposes, such as anomaly detections on network traffic and transaction data. However, computation cost of a conventional AR (Auto-Regression) model based approach is too high and infeasible for online. In this paper, an AR model based online change-point detection algorithm, called ChangeFinder, is implemented on an FPGA (Field Programmable Gate Array) based NIC (Network Interface Card). The proposed system computes the change-point score from time series data received from 10GbE (10Gbit Ethernet). More specifically, it computes the change-point score at the 10GbE NIC in advance of host applications. It can find change-points on single or multiple streams using a context memory. This paper aims to reduce the host workload and improve change-point detection performance by offloading ChangeFinder algorithm from host to the NIC. As evaluations, change-point detection in the FPGA NIC is compared with a baseline software implementation and those enhanced by two network optimization techniques using DPDK and Netfilter in terms of throughput. The result demonstrates 16.8x improvement in change-point detection throughput compared to the baseline software implementation. It is corresponding to the 10GbE line rate. Performance and area overheads when supporting multiple streams are also evaluated.
Tomoyuki SASAKI Hidehiro NAKANO
Particle swarm optimization (PSO) is a swarm intelligence algorithm and has good search performance and simplicity in implementation. Because of its properties, PSO has been applied to various optimization problems. However, the search performance of the classical PSO (CPSO) depends on reference frame of solution spaces for each objective function. CPSO is an invariant algorithm through translation and scale changes to reference frame of solution spaces but is a rotationally variant algorithm. As such, the search performance of CPSO is worse in solving rotated problems than in solving non-rotated problems. In the reference frame invariance, the search performance of an optimization algorithm is independent on rotation, translation, or scale changes to reference frame of solution spaces, which is a property of preferred optimization algorithms. In our previous study, piecewise-linear particle swarm optimizer (PPSO) has been proposed, which is effective in solving rotated problems. Because PPSO particles can move in solution spaces freely without depending on the coordinate systems, PPSO algorithm may have rotational invariance. However, theoretical analysis of reference frame invariance of PPSO has not been done. In addition, although behavior of each particle depends on PPSO parameters, good parameter conditions in solving various optimization problems have not been sufficiently clarified. In this paper, we analyze the reference frame invariance of PPSO theoretically, and investigated whether or not PPSO is invariant under reference frame alteration. We clarify that control parameters of PPSO which affect movement of each particle and performance of PPSO through numerical simulations.
Yi GUO Heming SUN Ping LEI Shinji KIMURA
Approximate computing has emerged as a promising approach for error-tolerant applications to improve hardware performance at the cost of some loss of accuracy. Multiplication is a key arithmetic operation in these applications. In this paper, we propose a low-cost approximate multiplier design by employing new probability-driven inexact compressors. This compressor design is introduced to reduce the height of partial product matrix into two rows, based on the probability distribution of the sum result of partial products. To compensate the accuracy loss of the multiplier, a grouped error recovery scheme is proposed and achieves different levels of accuracy. In terms of mean relative error distance (MRED), the accuracy losses of the proposed multipliers are from 1.07% to 7.86%. Compared with the Wallace multiplier using 40nm process, the most accurate variant of the proposed multipliers can reduce power by 59.75% and area by 42.47%. The critical path delay reduction is larger than 12.78%. The proposed multiplier design has a better accuracy-performance trade-off than other designs with comparable accuracy. In addition, the efficiency of the proposed multiplier design is assessed in an image processing application.
Yi-Xian YANG Kung-Jui PAI Ruay-Shiung CHANG Jou-Ming CHANG
A set of spanning trees of a graphs G are called completely independent spanning trees (CISTs for short) if for every pair of vertices x, y∈V(G), the paths joining x and y in any two trees have neither vertex nor edge in common, except x and y. Constructing CISTs has applications on interconnection networks such as fault-tolerant routing and secure message transmission. In this paper, we investigate the problem of constructing two CISTs in the balanced hypercube BHn, which is a hypercube-variant network and is superior to hypercube due to having a smaller diameter. As a result, the diameter of CISTs we constructed equals to 9 for BH2 and 6n-2 for BHn when n≥3.
Fanxin ZENG Yue ZENG Lisheng ZHANG Xiping HE Guixin XUAN Zhenyu ZHANG Yanni PENG Linjie QIAN Li YAN
Sequences that attain the smallest possible absolute sidelobes (SPASs) of periodic autocorrelation function (PACF) play fairly important roles in synchronization of communication systems, Large scale integrated circuit testing, and so on. This letter presents an approach to construct 16-QAM sequences of even periods, based on the known quaternary sequences. A relationship between the PACFs of 16-QAM and quaternary sequences is established, by which when quaternary sequences that attain the SPASs of PACF are employed, the proposed 16-QAM sequences have good PACF.
Toan H. VU An DANG Jia-Ching WANG
We develop a deep neural network (DNN) for detecting driver drowsiness in videos. The proposed DNN model that receives driver's faces extracted from video frames as inputs consists of three components - a convolutional neural network (CNN), a convolutional control gate-based recurrent neural network (ConvCGRNN), and a voting layer. The CNN is to learn facial representations from global faces which are then fed to the ConvCGRNN to learn their temporal dependencies. The voting layer works like an ensemble of many sub-classifiers to predict drowsiness state. Experimental results on the NTHU-DDD dataset show that our model not only achieve a competitive accuracy of 84.81% without any post-processing but it can work in real-time with a high speed of about 100 fps.
Kasho YAMAMOTO Masayuki IKEBE Tetsuya ASAI Masato MOTOMURA Shinya TAKAMAEDA-YAMAZAKI
An annealing processor based on the Ising model is a remarkable candidate for combinatorial optimization problems and it is superior to general von Neumann computers. CMOS-based implementations of the annealing processor are efficient and feasible based on current semiconductor technology. However, critical problems with annealing processors remain. There are few simulated spins and inflexibility in terms of implementable graph topology due to hardware constraints. A prior approach to overcoming these problems is to emulate a complicated graph on a simple and high-density spin array with so-called minor embedding, a spin duplication method based on graph theory. When a complicated graph is embedded on such hardware, numerous spins are consumed to represent high-degree spins by combining multiple low-degree spins. In addition to the number of spins, the quality of solutions decreases as a result of dummy strong connections between the duplicated spins. Thus, the approach cannot handle large-scale practical problems. This paper proposes a flexible and scalable hardware architecture with time-division multiplexing for massive spins and high-degree topologies. A target graph is separated and mapped onto multiple virtual planes, and each plane is subject to interleaved simulation with time-division processing. Therefore, the behavior of high-degree spins is efficiently emulated over time, so that no dummy strong connections are required, and the solution quality is accordingly improved. We implemented a prototype hardware design for FPGAs, and we evaluated the proposed method in a software-based annealing processor simulator. The results indicate that the method increased the spins that can be deployed. In addition, our time-division multiplexing architecture improved the solution quality and convergence time with reasonable resource consumption.
Baojun ZHAO Boya ZHAO Linbo TANG Baoxian WANG
Towards involving the convolutional neural networks into the object detection field, many computer vision tasks have achieved favorable successes. In order to adapt targets with various scales, deep feature pyramid is widely used, since the traditional object detection methods detect different objects in Gaussian image pyramid. However, due to the mismatching between the anchors and the feature distributions of targets, the accurate detection for targets with various scales is still a challenge. Considering the differences between the theoretical receptive field and effective receptive field, we propose a novel anchor generation method, which takes the effective receptive field as the standard. The proposed method is evaluated on the PASCAL VOC dataset and shows the favorable results.
Yunjie GU Yuehang DING Yuxiang HU
A Service Function Chain (SFC) is an ordered sequence of virtual network functions (VNFs) to provide network service. Most existing SFC orchestration schemes, however, cannot optimize the resources allocation while guaranteeing the service delay constraint. To fulfill this goal, we propose a Layered Graph based SFC Orchestration Scheme (LGOS). LGOS converts both the cost of resource and the related delay into the link weights in the layered graph, which helps abstract the SFC orchestration problem as a shortest path problem. Then a simulated annealing based batch processing algorithm is designed for SFC requests set. Through extensive evaluations, we demonstrated that our scheme can reduce the end-to-end delay and the operational expenditure by 21.6% and 13.7% at least, and the acceptance ratio of requests set can be improved by 22.3%, compared with other algorithms.
Marcus WALLDEN Stefano MARKIDIS Masao OKITA Fumihiko INO
We propose a novel compositing pipeline and a dynamic load balancing technique for volume rendering which utilizes a two-layered group structure to achieve effective and scalable load balancing. The technique enables each process to render data from non-contiguous regions of the volume with minimal impact on the total render time. We demonstrate the effectiveness of the proposed technique by performing a set of experiments on a modern GPU cluster. The experiments show that using the technique results in up to a 35.7% lower worst-case memory usage as compared to a dynamic k-d tree load balancing technique, whilst simultaneously achieving similar or higher render performance. The proposed technique was also able to lower the amount of transferred data during the load balancing stage by up to 72.2%. The technique has the potential to be used in many scenarios where other dynamic load balancing techniques have proved to be inadequate, such as during large-scale visualization.
JianNan ZHANG JiJun ZHOU JianFeng WU ShengYing YANG
Convolutional neural networks (CNNS) have a strong ability to understand and judge images. However, the enormous parameters and computation of CNNS have limited its application in resource-limited devices. In this letter, we used the idea of parameter sharing and dense connection to compress the parameters in the convolution kernel channel direction, thus greatly reducing the number of model parameters. On this basis, we designed Shared and Dense Channel-wise Convolutional Networks (SDChannelNets), mainly composed of Depth-wise Separable SD-Channel-wise Convolution layer. The advantage of SDChannelNets is that the number of model parameters is greatly reduced without or with little loss of accuracy. We also introduced a hyperparameter that can effectively balance the number of parameters and the accuracy of a model. We evaluated the model proposed by us through two popular image recognition tasks (CIFAR-10 and CIFAR-100). The results showed that SDChannelNets had similar accuracy to other CNNs, but the number of parameters was greatly reduced.
This letter presents ternary convolutional codes and their punctured codes with optimum distance spectrum.
This paper constructs packet-oriented erasure correcting codes and their systematic forms for the distributed storage systems. The proposed codes are encoded by exclusive OR and bit-level shift operation. By the shift operation, the encoded packets are slightly longer than the source packets. This paper evaluates the extra length of the encoded packets, called overhead, and shows that the proposed codes have smaller overheads than the zigzag decodable codes, which are existing codes using bit-level shift operation and exclusive OR.
Tetsunao MATSUTA Tomohiko UYEMATSU
In this paper, we consider a source coding with side information partially used at the decoder through a codeword. We assume that there exists a relative delay (or gap) of the correlation between the source sequence and side information. We also assume that the delay is unknown but the maximum of possible delays is known to two encoders and the decoder, where we allow the maximum of delays to change by the block length. In this source coding, we give an inner bound and an outer bound on the achievable rate region, where the achievable rate region is the set of rate pairs of encoders such that the decoding error probability vanishes as the block length tends to infinity. Furthermore, we clarify that the inner bound coincides with the outer bound when the maximum of delays for the block length converges to a constant.
Jin MITSUGI Yuki SATO Yuusuke KAWAKITA Haruhisa ICHIKAWA
Backscatter wireless communications offer advantages such as batteryless operations, small form factor, and radio regulatory exemption sensors. The major challenge ahead of backscatter wireless communications is synchronized multicarrier data collection, which can be realized by rejecting mutual harmonics among backscatters. This paper analyzes the mutual interferences of digitally modulated multicarrier backscatter to find interferences from higher frequency subcarriers to lower frequency subcarriers, which do not take place in analog modulated multicarrier backscatters, is harmful for densely populated subcarriers. This reverse interference distorts the harmonics replica, deteriorating the performance of the existing method, which rejects mutual interference among subcarriers by 5dB processing gain. To solve this problem, this paper analyzes the relationship between subcarrier spacing and reverse interference, and reveals that an alternate channel spacing, with channel separation twice the bandwidth of a subcarrier, can provide reasonably dense subcarrier allocation and can alleviate reverse interference. The idea is examined with prototype sensors in a wired experiment and in an indoor propagation experiment. The results reveal that with alternate channel spacing, the reverse interference practically becomes negligible, and the existing interference rejection method achieves the original processing gain of 5dB with one hundredth packet error rate reduction.