Qi ZHANG Hiroaki SASAKI Kazushi IKEDA
Estimation of the gradient of the logarithm of a probability density function is a versatile tool in statistical data analysis. A recent method for model-seeking clustering called the least-squares log-density gradient clustering (LSLDGC) [Sasaki et al., 2014] employs a sophisticated gradient estimator, which directly estimates the log-density gradients without going through density estimation. However, the typical implementation of LSLDGC is based on a spherical Gaussian function, which may not work well when the probability density function for data has highly correlated local structures. To cope with this problem, we propose a new gradient estimator for log-density gradients with Gaussian mixture models (GMMs). Covariance matrices in GMMs enable the new estimator to capture the highly correlated structures. Through the application of the new gradient estimator to mode-seeking clustering and hierarchical clustering, we experimentally demonstrate the usefulness of our clustering methods over existing methods.
This paper presents a rigorous analysis of the electromagnetic scattering and transmission of misaligned dual metallic grating screens. The Fourier transform and the mode-matching technique are employed to obtain an analytical solution. Numerical results show that misaligned dual metal grating screens exhibit asymmetric scattering and transmission properties with respect to the scattering and transmission angles. Parametric studies are conducted in terms of the lateral displacement and vertical distance between the dual metallic grating screens. For validation, the proposed method is compared with a numerical simulation and good agreement has been achieved.
A 3Gbps/lane transmission buffer chip including a high-speed mode detector is proposed for a field-programmable gate array (FPGA)-based frame generator supporting the mobile industry processor interface (MIPI) D-PHY version 1.2. It performs 1-to-3 repeat while buffering low voltage differential signaling (LVDS) or scalable low voltage signaling (SLVS) to SLVS.
Akihito TAYA Takayuki NISHIO Masahiro MORIKURA Koji YAMAMOTO
Sharing perceptual data (e.g., camera and LiDAR data) with other vehicles enhances the traffic safety of autonomous vehicles because it helps vehicles locate other vehicles and pedestrians in their blind spots. Such safety applications require high throughput and short delay, which cannot be achieved by conventional microwave vehicular communication systems. Therefore, millimeter-wave (mmWave) communications are considered to be a key technology for sharing perceptual data because of their wide bandwidth. One of the challenges of data sharing in mmWave communications is broadcasting because narrow-beam directional antennas are used to obtain high gain. Because many vehicles should share their perceptual data to others within a short time frame in order to enlarge the areas that can be perceived based on shared perceptual data, an efficient scheduling for concurrent transmission that improves spatial reuse is required for perceptual data sharing. This paper proposes a data sharing algorithm that employs a graph-based concurrent transmission scheduling. The proposed algorithm realizes concurrent transmission to improve spatial reuse by designing a rule that is utilized to determine if the two pairs of transmitters and receivers interfere with each other by considering the radio propagation characteristics of narrow-beam antennas. A prioritization method that considers the geographical information in perceptual data is also designed to enlarge perceivable areas in situations where data sharing time is limited and not all data can be shared. Simulation results demonstrate that the proposed algorithm doubles the area of the cooperatively perceivable region compared with a conventional algorithm that does not consider mmWave communications because the proposed algorithm achieves high-throughput transmission by improving spatial reuse. The prioritization also enlarges the perceivable region by a maximum of 20%.
Pietro NANNIPIERI Gianmarco DINELLI Luca FANUCCI
Data rate requirements, from consumer application to automotive and aerospace grew rapidly in the last years. This led to the development of a series of communication protocols (i.e. Ethernet, PCI-Express, RapidIO and SpaceFibre), which use more than one communication lane, both to speed up data rate and to increase link reliability. Some of these protocols, such as SpaceFibre, are able to detect real-time changes in the number of active lanes and to adapt the data flow appropriately, providing a flexible solution, robust to lane failures. This results in a real time varying data path in the lower layers of the data handling system. The aim of this paper is to propose the architecture of a hardware block capable of reading a fixed number of words from a host FIFO and shaping them on a real time variable number of words equal to the number of active lanes.
Ruicong ZHI Hairui XU Ming WAN Tingting LI
Facial micro-expression is momentary and subtle facial reactions, and it is still challenging to automatically recognize facial micro-expression with high accuracy in practical applications. Extracting spatiotemporal features from facial image sequences is essential for facial micro-expression recognition. In this paper, we employed 3D Convolutional Neural Networks (3D-CNNs) for self-learning feature extraction to represent facial micro-expression effectively, since the 3D-CNNs could well extract the spatiotemporal features from facial image sequences. Moreover, transfer learning was utilized to deal with the problem of insufficient samples in the facial micro-expression database. We primarily pre-trained the 3D-CNNs on normal facial expression database Oulu-CASIA by supervised learning, then the pre-trained model was effectively transferred to the target domain, which was the facial micro-expression recognition task. The proposed method was evaluated on two available facial micro-expression datasets, i.e. CASME II and SMIC-HS. We obtained the overall accuracy of 97.6% on CASME II, and 97.4% on SMIC, which were 3.4% and 1.6% higher than the 3D-CNNs model without transfer learning, respectively. And the experimental results demonstrated that our method achieved superior performance compared to state-of-the-art methods.
Gang WANG Min-Yao NIU Jian GAO Fang-Wei FU
In this letter, as a generalization of Luo et al.'s constructions, a construction of codebook, which meets the Welch bound asymptotically, is proposed. The parameters of codebook presented in this paper are new in some cases.
The efficiency of generating four-wave mixing (FWM) from phase-modulated (PM) optical signal is studied. An analysis, that takes bit shifts occurring during fiber propagation due to group velocity differences into account, indicates that the FWM efficiency from PM signals is smaller than that from continuous waves in fiber transmission lines whose distance is longer than the walk-off length between transmitted optical signals.
In this paper, we consider a group testing (GT) problem. We derive a lower bound on the probability of error for successful decoding of defected binary signals. To this end, we exploit Fano's inequality theorem in the information theory. We show that the probability of error is bounded as an entropy function, a density of a pooling matrix and a sparsity of a binary signal. We evaluate that for decoding of highly sparse signals, the pooling matrix is required to be dense. Conversely, if dense signals are needed to decode, the sparse pooling matrix should be designed to achieve the small probability of error.
Cheng LUO Wei CAO Lingli WANG Philip H. W. LEONG
With the continuous refinement of Deep Neural Networks (DNNs), a series of deep and complex networks such as Residual Networks (ResNets) show impressive prediction accuracy in image classification tasks. Unfortunately, the structural complexity and computational cost of residual networks make hardware implementation difficult. In this paper, we present the quantized and reconstructed deep neural network (QR-DNN) technique, which first inserts batch normalization (BN) layers in the network during training, and later removes them to facilitate efficient hardware implementation. Moreover, an accurate and efficient residual network accelerator (RNA) is presented based on QR-DNN with batch-normalization-free structures and weights represented in a logarithmic number system. RNA employs a systolic array architecture to perform shift-and-accumulate operations instead of multiplication operations. QR-DNN is shown to achieve a 1∼2% improvement in accuracy over existing techniques, and RNA over previous best fixed-point accelerators. An FPGA implementation on a Xilinx Zynq XC7Z045 device achieves 804.03 GOPS, 104.15 FPS and 91.41% top-5 accuracy for the ResNet-50 benchmark, and state-of-the-art results are also reported for AlexNet and VGG.
Yu HUANG Zhiheng ZHOU Tianlei WANG Qian CAO Junchu HUANG Zirong CHEN
Vehicle detection is challenging in natural traffic scenes because there exist a lot of occlusion. Because of occlusion, detector's training strategy may lead to mismatch between features and labels. As a result, some predicted bounding boxes may shift to surrounding vehicles and lead to lower confidences. These bounding boxes will lead to lower AP value. In this letter, we propose a new approach to address this problem. We calculate the center of visible part of current vehicle based on road information. Then a variable-radius Gaussian weight based method is applied to reweight each anchor box in loss function based on the center of visible part in training time of SSD. The reweighted method has ability to predict higher confidences and more accurate bounding boxes. Besides, the model also has high speed and can be trained end-to-end. Experimental results show that our proposed method outperforms some competitive methods in terms of speed and accuracy.
Osamu FURUKAWA Hideo SHIDA Shin-ichiro TEZUKA Satoshi MATSUURA Shoji ADACHI
A Brillouin optical correlation domain reflectometry (BOCDR) system, which can set measuring point to arbitrary distance that is aligned in a random order along an optical fiber (i.e., random accessibility), is proposed to measure dynamic strain and experimentally evaluated. This random-access system can allocate measurement bandwidth to measuring point by assigning the measurement times at each measuring point of the total number of strain measurements. This assigned number is not always equally but as necessary for plural objects with different natural frequencies. To verify the system, strain of two vibrating objects with different natural frequencies was measured by one optical fiber which is attached to those objects. The system allocated appropriate measurement bandwidth to each object and simultaneously measured dynamic strain corresponding to the vibrating objects.
Sornxayya PHETLASY Satoshi OHZAHATA Celimuge WU Toshihito KATO
Intrusion detection system (IDS) is a device or software to monitor a network system for malicious activity. In terms of detection results, there could be two types of false, namely, the false positive (FP) which incorrectly detects normal traffic as abnormal, and the false negative (FN) which incorrectly judges malicious traffic as normal. To protect the network system, we expect that FN should be minimized as low as possible. However, since there is a trade-off between FP and FN when IDS detects malicious traffic, it is difficult to reduce the both metrics simultaneously. In this paper, we propose a sequential classifiers combination method to reduce the effect of the trade-off. The single classifier suffers a high FN rate in general, therefore additional classifiers are sequentially combined in order to detect more positives (reduce more FN). Since each classifier can reduce FN and does not generate much FP in our approach, we can achieve a reduction of FN at the final output. In evaluations, we use NSL-KDD dataset, which is an updated version of KDD Cup'99 dataset. WEKA is utilized as a classification tool in experiment, and the results show that the proposed approach can reduce FN while improving the sensitivity and accuracy.
Tomoya KAWAKAMI Tomoki YOSHIHISA Yuuichi TERANISHI
In this paper, we propose a method to construct a scalable sensor data stream delivery system that guarantees the specified delivery quality of service (i.e., total reachability to destinations), even when delivery server resources (nodes) are in a heterogeneous churn situation. A number of P2P-based methods have been proposed for constructing a scalable and efficient sensor data stream system that accommodates different delivery cycles by distributing communication loads of the nodes. However, no existing method can guarantee delivery quality of service when the nodes on the system have a heterogeneous churn rate. As an extension of existing methods, which assign relay nodes based on the distributed hashing of the time-to-deliver, our method specifies the number of replication nodes, based on the churn rate of each node and on the relevant delivery paths. Through simulations, we confirmed that our proposed method can guarantee the required reachability, while avoiding any increase in unnecessary resource assignment costs.
A novel image enhancement method for vein recognition is introduced. Inspired by observation that the intensity of the vein vessel changes rapidly during the smoothing process compared to that of background (i.e., skin tissue) due to its thin and long shape, we propose to exploit the smoothing speed as a restoration weight for the vein image enhancement. Experimental results based on the CASIA multispectral palm vein database demonstrate that the proposed method is effective to improve the performance of vein recognition.
Affine projection sign algorithm (APSA) is an important adaptive filtering method to combat the impulsive noisy environment. However, the performance of APSA is poor, if its regularization parameter is not well chosen. We propose a variable regularization APSA (VR-APSA) approach, which adopts a gradient-based method to recursively reduce the norm of the a priori error vector. The resulting VR-APSA leverages the time correlation of both the input signal matrix and error vector to adjust the value of the regularization parameter. Simulation results confirm that our algorithm exhibits both fast convergence and small misadjustment properties.
In sparsity-based optimization problems for two dimensional (2-D) direction-of-arrival (DOA) estimation using L-shaped nested arrays, one of the major issues is computational complexity. A 2-D DOA estimation algorithm is proposed based on reconsitution sparse Bayesian learning (RSBL) and cross covariance matrix decomposition. A single measurement vector (SMV) model is obtained by the difference coarray corresponding to one-dimensional nested array. Through spatial smoothing, the signal measurement vector is transformed into a multiple measurement vector (MMV) matrix. The signal matrix is separated by singular values decomposition (SVD) of the matrix. Using this method, the dimensionality of the sensing matrix and data size can be reduced. The sparse Bayesian learning algorithm is used to estimate one-dimensional angles. By using the one-dimensional angle estimations, the steering vector matrix is reconstructed. The cross covariance matrix of two dimensions is decomposed and transformed. Then the closed expression of the steering vector matrix of another dimension is derived, and the angles are estimated. Automatic pairing can be achieved in two dimensions. Through the proposed algorithm, the 2-D search problem is transformed into a one-dimensional search problem and a matrix transformation problem. Simulations show that the proposed algorithm has better angle estimation accuracy than the traditional two-dimensional direction finding algorithm at low signal-to-noise ratio and few samples.
Naruki SHINOHARA Koji IGARASHI Kyo INOUE
Inter-channel crosstalk is one of the crucial issues in multichannel optical systems. Conventional studies assume that the crosstalk and the main signals have identical format. The present study, in contrast, considers different signal formats for the main and crosstalk lights, and shows that bit error degradation is different depending on the modulation format. Statistical properties of the crosstalk are also investigated. The result quantitatively confirms that a crosstalk light whose signal distribution is closer to a Gaussian profile causes larger degradation.
Antoniette MONDIGO Tomohiro UENO Kentaro SANO Hiroyuki TAKIZAWA
Since the hardware resource of a single FPGA is limited, one idea to scale the performance of FPGA-based HPC applications is to expand the design space with multiple FPGAs. This paper presents a scalable architecture of a deeply pipelined stream computing platform, where available parallelism and inter-FPGA link characteristics are investigated to achieve a scaled performance. For a practical exploration of this vast design space, a performance model is presented and verified with the evaluation of a tsunami simulation application implemented on Intel Arria 10 FPGAs. Finally, scalability analysis is performed, where speedup is achieved when increasing the computing pipeline over multiple FPGAs while maintaining the problem size of computation. Performance is scaled with multiple FPGAs; however, performance degradation occurs with insufficient available bandwidth and large pipeline overhead brought by inadequate data stream size. Tsunami simulation results show that the highest scaled performance for 8 cascaded Arria 10 FPGAs is achieved with a single pipeline of 5 stream processing elements (SPEs), which obtained a scaled performance of 2.5 TFlops and a parallel efficiency of 98%, indicating the strong scalability of the multi-FPGA stream computing platform.
Feng KE Xiaoyu HUANG Weiliang ZENG Yuqin LIU
Wireless powered communication networks (WPCNs) utilize the wireless energy transfer (WET) technique to facilitate the wireless information transmission (WIT) of nodes. We propose a two-step iterative algorithm to maximize the sum throughput of the users in a MIMO WPCN with discrete signal inputs. Firstly, the optimal solution of a convex power allocation problem can be found given a fixed time allocation; Secondly, a semi closed form solution for the optimal time allocation is obtained when fixing the power allocation matrix. By optimizing the power allocation and time allocation alternately, the two-step algorithm converges to a local optimal point. Simulation results show that the proposed algorithm outperforms the conventional schemes, which consider only Gaussian inputs.