Jingsong SHAN Jianxin LUO Guiqiang NI Yinjin FU Zhaofeng WU
Estimating the cardinality of flows over sliding windows on high-speed links is still a challenging work under time and space constrains. To solve this problem, we present a novel data structure maintaining a summary of data and propose a constant-time update algorithm for fast evicting expired information. Moreover, a further memory-reducing schema is given at a cost of very little loss of accuracy.
Shin-ichi NAKAYAMA Shigeru MASUYAMA
Given a graph G=(V,E) where V and E are a vertex and an edge set, respectively, specified with a subset VNT of vertices called a non-terminal set, the spanning tree with non-terminal set VNT is a connected and acyclic spanning subgraph of G that contains all the vertices of V where each vertex in a non-terminal set is not a leaf. In the case where each edge has the weight of a nonnegative integer, the problem of finding a minimum spanning tree with a non-terminal set VNT of G was known to be NP-hard. However, the complexity of finding a spanning tree on general graphs where each edge has the weight of one was unknown. In this paper, we consider this problem and first show that it is NP-hard even if each edge has the weight of one on general graphs. We also show that if G is a cograph then finding a spanning tree with a non-terminal set VNT of G is linearly solvable when each edge has the weight of one.
Wenming YANG Wenyang JI Fei ZHOU Qingmin LIAO
Automated biometrics identification using finger vein images has increasingly generated interest among researchers with emerging applications in human biometrics. The traditional feature-level fusion strategy is limited and expensive. To solve the problem, this paper investigates the possible use of infrared hybrid finger patterns on the back side of a finger, which includes both the information of finger vein and finger dorsal textures in original image, and a database using the proposed hybrid pattern is established. Accordingly, an Intersection enhanced Gabor based Direction Coding (IGDC) method is proposed. The Experiment achieves a recognition ratio of 98.4127% and an equal error rate of 0.00819 on our newly established database, which is fairly competitive.
Anhao XING Qingwei ZHAO Yonghong YAN
This paper proposes a new quantization framework on activation function of deep neural networks (DNN). We implement fixed-point DNN by quantizing the activations into powers-of-two integers. The costly multiplication operations in using DNN can be replaced with low-cost bit-shifts to massively save computations. Thus, applying DNN-based speech recognition on embedded systems becomes much easier. Experiments show that the proposed method leads to no performance degradation.
Naoki SAWADA Hiromitsu NISHIZAKI
This study proposes a two-pass spoken term detection (STD) method. The first pass uses a phoneme-based dynamic time warping (DTW)-based STD, and the second pass recomputes detection scores produced by the first pass using conditional random fields (CRF)-based triphone detectors. In the second-pass, we treat STD as a sequence labeling problem. We use CRF-based triphone detection models based on features generated from multiple types of phoneme-based transcriptions. The models train recognition error patterns such as phoneme-to-phoneme confusions in the CRF framework. Consequently, the models can detect a triphone comprising a query term with a detection probability. In the experimental evaluation of two types of test collections, the CRF-based approach worked well in the re-ranking process for the DTW-based detections. CRF-based re-ranking showed 2.1% and 2.0% absolute improvements in F-measure for each of the two test collections.
Xuyang WANG Pengyuan ZHANG Qingwei ZHAO Jielin PAN Yonghong YAN
The introduction of deep neural networks (DNNs) leads to a significant improvement of the automatic speech recognition (ASR) performance. However, the whole ASR system remains sophisticated due to the dependent on the hidden Markov model (HMM). Recently, a new end-to-end ASR framework, which utilizes recurrent neural networks (RNNs) to directly model context-independent targets with connectionist temporal classification (CTC) objective function, is proposed and achieves comparable results with the hybrid HMM/DNN system. In this paper, we investigate per-dimensional learning rate methods, ADAGRAD and ADADELTA included, to improve the recognition of the end-to-end system, based on the fact that the blank symbol used in CTC technique dominates the output and these methods give frequent features small learning rates. Experiment results show that more than 4% relative reduction of word error rate (WER) as well as 5% absolute improvement of label accuracy on the training set are achieved when using ADADELTA, and fewer epochs of training are needed.
We propose a new visual tracking method, where the target appearance is represented by combining color distribution and keypoints. Firstly, the object is localized via a keypoint-based tracking and matching strategy, where a new clustering method is presented to remove outliers. Secondly, the tracking confidence is evaluated by the color template. According to the tracking confidence, the local and global keypoints matching can be performed adaptively. Finally, we propose a target appearance update method in which the new appearance can be learned and added to the target model. The proposed tracker is compared with five state-of-the-art tracking methods on a recent benchmark dataset. Both qualitative and quantitative evaluations show that our method has favorable performance.
Mengzhe CHEN Jielin PAN Qingwei ZHAO Yonghong YAN
Multi-task learning in deep neural networks has been proven to be effective for acoustic modeling in speech recognition. In the paper, this technique is applied to Mandarin-English code-mixing recognition. For the primary task of the senone classification, three schemes of the auxiliary tasks are proposed to introduce the language information to networks and improve the prediction of language switching. On the real-world Mandarin-English test corpus in mobile voice search, the proposed schemes enhanced the recognition on both languages and reduced the relative overall error rates by 3.5%, 3.8% and 5.8% respectively.
Lin GAO Jian HUANG Wen SUN Ping WEI Hongshu LIAO
The cardinality balanced multi-target multi-Bernoulli (CBMeMBer) filter has emerged as a promising tool for tracking a time-varying number of targets. However, the standard CBMeMBer filter may perform poorly when measurements are coupled with sensor biases. This paper extends the CBMeMBer filter for simultaneous target tracking and sensor biases estimation by introducing the sensor translational biases into the multi-Bernoulli distribution. In the extended CBMeMBer filter, the biases are modeled as the first order Gauss-Markov process and assumed to be uncorrelated with target states. Furthermore, the sequential Monte Carlo (SMC) method is adopted to handle the non-linearity and the non-Gaussian conditions. Simulations are carried out to examine the performance of the proposed filter.
Abdel MARTINEZ ALONSO Masaya MIYAHARA Akira MATSUZAWA
This paper introduces a novel Direct Digital Frequency Synthesizer based on Complementary Dual-Phase Latch-Based sequencing method. Compared to conventional Direct Digital Frequency Synthesizer using Flip-Flop as synchronizing element, the proposed architecture allows to double the data sampling rate while trading-off area and Power Efficiency. Digital domain modulations can be easily implemented by using a Direct Digital Frequency Synthesizer. However, due to performance limitations, CMOS-based applications have been almost exclusively restricted to VHF, UHF and L bands. This work aims to increase the operation speed and extend the applicability of this technology to Multi-band Multi-standard wireless systems operating up to 2.7 GHz. The design features a 24 bits pipelined Phase Accumulator and a 14x10 bits Phase to Amplitude Converter. The Phase to Amplitude Converter module is compressed by using Quarter Wave Symmetry technique and is entirely made up of combinational logic inserted into 12 Complementary Dual-Phase Latch-Based pipeline stages. The logic is represented in the form of Sum of Product terms obtained from a 14x10 bits sinusoidal Look-Up-Table. The proposed Direct Digital Frequency Synthesizer is designed and simulated based on 65nm CMOS standard-cell technology. A maximum data sampling rate of 6.8 GS/s is expected. Estimated Spurious Free Dynamic Range and Power Efficiency are 61 dBc and 22 mW/(GS/s) respectively.
Naoki HASEGAWA Naoki SHINOHARA Shigeo KAWASAKI
The high performance GaN power amplifier circuit operating at 7.1 GHz was demonstrated for potential use such as in a space ground station. First, the GaN HEMT chips were investigated for the high power amplifier circuit design. And next, the designed amplifier circuits matching with the load and source impedance of the non-linear models were fabricated. From measurement, the AB-class power amplifier circuit with the four-cell chip showed the power added efficiency (PAE) of 42.6% and output power with 41.7dBm at -3dB gain compression. Finally, the good performance of the power amplifier was confirmed in a 20-way radial power combiner with the PAE of 17.4% and output power of 52.6 dBm at -3dB gain compression.
Zhigang CHEN Xiaolei ZHANG Hussain KHURRAM He HUANG Guomei ZHANG
In this letter, a novel channel impulse response (CIR)-based fingerprinting positioning method using kernel principal component analysis (KPCA) has been proposed. During the offline phase of the proposed method, a survey is performed to collect all CIRs from access points, and a fingerprint database is constructed, which has vectors including CIR and physical location. During the online phase, KPCA is first employed to solve the nonlinearity and complexity in the CIR-position dependencies and extract the principal nonlinear features in CIRs, and support vector regression is then used to adaptively learn the regress function between the KPCA components and physical locations. In addition, the iterative narrowing-scope step is further used to refine the estimation. The performance comparison shows that the proposed method outperforms the traditional received signal strength based positioning methods.
Zhongshan ZHANG Yuning CHEN Yuejin TAN Jungang YAN
This paper presents a non-crossover and multi-mutation based genetic algorithm (NMGA) for the Flexible Job-shop Scheduling problem (FJSP) with the criterion to minimize the maximum completion time (makespan). Aiming at the characteristics of FJSP, three mutation operators based on operation sequence coding and machine assignment coding are proposed: flip, slide, and swap. Meanwhile, the NMGA framework, coding scheme, as well as the decoding algorithm are also specially designed for the FJSP. In the framework, recombination operator crossover is not included and a special selection strategy is employed. Computational results based on a set of representative benchmark problems were provided. The evidence indicates that the proposed algorithm is superior to several recently published genetic algorithms in terms of solution quality and convergence ability.
Satoshi TAYU Toshihiko TAKAHASHI Eita KOBAYASHI Shuichi UENO
The 3-D channel routing is a fundamental problem on the physical design of 3-D integrated circuits. The 3-D channel is a 3-D grid G and the terminals are vertices of G located in the top and bottom layers. A net is a set of terminals to be connected. The objective of the 3-D channel routing problem is to connect the terminals in each net with a Steiner tree (wire) in G using as few layers as possible and as short wires as possible in such a way that wires for distinct nets are disjoint. This paper shows that the problem is intractable. We also show that a sparse set of ν 2-terminal nets can be routed in a 3-D channel with O(√ν) layers using wires of length O(√ν).
Hiroki YAMAOKA Toshimichi SAITO
A digital map is a simple dynamical system that is related to various digital dynamical systems including cellular automata, dynamic binary neural networks, and digital spiking neurons. Depending on parameters and initial condition, the map can exhibit various periodic orbits and transient phenomena to them. In order to analyze the dynamics, we present two simple feature quantities. The first and second quantities characterize the plentifulness of the periodic phenomena and the deviation of the transient phenomena, respectively. Using the two feature quantities, we construct the steady-versus-transient plot that is useful in the visualization and consideration of various digital dynamical systems. As a first step, we demonstrate analysis results for an example of the digital maps based on analog bifurcating neuron models.
Zhigang CHEN Lei WANG He HUANG Guomei ZHANG
A novel virtual sensors-based positioning method has been presented in this paper, which can make use of both direct paths and indirect paths. By integrating the virtual sensor idea and Bayesian state and observation framework, this method models the indirect paths corresponding to persistent virtual sensors as virtual direct paths and further reformulates the wireless positioning problem as the maximum likelihood estimation of both the mobile terminal's positions and the persistent virtual sensors' positions. Then the method adopts the EM (Expectation Maximization) and the particle filtering schemes to estimate the virtual sensors' positions and finally exploits not only the direct paths' measurements but also the indirect paths' measurements to realize the mobile terminal's positions estimation, thus achieving better positioning performance. Simulation results demonstrate the effectiveness of the proposed method.
Kyohei YAMADA Naoki SAKAI Takashi OHIRA
Internal power losses in lumped-element impedance matching circuits are formulated by means of Q factors of the elements and port impedances to be matched. Assuming that Q factors are relatively high, the above mentioned loss is expressed by a simple formula containing only the tangents of the impedances. The formula is a powerful tool for such applications that put emphasis on power efficiency as wireless power transfer. As well as the formulation, we illustrate some design examples with the derived formula: design of the least lossy L-section circuit and two-stage low-pass ladder. The examples provide ready-to-use knowledge for low-loss matching design.
Deep Neural Network (DNN) is a powerful machine learning model that has been successfully applied to a wide range of pattern classification tasks. Due to the great ability of the DNNs in learning complex mapping functions, it has been possible to train and deploy DNNs pretty much as a black box without the need to have an in-depth understanding of the inner workings of the model. However, this often leads to solutions and systems that achieve great performance, but offer very little in terms of how and why they work. This paper introduces Sensitivity-characterised Activity Neorogram (SCAN), a novel approach for understanding the inner workings of a DNN by analysing and visualising the sensitivity patterns of the neuron activities. SCAN constructs a low-dimensional visualisation space for the neurons so that the neuron activities can be visualised in a meaningful and interpretable way. The embedding of the neurons within this visualisation space can be used to compare the neurons, both within the same DNN and across different DNNs trained for the same task. This paper will present the observations from using SCAN to analyse DNN acoustic models for automatic speech recognition.
Huawei TAO Ruiyu LIANG Xinran ZHANG Li ZHAO
To discuss whether rotational invariance is the main role in spectrogram features, new spectral features based on local normalized center moments, denoted by LNCMSF, are proposed. The proposed LNCMSF firstly adopts 2nd order normalized center moments to describe local energy distribution of the logarithmic energy spectrum, then normalized center moment spectrograms NC1 and NC2 are gained. Secondly, DCT (Discrete Cosine Transform) is used to eliminate the correlation of NC1 and NC2, then high order cepstral coefficients TNC1 and TNC2 are obtained. Finally, LNCMSF is generated by combining NC1, NC2, TNC1 and TNC2. The rotational invariance test experiment shows that the rotational invariance is not a necessary property in partial spectrogram features. The recognition experiment shows that the maximum UA (Unweighted Average of Class-Wise Recall Rate) of LNCMSF are improved by at least 10.7% and 1.2% respectively, compared to that of MFCC (Mel Frequency Cepstrum Coefficient) and HuWSF (Weighted Spectral Features Based on Local Hu Moments).
Go IRIE Yukito WATANABE Takayuki KUROZUMI Tetsuya KINEBUCHI
Encoding multiple SIFT descriptors into a single vector is a key technique for efficient object image retrieval. In this paper, we propose an extension of local coordinate system (LCS) for image representation. The previous LCS approaches encode each SIFT descriptor by a single local coordinate, which is not adequate for localizing its position in the descriptor space. Instead, we use multiple local coordinates to represent each descriptor with PCA-based decorrelation. Experiments show that this simple modification can improve retrieval performance significantly.