Yu ZHOU Junfeng LI Yanqing SUN Jianping ZHANG Yonghong YAN Masato AKAGI
In this paper, we present a hybrid speech emotion recognition system exploiting both spectral and prosodic features in speech. For capturing the emotional information in the spectral domain, we propose a new spectral feature extraction method by applying a novel non-uniform subband processing, instead of the mel-frequency subbands used in Mel-Frequency Cepstral Coefficients (MFCC). For prosodic features, a set of features that are closely correlated with speech emotional states are selected. In the proposed hybrid emotion recognition system, due to the inherently different characteristics of these two kinds of features (e.g., data size), the newly extracted spectral features are modeled by Gaussian Mixture Model (GMM) and the selected prosodic features are modeled by Support Vector Machine (SVM). The final result of the proposed emotion recognition system is obtained by combining the results from these two subsystems. Experimental results show that (1) the proposed non-uniform spectral features are more effective than the traditional MFCC features for emotion recognition; (2) the proposed hybrid emotion recognition system using both spectral and prosodic features yields the relative recognition error reduction rate of 17.0% over the traditional recognition systems using only the spectral features, and 62.3% over those using only the prosodic features.
Seongyong AHN Hyejeong HONG HyunJin KIM Jin-Ho AHN Dongmyong BAEK Sungho KANG
This paper proposes a new pattern matching architecture with multi-character processing for deep packet inspection. The proposed pattern matching architecture detects the start point of pattern matching from multi-character input using input text alignment. By eliminating duplicate hardware components using process element tree, hardware cost is greatly reduced in the proposed pattern matching architecture.
Seong-Jun HAHM Yuichi OHKAWA Masashi ITO Motoyuki SUZUKI Akinori ITO Shozo MAKINO
In this paper, we propose an acoustic model that is robust to multiple noise environments, as well as a method for adapting the acoustic model to an environment to improve the model. The model is called "the multi-mixture model," which is based on a mixture of different HMMs each of which is trained using speech under different noise conditions. Speech recognition experiments showed that the proposed model performs better than the conventional multi-condition model. The method for adaptation is based on the aspect model, which is a "mixture-of-mixture" model. To realize adaptation using extremely small amount of adaptation data (i.e., a few seconds), we train a small number of mixture models, which can be interpreted as models for "clusters" of noise environments. Then, the models are mixed using weights, which are determined according to the adaptation data. The experimental results showed that the adaptation based on the aspect model improved the word accuracy in a heavy noise environment and showed no performance deterioration for all noise conditions, while the conventional methods either did not improve the performance or showed both improvement and degradation of recognition performance according to noise conditions.
Tong WU Ying WANG Yushan PEI Gen LI Ping ZHANG
This letter proposes an intra-cell partial spectrum reuse (PSR) scheme for cellular OFDM-relay networks. The proposed method aims to increase the system throughput, while the SINR of the cell edge users can be also promoted by utilizing the PSR scheme. The novel pre-allocation factor γ not only indicates the flexibility of PSR, but also decreases the complexity of the reuse mechanism. Through simulations, the proposed scheme is shown to offer superior performances in terms of system throughput and SINR of last 5% users.
Sung Soo KIM Chang Woo HAN Nam Soo KIM
In this letter, we present useful features accounting for pronunciation prominence and propose a classification technique for prominence detection. A set of phone-specific features are extracted based on a forced alignment of the test pronunciation provided by a speech recognition system. These features are then applied to the traditional classifiers such as the support vector machine (SVM), artificial neural network (ANN) and adaptive boosting (Adaboost) for detecting the place of prominence.
Cooperation is an attractive approach to improving the spectrum sensing performance of cognitive systems experiencing deep shadowing and fading. In this letter, an efficient weight-based cooperative spectrum sensing scheme is proposed. Simulation results show that the proposed scheme has better accuracy than "AND," "OR," and "half-voting" combination schemes and has similar spectrum sensing accuracy but with lower computational and communication complexity in comparison to the "optimal data fusion" rule.
Junghyeun HWANG Hisakazu KIKUCHI Shogo MURAMATSU Jaeho SHIN
The error diffusion filter in this paper is optimized with respect to the ideal blue noise pattern corresponding to a single tone level. The filter coefficients are optimized by the minimization of the squared error norm between the Fourier power spectra of the resulting halftone and the blue noise pattern. During the process of optimization, the binary pattern power spectrum matching algorithm is applied with the aid of a new blue noise model. The number of the optimum filters is equal to that of different tones. The visual fidelity of the bilevel halftones generated by the error diffusion filters is evaluated in terms of a weighted signal-to-noise ratio, Fourier power spectra, and others. Experimental results have demonstrated that the proposed filter set generates satisfactory bilevel halftones of grayscale images.
Chunxiao JIANG Xin MA Canfeng CHEN Jian MA Yong REN
Dynamic spectrum access has become a focal issue recently, in which identifying the available spectrum plays a rather important role. Lots of work has been done concerning secondary user (SU) synchronously accessing primary user's (PU's) network. However, on one hand, SU may have no idea about PU's communication protocols; on the other, it is possible that communications among PU are not based on synchronous scheme at all. In order to address such problems, this paper advances a strategy for SU to search available spectrums with asynchronous MAC-layer sensing. With this method, SUs need not know the communication mechanisms in PU's network when dynamically accessing. We will focus on four aspects: 1) strategy for searching available channels; 2) vacating strategy when PUs come back; 3) estimation of channel parameters; 4) impact of SUs' interference on PU's data rate. The simulations show that our search strategy not only can achieve nearly 50% less interference probability than equal allocation of total search time, but also well adapts to time-varying channels. Moreover, access by our strategies can attain 150% more access time than random access. The moment matching estimator shows good performance in estimating and tracing time-varying channels.
Sanaz SEYEDIN Seyed Mohammad AHADI
This paper presents a novel noise-robust feature extraction method for speech recognition. It is based on making the Minimum Variance Distortionless Response (MVDR) power spectrum estimation method robust against noise. This robustness is obtained by modifying the distortionless constraint of the MVDR spectral estimation method via weighting the sub-band power spectrum values based on the sub-band signal to noise ratios. The optimum weighting is obtained by employing the experimental findings of psychoacoustics. According to our experiments, this technique is successful in modifying the power spectrum of speech signals and making it robust against noise. The above method, when evaluated on Aurora 2 task for recognition purposes, outperformed both the MFCC features as the baseline and the MVDR-based features in different noisy conditions.
Subjects in Electromagnetic Compatibility (EMC) research that have been presented at meetings of the IEICE Technical Committee on Electromagnetic Compatibility (EMCJ) are overviewed and categorized. The temporal changes in the proportions of the categorized subjects among the total number of presentations each year is also shown. Finally, speculative opinions are presented on what EMC subjects will be studied in the near future.
This paper proposes and verifies a specific absorption rate (SAR) measurement procedure for multi-antenna transmitters that requires measurement of two-dimensional electric field distributions for the number of antennas and calculation in order to obtain the three-dimensional SAR distributions for arbitrary weighting coefficients of the antennas prior to determining the average SAR. The proposed procedure is verified based on Finite-Difference Time-Domain (FDTD) calculation and measurement using electro-optic (EO) probes. For two reference dipoles, the differences in the 10 g SAR obtained based on the proposed procedure compared numerically and experimentally to that based on the original calculated three-dimensional SAR distribution are at most 4.8% and 3.6%, respectively, at 1950 MHz. At 3500 MHz, this difference is at most 5.2% in the numerical verification.
Seong-Jun HAHM Yuichi OHKAWA Masashi ITO Motoyuki SUZUKI Akinori ITO Shozo MAKINO
We propose an improved reference speaker weighting (RSW) and speaker cluster weighting (SCW) approach that uses an aspect model. The concept of the approach is that the adapted model is a linear combination of a few latent reference models obtained from a set of reference speakers. The aspect model has specific latent-space characteristics that differ from orthogonal basis vectors of eigenvoice. The aspect model is a "mixture-of-mixture" model. We first calculate a small number of latent reference models as mixtures of distributions of the reference speaker's models, and then the latent reference models are mixed to obtain the adapted distribution. The mixture weights are calculated based on the expectation maximization (EM) algorithm. We use the obtained mixture weights for interpolating mean parameters of the distributions. Both training and adaptation are performed based on likelihood maximization with respect to the training and adaptation data, respectively. We conduct a continuous speech recognition experiment using a Korean database (KAIST-TRADE). The results are compared to those of a conventional MAP, MLLR, RSW, eigenvoice and SCW. Absolute word accuracy improvement of 2.06 point was achieved using the proposed method, even though we use only 0.3 s of adaptation data.
Hae Young LEE Seung-Min PARK Tae Ho CHO
This paper presents an approach to implementing simulation models for SAM fuzzy controllers without the use of external components. The approach represents a fuzzy controller as a composition of simple simulation models which involve only basic operations.
HyunJin KIM Hyejeong HONG Dongmyoung BAEK Sungho KANG
This paper proposes a pattern partitioning algorithm that maps multiple target patterns onto homogeneous memory-based string matchers. The proposed algorithm adopts the greedy search based on lexicographical sorting. By mapping as many target patterns as possible onto each string matcher, the memory requirements are greatly reduced.
Lei WANG Baoyu ZHENG Qingmin MENG Chao CHEN
Free probability theory, which has become a main branch of random matrix theory, is a valuable tool for describing the asymptotic behavior of multiple systems, especially for large matrices. In this paper, using asymptotic free probability theory, a new cooperative scheme for spectrum sensing is proposed, which shows how the asymptotic free behavior of random matrices and the property of Wishart distribution can be used to assist spectrum sensing for cognitive radio. Simulations over Rayleigh fading and AWGN channels demonstrate the proposed scheme has better detection performance than the energy detection techniques and the Maximum-minimum eigenvalue (MME) scheme even for the case of a small sample of observations.
Yeong-Sam KIM Seong-Hyun JANG Sang-Hun YOON Jong-Wha CHONG
A new estimation algorithm of clock drift in symbol duration for high precision ranging, based on multiple symbols of chirp spread spectrum (CSS) is proposed. Since the permissible error of a crystal oscillator in CSS is relatively high given the need to lower device costs, ranging results are perturbed by clock drift. We establish the phenomenon of clock drift in multiple symbols of CSS, and estimate the clock drift in symbol duration based on phase difference between adjacent symbols. The proposed algorithm is analyzed, and verified by Monte Carlo simulations.
Nobuyuki SHIMIZU Masashi SUGIYAMA Hiroshi NAKAGAWA
Traditionally, popular synonym acquisition methods are based on the distributional hypothesis, and a metric such as Jaccard coefficients is used to evaluate the similarity between the contexts of words to obtain synonyms for a query. On the other hand, when one tries to compile and clean a thesaurus, one often already has a modest number of synonym relations at hand. Could something be done with a half-built thesaurus alone? We propose the use of spectral methods and discuss their relation to other network-based algorithms in natural language processing (NLP), such as PageRank and Bootstrapping. Since compiling a thesaurus is very laborious, we believe that adding the proposed method to the toolkit of thesaurus constructors would significantly ease the pain in accomplishing this task.
Young-Bok JOO Chan-Ho HAN Kil-Houm PARK
LCD Automatic Vision Inspection (AVI) systems automatically detect defect features and measure their sizes via camera vision. AVI systems usually report different measurements on the same defect with some variations on position or rotation mainly because we get different images. This is caused by possible variations in the image acquisition process including optical factors, non-uniform illumination, random noise, and so on. For this reason, conventional area based defect measuring method has some problems in terms of robustness and consistency. In this paper, we propose a new defect size measuring method to overcome these problems. We utilize volume information which is completely ignored in the area based conventional defect measuring method. We choose a bell shape as a defect model for experiment. The results show that our proposed method dramatically improves robustness of defect size measurement. Given proper modeling, the proposed volume based measuring method can be applied to various types of defect for better robustness and consistency.
Masashi ETO Kotaro SONODA Daisuke INOUE Katsunari YOSHIOKA Koji NAKAO
Network monitoring systems that detect and analyze malicious activities as well as respond against them, are becoming increasingly important. As malwares, such as worms, viruses, and bots, can inflict significant damages on both infrastructure and end user, technologies for identifying such propagating malwares are in great demand. In the large-scale darknet monitoring operation, we can see that malwares have various kinds of scan patterns that involves choosing destination IP addresses. Since many of those oscillations seemed to have a natural periodicity, as if they were signal waveforms, we considered to apply a spectrum analysis methodology so as to extract a feature of malware. With a focus on such scan patterns, this paper proposes a novel concept of malware feature extraction and a distinct analysis method named "SPectrum Analysis for Distinction and Extraction of malware features (SPADE)". Through several evaluations using real scan traffic, we show that SPADE has the significant advantage of recognizing the similarities and dissimilarities between the same and different types of malwares.
Jianliang GAO Yinhe HAN Xiaowei LI
Bugs are becoming unavoidable in complex integrated circuit design. It is imperative to identify the bugs as soon as possible through post-silicon debug. For post-silicon debug, observability is one of the biggest challenges. Scan-based debug mechanism provides high observability by reusing scan chains. However, it is not feasible to scan dump cycle-by-cycle during program execution due to the excessive time required. In fact, it is not necessary to scan out the error-free states. In this paper, we introduce Suspect Window to cover the clock cycle in which the bug is triggered. Then, we present an efficient approach to determine the suspect window. Based on Suspect Window, we propose a novel debug mechanism to locate the bug both temporally and spatially. Since scan dumps are only taken in the suspect window with the proposed mechanism, the time required for locating the bug is greatly reduced. The approaches are evaluated using ISCAS'89 and ITC'99 benchmark circuits. The experimental results show that the proposed mechanism can significantly reduce the overall debug time compared to scan-based debug mechanism while keeping high observability.