IEICE global.ieice.org Site

Author Search Result

[Author] Youngjoo SUH(4hit)

1-4hit

Noise Robust Speaker Identification Using Sub-Band Weighting in Multi-Band Approach
Sungtak KIM Mikyong JI Youngjoo SUH Hoirin KIM

LETTER-Speech and Hearing

Vol:
E90-D No:12
Page(s):
2110-2114
Recently, many techniques have been proposed to improve speaker identification in noise environments. Among these techniques, we consider the feature recombination technique for the multi-band approach in noise robust speaker identification. The conventional feature recombination technique is very effective in the band-limited noise condition, but in broad-band noise condition, the conventional feature recombination technique does not provide notable performance improvement compared with the full-band system. Even though the speech is corrupted by the broad-band noise, the degree of the noise corruption on each sub-band is different from each other. In the conventional feature recombination for speaker identification, all sub-band features are used to compute multi-band likelihood score, but this likelihood computation does not use a merit of multi-band approach effectively, even though the sub-band features are extracted independently. Here we propose a new technique of sub-band likelihood computation with sub-band weighting in the feature recombination method. The signal to noise ratio (SNR) is used to compute the sub-band weights. The proposed sub-band-weighted likelihood computation makes a speaker identification system more robust to noise. Experimental results show that the average error reduction rate (ERR) in various noise environments is more than 24% compared with the conventional feature recombination-based speaker identification system.
Cepstral Domain Feature Extraction Utilizing Entropic Distance-Based Filterbank
Youngjoo SUH Hoirin KIM

LETTER-Speech and Hearing

Vol:
E93-D No:2
Page(s):
392-394
The selection of effective features is especially important in achieving highly accurate speech recognition. Although the mel-cepstrum is a popular and effective feature for speech recognition, it is still unclear that the filterbank adopted in the mel-cepstrum always produces the optimal performance regardless of the phonetic environment of any specific speech recognition task. In this paper, we propose a new cepstral domain feature extraction approach utilizing the entropic distance-based filterbank for highly accurate speech recognition. Experimental results showed that the cepstral features employing the proposed filterbank reduce the relative error by 31% for clean as well as noisy speech compared to the mel-cepstral features.
Histogram Equalization Utilizing Window-Based Smoothed CDF Estimation for Feature Compensation
Youngjoo SUH Hoirin KIM Munchurl KIM

LETTER-Speech and Hearing

Vol:
E91-D No:8
Page(s):
2199-2202
In this letter, we propose a new histogram equalization method to compensate for acoustic mismatches mainly caused by corruption of additive noise and channel distortion in speech recognition. The proposed method employs an improved test cumulative distribution function (CDF) by more accurately smoothing the conventional order statistics-based test CDF with the use of window functions for robust feature compensation. Experiments on the AURORA 2 framework confirmed that the proposed method is effective in compensating speech recognition features by reducing the averaged relative error by 13.12% over the order statistics-based conventional histogram equalization method and by 58.02% over the mel-cepstral-based features for the three test sets.
Soft Counting Poisson Mixture Model-Based Polling Method for Speech/Nonspeech Classification
Youngjoo SUH Hoirin KIM Minsoo HAHN Yongju LEE

LETTER-Speech and Hearing

Vol:
E89-D No:12
Page(s):
2994-2997
In this letter, a new segment-level speech/nonspeech classification method based on the Poisson polling technique is proposed. The proposed method makes two modifications from the baseline Poisson polling method to further improve the classification accuracy. One of them is to employ Poisson mixture models to more accurately represent various segmental patterns of the observed frequencies for frame-level input features. The other is the soft counting-based frequency estimation to improve the reliability of the observed frequencies. The effectiveness of the proposed method is confirmed by the experimental results showing the maximum error reduction of 39% compared to the segmentally accumulated log-likelihood ratio-based method.

Author Search Result

[Author] Youngjoo SUH(4hit)

Noise Robust Speaker Identification Using Sub-Band Weighting in Multi-Band Approach

Cepstral Domain Feature Extraction Utilizing Entropic Distance-Based Filterbank

Histogram Equalization Utilizing Window-Based Smoothed CDF Estimation for Feature Compensation

Soft Counting Poisson Mixture Model-Based Polling Method for Speech/Nonspeech Classification

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles