Cepstral Statistics Compensation and Normalization Using Online Pseudo Stereo Codebooks for Robust Speech Recognition in Additive Noise Environments

Jeih-weih HUNG

doi:10.1093/ietisy/e91-d.2.296

IEICE TRANSACTIONS on Information

Cepstral Statistics Compensation and Normalization Using Online Pseudo Stereo Codebooks for Robust Speech Recognition in Additive Noise Environments

Jeih-weih HUNG

Full Text Views

0

Cite this

Summary :

This paper proposes several cepstral statistics compensation and normalization algorithms which alleviate the effect of additive noise on cepstral features for speech recognition. The algorithms are simple yet efficient noise reduction techniques that use online-constructed pseudo-stereo codebooks to evaluate the statistics in both clean and noisy environments. The process yields transformations for both clean speech cepstra and noise-corrupted speech cepstra, or for noise-corrupted speech cepstra only, so that the statistics of the transformed speech cepstra are similar for both environments. Experimental results show that these codebook-based algorithms can provide significant performance gains compared to results obtained by using conventional utterance-based normalization approaches. The proposed codebook-based cesptral mean and variance normalization (C-CMVN), linear least squares (LLS) and quadratic least squares (QLS) outperform utterance-based CMVN (U-CMVN) by 26.03%, 22.72% and 27.48%, respectively, in relative word error rate reduction for experiments conducted on Test Set A of the Aurora-2 digit database.

Publication: IEICE TRANSACTIONS on Information Vol.E91-D No.2 pp.296-311

Publication Date: 2008/02/01

Publicized

Online ISSN: 1745-1361

DOI: 10.1093/ietisy/e91-d.2.296

Type of Manuscript: PAPER

Category: Speech and Hearing

Cite this

Copy

Jeih-weih HUNG, "Cepstral Statistics Compensation and Normalization Using Online Pseudo Stereo Codebooks for Robust Speech Recognition in Additive Noise Environments" in IEICE TRANSACTIONS on Information, vol. E91-D, no. 2, pp. 296-311, February 2008, doi: 10.1093/ietisy/e91-d.2.296.
Abstract: This paper proposes several cepstral statistics compensation and normalization algorithms which alleviate the effect of additive noise on cepstral features for speech recognition. The algorithms are simple yet efficient noise reduction techniques that use online-constructed pseudo-stereo codebooks to evaluate the statistics in both clean and noisy environments. The process yields transformations for both clean speech cepstra and noise-corrupted speech cepstra, or for noise-corrupted speech cepstra only, so that the statistics of the transformed speech cepstra are similar for both environments. Experimental results show that these codebook-based algorithms can provide significant performance gains compared to results obtained by using conventional utterance-based normalization approaches. The proposed codebook-based cesptral mean and variance normalization (C-CMVN), linear least squares (LLS) and quadratic least squares (QLS) outperform utterance-based CMVN (U-CMVN) by 26.03%, 22.72% and 27.48%, respectively, in relative word error rate reduction for experiments conducted on Test Set A of the Aurora-2 digit database.
URL: https://global.ieice.org/en_transactions/information/10.1093/ietisy/e91-d.2.296/_p

Copy

@ARTICLE{e91-d_2_296,
author={Jeih-weih HUNG, },
journal={IEICE TRANSACTIONS on Information},
title={Cepstral Statistics Compensation and Normalization Using Online Pseudo Stereo Codebooks for Robust Speech Recognition in Additive Noise Environments},
year={2008},
volume={E91-D},
number={2},
pages={296-311},
abstract={This paper proposes several cepstral statistics compensation and normalization algorithms which alleviate the effect of additive noise on cepstral features for speech recognition. The algorithms are simple yet efficient noise reduction techniques that use online-constructed pseudo-stereo codebooks to evaluate the statistics in both clean and noisy environments. The process yields transformations for both clean speech cepstra and noise-corrupted speech cepstra, or for noise-corrupted speech cepstra only, so that the statistics of the transformed speech cepstra are similar for both environments. Experimental results show that these codebook-based algorithms can provide significant performance gains compared to results obtained by using conventional utterance-based normalization approaches. The proposed codebook-based cesptral mean and variance normalization (C-CMVN), linear least squares (LLS) and quadratic least squares (QLS) outperform utterance-based CMVN (U-CMVN) by 26.03%, 22.72% and 27.48%, respectively, in relative word error rate reduction for experiments conducted on Test Set A of the Aurora-2 digit database.},
keywords={},
doi={10.1093/ietisy/e91-d.2.296},
ISSN={1745-1361},
month={February},}

Copy

TY - JOUR
TI - Cepstral Statistics Compensation and Normalization Using Online Pseudo Stereo Codebooks for Robust Speech Recognition in Additive Noise Environments
T2 - IEICE TRANSACTIONS on Information
SP - 296
EP - 311
AU - Jeih-weih HUNG
PY - 2008
DO - 10.1093/ietisy/e91-d.2.296
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E91-D
IS - 2
JA - IEICE TRANSACTIONS on Information
Y1 - February 2008
AB - This paper proposes several cepstral statistics compensation and normalization algorithms which alleviate the effect of additive noise on cepstral features for speech recognition. The algorithms are simple yet efficient noise reduction techniques that use online-constructed pseudo-stereo codebooks to evaluate the statistics in both clean and noisy environments. The process yields transformations for both clean speech cepstra and noise-corrupted speech cepstra, or for noise-corrupted speech cepstra only, so that the statistics of the transformed speech cepstra are similar for both environments. Experimental results show that these codebook-based algorithms can provide significant performance gains compared to results obtained by using conventional utterance-based normalization approaches. The proposed codebook-based cesptral mean and variance normalization (C-CMVN), linear least squares (LLS) and quadratic least squares (QLS) outperform utterance-based CMVN (U-CMVN) by 26.03%, 22.72% and 27.48%, respectively, in relative word error rate reduction for experiments conducted on Test Set A of the Aurora-2 digit database.
ER -

IEICE TRANSACTIONS on Information

Cepstral Statistics Compensation and Normalization Using Online Pseudo Stereo Codebooks for Robust Speech Recognition in Additive Noise Environments

Summary :

Authors

Keyword

Latest Issue

Contents

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles

IEICE TRANSACTIONS on Information

Cepstral Statistics Compensation and Normalization Using Online Pseudo Stereo Codebooks for Robust Speech Recognition in Additive Noise Environments

Summary :

Authors

Keyword

Latest Issue

Contents

Copyrights notice of machine-translated contents

Cite this

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles