Dynamic Bayesian Network Inversion for Robust Speech Recognition

Lei XIE; Hongwu YANG

doi:10.1093/ietisy/e90-d.7.1117

Dynamic Bayesian Network Inversion for Robust Speech Recognition

Lei XIE, Hongwu YANG

Full Text Views

0

Cite this

Summary :

This paper presents an inversion algorithm for dynamic Bayesian networks towards robust speech recognition, namely DBNI, which is a generalization of hidden Markov model inversion (HMMI). As a dual procedure of expectation maximization (EM)-based model reestimation, DBNI finds the 'uncontaminated' speech by moving the input noisy speech to the Gaussian means under the maximum likelihood (ML) sense given the DBN models trained on clean speech. This algorithm can provide both the expressive advantage from DBN and the noise-removal feature from model inversion. Experiments on the Aurora 2.0 database show that the hidden feature model (a typical DBN for speech recognition) with the DBNI algorithm achieves superior performance in terms of word error rate reduction.

Publication: IEICE TRANSACTIONS on Information Vol.E90-D No.7 pp.1117-1120

Publication Date: 2007/07/01

Publicized

Online ISSN: 1745-1361

DOI: 10.1093/ietisy/e90-d.7.1117

Type of Manuscript: LETTER

Category: Speech and Hearing

Cite this

Copy

Lei XIE, Hongwu YANG, "Dynamic Bayesian Network Inversion for Robust Speech Recognition" in IEICE TRANSACTIONS on Information, vol. E90-D, no. 7, pp. 1117-1120, July 2007, doi: 10.1093/ietisy/e90-d.7.1117.
Abstract: This paper presents an inversion algorithm for dynamic Bayesian networks towards robust speech recognition, namely DBNI, which is a generalization of hidden Markov model inversion (HMMI). As a dual procedure of expectation maximization (EM)-based model reestimation, DBNI finds the 'uncontaminated' speech by moving the input noisy speech to the Gaussian means under the maximum likelihood (ML) sense given the DBN models trained on clean speech. This algorithm can provide both the expressive advantage from DBN and the noise-removal feature from model inversion. Experiments on the Aurora 2.0 database show that the hidden feature model (a typical DBN for speech recognition) with the DBNI algorithm achieves superior performance in terms of word error rate reduction.
URL: https://global.ieice.org/en_transactions/information/10.1093/ietisy/e90-d.7.1117/_p

Copy

@ARTICLE{e90-d_7_1117,
author={Lei XIE, Hongwu YANG, },
journal={IEICE TRANSACTIONS on Information},
title={Dynamic Bayesian Network Inversion for Robust Speech Recognition},
year={2007},
volume={E90-D},
number={7},
pages={1117-1120},
abstract={This paper presents an inversion algorithm for dynamic Bayesian networks towards robust speech recognition, namely DBNI, which is a generalization of hidden Markov model inversion (HMMI). As a dual procedure of expectation maximization (EM)-based model reestimation, DBNI finds the 'uncontaminated' speech by moving the input noisy speech to the Gaussian means under the maximum likelihood (ML) sense given the DBN models trained on clean speech. This algorithm can provide both the expressive advantage from DBN and the noise-removal feature from model inversion. Experiments on the Aurora 2.0 database show that the hidden feature model (a typical DBN for speech recognition) with the DBNI algorithm achieves superior performance in terms of word error rate reduction.},
keywords={},
doi={10.1093/ietisy/e90-d.7.1117},
ISSN={1745-1361},
month={July},}

Copy

TY - JOUR
TI - Dynamic Bayesian Network Inversion for Robust Speech Recognition
T2 - IEICE TRANSACTIONS on Information
SP - 1117
EP - 1120
AU - Lei XIE
AU - Hongwu YANG
PY - 2007
DO - 10.1093/ietisy/e90-d.7.1117
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E90-D
IS - 7
JA - IEICE TRANSACTIONS on Information
Y1 - July 2007
AB - This paper presents an inversion algorithm for dynamic Bayesian networks towards robust speech recognition, namely DBNI, which is a generalization of hidden Markov model inversion (HMMI). As a dual procedure of expectation maximization (EM)-based model reestimation, DBNI finds the 'uncontaminated' speech by moving the input noisy speech to the Gaussian means under the maximum likelihood (ML) sense given the DBN models trained on clean speech. This algorithm can provide both the expressive advantage from DBN and the noise-removal feature from model inversion. Experiments on the Aurora 2.0 database show that the hidden feature model (a typical DBN for speech recognition) with the DBNI algorithm achieves superior performance in terms of word error rate reduction.
ER -