Acoustic Model Adaptation Using First-Order Linear Prediction for Reverberant Speech

Tetsuya TAKIGUCHI; Masafumi NISHIMURA; Yasuo ARIKI

doi:10.1093/ietisy/e89-d.3.908

Acoustic Model Adaptation Using First-Order Linear Prediction for Reverberant Speech

Tetsuya TAKIGUCHI, Masafumi NISHIMURA, Yasuo ARIKI

Full Text Views

0

Cite this

Summary :

This paper describes a hands-free speech recognition technique based on acoustic model adaptation to reverberant speech. In hands-free speech recognition, the recognition accuracy is degraded by reverberation, since each segment of speech is affected by the reflection energy of the preceding segment. To compensate for the reflection signal we introduce a frame-by-frame adaptation method adding the reflection signal to the means of the acoustic model. The reflection signal is approximated by a first-order linear prediction from the observation signal at the preceding frame, and the linear prediction coefficient is estimated with a maximum likelihood method by using the EM algorithm, which maximizes the likelihood of the adaptation data. Its effectiveness is confirmed by word recognition experiments on reverberant speech.

Publication: IEICE TRANSACTIONS on Information Vol.E89-D No.3 pp.908-914

Publication Date: 2006/03/01

Publicized

Online ISSN: 1745-1361

DOI: 10.1093/ietisy/e89-d.3.908

Type of Manuscript: Special Section PAPER (Special Section on Statistical Modeling for Speech Processing)

Category: Speech Recognition

Cite this

Copy

Tetsuya TAKIGUCHI, Masafumi NISHIMURA, Yasuo ARIKI, "Acoustic Model Adaptation Using First-Order Linear Prediction for Reverberant Speech" in IEICE TRANSACTIONS on Information, vol. E89-D, no. 3, pp. 908-914, March 2006, doi: 10.1093/ietisy/e89-d.3.908.
Abstract: This paper describes a hands-free speech recognition technique based on acoustic model adaptation to reverberant speech. In hands-free speech recognition, the recognition accuracy is degraded by reverberation, since each segment of speech is affected by the reflection energy of the preceding segment. To compensate for the reflection signal we introduce a frame-by-frame adaptation method adding the reflection signal to the means of the acoustic model. The reflection signal is approximated by a first-order linear prediction from the observation signal at the preceding frame, and the linear prediction coefficient is estimated with a maximum likelihood method by using the EM algorithm, which maximizes the likelihood of the adaptation data. Its effectiveness is confirmed by word recognition experiments on reverberant speech.
URL: https://global.ieice.org/en_transactions/information/10.1093/ietisy/e89-d.3.908/_p

Copy

@ARTICLE{e89-d_3_908,
author={Tetsuya TAKIGUCHI, Masafumi NISHIMURA, Yasuo ARIKI, },
journal={IEICE TRANSACTIONS on Information},
title={Acoustic Model Adaptation Using First-Order Linear Prediction for Reverberant Speech},
year={2006},
volume={E89-D},
number={3},
pages={908-914},
abstract={This paper describes a hands-free speech recognition technique based on acoustic model adaptation to reverberant speech. In hands-free speech recognition, the recognition accuracy is degraded by reverberation, since each segment of speech is affected by the reflection energy of the preceding segment. To compensate for the reflection signal we introduce a frame-by-frame adaptation method adding the reflection signal to the means of the acoustic model. The reflection signal is approximated by a first-order linear prediction from the observation signal at the preceding frame, and the linear prediction coefficient is estimated with a maximum likelihood method by using the EM algorithm, which maximizes the likelihood of the adaptation data. Its effectiveness is confirmed by word recognition experiments on reverberant speech.},
keywords={},
doi={10.1093/ietisy/e89-d.3.908},
ISSN={1745-1361},
month={March},}

Copy

TY - JOUR
TI - Acoustic Model Adaptation Using First-Order Linear Prediction for Reverberant Speech
T2 - IEICE TRANSACTIONS on Information
SP - 908
EP - 914
AU - Tetsuya TAKIGUCHI
AU - Masafumi NISHIMURA
AU - Yasuo ARIKI
PY - 2006
DO - 10.1093/ietisy/e89-d.3.908
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E89-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2006
AB - This paper describes a hands-free speech recognition technique based on acoustic model adaptation to reverberant speech. In hands-free speech recognition, the recognition accuracy is degraded by reverberation, since each segment of speech is affected by the reflection energy of the preceding segment. To compensate for the reflection signal we introduce a frame-by-frame adaptation method adding the reflection signal to the means of the acoustic model. The reflection signal is approximated by a first-order linear prediction from the observation signal at the preceding frame, and the linear prediction coefficient is estimated with a maximum likelihood method by using the EM algorithm, which maximizes the likelihood of the adaptation data. Its effectiveness is confirmed by word recognition experiments on reverberant speech.
ER -