In this paper, we address issues in improving hands-free speech recognition performance in different car environments using multiple spatially distributed microphones. In the previous work, we proposed the multiple linear regression of the log spectra (MRLS) for estimating the log spectra of speech at a close-talking microphone. In this paper, the concept is extended to nonlinear regressions. Regressions in the cepstrum domain are also investigated. An effective algorithm is developed to adapt the regression weights automatically to different noise environments. Compared to the nearest distant microphone and adaptive beamformer (Generalized Sidelobe Canceller), the proposed adaptive nonlinear regression approach shows an advantage in the average relative word error rate (WER) reductions of 58.5% and 10.3%, respectively, for isolated word recognition under 15 real car environments.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Weifeng LI, Chiyomi MIYAJIMA, Takanori NISHINO, Katsunobu ITOU, Kazuya TAKEDA, Fumitada ITAKURA, "Adaptive Nonlinear Regression Using Multiple Distributed Microphones for In-Car Speech Recognition" in IEICE TRANSACTIONS on Fundamentals,
vol. E88-A, no. 7, pp. 1716-1723, July 2005, doi: 10.1093/ietfec/e88-a.7.1716.
Abstract: In this paper, we address issues in improving hands-free speech recognition performance in different car environments using multiple spatially distributed microphones. In the previous work, we proposed the multiple linear regression of the log spectra (MRLS) for estimating the log spectra of speech at a close-talking microphone. In this paper, the concept is extended to nonlinear regressions. Regressions in the cepstrum domain are also investigated. An effective algorithm is developed to adapt the regression weights automatically to different noise environments. Compared to the nearest distant microphone and adaptive beamformer (Generalized Sidelobe Canceller), the proposed adaptive nonlinear regression approach shows an advantage in the average relative word error rate (WER) reductions of 58.5% and 10.3%, respectively, for isolated word recognition under 15 real car environments.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1093/ietfec/e88-a.7.1716/_p
Copy
@ARTICLE{e88-a_7_1716,
author={Weifeng LI, Chiyomi MIYAJIMA, Takanori NISHINO, Katsunobu ITOU, Kazuya TAKEDA, Fumitada ITAKURA, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={Adaptive Nonlinear Regression Using Multiple Distributed Microphones for In-Car Speech Recognition},
year={2005},
volume={E88-A},
number={7},
pages={1716-1723},
abstract={In this paper, we address issues in improving hands-free speech recognition performance in different car environments using multiple spatially distributed microphones. In the previous work, we proposed the multiple linear regression of the log spectra (MRLS) for estimating the log spectra of speech at a close-talking microphone. In this paper, the concept is extended to nonlinear regressions. Regressions in the cepstrum domain are also investigated. An effective algorithm is developed to adapt the regression weights automatically to different noise environments. Compared to the nearest distant microphone and adaptive beamformer (Generalized Sidelobe Canceller), the proposed adaptive nonlinear regression approach shows an advantage in the average relative word error rate (WER) reductions of 58.5% and 10.3%, respectively, for isolated word recognition under 15 real car environments.},
keywords={},
doi={10.1093/ietfec/e88-a.7.1716},
ISSN={},
month={July},}
Copy
TY - JOUR
TI - Adaptive Nonlinear Regression Using Multiple Distributed Microphones for In-Car Speech Recognition
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 1716
EP - 1723
AU - Weifeng LI
AU - Chiyomi MIYAJIMA
AU - Takanori NISHINO
AU - Katsunobu ITOU
AU - Kazuya TAKEDA
AU - Fumitada ITAKURA
PY - 2005
DO - 10.1093/ietfec/e88-a.7.1716
JO - IEICE TRANSACTIONS on Fundamentals
SN -
VL - E88-A
IS - 7
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - July 2005
AB - In this paper, we address issues in improving hands-free speech recognition performance in different car environments using multiple spatially distributed microphones. In the previous work, we proposed the multiple linear regression of the log spectra (MRLS) for estimating the log spectra of speech at a close-talking microphone. In this paper, the concept is extended to nonlinear regressions. Regressions in the cepstrum domain are also investigated. An effective algorithm is developed to adapt the regression weights automatically to different noise environments. Compared to the nearest distant microphone and adaptive beamformer (Generalized Sidelobe Canceller), the proposed adaptive nonlinear regression approach shows an advantage in the average relative word error rate (WER) reductions of 58.5% and 10.3%, respectively, for isolated word recognition under 15 real car environments.
ER -