Feature Compensation Employing Multiple Environmental Models for Robust In-Vehicle Speech Recognition

Wooil KIM; John H.L. HANSEN

doi:10.1093/ietisy/e91-d.3.430

IEICE TRANSACTIONS on Information

Feature Compensation Employing Multiple Environmental Models for Robust In-Vehicle Speech Recognition

Wooil KIM, John H.L. HANSEN

Full Text Views

0

Cite this

Summary :

An effective feature compensation method is developed for reliable speech recognition in real-life in-vehicle environments. The CU-Move corpus, used for evaluation, contains a range of speech and noise signals collected for a number of speakers under actual driving conditions. PCGMM-based feature compensation, considered in this paper, utilizes parallel model combination to generate noise-corrupted speech model by combining clean speech and the noise model. In order to address unknown time-varying background noise, an interpolation method of multiple environmental models is employed. To alleviate computational expenses due to multiple models, an Environment Transition Model is employed, which is motivated from Noise Language Model used in Environmental Sniffing. An environment dependent scheme of mixture sharing technique is proposed and shown to be more effective in reducing the computational complexity. A smaller environmental model set is determined by the environment transition model for mixture sharing. The proposed scheme is evaluated on the connected single digits portion of the CU-Move database using the Aurora2 evaluation toolkit. Experimental results indicate that our feature compensation method is effective for improving speech recognition in real-life in-vehicle conditions. A reduction of 73.10% of the computational requirements was obtained by employing the environment dependent mixture sharing scheme with only a slight change in recognition performance. This demonstrates that the proposed method is effective in maintaining the distinctive characteristics among the different environmental models, even when selecting a large number of Gaussian components for mixture sharing.

Publication: IEICE TRANSACTIONS on Information Vol.E91-D No.3 pp.430-438

Publication Date: 2008/03/01

Publicized

Online ISSN: 1745-1361

DOI: 10.1093/ietisy/e91-d.3.430

Type of Manuscript: Special Section PAPER (Special Section on Robust Speech Processing in Realistic Environments)

Category: Noisy Speech Recognition

Cite this

Copy

Wooil KIM, John H.L. HANSEN, "Feature Compensation Employing Multiple Environmental Models for Robust In-Vehicle Speech Recognition" in IEICE TRANSACTIONS on Information, vol. E91-D, no. 3, pp. 430-438, March 2008, doi: 10.1093/ietisy/e91-d.3.430.
Abstract: An effective feature compensation method is developed for reliable speech recognition in real-life in-vehicle environments. The CU-Move corpus, used for evaluation, contains a range of speech and noise signals collected for a number of speakers under actual driving conditions. PCGMM-based feature compensation, considered in this paper, utilizes parallel model combination to generate noise-corrupted speech model by combining clean speech and the noise model. In order to address unknown time-varying background noise, an interpolation method of multiple environmental models is employed. To alleviate computational expenses due to multiple models, an Environment Transition Model is employed, which is motivated from Noise Language Model used in Environmental Sniffing. An environment dependent scheme of mixture sharing technique is proposed and shown to be more effective in reducing the computational complexity. A smaller environmental model set is determined by the environment transition model for mixture sharing. The proposed scheme is evaluated on the connected single digits portion of the CU-Move database using the Aurora2 evaluation toolkit. Experimental results indicate that our feature compensation method is effective for improving speech recognition in real-life in-vehicle conditions. A reduction of 73.10% of the computational requirements was obtained by employing the environment dependent mixture sharing scheme with only a slight change in recognition performance. This demonstrates that the proposed method is effective in maintaining the distinctive characteristics among the different environmental models, even when selecting a large number of Gaussian components for mixture sharing.
URL: https://global.ieice.org/en_transactions/information/10.1093/ietisy/e91-d.3.430/_p

Copy

@ARTICLE{e91-d_3_430,
author={Wooil KIM, John H.L. HANSEN, },
journal={IEICE TRANSACTIONS on Information},
title={Feature Compensation Employing Multiple Environmental Models for Robust In-Vehicle Speech Recognition},
year={2008},
volume={E91-D},
number={3},
pages={430-438},
abstract={An effective feature compensation method is developed for reliable speech recognition in real-life in-vehicle environments. The CU-Move corpus, used for evaluation, contains a range of speech and noise signals collected for a number of speakers under actual driving conditions. PCGMM-based feature compensation, considered in this paper, utilizes parallel model combination to generate noise-corrupted speech model by combining clean speech and the noise model. In order to address unknown time-varying background noise, an interpolation method of multiple environmental models is employed. To alleviate computational expenses due to multiple models, an Environment Transition Model is employed, which is motivated from Noise Language Model used in Environmental Sniffing. An environment dependent scheme of mixture sharing technique is proposed and shown to be more effective in reducing the computational complexity. A smaller environmental model set is determined by the environment transition model for mixture sharing. The proposed scheme is evaluated on the connected single digits portion of the CU-Move database using the Aurora2 evaluation toolkit. Experimental results indicate that our feature compensation method is effective for improving speech recognition in real-life in-vehicle conditions. A reduction of 73.10% of the computational requirements was obtained by employing the environment dependent mixture sharing scheme with only a slight change in recognition performance. This demonstrates that the proposed method is effective in maintaining the distinctive characteristics among the different environmental models, even when selecting a large number of Gaussian components for mixture sharing.},
keywords={},
doi={10.1093/ietisy/e91-d.3.430},
ISSN={1745-1361},
month={March},}

Copy

TY - JOUR
TI - Feature Compensation Employing Multiple Environmental Models for Robust In-Vehicle Speech Recognition
T2 - IEICE TRANSACTIONS on Information
SP - 430
EP - 438
AU - Wooil KIM
AU - John H.L. HANSEN
PY - 2008
DO - 10.1093/ietisy/e91-d.3.430
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E91-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2008
AB - An effective feature compensation method is developed for reliable speech recognition in real-life in-vehicle environments. The CU-Move corpus, used for evaluation, contains a range of speech and noise signals collected for a number of speakers under actual driving conditions. PCGMM-based feature compensation, considered in this paper, utilizes parallel model combination to generate noise-corrupted speech model by combining clean speech and the noise model. In order to address unknown time-varying background noise, an interpolation method of multiple environmental models is employed. To alleviate computational expenses due to multiple models, an Environment Transition Model is employed, which is motivated from Noise Language Model used in Environmental Sniffing. An environment dependent scheme of mixture sharing technique is proposed and shown to be more effective in reducing the computational complexity. A smaller environmental model set is determined by the environment transition model for mixture sharing. The proposed scheme is evaluated on the connected single digits portion of the CU-Move database using the Aurora2 evaluation toolkit. Experimental results indicate that our feature compensation method is effective for improving speech recognition in real-life in-vehicle conditions. A reduction of 73.10% of the computational requirements was obtained by employing the environment dependent mixture sharing scheme with only a slight change in recognition performance. This demonstrates that the proposed method is effective in maintaining the distinctive characteristics among the different environmental models, even when selecting a large number of Gaussian components for mixture sharing.
ER -

IEICE TRANSACTIONS on Information

Feature Compensation Employing Multiple Environmental Models for Robust In-Vehicle Speech Recognition

Summary :

Authors

Keyword

Latest Issue

Contents

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles

IEICE TRANSACTIONS on Information

Feature Compensation Employing Multiple Environmental Models for Robust In-Vehicle Speech Recognition

Summary :

Authors

Keyword

Latest Issue

Contents

Copyrights notice of machine-translated contents

Cite this

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles