The search functionality is under construction.

IEICE TRANSACTIONS on Information

A Study on Acoustic Modeling of Pauses for Recognizing Noisy Conversational Speech

Jin-Song ZHANG, Konstantin MARKOV, Tomoko MATSUI, Satoshi NAKAMURA

  • Full Text Views

    0

  • Cite this

Summary :

This paper presents a study on modeling inter-word pauses to improve the robustness of acoustic models for recognizing noisy conversational speech. When precise contextual modeling is used for pauses, the frequent appearances and varying acoustics of pauses in noisy conversational speech make it a problem to automatically generate an accurate phonetic transcription of the training data for developing robust acoustic models. This paper presents a proposal to exploit the reliable phonetic heuristics of pauses in speech to aid the detection of varying pauses. Based on it, a stepwise approach to optimize pause HMMs was applied to the data of the DARPA SPINE2 project, and more correct phonetic transcription was achieved. The cross-word triphone HMMs developed using this method got an absolute 9.2% word error reduction when compared to the conventional method with only context free modeling of pauses. For the same pause modeling method, the use of the optimized phonetic segmentation brought about an absolute 5.2% improvements.

Publication
IEICE TRANSACTIONS on Information Vol.E86-D No.3 pp.489-496
Publication Date
2003/03/01
Publicized
Online ISSN
DOI
Type of Manuscript
Special Section PAPER (Special Issue on Speech Information Processing)
Category
Robust Speech Recognition and Enhancement

Authors

Keyword