The search functionality is under construction.

IEICE TRANSACTIONS on Information

ATR Parallel Decoding Based Speech Recognition System Robust to Noise and Speaking Styles

Shigeki MATSUDA, Takatoshi JITSUHIRO, Konstantin MARKOV, Satoshi NAKAMURA

  • Full Text Views

    0

  • Cite this

Summary :

In this paper, we describe a parallel decoding-based ASR system developed of ATR that is robust to noise type, SNR and speaking style. It is difficult to recognize speech affected by various factors, especially when an ASR system contains only a single acoustic model. One solution is to employ multiple acoustic models, one model for each different condition. Even though the robustness of each acoustic model is limited, the whole ASR system can handle various conditions appropriately. In our system, there are two recognition sub-systems which use different features such as MFCC and Differential MFCC (DMFCC). Each sub-system has several acoustic models depending on SNR, speaker gender and speaking style, and during recognition each acoustic model is adapted by fast noise adaptation. From each sub-system, one hypothesis is selected based on posterior probability. The final recognition result is obtained by combining the best hypotheses from the two sub-systems. On the AURORA-2J task used widely for the evaluation of noise robustness, our system achieved higher recognition performance than a system which contains only a single model. Also, our system was tested using normal and hyper-articulated speech contaminated by several background noises, and exhibited high robustness to noise and speaking styles.

Publication
IEICE TRANSACTIONS on Information Vol.E89-D No.3 pp.989-997
Publication Date
2006/03/01
Publicized
Online ISSN
1745-1361
DOI
10.1093/ietisy/e89-d.3.989
Type of Manuscript
Special Section PAPER (Special Section on Statistical Modeling for Speech Processing)
Category
Speech Recognition

Authors

Keyword