The search functionality is under construction.

IEICE TRANSACTIONS on Information

Discriminative Approach to Build Hybrid Vocabulary for Conversational Telephone Speech Recognition of Agglutinative Languages

Xin LI, Jielin PAN, Qingwei ZHAO, Yonghong YAN

  • Full Text Views

    0

  • Cite this

Summary :

Morphemes, which are obtained from morphological parsing, and statistical sub-words, which are derived from data-driven splitting, are commonly used as the recognition units for speech recognition of agglutinative languages. In this letter, we propose a discriminative approach to select the splitting result, which is more likely to improve the recognizer's performance, for each distinct word type. An objective function which involves the unigram language model (LM) probability and the count of misrecognized phones on the acoustic training data is defined and minimized. After determining the splitting result for each word in the text corpus, we select the frequent units to build a hybrid vocabulary including morphemes and statistical sub-words. Compared to a statistical sub-word based system, the hybrid system achieves 0.8% letter error rates (LERs) reduction on the test set.

Publication
IEICE TRANSACTIONS on Information Vol.E96-D No.11 pp.2478-2482
Publication Date
2013/11/01
Publicized
Online ISSN
1745-1361
DOI
10.1587/transinf.E96.D.2478
Type of Manuscript
LETTER
Category
Speech and Hearing

Authors

Xin LI
  Chinese Academy of Sciences
Jielin PAN
  Chinese Academy of Sciences
Qingwei ZHAO
  Chinese Academy of Sciences
Yonghong YAN
  Chinese Academy of Sciences

Keyword