IEICE global.ieice.org Site

Author Search Result

[Author] Yuya AKITA(2hit)

1-2hit

Automatic Lecture Transcription Based on Discriminative Data Selection for Lightly Supervised Acoustic Model Training
Sheng LI Yuya AKITA Tatsuya KAWAHARA

PAPER-Speech and Hearing

Pubricized:
2015/04/28
Vol:
E98-D No:8
Page(s):
1545-1552
The paper addresses a scheme of lightly supervised training of an acoustic model, which exploits a large amount of data with closed caption texts but not faithful transcripts. In the proposed scheme, a sequence of the closed caption text and that of the ASR hypothesis by the baseline system are aligned. Then, a set of dedicated classifiers is designed and trained to select the correct one among them or reject both. It is demonstrated that the classifiers can effectively filter the usable data for acoustic model training. The scheme realizes automatic training of the acoustic model with an increased amount of data. A significant improvement in the ASR accuracy is achieved from the baseline system and also in comparison with the conventional method of lightly supervised training based on simple matching.
Language Model Adaptation Based on PLSA of Topics and Speakers for Automatic Transcription of Panel Discussions
Yuya AKITA Tatsuya KAWAHARA

PAPER-Spoken Language Systems

Vol:
E88-D No:3
Page(s):
439-445
Appropriate language modeling is one of the major issues for automatic transcription of spontaneous speech. We propose an adaptation method for statistical language models based on both topic and speaker characteristics. This approach is applied for automatic transcription of meetings and panel discussions, in which multiple participants speak on a given topic in their own speaking style. A baseline language model is a mixture of two models, which are trained with different corpora covering various topics and speakers, respectively. Then, probabilistic latent semantic analysis (PLSA) is performed on the same respective corpora and the initial ASR result to provide two sets of unigram probabilities conditioned on input speech, with regard to topics and speaker characteristics, respectively. Finally, the baseline model is adapted by scaling N-gram probabilities with these unigram probabilities. For speaker adaptation purpose, we make use of a portion of the Corpus of Spontaneous Japanese (CSJ) in which a large number of speakers gave talks for given topics. Experimental evaluation with real discussions showed that both topic and speaker adaptation reduced test-set perplexity, and in total, an average reduction rate of 8.5% was obtained. Furthermore, improvement on word accuracy was also achieved by the proposed adaptation method.

Author Search Result

[Author] Yuya AKITA(2hit)

Automatic Lecture Transcription Based on Discriminative Data Selection for Lightly Supervised Acoustic Model Training

Language Model Adaptation Based on PLSA of Topics and Speakers for Automatic Transcription of Panel Discussions

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles