Language Model Adaptation Based on PLSA of Topics and Speakers for Automatic Transcription of Panel Discussions

Yuya AKITA; Tatsuya KAWAHARA

doi:10.1093/ietisy/e88-d.3.439

IEICE TRANSACTIONS on Information

Language Model Adaptation Based on PLSA of Topics and Speakers for Automatic Transcription of Panel Discussions

Yuya AKITA, Tatsuya KAWAHARA

Full Text Views

0

Cite this

Summary :

Appropriate language modeling is one of the major issues for automatic transcription of spontaneous speech. We propose an adaptation method for statistical language models based on both topic and speaker characteristics. This approach is applied for automatic transcription of meetings and panel discussions, in which multiple participants speak on a given topic in their own speaking style. A baseline language model is a mixture of two models, which are trained with different corpora covering various topics and speakers, respectively. Then, probabilistic latent semantic analysis (PLSA) is performed on the same respective corpora and the initial ASR result to provide two sets of unigram probabilities conditioned on input speech, with regard to topics and speaker characteristics, respectively. Finally, the baseline model is adapted by scaling N-gram probabilities with these unigram probabilities. For speaker adaptation purpose, we make use of a portion of the Corpus of Spontaneous Japanese (CSJ) in which a large number of speakers gave talks for given topics. Experimental evaluation with real discussions showed that both topic and speaker adaptation reduced test-set perplexity, and in total, an average reduction rate of 8.5% was obtained. Furthermore, improvement on word accuracy was also achieved by the proposed adaptation method.

Publication: IEICE TRANSACTIONS on Information Vol.E88-D No.3 pp.439-445

Publication Date: 2005/03/01

Publicized

Online ISSN

DOI: 10.1093/ietisy/e88-d.3.439

Type of Manuscript: Special Section PAPER (Special Section on Corpus-Based Speech Technologies)

Category: Spoken Language Systems

Cite this

Copy

Yuya AKITA, Tatsuya KAWAHARA, "Language Model Adaptation Based on PLSA of Topics and Speakers for Automatic Transcription of Panel Discussions" in IEICE TRANSACTIONS on Information, vol. E88-D, no. 3, pp. 439-445, March 2005, doi: 10.1093/ietisy/e88-d.3.439.
Abstract: Appropriate language modeling is one of the major issues for automatic transcription of spontaneous speech. We propose an adaptation method for statistical language models based on both topic and speaker characteristics. This approach is applied for automatic transcription of meetings and panel discussions, in which multiple participants speak on a given topic in their own speaking style. A baseline language model is a mixture of two models, which are trained with different corpora covering various topics and speakers, respectively. Then, probabilistic latent semantic analysis (PLSA) is performed on the same respective corpora and the initial ASR result to provide two sets of unigram probabilities conditioned on input speech, with regard to topics and speaker characteristics, respectively. Finally, the baseline model is adapted by scaling N-gram probabilities with these unigram probabilities. For speaker adaptation purpose, we make use of a portion of the Corpus of Spontaneous Japanese (CSJ) in which a large number of speakers gave talks for given topics. Experimental evaluation with real discussions showed that both topic and speaker adaptation reduced test-set perplexity, and in total, an average reduction rate of 8.5% was obtained. Furthermore, improvement on word accuracy was also achieved by the proposed adaptation method.
URL: https://global.ieice.org/en_transactions/information/10.1093/ietisy/e88-d.3.439/_p

Copy

@ARTICLE{e88-d_3_439,
author={Yuya AKITA, Tatsuya KAWAHARA, },
journal={IEICE TRANSACTIONS on Information},
title={Language Model Adaptation Based on PLSA of Topics and Speakers for Automatic Transcription of Panel Discussions},
year={2005},
volume={E88-D},
number={3},
pages={439-445},
abstract={Appropriate language modeling is one of the major issues for automatic transcription of spontaneous speech. We propose an adaptation method for statistical language models based on both topic and speaker characteristics. This approach is applied for automatic transcription of meetings and panel discussions, in which multiple participants speak on a given topic in their own speaking style. A baseline language model is a mixture of two models, which are trained with different corpora covering various topics and speakers, respectively. Then, probabilistic latent semantic analysis (PLSA) is performed on the same respective corpora and the initial ASR result to provide two sets of unigram probabilities conditioned on input speech, with regard to topics and speaker characteristics, respectively. Finally, the baseline model is adapted by scaling N-gram probabilities with these unigram probabilities. For speaker adaptation purpose, we make use of a portion of the Corpus of Spontaneous Japanese (CSJ) in which a large number of speakers gave talks for given topics. Experimental evaluation with real discussions showed that both topic and speaker adaptation reduced test-set perplexity, and in total, an average reduction rate of 8.5% was obtained. Furthermore, improvement on word accuracy was also achieved by the proposed adaptation method.},
keywords={},
doi={10.1093/ietisy/e88-d.3.439},
ISSN={},
month={March},}

Copy

TY - JOUR
TI - Language Model Adaptation Based on PLSA of Topics and Speakers for Automatic Transcription of Panel Discussions
T2 - IEICE TRANSACTIONS on Information
SP - 439
EP - 445
AU - Yuya AKITA
AU - Tatsuya KAWAHARA
PY - 2005
DO - 10.1093/ietisy/e88-d.3.439
JO - IEICE TRANSACTIONS on Information
SN -
VL - E88-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2005
AB - Appropriate language modeling is one of the major issues for automatic transcription of spontaneous speech. We propose an adaptation method for statistical language models based on both topic and speaker characteristics. This approach is applied for automatic transcription of meetings and panel discussions, in which multiple participants speak on a given topic in their own speaking style. A baseline language model is a mixture of two models, which are trained with different corpora covering various topics and speakers, respectively. Then, probabilistic latent semantic analysis (PLSA) is performed on the same respective corpora and the initial ASR result to provide two sets of unigram probabilities conditioned on input speech, with regard to topics and speaker characteristics, respectively. Finally, the baseline model is adapted by scaling N-gram probabilities with these unigram probabilities. For speaker adaptation purpose, we make use of a portion of the Corpus of Spontaneous Japanese (CSJ) in which a large number of speakers gave talks for given topics. Experimental evaluation with real discussions showed that both topic and speaker adaptation reduced test-set perplexity, and in total, an average reduction rate of 8.5% was obtained. Furthermore, improvement on word accuracy was also achieved by the proposed adaptation method.
ER -

IEICE TRANSACTIONS on Information

Language Model Adaptation Based on PLSA of Topics and Speakers for Automatic Transcription of Panel Discussions

Summary :

Authors

Keyword

Latest Issue

Contents

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles

IEICE TRANSACTIONS on Information

Language Model Adaptation Based on PLSA of Topics and Speakers for Automatic Transcription of Panel Discussions

Summary :

Authors

Keyword

Latest Issue

Contents

Copyrights notice of machine-translated contents

Cite this

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles