Learning Speech Variability in Discriminative Acoustic Model Adaptation

Shoei SATO; Takahiro OKU; Shinichi HOMMA; Akio KOBAYASHI; Toru IMAI

doi:10.1587/transinf.E93.D.2370

IEICE TRANSACTIONS on Information

Learning Speech Variability in Discriminative Acoustic Model Adaptation

Shoei SATO, Takahiro OKU, Shinichi HOMMA, Akio KOBAYASHI, Toru IMAI

Full Text Views

0

Cite this

Summary :

We present a new discriminative method of acoustic model adaptation that deals with a task-dependent speech variability. We have focused on differences of expressions or speaking styles between tasks and set the objective of this method as improving the recognition accuracy of indistinctly pronounced phrases dependent on a speaking style. The adaptation appends subword models for frequently observable variants of subwords in the task. To find the task-dependent variants, low-confidence words are statistically selected from words with higher frequency in the task's adaptation data by using their word lattices. HMM parameters of subword models dependent on the words are discriminatively trained by using linear transforms with a minimum phoneme error (MPE) criterion. For the MPE training, subword accuracy discriminating between the variants and the originals is also investigated. In speech recognition experiments, the proposed adaptation with the subword variants reduced the word error rate by 12.0% relative in a Japanese conversational broadcast task.

Publication: IEICE TRANSACTIONS on Information Vol.E93-D No.9 pp.2370-2378

Publication Date: 2010/09/01

Publicized

Online ISSN: 1745-1361

DOI: 10.1587/transinf.E93.D.2370

Type of Manuscript: Special Section PAPER (Special Section on Processing Natural Speech Variability for Improved Verbal Human-Computer Interaction)

Category: Adaptation

Cite this

Copy

Shoei SATO, Takahiro OKU, Shinichi HOMMA, Akio KOBAYASHI, Toru IMAI, "Learning Speech Variability in Discriminative Acoustic Model Adaptation" in IEICE TRANSACTIONS on Information, vol. E93-D, no. 9, pp. 2370-2378, September 2010, doi: 10.1587/transinf.E93.D.2370.
Abstract: We present a new discriminative method of acoustic model adaptation that deals with a task-dependent speech variability. We have focused on differences of expressions or speaking styles between tasks and set the objective of this method as improving the recognition accuracy of indistinctly pronounced phrases dependent on a speaking style. The adaptation appends subword models for frequently observable variants of subwords in the task. To find the task-dependent variants, low-confidence words are statistically selected from words with higher frequency in the task's adaptation data by using their word lattices. HMM parameters of subword models dependent on the words are discriminatively trained by using linear transforms with a minimum phoneme error (MPE) criterion. For the MPE training, subword accuracy discriminating between the variants and the originals is also investigated. In speech recognition experiments, the proposed adaptation with the subword variants reduced the word error rate by 12.0% relative in a Japanese conversational broadcast task.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.E93.D.2370/_p

Copy

@ARTICLE{e93-d_9_2370,
author={Shoei SATO, Takahiro OKU, Shinichi HOMMA, Akio KOBAYASHI, Toru IMAI, },
journal={IEICE TRANSACTIONS on Information},
title={Learning Speech Variability in Discriminative Acoustic Model Adaptation},
year={2010},
volume={E93-D},
number={9},
pages={2370-2378},
abstract={We present a new discriminative method of acoustic model adaptation that deals with a task-dependent speech variability. We have focused on differences of expressions or speaking styles between tasks and set the objective of this method as improving the recognition accuracy of indistinctly pronounced phrases dependent on a speaking style. The adaptation appends subword models for frequently observable variants of subwords in the task. To find the task-dependent variants, low-confidence words are statistically selected from words with higher frequency in the task's adaptation data by using their word lattices. HMM parameters of subword models dependent on the words are discriminatively trained by using linear transforms with a minimum phoneme error (MPE) criterion. For the MPE training, subword accuracy discriminating between the variants and the originals is also investigated. In speech recognition experiments, the proposed adaptation with the subword variants reduced the word error rate by 12.0% relative in a Japanese conversational broadcast task.},
keywords={},
doi={10.1587/transinf.E93.D.2370},
ISSN={1745-1361},
month={September},}

Copy

TY - JOUR
TI - Learning Speech Variability in Discriminative Acoustic Model Adaptation
T2 - IEICE TRANSACTIONS on Information
SP - 2370
EP - 2378
AU - Shoei SATO
AU - Takahiro OKU
AU - Shinichi HOMMA
AU - Akio KOBAYASHI
AU - Toru IMAI
PY - 2010
DO - 10.1587/transinf.E93.D.2370
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E93-D
IS - 9
JA - IEICE TRANSACTIONS on Information
Y1 - September 2010
AB - We present a new discriminative method of acoustic model adaptation that deals with a task-dependent speech variability. We have focused on differences of expressions or speaking styles between tasks and set the objective of this method as improving the recognition accuracy of indistinctly pronounced phrases dependent on a speaking style. The adaptation appends subword models for frequently observable variants of subwords in the task. To find the task-dependent variants, low-confidence words are statistically selected from words with higher frequency in the task's adaptation data by using their word lattices. HMM parameters of subword models dependent on the words are discriminatively trained by using linear transforms with a minimum phoneme error (MPE) criterion. For the MPE training, subword accuracy discriminating between the variants and the originals is also investigated. In speech recognition experiments, the proposed adaptation with the subword variants reduced the word error rate by 12.0% relative in a Japanese conversational broadcast task.
ER -

IEICE TRANSACTIONS on Information

Learning Speech Variability in Discriminative Acoustic Model Adaptation

Summary :

Authors

Keyword

Latest Issue

Contents

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles

IEICE TRANSACTIONS on Information

Learning Speech Variability in Discriminative Acoustic Model Adaptation

Summary :

Authors

Keyword

Latest Issue

Contents

Copyrights notice of machine-translated contents

Cite this

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles