Phoneme Set Design Based on Integrated Acoustic and Linguistic Features for Second Language Speech Recognition

Xiaoyun WANG; Tsuneo KATO; Seiichi YAMAMOTO

doi:10.1587/transinf.2016EDP7207

IEICE TRANSACTIONS on Information

Phoneme Set Design Based on Integrated Acoustic and Linguistic Features for Second Language Speech Recognition

Xiaoyun WANG, Tsuneo KATO, Seiichi YAMAMOTO

Full Text Views

0

Cite this

Summary :

Recognition of second language (L2) speech is a challenging task even for state-of-the-art automatic speech recognition (ASR) systems, partly because pronunciation by L2 speakers is usually significantly influenced by the mother tongue of the speakers. Considering that the expressions of non-native speakers are usually simpler than those of native ones, and that second language speech usually includes mispronunciation and less fluent pronunciation, we propose a novel method that maximizes unified acoustic and linguistic objective function to derive a phoneme set for second language speech recognition. The authors verify the efficacy of the proposed method using second language speech collected with a translation game type dialogue-based computer assisted language learning (CALL) system. In this paper, the authors examine the performance based on acoustic likelihood, linguistic discrimination ability and integrated objective function for second language speech. Experiments demonstrate the validity of the phoneme set derived by the proposed method.

Publication: IEICE TRANSACTIONS on Information Vol.E100-D No.4 pp.857-864

Publication Date: 2017/04/01

Publicized: 2016/12/29

Online ISSN: 1745-1361

DOI: 10.1587/transinf.2016EDP7207

Type of Manuscript: PAPER

Category: Speech and Hearing

Authors

Xiaoyun WANG
  Doshisha University
Tsuneo KATO
  Doshisha University
Seiichi YAMAMOTO
  Doshisha University

Keyword

second language (L2) speech recognition, unified acoustic and linguistic objective function, reduced phoneme set (RPS), linguistic discrimination ability

Cite this

Copy

Xiaoyun WANG, Tsuneo KATO, Seiichi YAMAMOTO, "Phoneme Set Design Based on Integrated Acoustic and Linguistic Features for Second Language Speech Recognition" in IEICE TRANSACTIONS on Information, vol. E100-D, no. 4, pp. 857-864, April 2017, doi: 10.1587/transinf.2016EDP7207.
Abstract: Recognition of second language (L2) speech is a challenging task even for state-of-the-art automatic speech recognition (ASR) systems, partly because pronunciation by L2 speakers is usually significantly influenced by the mother tongue of the speakers. Considering that the expressions of non-native speakers are usually simpler than those of native ones, and that second language speech usually includes mispronunciation and less fluent pronunciation, we propose a novel method that maximizes unified acoustic and linguistic objective function to derive a phoneme set for second language speech recognition. The authors verify the efficacy of the proposed method using second language speech collected with a translation game type dialogue-based computer assisted language learning (CALL) system. In this paper, the authors examine the performance based on acoustic likelihood, linguistic discrimination ability and integrated objective function for second language speech. Experiments demonstrate the validity of the phoneme set derived by the proposed method.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2016EDP7207/_p

Copy

@ARTICLE{e100-d_4_857,
author={Xiaoyun WANG, Tsuneo KATO, Seiichi YAMAMOTO, },
journal={IEICE TRANSACTIONS on Information},
title={Phoneme Set Design Based on Integrated Acoustic and Linguistic Features for Second Language Speech Recognition},
year={2017},
volume={E100-D},
number={4},
pages={857-864},
abstract={Recognition of second language (L2) speech is a challenging task even for state-of-the-art automatic speech recognition (ASR) systems, partly because pronunciation by L2 speakers is usually significantly influenced by the mother tongue of the speakers. Considering that the expressions of non-native speakers are usually simpler than those of native ones, and that second language speech usually includes mispronunciation and less fluent pronunciation, we propose a novel method that maximizes unified acoustic and linguistic objective function to derive a phoneme set for second language speech recognition. The authors verify the efficacy of the proposed method using second language speech collected with a translation game type dialogue-based computer assisted language learning (CALL) system. In this paper, the authors examine the performance based on acoustic likelihood, linguistic discrimination ability and integrated objective function for second language speech. Experiments demonstrate the validity of the phoneme set derived by the proposed method.},
keywords={},
doi={10.1587/transinf.2016EDP7207},
ISSN={1745-1361},
month={April},}

Copy

TY - JOUR
TI - Phoneme Set Design Based on Integrated Acoustic and Linguistic Features for Second Language Speech Recognition
T2 - IEICE TRANSACTIONS on Information
SP - 857
EP - 864
AU - Xiaoyun WANG
AU - Tsuneo KATO
AU - Seiichi YAMAMOTO
PY - 2017
DO - 10.1587/transinf.2016EDP7207
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E100-D
IS - 4
JA - IEICE TRANSACTIONS on Information
Y1 - April 2017
AB - Recognition of second language (L2) speech is a challenging task even for state-of-the-art automatic speech recognition (ASR) systems, partly because pronunciation by L2 speakers is usually significantly influenced by the mother tongue of the speakers. Considering that the expressions of non-native speakers are usually simpler than those of native ones, and that second language speech usually includes mispronunciation and less fluent pronunciation, we propose a novel method that maximizes unified acoustic and linguistic objective function to derive a phoneme set for second language speech recognition. The authors verify the efficacy of the proposed method using second language speech collected with a translation game type dialogue-based computer assisted language learning (CALL) system. In this paper, the authors examine the performance based on acoustic likelihood, linguistic discrimination ability and integrated objective function for second language speech. Experiments demonstrate the validity of the phoneme set derived by the proposed method.
ER -

IEICE TRANSACTIONS on Information