Automatic Evaluation of English Pronunciation Based on Speech Recognition Techniques

Hiroshi HAMADA; Satoshi MIKI; Ryohei NAKATSU

IEICE TRANSACTIONS on Information

Automatic Evaluation of English Pronunciation Based on Speech Recognition Techniques

Hiroshi HAMADA, Satoshi MIKI, Ryohei NAKATSU

Full Text Views

0

Cite this

Summary :

A new method is proposed for automatically evaluating the English pronunciation quality of non-native speakers. It is assumed that pronunciation can be rated using three criteria: the static characteristics of phonetic spectra, the dynamic structure of spectrum sequences, and the prosodic characteristics of utterances. The evaluation uses speech recognition techniques to compare the English words pronounced by a non-native speaker with those pronounced by a native speaker. Three evaluation measures are proposed to rate pronunciation quality. (1) The standard deviation of the mapping vectors, which map the codebook vectors of the non-native speaker onto the vector space of the native speaker, is used to evaluate the static phonetic spectra characteristics. (2) The spectral distance between words pronounced by the non-native speaker and those pronounced by the native speaker obtained by the DTW method is used to evaluate the dynamic characteristics of spectral sequences. (3) The differences in fundamental frequency and speech power between the pronunciation of the native and non-native speaker are used as the criteria for evaluating prosodic characteristics. Evaluation experiments are carried out using 441 words spoken by 10 Japanese speakers and 10 native speakers. One half of the 441 words was used to evaluate static phonetic spectra characteristics, and the other half was used to evaluate the dynamic characteristics of spectral sequences, as well as the prosodic characteristics. Based on the experimental results, the correlation between the evaluation scores and the scores determined by human judgement is found to be 0.90.

Publication: IEICE TRANSACTIONS on Information Vol.E76-D No.3 pp.352-359

Publication Date: 1993/03/25

Publicized

Online ISSN

DOI

Type of Manuscript: PAPER

Category: Speech Processing

Cite this

Copy

Hiroshi HAMADA, Satoshi MIKI, Ryohei NAKATSU, "Automatic Evaluation of English Pronunciation Based on Speech Recognition Techniques" in IEICE TRANSACTIONS on Information, vol. E76-D, no. 3, pp. 352-359, March 1993, doi: .
Abstract: A new method is proposed for automatically evaluating the English pronunciation quality of non-native speakers. It is assumed that pronunciation can be rated using three criteria: the static characteristics of phonetic spectra, the dynamic structure of spectrum sequences, and the prosodic characteristics of utterances. The evaluation uses speech recognition techniques to compare the English words pronounced by a non-native speaker with those pronounced by a native speaker. Three evaluation measures are proposed to rate pronunciation quality. (1) The standard deviation of the mapping vectors, which map the codebook vectors of the non-native speaker onto the vector space of the native speaker, is used to evaluate the static phonetic spectra characteristics. (2) The spectral distance between words pronounced by the non-native speaker and those pronounced by the native speaker obtained by the DTW method is used to evaluate the dynamic characteristics of spectral sequences. (3) The differences in fundamental frequency and speech power between the pronunciation of the native and non-native speaker are used as the criteria for evaluating prosodic characteristics. Evaluation experiments are carried out using 441 words spoken by 10 Japanese speakers and 10 native speakers. One half of the 441 words was used to evaluate static phonetic spectra characteristics, and the other half was used to evaluate the dynamic characteristics of spectral sequences, as well as the prosodic characteristics. Based on the experimental results, the correlation between the evaluation scores and the scores determined by human judgement is found to be 0.90.
URL: https://global.ieice.org/en_transactions/information/10.1587/e76-d_3_352/_p

Copy

@ARTICLE{e76-d_3_352,
author={Hiroshi HAMADA, Satoshi MIKI, Ryohei NAKATSU, },
journal={IEICE TRANSACTIONS on Information},
title={Automatic Evaluation of English Pronunciation Based on Speech Recognition Techniques},
year={1993},
volume={E76-D},
number={3},
pages={352-359},
abstract={A new method is proposed for automatically evaluating the English pronunciation quality of non-native speakers. It is assumed that pronunciation can be rated using three criteria: the static characteristics of phonetic spectra, the dynamic structure of spectrum sequences, and the prosodic characteristics of utterances. The evaluation uses speech recognition techniques to compare the English words pronounced by a non-native speaker with those pronounced by a native speaker. Three evaluation measures are proposed to rate pronunciation quality. (1) The standard deviation of the mapping vectors, which map the codebook vectors of the non-native speaker onto the vector space of the native speaker, is used to evaluate the static phonetic spectra characteristics. (2) The spectral distance between words pronounced by the non-native speaker and those pronounced by the native speaker obtained by the DTW method is used to evaluate the dynamic characteristics of spectral sequences. (3) The differences in fundamental frequency and speech power between the pronunciation of the native and non-native speaker are used as the criteria for evaluating prosodic characteristics. Evaluation experiments are carried out using 441 words spoken by 10 Japanese speakers and 10 native speakers. One half of the 441 words was used to evaluate static phonetic spectra characteristics, and the other half was used to evaluate the dynamic characteristics of spectral sequences, as well as the prosodic characteristics. Based on the experimental results, the correlation between the evaluation scores and the scores determined by human judgement is found to be 0.90.},
keywords={},
doi={},
ISSN={},
month={March},}

Copy

TY - JOUR
TI - Automatic Evaluation of English Pronunciation Based on Speech Recognition Techniques
T2 - IEICE TRANSACTIONS on Information
SP - 352
EP - 359
AU - Hiroshi HAMADA
AU - Satoshi MIKI
AU - Ryohei NAKATSU
PY - 1993
DO -
JO - IEICE TRANSACTIONS on Information
SN -
VL - E76-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 1993
AB - A new method is proposed for automatically evaluating the English pronunciation quality of non-native speakers. It is assumed that pronunciation can be rated using three criteria: the static characteristics of phonetic spectra, the dynamic structure of spectrum sequences, and the prosodic characteristics of utterances. The evaluation uses speech recognition techniques to compare the English words pronounced by a non-native speaker with those pronounced by a native speaker. Three evaluation measures are proposed to rate pronunciation quality. (1) The standard deviation of the mapping vectors, which map the codebook vectors of the non-native speaker onto the vector space of the native speaker, is used to evaluate the static phonetic spectra characteristics. (2) The spectral distance between words pronounced by the non-native speaker and those pronounced by the native speaker obtained by the DTW method is used to evaluate the dynamic characteristics of spectral sequences. (3) The differences in fundamental frequency and speech power between the pronunciation of the native and non-native speaker are used as the criteria for evaluating prosodic characteristics. Evaluation experiments are carried out using 441 words spoken by 10 Japanese speakers and 10 native speakers. One half of the 441 words was used to evaluate static phonetic spectra characteristics, and the other half was used to evaluate the dynamic characteristics of spectral sequences, as well as the prosodic characteristics. Based on the experimental results, the correlation between the evaluation scores and the scores determined by human judgement is found to be 0.90.
ER -

IEICE TRANSACTIONS on Information

Automatic Evaluation of English Pronunciation Based on Speech Recognition Techniques

Summary :

Authors

Keyword

Latest Issue

Contents

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles

IEICE TRANSACTIONS on Information

Automatic Evaluation of English Pronunciation Based on Speech Recognition Techniques

Summary :

Authors

Keyword

Latest Issue

Contents

Copyrights notice of machine-translated contents

Cite this

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles