Using Mutual Information Criterion to Design an Efficient Phoneme Set for Chinese Speech Recognition

Jin-Song ZHANG; Xin-Hui HU; Satoshi NAKAMURA

doi:10.1093/ietisy/e91-d.3.508

IEICE TRANSACTIONS on Information

Using Mutual Information Criterion to Design an Efficient Phoneme Set for Chinese Speech Recognition

Jin-Song ZHANG, Xin-Hui HU, Satoshi NAKAMURA

Full Text Views

0

Cite this

Summary :

Chinese is a representative tonal language, and it has been an attractive topic of how to process tone information in the state-of-the-art large vocabulary speech recognition system. This paper presents a novel way to derive an efficient phoneme set of tone-dependent units to build a recognition system, by iteratively merging a pair of tone-dependent units according to the principle of minimal loss of the Mutual Information (MI). The mutual information is measured between the word tokens and their phoneme transcriptions in a training text corpus, based on the system lexical and language model. The approach has a capability to keep discriminative tonal (and phoneme) contrasts that are most helpful for disambiguating homophone words due to lack of tones, and merge those tonal (and phoneme) contrasts that are not important for word disambiguation for the recognition task. This enables a flexible selection of phoneme set according to a balance between the MI information amount and the number of phonemes. We applied the method to traditional phoneme set of Initial/Finals, and derived several phoneme sets with different number of units. Speech recognition experiments using the derived sets showed its effectiveness.

Publication: IEICE TRANSACTIONS on Information Vol.E91-D No.3 pp.508-513

Publication Date: 2008/03/01

Publicized

Online ISSN: 1745-1361

DOI: 10.1093/ietisy/e91-d.3.508

Type of Manuscript: Special Section PAPER (Special Section on Robust Speech Processing in Realistic Environments)

Category: Acoustic Modeling

Cite this

Copy

Jin-Song ZHANG, Xin-Hui HU, Satoshi NAKAMURA, "Using Mutual Information Criterion to Design an Efficient Phoneme Set for Chinese Speech Recognition" in IEICE TRANSACTIONS on Information, vol. E91-D, no. 3, pp. 508-513, March 2008, doi: 10.1093/ietisy/e91-d.3.508.
Abstract: Chinese is a representative tonal language, and it has been an attractive topic of how to process tone information in the state-of-the-art large vocabulary speech recognition system. This paper presents a novel way to derive an efficient phoneme set of tone-dependent units to build a recognition system, by iteratively merging a pair of tone-dependent units according to the principle of minimal loss of the Mutual Information (MI). The mutual information is measured between the word tokens and their phoneme transcriptions in a training text corpus, based on the system lexical and language model. The approach has a capability to keep discriminative tonal (and phoneme) contrasts that are most helpful for disambiguating homophone words due to lack of tones, and merge those tonal (and phoneme) contrasts that are not important for word disambiguation for the recognition task. This enables a flexible selection of phoneme set according to a balance between the MI information amount and the number of phonemes. We applied the method to traditional phoneme set of Initial/Finals, and derived several phoneme sets with different number of units. Speech recognition experiments using the derived sets showed its effectiveness.
URL: https://global.ieice.org/en_transactions/information/10.1093/ietisy/e91-d.3.508/_p

Copy

@ARTICLE{e91-d_3_508,
author={Jin-Song ZHANG, Xin-Hui HU, Satoshi NAKAMURA, },
journal={IEICE TRANSACTIONS on Information},
title={Using Mutual Information Criterion to Design an Efficient Phoneme Set for Chinese Speech Recognition},
year={2008},
volume={E91-D},
number={3},
pages={508-513},
abstract={Chinese is a representative tonal language, and it has been an attractive topic of how to process tone information in the state-of-the-art large vocabulary speech recognition system. This paper presents a novel way to derive an efficient phoneme set of tone-dependent units to build a recognition system, by iteratively merging a pair of tone-dependent units according to the principle of minimal loss of the Mutual Information (MI). The mutual information is measured between the word tokens and their phoneme transcriptions in a training text corpus, based on the system lexical and language model. The approach has a capability to keep discriminative tonal (and phoneme) contrasts that are most helpful for disambiguating homophone words due to lack of tones, and merge those tonal (and phoneme) contrasts that are not important for word disambiguation for the recognition task. This enables a flexible selection of phoneme set according to a balance between the MI information amount and the number of phonemes. We applied the method to traditional phoneme set of Initial/Finals, and derived several phoneme sets with different number of units. Speech recognition experiments using the derived sets showed its effectiveness.},
keywords={},
doi={10.1093/ietisy/e91-d.3.508},
ISSN={1745-1361},
month={March},}

Copy

TY - JOUR
TI - Using Mutual Information Criterion to Design an Efficient Phoneme Set for Chinese Speech Recognition
T2 - IEICE TRANSACTIONS on Information
SP - 508
EP - 513
AU - Jin-Song ZHANG
AU - Xin-Hui HU
AU - Satoshi NAKAMURA
PY - 2008
DO - 10.1093/ietisy/e91-d.3.508
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E91-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2008
AB - Chinese is a representative tonal language, and it has been an attractive topic of how to process tone information in the state-of-the-art large vocabulary speech recognition system. This paper presents a novel way to derive an efficient phoneme set of tone-dependent units to build a recognition system, by iteratively merging a pair of tone-dependent units according to the principle of minimal loss of the Mutual Information (MI). The mutual information is measured between the word tokens and their phoneme transcriptions in a training text corpus, based on the system lexical and language model. The approach has a capability to keep discriminative tonal (and phoneme) contrasts that are most helpful for disambiguating homophone words due to lack of tones, and merge those tonal (and phoneme) contrasts that are not important for word disambiguation for the recognition task. This enables a flexible selection of phoneme set according to a balance between the MI information amount and the number of phonemes. We applied the method to traditional phoneme set of Initial/Finals, and derived several phoneme sets with different number of units. Speech recognition experiments using the derived sets showed its effectiveness.
ER -

IEICE TRANSACTIONS on Information

Using Mutual Information Criterion to Design an Efficient Phoneme Set for Chinese Speech Recognition

Summary :

Authors

Keyword

Latest Issue

Contents

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles

IEICE TRANSACTIONS on Information

Using Mutual Information Criterion to Design an Efficient Phoneme Set for Chinese Speech Recognition

Summary :

Authors

Keyword

Latest Issue

Contents

Copyrights notice of machine-translated contents

Cite this

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles