Processing Unknown Words in Continuous Speech Recognition

Kenji KITA; Terumasa EHARA; Tsuyoshi MORIMOTO

IEICE TRANSACTIONS on Fundamentals

Processing Unknown Words in Continuous Speech Recognition

Kenji KITA, Terumasa EHARA, Tsuyoshi MORIMOTO

Full Text Views

0

Cite this

Summary :

Current continuous speech recognition systems essentially ignore unknown words. Systems are designed to recognize words in the lexicon. However, for using speech recognition systems in a real application such as spoken-language processing, it is very important to process unknown words. This paper proposes a continuous speech recognition method which accepts any utterance that might include unknown words. In this method, words not in the lexicon are transcribed as phone sequences, while words in the lexicon are recognized correctly. The HMM-LR speech recognition system, which is an integration of Hidden Markov Models and generalized LR parsing, is used as the baseline system, and enhanced with the trigram model of syllables to take into account the stochastic characteristics of a language. In our approach, two kinds of grammars, a task grammar which describes the task and a phonetic grammar which describes constraints between phones, are merged and used in the HMM-LR system. The system can output a phonetic transcription for an unknown word by using the phonetic grammar. Experiment results indicate that our approach is very promising.

Publication: IEICE TRANSACTIONS on Fundamentals Vol.E74-A No.7 pp.1811-1816

Publication Date: 1991/07/25

Publicized

Online ISSN

DOI

Type of Manuscript: Special Section PAPER (Special Issue on Continuous Speech Recognition and Understanding)

Category: Continuous Speech Recognition

Authors

Kenji KITA
Terumasa EHARA
Tsuyoshi MORIMOTO

Keyword

Cite this

Copy

Kenji KITA, Terumasa EHARA, Tsuyoshi MORIMOTO, "Processing Unknown Words in Continuous Speech Recognition" in IEICE TRANSACTIONS on Fundamentals, vol. E74-A, no. 7, pp. 1811-1816, July 1991, doi: .
Abstract: Current continuous speech recognition systems essentially ignore unknown words. Systems are designed to recognize words in the lexicon. However, for using speech recognition systems in a real application such as spoken-language processing, it is very important to process unknown words. This paper proposes a continuous speech recognition method which accepts any utterance that might include unknown words. In this method, words not in the lexicon are transcribed as phone sequences, while words in the lexicon are recognized correctly. The HMM-LR speech recognition system, which is an integration of Hidden Markov Models and generalized LR parsing, is used as the baseline system, and enhanced with the trigram model of syllables to take into account the stochastic characteristics of a language. In our approach, two kinds of grammars, a task grammar which describes the task and a phonetic grammar which describes constraints between phones, are merged and used in the HMM-LR system. The system can output a phonetic transcription for an unknown word by using the phonetic grammar. Experiment results indicate that our approach is very promising.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/e74-a_7_1811/_p

Copy

@ARTICLE{e74-a_7_1811,
author={Kenji KITA, Terumasa EHARA, Tsuyoshi MORIMOTO, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={Processing Unknown Words in Continuous Speech Recognition},
year={1991},
volume={E74-A},
number={7},
pages={1811-1816},
abstract={Current continuous speech recognition systems essentially ignore unknown words. Systems are designed to recognize words in the lexicon. However, for using speech recognition systems in a real application such as spoken-language processing, it is very important to process unknown words. This paper proposes a continuous speech recognition method which accepts any utterance that might include unknown words. In this method, words not in the lexicon are transcribed as phone sequences, while words in the lexicon are recognized correctly. The HMM-LR speech recognition system, which is an integration of Hidden Markov Models and generalized LR parsing, is used as the baseline system, and enhanced with the trigram model of syllables to take into account the stochastic characteristics of a language. In our approach, two kinds of grammars, a task grammar which describes the task and a phonetic grammar which describes constraints between phones, are merged and used in the HMM-LR system. The system can output a phonetic transcription for an unknown word by using the phonetic grammar. Experiment results indicate that our approach is very promising.},
keywords={},
doi={},
ISSN={},
month={July},}

Copy

TY - JOUR
TI - Processing Unknown Words in Continuous Speech Recognition
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 1811
EP - 1816
AU - Kenji KITA
AU - Terumasa EHARA
AU - Tsuyoshi MORIMOTO
PY - 1991
DO -
JO - IEICE TRANSACTIONS on Fundamentals
SN -
VL - E74-A
IS - 7
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - July 1991
AB - Current continuous speech recognition systems essentially ignore unknown words. Systems are designed to recognize words in the lexicon. However, for using speech recognition systems in a real application such as spoken-language processing, it is very important to process unknown words. This paper proposes a continuous speech recognition method which accepts any utterance that might include unknown words. In this method, words not in the lexicon are transcribed as phone sequences, while words in the lexicon are recognized correctly. The HMM-LR speech recognition system, which is an integration of Hidden Markov Models and generalized LR parsing, is used as the baseline system, and enhanced with the trigram model of syllables to take into account the stochastic characteristics of a language. In our approach, two kinds of grammars, a task grammar which describes the task and a phonetic grammar which describes constraints between phones, are merged and used in the HMM-LR system. The system can output a phonetic transcription for an unknown word by using the phonetic grammar. Experiment results indicate that our approach is very promising.
ER -

IEICE TRANSACTIONS on Fundamentals

Processing Unknown Words in Continuous Speech Recognition

Summary :

Authors

Keyword

Latest Issue

Contents

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles

IEICE TRANSACTIONS on Fundamentals

Processing Unknown Words in Continuous Speech Recognition

Summary :

Authors

Keyword

Latest Issue

Contents

Copyrights notice of machine-translated contents

Cite this

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles