A phonetic typewriter is an unlimitedvocabulary continuous speech recognition system recognizing each phone in speech without the need for lexical information. This paper describes a Japanese phonetic typewriter system based on HMM phone recognition and syllable-based stochastic phone sequence modeling. Even though HMM methods have considerable capacity for recognizing speech, it is difficult to recognize individual phones in continuous speech without lexical information. HMM phone recognition is improved by incorporating syllable trigrams for phone sequence modeling. HMM phone units are trained using an isolated word database, and their duration parameters are modified according to speaking rate. Syllable trigram tables are made from a text database of over 300,000 syllables, and phone sequence probabilities calculated from the trigrams are combined with HMM probabilities. Using these probabilities, to limit the number of intermediate candidates leads to an accurate phonetic typewriter system without requiring excessive computation time. An interpolated n-gram approach to phone sequence modeling, is shown to be more effective than a simple trigram method.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Takeshi KAWABATA, Toshiyuki HANAZAWA, Katsunobu ITOH, Kiyohiro SHIKANO, "Japanese Phonetic Typewriter Using HMM Phone Recognition and Stochastic Phone-Sequence Modeling" in IEICE TRANSACTIONS on Fundamentals,
vol. E74-A, no. 7, pp. 1783-1787, July 1991, doi: .
Abstract: A phonetic typewriter is an unlimitedvocabulary continuous speech recognition system recognizing each phone in speech without the need for lexical information. This paper describes a Japanese phonetic typewriter system based on HMM phone recognition and syllable-based stochastic phone sequence modeling. Even though HMM methods have considerable capacity for recognizing speech, it is difficult to recognize individual phones in continuous speech without lexical information. HMM phone recognition is improved by incorporating syllable trigrams for phone sequence modeling. HMM phone units are trained using an isolated word database, and their duration parameters are modified according to speaking rate. Syllable trigram tables are made from a text database of over 300,000 syllables, and phone sequence probabilities calculated from the trigrams are combined with HMM probabilities. Using these probabilities, to limit the number of intermediate candidates leads to an accurate phonetic typewriter system without requiring excessive computation time. An interpolated n-gram approach to phone sequence modeling, is shown to be more effective than a simple trigram method.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/e74-a_7_1783/_p
Copy
@ARTICLE{e74-a_7_1783,
author={Takeshi KAWABATA, Toshiyuki HANAZAWA, Katsunobu ITOH, Kiyohiro SHIKANO, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={Japanese Phonetic Typewriter Using HMM Phone Recognition and Stochastic Phone-Sequence Modeling},
year={1991},
volume={E74-A},
number={7},
pages={1783-1787},
abstract={A phonetic typewriter is an unlimitedvocabulary continuous speech recognition system recognizing each phone in speech without the need for lexical information. This paper describes a Japanese phonetic typewriter system based on HMM phone recognition and syllable-based stochastic phone sequence modeling. Even though HMM methods have considerable capacity for recognizing speech, it is difficult to recognize individual phones in continuous speech without lexical information. HMM phone recognition is improved by incorporating syllable trigrams for phone sequence modeling. HMM phone units are trained using an isolated word database, and their duration parameters are modified according to speaking rate. Syllable trigram tables are made from a text database of over 300,000 syllables, and phone sequence probabilities calculated from the trigrams are combined with HMM probabilities. Using these probabilities, to limit the number of intermediate candidates leads to an accurate phonetic typewriter system without requiring excessive computation time. An interpolated n-gram approach to phone sequence modeling, is shown to be more effective than a simple trigram method.},
keywords={},
doi={},
ISSN={},
month={July},}
Copy
TY - JOUR
TI - Japanese Phonetic Typewriter Using HMM Phone Recognition and Stochastic Phone-Sequence Modeling
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 1783
EP - 1787
AU - Takeshi KAWABATA
AU - Toshiyuki HANAZAWA
AU - Katsunobu ITOH
AU - Kiyohiro SHIKANO
PY - 1991
DO -
JO - IEICE TRANSACTIONS on Fundamentals
SN -
VL - E74-A
IS - 7
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - July 1991
AB - A phonetic typewriter is an unlimitedvocabulary continuous speech recognition system recognizing each phone in speech without the need for lexical information. This paper describes a Japanese phonetic typewriter system based on HMM phone recognition and syllable-based stochastic phone sequence modeling. Even though HMM methods have considerable capacity for recognizing speech, it is difficult to recognize individual phones in continuous speech without lexical information. HMM phone recognition is improved by incorporating syllable trigrams for phone sequence modeling. HMM phone units are trained using an isolated word database, and their duration parameters are modified according to speaking rate. Syllable trigram tables are made from a text database of over 300,000 syllables, and phone sequence probabilities calculated from the trigrams are combined with HMM probabilities. Using these probabilities, to limit the number of intermediate candidates leads to an accurate phonetic typewriter system without requiring excessive computation time. An interpolated n-gram approach to phone sequence modeling, is shown to be more effective than a simple trigram method.
ER -