Copy
Yutaka KOBAYASHI, Masanori OMOTE, Hidenori ENDO, Yasuhisa NIIMI, "SUSKIT---A Speech Understanding System Based on Robust Phone Spotting--" in IEICE TRANSACTIONS on Fundamentals,
vol. E74-A, no. 7, pp. 1863-1869, July 1991, doi: .
Abstract: This paper describes an overview of our speech understanding system and reports on the recent results of the sentence recognition experiments. The system, we call SUSKIT-, recognizes database queries in natural Japanese sentences. The user is expected to speak sentence by sentence. Among the difficult problems to overcome, this study paid the prime attentions to how to cope with the contextual variations of pronunciations and how to verify partial sentence hypotheses in a hierarchical system. The SUSKIT- predicts words strings in a top-down manner, however, the verification of hypotheses against the input speech is done using a unit independent of word boundaries. Words are not suitable units of verification because the smoothing effect owing to phonetic contexts makes it difficult to recognize short words. In order to avoid the misrecognition caused by the smoothing effect across word boundaries, the SUSKIT- dynamically extracts those phoneme strings bounded by the easily detectable phonemes from the predicted word string as verification templates. The left-to-right timesynchronous beam-search strategy was adopted for searching likely sentences. We carried out sentence recognition experiments using the speech corpus consists of 159 sentences read by three Japanese male speakers. The task perplexity was 8.3. Using the speaker-dependent HMM parameters, we obtained the sentence recognition rates of 83.0-92.5%.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/e74-a_7_1863/_p
Copy
@ARTICLE{e74-a_7_1863,
author={Yutaka KOBAYASHI, Masanori OMOTE, Hidenori ENDO, Yasuhisa NIIMI, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={SUSKIT---A Speech Understanding System Based on Robust Phone Spotting--},
year={1991},
volume={E74-A},
number={7},
pages={1863-1869},
abstract={This paper describes an overview of our speech understanding system and reports on the recent results of the sentence recognition experiments. The system, we call SUSKIT-, recognizes database queries in natural Japanese sentences. The user is expected to speak sentence by sentence. Among the difficult problems to overcome, this study paid the prime attentions to how to cope with the contextual variations of pronunciations and how to verify partial sentence hypotheses in a hierarchical system. The SUSKIT- predicts words strings in a top-down manner, however, the verification of hypotheses against the input speech is done using a unit independent of word boundaries. Words are not suitable units of verification because the smoothing effect owing to phonetic contexts makes it difficult to recognize short words. In order to avoid the misrecognition caused by the smoothing effect across word boundaries, the SUSKIT- dynamically extracts those phoneme strings bounded by the easily detectable phonemes from the predicted word string as verification templates. The left-to-right timesynchronous beam-search strategy was adopted for searching likely sentences. We carried out sentence recognition experiments using the speech corpus consists of 159 sentences read by three Japanese male speakers. The task perplexity was 8.3. Using the speaker-dependent HMM parameters, we obtained the sentence recognition rates of 83.0-92.5%.},
keywords={},
doi={},
ISSN={},
month={July},}
Copy
TY - JOUR
TI - SUSKIT---A Speech Understanding System Based on Robust Phone Spotting--
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 1863
EP - 1869
AU - Yutaka KOBAYASHI
AU - Masanori OMOTE
AU - Hidenori ENDO
AU - Yasuhisa NIIMI
PY - 1991
DO -
JO - IEICE TRANSACTIONS on Fundamentals
SN -
VL - E74-A
IS - 7
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - July 1991
AB - This paper describes an overview of our speech understanding system and reports on the recent results of the sentence recognition experiments. The system, we call SUSKIT-, recognizes database queries in natural Japanese sentences. The user is expected to speak sentence by sentence. Among the difficult problems to overcome, this study paid the prime attentions to how to cope with the contextual variations of pronunciations and how to verify partial sentence hypotheses in a hierarchical system. The SUSKIT- predicts words strings in a top-down manner, however, the verification of hypotheses against the input speech is done using a unit independent of word boundaries. Words are not suitable units of verification because the smoothing effect owing to phonetic contexts makes it difficult to recognize short words. In order to avoid the misrecognition caused by the smoothing effect across word boundaries, the SUSKIT- dynamically extracts those phoneme strings bounded by the easily detectable phonemes from the predicted word string as verification templates. The left-to-right timesynchronous beam-search strategy was adopted for searching likely sentences. We carried out sentence recognition experiments using the speech corpus consists of 159 sentences read by three Japanese male speakers. The task perplexity was 8.3. Using the speaker-dependent HMM parameters, we obtained the sentence recognition rates of 83.0-92.5%.
ER -