In this paper, we describe the recent trend in automatic speech recognition. First, we should point out that the current art of speech recognition by machines is admittedly inferior to the ability of human beings. In particular, we assert that the improvement of acoustic models is necessary. Second, we describe robust feature parameters for noisy environments, which are important in practical usage. Then, we indicate that much training data in the same environment as the recognition stage are useful from the viewpoints of information theory and pattern recognition. Third, we discuss acoustic models and language models which are central issues in speech recognition techniques. Then the principle and limitations of the hidden Markov model (HMM) and recent extended models are discussed. The role of language models is to eliminate improbable candidate words, that is, to reduce the search space. In other words, language models having smaller entropy are preferable. From this standpoint, we survey stochastic language models. Finally, we state some points which deserve attention when constructing speech recognition systems.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Seiichi NAKAGAWA, "A Survey on Automatic Speech Recognition" in IEICE TRANSACTIONS on Information,
vol. E85-D, no. 3, pp. 465-486, March 2002, doi: .
Abstract: In this paper, we describe the recent trend in automatic speech recognition. First, we should point out that the current art of speech recognition by machines is admittedly inferior to the ability of human beings. In particular, we assert that the improvement of acoustic models is necessary. Second, we describe robust feature parameters for noisy environments, which are important in practical usage. Then, we indicate that much training data in the same environment as the recognition stage are useful from the viewpoints of information theory and pattern recognition. Third, we discuss acoustic models and language models which are central issues in speech recognition techniques. Then the principle and limitations of the hidden Markov model (HMM) and recent extended models are discussed. The role of language models is to eliminate improbable candidate words, that is, to reduce the search space. In other words, language models having smaller entropy are preferable. From this standpoint, we survey stochastic language models. Finally, we state some points which deserve attention when constructing speech recognition systems.
URL: https://global.ieice.org/en_transactions/information/10.1587/e85-d_3_465/_p
Copy
@ARTICLE{e85-d_3_465,
author={Seiichi NAKAGAWA, },
journal={IEICE TRANSACTIONS on Information},
title={A Survey on Automatic Speech Recognition},
year={2002},
volume={E85-D},
number={3},
pages={465-486},
abstract={In this paper, we describe the recent trend in automatic speech recognition. First, we should point out that the current art of speech recognition by machines is admittedly inferior to the ability of human beings. In particular, we assert that the improvement of acoustic models is necessary. Second, we describe robust feature parameters for noisy environments, which are important in practical usage. Then, we indicate that much training data in the same environment as the recognition stage are useful from the viewpoints of information theory and pattern recognition. Third, we discuss acoustic models and language models which are central issues in speech recognition techniques. Then the principle and limitations of the hidden Markov model (HMM) and recent extended models are discussed. The role of language models is to eliminate improbable candidate words, that is, to reduce the search space. In other words, language models having smaller entropy are preferable. From this standpoint, we survey stochastic language models. Finally, we state some points which deserve attention when constructing speech recognition systems.},
keywords={},
doi={},
ISSN={},
month={March},}
Copy
TY - JOUR
TI - A Survey on Automatic Speech Recognition
T2 - IEICE TRANSACTIONS on Information
SP - 465
EP - 486
AU - Seiichi NAKAGAWA
PY - 2002
DO -
JO - IEICE TRANSACTIONS on Information
SN -
VL - E85-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2002
AB - In this paper, we describe the recent trend in automatic speech recognition. First, we should point out that the current art of speech recognition by machines is admittedly inferior to the ability of human beings. In particular, we assert that the improvement of acoustic models is necessary. Second, we describe robust feature parameters for noisy environments, which are important in practical usage. Then, we indicate that much training data in the same environment as the recognition stage are useful from the viewpoints of information theory and pattern recognition. Third, we discuss acoustic models and language models which are central issues in speech recognition techniques. Then the principle and limitations of the hidden Markov model (HMM) and recent extended models are discussed. The role of language models is to eliminate improbable candidate words, that is, to reduce the search space. In other words, language models having smaller entropy are preferable. From this standpoint, we survey stochastic language models. Finally, we state some points which deserve attention when constructing speech recognition systems.
ER -