An on-line automatic spoken word recognition system has been developed for the researches on the automatic recognition of speech. In this system, the spoken word is, first, converted into a time series of short time spectra by 29-channel filter bank of single tuned low selectivity filters. Three major local peaks in the spectrum and the power of the speech wave are extracted in every 10 ms. Input speech is transformed into some possible phonemic sequences by using three major local peaks and the speech power. The similarity of the sequence to every item of the word dictionary in the recognition system is computed. The item of the dictionary having the maximum similarity to the sequence is chosen as the output of the recognition. Some recognition experiments have been carried out with the system. In the interactive experiment, the recognition score was found to be 94% for 51 city names uttered by 25 male speakers arbitrarily chosen.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Ken'iti KIDO, Takahide MATSUOKA, Jouji MIWA, Shozo MAKINO, "Spoken Word Recognition System for Unlimited Adult Male Speakers" in IEICE TRANSACTIONS on transactions,
vol. E61-E, no. 8, pp. 593-598, August 1978, doi: .
Abstract: An on-line automatic spoken word recognition system has been developed for the researches on the automatic recognition of speech. In this system, the spoken word is, first, converted into a time series of short time spectra by 29-channel filter bank of single tuned low selectivity filters. Three major local peaks in the spectrum and the power of the speech wave are extracted in every 10 ms. Input speech is transformed into some possible phonemic sequences by using three major local peaks and the speech power. The similarity of the sequence to every item of the word dictionary in the recognition system is computed. The item of the dictionary having the maximum similarity to the sequence is chosen as the output of the recognition. Some recognition experiments have been carried out with the system. In the interactive experiment, the recognition score was found to be 94% for 51 city names uttered by 25 male speakers arbitrarily chosen.
URL: https://global.ieice.org/en_transactions/transactions/10.1587/e61-e_8_593/_p
Copy
@ARTICLE{e61-e_8_593,
author={Ken'iti KIDO, Takahide MATSUOKA, Jouji MIWA, Shozo MAKINO, },
journal={IEICE TRANSACTIONS on transactions},
title={Spoken Word Recognition System for Unlimited Adult Male Speakers},
year={1978},
volume={E61-E},
number={8},
pages={593-598},
abstract={An on-line automatic spoken word recognition system has been developed for the researches on the automatic recognition of speech. In this system, the spoken word is, first, converted into a time series of short time spectra by 29-channel filter bank of single tuned low selectivity filters. Three major local peaks in the spectrum and the power of the speech wave are extracted in every 10 ms. Input speech is transformed into some possible phonemic sequences by using three major local peaks and the speech power. The similarity of the sequence to every item of the word dictionary in the recognition system is computed. The item of the dictionary having the maximum similarity to the sequence is chosen as the output of the recognition. Some recognition experiments have been carried out with the system. In the interactive experiment, the recognition score was found to be 94% for 51 city names uttered by 25 male speakers arbitrarily chosen.},
keywords={},
doi={},
ISSN={},
month={August},}
Copy
TY - JOUR
TI - Spoken Word Recognition System for Unlimited Adult Male Speakers
T2 - IEICE TRANSACTIONS on transactions
SP - 593
EP - 598
AU - Ken'iti KIDO
AU - Takahide MATSUOKA
AU - Jouji MIWA
AU - Shozo MAKINO
PY - 1978
DO -
JO - IEICE TRANSACTIONS on transactions
SN -
VL - E61-E
IS - 8
JA - IEICE TRANSACTIONS on transactions
Y1 - August 1978
AB - An on-line automatic spoken word recognition system has been developed for the researches on the automatic recognition of speech. In this system, the spoken word is, first, converted into a time series of short time spectra by 29-channel filter bank of single tuned low selectivity filters. Three major local peaks in the spectrum and the power of the speech wave are extracted in every 10 ms. Input speech is transformed into some possible phonemic sequences by using three major local peaks and the speech power. The similarity of the sequence to every item of the word dictionary in the recognition system is computed. The item of the dictionary having the maximum similarity to the sequence is chosen as the output of the recognition. Some recognition experiments have been carried out with the system. In the interactive experiment, the recognition score was found to be 94% for 51 city names uttered by 25 male speakers arbitrarily chosen.
ER -