A new analysis technique applicable to speech recognition is proposed considering the auditory mechanism of speech perception which emphasizes spectral dynamics as well as compensates for the spectral undershoot associated with coarticulation. A speech wave is represented by the LPC cepstrum and logarithmic energy sequences, and the time sequences over short periods are expanded by the first- and second-order polynomial functions at every frame period. The dynamics of the cepstrum sequences are then emphasized by the linear combination of their polynomial expansion coefficients, that is, derivatives, and their instantaneous values. Speaker-independent word recognition experiments using time functions of the dynamics-emphasized cepstrum and the polynomial coefficient for energy indicate that the error rate can be largely reduced by this method. The experimental results are compared with those obtained by the previous method in which the polynomial coefficients for the cepstrum and energy time functions were used in combination with the original time functions of these parameters as independent parameters.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Sadaoki FURUI, "Speaker-Independent Isolated Word Recognition Based on Dynamics-Emphasized Cepstrum" in IEICE TRANSACTIONS on transactions,
vol. E69-E, no. 12, pp. 1310-1317, December 1986, doi: .
Abstract: A new analysis technique applicable to speech recognition is proposed considering the auditory mechanism of speech perception which emphasizes spectral dynamics as well as compensates for the spectral undershoot associated with coarticulation. A speech wave is represented by the LPC cepstrum and logarithmic energy sequences, and the time sequences over short periods are expanded by the first- and second-order polynomial functions at every frame period. The dynamics of the cepstrum sequences are then emphasized by the linear combination of their polynomial expansion coefficients, that is, derivatives, and their instantaneous values. Speaker-independent word recognition experiments using time functions of the dynamics-emphasized cepstrum and the polynomial coefficient for energy indicate that the error rate can be largely reduced by this method. The experimental results are compared with those obtained by the previous method in which the polynomial coefficients for the cepstrum and energy time functions were used in combination with the original time functions of these parameters as independent parameters.
URL: https://global.ieice.org/en_transactions/transactions/10.1587/e69-e_12_1310/_p
Copy
@ARTICLE{e69-e_12_1310,
author={Sadaoki FURUI, },
journal={IEICE TRANSACTIONS on transactions},
title={Speaker-Independent Isolated Word Recognition Based on Dynamics-Emphasized Cepstrum},
year={1986},
volume={E69-E},
number={12},
pages={1310-1317},
abstract={A new analysis technique applicable to speech recognition is proposed considering the auditory mechanism of speech perception which emphasizes spectral dynamics as well as compensates for the spectral undershoot associated with coarticulation. A speech wave is represented by the LPC cepstrum and logarithmic energy sequences, and the time sequences over short periods are expanded by the first- and second-order polynomial functions at every frame period. The dynamics of the cepstrum sequences are then emphasized by the linear combination of their polynomial expansion coefficients, that is, derivatives, and their instantaneous values. Speaker-independent word recognition experiments using time functions of the dynamics-emphasized cepstrum and the polynomial coefficient for energy indicate that the error rate can be largely reduced by this method. The experimental results are compared with those obtained by the previous method in which the polynomial coefficients for the cepstrum and energy time functions were used in combination with the original time functions of these parameters as independent parameters.},
keywords={},
doi={},
ISSN={},
month={December},}
Copy
TY - JOUR
TI - Speaker-Independent Isolated Word Recognition Based on Dynamics-Emphasized Cepstrum
T2 - IEICE TRANSACTIONS on transactions
SP - 1310
EP - 1317
AU - Sadaoki FURUI
PY - 1986
DO -
JO - IEICE TRANSACTIONS on transactions
SN -
VL - E69-E
IS - 12
JA - IEICE TRANSACTIONS on transactions
Y1 - December 1986
AB - A new analysis technique applicable to speech recognition is proposed considering the auditory mechanism of speech perception which emphasizes spectral dynamics as well as compensates for the spectral undershoot associated with coarticulation. A speech wave is represented by the LPC cepstrum and logarithmic energy sequences, and the time sequences over short periods are expanded by the first- and second-order polynomial functions at every frame period. The dynamics of the cepstrum sequences are then emphasized by the linear combination of their polynomial expansion coefficients, that is, derivatives, and their instantaneous values. Speaker-independent word recognition experiments using time functions of the dynamics-emphasized cepstrum and the polynomial coefficient for energy indicate that the error rate can be largely reduced by this method. The experimental results are compared with those obtained by the previous method in which the polynomial coefficients for the cepstrum and energy time functions were used in combination with the original time functions of these parameters as independent parameters.
ER -