Pitch frequency is a basic characteristic of human voice, and pitch extraction is one of the most important studies for speech recognition. This paper describes a simple but effective technique to obtain correct pitch frequency from candidates (pitch candidates) extracted by the short-range autocorrelation function. The correction is performed by a neural network in consideration of the time coutinuation that is realized by referring to pitch candidates at previous frames. Since the neural network is trained by the back-propagation algorithm with training data, it adapts to any speaker and obtains good correction without sensitive adjustment and tuning. The pitch extraction was performed for 3 male and 3 female announcers, and the proposed method improves the percentage of correct pitch from 58.65% to 89.19%.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Akio OGIHARA, Kunio FUKUNAGA, "A Correcting Method for Pitch Extraction Using Neural Networks" in IEICE TRANSACTIONS on Fundamentals,
vol. E77-A, no. 6, pp. 1015-1022, June 1994, doi: .
Abstract: Pitch frequency is a basic characteristic of human voice, and pitch extraction is one of the most important studies for speech recognition. This paper describes a simple but effective technique to obtain correct pitch frequency from candidates (pitch candidates) extracted by the short-range autocorrelation function. The correction is performed by a neural network in consideration of the time coutinuation that is realized by referring to pitch candidates at previous frames. Since the neural network is trained by the back-propagation algorithm with training data, it adapts to any speaker and obtains good correction without sensitive adjustment and tuning. The pitch extraction was performed for 3 male and 3 female announcers, and the proposed method improves the percentage of correct pitch from 58.65% to 89.19%.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/e77-a_6_1015/_p
Copy
@ARTICLE{e77-a_6_1015,
author={Akio OGIHARA, Kunio FUKUNAGA, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={A Correcting Method for Pitch Extraction Using Neural Networks},
year={1994},
volume={E77-A},
number={6},
pages={1015-1022},
abstract={Pitch frequency is a basic characteristic of human voice, and pitch extraction is one of the most important studies for speech recognition. This paper describes a simple but effective technique to obtain correct pitch frequency from candidates (pitch candidates) extracted by the short-range autocorrelation function. The correction is performed by a neural network in consideration of the time coutinuation that is realized by referring to pitch candidates at previous frames. Since the neural network is trained by the back-propagation algorithm with training data, it adapts to any speaker and obtains good correction without sensitive adjustment and tuning. The pitch extraction was performed for 3 male and 3 female announcers, and the proposed method improves the percentage of correct pitch from 58.65% to 89.19%.},
keywords={},
doi={},
ISSN={},
month={June},}
Copy
TY - JOUR
TI - A Correcting Method for Pitch Extraction Using Neural Networks
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 1015
EP - 1022
AU - Akio OGIHARA
AU - Kunio FUKUNAGA
PY - 1994
DO -
JO - IEICE TRANSACTIONS on Fundamentals
SN -
VL - E77-A
IS - 6
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - June 1994
AB - Pitch frequency is a basic characteristic of human voice, and pitch extraction is one of the most important studies for speech recognition. This paper describes a simple but effective technique to obtain correct pitch frequency from candidates (pitch candidates) extracted by the short-range autocorrelation function. The correction is performed by a neural network in consideration of the time coutinuation that is realized by referring to pitch candidates at previous frames. Since the neural network is trained by the back-propagation algorithm with training data, it adapts to any speaker and obtains good correction without sensitive adjustment and tuning. The pitch extraction was performed for 3 male and 3 female announcers, and the proposed method improves the percentage of correct pitch from 58.65% to 89.19%.
ER -