Automatic labeling of prosodic features is an important topic when constructing large speech databases for speech synthesis or analysis purposes. Perceptually-related F0 parameters are proposed with the aim of automatically classifying phrase final tones. Analyses are conducted to verify how consistently subjects are able to categorize phrase final tones, and how perceptual features are related with the categories. Three types of acoustic parameters are proposed and analyzed for representing the perceptual features related to the tone categories: one related to pitch movement within the phrase final, one related to pitch reset prior to the phrase final, and one related to the length of the phrase final. A classification tree is constructed to evaluate automatic classification of phrase final tones, resulting in 79.2% accuracy for the consistently categorized samples, using the best combination among the proposed acoustic parameters.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Carlos Toshinori ISHI, "Perceptually-Related F0 Parameters for Automatic Classification of Phrase Final Tones" in IEICE TRANSACTIONS on Information,
vol. E88-D, no. 3, pp. 481-488, March 2005, doi: 10.1093/ietisy/e88-d.3.481.
Abstract: Automatic labeling of prosodic features is an important topic when constructing large speech databases for speech synthesis or analysis purposes. Perceptually-related F0 parameters are proposed with the aim of automatically classifying phrase final tones. Analyses are conducted to verify how consistently subjects are able to categorize phrase final tones, and how perceptual features are related with the categories. Three types of acoustic parameters are proposed and analyzed for representing the perceptual features related to the tone categories: one related to pitch movement within the phrase final, one related to pitch reset prior to the phrase final, and one related to the length of the phrase final. A classification tree is constructed to evaluate automatic classification of phrase final tones, resulting in 79.2% accuracy for the consistently categorized samples, using the best combination among the proposed acoustic parameters.
URL: https://global.ieice.org/en_transactions/information/10.1093/ietisy/e88-d.3.481/_p
Copy
@ARTICLE{e88-d_3_481,
author={Carlos Toshinori ISHI, },
journal={IEICE TRANSACTIONS on Information},
title={Perceptually-Related F0 Parameters for Automatic Classification of Phrase Final Tones},
year={2005},
volume={E88-D},
number={3},
pages={481-488},
abstract={Automatic labeling of prosodic features is an important topic when constructing large speech databases for speech synthesis or analysis purposes. Perceptually-related F0 parameters are proposed with the aim of automatically classifying phrase final tones. Analyses are conducted to verify how consistently subjects are able to categorize phrase final tones, and how perceptual features are related with the categories. Three types of acoustic parameters are proposed and analyzed for representing the perceptual features related to the tone categories: one related to pitch movement within the phrase final, one related to pitch reset prior to the phrase final, and one related to the length of the phrase final. A classification tree is constructed to evaluate automatic classification of phrase final tones, resulting in 79.2% accuracy for the consistently categorized samples, using the best combination among the proposed acoustic parameters.},
keywords={},
doi={10.1093/ietisy/e88-d.3.481},
ISSN={},
month={March},}
Copy
TY - JOUR
TI - Perceptually-Related F0 Parameters for Automatic Classification of Phrase Final Tones
T2 - IEICE TRANSACTIONS on Information
SP - 481
EP - 488
AU - Carlos Toshinori ISHI
PY - 2005
DO - 10.1093/ietisy/e88-d.3.481
JO - IEICE TRANSACTIONS on Information
SN -
VL - E88-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2005
AB - Automatic labeling of prosodic features is an important topic when constructing large speech databases for speech synthesis or analysis purposes. Perceptually-related F0 parameters are proposed with the aim of automatically classifying phrase final tones. Analyses are conducted to verify how consistently subjects are able to categorize phrase final tones, and how perceptual features are related with the categories. Three types of acoustic parameters are proposed and analyzed for representing the perceptual features related to the tone categories: one related to pitch movement within the phrase final, one related to pitch reset prior to the phrase final, and one related to the length of the phrase final. A classification tree is constructed to evaluate automatic classification of phrase final tones, resulting in 79.2% accuracy for the consistently categorized samples, using the best combination among the proposed acoustic parameters.
ER -