Experiments were performed to investigate perceptual contributions of static and dynamic features of vocal tract characteristics to talker individuality. An ARX (Auto-regressive with exogenous input) speech production model was used to extract separately voice source and vocal tract parameters from a Japanese sentence, /aoiueoie/ ("Say blue top" in English) uttered by three males. The Discrete Cosine Transform (DCT) was applied to resolve formant trajectories of the speech signal into static and dynamic components. The perceptual contributions were quantitatively studied by systematically replacing the corresponding formant components of the sentences between the three talkers. Results of the experiments show that the static (average) feature of the vocal tract is a primary cue to talker individuality.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Weizhong ZHU, Hideki KASUYA, "Perceptual Contributions of Static and Dynamic Features of Vocal Tract Characteristics to Talker Individuality" in IEICE TRANSACTIONS on Fundamentals,
vol. E81-A, no. 2, pp. 268-274, February 1998, doi: .
Abstract: Experiments were performed to investigate perceptual contributions of static and dynamic features of vocal tract characteristics to talker individuality. An ARX (Auto-regressive with exogenous input) speech production model was used to extract separately voice source and vocal tract parameters from a Japanese sentence, /aoiueoie/ ("Say blue top" in English) uttered by three males. The Discrete Cosine Transform (DCT) was applied to resolve formant trajectories of the speech signal into static and dynamic components. The perceptual contributions were quantitatively studied by systematically replacing the corresponding formant components of the sentences between the three talkers. Results of the experiments show that the static (average) feature of the vocal tract is a primary cue to talker individuality.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/e81-a_2_268/_p
Copy
@ARTICLE{e81-a_2_268,
author={Weizhong ZHU, Hideki KASUYA, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={Perceptual Contributions of Static and Dynamic Features of Vocal Tract Characteristics to Talker Individuality},
year={1998},
volume={E81-A},
number={2},
pages={268-274},
abstract={Experiments were performed to investigate perceptual contributions of static and dynamic features of vocal tract characteristics to talker individuality. An ARX (Auto-regressive with exogenous input) speech production model was used to extract separately voice source and vocal tract parameters from a Japanese sentence, /aoiueoie/ ("Say blue top" in English) uttered by three males. The Discrete Cosine Transform (DCT) was applied to resolve formant trajectories of the speech signal into static and dynamic components. The perceptual contributions were quantitatively studied by systematically replacing the corresponding formant components of the sentences between the three talkers. Results of the experiments show that the static (average) feature of the vocal tract is a primary cue to talker individuality.},
keywords={},
doi={},
ISSN={},
month={February},}
Copy
TY - JOUR
TI - Perceptual Contributions of Static and Dynamic Features of Vocal Tract Characteristics to Talker Individuality
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 268
EP - 274
AU - Weizhong ZHU
AU - Hideki KASUYA
PY - 1998
DO -
JO - IEICE TRANSACTIONS on Fundamentals
SN -
VL - E81-A
IS - 2
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - February 1998
AB - Experiments were performed to investigate perceptual contributions of static and dynamic features of vocal tract characteristics to talker individuality. An ARX (Auto-regressive with exogenous input) speech production model was used to extract separately voice source and vocal tract parameters from a Japanese sentence, /aoiueoie/ ("Say blue top" in English) uttered by three males. The Discrete Cosine Transform (DCT) was applied to resolve formant trajectories of the speech signal into static and dynamic components. The perceptual contributions were quantitatively studied by systematically replacing the corresponding formant components of the sentences between the three talkers. Results of the experiments show that the static (average) feature of the vocal tract is a primary cue to talker individuality.
ER -