Generating F0 Contours by Statistical Manipulation of Natural F0 Shapes

Takashi SAITO

doi:10.1093/ietisy/e89-d.3.1100

IEICE TRANSACTIONS on Information

Generating F0 Contours by Statistical Manipulation of Natural F0 Shapes

Takashi SAITO

Full Text Views

0

Cite this

Summary :

This paper describes a method of generating F0 contours from natural F0 segmental shapes for speech synthesis. The extracted shapes of the F0 units are basically held invariant by eliminating any averaging operations in the analysis phase and by minimizing modification operations in the synthesis phase. The use of natural F0 shapes has great potential to cover a wide variety of speaking styles with the same framework, including not only read-aloud speech, but also dialogues and emotional speech. A linear-regression statistical model is used to "manipulate" the stored raw F0 shapes to build them up into a sentential F0 contour. Through experimental evaluations, the proposed model is shown to provide stable and robust F0 contour prediction for various speakers. By using this model, linguistically derived information about a sentence can be directly mapped, in a purely data-driven manner, to acoustic F0 values of the sentential intonation contour for a given target speaker.

Publication: IEICE TRANSACTIONS on Information Vol.E89-D No.3 pp.1100-1106

Publication Date: 2006/03/01

Publicized

Online ISSN: 1745-1361

DOI: 10.1093/ietisy/e89-d.3.1100

Type of Manuscript: Special Section PAPER (Special Section on Statistical Modeling for Speech Processing)

Category: Speech Analysis

Cite this

Copy

Takashi SAITO, "Generating F0 Contours by Statistical Manipulation of Natural F0 Shapes" in IEICE TRANSACTIONS on Information, vol. E89-D, no. 3, pp. 1100-1106, March 2006, doi: 10.1093/ietisy/e89-d.3.1100.
Abstract: This paper describes a method of generating F0 contours from natural F0 segmental shapes for speech synthesis. The extracted shapes of the F0 units are basically held invariant by eliminating any averaging operations in the analysis phase and by minimizing modification operations in the synthesis phase. The use of natural F0 shapes has great potential to cover a wide variety of speaking styles with the same framework, including not only read-aloud speech, but also dialogues and emotional speech. A linear-regression statistical model is used to "manipulate" the stored raw F0 shapes to build them up into a sentential F0 contour. Through experimental evaluations, the proposed model is shown to provide stable and robust F0 contour prediction for various speakers. By using this model, linguistically derived information about a sentence can be directly mapped, in a purely data-driven manner, to acoustic F0 values of the sentential intonation contour for a given target speaker.
URL: https://global.ieice.org/en_transactions/information/10.1093/ietisy/e89-d.3.1100/_p

Copy

@ARTICLE{e89-d_3_1100,
author={Takashi SAITO, },
journal={IEICE TRANSACTIONS on Information},
title={Generating F0 Contours by Statistical Manipulation of Natural F0 Shapes},
year={2006},
volume={E89-D},
number={3},
pages={1100-1106},
abstract={This paper describes a method of generating F0 contours from natural F0 segmental shapes for speech synthesis. The extracted shapes of the F0 units are basically held invariant by eliminating any averaging operations in the analysis phase and by minimizing modification operations in the synthesis phase. The use of natural F0 shapes has great potential to cover a wide variety of speaking styles with the same framework, including not only read-aloud speech, but also dialogues and emotional speech. A linear-regression statistical model is used to "manipulate" the stored raw F0 shapes to build them up into a sentential F0 contour. Through experimental evaluations, the proposed model is shown to provide stable and robust F0 contour prediction for various speakers. By using this model, linguistically derived information about a sentence can be directly mapped, in a purely data-driven manner, to acoustic F0 values of the sentential intonation contour for a given target speaker.},
keywords={},
doi={10.1093/ietisy/e89-d.3.1100},
ISSN={1745-1361},
month={March},}

Copy

TY - JOUR
TI - Generating F0 Contours by Statistical Manipulation of Natural F0 Shapes
T2 - IEICE TRANSACTIONS on Information
SP - 1100
EP - 1106
AU - Takashi SAITO
PY - 2006
DO - 10.1093/ietisy/e89-d.3.1100
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E89-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2006
AB - This paper describes a method of generating F0 contours from natural F0 segmental shapes for speech synthesis. The extracted shapes of the F0 units are basically held invariant by eliminating any averaging operations in the analysis phase and by minimizing modification operations in the synthesis phase. The use of natural F0 shapes has great potential to cover a wide variety of speaking styles with the same framework, including not only read-aloud speech, but also dialogues and emotional speech. A linear-regression statistical model is used to "manipulate" the stored raw F0 shapes to build them up into a sentential F0 contour. Through experimental evaluations, the proposed model is shown to provide stable and robust F0 contour prediction for various speakers. By using this model, linguistically derived information about a sentence can be directly mapped, in a purely data-driven manner, to acoustic F0 values of the sentential intonation contour for a given target speaker.
ER -

IEICE TRANSACTIONS on Information

Generating F0 Contours by Statistical Manipulation of Natural F0 Shapes

Summary :

Authors

Keyword

Latest Issue

Contents

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles

IEICE TRANSACTIONS on Information

Generating F0 Contours by Statistical Manipulation of Natural F0 Shapes

Summary :

Authors

Keyword

Latest Issue

Contents

Copyrights notice of machine-translated contents

Cite this

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles