Estimating Syntactic Structure from Prosody in Japanese Speech

Tomoko OHSUGA; Yasuo HORIUCHI; Akira ICHIKAWA

Estimating Syntactic Structure from Prosody in Japanese Speech

Tomoko OHSUGA, Yasuo HORIUCHI, Akira ICHIKAWA

Full Text Views

0

Cite this

Summary :

In this study, we introduce a method for estimating the syntactic structure of Japanese speech from F0 contour and pause duration. We defined a prosodic unit (PU) which is divided by the local minimal point of an F0 contour or pause. Combining PUs repeatedly (a pair of PUs is combined into one PU), a tree structure is gradually generated. Which pair of PUs in a sequence of three PUs should be combined is decided by a discriminant function based on the discriminant analysis of a corpus of speech data. We applied the method to the ATR Phonetically Balanced Sentences read by four Japanese speakers. We found that with this method, the correct rate of judgement for each sequence of three PUs is 79% and the estimation accuracy of the entire syntactic structure for each sentence is 26%. We consider this result to demonstrate a good degree of accuracy for the difficult task of estimating syntactic structure only from prosody.

Publication: IEICE TRANSACTIONS on Information Vol.E86-D No.3 pp.558-564

Publication Date: 2003/03/01

Publicized

Online ISSN

DOI

Type of Manuscript: Special Section PAPER (Special Issue on Speech Information Processing)

Category: Speech Synthesis and Prosody

Cite this

Copy

Tomoko OHSUGA, Yasuo HORIUCHI, Akira ICHIKAWA, "Estimating Syntactic Structure from Prosody in Japanese Speech" in IEICE TRANSACTIONS on Information, vol. E86-D, no. 3, pp. 558-564, March 2003, doi: .
Abstract: In this study, we introduce a method for estimating the syntactic structure of Japanese speech from F0 contour and pause duration. We defined a prosodic unit (PU) which is divided by the local minimal point of an F0 contour or pause. Combining PUs repeatedly (a pair of PUs is combined into one PU), a tree structure is gradually generated. Which pair of PUs in a sequence of three PUs should be combined is decided by a discriminant function based on the discriminant analysis of a corpus of speech data. We applied the method to the ATR Phonetically Balanced Sentences read by four Japanese speakers. We found that with this method, the correct rate of judgement for each sequence of three PUs is 79% and the estimation accuracy of the entire syntactic structure for each sentence is 26%. We consider this result to demonstrate a good degree of accuracy for the difficult task of estimating syntactic structure only from prosody.
URL: https://global.ieice.org/en_transactions/information/10.1587/e86-d_3_558/_p

Copy

@ARTICLE{e86-d_3_558,
author={Tomoko OHSUGA, Yasuo HORIUCHI, Akira ICHIKAWA, },
journal={IEICE TRANSACTIONS on Information},
title={Estimating Syntactic Structure from Prosody in Japanese Speech},
year={2003},
volume={E86-D},
number={3},
pages={558-564},
abstract={In this study, we introduce a method for estimating the syntactic structure of Japanese speech from F0 contour and pause duration. We defined a prosodic unit (PU) which is divided by the local minimal point of an F0 contour or pause. Combining PUs repeatedly (a pair of PUs is combined into one PU), a tree structure is gradually generated. Which pair of PUs in a sequence of three PUs should be combined is decided by a discriminant function based on the discriminant analysis of a corpus of speech data. We applied the method to the ATR Phonetically Balanced Sentences read by four Japanese speakers. We found that with this method, the correct rate of judgement for each sequence of three PUs is 79% and the estimation accuracy of the entire syntactic structure for each sentence is 26%. We consider this result to demonstrate a good degree of accuracy for the difficult task of estimating syntactic structure only from prosody.},
keywords={},
doi={},
ISSN={},
month={March},}

Copy

TY - JOUR
TI - Estimating Syntactic Structure from Prosody in Japanese Speech
T2 - IEICE TRANSACTIONS on Information
SP - 558
EP - 564
AU - Tomoko OHSUGA
AU - Yasuo HORIUCHI
AU - Akira ICHIKAWA
PY - 2003
DO -
JO - IEICE TRANSACTIONS on Information
SN -
VL - E86-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2003
AB - In this study, we introduce a method for estimating the syntactic structure of Japanese speech from F0 contour and pause duration. We defined a prosodic unit (PU) which is divided by the local minimal point of an F0 contour or pause. Combining PUs repeatedly (a pair of PUs is combined into one PU), a tree structure is gradually generated. Which pair of PUs in a sequence of three PUs should be combined is decided by a discriminant function based on the discriminant analysis of a corpus of speech data. We applied the method to the ATR Phonetically Balanced Sentences read by four Japanese speakers. We found that with this method, the correct rate of judgement for each sequence of three PUs is 79% and the estimation accuracy of the entire syntactic structure for each sentence is 26%. We consider this result to demonstrate a good degree of accuracy for the difficult task of estimating syntactic structure only from prosody.
ER -