There are various difficulties in processing spoken dialogs because of acoustic, phonetic, and grammatical ill-formedness, and because of interactions among participants. This paper describes temporal characteristics of utterances in human-human task-oriented dialogs and interactions between the participants, analyzed in relation to the topic structure of the dialog. We analyzed 12 task-oriented simulated dialogs of ASJ continuous speech corpus conducted by 13 different participants whose total length being 66 minutes. Speech data was segmented into utterance units each of which is a speech interval segmented by pauses. There were 3876 utterance units, and 38.9% of them were interjections, fillers, false starts and chiming utterances. Each dialog consisted of 6 to 15 topic segments in each of which participants exchange specific information of the task. Eighty-six out of 119 new topic segments started with interjectory utterances and filled pauses. It was found that the durations of turn-taking interjections and fillers including the preceding silent pause were significantly longer in topic boundaries than the other positions. The results indicate that the duration of interjection words and filled pauses is a sign of a topic shift in spoken dialogs. In natural conversations, participants' speaking modes change dynamically as the conversation develops. Response time of both client and agent role speakers became shorter as the dialog proceeded. This indicates that interactions between the participants become active as the dialog proceeds. Speech rate was also affected by the dialog structure. It was generally fast in the initiating and terminating parts where most utterances are of fixed expressions, and slow in topic segments of the body part of the dialog where both client and agent participants stalled to speak in order to retrieve task knowledge. The results can be utilized in man-machine dialog systems, e.g., in order to detect topic shifts of a dialog, and to make the speech interface of dialog systems more natural to a human participant.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Kazuyuki TAKAGI, Shuichi ITAHASHI, "Temporal Characteristics of Utterance Units and Topic Structure of Spoken Dialogs" in IEICE TRANSACTIONS on Information,
vol. E78-D, no. 3, pp. 269-276, March 1995, doi: .
Abstract: There are various difficulties in processing spoken dialogs because of acoustic, phonetic, and grammatical ill-formedness, and because of interactions among participants. This paper describes temporal characteristics of utterances in human-human task-oriented dialogs and interactions between the participants, analyzed in relation to the topic structure of the dialog. We analyzed 12 task-oriented simulated dialogs of ASJ continuous speech corpus conducted by 13 different participants whose total length being 66 minutes. Speech data was segmented into utterance units each of which is a speech interval segmented by pauses. There were 3876 utterance units, and 38.9% of them were interjections, fillers, false starts and chiming utterances. Each dialog consisted of 6 to 15 topic segments in each of which participants exchange specific information of the task. Eighty-six out of 119 new topic segments started with interjectory utterances and filled pauses. It was found that the durations of turn-taking interjections and fillers including the preceding silent pause were significantly longer in topic boundaries than the other positions. The results indicate that the duration of interjection words and filled pauses is a sign of a topic shift in spoken dialogs. In natural conversations, participants' speaking modes change dynamically as the conversation develops. Response time of both client and agent role speakers became shorter as the dialog proceeded. This indicates that interactions between the participants become active as the dialog proceeds. Speech rate was also affected by the dialog structure. It was generally fast in the initiating and terminating parts where most utterances are of fixed expressions, and slow in topic segments of the body part of the dialog where both client and agent participants stalled to speak in order to retrieve task knowledge. The results can be utilized in man-machine dialog systems, e.g., in order to detect topic shifts of a dialog, and to make the speech interface of dialog systems more natural to a human participant.
URL: https://global.ieice.org/en_transactions/information/10.1587/e78-d_3_269/_p
Copy
@ARTICLE{e78-d_3_269,
author={Kazuyuki TAKAGI, Shuichi ITAHASHI, },
journal={IEICE TRANSACTIONS on Information},
title={Temporal Characteristics of Utterance Units and Topic Structure of Spoken Dialogs},
year={1995},
volume={E78-D},
number={3},
pages={269-276},
abstract={There are various difficulties in processing spoken dialogs because of acoustic, phonetic, and grammatical ill-formedness, and because of interactions among participants. This paper describes temporal characteristics of utterances in human-human task-oriented dialogs and interactions between the participants, analyzed in relation to the topic structure of the dialog. We analyzed 12 task-oriented simulated dialogs of ASJ continuous speech corpus conducted by 13 different participants whose total length being 66 minutes. Speech data was segmented into utterance units each of which is a speech interval segmented by pauses. There were 3876 utterance units, and 38.9% of them were interjections, fillers, false starts and chiming utterances. Each dialog consisted of 6 to 15 topic segments in each of which participants exchange specific information of the task. Eighty-six out of 119 new topic segments started with interjectory utterances and filled pauses. It was found that the durations of turn-taking interjections and fillers including the preceding silent pause were significantly longer in topic boundaries than the other positions. The results indicate that the duration of interjection words and filled pauses is a sign of a topic shift in spoken dialogs. In natural conversations, participants' speaking modes change dynamically as the conversation develops. Response time of both client and agent role speakers became shorter as the dialog proceeded. This indicates that interactions between the participants become active as the dialog proceeds. Speech rate was also affected by the dialog structure. It was generally fast in the initiating and terminating parts where most utterances are of fixed expressions, and slow in topic segments of the body part of the dialog where both client and agent participants stalled to speak in order to retrieve task knowledge. The results can be utilized in man-machine dialog systems, e.g., in order to detect topic shifts of a dialog, and to make the speech interface of dialog systems more natural to a human participant.},
keywords={},
doi={},
ISSN={},
month={March},}
Copy
TY - JOUR
TI - Temporal Characteristics of Utterance Units and Topic Structure of Spoken Dialogs
T2 - IEICE TRANSACTIONS on Information
SP - 269
EP - 276
AU - Kazuyuki TAKAGI
AU - Shuichi ITAHASHI
PY - 1995
DO -
JO - IEICE TRANSACTIONS on Information
SN -
VL - E78-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 1995
AB - There are various difficulties in processing spoken dialogs because of acoustic, phonetic, and grammatical ill-formedness, and because of interactions among participants. This paper describes temporal characteristics of utterances in human-human task-oriented dialogs and interactions between the participants, analyzed in relation to the topic structure of the dialog. We analyzed 12 task-oriented simulated dialogs of ASJ continuous speech corpus conducted by 13 different participants whose total length being 66 minutes. Speech data was segmented into utterance units each of which is a speech interval segmented by pauses. There were 3876 utterance units, and 38.9% of them were interjections, fillers, false starts and chiming utterances. Each dialog consisted of 6 to 15 topic segments in each of which participants exchange specific information of the task. Eighty-six out of 119 new topic segments started with interjectory utterances and filled pauses. It was found that the durations of turn-taking interjections and fillers including the preceding silent pause were significantly longer in topic boundaries than the other positions. The results indicate that the duration of interjection words and filled pauses is a sign of a topic shift in spoken dialogs. In natural conversations, participants' speaking modes change dynamically as the conversation develops. Response time of both client and agent role speakers became shorter as the dialog proceeded. This indicates that interactions between the participants become active as the dialog proceeds. Speech rate was also affected by the dialog structure. It was generally fast in the initiating and terminating parts where most utterances are of fixed expressions, and slow in topic segments of the body part of the dialog where both client and agent participants stalled to speak in order to retrieve task knowledge. The results can be utilized in man-machine dialog systems, e.g., in order to detect topic shifts of a dialog, and to make the speech interface of dialog systems more natural to a human participant.
ER -