The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] speaking mode(1hit)

1-1hit
  • Temporal Characteristics of Utterance Units and Topic Structure of Spoken Dialogs

    Kazuyuki TAKAGI  Shuichi ITAHASHI  

     
    PAPER-Speech Processing

      Vol:
    E78-D No:3
      Page(s):
    269-276

    There are various difficulties in processing spoken dialogs because of acoustic, phonetic, and grammatical ill-formedness, and because of interactions among participants. This paper describes temporal characteristics of utterances in human-human task-oriented dialogs and interactions between the participants, analyzed in relation to the topic structure of the dialog. We analyzed 12 task-oriented simulated dialogs of ASJ continuous speech corpus conducted by 13 different participants whose total length being 66 minutes. Speech data was segmented into utterance units each of which is a speech interval segmented by pauses. There were 3876 utterance units, and 38.9% of them were interjections, fillers, false starts and chiming utterances. Each dialog consisted of 6 to 15 topic segments in each of which participants exchange specific information of the task. Eighty-six out of 119 new topic segments started with interjectory utterances and filled pauses. It was found that the durations of turn-taking interjections and fillers including the preceding silent pause were significantly longer in topic boundaries than the other positions. The results indicate that the duration of interjection words and filled pauses is a sign of a topic shift in spoken dialogs. In natural conversations, participants' speaking modes change dynamically as the conversation develops. Response time of both client and agent role speakers became shorter as the dialog proceeded. This indicates that interactions between the participants become active as the dialog proceeds. Speech rate was also affected by the dialog structure. It was generally fast in the initiating and terminating parts where most utterances are of fixed expressions, and slow in topic segments of the body part of the dialog where both client and agent participants stalled to speak in order to retrieve task knowledge. The results can be utilized in man-machine dialog systems, e.g., in order to detect topic shifts of a dialog, and to make the speech interface of dialog systems more natural to a human participant.