This paper describes issues on dialogue corpora for speech and natural language research. Speech and text corpora of dialogue have recently become more important for the development and the evaluation of speech and text-based dialogue systems. However, the design and the construction of dialogue corpora themselves still remain research issues and many problems have not yet been clarified. Many kinds of corpus are necessary to study various aspects of dialogues. On the other hand, each corpus should contain a certain quantity for each purpose in order to make it statistically meaningful. This paper presents the issues related with design and creation of dialogue corpora; the selection of a task domain, transcription conventions, situations for the collection, syntactic and semantic ill-formedness, and politeness. Future directions for dialogue corpora creation are also discussed.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Satoru HAYAMIZU, Shuichi ITAHASHI, Tetsunori KOBAYASHI, Toshiyuki TAKEZAWA, "Design and Creation of Speech and Text Corpora of Dialogue" in IEICE TRANSACTIONS on Information,
vol. E76-D, no. 1, pp. 17-22, January 1993, doi: .
Abstract: This paper describes issues on dialogue corpora for speech and natural language research. Speech and text corpora of dialogue have recently become more important for the development and the evaluation of speech and text-based dialogue systems. However, the design and the construction of dialogue corpora themselves still remain research issues and many problems have not yet been clarified. Many kinds of corpus are necessary to study various aspects of dialogues. On the other hand, each corpus should contain a certain quantity for each purpose in order to make it statistically meaningful. This paper presents the issues related with design and creation of dialogue corpora; the selection of a task domain, transcription conventions, situations for the collection, syntactic and semantic ill-formedness, and politeness. Future directions for dialogue corpora creation are also discussed.
URL: https://global.ieice.org/en_transactions/information/10.1587/e76-d_1_17/_p
Copy
@ARTICLE{e76-d_1_17,
author={Satoru HAYAMIZU, Shuichi ITAHASHI, Tetsunori KOBAYASHI, Toshiyuki TAKEZAWA, },
journal={IEICE TRANSACTIONS on Information},
title={Design and Creation of Speech and Text Corpora of Dialogue},
year={1993},
volume={E76-D},
number={1},
pages={17-22},
abstract={This paper describes issues on dialogue corpora for speech and natural language research. Speech and text corpora of dialogue have recently become more important for the development and the evaluation of speech and text-based dialogue systems. However, the design and the construction of dialogue corpora themselves still remain research issues and many problems have not yet been clarified. Many kinds of corpus are necessary to study various aspects of dialogues. On the other hand, each corpus should contain a certain quantity for each purpose in order to make it statistically meaningful. This paper presents the issues related with design and creation of dialogue corpora; the selection of a task domain, transcription conventions, situations for the collection, syntactic and semantic ill-formedness, and politeness. Future directions for dialogue corpora creation are also discussed.},
keywords={},
doi={},
ISSN={},
month={January},}
Copy
TY - JOUR
TI - Design and Creation of Speech and Text Corpora of Dialogue
T2 - IEICE TRANSACTIONS on Information
SP - 17
EP - 22
AU - Satoru HAYAMIZU
AU - Shuichi ITAHASHI
AU - Tetsunori KOBAYASHI
AU - Toshiyuki TAKEZAWA
PY - 1993
DO -
JO - IEICE TRANSACTIONS on Information
SN -
VL - E76-D
IS - 1
JA - IEICE TRANSACTIONS on Information
Y1 - January 1993
AB - This paper describes issues on dialogue corpora for speech and natural language research. Speech and text corpora of dialogue have recently become more important for the development and the evaluation of speech and text-based dialogue systems. However, the design and the construction of dialogue corpora themselves still remain research issues and many problems have not yet been clarified. Many kinds of corpus are necessary to study various aspects of dialogues. On the other hand, each corpus should contain a certain quantity for each purpose in order to make it statistically meaningful. This paper presents the issues related with design and creation of dialogue corpora; the selection of a task domain, transcription conventions, situations for the collection, syntactic and semantic ill-formedness, and politeness. Future directions for dialogue corpora creation are also discussed.
ER -