In this paper, we discuss the construction of a large in-car spoken dialogue corpus and the result of its analysis. We have developed a system specially built into a Data Collection Vehicle (DCV) which supports the synchronous recording of multichannel audio data from 16 microphones that can be placed in flexible positions, multichannel video data from 3 cameras, and vehicle related data. Multimedia data has been collected for three sessions of spoken dialogue with different modes of navigation, during approximately a 60 minute drive by each of 800 subjects. We have characterized the collected dialogues across the three sessions. Some characteristics such as sentence complexity and SNR are found to differ significantly among the sessions. Linear regression analysis results also clarify the relative importance of various corpus characteristics.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Kazuya TAKEDA, Hiroshi FUJIMURA, Katsunobu ITOU, Nobuo KAWAGUCHI, Shigeki MATSUBARA, Fumitada ITAKURA, "Construction and Evaluation of a Large In-Car Speech Corpus" in IEICE TRANSACTIONS on Information,
vol. E88-D, no. 3, pp. 553-561, March 2005, doi: 10.1093/ietisy/e88-d.3.553.
Abstract: In this paper, we discuss the construction of a large in-car spoken dialogue corpus and the result of its analysis. We have developed a system specially built into a Data Collection Vehicle (DCV) which supports the synchronous recording of multichannel audio data from 16 microphones that can be placed in flexible positions, multichannel video data from 3 cameras, and vehicle related data. Multimedia data has been collected for three sessions of spoken dialogue with different modes of navigation, during approximately a 60 minute drive by each of 800 subjects. We have characterized the collected dialogues across the three sessions. Some characteristics such as sentence complexity and SNR are found to differ significantly among the sessions. Linear regression analysis results also clarify the relative importance of various corpus characteristics.
URL: https://global.ieice.org/en_transactions/information/10.1093/ietisy/e88-d.3.553/_p
Copy
@ARTICLE{e88-d_3_553,
author={Kazuya TAKEDA, Hiroshi FUJIMURA, Katsunobu ITOU, Nobuo KAWAGUCHI, Shigeki MATSUBARA, Fumitada ITAKURA, },
journal={IEICE TRANSACTIONS on Information},
title={Construction and Evaluation of a Large In-Car Speech Corpus},
year={2005},
volume={E88-D},
number={3},
pages={553-561},
abstract={In this paper, we discuss the construction of a large in-car spoken dialogue corpus and the result of its analysis. We have developed a system specially built into a Data Collection Vehicle (DCV) which supports the synchronous recording of multichannel audio data from 16 microphones that can be placed in flexible positions, multichannel video data from 3 cameras, and vehicle related data. Multimedia data has been collected for three sessions of spoken dialogue with different modes of navigation, during approximately a 60 minute drive by each of 800 subjects. We have characterized the collected dialogues across the three sessions. Some characteristics such as sentence complexity and SNR are found to differ significantly among the sessions. Linear regression analysis results also clarify the relative importance of various corpus characteristics.},
keywords={},
doi={10.1093/ietisy/e88-d.3.553},
ISSN={},
month={March},}
Copy
TY - JOUR
TI - Construction and Evaluation of a Large In-Car Speech Corpus
T2 - IEICE TRANSACTIONS on Information
SP - 553
EP - 561
AU - Kazuya TAKEDA
AU - Hiroshi FUJIMURA
AU - Katsunobu ITOU
AU - Nobuo KAWAGUCHI
AU - Shigeki MATSUBARA
AU - Fumitada ITAKURA
PY - 2005
DO - 10.1093/ietisy/e88-d.3.553
JO - IEICE TRANSACTIONS on Information
SN -
VL - E88-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2005
AB - In this paper, we discuss the construction of a large in-car spoken dialogue corpus and the result of its analysis. We have developed a system specially built into a Data Collection Vehicle (DCV) which supports the synchronous recording of multichannel audio data from 16 microphones that can be placed in flexible positions, multichannel video data from 3 cameras, and vehicle related data. Multimedia data has been collected for three sessions of spoken dialogue with different modes of navigation, during approximately a 60 minute drive by each of 800 subjects. We have characterized the collected dialogues across the three sessions. Some characteristics such as sentence complexity and SNR are found to differ significantly among the sessions. Linear regression analysis results also clarify the relative importance of various corpus characteristics.
ER -