Predicting Performance of Collaborative Storytelling Using Multimodal Analysis

Shogo OKADA; Mi HANG; Katsumi NITTA

doi:10.1587/transinf.2015CBP0003

IEICE TRANSACTIONS on Information

Predicting Performance of Collaborative Storytelling Using Multimodal Analysis

Shogo OKADA, Mi HANG, Katsumi NITTA

Full Text Views

0

Cite this

Summary :

This study focuses on modeling the storytelling performance of the participants in a group conversation. Storytelling performance is one of the fundamental communication techniques for providing information and entertainment effectively to a listener. We present a multimodal analysis of the storytelling performance in a group conversation, as evaluated by external observers. A new multimodal data corpus is collected through this group storytelling task, which includes the participants' performance scores. We extract multimodal (verbal and nonverbal) features regarding storytellers and listeners from a manual description of spoken dialog and from various nonverbal patterns, including each participant's speaking turn, utterance prosody, head gesture, hand gesture, and head direction. We also extract multimodal co-occurrence features, such as head gestures, and interaction features, such as storyteller utterance overlapped with listener's backchannel. In the experiment, we modeled the relationship between the performance indices and the multimodal features using machine-learning techniques. Experimental results show that the highest accuracy (R²) is 0.299 for the total storytelling performance (sum of indices scores) obtained with a combination of verbal and nonverbal features in a regression task.

Publication: IEICE TRANSACTIONS on Information Vol.E99-D No.6 pp.1462-1473

Publication Date: 2016/06/01

Publicized: 2016/04/01

Online ISSN: 1745-1361

DOI: 10.1587/transinf.2015CBP0003

Type of Manuscript: Special Section PAPER (Special Section on Human Cognition and Behavioral Science and Technology)

Category

Authors

Shogo OKADA
  Tokyo Institute of Technology
Mi HANG
  Tokyo Institute of Technology
Katsumi NITTA
  Tokyo Institute of Technology

Keyword

storytelling performance, multimodal interaction, inference, data mining, small group, conversation analysis

Cite this

Copy

Shogo OKADA, Mi HANG, Katsumi NITTA, "Predicting Performance of Collaborative Storytelling Using Multimodal Analysis" in IEICE TRANSACTIONS on Information, vol. E99-D, no. 6, pp. 1462-1473, June 2016, doi: 10.1587/transinf.2015CBP0003.
Abstract: This study focuses on modeling the storytelling performance of the participants in a group conversation. Storytelling performance is one of the fundamental communication techniques for providing information and entertainment effectively to a listener. We present a multimodal analysis of the storytelling performance in a group conversation, as evaluated by external observers. A new multimodal data corpus is collected through this group storytelling task, which includes the participants' performance scores. We extract multimodal (verbal and nonverbal) features regarding storytellers and listeners from a manual description of spoken dialog and from various nonverbal patterns, including each participant's speaking turn, utterance prosody, head gesture, hand gesture, and head direction. We also extract multimodal co-occurrence features, such as head gestures, and interaction features, such as storyteller utterance overlapped with listener's backchannel. In the experiment, we modeled the relationship between the performance indices and the multimodal features using machine-learning techniques. Experimental results show that the highest accuracy (R²) is 0.299 for the total storytelling performance (sum of indices scores) obtained with a combination of verbal and nonverbal features in a regression task.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2015CBP0003/_p

Copy

@ARTICLE{e99-d_6_1462,
author={Shogo OKADA, Mi HANG, Katsumi NITTA, },
journal={IEICE TRANSACTIONS on Information},
title={Predicting Performance of Collaborative Storytelling Using Multimodal Analysis},
year={2016},
volume={E99-D},
number={6},
pages={1462-1473},
abstract={This study focuses on modeling the storytelling performance of the participants in a group conversation. Storytelling performance is one of the fundamental communication techniques for providing information and entertainment effectively to a listener. We present a multimodal analysis of the storytelling performance in a group conversation, as evaluated by external observers. A new multimodal data corpus is collected through this group storytelling task, which includes the participants' performance scores. We extract multimodal (verbal and nonverbal) features regarding storytellers and listeners from a manual description of spoken dialog and from various nonverbal patterns, including each participant's speaking turn, utterance prosody, head gesture, hand gesture, and head direction. We also extract multimodal co-occurrence features, such as head gestures, and interaction features, such as storyteller utterance overlapped with listener's backchannel. In the experiment, we modeled the relationship between the performance indices and the multimodal features using machine-learning techniques. Experimental results show that the highest accuracy (R²) is 0.299 for the total storytelling performance (sum of indices scores) obtained with a combination of verbal and nonverbal features in a regression task.},
keywords={},
doi={10.1587/transinf.2015CBP0003},
ISSN={1745-1361},
month={June},}

Copy

TY - JOUR
TI - Predicting Performance of Collaborative Storytelling Using Multimodal Analysis
T2 - IEICE TRANSACTIONS on Information
SP - 1462
EP - 1473
AU - Shogo OKADA
AU - Mi HANG
AU - Katsumi NITTA
PY - 2016
DO - 10.1587/transinf.2015CBP0003
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E99-D
IS - 6
JA - IEICE TRANSACTIONS on Information
Y1 - June 2016
AB - This study focuses on modeling the storytelling performance of the participants in a group conversation. Storytelling performance is one of the fundamental communication techniques for providing information and entertainment effectively to a listener. We present a multimodal analysis of the storytelling performance in a group conversation, as evaluated by external observers. A new multimodal data corpus is collected through this group storytelling task, which includes the participants' performance scores. We extract multimodal (verbal and nonverbal) features regarding storytellers and listeners from a manual description of spoken dialog and from various nonverbal patterns, including each participant's speaking turn, utterance prosody, head gesture, hand gesture, and head direction. We also extract multimodal co-occurrence features, such as head gestures, and interaction features, such as storyteller utterance overlapped with listener's backchannel. In the experiment, we modeled the relationship between the performance indices and the multimodal features using machine-learning techniques. Experimental results show that the highest accuracy (R²) is 0.299 for the total storytelling performance (sum of indices scores) obtained with a combination of verbal and nonverbal features in a regression task.
ER -

IEICE TRANSACTIONS on Information