1-3hit |
Kenya SAKAMOTO Shizuka SHIRAI Noriko TAKEMURA Jason ORLOSKY Hiroyuki NAGATAKI Mayumi UEDA Yuki URANISHI Haruo TAKEMURA
This study explores significant eye-gaze features that can be used to estimate subjective difficulty while reading educational comics. Educational comics have grown rapidly as a promising way to teach difficult topics using illustrations and texts. However, comics include a variety of information on one page, so automatically detecting learners' states such as subjective difficulty is difficult with approaches such as system log-based detection, which is common in the Learning Analytics field. In order to solve this problem, this study focused on 28 eye-gaze features, including the proposal of three new features called “Variance in Gaze Convergence,” “Movement between Panels,” and “Movement between Tiles” to estimate two degrees of subjective difficulty. We then ran an experiment in a simulated environment using Virtual Reality (VR) to accurately collect gaze information. We extracted features in two unit levels, page- and panel-units, and evaluated the accuracy with each pattern in user-dependent and user-independent settings, respectively. Our proposed features achieved an average F1 classification-score of 0.721 and 0.742 in user-dependent and user-independent models at panel unit levels, respectively, trained by a Support Vector Machine (SVM).
Alparslan YILDIZ Noriko TAKEMURA Maiya HORI Yoshio IWAI Kosuke SATO
In this study, we introduce a system for tracking multiple people using multiple active cameras. Our main objective is to surveille as many targets as possible, at any time, using a limited number of active cameras. In our context, an active camera is a statically located pan-tilt-zoom camera. In this research, we aim to optimize the camera configuration to achieve maximum coverage of the targets. We first devise a method for efficient tracking and estimation of target locations in the environment. Our tracking method is able to track an unknown number of targets and easily estimate multiple future time-steps, which is a requirement for active cameras. Next, we present an optimization of camera configuration with variable time-step that is optimal given the estimated object likelihoods for multiple future frames. We confirmed our results using simulation and real videos, and show that without introducing any significant computational complexities, it is possible to use active cameras to the point that we can track and observe multiple targets very effectively.
Ruochen LIAO Kousuke MORIWAKI Yasushi MAKIHARA Daigo MURAMATSU Noriko TAKEMURA Yasushi YAGI
In this study, we propose a method to estimate body composition-related health indicators (e.g., ratio of body fat, body water, and muscle, etc.) using video-based gait analysis. This method is more efficient than individual measurement using a conventional body composition meter. Specifically, we designed a deep-learning framework with a convolutional neural network (CNN), where the input is a gait energy image (GEI) and the output consists of the health indicators. Although a vast amount of training data is typically required to train network parameters, it is unfeasible to collect sufficient ground-truth data, i.e., pairs consisting of the gait video and the health indicators measured using a body composition meter for each subject. We therefore use a two-step approach to exploit an auxiliary gait dataset that contains a large number of subjects but lacks the ground-truth health indicators. At the first step, we pre-train a backbone network using the auxiliary dataset to output gait primitives such as arm swing, stride, the degree of stoop, and the body width — considered to be relevant to the health indicators. At the second step, we add some layers to the backbone network and fine-tune the entire network to output the health indicators even with a limited number of ground-truth data points of the health indicators. Experimental results show that the proposed method outperforms the other methods when training from scratch as well as when using an auto-encoder-based pre-training and fine-tuning approach; it achieves relatively high estimation accuracy for the body composition-related health indicators except for body fat-relevant ones.