Kazuaki KONDO Takuto FUJIWARA Yuichi NAKAMURA
When using a gesture-based interface for pointing to targets on a wide screen, displaying a large pointer instead of a typical spot pattern reduces disturbance caused by measurement errors of user's pointing posture. However, it remains unclear why a large pointer helps facilitate easy pointing. To examine this issue, in this study we propose a mathematical model that formulates human pointing motions affected by a large pointer. Our idea is to describe the effect of the large pointer as human visual perception, because the user will perceive the pointer-target distance as being shorter than it actually is. We embedded this scheme, referred to as non-linear distance filter (NDF), into a typical feedback loop model designed to formulate human pointing motions. We also proposed a method to estimate NDF mapping from pointing trajectories, and used it to investigate the applicability of the model under three typical disturbance patterns: small vibration, smooth shift, and step signal. Experimental results demonstrated that the proposed NDF-based model could accurately reproduced actual pointing trajectories, achieving high similarity values of 0.89, 0.97, and 0.91 for the three respective disturbance patterns. The results indicate the applicability of the proposed method. In addition, we confirmed that the obtained NDF mappings suggested rationales for why a large pointer helps facilitate easy pointing.
Masamune NOMURA Yuki NAKAMURA Hiroo TARAO Amane TAKEI
This paper describes the effectiveness of the geometric multi-grid method in a current density analysis using a numerical human body model. The scalar potential finite difference (SPFD) method is used as a numerical method for analyzing the current density inside a human body due to contact with charged objects in a low-frequency band, and research related to methods to solve faster large-scale simultaneous equations based on the SPFD method has been conducted. In previous research, the block incomplete Cholesky conjugate gradients (ICCG) method is proposed as an effective method to solve the simultaneous equations faster. However, even though the block ICCG method is used, many iterations are still needed. Therefore, in this study, we focus on the geometric multi-grid method as a method to solve the problem. We develop the geometric-multi-grid method and evaluate performances by comparing it with the block ICCG method in terms of computation time and the number of iterations. The results show that the number of iterations needed for the geometric multi-grid method is much less than that for the block ICCG method. In addition, the computation time is much shorter, depending on the number of threads and the number of coarse grids. Also, by using multi-color ordering, the parallel performance of the geometric multi-grid method can be greatly improved.
Rousslan F. J. DOSSA Xinyu LIAN Hirokazu NOMOTO Takashi MATSUBARA Kuniaki UEHARA
Reinforcement learning methods achieve performance superior to humans in a wide range of complex tasks and uncertain environments. However, high performance is not the sole metric for practical use such as in a game AI or autonomous driving. A highly efficient agent performs greedily and selfishly, and is thus inconvenient for surrounding users, hence a demand for human-like agents. Imitation learning reproduces the behavior of a human expert and builds a human-like agent. However, its performance is limited to the expert's. In this study, we propose a training scheme to construct a human-like and efficient agent via mixing reinforcement and imitation learning for discrete and continuous action space problems. The proposed hybrid agent achieves a higher performance than a strict imitation learning agent and exhibits more human-like behavior, which is measured via a human sensitivity test.
Hitoshi NISHIMURA Naoya MAKIBUCHI Kazuyuki TASAKA Yasutomo KAWANISHI Hiroshi MURASE
Multiple human tracking is widely used in various fields such as marketing and surveillance. The typical approach associates human detection results between consecutive frames using the features and bounding boxes (position+size) of detected humans. Some methods use an omnidirectional camera to cover a wider area, but ID switch often occurs in association with detections due to following two factors: i) The feature is adversely affected because the bounding box includes many background regions when a human is captured from an oblique angle. ii) The position and size change dramatically between consecutive frames because the distance metric is non-uniform in an omnidirectional image. In this paper, we propose a novel method that accurately tracks humans with an association metric for omnidirectional images. The proposed method has two key points: i) For feature extraction, we introduce local rectification, which reduces the effect of background regions in the bounding box. ii) For distance calculation, we describe the positions in a world coordinate system where the distance metric is uniform. In the experiments, we confirmed that the Multiple Object Tracking Accuracy (MOTA) improved 3.3 in the LargeRoom dataset and improved 2.3 in the SmallRoom dataset.
Takuya MATSUMOTO Kodai SHIMOSATO Takahiro MAEDA Tatsuya MURAKAMI Koji MURAKOSO Kazuhiko MINO Norimichi UKITA
This paper proposes a framework for automatically annotating the keypoints of a human body in images for learning 2D pose estimation models. Ground-truth annotations for supervised learning are difficult and cumbersome in most machine vision tasks. While considerable contributions in the community provide us a huge number of pose-annotated images, all of them mainly focus on people wearing common clothes, which are relatively easy to annotate the body keypoints. This paper, on the other hand, focuses on annotating people wearing loose-fitting clothes (e.g., Japanese Kimono) that occlude many body keypoints. In order to automatically and correctly annotate these people, we divert the 3D coordinates of the keypoints observed without loose-fitting clothes, which can be captured by a motion capture system (MoCap). These 3D keypoints are projected to an image where the body pose under loose-fitting clothes is similar to the one captured by the MoCap. Pose similarity between bodies with and without loose-fitting clothes is evaluated with 3D geometric configurations of MoCap markers that are visible even with loose-fitting clothes (e.g., markers on the head, wrists, and ankles). Experimental results validate the effectiveness of our proposed framework for human pose estimation.
This paper reviews our developed wide band human body communication technology for wearable and implantable robot control. The wearable and implantable robots are assumed to be controlled by myoelectric signals and operate according to the operator's will. The signal transmission for wearable robot control was shown to be mainly realized by electrostatic coupling, and the signal transmission for implantable robot control was shown to be mainly determined by the lossy frequency-dependent dielectric properties of human body. Based on these basic observations on signal transmission mechanisms, we developed a 10-50MHz band impulse radio transceiver based on human body communication technology, and applied it for wireless control of a robotic hand using myoelectric signals in the first time. In addition, we also examined its applicability to implantable robot control, and evaluated the communication performance of implant signal transmission using a living swine. These experimental results showed that the proposed technology is well suited for detection and transmission of biological signals for wearable and implantable robot control.
Taeyoung JUNG Hyuk-Ju KWON Joonku HAHN Sung-Hak LEE
We propose image synthesizing using luminance adapted range compression and detail-preserved blending. Range compression is performed using the correlated visual gamma then image blending is performed by local adaptive mixing and selecting method. Simulations prove that the proposed method reproduces natural images without any increase in noise or color desaturation.
Takafumi HIGASHI Hideaki KANAI
To improve the cutting skills of learners, we developed a method for improving the skill involved in creating paper cuttings based on a steering task in the field of human-computer interaction. TaWe made patterns using the white and black boundaries that make up a picture. The index of difficulty (ID) is a numerical value based on the width and distance of the steering law. First, we evaluated novice and expert pattern-cutters, and measured their moving time (MT), error rate, and compliance with the steering law, confirming that the MT and error rate are affected by pattern width and distance. Moreover, we quantified the skills of novices and experts using ID and MT based models. We then observed changes in the cutting skills of novices who practiced with various widths and evaluated the impact of the difficulty level on skill improvement. Patterns considered to be moderately difficult for novices led to a significant improvement in skills.
Xinxin HAN Jian YE Jia LUO Haiying ZHOU
The triaxial accelerometer is one of the most important sensors for human activity recognition (HAR). It has been observed that the relations between the axes of a triaxial accelerometer plays a significant role in improving the accuracy of activity recognition. However, the existing research rarely focuses on these relations, but rather on the fusion of multiple sensors. In this paper, we propose a data fusion-based convolutional neural network (CNN) approach to effectively use the relations between the axes. We design a single-channel data fusion method and multichannel data fusion method in consideration of the diversified formats of sensor data. After obtaining the fused data, a CNN is used to extract the features and perform classification. The experiments show that the proposed approach has an advantage over the CNN in accuracy. Moreover, the single-channel model achieves an accuracy of 98.83% with the WISDM dataset, which is higher than that of state-of-the-art methods.
This paper proposes a visual analytics (VA) interface for time-series data so that it can solve the problems arising from the property of time-series data: a collision between interaction and animation on the temporal aspect, collision of interaction between the temporal and spatial aspects, and the trade-off of exploration accuracy, efficiency, and scalability between different visualization methods. To solve these problems, this paper proposes a VA interface that can handle temporal and spatial changes uniformly. Trajectories can show temporal changes spatially, of which direct manipulation enables to examine the relationship among objects either at a certain time point or throughout the entire time range. The usefulness of the proposed interface is demonstrated through experiments.
Kohei YOSHIGAMI Taishi HAYASHI Masateru TSUNODA Hidetake UWANO Shunichiro SASAKI Kenichi MATSUMOTO
Recently, many studies have applied gamification to software engineering education and software development to enhance work results. Gamification is defined as “the use of game design elements in non-game contexts.” When applying gamification, we make various game rules, such as a time limit. However, it is not clear whether the rule affects working time or not. For example, if we apply a time limit to impatient developers, the working time may become shorter, but the rule may negatively affect because of pressure for time. In this study, we analyze with subjective experiments whether the rules affects work results such as working time. Our experimental results suggest that for the coding tasks, working time was shortened when we applied a rule that made developers aware of working time by showing elapsed time.
Zhiyu SHAO Juan WU Qiangqiang OUYANG
Many quality metrics have been proposed for the compliance perception to assess haptic device performance and perceived results. Perceived compliance may be influenced by factors such as object properties, experimental conditions and human perceptual habits. In this paper, analysis of softness perception was conducted to find out relevant quality metrics dominating in the compliance perception system and their correlation with perception results, by expressing these metrics by basic physical parameters that characterizing these factors. Based on three psychophysical experiments, just noticeable differences (JNDs) for perceived softness of combination of different stiffness coefficients and damping levels rendered by haptic devices were analyzed. Interaction data during the interaction process were recorded and analyzed. Preliminary experimental results show that the discrimination ability of softness perception changes with the ratio of damping to stiffness when subjects exploring at their habitual speed. Analysis results indicate that quality metrics of Rate-hardness, Extended Rate-hardness and ratio of damping to stiffness have high correlation for perceived results. Further analysis results show that parameters that reflecting object properties (stiffness, damping), experimental conditions (force bandwidth) and human perceptual habits (initial speed, maximum force change rate) lead to the change of these quality metrics, which then bring different perceptual feeling and finally result in the change of discrimination ability. Findings in this paper may provide a better understanding of softness perception and useful guidance in improvement of haptic and teleoperation devices.
Maya OKAWA Yusuke TANAKA Takeshi KURASHIMA Hiroyuki TODA Tomohiro YAMADA
With the acceptance of social sharing, public bike sharing services have become popular worldwide. One of the most important tasks in operating a bike sharing system is managing the bike supply at each station to avoid either running out of bicycles or docks to park them. This requires the system operator to redistribute bicycles from overcrowded stations to under-supplied ones. Trip demand prediction plays a crucial role in improving redistribution strategies. Predicting trip demand is a highly challenging problem because it is influenced by multiple levels of factors, both environmental and individual, e.g., weather and user characteristics. Although several existing studies successfully address either of them in isolation, no framework exists that can consider all factors simultaneously. This paper starts by analyzing trip data from real-world bike-sharing systems. The analysis reveals the interplay of the multiple levels of the factors. Based on the analysis results, we develop a novel form of the point process; it jointly incorporates multiple levels of factors to predict trip demand, i.e., predicting the pick-up and drop-off levels in the future and when over-demand is likely to occur. Our extensive experiments on real-world bike sharing systems demonstrate the superiority of our trip demand prediction method over five existing methods.
Mitsuki NAKAMURA Motoharu SASAKI Wataru YAMADA Naoki KITA Takeshi ONIZAWA Yasushi TAKATORI Masashi NAKATSUGAWA Minoru INOMATA Koshiro KITAO Tetsuro IMAI
This paper proposes a path loss model for crowded outdoor environments that can consider the density of people. Measurement results in an anechoic chamber with three blocking persons showed that multiple human body shadowing can be calculated by using finite width screens. As a result, path loss in crowded environments can be calculated by using the path losses of the multipath and the multiple human body shadowing on those paths. The path losses of the multipath are derived from a ray tracing simulation, and the simulation results are then used to predict the path loss in crowded environments. The predicted path loss of the proposed model was examined through measurements in the crowded outdoor station square in front of Shibuya Station in Tokyo, and results showed that it can accurately predict the path loss in crowded environments at the frequencies of 4.7GHz and 26.4GHz under two different conditions of antenna height and density of people. The RMS error of the proposed model was less than 4dB.
Dai SASAKAWA Naoki HONMA Takeshi NAKAYAMA Shoichi IIZUKA
This paper introduces a method that identifies human activity from the height and Doppler Radar Cross Section (RCS) information detected by Multiple-Input Multiple-Output (MIMO) radar. This method estimates the three-dimensional target location by applying the MUltiple SIgnal Classification (MUSIC) method to the observed MIMO channel; the Doppler RCS is calculated from the signal reflected from the target. A gesture recognition algorithm is applied to the trajectory of the temporal transition of the estimated human height and the Doppler RCS. In experiments, the proposed method achieves over 90% recognition rate (average).
Toshihiro KITAJIMA Edwardo Arata Y. MURAKAMI Shunsuke YOSHIMOTO Yoshihiro KURODA Osamu OSHIRO
The arrival of the era of the Internet of Things (IoT) has ensured the ubiquity of human-sensing technologies. Cameras have become inexpensive instruments for human sensing and have been increasingly used for this purpose. Because cameras produce large quantities of information, they are powerful tools for sensing; however, because camera images contain information allowing individuals to be personally identified, their use poses risks of personal privacy violations. In addition, because IoT-ready home appliances are connected to the Internet, camera-captured images of individual users may be unintentionally leaked. In developing our human-detection method [33], [34], we proposed techniques for detecting humans from unclear images in which individuals cannot be identified; however, a drawback of this method was its inability to detect moving humans. Thus, to enable tracking of humans even through the images are blurred to protect privacy, we introduce a particle-filter framework and propose a human-tracking method based on motion detection and heart-rate detection. We also show how the use of integral images [32] can accelerate the execution of our algorithms. In performance tests involving unclear images, the proposed method yields results superior to those obtained with the existing mean-shift method or with a face-detection method based on Haar-like features. We confirm the acceleration afforded by the use of integral images and show that the speed of our method is sufficient to enable real-time operation. Moreover, we demonstrate that the proposed method allows successful tracking even in cases where the posture of the individual changes, such as when the person lies down, a situation that arises in real-world usage environments. We discuss the reasons behind the superior behavior of our method in performance tests compared to those of other methods.
Kunho PARK Min Joo JEONG Jong Jin BAEK Se Woong KIM Youn Tae KIM
This paper presents the bit error rate (BER) performance of human body communication (HBC) receivers in interference-rich environments. The BER performance was measured while applying an interference signal to the HBC receiver to consider the effect of receiver performance on BER performance. During the measurement, a signal attenuator was used to mimic the signal loss of the human body channel, which improved the repeatability of the measurement results. The measurement results showed that HBC is robust against the interference when frequency selective digital transmission (FSDT) is used as a modulation scheme. The BER performance in this paper can be effectively used to evaluate a communication performance of HBC.
Yotaro FUSE Hiroshi TAKENOUCHI Masataka TOKUMARU
Herein, we proposed a robot model that will obey a norm of a certain group by interacting with the group members. Using this model, a robot system learns the norm of the group as a group member itself. The people with individual differences form a group and a characteristic norm that reflects the group members' personalities. When robots join a group that includes humans, the robots need to obey a characteristic norm: a group norm. We investigated whether the robot system generates a decision-making criterion to obey group norms by learning from interactions through reinforcement learning. In this experiment, human group members and the robot system answer same easy quizzes that could have several vague answers. When the group members answered differently from one another at first, we investigated whether the group members answered the quizzes while considering the group norm. To avoid bias toward the system's answers, one of the participants in a group only obeys the system, whereas the other participants are unaware of the system. Our experiments revealed that the group comprising the participants and the robot system forms group norms. The proposed model enables a social robot to make decisions socially in order to adjust their behaviors to common sense not only in a large human society but also in partial human groups, e.g., local communities. Therefore, we presumed that these robots can join human groups by interacting with its members. To adapt to these groups, these robots adjust their own behaviors. However, further studies are required to reveal whether the robots' answers affect people and whether the participants can form a group norm based on a robot's answer even in a situation wherein the participants recognize that they are interacting in a group that include a real robot. Moreover, some participants in a group do not know that the other participant only obeys the system's decisions and pretends to answer questions to prevent biased answers.
A fusion framework between CNN and RNN is proposed dedicatedly for air-writing recognition. By modeling the air-writing using both spatial and temporal features, the proposed network can learn more information than existing techniques. Performance of the proposed network is evaluated by using the alphabet and numeric datasets in the public database namely the 6DMG. Average accuracy of the proposed fusion network outperforms other techniques, i.e. 99.25% and 99.83% are observed in the alphabet gesture and the numeric gesture, respectively. Simplified structure of RNN is also proposed, which can attain about two folds speed-up of ordinary BLSTM network. It is also confirmed that only the distance between consecutive sampling points is enough to attain high recognition performance.
The research on inertial sensor based human action detection and recognition (HADR) is a new area in machine learning. We propose a novel time sequence based interval convolutional neutral networks framework for HADR by combining interesting interval proposals generator and interval-based classifier. Experiments demonstrate the good performance of our method.