1-7hit |
In this paper, we introduce a method for recognizing a subject complex object in real world environment. We use a three dimensional model described by line segments of the object and the data provided by a three-axis orientation sensor attached to the video camera. We assume that existing methods for finding line features in the image allow at least one model line segment to be detected as a single continuous segment. The method consists of two main steps: generation of pose hypotheses and then evaluation of each pose in order to select the most appropriate one. The first stage is three-fold: model visibility, line matching and pose estimation; the second stage aims to rank the poses by evaluating the similarity between the projected model lines and the image lines. Furthermore, we propose an additional step that consists of refining the best candidate pose by using the Lie group formalism of spatial rigid motions. Such a formalism provides an efficient local parameterization of the set of rigid rotation via the exponential map. A set of experiments demonstrating the robustness of this approach is presented.
Yukiko I. NAKANO Toshiyasu MURAYAMA Toyoaki NISHIDA
In story-based communication, where a message is conveyed in story form, it is important to embody the story with expressive materials. However, it is quite difficult for users to create rich multimedia contents using multimedia editing tools. This paper proposes a web-based multimedia environment, SPOC (Stream-oriented Public Opinion Channel), aiming at helping non-skillful people to convert their stories into TV-like programs very easily. The system can produce a digital camera work for graphics and video clips as well as generate an agent animation automatically according to a narration text. Findings in evaluation experiments showed that SPOC is easy-to-use and easy-to-learn for novice users. Given a short instruction, the subjects not only mastered the operations of the software, but also succeeded in creating highly original programs. In subjective evaluation, the subjects answered that they enjoyed using the software without feeling difficulty. These results suggest that this system reduces user's cost in making a program, and encourages communication in a network community.
Masashi OKAMOTO Yukiko I. NAKANO Kazunori OKAMOTO Ken'ichi MATSUMURA Toyoaki NISHIDA
In virtue of great progress in computer graphics technologies, CG movies have been getting popular. However, cinematography techniques, which contribute to improving the contents' comprehensibility, need to be learned from professional experiences, and not easily acquired by non-professional people. This paper focuses on film cutting as one of the most important cinematography techniques in conversational scenes, and presents a system that automatically generates shot transitions to improve comprehensibility of CG contents. First, we propose a cognitive model of User Involvement serving as constraints on selecting shot transitions. Then, to examine the validity of the model, we analyze shot transitions in TV programs, and based on the analysis, we implement a CG contents creation system. Results of our preliminary evaluation experiment show the effectiveness of the proposed method, specifically in enhancing contents' comprehensibility.
Toshiya NAKAKURA Yasuyuki SUMI Toyoaki NISHIDA
This paper proposes a system called Neary that detects conversational fields based on similarity of auditory situation among users. The similarity of auditory situation between each pair of the users is measured by the similarity of frequency property of sound captured by head-worn microphones of the individual users. Neary is implemented with a simple algorithm and runs on portable PCs. Experimental result shows Neary can successfully distinguish groups of conversations and track dynamic changes of them. This paper also presents two examples of Neary deployment to detect user contexts during experience sharing in touring at the zoo and attending an academic conference.
The objective of this paper is to provide an effective approach to infrared spectrum recognition. Traditionally, recognizing infrared spectra is a quantitative analysis problem. However, only using quantitative analysis has met two difficulties in practice: (1) quantitative analysis generally very complex, and in some cases it may even become intractable; and (2) when spectral data are inaccurate, it is hard to give concrete solutions. Our approach performs qualitative reasoning before complex quantitative analysis starts so that the above difficulties can be efficiently overcome. We present a novel model for qualitatively decomposing and analyzing infrared spectra. A list of candidates can be obtained based on the solutions of the model, then quantitative analysis will only be applied to the limited candidates. We also present a novel model for handling inaccuracy of spectral data. The model can capture qualitative features of infrared spectra, and can consider qualitative correlations among spectral data as evidence when spectral data are inaccurate. We have tested the approach against about 300 real infrared spectra. This paper also introduces the implementation of the approach.
Toyoaki NISHIDA Kazunori TERADA Takashi TAJIMA Makoto HATAKEYAMA Yoshiyasu OGASAWARA Yasuyuki SUMI Yong XU Yasser F. O. MOHAMMAD Kateryna TARASENKO Taku OHYA Tatsuya HIRAMATSU
We describe attempts to have robots behave as embodied knowledge media that will permit knowledge to be communicated through embodied interactions in the real world. The key issue here is to give robots the ability to associate interactions with information content while interacting with a communication partner. Toward this end, we present two contributions in this paper. The first concerns the formation and maintenance of joint intention, which is needed to sustain the communication of knowledge between humans and robots. We describe an architecture consisting of multiple layers that enables interaction with people at different speeds. We propose the use of an affordance-based method for fast interactions. For medium-speed interactions, we propose basing control on an entrainment mechanism. For slow interactions, we propose employing defeasible interaction patterns based on probabilistic reasoning. The second contribution is concerned with the design and implementation of a robot that can listen to a human instructor to elicit knowledge, and present the content of this knowledge to a person who needs it in an appropriate situation. In addition, we discuss future research agenda toward achieving robots serving as embodied knowledge media, and fit the robots-as-embodied-knowledge-media view in a larger perspective of Conversational Informatics.
Taishi OGAWA Atsushi NAKAZAWA Toyoaki NISHIDA
We present a human point of gaze estimation system using corneal surface reflection and omnidirectional image taken by spherical panorama cameras, which becomes popular recent years. Our system enables to find where a user is looking at only from an eye image in a 360° surrounding scene image, thus, does not need gaze mapping from partial scene images to a whole scene image that are necessary in conventional eye gaze tracking system. We first generate multiple perspective scene images from an omnidirectional (equirectangular) image and perform registration between the corneal reflection and perspective images using a corneal reflection-scene image registration technique. We then compute the point of gaze using a corneal imaging technique leveraged by a 3D eye model, and project the point to an omnidirectional image. The 3D eye pose is estimate by using the particle-filter-based tracking algorithm. In experiments, we evaluated the accuracy of the 3D eye pose estimation, robustness of registration and accuracy of PoG estimations using two indoor and five outdoor scenes, and found that gaze mapping error was 5.546 [deg] on average.