1-7hit |
Setsuo YOKOYAMA Toyohide WATANABE
Hiroshi SHIMANUKI Toyohide WATANABE Koichi ASAKURA Hideki SATO Taketoshi USHIAMA
When people learn a handicraft with instructional contents such as books, videos, and web pages, many of them often give up halfway because the contents do not always assure how to make it. This study aims to provide origami learners, especially beginners, with feedbacks on their folding operations. An approach for recognizing the state of the learner by using a single top-view camera, and pointing out the mistakes made during the origami folding operation is proposed. First, an instruction model that stores easy-to-follow folding operations is defined. Second, a method for recognizing the state of the learner's origami paper sheet is proposed. Third, a method for detecting mistakes made by the learner by means of anomaly detection using a one-class support vector machine (one-class SVM) classifier (using the folding progress and the difference between the learner's origami shape and the correct shape) is proposed. Because noises exist in the camera images due to shadows and occlusions caused by the learner's hands, the shapes of the origami sheet are not always extracted accurately. To train the one-class SVM classifier with high accuracy, a data cleansing method that automatically sifts out video frames with noises is proposed. Moreover, using the statistics of features extracted from the frames in a sliding window makes it possible to reduce the influence by the noises. The proposed method was experimentally demonstrated to be sufficiently accurate and robust against noises, and its false alarm rate (false positive rate) can be reduced to zero. Requiring only a single camera and common origami paper, the proposed method makes it possible to monitor mistakes made by origami learners and support their self-learning.
Toyohide WATANABE Qin LUO Noboru SUGIE
The issue about document structure recognition and document understanding is today one of interesting subjects from a viewpoint of practical applications. The research objective is to extract the meaningful data from document images interpretatively and also classify them as the predefined item data automatically. In comparison with the traditional image-processing-based approaches, the knowledge-based approaches, which make use of various knowledge in order to interpret structural/constructive features of documents, have been currently investigated as more flexible and applicable methods. In this paper, we propose a totally integrated paradigm for understanding table-form documents from a viewpoint of the architectural framework.
Jie ZHANG Chuan XIAO Toyohide WATANABE Yoshiharu ISHIKAWA
Presentation slide composition is an important job for knowledge workers. Instead of starting from scratch, users tend to make new presentation slides by reusing existing ones. A primary challenge in slide reuse is to select desired materials from a collection of existing slides. The state-of-the-art solution utilizes texts and images in slides as well as file names to help users to retrieve the materials they want. However, it only allows users to choose an entire slide as a query but does not support the search for a single element such as a few keywords, a sentence, an image, or a diagram. In this paper, we investigate content-based search for a variety of elements in presentation slides. Users may freely choose a slide element as a query. We propose different query processing methods to deal with various types of queries and improve the search efficiency. A system with a user-friendly interface is designed, based on which experiments are performed to evaluate the effectiveness and the efficiency of the proposed methods.
Tomoko KOJIRI Yosuke MURASE Toyohide WATANABE
This paper focuses on the collaborative learning of mathematics in which learners effectively acquire knowledge of common exercises through discussion with other learners. During collaborative learning, learners sometimes cannot solve exercises successfully, because they cannot derive answers by themselves or they hesitate to propose answers through discussion. To cope with such situations, this paper proposes two support functions using diagrams to encourage active discussion, since diagrams are often used to graphically illustrate mathematical concepts. One function indicates the differences between learner diagrams and the group diagram in order to encourage participation in discussions. To compare the characteristics of diagrams drawn by different learners, internal representation of the diagram, which consists of types of figures and remarkable relations to other figures, is introduced. The other function provides hints in the group diagram so that all learners can consider their answers collaboratively through discussions. Since preparing hints for all exercises is difficult, rules for drawing supplementary figures, which are general methods for drawing supplementary figures that correspond to individual answering methods/formulas, are also developed. By applying available rules to current group diagram, appropriate supplementary figures that can solve current learning situations may be generated. The experimental results showed that the generated hints successfully increased the number of utterances in the groups. Moreover, learners were also able to derive answers by themselves and tended to propose more opinions in discussions when the uniqueness of their diagrams was indicated.
The subject about document image understanding is to extract and classify individual data meaningfully from paper-based documents. Until today, many methods/approaches have been proposed with regard to recognition of various kinds of documents, various technical problems for extensions of OCR, and requirements for practical usages. Of course, though the technical research issues in the early stage are looked upon as complementary attacks for the traditional OCR which is dependent on character recognition techniques, the application ranges or related issues are widely investigated or should be established progressively. This paper addresses current topics about document image understanding from a technical point of view as a survey.
Jien KATO Toyohide WATANABE Hiroyuki HASE
Automatic traffic surveillance based on visual tracking techniques has been desired for many years. This paper proposes a basic highway surveillance system using an HMM-based segmentation method. The presented system meets the essential requirement of ITS: real-time running. Its another advantage is robustness to the shadows of moving objects, which have been recognized as one of main obstacles to robust car tracking. At present, using the system we can estimate velocity of vehicles with high accuracy. For acquiring metric information in the real world, the system does not require a precise calibration but only needs four point correspondences between the image plane and ground plane.