The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] video indexing(9hit)

1-9hit
  • Visual Indexing of Large Scale Train-Borne Video for Rail Condition Perceiving

    Peng DAI  Shengchun WANG  Yaping HUANG  Hao WANG  Xinyu DU  Qiang HAN  

     
    PAPER

      Pubricized:
    2017/06/14
      Vol:
    E100-D No:9
      Page(s):
    2017-2026

    Train-borne video captured from the camera installed in the front or back of the train has been used for railway environment surveillance, including missing communication units and bolts on the track, broken fences, unpredictable objects falling into the rail area or hanging on wires on the top of rails. Moreover, the track condition can be perceived visually from the video by observing and analyzing the train-swaying arising from the track irregularity. However, it's a time-consuming and labor-intensive work to examine the whole large scale video up to dozens of hours frequently. In this paper, we propose a simple and effective method to detect the train-swaying quickly and automatically. We first generate the long rail track panorama (RTP) by stitching the stripes cut from the video frames, and then extract track profile to perform the unevenness detection algorithm on the RTP. The experimental results show that RTP, the compact video representation, can fast examine the visual train-swaying information for track condition perceiving, on which we detect the irregular spots with 92.86% recall and 82.98% precision in only 2 minutes computation from the video close to 1 hour.

  • Spectral Fluctuation Method: A Texture-Based Method to Extract Text Regions in General Scene Images

    Yoichiro BABA  Akira HIROSE  

     
    PAPER-Pattern Recognition

      Vol:
    E92-D No:9
      Page(s):
    1702-1715

    To obtain text information included in a scene image, we first need to extract text regions from the image before recognizing the text. In this paper, we examine human vision and propose a novel method to extract text regions by evaluating textural variation. Human beings are often attracted by textural variation in scenes, which causes foveation. We frame a hypothesis that texts also have similar property that distinguishes them from the natural background. In our method, we calculate spatial variation of texture to obtain the distribution of the degree of likelihood of text region. Here we evaluate the changes in local spatial spectrum as the textural variation. We investigate two options to evaluate the spectrum, that is, those based on one- and two-dimensional Fourier transforms. In particular, in this paper, we put emphasis on the one-dimensional transform, which functions like the Gabor filter. The proposal can be applied to a wide range of characters mainly because it employs neither templates nor heuristics concerning character size, aspect ratio, specific direction, alignment, and so on. We demonstrate that the method effectively extracts text regions contained in various general scene images. We present quantitative evaluation of the method by using databases open to the public.

  • Filtering of Block Motion Vectors for Use in Motion-Based Video Indexing and Retrieval

    Golam SORWAR  Manzur MURSHED  Laurence DOOLEY  

     
    PAPER

      Vol:
    E88-A No:10
      Page(s):
    2593-2599

    Though block-based motion estimation techniques are primarily designed for video coding applications, they are increasingly being used in other video analysis applications due to their simplicity and ease of implementation. The major drawback associated with these techniques is that noises, in the form of false motion vectors, cannot be avoided while capturing block motion vectors. Similar noises may further be introduced when the technique of global motion compensation is applied to obtain true object motion from video sequences where both the camera and object motions are present. This paper presents a new technique for capturing large number of true object motion vectors by eliminating the false motion vector fields, for use in the application of object motion based video indexing and retrieval applications. Experimental results prove that our proposed technique significantly increases the percentage of retained true object motion vectors while eliminating all false motion vectors for variety of standard and non-standard video sequences.

  • Caption Detection Algorithm Using Temporal Information in Video

    Chung-Ho SHIN  Chul-Hyun KWON  Su-Yeon KIM  Sang-Hui PARK  

     
    LETTER-Image Processing, Image Pattern Recognition

      Vol:
    E87-D No:2
      Page(s):
    487-490

    A novel caption text detection and recognition algorithm using the temporal nature of video is proposed in this paper. A text registration technique is used to locate the temporal and spatial positions of captions in video from the accumulated frame difference information. Experimental results show that the proposed method is effective and robust. Also, a high processing speed is achieved since no time consuming operation is included.

  • Fast Algorithm for Aligning Images Having Large Displacements

    JunWei HSIEH  Cheng-Chin CHIANG  

     
    LETTER-Image Processing, Image Pattern Recognition

      Vol:
    E86-D No:7
      Page(s):
    1321-1324

    This paper presents an edge alignment method for stitching images when they have large displacements and light changes. First, without building any correspondences, the proposed method predicts all possible translation solutions by examining the consistency between edge positions. Then, the best solution can be obtained from the set of possible translations by a verification process. The proposed method has better capabilities to stitch images when they have large light changes and displacements. Since the method doesn't require building any correspondences or involve any optimization process, it performs more efficiently than other correlation techniques like feature-matching or phase-correlation approaches. Due to its simplicity and efficiency, different images can be very quickly aligned (less than 0.1 seconds) with the proposed method. Experimental results are provided to verify the superiority of the proposed method.

  • A Dimensionality Reduction Method for Efficient Search of High-Dimensional Databases

    Zaher AGHBARI  Kunihiko KANEKO  Akifumi MAKINOUCHI  

     
    PAPER-Databases

      Vol:
    E86-D No:6
      Page(s):
    1032-1041

    In this paper, we present a novel approach for efficient search of high-dimensional databases, such as video shots. The idea is to map feature vectors from the high-dimensional feature space into a point in a low-dimensional distance space. Then, a spatial access method, such as an R-tree, is used to cluster these points based on their distances in the low-dimensional space. Our mapping method, called topological mapping, guarantees no false dismissals in the result of a query. However, the result of a query might contain some false alarms. Hence, two refinement steps are performed to remove these false alarms. Comparative experiments on a database of video shots show the superior efficiency of the topological mapping method over other known methods.

  • A Representative-Video-Frame Selection Method for a Content-Based Video-Query-Agent System

    Katsunobu FUSHIKIDA  Yoshitsugu HIWATARI  Hideyo WAKI  

     
    PAPER-Image Processing, Image Pattern Recognition

      Vol:
    E83-D No:6
      Page(s):
    1274-1281

    An optimum representative-frames (r-frames) selection method using step-wise function approximation has been developed to provide automatic indexing for a video-query-agent (VQA) system. It uses dynamic programming to simultaneously select the r-frames and corresponding segment boundaries. Experiments showed that the approximation error of the selected r-frames was less than that of two conventional methods. Retrieval experiments using a feature-based image-search engine showed that the proposed method is more robust and effective than the two conventional methods. The proposed method was implemented in a VQA system and processing time was evaluated. The results showed that the processing time for indexing was shorter than that of the conventional method.

  • Content-Based Video Indexing and Retrieval-- A Natural Language Approach--

    Yeun-Bae KIM  Masahiro SHIBATA  

     
    PAPER

      Vol:
    E79-D No:6
      Page(s):
    695-705

    This paper describes methods in which natural language is used to describe video contents, knowledge of which is needed for intelligent video manipulation. The content encoded by natural language is extracted by a language analyzer in the form of subject-centered dependency structures which is a language-oriented structure, and is combined in an incremental way into a single structure called a multi-path index tree. Content descriptors and their inter-relations are extracted from the index tree in order to provide a high speed retrieval and flexibility. The content-based video index is represented in a two-dimensional structure where in the descriptors are mapped onto a component axis and temporal references (i.e., video segments aligned to the descriptors) are mapped onto a time axis. We implemented an experimental image retrieval systems to illustrate the proposed index structure 1) has superior retrieval capabilities compare to those used in conventional methods, 2) can be generated by an automated procedure, and 3) has a compact and flexible structure that is easily expandable, making an integration with vision processing possible.

  • A Structured Video Handling Technique for Multimedia Systems

    Yoshinobu TONOMURA  Akihito AKUTSU  

     
    PAPER-Image Processing, Computer Graphics and Pattern Recognition

      Vol:
    E78-D No:6
      Page(s):
    764-777

    This paper proposes a functional video handling technique based on structured video. The video handling architecture, which includes a video data structure, file management structure, and visual interface structure, is introduced as the core concept of this technique. One of the key features of this architecture is that the newly proposed video indexing method is performed automatically based on image processing. The video data structure, which plays an important role in the architecture, has two kinds of data structures: content and node. The central idea behind these structures is to separate the video contents from the processing operations and to create links between them. Video indexes work as a backend mechanism in structuring video content. A prototype video handling system called the MediaBENCH, a hypermedia basic environment for computer and human interactions, which demonstrates the actual implementation of the proposed concept and technique, is described. Basic functions such as browsing and editing, which are achieved based on the architecture, exhibit the advantages of structured video handling. The concept and the methods proposed in this paper assure various video-computer applications, which will play major roles in the multimedia field.