The search functionality is under construction.

Author Search Result

[Author] Naokazu YOKOYA(6hit)

1-6hit
  • Three Point Based Registration for Binocular Augmented Reality

    Steve VALLERAND  Masayuki KANBARA  Naokazu YOKOYA  

     
    PAPER-Multimedia Pattern Processing

      Vol:
    E87-D No:6
      Page(s):
    1554-1565

    In order to perform the registration of virtual objects in vision-based augmented reality systems, the estimation of the relation between the real and virtual worlds is needed. This paper presents a three-point vision-based registration method for video see-through augmented reality systems using binocular cameras. The proposed registration method is based on a combination of monocular and stereoscopic registration methods. A correction method that performs an optimization of the registration by correcting the 2D positions in the images of the marker feature points is proposed. Also, an extraction strategy based on color information is put forward to allow the system to be robust to fast user's motion. In addition, a quantification method is used in order to evaluate the stability of the produced registration. Timing and stability results are presented. The proposed registration method is proven to be more stable than the standard stereoscopic registration method and to be independent of the distance. Even when the user moves quickly, our developed system succeeds in producing stable three-point based registration. Therefore, our proposed methods can be considered as interesting alternatives to produce the registration in binocular augmented reality systems when only three points are available.

  • Finding Important People in a Video Using Deep Neural Networks with Conditional Random Fields

    Mayu OTANI  Atsushi NISHIDA  Yuta NAKASHIMA  Tomokazu SATO  Naokazu YOKOYA  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2018/07/20
      Vol:
    E101-D No:10
      Page(s):
    2509-2517

    Finding important regions is essential for applications, such as content-aware video compression and video retargeting to automatically crop a region in a video for small screens. Since people are one of main subjects when taking a video, some methods for finding important regions use a visual attention model based on face/pedestrian detection to incorporate the knowledge that people are important. However, such methods usually do not distinguish important people from passers-by and bystanders, which results in false positives. In this paper, we propose a deep neural network (DNN)-based method, which classifies a person into important or unimportant, given a video containing multiple people in a single frame and captured with a hand-held camera. Intuitively, important/unimportant labels are highly correlated given that corresponding people's spatial motions are similar. Based on this assumption, we propose to boost the performance of our important/unimportant classification by using conditional random fields (CRFs) built upon the DNN, which can be trained in an end-to-end manner. Our experimental results show that our method successfully classifies important people and the use of a DNN with CRFs improves the accuracy.

  • Passive Range Sensing Techniques: Depth from Images

    Naokazu YOKOYA  Takeshi SHAKUNAGA  Masayuki KANBARA  

     
    INVITED SURVEY PAPER

      Vol:
    E82-D No:3
      Page(s):
    523-533

    Acquisition of three-dimensional information of a real-world scene from two-dimensional images has been one of the most important issues in computer vision and image understanding in the last two decades. Noncontact range acquisition techniques can be essentially classified into two classes: Passive and active. This paper concentrates on passive depth extraction techniques which have the advantage that 3-D information can be obtained without affecting the scene. Passive range sensing techniques are often referred to as shape-from-x, where x is one of visual cues such as shading, texture, contour, focus, stereo, and motion. These techniques produce 2.5-D representations of visible surfaces. This survey discusses aspects of this research field and reviews some recent advances including video-rate range imaging sensors as well as emerging themes and applications.

  • Generation of a Zoomed Stereo Video Using Two Synchronized Videos with Different Magnifications

    Yusuke HAYASHI  Norihiko KAWAI  Tomokazu SATO  Miyuki OKUMOTO  Naokazu YOKOYA  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2015/06/17
      Vol:
    E98-D No:9
      Page(s):
    1691-1701

    This paper proposes a novel approach to generate stereo video in which the zoom magnification is not constant. Although this has been achieved mechanically in a conventional way, it is necessary for this approach to develop a mechanically complex system for each stereo camera system. Instead of a mechanical solution, we employ an approach from the software side: by using a pair of zoomed and non-zoomed video, a part of the non-zoomed video image is cut out and super-resolved for generating stereo video without a special hardware. To achieve this, (1) the zoom magnification parameter is automatically determined by using distributions of intensities, and (2) the cutout image is super-resolved by using optically zoomed images as exemplars. The effectiveness of the proposed method is quantitatively and qualitatively validated through experiments.

  • Real-Time Tracking of Multiple Moving Object Contours in a Moving Camera Image Sequence

    Shoichi ARAKI  Takashi MATSUOKA  Naokazu YOKOYA  Haruo TAKEMURA  

     
    PAPER-Image Processing, Image Pattern Recognition

      Vol:
    E83-D No:7
      Page(s):
    1583-1591

    This paper describes a new method for detection and tracking of moving objects from a moving camera image sequence using robust estimation and active contour models. We assume that the apparent background motion between two consecutive image frames can be approximated by affine transformation. In order to register the static background, we estimate affine transformation parameters using LMedS (Least Median of Squares) method which is a kind of robust estimator. Split-and-merge contour models are employed for tracking multiple moving objects. Image energy of contour models is defined based on the image which is obtained by subtracting the previous frame transformed with estimated affine parameters from the current frame. We have implemented the method on an image processing system which consists of DSP boards for real-time tracking of moving objects from a moving camera image sequence.

  • Optimization Approaches in Computer Vision and Image Processing

    Katsuhiko SAKAUE  Akira AMANO  Naokazu YOKOYA  

     
    INVITED SURVEY PAPER

      Vol:
    E82-D No:3
      Page(s):
    534-547

    In this paper, the authors present general views of computer vision and image processing based on optimization. Relaxation and regularization in both broad and narrow senses are used in various fields and problems of computer vision and image processing, and they are currently being combined with general-purpose optimization algorithms. The principle and case examples of relaxation and regularization are discussed; the application of optimization to shape description that is a particularly important problem in the field is described; and the use of a genetic algorithm (GA) as a method of optimization is introduced.