The search functionality is under construction.

Keyword Search Result

[Keyword] stereo vision(23hit)

1-20hit(23hit)

  • FPGA Implementation of 3-Bit Quantized Multi-Task CNN for Contour Detection and Disparity Estimation

    Masayuki MIYAMA  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2021/10/26
      Vol:
    E105-D No:2
      Page(s):
    406-414

    Object contour detection is a task of extracting the shape created by the boundaries between objects in an image. Conventional methods limit the detection targets to specific categories, or miss-detect edges of patterns inside an object. We propose a new method to represent a contour image where the pixel value is the distance to the boundary. Contour detection becomes a regression problem that estimates this contour image. A deep convolutional network for contour estimation is combined with stereo vision to detect unspecified object contours. Furthermore, thanks to similar inference targets and common network structure, we propose a network that simultaneously estimates both contour and disparity with fully shared weights. As a result of experiments, the multi-tasking network drew a good precision-recall curve, and F-measure was about 0.833 for FlyingThings3D dataset. L1 loss of disparity estimation for the dataset was 2.571. This network reduces the amount of calculation and memory capacity by half, and accuracy drop compared to the dedicated networks is slight. Then we quantize both weights and activations of the network to 3-bit. We devise a dedicated hardware architecture for the quantized CNN and implement it on an FPGA. This circuit uses only internal memory to perform forward propagation calculations, that eliminates high-power external memory accesses. This circuit is a stall-free pixel-by-pixel pipeline, and performs 8 rows, 16 input channels, 16 output channels, 3 by 3 pixels convolution calculations in parallel. The convolution calculation performance at the operating frequency of 250 MHz is 9 TOPs/s.

  • Ground Plane Detection with a New Local Disparity Texture Descriptor

    Kangru WANG  Lei QU  Lili CHEN  Jiamao LI  Yuzhang GU  Dongchen ZHU  Xiaolin ZHANG  

     
    LETTER-Pattern Recognition

      Pubricized:
    2017/06/27
      Vol:
    E100-D No:10
      Page(s):
    2664-2668

    In this paper, a novel approach is proposed for stereo vision-based ground plane detection at superpixel-level, which is implemented by employing a Disparity Texture Map in a convolution neural network architecture. In particular, the Disparity Texture Map is calculated with a new Local Disparity Texture Descriptor (LDTD). The experimental results demonstrate our superior performance in KITTI dataset.

  • Occlusion-Robust Human Tracking with Integrated Multi-View Depth Imagery

    Kenichiro FUKUSHI  Itsuo KUMAZAWA  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E97-D No:12
      Page(s):
    3181-3191

    In this paper, we present a computer vision-based human tracking system with multiple stereo cameras. Many widely used methods, such as KLT-tracker, update the trackers “frame-to-frame,” so that features extracted from one frame are utilized to update their current state. In contrast, we propose a novel optimization technique for the “multi-frame” approach that computes resultant trajectories directly from video sequences, in order to achieve high-level robustness against severe occlusion, which is known to be a challenging problem in computer vision. We developed a heuristic optimization technique to estimate human trajectories, instead of using dynamic programming (DP) or an iterative approach, which makes our method sufficiently computationally efficient to operate in realtime. Six video sequences where one to six people walk in a narrow laboratory space are processed using our system. The results confirm that our system is capable of tracking cluttered scenes in which severe occlusion occurs and people are frequently in close proximity to each other. Moreover, minimal information is required for tracking, instead of full camera images, which is communicated over the network. Hence, commonly used network devices are sufficient for constructing our tracking system.

  • A Study of Stereoscopic Image Quality Assessment Model Corresponding to Disparate Quality of Left/Right Image for JPEG Coding

    Masaharu SATO  Yuukou HORITA  

     
    LETTER-Quality Metrics

      Vol:
    E95-A No:8
      Page(s):
    1264-1269

    Our research is focused on examining a stereoscopic quality assessment model for stereoscopic images with disparate quality in left and right images for glasses-free stereo vision. In this paper, we examine the objective assessment model of 3-D images, considering the difference in image quality between each view-point generated by the disparity-compensated coding. A overall stereoscopic image quality can be estimated by using only predicted values of left and right 2-D image qualities based on the MPEG-7 descriptor information without using any disparity information. As a result, the stereoscopic still image quality is assessed with high prediction accuracy with correlation coefficient=0.98 and average error=0.17.

  • Simplified Relative Model to Measure Visual Fatigue in a Stereoscopy

    Jae Gon KIM  Jun-Dong CHO  

     
    LETTER

      Vol:
    E94-A No:12
      Page(s):
    2830-2831

    In this paper, we propose a quantitative metric of measuring the degree of the visual fatigue in a stereoscopy. To the best of our knowledge, this is the first simplified relative quantitative approach describing visual fatigue value of a stereoscopy. Our experimental result shows that the correlation index of more than 98% is obtained between our Simplified Relative Visual Fatigue (SRVF) model and Mean Opinion Score (MOS).

  • Memory Allocation for Multi-Resolution Image Processing

    Yasuhiro KOBAYASHI  Masanori HARIYAMA  Michitaka KAMEYAMA  

     
    PAPER-VLSI Systems

      Vol:
    E91-D No:10
      Page(s):
    2386-2397

    Hierarchical approaches using multi-resolution images are well-known techniques to reduce the computational amount without degrading quality. One major issue in designing image processors is to design a memory system that supports parallel access with a simple interconnection network. The complexity of the interconnection network mainly depends on memory allocation; it maps pixels onto memory modules and determines the required number of memory modules. This paper presents a memory allocation method to minimize the number of memory modules for image processing using multi-resolution images. For efficient search, the proposed method exploits the regularity of window-type image processing. A practical example demonstrates that the number of memory modules is reduced to less than 14% that of conventional methods.

  • A Passive 3D Face Recognition System and Its Performance Evaluation

    Akihiro HAYASAKA  Takuma SHIBAHARA  Koichi ITO  Takafumi AOKI  Hiroshi NAKAJIMA  Koji KOBAYASHI  

     
    PAPER

      Vol:
    E91-A No:8
      Page(s):
    1974-1981

    This paper proposes a three-dimensional (3D) face recognition system using passive stereo vision. So far, the reported 3D face recognition techniques have used active 3D measurement methods to capture high-quality 3D facial information. However, active methods employ structured illumination (structure projection, phase shift, moire topography, etc.) or laser scanning, which is not desirable in many human recognition applications. Addressing this problem, we propose a face recognition system that uses (i) passive stereo vision to capture 3D facial information and (ii) 3D matching using an ICP (Iterative Closest Point) algorithm with its improvement techniques. Experimental evaluation demonstrates efficient recognition performance of the proposed system compared with an active 3D face recognition system and a passive 3D face recognition system employing the original ICP algorithm.

  • 3D Precise Inspection of Terminal Lead for Electronic Devices by Single Camera Stereo Vision

    Takashi WATANABE  Akira KUSANO  Takayuki FUJIWARA  Hiroyasu KOSHIMIZU  

     
    PAPER

      Vol:
    E91-D No:7
      Page(s):
    1885-1892

    It is very important to guarantee the quality of the industrial products by means of visual inspection. In order to reduce the soldering defect with terminal deformation and terminal burr in the manufacturing process, this paper proposes a 3D visual inspection system based on a stereo vision with single camera. It is technically noted that the base line of this single camera stereo was precisely calibrated by the image processing procedure. Also to extract the measuring point coordinates for computing disparity; the error is reduced with original algorithm. Comparing its performance with that of human inspection using industrial microscope, the proposed 3D inspection could be an alternative in precision and in processing cost. Since the practical specification in 3D precision is less than 1 pixel and the experimental performance was around the same, it was demonstrated by the proposed system that the soldering defect with terminal deformation and terminal burr in inspection, especially in 3D inspection, was decreased. In order to realize the inline inspection, this paper will suggest how the human inspection of the products could be modeled and be implemented by the computer system especially in manufacturing process.

  • Interactive Cosmetic Makeup of a 3D Point-Based Face Model

    Jeong-Sik KIM  Soo-Mi CHOI  

     
    PAPER-Interface Design

      Vol:
    E91-D No:6
      Page(s):
    1673-1680

    We present an interactive system for cosmetic makeup of a point-based face model acquired by 3D scanners. We first enhance the texture of a face model in 3D space using low-pass Gaussian filtering, median filtering, and histogram equalization. The user is provided with a stereoscopic display and haptic feedback, and can perform simulated makeup tasks including the application of foundation, color makeup, and lip gloss. Fast rendering is achieved by processing surfels using the GPU, and we use a BSP tree data structure and a dynamic local refinement of the facial surface to provide interactive haptics. We have implemented a prototype system and evaluated its performance.

  • Design of a Trinocular-Stereo-Vision VLSI Processor Based on Optimal Scheduling

    Masanori HARIYAMA  Naoto YOKOYAMA  Michitaka KAMEYAMA  

     
    PAPER

      Vol:
    E91-C No:4
      Page(s):
    479-486

    This paper presents a processor architecture for high-speed and reliable trinocular stereo matching based on adaptive window-size control of SAD (Sum of Absolute Differences) computation. To reduce its computational complexity, SADs are computed using images divided into non-overlapping regions, and the matching result is iteratively refined by reducing a window size. Window-parallel-and-pixel-parallel architecture is also proposed to achieve to fully exploit the potential parallelism of the algorithm. The architecture also reduces the complexity of an interconnection network between memory and functional units based on regularity of reference pixels. The stereo matching processor is designed in a 0.18 µm CMOS technology. The processing time is 83.2 µs@100 MHz. By using optimal scheduling, the increases in area and processing time is only 5% and 3% respectively compared to binocular stereo vision although the computational amount is double.

  • Surface Reconstruction from Stereo Data Using a Three-Dimensional Markov Random Field Model

    Hotaka TAKIZAWA  Shinji YAMAMOTO  

     
    PAPER-Stereo and Multiple View Analysis

      Vol:
    E89-D No:7
      Page(s):
    2028-2035

    In the present paper, we propose a method for reconstructing the surfaces of objects from stereo data. Both the fitness of stereo data to surfaces and interrelation between the surfaces are defined in the framework of a three-dimensional (3-D) Markov Random Field (MRF) model. The surface reconstruction is accomplished by searching for the most likely state of the MRF model. Three experimental results are shown for synthetic and real stereo data.

  • A New Efficient Stereo Line Segment Matching Algorithm Based on More Effective Usage of the Photometric, Geometric and Structural Information

    Ghader KARIMIAN  Abolghasem A. RAIE  Karim FAEZ  

     
    PAPER-Stereo and Multiple View Analysis

      Vol:
    E89-D No:7
      Page(s):
    2012-2020

    In this paper, a new stereo line segment matching algorithm is presented. The main purpose of this algorithm is to increase efficiency, i.e. increasing the number of correctly matched lines while avoiding the increase of mismatches. In this regard, the reasons for the elimination of correct matches as well as the existence of the erroneous ones in some existing algorithms have been investigated. An attempt was also made to make efficient uses of the photometric, geometric and structural information through the introduction of new constraints, criteria, and procedures. Hence, in the candidate determination stage of the designed algorithm two new constraints, in addition to the reliable epipolar, maximum and minimum disparity and orientation similarity constraints were employed. In the process of disambiguation and final matches selection, being the main problem of the matching issue, regarding the employed constraints, criterion function and its optimization, it is a completely new development. The algorithm was applied to the images of several indoor scenes and its high efficiency illustrated by correct matching of 96% of the line segments with no mismatches.

  • A High-Accuracy Passive 3D Measurement System Using Phase-Based Image Matching

    Mohammad Abdul MUQUIT  Takuma SHIBAHARA  Takafumi AOKI  

     
    PAPER-Image/Vision Processing

      Vol:
    E89-A No:3
      Page(s):
    686-697

    This paper presents a high-accuracy 3D (three-dimen-sional) measurement system using multi-camera passive stereo vision to reconstruct 3D surfaces of free form objects. The proposed system is based on an efficient stereo correspondence technique, which consists of (i) coarse-to-fine correspondence search, and (ii) outlier detection and correction, both employing phase-based image matching. The proposed sub-pixel correspondence search technique contributes to dense reconstruction of arbitrary-shaped 3D surfaces with high accuracy. The outlier detection and correction technique contributes to high reliability of reconstructed 3D points. Through a set of experiments, we show that the proposed system measures 3D surfaces of objects with sub-mm accuracy. Also, we demonstrate high-quality dense 3D reconstruction of a human face as a typical example of free form objects. The result suggests a potential possibility of our approach to be used in many computer vision applications.

  • FPGA Implementation of a Stereo Matching Processor Based on Window-Parallel-and-Pixel-Parallel Architecture

    Masanori HARIYAMA  Yasuhiro KOBAYASHI  Haruka SASAKI  Michitaka KAMEYAMA  

     
    PAPER-VLSI Architecture

      Vol:
    E88-A No:12
      Page(s):
    3516-3522

    This paper presents a processor architecture for high-speed and reliable stereo matching based on adaptive window-size control of SAD (Sum of Absolute Differences) computation. To reduce its computational complexity, SADs are computed using images divided into non-overlapping regions, and the matching result is iteratively refined by reducing a window size. Window-parallel-and-pixel-parallel architecture is also proposed to achieve to fully exploit the potential parallelism of the algorithm. The architecture also reduces the complexity of an interconnection network between memory and functional units based on the regularity of reference pixels. The stereo matching processor is implemented on an FPGA. Its performance is 80 times higher than that of a microprocessor (Pentium4@2 GHz), and is enough to generate a 3-D depth image at the video rate of 33 MHz.

  • Architecture of a Stereo Matching VLSI Processor Based on Hierarchically Parallel Memory Access

    Masanori HARIYAMA  Haruka SASAKI  Michitaka KAMEYAMA  

     
    PAPER-Digital Circuits and Computer Arithmetic

      Vol:
    E88-D No:7
      Page(s):
    1486-1491

    This paper presents a VLSI processor for high-speed and reliable stereo matching based on adaptive window-size control of SAD(Sum of Absolute Differences) computation. To reduce its computational complexity, SADs are computed using multi-resolution images. Parallel memory access is essential for highly parallel image processing. For parallel memory access, this paper also presents an optimal memory allocation that minimizes the hardware amount under the condition of parallel memory access at specified resolutions.

  • Three Point Based Registration for Binocular Augmented Reality

    Steve VALLERAND  Masayuki KANBARA  Naokazu YOKOYA  

     
    PAPER-Multimedia Pattern Processing

      Vol:
    E87-D No:6
      Page(s):
    1554-1565

    In order to perform the registration of virtual objects in vision-based augmented reality systems, the estimation of the relation between the real and virtual worlds is needed. This paper presents a three-point vision-based registration method for video see-through augmented reality systems using binocular cameras. The proposed registration method is based on a combination of monocular and stereoscopic registration methods. A correction method that performs an optimization of the registration by correcting the 2D positions in the images of the marker feature points is proposed. Also, an extraction strategy based on color information is put forward to allow the system to be robust to fast user's motion. In addition, a quantification method is used in order to evaluate the stability of the produced registration. Timing and stability results are presented. The proposed registration method is proven to be more stable than the standard stereoscopic registration method and to be independent of the distance. Even when the user moves quickly, our developed system succeeds in producing stable three-point based registration. Therefore, our proposed methods can be considered as interesting alternatives to produce the registration in binocular augmented reality systems when only three points are available.

  • Fast Stereo Matching Using Constraints in Discrete Space

    Hong JEONG  Yuns OH  

     
    PAPER-Image Processing, Image Pattern Recognition

      Vol:
    E83-D No:7
      Page(s):
    1592-1600

    We present a new basis for discrete representation of stereo correspondence. This center referenced basis permits a more natural, complete and concise representation of constraints in stereo matching. In this context a MAP formulation for disparity estimation is derived and reduced to unconstrained minimization of an energy function. Incorporating natural constraints, the problem is simplified to the shortest path problem in a sparsely connected trellis structure which is performed by an efficient dynamic programing algorithm. The computational complexity is the same as the best of other dynamic programming methods, but a very high degree of concurrency is possible in the algorithm making it suitable for implementation with parallel procesors. Experimental results confirm the performance of this method and matching errors are found to degrade gracefully in exponential form with respect to noise.

  • Optimal Estimation of Three-Dimensional Rotation and Reliability Evaluation

    Naoya OHTA  Kenichi KANATANI  

     
    PAPER-Image Processing,Computer Graphics and Pattern Recognition

      Vol:
    E81-D No:11
      Page(s):
    1247-1252

    We discuss optimal rotation estimation from two sets of 3-D points in the presence of anisotropic and inhomogeneous noise. We first present a theoretical accuracy bound and then give a method that attains that bound, which can be viewed as describing the reliability of the solution. We also show that an efficient computational scheme can be obtained by using quaternions and applying renormalization. Using real stereo images for 3-D reconstruction, we demonstrate that our method is superior to the least-squares method and confirm the theoretical predictions of our theory by applying bootstrap procedure.

  • Infinity and Planarity Test for Stereo Vision

    Yasushi KANAZAWA  Kenichi KANATANI  

     
    PAPER-Image Processing,Computer Graphics and Pattern Recognition

      Vol:
    E80-D No:8
      Page(s):
    774-779

    Introducing a mathematical model of noise in stereo images, we propose a new criterion for intelligent statistical inference about the scene we are viewing by using the geometric information criterion (geometric AIC). Using synthetic and real-image experiments, we demonstrate that a robot can test whether or not the object is located very far away or the object is a planar surface without using any knowledge about the noise magnitude or any empirically adjustable thresholds.

  • Reliability of 3-D Reconstruction by Stereo Vision

    Yasushi KANAZAWA  Kenichi KANATANI  

     
    PAPER-Image Processing, Computer Graphics and Pattern Recognition

      Vol:
    E78-D No:10
      Page(s):
    1301-1306

    Theoretically, corresponding pairs of feature points between two stereo images can determine their 3-D locations uniquely by triangulation. In the presence of noise, however, corresponding feature points may not satisfy the epipolar equation exactly, so we must first correct the corresponding pairs so as to satisfy the epipolar equation. In this paper, we present an optimal correction method based on a statistical model of image noise. Our method allows us to evaluate the magnitude of image noise a posteriori and compute the covariance matrix of each of the reconstructed 3-D points. We demonstrate the effectiveness of our method by doing numerical simulation and real-image experiments.

1-20hit(23hit)