The search functionality is under construction.

Keyword Search Result

[Keyword] stereo(113hit)

1-20hit(113hit)

  • Simultaneous Visible Light Communication and Ranging Using High-Speed Stereo Cameras Based on Bicubic Interpolation Considering Multi-Level Pulse-Width Modulation

    Ruiyi HUANG  Masayuki KINOSHITA  Takaya YAMAZATO  Hiraku OKADA  Koji KAMAKURA  Shintaro ARAI  Tomohiro YENDO  Toshiaki FUJII  

     
    PAPER-Communication Theory and Signals

      Pubricized:
    2022/12/26
      Vol:
    E106-A No:7
      Page(s):
    990-997

    Visible light communication (VLC) and visible light ranging are applicable techniques for intelligent transportation systems (ITS). They use every unique light-emitting diode (LED) on roads for data transmission and range estimation. The simultaneous VLC and ranging can be applied to improve the performance of both. It is necessary to achieve rapid data rate and high-accuracy ranging when transmitting VLC data and estimating the range simultaneously. We use the signal modulation method of pulse-width modulation (PWM) to increase the data rate. However, when using PWM for VLC data transmission, images of the LED transmitters are captured at different luminance levels and are easily saturated, and LED saturation leads to inaccurate range estimation. In this paper, we establish a novel simultaneous visible light communication and ranging system for ITS using PWM. Here, we analyze the LED saturation problems and apply bicubic interpolation to solve the LED saturation problem and thus, improve the communication and ranging performance. Simultaneous communication and ranging are enabled using a stereo camera. Communication is realized using maximal-ratio combining (MRC) while ranging is achieved using phase-only correlation (POC) and sinc function approximation. Furthermore, we measured the performance of our proposed system using a field trial experiment. The results show that error-free performance can be achieved up to a communication distance of 55 m and the range estimation errors are below 0.5m within 60m.

  • RT-libSGM: FPGA-Oriented Real-Time Stereo Matching System with High Scalability

    Kaijie WEI  Yuki KUNO  Masatoshi ARAI  Hideharu AMANO  

     
    PAPER-Computer System

      Pubricized:
    2022/12/07
      Vol:
    E106-D No:3
      Page(s):
    337-348

    Stereo depth estimation has become an attractive topic in the computer vision field. Although various algorithms strive to optimize the speed and the precision of estimation, the energy cost of a system is also an essential metric for an embedded system. Among these various algorithms, Semi-Global Matching (SGM) has been a popular choice for some real-world applications because of its accuracy-and-speed balance. However, its power consumption makes it difficult to be applied to an embedded system. Thus, we propose a robust stereo matching system, RT-libSGM, working on the Xilinx Field-Programmable Gate Array (FPGA) platforms. The dedicated design of each module optimizes the speed of the entire system while ensuring the flexibility of the system structure. Through an evaluation on a Zynq FPGA board called M-KUBOS, RT-libSGM achieves state-of-the-art performance with lower power consumption. Compared with the benchmark design (libSGM) working on the Tegra X2 GPU, RT-libSGM runs more than 2× faster at a much lower energy cost.

  • FPGA Implementation of 3-Bit Quantized Multi-Task CNN for Contour Detection and Disparity Estimation

    Masayuki MIYAMA  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2021/10/26
      Vol:
    E105-D No:2
      Page(s):
    406-414

    Object contour detection is a task of extracting the shape created by the boundaries between objects in an image. Conventional methods limit the detection targets to specific categories, or miss-detect edges of patterns inside an object. We propose a new method to represent a contour image where the pixel value is the distance to the boundary. Contour detection becomes a regression problem that estimates this contour image. A deep convolutional network for contour estimation is combined with stereo vision to detect unspecified object contours. Furthermore, thanks to similar inference targets and common network structure, we propose a network that simultaneously estimates both contour and disparity with fully shared weights. As a result of experiments, the multi-tasking network drew a good precision-recall curve, and F-measure was about 0.833 for FlyingThings3D dataset. L1 loss of disparity estimation for the dataset was 2.571. This network reduces the amount of calculation and memory capacity by half, and accuracy drop compared to the dedicated networks is slight. Then we quantize both weights and activations of the network to 3-bit. We devise a dedicated hardware architecture for the quantized CNN and implement it on an FPGA. This circuit uses only internal memory to perform forward propagation calculations, that eliminates high-power external memory accesses. This circuit is a stall-free pixel-by-pixel pipeline, and performs 8 rows, 16 input channels, 16 output channels, 3 by 3 pixels convolution calculations in parallel. The convolution calculation performance at the operating frequency of 250 MHz is 9 TOPs/s.

  • Dynamic Image Adjustment Method and Evaluation for Glassless 3D Viewing Systems

    Takayuki NAKATA  Isao NISHIHARA  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2020/08/24
      Vol:
    E103-D No:11
      Page(s):
    2351-2361

    In this paper, we propose an accurate calibration method for glassless stereoscopic systems. The method uses a lenticular lens on a general display. Glassless stereoscopic displays are currently used in many fields; however, accurately adjusting their physical display position is difficult because an accuracy of several microns or one hundredth of a degree is required, particularly given their larger display area. The proposed method enables a dynamic adjustment of the positions of images on the display to match various physical conditions in three-dimensional (3D) displays. In particular, compared with existing approaches, this avoids degradation of the image quality due to the image location on the screen while improving the image quality by local mapping. Moreover, it is shown to decrease the calibration time by performing simultaneous processing for each local area. As a result of the calibration, the offset jitter representing the crosstalk reduces from 14.946 to 8.645 mm. It is shown that high-quality 3D videos can be generated. Finally, we construct a stereoscopic viewing system using a high-resolution display and lenticular lens and produce high-quality 3D images with automatic calibration.

  • Asymmetric Learning for Stereo Matching Cost Computation

    Zhongjian MA  Dongzhen HUANG  Baoqing LI  Xiaobing YUAN  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2020/07/13
      Vol:
    E103-D No:10
      Page(s):
    2162-2167

    Current stereo matching methods benefit a lot from the precise stereo estimation with Convolutional Neural Networks (CNNs). Nevertheless, patch-based siamese networks rely on the implicit assumption of constant depth within a window, which does not hold for slanted surfaces. Existing methods for handling slanted patches focus on post-processing. In contrast, we propose a novel module for matching cost networks to overcome this bias. Slanted objects appear horizontally stretched between stereo pairs, suggesting that the feature extraction in the horizontal direction should be different from that in the vertical direction. To tackle this distortion, we utilize asymmetric convolutions in our proposed module. Experimental results show that the proposed module in matching cost networks can achieve higher accuracy with fewer parameters compared to conventional methods.

  • Using Temporal Correlation to Optimize Stereo Matching in Video Sequences

    Ming LI  Li SHI  Xudong CHEN  Sidan DU  Yang LI  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2019/03/01
      Vol:
    E102-D No:6
      Page(s):
    1183-1196

    The large computational complexity makes stereo matching a big challenge in real-time application scenario. The problem of stereo matching in a video sequence is slightly different with that in a still image because there exists temporal correlation among video frames. However, no existing method considered temporal consistency of disparity for algorithm acceleration. In this work, we proposed a scheme called the dynamic disparity range (DDR) to optimize matching cost calculation and cost aggregation steps by narrowing disparity searching range, and a scheme called temporal cost aggregation path to optimize the cost aggregation step. Based on the schemes, we proposed the DDR-SGM and the DDR-MCCNN algorithms for the stereo matching in video sequences. Evaluation results showed that the proposed algorithms significantly reduced the computational complexity with only very slight loss of accuracy. We proved that the proposed optimizations for the stereo matching are effective and the temporal consistency in stereo video is highly useful for either improving accuracy or reducing computational complexity.

  • A Robust Depth Image Based Rendering Scheme for Stereoscopic View Synthesis with Adaptive Domain Transform Based Filtering Framework

    Wei LIU  Yun Qi TANG  Jian Wei DING  Ming Yue CUI  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2018/08/31
      Vol:
    E101-D No:12
      Page(s):
    3138-3149

    Depth image based rendering (DIBR), which is utilized to render virtual views with a color image and the corresponding depth map, is one of the key procedures in the 2D to 3D conversion process. However, some troubling problems, such as depth edge misalignment, disocclusion occurrences and cracks at resampling, still exist in current DIBR systems. To solve these problems, in this paper, we present a robust depth image based rendering scheme for stereoscopic view synthesis. The cores of the proposed scheme are two depth map filters which share a common domain transform based filtering framework. As a first step, a filter of this framework is carried out to realize texture-depth boundary alignments and directional disocclusion reduction smoothing simultaneously. Then after depth map 3D warping, another adaptive filter is used on the warped depth maps with delivered scene gradient structures to further diminish the remaining cracks and noises. Finally, with the optimized depth map of the virtual view, backward texture warping is adopted to retrieve the final texture virtual view. The proposed scheme enables to yield visually satisfactory results for high quality 2D to 3D conversion. Experimental results demonstrate the excellent performances of the proposed approach.

  • New Context-Adaptive Arithmetic Coding Scheme for Lossless Bit Rate Reduction of Parametric Stereo in Enhanced aacPlus

    Hee-Suk PANG  Jun-seok LIM  Hyun-Young JIN  

     
    LETTER-Speech and Hearing

      Pubricized:
    2018/09/18
      Vol:
    E101-D No:12
      Page(s):
    3258-3262

    We propose a new context-adaptive arithmetic coding (CAAC) scheme for lossless bit rate reduction of parametric stereo (PS) in enhanced aacPlus. Based on the probability analysis of stereo parameters indexes in PS, we propose a stereo band-dependent CAAC scheme for PS. We also propose a new coding structure of the scheme which is simple but effective. The proposed scheme has normal and memory-reduced versions, which are superior to the original and conventional schemes and guarantees significant bit rate reduction of PS. The proposed scheme can be an alternative to the original PS coding scheme at low bit rate, where coding efficiency is very important.

  • A Stereo Wind-Noise Suppressor with Null Beamforming and Frequency-Domain Noise Averaging

    Masanori KATO  Akihiko SUGIYAMA  Tatsuya KOMATSU  

     
    PAPER-Digital Signal Processing

      Vol:
    E101-A No:10
      Page(s):
    1631-1637

    This paper proposes a stereo wind-noise suppressor with frequency-domain noise averaging. A directional gain for diffuse wind noise is estimated frame by frame using a null beamformer based on interchannel phase difference which blocks the target signal. The wind-noise gain estimate is commonly multiplied by the input noisy signal to generate channel dependent wind noise estimates in order to cope with interchannel wind-noise imbalance. Interchannel phase agreement by target signal dominance or incidentally equal wind-noise phase, which leads to underestimation, is offset by averaging channel dependent wind-noise estimates along frequency. Evaluation results show that the mean PESQ score by the proposed wind-noise suppressor reaches 2.1 which is 0.2 higher than that by the wind-noise suppressor without averaging and 0.3 higher than that by a conventional monaural-noise suppressor with a statistically significant difference.

  • Stereophonic Music Separation Based on Non-Negative Tensor Factorization with Cepstral Distance Regularization

    Shogo SEKI  Tomoki TODA  Kazuya TAKEDA  

     
    PAPER-Engineering Acoustics

      Vol:
    E101-A No:7
      Page(s):
    1057-1064

    This paper proposes a semi-supervised source separation method for stereophonic music signals containing multiple recorded or processed signals, where synthesized music is focused on the stereophonic music. As the synthesized music signals are often generated as linear combinations of many individual source signals and their respective mixing gains, phase or phase difference information between inter-channel signals, which represent spatial characteristics of recording environments, cannot be utilized as acoustic clues for source separation. Non-negative Tensor Factorization (NTF) is an effective technique which can be used to resolve this problem by decomposing amplitude spectrograms of stereo channel music signals into basis vectors and activations of individual music source signals, along with their corresponding mixing gains. However, it is difficult to achieve sufficient separation performance using this method alone, as the acoustic clues available for separation are limited. To address this issue, this paper proposes a Cepstral Distance Regularization (CDR) method for NTF-based stereo channel separation, which involves making the cepstrum of the separated source signals follow Gaussian Mixture Models (GMMs) of the corresponding the music source signal. These GMMs are trained in advance using available samples. Experimental evaluations separating three and four sound sources are conducted to investigate the effectiveness of the proposed method in both supervised and semi-supervised separation frameworks, and performance is also compared with that of a conventional NTF method. Experimental results demonstrate that the proposed method yields significant improvements within both separation frameworks, and that cepstral distance regularization provides better separation parameters.

  • Feature Ensemble Network with Occlusion Disambiguation for Accurate Patch-Based Stereo Matching

    Xiaoqing YE  Jiamao LI  Han WANG  Xiaolin ZHANG  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2017/09/14
      Vol:
    E100-D No:12
      Page(s):
    3077-3080

    Accurate stereo matching remains a challenging problem in case of weakly-textured areas, discontinuities and occlusions. In this letter, a novel stereo matching method, consisting of leveraging feature ensemble network to compute matching cost, error detection network to predict outliers and priority-based occlusion disambiguation for refinement, is presented. Experiments on the Middlebury benchmark demonstrate that the proposed method yields competitive results against the state-of-the-art algorithms.

  • Generating Questions for Inquiry-Based Learning of History in Elementary Schools by Using Stereoscopic 3D Images Open Access

    Takashi SHIBATA  Kazunori SATO  Ryohei IKEJIRI  

     
    INVITED PAPER

      Vol:
    E100-C No:11
      Page(s):
    1012-1020

    We conducted experimental classes in an elementary school to examine how the advantages of using stereoscopic 3D images could be applied in education. More specifically, we selected a unit of the Tumulus period in Japan for sixth-graders as the source of our 3D educational materials. This unit represents part of the coursework for the topic of Japanese history. The educational materials used in our study included stereoscopic 3D images for examining the stone chambers and Haniwa (i.e., terracotta clay figures) of the Tumulus period. The results of our experimental class showed that 3D educational materials helped students focus on specific parts in images such as attached objects of the Haniwa and also understand 3D spaces and concavo-convex shapes. The experimental class revealed that 3D educational materials also helped students come up with novel questions regarding attached objects of the Haniwa, and Haniwa's spatial balance and spatial alignment. The results suggest that the educational use of stereoscopic 3D images is worthwhile in that they lead to question and hypothesis generation and an inquiry-based learning approach to history.

  • Ground Plane Detection with a New Local Disparity Texture Descriptor

    Kangru WANG  Lei QU  Lili CHEN  Jiamao LI  Yuzhang GU  Dongchen ZHU  Xiaolin ZHANG  

     
    LETTER-Pattern Recognition

      Pubricized:
    2017/06/27
      Vol:
    E100-D No:10
      Page(s):
    2664-2668

    In this paper, a novel approach is proposed for stereo vision-based ground plane detection at superpixel-level, which is implemented by employing a Disparity Texture Map in a convolution neural network architecture. In particular, the Disparity Texture Map is calculated with a new Local Disparity Texture Descriptor (LDTD). The experimental results demonstrate our superior performance in KITTI dataset.

  • Saliency-Guided Stereo Camera Control for Comfortable VR Explorations

    Yeo-Jin YOON  Jaechun NO  Soo-Mi CHOI  

     
    LETTER-Human-computer Interaction

      Pubricized:
    2017/06/01
      Vol:
    E100-D No:9
      Page(s):
    2245-2248

    The quality of visual comfort and depth perception is a crucial requirement for virtual reality (VR) applications. This paper investigates major causes of visual discomfort and proposes a novel virtual camera controlling method using visual saliency to minimize visual discomfort. We extract the saliency of each scene and properly adjust the convergence plane to preserve realistic 3D effects. We also evaluate the effectiveness of our method on free-form architecture models. The results indicate that the proposed saliency-guided camera control is more comfortable than typical camera control and gives more realistic depth perception.

  • Sensor Fusion and Registration of Lidar and Stereo Camera without Calibration Objects

    Vijay JOHN  Qian LONG  Yuquan XU  Zheng LIU  Seiichi MITA  

     
    PAPER

      Vol:
    E100-A No:2
      Page(s):
    499-509

    Environment perception is an important task for intelligent vehicles applications. Typically, multiple sensors with different characteristics are employed to perceive the environment. To robustly perceive the environment, the information from the different sensors are often integrated or fused. In this article, we propose to perform the sensor fusion and registration of the LIDAR and stereo camera using the particle swarm optimization algorithm, without the aid of any external calibration objects. The proposed algorithm automatically calibrates the sensors and registers the LIDAR range image with the stereo depth image. The registered LIDAR range image functions as the disparity map for the stereo disparity estimation and results in an effective sensor fusion mechanism. Additionally, we perform the image denoising using the modified non-local means filter on the input image during the stereo disparity estimation to improve the robustness, especially at night time. To evaluate our proposed algorithm, the calibration and registration algorithm is compared with baseline algorithms on multiple datasets acquired with varying illuminations. Compared to the baseline algorithms, we show that our proposed algorithm demonstrates better accuracy. We also demonstrate that integrating the LIDAR range image within the stereo's disparity estimation results in an improved disparity map with significant reduction in the computational complexity.

  • Auto-Radiometric Calibration in Photometric Stereo

    Wiennat MONGKULMANN  Takahiro OKABE  Yoichi SATO  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2016/09/01
      Vol:
    E99-D No:12
      Page(s):
    3154-3164

    We propose a framework to perform auto-radiometric calibration in photometric stereo methods to estimate surface orientations of an object from a sequence of images taken using a radiometrically uncalibrated camera under varying illumination conditions. Our proposed framework allows the simultaneous estimation of surface normals and radiometric responses, and as a result can avoid cumbersome and time-consuming radiometric calibration. The key idea of our framework is to use the consistency between the irradiance values converted from pixel values by using the inverse response function and those computed from the surface normals. Consequently, a linear optimization problem is formulated to estimate the surface normals and the response function simultaneously. Finally, experiments on both synthetic and real images demonstrate that our framework enables photometric stereo methods to accurately estimate surface normals even when the images are captured using cameras with unknown and nonlinear response functions.

  • Stereo Matching Based on Efficient Image-Guided Cost Aggregation

    Yunlong ZHAN  Yuzhang GU  Xiaolin ZHANG  Lei QU  Jiatian PI  Xiaoxia HUANG  Yingguan WANG  Jufeng LUO  Yunzhou QIU  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2015/12/09
      Vol:
    E99-D No:3
      Page(s):
    781-784

    Cost aggregation is one of the most important steps in local stereo matching, while it is difficult to fulfill both accuracy and speed. In this letter, a novel cost aggregation, consisting of guidance image, fast aggregation function and simplified scan-line optimization, is developed. Experiments demonstrate that the proposed algorithm has competitive performance compared with the state-of-art aggregation methods on 32 Middlebury stereo datasets in both accuracy and speed.

  • A Gaze-Reactive Display for Simulating Depth-of-Field of Eyes When Viewing Scenes with Multiple Depths

    Tatsuro ORIKASA  Takayuki OKATANI  

     
    PAPER-Computer Graphics

      Pubricized:
    2015/11/30
      Vol:
    E99-D No:3
      Page(s):
    739-746

    The the depth-of-field limitation of our eyes causes out-of-focus blur in the retinal images. The blur dynamically changes whenever we change our gaze and accordingly the scene point we are looking at changes its depth. This paper proposes an image display that reproduces retinal out-of-focus blur by using a stereoscopic display and eye trackers. Its purpose is to provide the viewer with more realistic visual experiences than conventional (stereoscopic) displays. Unlike previous similar systems that track only one of the viewer's eyes to estimate the gaze depth, the proposed system tracks both eyes individually using two eye trackers and estimates the gaze depth from the convergence angle calculated by triangulation. This provides several advantages over existing schemes, such as being able to deal with scenes having multiple depths. We describe detailed implementations of the proposed system and show the results of an experiment conducted to examine its effectiveness. In the experiment, creating a scene having two depths using two LCD displays together with a half mirror, we examined how difficult it is for viewers to distinguish between the real scene and its virtual reproduction created by the proposed display system. The results of the experiment show the effectiveness of the proposed approach.

  • Multi-Phase Convex Lens Array for Directional Backlights to Improve Luminance Distribution of Autostereoscopic Display Open Access

    Shuta ISHIZUKA  Takuya MUKAI  Hideki KAKEYA  

     
    INVITED PAPER

      Vol:
    E98-C No:11
      Page(s):
    1023-1027

    We realize homogenous luminance of the directional backlight for the time-division multiplexing autostereoscopic display using a convex lens array with the elemental lenses whose phase of placement in each row differs from one another. The validity of the proposed optical design is confirmed by a prototype system.

  • Phase-Based Window Matching with Geometric Correction for Multi-View Stereo

    Shuji SAKAI  Koichi ITO  Takafumi AOKI  Takafumi WATANABE  Hiroki UNTEN  

     
    PAPER-Image Recognition, Computer Vision

      Vol:
    E98-D No:10
      Page(s):
    1818-1828

    Methods of window matching to estimate 3D points are the most serious factors affecting the accuracy, robustness, and computational cost of Multi-View Stereo (MVS) algorithms. Most existing MVS algorithms employ window matching based on Normalized Cross-Correlation (NCC) to estimate the depth of a 3D point. NCC-based window matching estimates the displacement between matching windows with sub-pixel accuracy by linear/cubic interpolation, which does not represent accurate sub-pixel values of matching windows. This paper proposes a technique of window matching that is very accurate using Phase-Only Correlation (POC) with geometric correction for MVS. The accurate sub-pixel displacement between two matching windows can be estimated by fitting the analytical correlation peak model of the POC function. The proposed method also corrects the geometric transformations of matching windows by taking into consideration the 3D shape of a target object. The use of the proposed geometric correction approach makes it possible to achieve accurate 3D reconstruction from multi-view images even for images with large transformations. The proposed method demonstrates more accurate 3D reconstruction from multi-view images than the conventional methods in a set of experiments.

1-20hit(113hit)