IEICE global.ieice.org Site

Keyword Search Result

[Keyword] stereo(113hit)

21-40hit(113hit)

Generation of a Zoomed Stereo Video Using Two Synchronized Videos with Different Magnifications
Yusuke HAYASHI Norihiko KAWAI Tomokazu SATO Miyuki OKUMOTO Naokazu YOKOYA

PAPER-Image Processing and Video Processing

Pubricized:
2015/06/17
Vol:
E98-D No:9
Page(s):
1691-1701
This paper proposes a novel approach to generate stereo video in which the zoom magnification is not constant. Although this has been achieved mechanically in a conventional way, it is necessary for this approach to develop a mechanically complex system for each stereo camera system. Instead of a mechanical solution, we employ an approach from the software side: by using a pair of zoomed and non-zoomed video, a part of the non-zoomed video image is cut out and super-resolved for generating stereo video without a special hardware. To achieve this, (1) the zoom magnification parameter is automatically determined by using distributions of intensities, and (2) the cutout image is super-resolved by using optically zoomed images as exemplars. The effectiveness of the proposed method is quantitatively and qualitatively validated through experiments.
Occlusion-Robust Human Tracking with Integrated Multi-View Depth Imagery
Kenichiro FUKUSHI Itsuo KUMAZAWA

PAPER-Image Recognition, Computer Vision

Vol:
E97-D No:12
Page(s):
3181-3191
In this paper, we present a computer vision-based human tracking system with multiple stereo cameras. Many widely used methods, such as KLT-tracker, update the trackers “frame-to-frame,” so that features extracted from one frame are utilized to update their current state. In contrast, we propose a novel optimization technique for the “multi-frame” approach that computes resultant trajectories directly from video sequences, in order to achieve high-level robustness against severe occlusion, which is known to be a challenging problem in computer vision. We developed a heuristic optimization technique to estimate human trajectories, instead of using dynamic programming (DP) or an iterative approach, which makes our method sufficiently computationally efficient to operate in realtime. Six video sequences where one to six people walk in a narrow laboratory space are processed using our system. The results confirm that our system is capable of tracking cluttered scenes in which severe occlusion occurs and people are frequently in close proximity to each other. Moreover, minimal information is required for tracking, instead of full camera images, which is communicated over the network. Hence, commonly used network devices are sufficient for constructing our tracking system.
Approximated Virtual Source Imaging System for a Pair of Closely Spaced Loudspeakers
Jae-woong JEONG Young-cheol PARK Dae-hee YOUN

LETTER-Speech and Hearing

Vol:
E97-D No:9
Page(s):
2526-2529
This paper presents an approximated virtual source imaging system based on crosstalk cancellation with a pair of closely spaced loudspeakers. Utilizing the frequency-dependent relative importance of sound localization cues, the proposed system provides separate approximations for the low- and high-frequency bands. Experimental results show that the system provides good approximations within ±55° in the stereo dipole setup with natural sound quality.
A Delivery Format for Unified Stereoscopic Video Content Transmissions over Dynamic Adaptive Streaming Scheme
Jangwon LEE Kugjin YUN Doug Young SUH Kyuheon KIM

LETTER-Image Processing and Video Processing

Vol:
E96-D No:9
Page(s):
2162-2165
This letter proposes a new delivery format in order to realize unified transmissions of stereoscopic video contents over a dynamic adaptive streaming scheme. With the proposed delivery format, various forms of stereoscopic video contents regardless of their encoding and composition types can be delivered over the current dynamic adaptive streaming scheme. In addition, the proposed delivery format supports dynamic and efficient switching between 2D and 3D sequences in an interoperable manner for both 2D and 3D digital devices, regardless of their capabilities. This letter describes the designed delivery format and shows dynamic interoperable applications for 2D and 3D mixed contents with the implemented system in order to verify its features and efficiency.
3D Reconstruction with Globally-Optimized Point Selection
Norimichi UKITA Kazuki MATSUDA

PAPER-Image Recognition, Computer Vision

Vol:
E95-D No:12
Page(s):
3069-3077
This paper proposes a method for reconstructing accurate 3D surface points. To this end, robust and dense reconstruction with Shape-from-Silhouettes (SfS) and accurate multiview stereo are integrated. Unlike gradual shape shrinking and/or bruteforce large space search by existing space carving approaches, our method obtains 3D points by SfS and stereo independently, and then selects correct ones from them. The point selection is achieved in accordance with spatial consistency and smoothness of 3D point coordinates and normals. The globally optimized points are selected by graph-cuts. Experimental results with several subjects containing complex shapes demonstrate that our method outperforms existing approaches and our previous method.
A New Histogram Modification Method for Stereoscopic Image Enhancement
Seung-Won JUNG Sung-Jea KO

LETTER-Image

Vol:
E95-A No:11
Page(s):
2090-2092
Histogram modification based image enhancement algorithms have been extensively used in 2-D image applications. In this letter, we apply a histogram modification framework to stereoscopic image enhancement. The proposed algorithm estimates the histogram of a stereo image pair without explicitly computing the pixel-wise disparity. Then, the histogram in the occluded regions is estimated and used to determine the target histogram of the stereo image. Experimental results demonstrate the effectiveness of the proposed algorithm.
Perceived Depth Change Produced by Visual Acuity Difference between the Eyes
Kei SADAKUNI Takuya INOUE Hirotsugu YAMAMOTO Shiro SUYAMA

PAPER

Vol:
E95-C No:11
Page(s):
1707-1715
Three methods of presenting a three-dimensional (3-D) image – a real object, a protruding stereoscopic display, and the depth-fused 3-D (DFD) display – have different tendencies for the change in perceived depth produced when the visual acuity of the dominant eye is decreased by an occlusion foil. These different tendencies are estimated from the slope and correlation coefficient of the plot of perceived depth difference versus stimuli depth difference. This estimation was derived using the same experimental system setup composed of two displays and a half mirror for all three 3-D display methods. The perceived depth difference was measured for four subjects by calipers using two fingers. The slope and correlation coefficient had almost the same tendencies as follows. The real object had the smallest decrease among the three 3-D display methods when the dominant eye's visual acuity was decreased and the protruding stereoscopic display had the largest decrease. The DFD display method had an intermediate decrease between those of the real object and protruding stereoscopic display. When the dominant eye's visual acuity was high enough, the differences among the three 3-D display methods were small. When its visual acuity was decreased, the differences increased among the three 3-D display methods and became statistically significant.
A Study of Stereoscopic Image Quality Assessment Model Corresponding to Disparate Quality of Left/Right Image for JPEG Coding
Masaharu SATO Yuukou HORITA

LETTER-Quality Metrics

Vol:
E95-A No:8
Page(s):
1264-1269
Our research is focused on examining a stereoscopic quality assessment model for stereoscopic images with disparate quality in left and right images for glasses-free stereo vision. In this paper, we examine the objective assessment model of 3-D images, considering the difference in image quality between each view-point generated by the disparity-compensated coding. A overall stereoscopic image quality can be estimated by using only predicted values of left and right 2-D image qualities based on the MPEG-7 descriptor information without using any disparity information. As a result, the stereoscopic still image quality is assessed with high prediction accuracy with correlation coefficient=0.98 and average error=0.17.
SSM-HPC: Front View Gait Recognition Using Spherical Space Model with Human Point Clouds
Jegoon RYU Sei-ichiro KAMATA Alireza AHRARY

PAPER-Image Recognition, Computer Vision

Vol:
E95-D No:7
Page(s):
1969-1978
In this paper, we propose a novel gait recognition framework - Spherical Space Model with Human Point Clouds (SSM-HPC) to recognize front view of human gait. A new gait representation - Marching in Place (MIP) gait is also introduced which preserves the spatiotemporal characteristics of individual gait manner. In comparison with the previous studies on gait recognition which usually use human silhouette images from image sequences, this research applies three dimensional (3D) point clouds data of human body obtained from stereo camera. The proposed framework exhibits gait recognition rates superior to those of other gait recognition methods.
Direct Shape Carving: Smooth 3D Points and Normals for Surface Reconstruction
Kazuki MATSUDA Norimichi UKITA

PAPER-3D Reconstruction

Vol:
E95-D No:7
Page(s):
1811-1818
This paper proposes a method for reconstructing a smooth and accurate 3D surface. Recent machine vision techniques can reconstruct accurate 3D points and normals of an object. The reconstructed point cloud is used for generating its 3D surface by surface reconstruction. The more accurate the point cloud, the more correct the surface becomes. For improving the surface, how to integrate the advantages of existing techniques for point reconstruction is proposed. Specifically, robust and dense reconstruction with Shape-from-Silhouettes (SfS) and accurate stereo reconstruction are integrated. Unlike gradual shape shrinking by space carving, our method obtains 3D points by SfS and stereo independently and accepts the correct points reconstructed. Experimental results show the improvement by our method.
Optimizing a Virtual Re-Convergence System to Reduce Visual Fatigue in Stereoscopic Camera
Jae Gon KIM Jun-Dong CHO

PAPER-Image Processing

Vol:
E95-D No:5
Page(s):
1238-1247
In this paper, we propose an optimized virtual re-convergence system especially to reduce the visual fatigue caused by binocular stereoscopy. Our unique idea to reduce visual fatigue is to utilize the virtual re-convergence based on the optimized disparity-map that contains more depth information in the negative disparity area than in the positive area. Therefore, our system facilitates a unique search-range scheme, especially for negative disparity exploration. In addition, we used a dedicated method, using a so-called Global-Shift Value (GSV), which are the total shift values of each image in stereoscopy to converge a main object that can mostly affect visual fatigue. The experimental result, which is a subjective assessment by participants, shows that the proposed method makes stereoscopy significantly comfortable and attractive to view than existing methods.
Implementation and Optimization of Image Processing Algorithms on Embedded GPU
Nitin SINGHAL Jin Woo YOO Ho Yeol CHOI In Kyu PARK

PAPER-Image Processing and Video Processing

Vol:
E95-D No:5
Page(s):
1475-1484
In this paper, we analyze the key factors underlying the implementation, evaluation, and optimization of image processing and computer vision algorithms on embedded GPU using OpenGL ES 2.0 shader model. First, we present the characteristics of the embedded GPU and its inherent advantage when compared to embedded CPU. Additionally, we propose techniques to achieve increased performance with optimized shader design. To show the effectiveness of the proposed techniques, we employ cartoon-style non-photorealistic rendering (NPR), speeded-up robust feature (SURF) detection, and stereo matching as our example algorithms. Performance is evaluated in terms of the execution time and speed-up achieved in comparison with the implementation on embedded CPU.
An Interleaving Updating Framework of Disparity and Confidence Map for Stereo Matching
Chenbo SHI Guijin WANG Xiaokang PEI Bei HE Xinggang LIN

LETTER-Image Recognition, Computer Vision

Vol:
E95-D No:5
Page(s):
1552-1555
In this paper, we propose an interleaving updating framework of disparity and confidence map (IUFDCM) for stereo matching to eliminate the redundant and interfere information from unreliable pixels. Compared with other propagation algorithms using matching cost as messages, IUFDCM updates the disparity map and the confidence map in an interleaving manner instead. Based on the Confidence-based Support Window (CSW), disparity map is updated adaptively to alleviate the effect of input parameters. The reassignment for unreliable pixels with larger probability keeps ground truth depending on reliable messages. Consequently, the confidence map is updated according to the previous disparity map and the left-right consistency. The top ranks on Middlebury benchmark corresponding to different error thresholds demonstrate that our algorithm is competitive with the best stereo matching algorithms at present.
Stereo Matching Using Local Plane Fitting in Confidence-Based Support Window
Chenbo SHI Guijin WANG Xiaokang PEI Bei HE Xinggang LIN

LETTER-Image Recognition, Computer Vision

Vol:
E95-D No:2
Page(s):
699-702
This paper addresses stereo matching under scenarios of smooth region and obviously slant plane. We explore the flexible handling of color disparity, spatial relation and the reliability of matching pixels in support windows. Building upon these key ingredients, a robust stereo matching algorithm using local plane fitting by Confidence-based Support Window (CSW) is presented. For each CSW, only these pixels with high confidence are employed to estimate optimal disparity plane. Considering that RANSAC has shown to be robust in suppressing the disturbance resulting from outliers, we employ it to solve local plane fitting problem. Compared with the state of the art local methods in the computer vision community, our approach achieves the better performance and time efficiency on the Middlebury benchmark.
Simplified Relative Model to Measure Visual Fatigue in a Stereoscopy
Jae Gon KIM Jun-Dong CHO

LETTER

Vol:
E94-A No:12
Page(s):
2830-2831
In this paper, we propose a quantitative metric of measuring the degree of the visual fatigue in a stereoscopy. To the best of our knowledge, this is the first simplified relative quantitative approach describing visual fatigue value of a stereoscopy. Our experimental result shows that the correlation index of more than 98% is obtained between our Simplified Relative Visual Fatigue (SRVF) model and Mean Opinion Score (MOS).
Pedestrian Detection with Sparse Depth Estimation
Yu WANG Jien KATO

PAPER-Image Recognition, Computer Vision

Vol:
E94-D No:8
Page(s):
1690-1699
In this paper, we deal with the pedestrian detection task in outdoor scenes. Because of the complexity of such scenes, generally used gradient-feature-based detectors do not work well on them. We propose to use sparse 3D depth information as an additional cue to do the detection task, in order to achieve a fast improvement in performance. Our proposed method uses a probabilistic model to integrate image-feature-based classification with sparse depth estimation. Benefiting from the depth estimates, we map the prior distribution of human's actual height onto the image, and update the image-feature-based classification result probabilistically. We have two contributions in this paper: 1) a simplified graphical model which can efficiently integrate depth cue in detection; and 2) a sparse depth estimation method which could provide fast and reliable estimation of depth information. An experiment shows that our method provides a promising enhancement over baseline detector within minimal additional time.
Compatible Stereo Video Coding with Adaptive Prediction Structure
Lili MENG Yao ZHAO Anhong WANG Jeng-Shyang PAN Huihui BAI

LETTER-Image Processing and Video Processing

Vol:
E94-D No:7
Page(s):
1506-1509
A stereo video coding scheme which is compatible with monoview-processor is presented in this paper. At the same time, this paper proposes an adaptive prediction structure which can make different prediction modes to be applied to different groups of picture (GOPs) according to temporal correlations and interview correlations to improve the coding efficiency. Moreover, the most advanced video coding standard H.264 is used conveniently for maximize the coding efficiency in this paper. Finally, the effectiveness of the proposed scheme is verified by extensive experimental results.
Stereo Image Retargeting with Shift-Map
Ryo NAKASHIMA Kei UTSUGI Keita TAKAHASHI Takeshi NAEMURA

LETTER-Image Recognition, Computer Vision

Vol:
E94-D No:6
Page(s):
1345-1348
We propose a new stereo image retargeting method based on the framework of shift-map image editing. Retargeting is the process of changing the image size according to the target display while preserving as much of the richness of the image as possible, and is often applied to monocular images and videos. Retargeting stereo images poses a new challenge because pixel correspondences between the stereo pair should be preserved to keep the scene's structure. The main contribution of this paper is integrating a stereo correspondence constraint into the retargeting process. Among several retargeting methods, we adopt shift-map image editing because this framework can be extended naturally to stereo images, as we show in this paper. We confirmed the effectiveness of our method through experiments.
Real-Time Estimation of Fast Egomotion with Feature Classification Using Compound Omnidirectional Vision Sensor
Trung Thanh NGO Yuichiro KOJIMA Hajime NAGAHARA Ryusuke SAGAWA Yasuhiro MUKAIGAWA Masahiko YACHIDA Yasushi YAGI

PAPER-Image Recognition, Computer Vision

Vol:
E93-D No:1
Page(s):
152-166
For fast egomotion of a camera, computing feature correspondence and motion parameters by global search becomes highly time-consuming. Therefore, the complexity of the estimation needs to be reduced for real-time applications. In this paper, we propose a compound omnidirectional vision sensor and an algorithm for estimating its fast egomotion. The proposed sensor has both multi-baselines and a large field of view (FOV). Our method uses the multi-baseline stereo vision capability to classify feature points as near or far features. After the classification, we can estimate the camera rotation and translation separately by using random sample consensus (RANSAC) to reduce the computational complexity. The large FOV also improves the robustness since the translation and rotation are clearly distinguished. To date, there has been no work on combining multi-baseline stereo with large FOV characteristics for estimation, even though these characteristics are individually are important in improving egomotion estimation. Experiments showed that the proposed method is robust and produces reasonable accuracy in real time for fast motion of the sensor.
Bandwidth-Scalable Stereo Audio Coding Based on a Layered Structure
Young Han LEE Deok Su KIM Hong Kook KIM Jongmo SUNG Mi Suk LEE Hyun Joo BAE

LETTER-Speech and Hearing

Vol:
E92-D No:12
Page(s):
2540-2544
In this paper, we propose a bandwidth-scalable stereo audio coding method based on a layered structure. The proposed stereo coding method encodes super-wideband (SWB) stereo signals and is able to decode either wideband (WB) stereo signals or SWB stereo signals, depending on the network congestion. The performance of the proposed stereo coding method is then compared with that of a conventional stereo coding method that separately decodes WB or SWB stereo signals, in terms of subjective quality, algorithmic delay, and computational complexity. Experimental results show that when stereo audio signals sampled at a rate of 32 kHz are compressed to 64 kbit/s, the proposed method provides significantly better audio quality with a 64-sample shorter algorithmic delay, and comparable computational complexity.

21-40hit(113hit)

Keyword Search Result

[Keyword] stereo(113hit)

Generation of a Zoomed Stereo Video Using Two Synchronized Videos with Different Magnifications

Occlusion-Robust Human Tracking with Integrated Multi-View Depth Imagery

Approximated Virtual Source Imaging System for a Pair of Closely Spaced Loudspeakers

A Delivery Format for Unified Stereoscopic Video Content Transmissions over Dynamic Adaptive Streaming Scheme

3D Reconstruction with Globally-Optimized Point Selection

A New Histogram Modification Method for Stereoscopic Image Enhancement

Perceived Depth Change Produced by Visual Acuity Difference between the Eyes

A Study of Stereoscopic Image Quality Assessment Model Corresponding to Disparate Quality of Left/Right Image for JPEG Coding

SSM-HPC: Front View Gait Recognition Using Spherical Space Model with Human Point Clouds

Direct Shape Carving: Smooth 3D Points and Normals for Surface Reconstruction

Optimizing a Virtual Re-Convergence System to Reduce Visual Fatigue in Stereoscopic Camera

Implementation and Optimization of Image Processing Algorithms on Embedded GPU

An Interleaving Updating Framework of Disparity and Confidence Map for Stereo Matching

Stereo Matching Using Local Plane Fitting in Confidence-Based Support Window

Simplified Relative Model to Measure Visual Fatigue in a Stereoscopy

Pedestrian Detection with Sparse Depth Estimation

Compatible Stereo Video Coding with Adaptive Prediction Structure

Stereo Image Retargeting with Shift-Map

Real-Time Estimation of Fast Egomotion with Feature Classification Using Compound Omnidirectional Vision Sensor

Bandwidth-Scalable Stereo Audio Coding Based on a Layered Structure

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles