Object contour detection is a task of extracting the shape created by the boundaries between objects in an image. Conventional methods limit the detection targets to specific categories, or miss-detect edges of patterns inside an object. We propose a new method to represent a contour image where the pixel value is the distance to the boundary. Contour detection becomes a regression problem that estimates this contour image. A deep convolutional network for contour estimation is combined with stereo vision to detect unspecified object contours. Furthermore, thanks to similar inference targets and common network structure, we propose a network that simultaneously estimates both contour and disparity with fully shared weights. As a result of experiments, the multi-tasking network drew a good precision-recall curve, and F-measure was about 0.833 for FlyingThings3D dataset. L1 loss of disparity estimation for the dataset was 2.571. This network reduces the amount of calculation and memory capacity by half, and accuracy drop compared to the dedicated networks is slight. Then we quantize both weights and activations of the network to 3-bit. We devise a dedicated hardware architecture for the quantized CNN and implement it on an FPGA. This circuit uses only internal memory to perform forward propagation calculations, that eliminates high-power external memory accesses. This circuit is a stall-free pixel-by-pixel pipeline, and performs 8 rows, 16 input channels, 16 output channels, 3 by 3 pixels convolution calculations in parallel. The convolution calculation performance at the operating frequency of 250 MHz is 9 TOPs/s.
Gou HOUBEN Shu FUJITA Keita TAKAHASHI Toshiaki FUJII
Depth (disparity) estimation from a light field (a set of dense multi-view images) is currently attracting much research interest. This paper focuses on how to handle a noisy light field for disparity estimation, because if left as it is, the noise deteriorates the accuracy of estimated disparity maps. Several researchers have worked on this problem, e.g., by introducing disparity cues that are robust to noise. However, it is not easy to break the trade-off between the accuracy and computational speed. To tackle this trade-off, we have integrated a fast denoising scheme in a fast disparity estimation framework that works in the epipolar plane image (EPI) domain. Specifically, we found that a simple 1-D slanted filter is very effective for reducing noise while preserving the underlying structure in an EPI. Moreover, this simple filtering does not require elaborate parameter configurations in accordance with the target noise level. Experimental results including real-world inputs show that our method can achieve good accuracy with much less computational time compared to some state-of-the-art methods.
Ming LI Li SHI Xudong CHEN Sidan DU Yang LI
The large computational complexity makes stereo matching a big challenge in real-time application scenario. The problem of stereo matching in a video sequence is slightly different with that in a still image because there exists temporal correlation among video frames. However, no existing method considered temporal consistency of disparity for algorithm acceleration. In this work, we proposed a scheme called the dynamic disparity range (DDR) to optimize matching cost calculation and cost aggregation steps by narrowing disparity searching range, and a scheme called temporal cost aggregation path to optimize the cost aggregation step. Based on the schemes, we proposed the DDR-SGM and the DDR-MCCNN algorithms for the stereo matching in video sequences. Evaluation results showed that the proposed algorithms significantly reduced the computational complexity with only very slight loss of accuracy. We proved that the proposed optimizations for the stereo matching are effective and the temporal consistency in stereo video is highly useful for either improving accuracy or reducing computational complexity.
Kangru WANG Lei QU Lili CHEN Jiamao LI Yuzhang GU Dongchen ZHU Xiaolin ZHANG
In this paper, a novel approach is proposed for stereo vision-based ground plane detection at superpixel-level, which is implemented by employing a Disparity Texture Map in a convolution neural network architecture. In particular, the Disparity Texture Map is calculated with a new Local Disparity Texture Descriptor (LDTD). The experimental results demonstrate our superior performance in KITTI dataset.
Takahiro SUZUKI Keita TAKAHASHI Toshiaki FUJII
Structure tensor analysis on epipolar plane images (EPIs) is a successful approach to estimate disparity from a light field, i.e. a dense set of multi-view images. However, the disparity range allowable for the light field is limited because the estimation becomes less accurate as the range of disparities become larger. To overcome this limitation, we developed a new method called sheared EPI analysis, where EPIs are sheared before the structure tensor analysis. The results of analysis obtained with different shear values are integrated into a final disparity map through a smoothing process, which is the key idea of our method. In this paper, we closely investigate the performance of sheared EPI analysis and demonstrate the effectiveness of the smoothing process by extensively evaluating the proposed method with 15 datasets that have large disparity ranges.
Duck-Ho BAE Jong-Min LEE Sang-Wook KIM Youngjoon WON Yongsu PARK
A burst of social network services increases the need for in-depth analysis of network activities. Privacy breach for network participants is a concern in such analysis efforts. This paper investigates structural and property changes via several privacy preserving methods (anonymization) for social network. The anonymized social network does not follow the power-law for node degree distribution as the original network does. The peak-hop for node connectivity increases at most 1 and the clustering coefficient of neighbor nodes shows 6.5 times increases after anonymization. Thus, we observe inconsistency of privacy preserving methods in social network analysis.
In this paper, we propose an optimized virtual re-convergence system especially to reduce the visual fatigue caused by binocular stereoscopy. Our unique idea to reduce visual fatigue is to utilize the virtual re-convergence based on the optimized disparity-map that contains more depth information in the negative disparity area than in the positive area. Therefore, our system facilitates a unique search-range scheme, especially for negative disparity exploration. In addition, we used a dedicated method, using a so-called Global-Shift Value (GSV), which are the total shift values of each image in stereoscopy to converge a main object that can mostly affect visual fatigue. The experimental result, which is a subjective assessment by participants, shows that the proposed method makes stereoscopy significantly comfortable and attractive to view than existing methods.
Duhwan JO Sumi HELAL Eunsam KIM Wonjun LEE Choonhwa LEE
This paper presents novel hybrid push-pull protocols for peer-to-peer video streaming. Our approaches intend to reap the best of push- and pull-based schemes by adaptively switching back and forth between the two modes according to video chunk distributions. The efficacy of the proposed protocols is validated through an evaluation study that demonstrates substantial performance gains.
Dong-Hoon HAN Yung-Ki LEE Yung-Lyul LEE
Since multiview video coding (MVC) based on H.264/AVC uses a prediction scheme exploiting inter-view correlation among multiview video, MVC encoder compresses multiple views more efficiently than simulcast H.264/AVC encoder. However, in case that the number of views to be encoded increases in MVC, the total encoding time will be greatly increased. To reduce computational complexity in MVC, a fast mode decision using both Macroblock-based region segmentation information and global disparity vector among views is proposed to reduce the encoding time. The proposed method achieves on the average 1.5 2.9 reduction of the total encoding time with the PSNR (Peak Signal-to-Noise Ratio) degradation of about 0.05 dB.
Ali M. FOTOUHI Abolghasem A. RAIE
In this paper, a new local matching algorithm, to estimate dense disparity map in stereo vision, consisting of two stages is presented. At the first stage, the reduction of search space is carried out with a high efficiency, i.e. remarkable decrease in the average number of candidates per pixel, with low computational cost and high assurance of retaining the correct answer. This outcome being due to the effective use of multiple radial windows, intensity information, and some usual and new constraints, in a reasonable manner, retains those candidates which satisfy more constraints and especially being more promising to satisfy the implied assumption in using support windows; i.e., the disparity consistency of the window pixels. Such an output from the first stage, while speeding up the final selection of disparity in the second stage due to search space reduction, is also promising a more accurate result due to having more reliable candidates. In the second stage, the weighted window, although not necessarily being the exclusive choice, is employed and examined. The experimental results on the standard stereo benchmarks for the developed algorithm are presented, confirming that the massive computations to obtain more precise matching costs in weighted window is reduced to about 1/11 and the final disparity map is also improved.
Yuu TANAKA Atsushi YAMASHITA Toru KANEKO Kenjiro T. MIURA
In this paper, we propose a new method that can remove view-disturbing noises from stereo images. One of the thorny problems in outdoor surveillance by a camera is that adherent noises such as waterdrops on the protecting glass surface lens disturb the view from the camera. Therefore, we propose a method for removing adherent noises from stereo images taken with a stereo camera system. Our method is based on the stereo measurement and utilizes disparities between stereo image pair. Positions of noises in images can be detected by comparing disparities measured from stereo images with the distance between the stereo camera system and the glass surface. True disparities of image regions hidden by noises can be estimated from the property that disparities are generally similar with those around noises. Finally, we can remove noises from images by replacing the above regions with textures of corresponding image regions obtained by the disparity referring. Experimental results show the effectiveness of the proposed method.
This paper presents an approach that uses the Viterbi algorithm in a stereo correspondence problem. We propose a matching process which is visualized as a trellis diagram to find the maximum a posterior result. The matching process is divided into two parts: matching the left scene to the right scene and matching the right scene to the left scene. The last result of stereo problem is selected based on the minimum error for uniqueness by a comparison between the results of the two parts of matching process. This makes the stereo matching possible without explicitly detecting occlusions. Moreover, this stereo matching algorithm can improve the accuracy of the disparity image, and it has an acceptable running time for practical applications since it uses a trellis diagram iteratively and bi-directionally. The complexity of our proposed method is shown approximately as O(N2P), in which N is the number of disparity, and P is the length of the epipolar line in both the left and right images. Our proposed method has been proved to be robust when applied to well-known samples of stereo images such as random dot, Pentagon, Tsukuba image, etc. It provides a 95.7 percent of accuracy in radius 1 (differing by 1) for the Tsukuba images.
Dae-Hyun KIM Jung-Hoon KIM Yong-In YOON In-Hwan OH Jong-Soo CHOI
In this paper, we propose an algorithm that automatically generates the intermediate scenes using the bidirectional disparity morphing (BDM) from the parallel stereo images. The two-step search strategy is used for speeding up the computation of the bidirectional disparity map and three occluding patterns are used for smoothing the computed disparities more elaborately. Using the bidirectional disparity map, we interpolate the left and the right image to their intermediate scenes. Then we dissolve two interpolated images into the desired intermediate scene which the holes are removed and the effect of the disparity estimation errors is minimized. We implemented the proposed algorithm on TM1300 supported by TriMedia using pSOSytem which enables to do multiprocessing. As a result, we can interpolate the high-quality intermediate scenes with real-time process.
Payman MOALLEM Karim FAEZ Javad HADDADNIA
Finding corresponding edges is considered being the most difficult part of edge-based stereo matching algorithms. Usually, correspondence for a feature point in the first image is obtained by searching in a predefined region of the second image, based on epipolar line and maximum disparity. Reduction of search region can increase performances of the matching process, in the context of execution time and accuracy. Traditionally, hierarchical multiresolution techniques, as the fastest methods are used to decrease the search space and therefore increase the processing speed. Considering maximum of directional derivative of disparity in real scenes, we formulated some relations between maximum search space in the second images with respect to relative displacement of connected edges (as the feature points), in successive scan lines of the first images. Then we proposed a new matching strategy to reduce the search space for edge-based stereo matching algorithms. Afterward, we developed some fast stereo matching algorithms based on the proposed matching strategy and the hierarchical multiresolution techniques. The proposed algorithms have two stages: feature extraction and feature matching. We applied these new algorithms on some stereo images and compared their results with those of some hierarchical multiresolution ones. The execution times of our proposed methods are decreased between 30% to 55%, in the feature matching stage. Moreover, the execution time of the overall algorithms (including the feature extraction and the feature matching) is decreased between 15% to 40% in real scenes. Meanwhile in some cases, the accuracy is increased too. Theoretical investigation and experimental results show that our algorithms have a very good performance with real complex scenes, therefore these new algorithms are very suitable for fast edge-based stereo applications in real scenes like robotic applications.
Chiho LEE Gwangzeen KO Kiseon KIM
In this paper, we propose an activity-based estimation scheme to determine the received signal power disparity, that enhances the BER performance of the SIC scheme in a DS/CDMA system considering a practical voice activity factor, and compare BER performance with those of other schemes with or without estimation. Numerical analysis results show that the SIC scheme with the proposed activity-based estimation improves the BER performance compared with that without considering voice activity, and it approaches to that of the ideal estimation as the total number of concurrent users increases. In addition, the higher becomes the maximum attainable SNR, the better becomes the BER performance of the proposed activity-based estimation scheme.
We present a new basis for discrete representation of stereo correspondence. This center referenced basis permits a more natural, complete and concise representation of constraints in stereo matching. In this context a MAP formulation for disparity estimation is derived and reduced to unconstrained minimization of an energy function. Incorporating natural constraints, the problem is simplified to the shortest path problem in a sparsely connected trellis structure which is performed by an efficient dynamic programing algorithm. The computational complexity is the same as the best of other dynamic programming methods, but a very high degree of concurrency is possible in the algorithm making it suitable for implementation with parallel procesors. Experimental results confirm the performance of this method and matching errors are found to degrade gracefully in exponential form with respect to noise.
Toshiaki SUGIHARA Tsutomu MIYASATO Ryohei NAKATSU
In this paper, we describe an experimental evaluation of visual fatigue in a binocular disparity type 3-D display system. To evaluate this fatigue, we use a subjective assessment method and focus on mismatching between convergence and accommodation, which is a major weakness of binocular disparity 3-D displays. For this subjective assessment, we use a newly-developed binocular disparity 3-D display system with a compensation function for accommodation. Because this equipment only allowed us to compare the terms of the mismatching itself, the evaluation is more accurate than similar previous works.
Sang Hwa LEE Jong-Il PARK Seiki INOUE Choong Woong LEE
In this paper, a general formula of disparity estimation based on Bayesian Maximum A Posteriori (MAP) algorithm is derived and implemented with simplified probabilistic models. The formula is the generalized probabilistic diffusion equation based on Bayesian model, and can be implemented into some different forms corresponding to the probabilistic models in the disparity neighborhood system or configuration. The probabilistic models are independence and similarity among the neighboring disparities in the configuration. The independence probabilistic model guarantees the discontinuity at the object boundary region, and the similarity model does the continuity or the high correlation of the disparity distribution. According to the experimental results, the proposed algorithm had good estimation performance. This result showes that the derived formula generalizes the probabilistic diffusion based on Bayesian MAP algorithm for disparity estimation. Also, the proposed probabilistic models are reasonable and approximate the pure joint probability distribution very well with decreasing the computations to O(n()) from O(n()4) of the generalized formula.
This paper proposes an object oriented face region detection and tracking method using range color information. Range segmentation of the objects are obtained from the complicated background using disparity histogram (DH). The facial regions among the range segmented objects are detected using skin-color transform technique that provides a facial region enhanced gray-level image. Computationally efficient matching pixel count (MPC) disparity measure is introduced to enhance the matching accuracy by removing the effect of the unexpected noise in the boundary region. Redundancy operations inherent in the area-based matching operation are removed to enhance the processing speed. For the skin-color transformation, the generalized facial color distribution (GFCD) is modeled by 2D Gaussian function in a normalized color space. Disparity difference histogram (DDH) concept from two consecutive frames is introduced to estimate the range information effectively. Detailed geometrical analysis provides exact variation of range information of moving object. The experimental results show that the proposed algorithm works well in various environments, at a rate of 1 frame per second with 512 480 resolution in general purpose workstation.
This paper presents a technique for disparity selection in the context of binocular pursuit. For vergence control in binocular pursuit, it is a crucial problem to find the disparity which corresponds to the target among multiple disparities generally observed in a scene. To solve the problem of the selection, we propose an approach based on histogramming the disparities obtained in the scene. Here we use an extended phase-based disparity estimation algorithm. The idea is to slice the scene using the disparity histogram so that only the target remains. The slice is chosen around a peak in the histogram using prediction of the target disparity and target location obtained by back projection. The tracking of the peak enables robustness against other, possibly dominant, objects in the scene. The approach is investigated through experiments and shown to work appropriately.