1-3hit |
This paper reports on the trending literature of occlusion handling in the task of online visual tracking. The discussion first explores visual tracking realm and pinpoints the necessity of dedicated attention to the occlusion problem. The findings suggest that although occlusion detection facilitated tracking impressively, it has been largely ignored. The literature further showed that the mainstream of the research is gathered around human tracking and crowd analysis. This is followed by a novel taxonomy of types of occlusion and challenges arising from it, during and after the emergence of an occlusion. The discussion then focuses on an investigation of the approaches to handle the occlusion in the frame-by-frame basis. Literature analysis reveals that researchers examined every aspect of a tracker design that is hypothesized as beneficial in the robust tracking under occlusion. State-of-the-art solutions identified in the literature involved various camera settings, simplifying assumptions, appearance and motion models, target state representations and observation models. The identified clusters are then analyzed and discussed, and their merits and demerits are explained. Finally, areas of potential for future research are presented.
Dang Ngoc Hai NGUYEN NamUk KIM Yung-Lyul LEE
A new technology for video frame rate up-conversion (FRUC) is presented by combining a median filter and motion estimation (ME) with an occlusion detection (OD) method. First, ME is performed to obtain a motion vector. Then, the OD method is used to refine the MV in the occlusion region. When occlusion occurs, median filtering is applied. Otherwise, bidirectional motion compensated interpolation (BDMC) is applied to create the interpolated frames. The experimental results show that the proposed algorithm provides better performance than the conventional approach. The average gain in the PSNR (Peak Signal to Noise Ratio) is always better than the other methods in the Full HD test sequences.
In this paper, we first discuss on a framework for a 3D image display system which is the combination of passive sensing and active display technologies. The passive sensing enables to capture real scenes under natural condition. The active display enables to present arbitrary views with proper motion parallax following the observer's motion. The requirements of passive sensing technology for 3D image displays are discussed in comparison with those for robot vision. Then, a new stereo algorithm, called SEA (Stereo by Eye Array), which satisfies the requirements is described in detail. The SEA uses nine images captured by a 33 camera array. It has the following features for depth estimation: 1) Pixel-based correspondence search enables to obtain a dense and high-spatial-resolution depth map. 2) Correspondence ambiguity for linear edges with the orientation parallel to a particular baseline is eliminated by using multiple baselines with different orientations. 3) Occlusion can be easily detected and an occlusion-free depth map with sharp object boundaries is generated. The feasibility of the SEA is demonstrated by experiments by using real image data.