1-14hit |
Yinhui ZHANG Zifen HE Changyu LIU
Segmenting foreground objects from highly dynamic scenes with missing data is very challenging. We present a novel unsupervised segmentation approach that can cope with extensive scene dynamic as well as a substantial amount of missing data that present in dynamic scene. To make this possible, we exploit convex optimization of total variation beforehand for images with missing data in which depletion mask is available. Inpainting depleted images using total variation facilitates detecting ambiguous objects from highly dynamic images, because it is more likely to yield areas of object instances with improved grayscale contrast. We use a conditional random field that adapts to integrate both appearance and motion knowledge of the foreground objects. Our approach segments foreground object instances while inpainting the highly dynamic scene with a variety amount of missing data in a coupled way. We demonstrate this on a very challenging dataset from the UCSD Highly Dynamic Scene Benchmarks (HDSB) and compare our method with two state-of-the-art unsupervised image sequence segmentation algorithms and provide quantitative and qualitative performance comparisons.
Sang-Churl NAM Masahide ABE Masayuki KAWAMATA
This paper proposes a fast, efficient detection algorithm of missing data (also referred to as blotches) based on Markov Random Field (MRF) models with less computational load and a lower false alarm rate than the existing MRF-based blotch detection algorithms. The proposed algorithm can reduce the computational load by applying fast block-matching motion estimation based on the diamond searching pattern and restricting the attention of the blotch detection process to only the candidate bloch areas. The problem of confusion of the blotches is frequently seen in the vicinity of a moving object due to poorly estimated motion vectors. To solve this problem, we incorporate a weighting function with respect to the pixels, which are accurately detected by our moving edge detector and inputed into the formulation. To solve the blotch detection problem formulated as a maximum a posteriori (MAP) problem, an iterated conditional modes (ICM) algorithm is used. The experimental results show that our proposed method results in fewer blotch detection errors than the conventional blotch detectors, and enables lower computational cost and the more efficient detecting performance when compared with existing MRF-based detectors.
This paper proposes a new face recognition method based on mutual projection of feature distributions. The proposed method introduces a new robust measurement between two feature distributions. This measurement is computed by a harmonic mean of two distance values obtained by projection of each mean value into the opposite feature distribution. The proposed method does not require eigenvalue analysis of the two subspaces. This method was applied to face recognition task of temporal image sequence. Experimental results demonstrate that the computational cost was improved without degradation of identification performance in comparison with the conventional method.
Daiki KAWANAKA Takayuki OKATANI Koichiro DEGUCHI
In this paper, we present a method for recognition of human activity as a series of actions from an image sequence. The difficulty with the problem is that there is a chicken-egg dilemma that each action needs to be extracted in advance for its recognition but the precise extraction is only possible after the action is correctly identified. In order to solve this dilemma, we use as many models as actions of our interest, and test each model against a given sequence to find a matched model for each action occurring in the sequence. For each action, a model is designed so as to represent any activity containing the action. The hierarchical hidden Markov model (HHMM) is employed to represent the models, in which each model is composed of a submodel of the target action and submodels which can represent any action, and they are connected appropriately. Several experimental results are shown.
Lei ZHOU Qiang NI Yuanhua ZHOU
An automatic and efficient algorithm for removal of intensity flicker is proposed. The novel repair process is founded on the block-based estimation and restoration algorithm with regard to luminance variation. It is easily realized and controlled to remove most intensity flicker and preserve the wanted effects, like fade in and fade out.
Viet HUYNH QUANG HUY Michio MIWA Hidenori MARUTA Makoto SATO
In this paper, we propose a fixed monocular camera, which changes the focus cyclically to recognize completely the three-dimensional translational motion of a rigid object. The images captured in a half cycle of the focus change form a multi-focus image sequence. The motion in depth or the focus change of the camera causes defocused blur. We develop an in-focus frame tracking operator in order to automatically detect the in-focus frame in a multi-focus image sequence of a moving object. The in-focus frame gives a 3D position in the motion of the object at the time that the frame was captured. The reconstruction of the motion of an object is performed by utilizing non-uniform sampling theory for the 3D position samples, of which information were inferred from the in-focus frames in the multi-focus image sequences.
The existing methods for the reconstruction of a super-resolution image from a sequence of undersampled and subpixel shifted images have to solve a large ill-condition equation group by approximately finding the pseudo-inverse matrix or performing many iterations to approach the solution. The former leads to a big burden of computation, and the latter causes the artifacts or noise to be stressed. In order to solve these problems, in this paper, we consider applying pyramid structure to the super-resolution of the image sequence and present a suitable pyramid framework, called Super-Resolution Image Pyramid (SRIP). Based on the imaging process of the image sequence, the proposed method divides a big back-projection into a series of different levels of small back-projections, thereby avoiding the above problems. As an example, the Iterative Back-Projection (IBP) suggested by Peleg is included in this pyramid framework. Computer simulations and error analyses are conducted and the effectiveness of the proposed framework is demonstrated. The image resolution can be improved better even in the case of severely undersampled images. In addition, the other general super-resolution methods can be easily included in this framework and done in parallel so as to meet the need of real-time processing.
Osamu YAMAGUCHI Kazuhiro FUKUI
Face recognition provides an important means for realizing a man-machine interface and security. This paper presents "Smartface," a PC-based face recognition system using a temporal image sequence. The face recognition engine of the system employs a robust facial parts detection method and a pattern recognition algorithm which is stable against variations of facial pose and expression. The functions of Smartface include (i) screensaver with face recognition, (ii) customization of PC environment, and (iii) real-time disguising, an entertainment application. The system is operable on a portable PC with a camera and is implemented only with software; no image processing hardware is required.
Naoyuki ICHIMURA Norikazu IKOMA
Filtering and smoothing using a non-Gaussian state space model are proposed for motion trajectory of feature point in image sequence. A heavy-tailed non-Gaussian distribution is used for measurement noise to reduce the effect of outliers in motion trajectory. Experimental results are presented to show the usefulness of the proposed method.
Mitsuhiko MEGURO Akira TAGUCHI Nozomu HAMADA
In this study, we consider a filtering method for image sequence degraded by additive Gaussian noise and/or impulse noise (i.e., mixed noise). For removing the mixed noise from the 1D/2D signal, weighted median filters are well known as a proper choice. We have also proposed a filtering tool based on the weighted median filter with a data-dependent method. We call this data-dependent weighted median (DDWM) filters. Nevertheless, the DDWM filter, its weights are controlled by some local information, is not enough performance to restore the image sequence degraded by the noise. The reason is that the DDWM filter is not able to obtain good filtering performance both in the still and moving regions of an image sequence. To overcome above drawback, we add motion information as a motion detector to the local information that controls the weights of the filters. This new filter is proposed as a Video-Data Dependent Weighted Median (Video-DDWM) filter. Through some simulations, the Video-DDWM filter is shown to give effective restoration results than that given by the DDWM filtering and the conventional filtering method with a motion-conpensation (MC).
Kazuhiro OTSUKA Tsutomu HORIKOSHI Haruhiko KOJIMA Satoshi SUZUKI
A novel method is proposed to retrieve image sequences with the goal of forecasting complex and time-varying natural patterns. To that end, we introduce a framework called Memory-Based Forecasting; it provides forecast information based on the temporal development of past retrieved sequences. This paper targets the radar echo patterns in weather radar images, and aims to realize an image retrieval method that supports weather forecasters in predicting local precipitation. To characterize the radar echo patterns, an appearance-based representation of the echo pattern, and its velocity field are employed. Temporal texture features are introduced to represent local pattern features including non-rigid complex motion. Furthermore, the temporal development of a sequence is represented as paths in eigenspaces of the image features, and a normalized distance between two sequences in the eigenspace is proposed as a dissimilarity measure that is used in retrieving similar sequences. Several experiments confirm the good performance of the proposed retrieval scheme, and indicate the predictability of the image sequence.
This paper describes a factorization-based algorithm that reconstructs 3D object structure as well as motion from a set of multiple uncalibrated perspective images. The factorization method introduced by Tomasi-Kanade is believed to be applicable under the assumption of linear approximations of imaging system. In this paper we describe that the method can be extended to the case of truly perspective images if projective depths are recovered. We established this fact by interpreting their purely mathematical theory in terms of the projective geometry of the imaging system and thereby, giving physical meanings to the parameters involved. We also provide a method to recover them using the fundamental matrices and epipoles estimated from pairs of images in the image set. Our method is applicable for general cases where the images are not taken by a single moving camera but by different cameras having individual camera parameters. The experimental results clearly demonstrates the feasibility of the proposed method.
Yoshiaki SHIRAI Tsuyoshi YAMANE Ryuzo OKADA
This paper describes methods of tracking of moving objects in a cluttered background by integrating optical flow, depth data, and/or uniform brightness regions. First, a basic method is introduced which extracts a region with uniform optical flow as the target region. Then an extended method is described in which optical flow and depth are fused. A target region is extracted by Baysian inference in term of optical flow, depth and the predicted target location. This method works only for textured objects because optical flow or depth are extracted for textured objects. In order to solve this problem, uniform regions in addition to the optical flow are used for tracking. Realtime human tracking is realized for real image sequences by using a real time processor with multiple DSPs.
Yuichiro NAKAYA Hiroshi HARASHIMA
Despite its potential to realize image communication at extremely low rates, model-based coding (analysis-synthesis coding) still has problems to be solved for any practical use. The main problems are the difficulty in modeling unknown objects and the presence of analysis errors. To cope with these difficulties, we incorporate waveform coding into model-based coding (model-based/waveform hybrid coding). The incorporated waveform coder can code unmodeled objects and cancel the artifacts caused by the analysis errors. From a different point of view, the performance of the practically used waveform coder can be improved by the incorporation of model-based coding. Since the model-based coder codes the modeled part of the image at extremely low rates, more bits can be allocated for the coding of the unmodeled region. In this paper, we present the basic concept of model-based/waveform hybrid coding. We develop a model-based/MC-DCT hybrid coding system designed to improve the performance of the practically used MC-DCT coder. Simulation results of the system show that this coding method is effective at very low transmission rates such as 16kb/s. Image transmission at such low rates is quite difficult for an MC-DCT coder without the contribution of the model-based coder.