Chi-Hsi SU Hsueh-Ming HANG David W. LIN
A global motion parameter estimation method is proposed. The method can be used to segment an image sequence into regions of different moving objects. For any two pixels belonging to the same moving object, their associated global motion components have a fixed relationship from the projection geometry of camera imaging. Therefore, by examining the measured motion vectors we are able to group pixels into objects and, at the same time, identify some global motion information. In the presence of camera zoom, the object shape is distorted and conventional translational motion estimation may not yield accurate motion modeling. A deformable block motion estimation scheme is thus proposed to estimate the local motion of an object in this situation. Some simulation results are reported. For an artificially generated sequence containing only zoom activity, we find that the maximum estimation error in the zoom factor is about 2. 8 %. Rather good moving object segmentation results are obtained using the proposed object local motion estimation method after zoom extraction. The deformable block motion compensation is also seen to outperform conventional translational block motion compensation for video material containing zoom activity.
Jeng-Shyang PAN Jing-Wein WANG
In this paper, a new feature which is characterized by the extrema density of 2-D wavelet frames estimated at the output of the corresponding filter bank is proposed for texture segmentation. With and without feature selection, the discrimination ability of features based on pyramidal and tree-structured decompositions are comparatively studied using the extrema density, energy, and entropy as features, respectively. These comparisons are demonstrated with separable and non-separable wavelets. With the three-, four-, and five-category textured images from Brodatz album, it is observed that most performances with feature selection improve significantly than those without feature selection. In addition, the experimental results show that the extrema density-based measure performs best among the three types of features investigated. A Min-Min method based on genetic algorithms, which is a novel approach with the spatial separation criterion (SPC) as the evaluation function is presented to evaluate the segmentation performance of each subset of selected features. In this work, the SPC is defined as the Euclidean distance within class divided by the Euclidean distance between classes in the spatial domain. It is shown that with feature selection the tree-structured wavelet decomposition based on non-separable wavelet frames has better performances than the tree-structured wavelet decomposition based on separable wavelet frames and pyramidal decomposition based on separable and non-separable wavelet frames in the experiments. Finally, we compare to the segmentation results evaluated with the templates of the textured images and verify the effectiveness of the proposed criterion. Moreover, it is proved that the discriminatory characteristics of features do spread over all subbands from the feature selection vector.
This paper describes a new system for extracting and classifying bibliography regions from the color image of a book cover. The same as all the color image processing, the segmentation of color space is an essential and important step in our system; and here HSI color space is adopted rather than RGB color space. The color space is segmented into achromatic and chromatic regions first; and the segmentation is completed after thresholding the intensity histogram of the achromatic region and the hue histogram of the chromatic region. Then text region extraction and classification follows. After detecting fundamental features (stroke width and local label width) text regions are determined by comparing smeared blocks to the original candidate image. Based on the general cover design model, text regions are classified into author region, title region, and publisher region furthermore, and a bibliography image is obtained as a result, without applying OCR. The appearance of the book is 3D reconstructed as well. In this paper, two examples are presented.
Takayuki NAKACHI Tatsuya FUJII Junji SUZUKI
In this paper, we propose an adaptive predictive coding method based on image segmentation for lossless compression. MAR (Multiplicative Autoregressive) predictive coding is an efficient lossless compression scheme. Predictors of the MAR model can be adapted to changes in the local image statistics due to its local image processing. However, the performance of the MAR method is reduced when applied to images whose local statistics change within the block-by-block subdivided image. Furthermore, side-information such as prediction coefficients must be transmitted to the decoder with each block. In order to enhance the compression performance, we improve the MAR coding method by using image segmentation. The proposed MAR predictor can be adapted to the local statistics of the image efficiently at each pixel. Furthermore, less side-information need be transmitted compared with the conventional MAR method.
Homogeneous but distinct visual objects having low-contrast boundaries are usually merged in most of the segmentation algorithms. To alleviate this problem, an efficient image segmentation algorithm based on a bottom-up approach is proposed by using spatial domain information only. For initial image segmentation, we adopt a new marker extraction algorithm conforming to the human visual system. It generates dense markers in visually complex areas and sparse markers in visually homogeneous areas. Then, two region-merging algorithms are successively applied so that homogeneous visual objects can be represented as simple as possible without destroying low-contrast real boundaries among them. The first one is to remove insignificant regions in a proper merging order. And the second one merges only homogeneous regions, based on ternary region classification. The resultant segmentation describes homogeneous visual objects with few regions while preserving semantic object shapes well. Finally, a size-based region decision procedure may be applied to represent complex visual objects simpler, if their precise semantic contents are not necessary. Experimental results show that the proposed image segmentation algorithm represents homogeneous visual objects with a few regions and describes complex visual objects with a marginal number of regions with well-preserved semantic object shapes.
Makoto ISHIKAWA Naotake KAMIURA Yutaka HATA
This paper proposes a thresholding based segmentation method aided by Kleene Algebra. For a given image including some regions of interest (ROIs for short) with the coherent intensity level, assume that we can segment each ROI on applying thresholding technique. Three segmented states are then derived for every ROI: Shortage denoted by logic value 0, Correct denoted by 1 and Excess denoted by 2. The segmented states for every ROI in the image can be then expressed on a ternary logic system. Our goal is then set to find "Correct (1)" state for every ROI. First, unate function, which is a model of Kleene Algebra, based procedure is proposed. However, this method is not complete for some cases, that is, correctly segmented ratio is about 70% for three and four ROI segmentation. For the failed cases, Brzozowski operations, which are defined on De Morgan algebra, can accommodate to completely find all "Correct" states. Finally, we apply these procedures to segmentation problems of a human brain MR image and a foot CT image. As the result, we can find all "1" states for the ROIs, i. e. , we can correctly segment the ROIs.
Jun-ichiro TORIWAKI Kensaku MORI
In this article we present a survey of medical image processing with the stress on applications of image generation and pattern recognition / understanding to computer aided diagnosis (CAD) and surgery (CAS). First, topics and fields of research in medical image processing are summarized. Second the importance of the 3D image processing and the use of virtualized human body (VHB) is pointed out. Thirdly the visualization and the observation methods of the VHB are introduced. In the forth section the virtualized endoscope system is presented from the viewpoint of the observation of the VHB with the moving viewpoints. The fifth topic is the use of VHB with deformation such as the simulation of surgical operation, intra-operative aids and image overlay. In the seventh section several topics on image processing methodologies are introduced including model generation, registration, segmentation, rendering and the use of knowledge processing.
This paper proposes an object oriented face region detection and tracking method using range color information. Range segmentation of the objects are obtained from the complicated background using disparity histogram (DH). The facial regions among the range segmented objects are detected using skin-color transform technique that provides a facial region enhanced gray-level image. Computationally efficient matching pixel count (MPC) disparity measure is introduced to enhance the matching accuracy by removing the effect of the unexpected noise in the boundary region. Redundancy operations inherent in the area-based matching operation are removed to enhance the processing speed. For the skin-color transformation, the generalized facial color distribution (GFCD) is modeled by 2D Gaussian function in a normalized color space. Disparity difference histogram (DDH) concept from two consecutive frames is introduced to estimate the range information effectively. Detailed geometrical analysis provides exact variation of range information of moving object. The experimental results show that the proposed algorithm works well in various environments, at a rate of 1 frame per second with 512 480 resolution in general purpose workstation.
Rachid SAMMOUDA Noboru NIKI Hiromu NISHITANI Emi KYOKAGE
In our current work, we attempt to make an automatic diagnostic system of lung cancer based on the analysis of the sputum color images. In order to form general diagnostic rules, we have collected a database with thousands of sputum color images from normal and abnormal subjects. As a first step, in this paper, we present a segmentation method of sputum color images prepared by the Papanicalaou standard staining method. The segmentation is performed based on an energy function minimization using an unsupervised Hopfield neural network (HNN). This HNN have been used for the segmentation of magnetic resonance images (MRI). The results have been acceptable, however the method have some limitations due to the stuck of the network in an early local minimum because the energy landscape in general has more than one local minimum due to the nonconvex nature of the energy surface. To overcome this problem, we have suggested in our previous work some contributions. Similarly to the MRI images, the color images can be considered as multidimensional data as each pixel is represented by its three components in the RGB image planes. To the input of HNN we have applied the RGB components of several sputum images. However, the extreme variations in the gray-levels of the images and the relative contrast among nuclei due to unavoidable staining variations among individual cells, the cytoplasm folds and the debris cells, make the segmentation less accurate and impossible its automatization as the number of regions is difficult to be estimated in advance. On the other hand, the most important objective in processing cell clusters is the detection and accurate segmentation of the nuclei, because most quantitative procedures are based on measurements of nuclear features. For this reason, based on our collected database of sputum color images, we found an algorithm for NonSputum cell masking. Once these masked images are determined, they are given, with some of the RGB components of the raw image, to the input of HNN to make a crisp segmentation by assigning each pixel to label such as Background, Cytoplasm, and Nucleus. The proposed technique has yielded correct segmentation of complex scene of sputum prepared by ordinary manual staining method in most of the tested images selected from our database containing thousands of sputum color images.
The adaptive associative memory proposed by Ma is used to construct a new model of semantic network, referred to as associative semantic memory (ASM). The main novelty is its computational effectiveness which is an important issue in knowledge representation; the ASM can do inference based on large conceptual hierarchies extremely fast-in time that does not increase with the size of conceptual hierarchies. This performance cannot be realized by any existing systems. In addition, ASM has a simple and easily understandable architecture and is flexible in the sense that modifying knowledge can easily be done using one-shot relearning and the generalization of knowledge is a basic system property. Theoretical analyses are given in general case to guarantee that ASM can flawlessly infer via pattern segmentation and recovery which are the two basic functions that the adaptive associative memory has.
Iren VALOVA Yusuke SUGANAMI Yukio KOSUGI
Segmenting the images obtained from magnetic resonance imaging (MRI) is an important process for visualization of the human soft tissues. For the application of MR, we often have to introduce a reasonable segmentation technique. Neural networks may provide us with superior solutions for the pattern classification of medical images than the conventional methods. For image segmentation with the aid of neural networks of a reasonable size, it is important to select the most effective combination of secondary indices to be used for the classification. In this paper, we introduce a vector quantized class entropy (VQCCE) criterion to evaluate which indices are effective for pattern classification, without testing on the actual classifiers. We have exploited a newly developed neural tree classifier for accomplishing the segmentation task. This network effectively partitions the feature space into subregions and each final subregion is assigned a class label according to the data routed to it. As the tree grows on, the number of training data for each node decreases, which results in less weight update epochs and decreases the time consumption. The partitioning of the feature space at each node is done by a simple neural network; the appropriateness of which is measured by newly proposed estimation criterion, i. e. the measure for assessment of neuron (MAN). It facilitates the obtaining of a neuron with maximum correlation between a unit's value and the residual error at a given output. The application of this criterion guarantees adopting the best-fit neuron to split the feature space. The proposed neural classifier has achieved 95% correct classification rate on average for the white/gray matter segmentation problem. The performance of the proposed method is compared to that of a multilayered perceptron (MLP), the latter being widely exploited network in the field of image processing and pattern recognition. The experiments show the superiority of the introduced method in terms of less iterations and weight up dates necessary to train the neural network, i. e. lower computational complexity; as well as higher correct classification rate.
Takashi IMORI Tadahiko KIMOTO Bunpei TOUJI Toshiaki FUJII Masayuki TANIMOTO
This paper presents a new scheme to estimate depth in a natural three-dimensional scene using a multi-viewpoint image set. In the conventional Multiple-Baseline Stereo (MBS) scheme for the image set, although errors of stereo matching are somewhat reduced by using multiple stereo pairs, the use of square blocks of fixed size sometimes causes false matching, especially, in that image area where occlusion occurs and that image area of small variance of brightness levels. In the proposed scheme, the reference image is segmented into regions which are capable of being arbitrarily shaped, and a depth value is estimated for each region. Also, by comparing the image generated by projection with the original image, depth values are newly estimated in a top-down manner. Then, the error of the previous depth value is detected, and it is corrected. The results of experiments show advantages of the proposed scheme over the MBS scheme.
Keisuke KAMEYAMA Kenzo MORI Yukio KOSUGI
A novel neural network architecture for image texture classification is introduced. The proposed model (Kernel Modifying Neural Network: KM Net) which incorporates the convolution filter kernel and the classifier in one, enables an automated texture feature extraction in multichannel texture classification through the modification of the kernel and the connection weights by the backpropagation-based training rule. The first layer units working as the convolution kernels are constrained to be an array of Gabor filters, which achieves a most efficient texture feature localization. The following layers work as a classifier of the extracted texture feature vectors. The capability of the KM Net and its training rule is verified using a basic problem on a synthetic texture image. In addition, the possibilities of applying the KM Net to natural texture classification and biological tissue classification using an ultrasonic echo image have been tried.
The approach presented in this paper was intended for extending conventional Markov random field (MRF) models to a more practical problem: the unsupervised and adaptive segmentation of gray-level images. The "unsupervised" segmentation means that all the model parameters, including the number of image classes, are unknown and have to be estimated from the observed image. In addition, the "adaptive" segmentation means that both the region distribution and the image feature within a region are all location-dependent and their corresponding parameters must be estimated from location to location. We estimated local parameters independently from multiple small windows under the assumption that an observed image consists of objects with smooth surfaces, no texture. Due to this assumption, the intensity of each region is a slowly varying function plus noise, and the conventional homogeneous hidden MRF (HMRF) models are appropriate for these windows. In each window, we employed the EM algorithm for maximum-likelihood (ML) parameter estimation, and then, the estimated parameters were used for "maximizer of the posterior marginals" (MPM) segmentation. To keep continuous segments between windows, a scheme for combining window fragments was proposed. The scheme comprises two parts: the programming of windows and the Bayesian merging of window fragments. Finally, a remerging procedure is used as post processing to remove the over-segmented small regions that possibly exist after the Bayesian merging. Since the final segments are obtained from merging, the number of image classes is automatically determined. The use of multiple parallel windows makes our algorithm to be suitable for parallel implementation. The experimental results of real-world images showed that the surfaces (objects) consistent with our reasonable model assumptions were all correctly segmented as connected regions.
Mohammed BENNAMOUN Boualem BOASHASH
Within the framework of a previously proposed vision system, a new part-segmentation algorithm, that breaks an object defined by its contour into its constituent parts, is presented. The contour is assumed to be obtained using an edge detector. This decomposition is achieved in two stages. The first stage is a preprocessing step which consists of extracting the convex dominant points (CDPs) of the contour. For this aim, we present a new technique which relaxes the compromise that exists in most classical methods for the selection of the width of the Gaussian filter. In the subsequent stage, the extracted CDPs are used to break the object into convex parts. This is performed as follows: among all the points of the contour only the CDPs are moved along their normals nutil they touch another moving CDP or a point on the contour. The results show that this part-segmentation algorithm is invariant to transformations such as rotation, scaling and shift in position of the object, which is very important for object recognition. The algorithm has been tested on many object contours, with and without noise and the advantages of the algorithm are listed in this paper. Our results are visually similar to a human intuitive decomposition of objects into their parts.
Shengjin WANG Makoto SATO Hiroshi KAWARADA
High-speed display of 3-D objects in virtual reality environments is one of the currently important subjects. Shape simplification is considered an efficient method. This paper presents a method of hierarchical cube-based segmentation for shape simplification and multiresolution model construction. The relations among shape simplification, resolution and visual distance are derived firstly. The first level model is generated from scattered range data by cube-base segmentation with the first level cube size. Multiresolution models are then generated by re-sampling polygonal patch vertices of each former level model with hierarchical cube-based segmentation structure. The results show that the algorithm is efficient for constructing multiresolution models of free-form shape 3-D objects from scattered range data and high compression ratio can be obtained with little noticeable difference during the visualization.
The segmentation of images into regions that have some common properties is a fundamental problem in low level computer vision. In this paper, the region growing method to segmentation is studied. In the study, a coarse to fine processing strategy is adopted to identify the homogeneity of the subregion of an image. The pixels in the image are checked by a nested triple-layer neighborhood system based hypothesis test. The pixels can then be classified into single pixels or grain pixels with different size and coarseness. Instead of using the global threshold to the region growing, local thresholds are determined adaptively for each pixel in the image. The strength of the proposed method lies in the fact that the thresholds are computed automatically. Experiments for synthetic and natural images show the efficiency of our method.
Adam KURIASKI Takeshi AGUI Hiroshi NAGAHASHI
A method of motion segmentation in RGB image sequences is presented in details. The method is based on moving object modeling by a six-variate Gaussian distribution and a hidden Markov random field (MRF) framework. It is an extended and improved version of our previous work. Based on mathematical principles the energy expression of MRF is modified. Moreover, an initialization procedure for the first frame of the sequence is introduced. Both modifications result in new interesting features. The first involves a rather simple parameter estimation which has to be performed before the use of the method. Now, the values of Maximum Likelihood (ML) estimators of the parameters can be used without any user's modifications. The last allows one to avoid finding manually the localization mask of moving object in the first frame. Experimental results showing the usefulness of the method are also included.
Mikio HASEGAWA Tohru IKEGUCHI Takeshi MATOZAKI Kazuyuki AIHARA
We propose a novel segmentation algorithm which combines an image segmentation method into small regions with chaotic neurodynamics that has already been clarified to be effective for solving some combinatorial optimization problems. The basic algorithm of an image segmentation is the variable-shape-bloch-segmentation (VB) which searches an opti-mal state of the segmentation by moving the vertices of quadran-gular regions. However, since the algorithm for moving vertices is based upon steepest descent dynamics, this segmentation method has a local minimum problem that the algorithm gets stuck at undesirable local minima. In order to treat such a problem of the VB and improve its performance, we introduce chaotic neurodynamics for optimization. The results of our novel method are compared with those of conventional stochastic dynamics for escaping from undesirable local minima. As a result, the better results are obtained with the chaotic neurodynamical image segmentation.
This paper presents an overview of research activities in Japan in the field of very low bit-rate video coding. Related research based on the concept of "intelligent image coding" started in the mid-1980's. Although this concept originated from the consideration of a new type of image coding, it can also be applied to other interesting applications such as human interface and psychology. On the other hand, since the beginning of the 1990's, research on the improvement of waveform coding has been actively performed to realize very low bit-rate video coding. Key techniques employed here are improvement of motion compensation and adoption of region segmentation. In addition to the above, we propose new concepts of image coding, which have the potential to open up new aspects of image coding, e.g., ideas of interactive image coding, integrated 3-D visual communication and coding of multimedia information considering mutual relationship amongst various media.