The search functionality is under construction.
The search functionality is under construction.

IEICE TRANSACTIONS on Information

  • Impact Factor

    0.59

  • Eigenfactor

    0.002

  • article influence

    0.1

  • Cite Score

    1.4

Advance publication (published online immediately after acceptance)

Volume E83-D No.7  (Publication Date:2000/07/25)

    Special Issue on Machine Vision Applications
  • FOREWORD

    Katsushi IKEUCHI  

     
    FOREWORD

      Page(s):
    1329-1330
  • A New Image Sensor with Space Variant Sampling Control on a Focal Plane

    Yasuhiro OHTSUKA  Takayuki HAMAMOTO  Kiyoharu AIZAWA  

     
    PAPER

      Page(s):
    1331-1337

    We propose a new sampling control system on image sensor array. Contrary to the random access pixels, the proposed sensor is able to read out spatially variant sampled pixels at high speed, without inputting pixel address for each access. The sampling positions can be changed dynamically by rewriting the sampling position memory. The proposed sensor has a memory array that stores the sampling positions. It can achieve any spatially varying sampling patterns. A prototype of 64 64 pixels are fabricated under 0.7 µm CMOS precess.

  • Hybrid Defect Detection Method Based on the Shape Measurement and Feature Extraction for Complex Patterns

    Hilario Haruomi KOBAYASHI  Yasuhiko HARA  Hideaki DOI  Kazuo TAKAI  Akiyoshi SUMIYA  

     
    PAPER

      Page(s):
    1338-1345

    The visual inspection of printed circuit boards (PCBs) at the final production stage is necessary for quality assurance and the requirements for an automated inspection system are very high. However, consistent inspection of patterns on these PCBs is very difficult due to pattern complexity. Most of the previously developed techniques are not sensitive enough to detect defects in complex patterns. To solve this problem, we propose a new optical system that discriminates pattern types existing on a PCB, such as copper, solder resist and silk-screen printing. We have also developed a hybrid defect detection technique to inspect discriminated patterns. This technique is based on shape measurement and features extraction methods. We used the proposed techniques in an actual automated inspection system, realizing real time transactions with a combination of hardware equipped with image processing LSIs and PC software. Evaluation with this inspection system ensures a 100% defect detection rate and a fairly low false alarm rate (0.06%). The present paper describes the inspection algorithm and briefly explains the automated inspection system.

  • Automatic Evaluation of the Appearance of Seam Puckers on Suits

    Tsunehiro AIBARA  Takehiro MABUCHI  Masanori IZUMIDA  

     
    PAPER

      Page(s):
    1346-1352

    This paper deals with the fundamental problem of automatic assessment of appearance of seam puckers on suits, and suggests possibilities for practical usage. Presently, evaluations are done by inspectors who compare standard photographs of suits to test samples. In order to avoid human errors, however, a method of automatic evaluation is desired. We process the problem as pattern recognition. As a feature we use fractal dimensions. The fractal dimensions obtained from standard photographs are used as template patterns. To make it easier to calculate fractal dimensions, we plot a curve representing the appearance of seam puckers, from which fractal dimensions of the curve can be calculated. The seam puckers in gray-scale images are confused with the material's texture, so the seam puckers must be enhanced for a precise evaluation. By using the concept of variance, we select images with seam puckers and enhance only the images with seam puckers. This is the novel aspect of this work. Twenty suits are used for the evaluation experiment and we obtain a result almost the same to the evaluation gained by inspection. That is, the evaluation of 11 samples is the same as that gained by inspection, the results of 8 samples differ by 1 grade, and the evaluation of 1 sample has a 2-grade difference. The results are also compared to the evaluation of the system using the Daubechies wavelet feature. The result of comparison shows that the present method gives a better evaluation than the system using the Daubechies wavelet.

  • A Machine Vision Approach to Seam Sensing for High-Speed Robotic Sealing

    Kenichi ARAKAWA  Takao KAKIZAKI  Shinji OMYO  

     
    PAPER

      Page(s):
    1353-1357

    In industrial assembly lines, seam sealing is a painting process used for making watertight seals or for preventing rusting. In the process, sealant is painted on seams located at the joints of pressed metal parts. We developed a sealing robot system that adjusts the sealing gun motion adaptively to the seam position sensed by a range sensor (a scanning laser rangefinder which senses profile range data). In this paper, we propose a high-speed and highly reliable algorithm for seam position computation from the sensed profile range data around the seam. It is proved experimentally that the sealing robot system used with the developed algorithm is very effective, especially for reducing wasted sealant.

  • Real Time Visual Servoing around a Complex Object

    Francois BERRY  Philippe MARTINET  Jean GALLICE  

     
    PAPER

      Page(s):
    1358-1368

    In visual servoing, most studies are concerned with robotic application with known objects. In this paper, the problem of controlling a motion by visual servoing around an unknown object is addressed. In this case, the approach is interpreted as an initial step towards a perception goal of an unmodeled object. The main goal is to perform motion with regard to the object in order to discover several viewpoint of the object. An adaptive visual servoing scheme is proposed to perform such task. The originality of our work is based on the choice and extraction of visual features in accordance with motions to be performed. The notion of invariant feature is introduced to control the navigational task around the unknown object. During experimentation, a cartesian robot connected to a real time vision system is used. A CCD camera is mounted on the end effector of the robot. The experimental results present a linkage of desired motion around different kind of objects.

  • Optimal Homography Computation with a Reliability Measure

    Kenichi KANATANI  Naoya OHTA  Yasushi KANAZAWA  

     
    PAPER

      Page(s):
    1369-1374

    We describe a theoretically optimal algorithm for computing the homography between two images. First, we derive a theoretical accuracy bound based on a mathematical model of image noise and do simulation to confirm that our renormalization technique effectively attains that bound. Then, we apply our technique to mosaicing of images with small overlaps. By using real images, we show how our algorithm reduces the instability of the image mapping.

  • A Multiple View Approach for Auto-Calibration of a Rotating and Zooming Camera

    Yongduek SEO  Min-Ho AHN  Ki-Sang HONG  

     
    PAPER

      Page(s):
    1375-1385

    In this paper we deal with the problem of calibrating a rotating and zooming camera, without 3D pattern, whose internal calibration parameters change frame by frame. First, we theoretically show the existence of the calibration parameters up to an orthogonal transformation under the assumption that the skew of the camera is zero. Auto-calibration becomes possible by analyzing inter-image homographies which can be obtained from the matches in images of the same scene, or through direct nonlinear iteration. In general, at least four homographies are needed for auto-calibration. When we further assume that the aspect ratio is known and the principal point is fixed during the sequence then one homography yields camera parameters, and when the aspect ratio is assumed to be unknown with fixed principal point then two homographies are enough. In the case of a fixed principal point, we suggest a method for obtaining the calibration parameters by searching the space of the principal point. If this is not the case, then nonlinear iteration is applied. The algorithm is implemented and validated on several sets of synthetic data. Also experimental results for real images are given.

  • Randomized Adaptive Algorithms for Mosaicing Systems

    Frank NIELSEN  

     
    PAPER

      Page(s):
    1386-1394

    Given a set of still images taken from a hand-held camera, we present a fast method for mosaicing them into a single blended picture. We design time- and memory- efficient still image mosaicing algorithms based on geometric point feature matchings that can handle both arbitrary rotations and large zoom factors. We discuss extensions of the methodology to related problems like the recovering of the epipolar geometry for 3d reconstruction and object recognition tasks.

  • Facet Matching from an Uncalibrated Pair of Images

    Lukas THEILER  Houda CHABBI  

     
    PAPER

      Page(s):
    1395-1399

    Since for recognition tasks it is known that planar invariants are more easily obtained than others, decomposing a scene in terms of planar parts becomes very interresting. This paper presents a new approach to find the projections of planar surfaces in a pair of images. For this task we introduce the facet concept defined by linked edges (chains) and corners. We use collineations as projective information to match and verify their planarity. Our contribution consists in obtaining from an uncalibrated stereo pair of images a match of "planar" chains based on matched corners. Collineations are constrained by the fundamental matrix information and a Kalman filter approach is used to refine its computation.

  • Generation of Missing Medical Slices Using Morphing Technology

    Hasnine HAQUE  Aboul-Ella HASSANIEN  Masayuki NAKAJIMA  

     
    PAPER

      Page(s):
    1400-1407

    When the inter-slice resolution of tomographic image slices is large, it is necessary to estimate the locations and intensities of pixels, which would appear in the non-existed intermediate slices. This paper presents a new method for generating the missing medical slices from two given slices. It uses the contours of organs as the control parameters to the intensity information in the physical gaps of sequential medical slices. The Snake model is used for generating the control points required for the elastic body spline (EBS) morphing algorithm. Contour information derived from this segmentation pre-process is then further processed and used as control parameters to warp the corresponding regions in both input slices into compatible shapes. In this way, the intensity information of the interpolated intermediate slices can be derived more faithfully. In comparison with the existing intensity interpolation methods, including linear interpolation, which only considers corresponding points in a small physical neighborhood, this method warps the data images into similar shapes according to contour information to provide a more meaningful correspondence relationship.

  • 3D Fundus Shape Reconstruction and Display from Stereo Fundus Images

    Koichiro DEGUCHI  Daisuke KAWAMATA  Kanae MIZUTANI  Hidekata HONTANI  Kiwa WAKABAYASHI  

     
    PAPER

      Page(s):
    1408-1414

    A new method to recover and display 3D fundus pattern on the inner bottom surface of eye-ball from stereo fundus image pair is developed. For the fundus stereo images, a simple stereo technique does not work, because the fundus is observed through eye lens and a contact wide-angle enlarging lens. In this method, utilizing the fact that fundus forms a part of sphere, we identify their optical parameters and correct the skews of the lines-of-sight. Then, we obtain 3D images of the fundus by back-projecting the stereo images.

  • 3D Reconstruction of Skin Surface from Image Sequence

    Takeshi YAMADA  Hideo SAITO  Shinji OZAWA  

     
    PAPER

      Page(s):
    1415-1421

    This paper proposes a new method for reconstruction a shape of skin surface replica from shaded image sequence taken with different light source directions. Since the shaded images include shadows caused by surface height fluctuation, and specular and inter reflections, the conventional photometric stereo method is not suitable for reconstructing its surface accurately. In the proposed method, we choose measured intensity which does not include specular and inter reflections and self-shadows so that we can calculate accurate normal vector from the selected measured intensity using SVD (Singular Value Decomposition) method. The experimental results from real images demonstrate that the proposed method is effective for shape reconstruction from shaded images, which include specular and inter reflections and self-shadows.

  • High Speed 3D Reconstruction by Spatio-Temporal Division of Video Image Processing

    Yoshinari KAMEDA  Takeo TAODA  Michihiko MINOH  

     
    PAPER

      Page(s):
    1422-1428

    A high speed 3D shape reconstruction method with multiple video cameras and multiple computers on LAN is presented. The video cameras are set to surround the real 3D space where people exist. Reconstructed 3D space is displayed in voxel format and users can see the space from any viewpoint with a VR viewer. We implemented a prototype system that can work out the 3D reconstruction with the speed of 10.55 fps in 313 ms delay.

  • Reconstruction of Textured Urban 3D Model by Fusing Ground-Based Laser Range and CCD Images

    Huijing ZHAO  Ryosuke SHIBASAKI  

     
    PAPER

      Page(s):
    1429-1440

    In this paper, a method of fusing ground-based laser range image and CCD images for the reconstruction of textured 3D urban object is proposed. An acquisition system is developed to capture laser range image and CCD images simultaneously from the same platform. A registration method is developed using both laser range and CCD images in a coarse-to-fine process. Laser range images are registered with an assumption on sensor's setup, which aims at robustly detecting an initial configuration between the sensor's coordinate system of two views. CCD images are matched to refine the accuracy of the initial transformation, which might be degraded by improper sensor setup, unreliable feature extraction, or limited by low spatial resolution of laser range image. Textured 3D model is generated using planar faces for vertical walls and triangular cells for ground surface, trees and bushes. Through an outdoor experiment of reconstructing a building using six views of laser range and CCD images, it is demonstrated that textured 3D model of urban objects can be generated in an automated manner.

  • Modeling of Urban Scenes by Aerial Photographs and Simply Reconstructed Buildings

    Katsuyuki KAMEI  Wayne HOY  Takashi TAMADA  Kazuo SEO  

     
    PAPER

      Page(s):
    1441-1449

    In many fields such as city administration and facilities management, there are an increasing number of requests for a Geographic Information System (GIS) that provides users with automated mapping functions. A mechanism which displays 3D views of an urban scene is particularly required because it would allow the construction of an intuitive and understandable environment for managing objects in the scene. In this paper, we present a new urban modeling system utilizing both image-based and geometry-based approaches. Our method is based on a new concept in which a wide urban area can be displayed with natural photo-realistic images, and each object drawn in the view can be identified by pointing to it. First, to generate natural urban views from any viewpoint, we employ an image-based rendering method, Image Walkthrough, and modify it to handle aerial images. This method can interpolate and generate natural views by assembling several source photographs. Next, to identify each object in the scene, we recover its shape using computer vision techniques (a geometry-based approach). The rough shape of each building is reconstructed from various aerial images, and then its drawn position on the generated view is also determined. This means that it becomes possible to identify each building from an urban view. We have combined both of these approaches yielding a new style of urban information management. The users of the system can enjoy an intuitive understanding of the area and easily identify their target, by generating natural views from any viewpoint and suitably reconstructing the shapes of objects. We have made a prototype system of this new concept of GIS, which have shown the validity of our method.

  • Extracting Object Information from Aerial Images: A Map-Based Approach

    Yukio OGAWA  Kazuaki IWAMURA  Shigeru KAKUMOTO  

     
    PAPER

      Page(s):
    1450-1457

    We have developed a map-based approach that enables us to efficiently extract information about man-made objects, such as buildings, from aerial images. An image is matched with a corresponding map in order to estimate the object information in the image (i. e. , presence, location, shape, size, kind, and surroundings). This approach is characterized by using a figure contained in a map as an object model for a top-down (model-driven) analysis of an object in the aerial image. We determined the principal steps of the map-based approach needed to extract object information and update a map. These steps were then applied to obtain the locations of missing buildings and the heights of existing buildings. The extraction results of experiments using aerial images of Kobe City (taken after the 1995 earthquake) show that the approach is effective for automatically extracting building information from aerial images and for rapidly updating map data.

  • Image Sequence Retrieval for Forecasting Weather Radar Echo Pattern

    Kazuhiro OTSUKA  Tsutomu HORIKOSHI  Haruhiko KOJIMA  Satoshi SUZUKI  

     
    PAPER

      Page(s):
    1458-1465

    A novel method is proposed to retrieve image sequences with the goal of forecasting complex and time-varying natural patterns. To that end, we introduce a framework called Memory-Based Forecasting; it provides forecast information based on the temporal development of past retrieved sequences. This paper targets the radar echo patterns in weather radar images, and aims to realize an image retrieval method that supports weather forecasters in predicting local precipitation. To characterize the radar echo patterns, an appearance-based representation of the echo pattern, and its velocity field are employed. Temporal texture features are introduced to represent local pattern features including non-rigid complex motion. Furthermore, the temporal development of a sequence is represented as paths in eigenspaces of the image features, and a normalized distance between two sequences in the eigenspace is proposed as a dissimilarity measure that is used in retrieving similar sequences. Several experiments confirm the good performance of the proposed retrieval scheme, and indicate the predictability of the image sequence.

  • Illumination Invariant Face Recognition Using Photometric Stereo

    Seok Cheol KEE  Kyoung Mu LEE  Sang Uk LEE  

     
    PAPER

      Page(s):
    1466-1474

    In this paper, we propose an elegant approach for illumination invariant face recognition based on the photometric stereo technique. The basic idea is to reconstruct the surface normal and the albedo of a face using photometric stereo images, and then use them as the illumination independent model of the face. And, we have investigated the optimal light source directions for accurate surface shape reconstruction, and the robust estimation technique for the illumination direction of an input face image. We have tested the proposed algorithm with 125 real face images of 25 persons which are taken under 5 quite different illumination conditions, and achieved the success rate of more than 80%. Comparison results of conventional face recognition methods and the proposed method are also evaluated. These results demonstrate that the proposed technique have a great potential for the robust face recognition even when the lighting condition changes severely.

  • An Approach to Vehicle Recognition Using Supervised Learning

    Takeo KATO  Yoshiki NINOMIYA  

     
    PAPER

      Page(s):
    1475-1479

    To enhance safety and traffic efficiency, a driver assistance system and an autonomous vehicle system are being developed. A preceding vehicle recognition method is important to develop such systems. In this paper, a vision-based preceding vehicle recognition method, based on supervised learning from sample images is proposed. The improvement for Modified Quadratic Discriminant Function (MQDF) classifier that is used in the proposed method is also shown. And in the case of road environment recognition including the preceding vehicle recognition, many researches have been reported. However in those researches, a quantitative evaluation with large number of images has rarely been done. Whereas, in this paper, over 1,000 sample images for passenger vehicles, which are recorded on a highway during daytime, are used for an evaluation. The evaluation result shows that the performance in a low order case is improved from the ordinary MQDF. Accordingly, the calculation time is reduced more than 20% by using the proposed method. And the feasibility of the proposed method is also proved, due to the result that the proposed method indicates over 98% as classification rate.

  • Regular Section
  • Optimal k-Bounded Placement of Resources in Distributed Computing Systems

    Jong-Hoon KIM  Cheol-Hoon LEE  

     
    PAPER-Theory/Models of Computation

      Page(s):
    1480-1487

    We consider the problem of placing resources in a distributed computing system so that certain performance requirements may be met while minimizing the number of resource copies needed. Resources include special I/O processors, expensive peripheral devices, or such software modules as compilers, library routines, and data files. Due to the delay in accessing each of these resources, system performance degrades as the distance between each processor and its nearest resource copy increases. Thus, every processor must be within a given distance k1 of at least one resource copy, which is called the k-bounded placement problem. The structure of a distributed computing system is represented by a graph. The k-bounded placement problem is first transformed into the problem of finding smallest k-dominating sets in a graph. Searching for smallest k-dominating sets is formulated as a state-space search problem. We derive heuristic information to speed up the search, which is then used to solve the problem with the well-known A* algorithm. An illustrative example and some experimental results are presented to demonstrate the effectiveness of the heuristic search.

  • BPL: A Language for Parallel Algorithms on the Butterfly Network

    Fattaneh TAGHIYAREH  Hiroshi NAGAHASHI  

     
    PAPER-Algorithms

      Page(s):
    1488-1496

    A number of parallel algorithms have been developed to solve large-scale real world problems. Although there has been much work on the design of parallel algorithms, there has been little on the design of languages for expressing these algorithms. This paper describes the BPL, a new parallel language designed for butterfly networks. The purpose of this language is to help designers in hiding the complexity of the algorithm and leaving details of mapping between data and processors for lower level. BPL provides a simpler virtual machine for the designer , in order to avoid thinking about control of processors and data. From another point of view, BPL helps designer to logically check the algorithm and correct any possible error in it. The paper gives some examples implemented by this language. In addition, we have also implemented a software tool which simulates the running of the algorithm on the network. The results lead us to believe that this language would be useful in representing all kinds of algorithms on this network including normal algorithms and others.

  • Scheduling DAGs on Message Passing m-Processor Systems

    Sanjeev BASKIYAR  

     
    PAPER-Computer Systems

      Page(s):
    1497-1507

    Scheduling directed a-cyclic task graphs (DAGs) onto multiprocessors is known to be an intractable problem. Although there have been several heuristic algorithms for scheduling DAGs onto multiprocessors, few address the mapping onto a given number of completely connected processors with an objective of minimizing the finish time. We present an efficient algorithm called ClusterMerge to statically schedule directed a-cyclic task graphs onto a homogeneous completely connected MIMD system with a given number of processors. The algorithm clusters tasks in a DAG using a longest path heuristic and then iteratively merges these clusters to give a number of clusters identical to the number of available processors. Each of these clusters is then scheduled on a separate processor. Using simulations, we demonstrate that ClusterMerge schedules task graphs yielding the same or lower execution times than those of other researchers, but using fewer processors. We also discuss pitfalls in the various approaches to defining the longest path in a directed a-cyclic task graph.

  • A Discrete Gompertz Equation and a Software Reliability Growth Model

    Daisuke SATOH  

     
    PAPER-Software Engineering

      Page(s):
    1508-1513

    I describe a software reliability growth model that yields accurate parameter estimates even with a small amount of input data. The model is based on a proposed discrete analog of a Gompertz equation that has an exact solution. The difference equation tends to a differential equation on which the Gompertz curve model is defined, when the time interval tends to zero. The exact solution also tends to the exact solution of the differential equation when the time interval tends to zero. The discrete model conserves the characteristics of the Gompertz model because the difference equation has an exact solution. Therefore, the proposed model provides accurate parameter estimates, making it possible to predict in the early test phase when software can be released.

  • ICU/COWS: A Distributed Transactional Workflow System Supporting Multiple Workflow Types

    Dongsoo HAN  Jaeyong SHIM  Chansu YU  

     
    PAPER-Databases

      Page(s):
    1514-1525

    In this paper, we describe a distributed transactional workflow system named ICU/COWS, which supports multiple workflow types of large scale enterprises. The system aims to support the whole workflow for large scale enterprises effectively within a single workflow system and the system is designed to satisfy several design goals such as availability, scalability, and reliability. Transactional task and special tasks such as alternative task and compensating task are developed and utilized to achieve the design goals in task model level and the system is constructed with distributed transactional objects to achieve the design goals in distributed system environment. In this paper, structured ad hoc workflow is defined as a special type of ad hoc workflow that should be automated by workflow management system because many benefits can be obtained by automating it and connector facility is proposed as a means to support structured ad hoc workflow effectively. Some characteristics of a workflow system can be identified by monitoring the system behavior on different conditions like workloads or system configurations. An early version of the system has been implemented and the performance data of the system is illustrated.

  • New Constructions for Nondominated k-Coteries

    Eun Hye CHOI  Tatsuhiro TSUCHIYA  Tohru KIKUNO  

     
    PAPER-Fault Tolerance

      Page(s):
    1526-1532

    The k-mutual exclusion problem is the problem of guaranteeing that no more than k computing nodes enter a critical section simultaneously. The use of a k-coterie, which is a special set of node groups, is known as a robust approach to this problem. In general, k-coteries are classified as either dominated or nondominated, and a mutual exclusion mechanism has maximal availability when it employs a nondominated k-coterie. In this paper, we propose two new schemes called VOT and D-VOT for constructing nondominated k-coteries. We conduct a comparative evaluation of the proposed schemes and well-known previous schemes. The results clearly show the superiority of the proposed schemes.

  • Seismic Events Discrimination Using a New FLVQ Clustering Model

    Payam NASSERY  Karim FAEZ  

     
    PAPER-Pattern Recognition

      Page(s):
    1533-1539

    In this paper, the LVQ (Learning Vector Quantization) model and its variants are regarded as the clustering tools to discriminate the natural seismic events (earthquakes) from the artificial ones (nuclear explosions). The study is based on the six spectral features of the P-wave spectra computed from the short period teleseismic recordings. The conventional LVQ proposed by Kohenen and also the Fuzzy LVQ (FLVQ) models proposed by Sakuraba and Bezdek are all tested on a set of 26 earthquakes and 24 nuclear explosions using the leave-one-out testing strategy. The primary experimental results have shown that the shapes, the number and also the overlaps of the clusters play an important role in seismic classification. The results also showed how an improper feature space partitioning would strongly weaken both the clustering and recognition phases. To improve the numerical results, a new combined FLVQ algorithm is employed in this paper. The algorithm is composed of two nested sub-algorithms. The inner sub-algorithm tries to generate a well-defined fuzzy partitioning with the fuzzy reference vectors in the feature space. To achieve this goal, a cost function is defined as a function of the number, the shapes and also the overlaps of the fuzzy reference vectors. The update rule tries to minimize this cost function in a stepwise learning algorithm. On the other hand, the outer sub-algorithm tries to find an optimum value for the number of the clusters, in each step. For this optimization in the outer loop, we have used two different criteria. In the first criterion, the newly defined "fuzzy entropy" is used while in the second criterion, a performance index is employed by generalizing the Huntsberger formula for the learning rate, using the concept of fuzzy distance. The experimental results of the new model show a promising improvement in the error rate, an acceptable convergence time, and also more flexibility in boundary decision making.

  • Spectral Peak-Weighted Liftering of Cepstral Coefficients for Speech Recognition

    Hong Kook KIM  Hwang Soo LEE  

     
    PAPER-Speech and Hearing

      Page(s):
    1540-1549

    In this paper, we propose a peak-weighted cepstral lifter (PWL) for enhancing the spectral peaks of an all-pole model spectrum in the cepstral domain. The design parameter of the PWL is the degree of pole enhancement or pole shifting toward the unit circle. The optimal pole shifting factor is chosen by considering the sensitivity to spectral resonance peaks, the variability of cepstral variances, and the recognition accuracy. Next, we generalize the PWL so that the optimal shifting factor is adaptively determined in frame-by-frame basis. Compared with other cepstral lifters, a speech recognizer employing the frame-adaptive PWL provides better recognition performance.

  • Statistical Modelling of Speech Segment Duration by Constrained Tree Regression

    Naoto IWAHASHI  Yoshinori SAGISAKA  

     
    PAPER-Speech and Hearing

      Page(s):
    1550-1559

    This paper presents a new method for statistical modelling of prosody control in speech synthesis. The proposed method, which is referred to as Constrained Tree Regression (CTR), can make suitable representation of complex effects of control factors for prosody with a moderate amount of learning data. It is based on recursive splits of predictor variable spaces and partial imposition of constraints of linear independence among predictor variables. It incorporates both linear and tree regressions with categorical predictor variables, which have been conventionally used for prosody control, and extends them to more general models. In addition, a hierarchical error function is presented to consider hierarchical structure in prosody control. This new method is applied to modelling of speech segmental duration. Experimental results show that better duration models are obtained by using the proposed regression method compared with linear and tree regressions using the same number of free parameters. It is also shown that the hierarchical structure of phoneme and syllable durations can be represented efficiently using the hierarchical error function.

  • Training Method for Pattern Classifier Based on the Performance after Adaptation

    Naoto IWAHASHI  

     
    PAPER-Speech and Hearing

      Page(s):
    1560-1566

    This paper describes a method for training a pattern classifier that will perform well after it has been adapted to changes in input conditions. Considering the adaptation methods which are based on the transformation of classifier parameters, we formulate the problem of optimizing classifiers, and propose a method for training them. In the proposed training method, the classifier is trained while the adaptation is being carried out. The objective function for the training is given based on the recognition performance obtained by the adapted classifier. The utility of the proposed training method is demonstrated by experiments in a five-class Japanese vowel pattern recognition task with speaker adaptation.

  • Epipolar Constraint from 2D Affine Lines, and Its Application in Face Image Rendering

    Kuntal SENGUPTA  Jun OHYA  

     
    PAPER-Image Processing, Image Pattern Recognition

      Page(s):
    1567-1573

    This paper has two parts. In the first part of the paper, we note the property that under the para perspective camera projection model of a camera, the set of 2D images produced by a 3D point can be optimally represented by two lines in the affine space (α-β space). The slope of these two lines are same, and we observe that this constraint is exactly the same as the epipolar line constraint. Using this constraint, the equation of the epipolar line can be derived. In the second part of the paper, we use the "same slope" property of the lines in the α-β space to derive the affine structure of the human face. The input to the algorithm is not limited to an image sequence of a human head under rigid motion. It can be snapshots of the human face taken by the same or different cameras, over different periods of time. Since the depth variation of the human face is not very large, we use the para perspective camera projection model. Using this property, we reformulate the (human) face structure reconstruction problem in terms of the much familiar multiple baseline stereo matching problem. Apart from the face modeling aspect, we also show how we use the results for reprojecting human faces in identification tasks.

  • Vanishing Point and Vanishing Line Estimation with Line Clustering

    Akihiro MINAGAWA  Norio TAGAWA  Tadashi MORIYA  Toshiyuki GOTOH  

     
    PAPER-Image Processing, Image Pattern Recognition

      Page(s):
    1574-1582

    In conventional methods for detecting vanishing points and vanishing lines, the observed feature points are clustered into collections that represent different lines. The multiple lines are then detected and the vanishing points are detected as points of intersection of the lines. The vanishing line is then detected based on the points of intersection. However, for the purpose of optimization, these processes should be integrated and be achieved simultaneously. In the present paper, we assume that the observed noise model for the feature points is a two-dimensional Gaussian mixture and define the likelihood function, including obvious vanishing points and a vanishing line parameters. As a result, the above described simultaneous detection can be formulated as a maximum likelihood estimation problem. In addition, an iterative computation method for achieving this estimation is proposed based on the EM (Expectation Maximization) algorithm. The proposed method involves new techniques by which stable convergence is achieved and computational cost is reduced. The effectiveness of the proposed method that includes these techniques can be confirmed by computer simulations and real images.

  • Real-Time Tracking of Multiple Moving Object Contours in a Moving Camera Image Sequence

    Shoichi ARAKI  Takashi MATSUOKA  Naokazu YOKOYA  Haruo TAKEMURA  

     
    PAPER-Image Processing, Image Pattern Recognition

      Page(s):
    1583-1591

    This paper describes a new method for detection and tracking of moving objects from a moving camera image sequence using robust estimation and active contour models. We assume that the apparent background motion between two consecutive image frames can be approximated by affine transformation. In order to register the static background, we estimate affine transformation parameters using LMedS (Least Median of Squares) method which is a kind of robust estimator. Split-and-merge contour models are employed for tracking multiple moving objects. Image energy of contour models is defined based on the image which is obtained by subtracting the previous frame transformed with estimated affine parameters from the current frame. We have implemented the method on an image processing system which consists of DSP boards for real-time tracking of moving objects from a moving camera image sequence.

  • Fast Stereo Matching Using Constraints in Discrete Space

    Hong JEONG  Yuns OH  

     
    PAPER-Image Processing, Image Pattern Recognition

      Page(s):
    1592-1600

    We present a new basis for discrete representation of stereo correspondence. This center referenced basis permits a more natural, complete and concise representation of constraints in stereo matching. In this context a MAP formulation for disparity estimation is derived and reduced to unconstrained minimization of an energy function. Incorporating natural constraints, the problem is simplified to the shortest path problem in a sparsely connected trellis structure which is performed by an efficient dynamic programing algorithm. The computational complexity is the same as the best of other dynamic programming methods, but a very high degree of concurrency is possible in the algorithm making it suitable for implementation with parallel procesors. Experimental results confirm the performance of this method and matching errors are found to degrade gracefully in exponential form with respect to noise.

  • An Approach to Estimating the Motion Parameters for a Linear Motion Blurred Image

    Yung-Sheng CHEN  I-San CHOA  

     
    LETTER-Image Processing, Image Pattern Recognition

      Page(s):
    1601-1603

    Identification of motion parameters is an important issue in image restoration of a linear motion blur. Based on the human visual-motion sensing properties, an integrated approach with some known image processing techniques is proposed to the estimation of the direction and extent of motion on a linear motion blurred image. Experimental results confirm the feasibility of our approach.