Hiroaki AKUTSU Ko ARAI
Lanxi LIU Pengpeng YANG Suwen DU Sani M. ABDULLAHI
Xiaoguang TU Zhi HE Gui FU Jianhua LIU Mian ZHONG Chao ZHOU Xia LEI Juhang YIN Yi HUANG Yu WANG
Yingying LU Cheng LU Yuan ZONG Feng ZHOU Chuangao TANG
Jialong LI Takuto YAMAUCHI Takanori HIRANO Jinyu CAI Kenji TEI
Wei LEI Yue ZHANG Hanfeng XIE Zebin CHEN Zengping CHEN Weixing LI
David CLARINO Naoya ASADA Atsushi MATSUO Shigeru YAMASHITA
Takashi YOKOTA Kanemitsu OOTSU
Xiaokang Jin Benben Huang Hao Sheng Yao Wu
Tomoki MIYAMOTO
Ken WATANABE Katsuhide FUJITA
Masashi UNOKI Kai LI Anuwat CHAIWONGYEN Quoc-Huy NGUYEN Khalid ZAMAN
Takaharu TSUBOYAMA Ryota TAKAHASHI Motoi IWATA Koichi KISE
Chi ZHANG Li TAO Toshihiko YAMASAKI
Ann Jelyn TIEMPO Yong-Jin JEONG
Haruhisa KATO Yoshitaka KIDANI Kei KAWAMURA
Jiakun LI Jiajian LI Yanjun SHI Hui LIAN Haifan WU
Gyuyeong KIM
Hyun KWON Jun LEE
Fan LI Enze YANG Chao LI Shuoyan LIU Haodong WANG
Guangjin Ouyang Yong Guo Yu Lu Fang He
Yuyao LIU Qingyong LI Shi BAO Wen WANG
Cong PANG Ye NI Jia Ming CHENG Lin ZHOU Li ZHAO
Nikolay FEDOROV Yuta YAMASAKI Masateru TSUNODA Akito MONDEN Amjed TAHIR Kwabena Ebo BENNIN Koji TODA Keitaro NAKASAI
Yukasa MURAKAMI Yuta YAMASAKI Masateru TSUNODA Akito MONDEN Amjed TAHIR Kwabena Ebo BENNIN Koji TODA Keitaro NAKASAI
Kazuya KAKIZAKI Kazuto FUKUCHI Jun SAKUMA
Yitong WANG Htoo Htoo Sandi KYAW Kunihiro FUJIYOSHI Keiichi KANEKO
Waqas NAWAZ Muhammad UZAIR Kifayat ULLAH KHAN Iram FATIMA
Haeyoung Lee
Ji XI Pengxu JIANG Yue XIE Wei JIANG Hao DING
Weiwei JING Zhonghua LI
Sena LEE Chaeyoung KIM Hoorin PARK
Akira ITO Yoshiaki TAKAHASHI
Rindo NAKANISHI Yoshiaki TAKATA Hiroyuki SEKI
Chuzo IWAMOTO Ryo TAKAISHI
Chih-Ping Wang Duen-Ren Liu
Yuya TAKADA Rikuto MOCHIDA Miya NAKAJIMA Syun-suke KADOYA Daisuke SANO Tsuyoshi KATO
Yi Huo Yun Ge
Rikuto MOCHIDA Miya NAKAJIMA Haruki ONO Takahiro ANDO Tsuyoshi KATO
Koichi FUJII Tomomi MATSUI
Yaotong SONG Zhipeng LIU Zhiming ZHANG Jun TANG Zhenyu LEI Shangce GAO
Souhei TAKAGI Takuya KOJIMA Hideharu AMANO Morihiro KUGA Masahiro IIDA
Jun ZHOU Masaaki KONDO
Tetsuya MANABE Wataru UNUMA
Kazuyuki AMANO
Takumi SHIOTA Tonan KAMATA Ryuhei UEHARA
Hitoshi MURAKAMI Yutaro YAMAGUCHI
Jingjing Liu Chuanyang Liu Yiquan Wu Zuo Sun
Zhenglong YANG Weihao DENG Guozhong WANG Tao FAN Yixi LUO
Yoshiaki TAKATA Akira ONISHI Ryoma SENDA Hiroyuki SEKI
Dinesh DAULTANI Masayuki TANAKA Masatoshi OKUTOMI Kazuki ENDO
Kento KIMURA Tomohiro HARAMIISHI Kazuyuki AMANO Shin-ichi NAKANO
Ryotaro MITSUBOSHI Kohei HATANO Eiji TAKIMOTO
Genta INOUE Daiki OKONOGI Satoru JIMBO Thiem Van CHU Masato MOTOMURA Kazushi KAWAMURA
Hikaru USAMI Yusuke KAMEDA
Yinan YANG
Takumi INABA Takatsugu ONO Koji INOUE Satoshi KAWAKAMI
Fengshan ZHAO Qin LIU Takeshi IKENAGA
Naohito MATSUMOTO Kazuhiro KURITA Masashi KIYOMI
Tomohiro KOBAYASHI Tomomi MATSUI
Shin-ichi NAKANO
Ming PAN
Yasuhiro OHTSUKA Takayuki HAMAMOTO Kiyoharu AIZAWA
We propose a new sampling control system on image sensor array. Contrary to the random access pixels, the proposed sensor is able to read out spatially variant sampled pixels at high speed, without inputting pixel address for each access. The sampling positions can be changed dynamically by rewriting the sampling position memory. The proposed sensor has a memory array that stores the sampling positions. It can achieve any spatially varying sampling patterns. A prototype of 64
Hilario Haruomi KOBAYASHI Yasuhiko HARA Hideaki DOI Kazuo TAKAI Akiyoshi SUMIYA
The visual inspection of printed circuit boards (PCBs) at the final production stage is necessary for quality assurance and the requirements for an automated inspection system are very high. However, consistent inspection of patterns on these PCBs is very difficult due to pattern complexity. Most of the previously developed techniques are not sensitive enough to detect defects in complex patterns. To solve this problem, we propose a new optical system that discriminates pattern types existing on a PCB, such as copper, solder resist and silk-screen printing. We have also developed a hybrid defect detection technique to inspect discriminated patterns. This technique is based on shape measurement and features extraction methods. We used the proposed techniques in an actual automated inspection system, realizing real time transactions with a combination of hardware equipped with image processing LSIs and PC software. Evaluation with this inspection system ensures a 100% defect detection rate and a fairly low false alarm rate (0.06%). The present paper describes the inspection algorithm and briefly explains the automated inspection system.
Tsunehiro AIBARA Takehiro MABUCHI Masanori IZUMIDA
This paper deals with the fundamental problem of automatic assessment of appearance of seam puckers on suits, and suggests possibilities for practical usage. Presently, evaluations are done by inspectors who compare standard photographs of suits to test samples. In order to avoid human errors, however, a method of automatic evaluation is desired. We process the problem as pattern recognition. As a feature we use fractal dimensions. The fractal dimensions obtained from standard photographs are used as template patterns. To make it easier to calculate fractal dimensions, we plot a curve representing the appearance of seam puckers, from which fractal dimensions of the curve can be calculated. The seam puckers in gray-scale images are confused with the material's texture, so the seam puckers must be enhanced for a precise evaluation. By using the concept of variance, we select images with seam puckers and enhance only the images with seam puckers. This is the novel aspect of this work. Twenty suits are used for the evaluation experiment and we obtain a result almost the same to the evaluation gained by inspection. That is, the evaluation of 11 samples is the same as that gained by inspection, the results of 8 samples differ by 1 grade, and the evaluation of 1 sample has a 2-grade difference. The results are also compared to the evaluation of the system using the Daubechies wavelet feature. The result of comparison shows that the present method gives a better evaluation than the system using the Daubechies wavelet.
Kenichi ARAKAWA Takao KAKIZAKI Shinji OMYO
In industrial assembly lines, seam sealing is a painting process used for making watertight seals or for preventing rusting. In the process, sealant is painted on seams located at the joints of pressed metal parts. We developed a sealing robot system that adjusts the sealing gun motion adaptively to the seam position sensed by a range sensor (a scanning laser rangefinder which senses profile range data). In this paper, we propose a high-speed and highly reliable algorithm for seam position computation from the sensed profile range data around the seam. It is proved experimentally that the sealing robot system used with the developed algorithm is very effective, especially for reducing wasted sealant.
Francois BERRY Philippe MARTINET Jean GALLICE
In visual servoing, most studies are concerned with robotic application with known objects. In this paper, the problem of controlling a motion by visual servoing around an unknown object is addressed. In this case, the approach is interpreted as an initial step towards a perception goal of an unmodeled object. The main goal is to perform motion with regard to the object in order to discover several viewpoint of the object. An adaptive visual servoing scheme is proposed to perform such task. The originality of our work is based on the choice and extraction of visual features in accordance with motions to be performed. The notion of invariant feature is introduced to control the navigational task around the unknown object. During experimentation, a cartesian robot connected to a real time vision system is used. A CCD camera is mounted on the end effector of the robot. The experimental results present a linkage of desired motion around different kind of objects.
Kenichi KANATANI Naoya OHTA Yasushi KANAZAWA
We describe a theoretically optimal algorithm for computing the homography between two images. First, we derive a theoretical accuracy bound based on a mathematical model of image noise and do simulation to confirm that our renormalization technique effectively attains that bound. Then, we apply our technique to mosaicing of images with small overlaps. By using real images, we show how our algorithm reduces the instability of the image mapping.
Yongduek SEO Min-Ho AHN Ki-Sang HONG
In this paper we deal with the problem of calibrating a rotating and zooming camera, without 3D pattern, whose internal calibration parameters change frame by frame. First, we theoretically show the existence of the calibration parameters up to an orthogonal transformation under the assumption that the skew of the camera is zero. Auto-calibration becomes possible by analyzing inter-image homographies which can be obtained from the matches in images of the same scene, or through direct nonlinear iteration. In general, at least four homographies are needed for auto-calibration. When we further assume that the aspect ratio is known and the principal point is fixed during the sequence then one homography yields camera parameters, and when the aspect ratio is assumed to be unknown with fixed principal point then two homographies are enough. In the case of a fixed principal point, we suggest a method for obtaining the calibration parameters by searching the space of the principal point. If this is not the case, then nonlinear iteration is applied. The algorithm is implemented and validated on several sets of synthetic data. Also experimental results for real images are given.
Given a set of still images taken from a hand-held camera, we present a fast method for mosaicing them into a single blended picture. We design time- and memory- efficient still image mosaicing algorithms based on geometric point feature matchings that can handle both arbitrary rotations and large zoom factors. We discuss extensions of the methodology to related problems like the recovering of the epipolar geometry for 3d reconstruction and object recognition tasks.
Since for recognition tasks it is known that planar invariants are more easily obtained than others, decomposing a scene in terms of planar parts becomes very interresting. This paper presents a new approach to find the projections of planar surfaces in a pair of images. For this task we introduce the facet concept defined by linked edges (chains) and corners. We use collineations as projective information to match and verify their planarity. Our contribution consists in obtaining from an uncalibrated stereo pair of images a match of "planar" chains based on matched corners. Collineations are constrained by the fundamental matrix information and a Kalman filter approach is used to refine its computation.
Hasnine HAQUE Aboul-Ella HASSANIEN Masayuki NAKAJIMA
When the inter-slice resolution of tomographic image slices is large, it is necessary to estimate the locations and intensities of pixels, which would appear in the non-existed intermediate slices. This paper presents a new method for generating the missing medical slices from two given slices. It uses the contours of organs as the control parameters to the intensity information in the physical gaps of sequential medical slices. The Snake model is used for generating the control points required for the elastic body spline (EBS) morphing algorithm. Contour information derived from this segmentation pre-process is then further processed and used as control parameters to warp the corresponding regions in both input slices into compatible shapes. In this way, the intensity information of the interpolated intermediate slices can be derived more faithfully. In comparison with the existing intensity interpolation methods, including linear interpolation, which only considers corresponding points in a small physical neighborhood, this method warps the data images into similar shapes according to contour information to provide a more meaningful correspondence relationship.
Koichiro DEGUCHI Daisuke KAWAMATA Kanae MIZUTANI Hidekata HONTANI Kiwa WAKABAYASHI
A new method to recover and display 3D fundus pattern on the inner bottom surface of eye-ball from stereo fundus image pair is developed. For the fundus stereo images, a simple stereo technique does not work, because the fundus is observed through eye lens and a contact wide-angle enlarging lens. In this method, utilizing the fact that fundus forms a part of sphere, we identify their optical parameters and correct the skews of the lines-of-sight. Then, we obtain 3D images of the fundus by back-projecting the stereo images.
Takeshi YAMADA Hideo SAITO Shinji OZAWA
This paper proposes a new method for reconstruction a shape of skin surface replica from shaded image sequence taken with different light source directions. Since the shaded images include shadows caused by surface height fluctuation, and specular and inter reflections, the conventional photometric stereo method is not suitable for reconstructing its surface accurately. In the proposed method, we choose measured intensity which does not include specular and inter reflections and self-shadows so that we can calculate accurate normal vector from the selected measured intensity using SVD (Singular Value Decomposition) method. The experimental results from real images demonstrate that the proposed method is effective for shape reconstruction from shaded images, which include specular and inter reflections and self-shadows.
Yoshinari KAMEDA Takeo TAODA Michihiko MINOH
A high speed 3D shape reconstruction method with multiple video cameras and multiple computers on LAN is presented. The video cameras are set to surround the real 3D space where people exist. Reconstructed 3D space is displayed in voxel format and users can see the space from any viewpoint with a VR viewer. We implemented a prototype system that can work out the 3D reconstruction with the speed of 10.55 fps in 313 ms delay.
Huijing ZHAO Ryosuke SHIBASAKI
In this paper, a method of fusing ground-based laser range image and CCD images for the reconstruction of textured 3D urban object is proposed. An acquisition system is developed to capture laser range image and CCD images simultaneously from the same platform. A registration method is developed using both laser range and CCD images in a coarse-to-fine process. Laser range images are registered with an assumption on sensor's setup, which aims at robustly detecting an initial configuration between the sensor's coordinate system of two views. CCD images are matched to refine the accuracy of the initial transformation, which might be degraded by improper sensor setup, unreliable feature extraction, or limited by low spatial resolution of laser range image. Textured 3D model is generated using planar faces for vertical walls and triangular cells for ground surface, trees and bushes. Through an outdoor experiment of reconstructing a building using six views of laser range and CCD images, it is demonstrated that textured 3D model of urban objects can be generated in an automated manner.
Katsuyuki KAMEI Wayne HOY Takashi TAMADA Kazuo SEO
In many fields such as city administration and facilities management, there are an increasing number of requests for a Geographic Information System (GIS) that provides users with automated mapping functions. A mechanism which displays 3D views of an urban scene is particularly required because it would allow the construction of an intuitive and understandable environment for managing objects in the scene. In this paper, we present a new urban modeling system utilizing both image-based and geometry-based approaches. Our method is based on a new concept in which a wide urban area can be displayed with natural photo-realistic images, and each object drawn in the view can be identified by pointing to it. First, to generate natural urban views from any viewpoint, we employ an image-based rendering method, Image Walkthrough, and modify it to handle aerial images. This method can interpolate and generate natural views by assembling several source photographs. Next, to identify each object in the scene, we recover its shape using computer vision techniques (a geometry-based approach). The rough shape of each building is reconstructed from various aerial images, and then its drawn position on the generated view is also determined. This means that it becomes possible to identify each building from an urban view. We have combined both of these approaches yielding a new style of urban information management. The users of the system can enjoy an intuitive understanding of the area and easily identify their target, by generating natural views from any viewpoint and suitably reconstructing the shapes of objects. We have made a prototype system of this new concept of GIS, which have shown the validity of our method.
Yukio OGAWA Kazuaki IWAMURA Shigeru KAKUMOTO
We have developed a map-based approach that enables us to efficiently extract information about man-made objects, such as buildings, from aerial images. An image is matched with a corresponding map in order to estimate the object information in the image (i. e. , presence, location, shape, size, kind, and surroundings). This approach is characterized by using a figure contained in a map as an object model for a top-down (model-driven) analysis of an object in the aerial image. We determined the principal steps of the map-based approach needed to extract object information and update a map. These steps were then applied to obtain the locations of missing buildings and the heights of existing buildings. The extraction results of experiments using aerial images of Kobe City (taken after the 1995 earthquake) show that the approach is effective for automatically extracting building information from aerial images and for rapidly updating map data.
Kazuhiro OTSUKA Tsutomu HORIKOSHI Haruhiko KOJIMA Satoshi SUZUKI
A novel method is proposed to retrieve image sequences with the goal of forecasting complex and time-varying natural patterns. To that end, we introduce a framework called Memory-Based Forecasting; it provides forecast information based on the temporal development of past retrieved sequences. This paper targets the radar echo patterns in weather radar images, and aims to realize an image retrieval method that supports weather forecasters in predicting local precipitation. To characterize the radar echo patterns, an appearance-based representation of the echo pattern, and its velocity field are employed. Temporal texture features are introduced to represent local pattern features including non-rigid complex motion. Furthermore, the temporal development of a sequence is represented as paths in eigenspaces of the image features, and a normalized distance between two sequences in the eigenspace is proposed as a dissimilarity measure that is used in retrieving similar sequences. Several experiments confirm the good performance of the proposed retrieval scheme, and indicate the predictability of the image sequence.
Seok Cheol KEE Kyoung Mu LEE Sang Uk LEE
In this paper, we propose an elegant approach for illumination invariant face recognition based on the photometric stereo technique. The basic idea is to reconstruct the surface normal and the albedo of a face using photometric stereo images, and then use them as the illumination independent model of the face. And, we have investigated the optimal light source directions for accurate surface shape reconstruction, and the robust estimation technique for the illumination direction of an input face image. We have tested the proposed algorithm with 125 real face images of 25 persons which are taken under 5 quite different illumination conditions, and achieved the success rate of more than 80%. Comparison results of conventional face recognition methods and the proposed method are also evaluated. These results demonstrate that the proposed technique have a great potential for the robust face recognition even when the lighting condition changes severely.
To enhance safety and traffic efficiency, a driver assistance system and an autonomous vehicle system are being developed. A preceding vehicle recognition method is important to develop such systems. In this paper, a vision-based preceding vehicle recognition method, based on supervised learning from sample images is proposed. The improvement for Modified Quadratic Discriminant Function (MQDF) classifier that is used in the proposed method is also shown. And in the case of road environment recognition including the preceding vehicle recognition, many researches have been reported. However in those researches, a quantitative evaluation with large number of images has rarely been done. Whereas, in this paper, over 1,000 sample images for passenger vehicles, which are recorded on a highway during daytime, are used for an evaluation. The evaluation result shows that the performance in a low order case is improved from the ordinary MQDF. Accordingly, the calculation time is reduced more than 20% by using the proposed method. And the feasibility of the proposed method is also proved, due to the result that the proposed method indicates over 98% as classification rate.
We consider the problem of placing resources in a distributed computing system so that certain performance requirements may be met while minimizing the number of resource copies needed. Resources include special I/O processors, expensive peripheral devices, or such software modules as compilers, library routines, and data files. Due to the delay in accessing each of these resources, system performance degrades as the distance between each processor and its nearest resource copy increases. Thus, every processor must be within a given distance k
Fattaneh TAGHIYAREH Hiroshi NAGAHASHI
A number of parallel algorithms have been developed to solve large-scale real world problems. Although there has been much work on the design of parallel algorithms, there has been little on the design of languages for expressing these algorithms. This paper describes the BPL, a new parallel language designed for butterfly networks. The purpose of this language is to help designers in hiding the complexity of the algorithm and leaving details of mapping between data and processors for lower level. BPL provides a simpler virtual machine for the designer , in order to avoid thinking about control of processors and data. From another point of view, BPL helps designer to logically check the algorithm and correct any possible error in it. The paper gives some examples implemented by this language. In addition, we have also implemented a software tool which simulates the running of the algorithm on the network. The results lead us to believe that this language would be useful in representing all kinds of algorithms on this network including normal algorithms and others.
Scheduling directed a-cyclic task graphs (DAGs) onto multiprocessors is known to be an intractable problem. Although there have been several heuristic algorithms for scheduling DAGs onto multiprocessors, few address the mapping onto a given number of completely connected processors with an objective of minimizing the finish time. We present an efficient algorithm called ClusterMerge to statically schedule directed a-cyclic task graphs onto a homogeneous completely connected MIMD system with a given number of processors. The algorithm clusters tasks in a DAG using a longest path heuristic and then iteratively merges these clusters to give a number of clusters identical to the number of available processors. Each of these clusters is then scheduled on a separate processor. Using simulations, we demonstrate that ClusterMerge schedules task graphs yielding the same or lower execution times than those of other researchers, but using fewer processors. We also discuss pitfalls in the various approaches to defining the longest path in a directed a-cyclic task graph.
I describe a software reliability growth model that yields accurate parameter estimates even with a small amount of input data. The model is based on a proposed discrete analog of a Gompertz equation that has an exact solution. The difference equation tends to a differential equation on which the Gompertz curve model is defined, when the time interval tends to zero. The exact solution also tends to the exact solution of the differential equation when the time interval tends to zero. The discrete model conserves the characteristics of the Gompertz model because the difference equation has an exact solution. Therefore, the proposed model provides accurate parameter estimates, making it possible to predict in the early test phase when software can be released.
Dongsoo HAN Jaeyong SHIM Chansu YU
In this paper, we describe a distributed transactional workflow system named ICU/COWS, which supports multiple workflow types of large scale enterprises. The system aims to support the whole workflow for large scale enterprises effectively within a single workflow system and the system is designed to satisfy several design goals such as availability, scalability, and reliability. Transactional task and special tasks such as alternative task and compensating task are developed and utilized to achieve the design goals in task model level and the system is constructed with distributed transactional objects to achieve the design goals in distributed system environment. In this paper, structured ad hoc workflow is defined as a special type of ad hoc workflow that should be automated by workflow management system because many benefits can be obtained by automating it and connector facility is proposed as a means to support structured ad hoc workflow effectively. Some characteristics of a workflow system can be identified by monitoring the system behavior on different conditions like workloads or system configurations. An early version of the system has been implemented and the performance data of the system is illustrated.
Eun Hye CHOI Tatsuhiro TSUCHIYA Tohru KIKUNO
The k-mutual exclusion problem is the problem of guaranteeing that no more than k computing nodes enter a critical section simultaneously. The use of a k-coterie, which is a special set of node groups, is known as a robust approach to this problem. In general, k-coteries are classified as either dominated or nondominated, and a mutual exclusion mechanism has maximal availability when it employs a nondominated k-coterie. In this paper, we propose two new schemes called VOT and D-VOT for constructing nondominated k-coteries. We conduct a comparative evaluation of the proposed schemes and well-known previous schemes. The results clearly show the superiority of the proposed schemes.
In this paper, the LVQ (Learning Vector Quantization) model and its variants are regarded as the clustering tools to discriminate the natural seismic events (earthquakes) from the artificial ones (nuclear explosions). The study is based on the six spectral features of the P-wave spectra computed from the short period teleseismic recordings. The conventional LVQ proposed by Kohenen and also the Fuzzy LVQ (FLVQ) models proposed by Sakuraba and Bezdek are all tested on a set of 26 earthquakes and 24 nuclear explosions using the leave-one-out testing strategy. The primary experimental results have shown that the shapes, the number and also the overlaps of the clusters play an important role in seismic classification. The results also showed how an improper feature space partitioning would strongly weaken both the clustering and recognition phases. To improve the numerical results, a new combined FLVQ algorithm is employed in this paper. The algorithm is composed of two nested sub-algorithms. The inner sub-algorithm tries to generate a well-defined fuzzy partitioning with the fuzzy reference vectors in the feature space. To achieve this goal, a cost function is defined as a function of the number, the shapes and also the overlaps of the fuzzy reference vectors. The update rule tries to minimize this cost function in a stepwise learning algorithm. On the other hand, the outer sub-algorithm tries to find an optimum value for the number of the clusters, in each step. For this optimization in the outer loop, we have used two different criteria. In the first criterion, the newly defined "fuzzy entropy" is used while in the second criterion, a performance index is employed by generalizing the Huntsberger formula for the learning rate, using the concept of fuzzy distance. The experimental results of the new model show a promising improvement in the error rate, an acceptable convergence time, and also more flexibility in boundary decision making.
In this paper, we propose a peak-weighted cepstral lifter (PWL) for enhancing the spectral peaks of an all-pole model spectrum in the cepstral domain. The design parameter of the PWL is the degree of pole enhancement or pole shifting toward the unit circle. The optimal pole shifting factor is chosen by considering the sensitivity to spectral resonance peaks, the variability of cepstral variances, and the recognition accuracy. Next, we generalize the PWL so that the optimal shifting factor is adaptively determined in frame-by-frame basis. Compared with other cepstral lifters, a speech recognizer employing the frame-adaptive PWL provides better recognition performance.
Naoto IWAHASHI Yoshinori SAGISAKA
This paper presents a new method for statistical modelling of prosody control in speech synthesis. The proposed method, which is referred to as Constrained Tree Regression (CTR), can make suitable representation of complex effects of control factors for prosody with a moderate amount of learning data. It is based on recursive splits of predictor variable spaces and partial imposition of constraints of linear independence among predictor variables. It incorporates both linear and tree regressions with categorical predictor variables, which have been conventionally used for prosody control, and extends them to more general models. In addition, a hierarchical error function is presented to consider hierarchical structure in prosody control. This new method is applied to modelling of speech segmental duration. Experimental results show that better duration models are obtained by using the proposed regression method compared with linear and tree regressions using the same number of free parameters. It is also shown that the hierarchical structure of phoneme and syllable durations can be represented efficiently using the hierarchical error function.
This paper describes a method for training a pattern classifier that will perform well after it has been adapted to changes in input conditions. Considering the adaptation methods which are based on the transformation of classifier parameters, we formulate the problem of optimizing classifiers, and propose a method for training them. In the proposed training method, the classifier is trained while the adaptation is being carried out. The objective function for the training is given based on the recognition performance obtained by the adapted classifier. The utility of the proposed training method is demonstrated by experiments in a five-class Japanese vowel pattern recognition task with speaker adaptation.
This paper has two parts. In the first part of the paper, we note the property that under the para perspective camera projection model of a camera, the set of 2D images produced by a 3D point can be optimally represented by two lines in the affine space (α-β space). The slope of these two lines are same, and we observe that this constraint is exactly the same as the epipolar line constraint. Using this constraint, the equation of the epipolar line can be derived. In the second part of the paper, we use the "same slope" property of the lines in the α-β space to derive the affine structure of the human face. The input to the algorithm is not limited to an image sequence of a human head under rigid motion. It can be snapshots of the human face taken by the same or different cameras, over different periods of time. Since the depth variation of the human face is not very large, we use the para perspective camera projection model. Using this property, we reformulate the (human) face structure reconstruction problem in terms of the much familiar multiple baseline stereo matching problem. Apart from the face modeling aspect, we also show how we use the results for reprojecting human faces in identification tasks.
Akihiro MINAGAWA Norio TAGAWA Tadashi MORIYA Toshiyuki GOTOH
In conventional methods for detecting vanishing points and vanishing lines, the observed feature points are clustered into collections that represent different lines. The multiple lines are then detected and the vanishing points are detected as points of intersection of the lines. The vanishing line is then detected based on the points of intersection. However, for the purpose of optimization, these processes should be integrated and be achieved simultaneously. In the present paper, we assume that the observed noise model for the feature points is a two-dimensional Gaussian mixture and define the likelihood function, including obvious vanishing points and a vanishing line parameters. As a result, the above described simultaneous detection can be formulated as a maximum likelihood estimation problem. In addition, an iterative computation method for achieving this estimation is proposed based on the EM (Expectation Maximization) algorithm. The proposed method involves new techniques by which stable convergence is achieved and computational cost is reduced. The effectiveness of the proposed method that includes these techniques can be confirmed by computer simulations and real images.
Shoichi ARAKI Takashi MATSUOKA Naokazu YOKOYA Haruo TAKEMURA
This paper describes a new method for detection and tracking of moving objects from a moving camera image sequence using robust estimation and active contour models. We assume that the apparent background motion between two consecutive image frames can be approximated by affine transformation. In order to register the static background, we estimate affine transformation parameters using LMedS (Least Median of Squares) method which is a kind of robust estimator. Split-and-merge contour models are employed for tracking multiple moving objects. Image energy of contour models is defined based on the image which is obtained by subtracting the previous frame transformed with estimated affine parameters from the current frame. We have implemented the method on an image processing system which consists of DSP boards for real-time tracking of moving objects from a moving camera image sequence.
We present a new basis for discrete representation of stereo correspondence. This center referenced basis permits a more natural, complete and concise representation of constraints in stereo matching. In this context a MAP formulation for disparity estimation is derived and reduced to unconstrained minimization of an energy function. Incorporating natural constraints, the problem is simplified to the shortest path problem in a sparsely connected trellis structure which is performed by an efficient dynamic programing algorithm. The computational complexity is the same as the best of other dynamic programming methods, but a very high degree of concurrency is possible in the algorithm making it suitable for implementation with parallel procesors. Experimental results confirm the performance of this method and matching errors are found to degrade gracefully in exponential form with respect to noise.
Identification of motion parameters is an important issue in image restoration of a linear motion blur. Based on the human visual-motion sensing properties, an integrated approach with some known image processing techniques is proposed to the estimation of the direction and extent of motion on a linear motion blurred image. Experimental results confirm the feasibility of our approach.