Yoichi YAMASHITA Tomoyoshi ISHIDA Kazuki SHIMADERA
One of fundamental issues on the F0 contour is modeling relationship between F0 parameters and linguistic information of a sentence. This paper proposes a stochastic F0 model which probabilistically models the relationship between the F0 contour and the linguistic information. For the application of speech synthesis, an F0 generator selects the most probable F0 contour from candidates given by a probabilistic F0 model. An F0 contour of a Japanese sentence is represented by concatenation of F0 patterns of a Japanese syntactic unit, bunsetsu. A bunsetsu F0 pattern is composed of an F0 average and an F0 shape. The F0 average is independently predicted for each bunsetsu by a quantification theory from linguistic features of the bunsetsu. The most probable sequence of bunsetsu F0 shapes for a sentence is found in the F0 shape database based on a probabilistic measure. The probability that an F0 contour is observed for a sentence is defined by two kinds of probabilities, the F0 shape production and the F0 shape bigram. The latter is a probability of adjacent occurrence of two F0 shapes, which is similar to a word bigram in speech recognition. Several typical bunsetsu F0 shapes are extracted by clustering of training data and stored in the F0 shape database. The probability of the F0 shape production is computed for each bunsetsu based on distribution of values for the linguistic feature in a cluster. The RMS prediction errors of the F0 contour are 0.26[octave].
A quasi-periodic signal is a periodic signal with period and amplitude variations. Several physiological signals, including the electrocardiogram (ECG), can be treated as quasi-periodic. Vector quantization (VQ) is a valuable and universal tool for signal compression. However, compressing quasi-periodic signals using VQ presents several problems. First, a pre-trained codebook has little adaptation to signal variations, resulting in no quality control of reconstructed signals. Secondly, the periodicity of the signal causes data redundancy in the codebook, where many codevectors are highly correlated. These two problems are solved by the proposed codebook replenishment VQ (CRVQ) scheme based on a bar-shaped (BS) codebook structure. In the CRVQ, codevectors can be updated online according to signal variations, and the quality of reconstructed signals can be specified. With the BS codebook structure, the codebook redundancy is reduced significantly and great codebook storage space is saved; moreover variable-dimension (VD) codevectors can be used to minimize the coding bit rate subject to a distortion constraint. The theoretic rationale and implementation scheme of the VD-CRVQ is given. The ECG data from the MIT/BIH arrhythmic database are tested, and the result is substantially better than that of using other VQ compression methods.
Feng-Xiang GE Ying-Ning PENG Xiu-Tan WANG
A novel power spectral density accumulation (PSDA) method for estimating the bandwidth of the clutter spectra is proposed, based on a priori knowledge of the shape of the clutter spectra. The comparison of the complexity and the performance between the PSDA method and the general ones is presented. It is shown that the PSDA method is effective for the short-time clutter data in the practical application.
Rajalida LIPIKORN Akinobu SHIMIZU Yoshihiro HAGIHARA Hidefumi KOBATAKE
The skeleton and the skeleton function of an object are important representations for shape analysis and recognition. They contain enough information to recognize an object and to reconstruct its original shape. However, they are sensitive to distortion caused by rotation and noise. This paper presents another approach for binary object representation called a modified exoskeleton(mES) that combines the previously defined exoskeleton with the use of symmetric object whose dominant property is rotation invariant. The mES is the skeleton of a circular background around the object that preserves the skeleton properties including significant information about the object for use in object recognition. Then the matching algorithm for object recognition based on the mES is presented. We applied the matching algorithm to evaluate the mES against the skeleton obtained from using 4-neighbor distance transformation on a set of artificial objects, and the experimental results reveal that the mES is more robust to distortion caused by rotation and noise than the skeleton and that the matching algorithm is capable of recognizing objects effectively regardless of their size and orientation.
Shinji FUKUI Yuji IWAHORI Robert J. WOODHAM Kenji FUNAHASHI Akira IWATA
This paper proposes a new method to recover the sign of local Gaussian curvature from multiple (more than three) shading images. The information required to recover the sign of Gaussian curvature is obtained by applying Principal Components Analysis (PCA) to the normalized irradiance measurements. The sign of the Gaussian curvature is recovered based on the relative orientation of measurements obtained on a local five point test pattern to those in the 2-D subspace called the eigen plane. Using multiple shading images gives a more accurate and robust result and minimizes the effect of shadows by allowing a larger area of the visible surface to be analyzed compared to methods using only three shading images. Furthermore, it allows the method to be applied to specular surfaces. Since PCA removes linear correlation among images, the method can produce results of high quality even when the light source directions are not widely dispersed.
Pierre-Louis BAZIN Jean-Marc VEZIEN
This paper presents a new approach to shape and motion estimation based on geometric primitives and relations in a model-based framework. A description of a scene in terms of structured geometric elements sharing relationships allows to derive a parametric model with Euclidian constraints, and a camera model is also proposed to reduce the problem dimensionality. It leads to a sequential MAP estimation, that gives accurate and comprehensible results on real images.
Hiroyuki UKIDA Katsunobu KONISHI
We suggest the method to recover the 3D shape of an object by using a color image scanner which has three light sources. The photometric stereo is traditional to recover the surface normals of objects using multiple light sources. In this method, it usually assumes distant light sources to make the optical models simple. But the light sources in the image scanner are so close to an object that the illuminant intensity varies with the distance from the light source, therefore these light sources should be modeled as the linear light sources. In this method, by using these models and two step algorithm; the initial estimation by the iterating computation and the optimization by the non-linear least square method, not only the surface normal but also the absolute distance from the light source to the surface can be estimated. By using this method, we can recover the 3D shape more precisely. In the experimental results, the 3D shape of real objects can be recovered and the effectiveness of the proposed method is shown.
Davar PISHVA Atsuo KAWAI Kouji HIRAKAWA Kazunori YAMAMORI Tsutomu SHIINO
We propose a new field of application for machine vision, a machine-vision-based cash-register system. We show that the overall system of color analysis for such an application should include the method of color distribution analysis which we propose, and that the analysis of shape and size is important. We present our test results and identify a few technical issues which may have to be considered for its practical utilization.
Hidetoshi MIIKE Sosuke TSUKAMOTO Keishi NISHIHARA Takashi KURODA
This paper proposes a precise method of realizing simultaneous measurement of microscopic defects and the macroscopic three-dimensional shapes of planar objects having specular reflection surfaces. The direction vector field of surface tilt is evaluated directly by the introduction of a moving slit-light technique based on computer graphic animation. A reflected image created by the moving slit-light is captured by a video camera, and the image sequence of the slit-light deformation is analyzed. The obtained direction vector field of the surface tilt recovers the surface shape by means of integration. Two sample objects, a concave mirror and a plane plastic injection molding, are tested to measure the performance of the proposed method. Surface anomalies such as surface dent and warpage are detected quantitatively at a high resolution (about 0.2 [µm]) and a high accuracy (about 95%) in a wide area (about 15 [cm]) of the test object.
Weichun YE Yuankai ZHENG Seidikkurippu N. PIRAMANAYAGAM Yu LIN Victor Y. KRACHKOVSKY
Two isolated pulse models, the Lorentzian-like and the Mixture model, were used to investigate the effect of GMR heads-media with different geometric and magnetic parameters on the readback pulse shape. The matching of these two models with an actual pulse was compared in detail. The dependence of the readback pulse shape of GMR head on the head-media parameters and non-linear distortions was discussed in this paper. When applying these models to evaluate the performance of a recording system, it is necessary to take into account of the difference between the linear superposition of the isolated pulse and the actual readback data pattern. It was suggested to linearize the captured isolated pulse in order to use the model correctly as a useful tool for evaluating the system performance.
Kiyomi NAKAMURA Shingo MIYAMOTO
Although previous studies using artificial neural networks have been actively applied to object shape recognition, little attention has been paid to the recognition of spatial elements (e.g. position, rotation and size). In the present study, a rotation and size spreading associative neural network (RS-SAN net) is proposed and the efficacy of the RS-SAN net in object orientation (rotation), size and shape recognition is shown. The RS-SAN net pays attention to the fact that the spatial recognition system in the brain (parietal cortex) is involved in both the spatial (e.g. position, rotation and size) and shape recognition of an object. The RS-SAN net uses spatial spreading by spreading layers, generalized inverse learning and population vector methods for the recognition of the object. The information of the object orientation and size is spread by double spreading layers which have similar tuning characteristics to spatial discrimination neurons (e.g. axis orientation neurons and size discrimination neurons) in the parietal cortex. The RS-SAN net simultaneously recognizes the size of the object irrespective of its orientation and shape, the orientation irrespective of its size and shape, and the shape irrespective of its size and orientation.
Yen-Ping CHU Chin-Hsing CHEN Kuan-Cheng LIN
ATM networks are connection-oriented. Making a call requires first sending a message to do an admission control to guarantee the connections' QoS (quality of service) in the network. In this paper, we focus on the problem of translating a global QoS requirement into a set of local QoS requirements in ATM networks. Usually, an end-user is only concerned with the QoS requirements on end-to-end basis and does not care about the local switching node QoS. Most of recent research efforts only focus on worst-case end-to-end delay bound but pay no attention to the problem of distributing the end-to-end delay bound to local switching node. After admission control, when the new connection is admitted to enter the network, they equally allocate the excess delay and reserve the same bandwidth at each switch along the path. But, this can not improve network utilization efficiently. It motivates us to design a novel local QoS requirement allocation scheme to get better performance. Using the number of maximum supportable connections as the performance index, we derive an optimal delay allocation (OPT) policy. In addition, we also proposed an analysis model to evaluate the proposed allocation scheme and equal allocation (EQ) scheme in a series of switching nodes with the Rate-controlled scheduling architecture, including a traffic shaper and a non-preemptive earliest-deadline-first scheduler. From the numerical results, we have shown the importance of allocation policy and explored the factors that affect the performance index.
Shinfeng D. LIN Chien-Chuang LIN Shih-Chieh SHIE
MPEG-4 emphasizes on coding efficiency and allows for content-based access and transmission of arbitrary shaped object. It addresses the encoding of video object using shape coding, motion estimation, and texture coding for interactivity, high compression ratio, and scalability. In this letter, an advanced object-adaptive vertex-based shape coding method is proposed for encoding the shape of video objects. This method exploits octant-based representation to represent the relation of adjacent vertices and that relation can be used to improve coding efficiency. Simulation results demonstrate that the proposed method may reduce more bits for closely spaced vertices.
Yoshiaki YASUNO Yasunori SUTOH Masahiko MORI Masahide ITOH Toyohiko YATAGAI
An improved pulse shaper is proposed which is able to control both the spatial and temporal profile of femtosecond light pulses. Our pulse shaper exploits the spatio-temporal coupling effect seen in pulse shapers. Its properties are numerically analyzed by application of the Wigner distribution function. We confirm that the spatio-temporal output pulse track dictates the differentiation of the phase mask; that the degree of spatio-temporal coupling is determined by the focal length ratio of the lenses in the pulse shaper; and that space to spatial-frequency chirp results from misalignment of lenses.
Koichiro DEGUCHI Daisuke KAWAMATA Kanae MIZUTANI Hidekata HONTANI Kiwa WAKABAYASHI
A new method to recover and display 3D fundus pattern on the inner bottom surface of eye-ball from stereo fundus image pair is developed. For the fundus stereo images, a simple stereo technique does not work, because the fundus is observed through eye lens and a contact wide-angle enlarging lens. In this method, utilizing the fact that fundus forms a part of sphere, we identify their optical parameters and correct the skews of the lines-of-sight. Then, we obtain 3D images of the fundus by back-projecting the stereo images.
Katsuyuki KAMEI Wayne HOY Takashi TAMADA Kazuo SEO
In many fields such as city administration and facilities management, there are an increasing number of requests for a Geographic Information System (GIS) that provides users with automated mapping functions. A mechanism which displays 3D views of an urban scene is particularly required because it would allow the construction of an intuitive and understandable environment for managing objects in the scene. In this paper, we present a new urban modeling system utilizing both image-based and geometry-based approaches. Our method is based on a new concept in which a wide urban area can be displayed with natural photo-realistic images, and each object drawn in the view can be identified by pointing to it. First, to generate natural urban views from any viewpoint, we employ an image-based rendering method, Image Walkthrough, and modify it to handle aerial images. This method can interpolate and generate natural views by assembling several source photographs. Next, to identify each object in the scene, we recover its shape using computer vision techniques (a geometry-based approach). The rough shape of each building is reconstructed from various aerial images, and then its drawn position on the generated view is also determined. This means that it becomes possible to identify each building from an urban view. We have combined both of these approaches yielding a new style of urban information management. The users of the system can enjoy an intuitive understanding of the area and easily identify their target, by generating natural views from any viewpoint and suitably reconstructing the shapes of objects. We have made a prototype system of this new concept of GIS, which have shown the validity of our method.
Yoshinari KAMEDA Takeo TAODA Michihiko MINOH
A high speed 3D shape reconstruction method with multiple video cameras and multiple computers on LAN is presented. The video cameras are set to surround the real 3D space where people exist. Reconstructed 3D space is displayed in voxel format and users can see the space from any viewpoint with a VR viewer. We implemented a prototype system that can work out the 3D reconstruction with the speed of 10.55 fps in 313 ms delay.
Hilario Haruomi KOBAYASHI Yasuhiko HARA Hideaki DOI Kazuo TAKAI Akiyoshi SUMIYA
The visual inspection of printed circuit boards (PCBs) at the final production stage is necessary for quality assurance and the requirements for an automated inspection system are very high. However, consistent inspection of patterns on these PCBs is very difficult due to pattern complexity. Most of the previously developed techniques are not sensitive enough to detect defects in complex patterns. To solve this problem, we propose a new optical system that discriminates pattern types existing on a PCB, such as copper, solder resist and silk-screen printing. We have also developed a hybrid defect detection technique to inspect discriminated patterns. This technique is based on shape measurement and features extraction methods. We used the proposed techniques in an actual automated inspection system, realizing real time transactions with a combination of hardware equipped with image processing LSIs and PC software. Evaluation with this inspection system ensures a 100% defect detection rate and a fairly low false alarm rate (0.06%). The present paper describes the inspection algorithm and briefly explains the automated inspection system.
The conventional shape from focus (SFF) methods have inaccuracies because of piecewise constant approximation of the focused image surface (FIS). We propose a more accurate scheme for SFF based on representation of three-dimensional FIS in terms of neural network weights. The neural networks are trained to learn the shape of the FIS that maximizes the focus measure.
In this letter we propose a new Shape from Focus (SFF) method using piecewise curved search windows for accurate 3-D shape recovery. The new method uses piecewise curved windows to compute focus measure and to search for Focus Image Surface (FIS) in image space. The experimental result shows that our new method gives more accurate result than the previous SFF methods.