Yasuhisa HAYASHI Akio OGIHARA Kunio FUKUNAGA
We propose a recognition method for HMM using a simultaneous generative histogram. Proposed method uses the correlation between two features, which is expressed by a simultaneous generative histogram. Then output probabilities of integrated HMM are conditioned by the codeword of another feature. The proposed method is applied to isolated digit word recognition to confirm its validity.
Hui ZHAO Toru SATO Iwane KIMURA
This paper presents an adaptive rate error control scheme for digital communication over time-varying channels. The cyclic code with majority-logic decoding is used in a cascaded way as an inner code to create a simple and powerful hybrid-ARQ error control scheme. Inner code is used only for error correction and the outer code is used for both error correction and error detection. When an error is detected, retransmission is required. The unsuccessful packets are not discarded as with conventional schemes, but are combined with their retransmitted copies. Approximations for the throughput efficiency and the undetectable error probability are given. A high reliability coupled with a simple high-speed implementation makes it suitable for high data rate error control over both stationary and nonstationary channels. Adaptive error control scheme becomes the best solution for time-varying channels when the optimum code is selected according to the actual channel conditions to enhance the system performance. The main feature of this system is that the basic structure of the encoder and decoder need not be modified while the error-correction capability of the code increases. Results of a comparative analysis show that the proposed scheme outperforms other similar ARQ protocols.
This paper gives a model to explain one phenomenon found in the process of creative concept formation, i.e. the phenomenon that people often get trapped in some state where the mental world remains nebulous and sometimes suddenly make a jump to a new concept. This phenomenon has been qualitatively explained mainly by the philosophers but there have not been models for explaining it quantitatively. Such model is necessary in a new research field to study the systems for aiding human creative activities. So far, the work on creation aid has not had theoretical background and the systems have been built based only on trial and error. The model given in this paper explains some aspects of the phenomena found in creative activities and give some suggestions for the future systems for aiding creative concept formation.
Jie CHEN Shuichi ITOH Takeshi HASHIMOTO
A new method for the compression of electrocardiographic (ECG) data is presented. The method is based on the orthonormal wavelet analysis recently developed in applied mathematics. By using wavelet transform, the original signal is decomposed into a set of sub-signals with different frequency channels corresponding to the different physical features of the signal. By utilizing the optimum bit allocation scheme, each decomposed sub-signal is treated according to its contribution to the total reconstruction distortion and to the bit rate. In our experiments, compression ratios (CR) from 13.5: 1 to 22.9: 1 with the corresponding percent rms difference (PRD) between 5.5% and 13.3% have been obtained at a clinically acceptable signal quality. Experimental results show that the proposed method seems suitable for the compression of ECG data in the sense of high compression ratio and high speed.
The performance of various ARQ protocols has recently been analyzed for multidestination environments. In all previous work, the round-trip delays between the transmitter and each of the receivers are assumed (or forced) to be equal to the maximum one, to simplify the analysis and/or the operation. This assumption obviously will sacrifice the system performance. In this paper, we evaluate the throughput efficiencies of three multidestination GBN ARQ protocols under unequal round-trip delays. In the investigated protocols, multiple copies of each data block are (re)transmitted contiguously to the receivers. Tight lower bounds are obtained for the throughput efficiencies of the schemes in which each data block is transmitted with the optimum number of copies. Results show that assuming all the round-trip delays to be equal to the maximum one may sacrifice the performance significantly. We also compare the performances of the three investigated protocols. In general, the performance becomes better as the transmitter utilizes more of the outcomes of previous transmission attempts.
Queueing problems are investigated for very wide classes of input traffic and service time models to obtain good loss probability and waiting time probability approximation. The proposed approximation is based on the fundamental recursion formula and the Chernoff bound technique, both of which requires no particular assumption for the stochastic nature of input traffic and service time, such as renewal or markovian properties. The only essential assumption is stationarity. We see that the accuracy of the obtained approximation is confirmed by comparison with computer simulation. There are a number of advantages of the proposed method of approximation when we apply it to network capacity design or path accommodation design problems. First, the proposed method has the advantage of applying to multi-media traffic. In the ATM network, a variety of bursty or non-bursty cell traffic exist and are superposed, so some unified analysis methodology is required without depending each traffic's characteristics. Since our method assumes only the stationarity of input and service process, it is applicable to arbitrary types of cell streams. Further, this approach can be used for the unexpected future traffic models. The second advantage in application is that the proposed probability approximation requires only small amount of computational complexity. Because of the use of the Chernoff bound technique, the convolution of every traffic's probability density fnuction is replaced by the product of probability generating functions. Hence, the proposed method provides a fast algorithm for, say, the call admission control problem. Third, it has the advantage of accuracy. In this paper, we applied the approxmation to the cases of homogeneous CBR traffic, non-homogeneous CBR traffic, M/D/1, AR(1)/D/1, M/M/1 and D/M/1. In all cases, the approximating values have enough accuracy for the exact values or computer simulation results from low traffic load to high load. Moreover, in all cases of the numerical comparison, our approximations are upper bounds of the real values. This is very important for the sake of conservative network design.
Yasuhiro TAKISHIMA Masahiro WADA Hitomi MURAKAMI
We analyze frame rates in low bit rate video coding and show that an optimal frame rate can be theoretically obtained. In low bit rate video coding the frame rate is usually forced to be decreased for reducing the total amount of coded information. The choice of frame rate, however, has a great effect on the picture quality in a trade-off relation between coded picture quality and motion smoothness. It is known from experience that in order to achieve an optimum balance between these two factors, a frame rate has to be selected which is appropriate for the coding scheme, property of the video sequences and coding bit rate. A theoretical analysis, however, on the existence of an optimal frame rate and how the optimal frame rate would be expressed has not been performed. In this paper, coding distortion measured by mean square error is analyzed by using video signal models such as a rate-distortion function for coded frames and inter-frame correlation coefficients for non-coded frames. Overall picture quality taking account of coded picture quality and motion smoothness simultaneously is expressed as a function of frame rate. This analysis shows that the optimum frame rate can be uniquely specified. The maximum frame rate is optimal when the coding bit rate is higher than a certain value for a given video scene, while a frame rate less than the maximum is optimal otherwise. The result of the theoretical analysis is compared with the results of computer simulation. In addition, the relation between this analysis and a subjective evaluation is described. From both comparisons this theoretical analysis can be justified as an effective scheme to indicate the optimal frame rate, and it shows the possibility of improving picture quality by selecting frame rate adaptively.
Gang WU Kaiji MUKUMOTO Akira FUKUDA
In our preceding paper, I-ISMA (Idle Signal Multiple Access for Integrated services), a combination of ISMA and time reservation technique, was proposed to transmit an integrated voice and data traffic in third generation wireless communication networks. There, the channel capacity of I-ISMA was evaluated by the static analysis. To fully estimate performance of contention-based channel access protocols, however, we also need dynamic analysis to evaluate stability, delay, etc. Particularly, in systems concerning real-time voice transmission, delay is one of the most important performance measures. A six-mode model to describe an I-ISMA system is set up. With some assumptions for simplification, the dynamic behavior of the system is approximated by a Markov process so that the EPA (Equilibrium Point Analysis), a fluid approximation method, can be applied to the analysis. Then, numerical and simulation results are obtained for some examples. By means of the same analysis method and under the same conditions, the performance of PRMA is evaluated and compared briefly with that of I-ISMA.
Yonehiko SUNAHARA Hiroyuki OHMINE Hiroshi AOKI Takashi KATAGI Tsutomu HASHIMOTO
This paper describes a novel method to calculate the fields scattered by a polyhedron structure for an incident plane wave. In this method, the fields diffracted by an edge are calculated using the equivalent edge currents which are separated into components dependent on each of the two surfaces which form the edge. The separated equivalent edge currents are based on the Geometrical Theory of Diffraction (GTD). Using this Separated Equivalent Edge Current Method (SEECM) , fields scattered by a polyhedron structure can be calculated without special treatment of the singularity in the diffraction coefficient. This method can be also applied successfully to structures with convex surfaces by modeling them as polyhedron structures.
Kazuhiko OGUSU Masashi YOSHIMURA Hiroo KOMURA
The intensity-dependent transmission characteristics of an Ag+Na+ ion-exchanged glass waveguide with a nematic liquid crystal MBBA cover have been investigated experimentally using an Ar+ laser. It is found that the transmission characteristics of the TE1 mode are strongly influenced by temperature. Optical bistability has been observed at a particular temperature. Such the strong temperature dependence is believed to be brought by an increase in ordinary refractive index of the MBBA cover due to temperature rise.
Hiroya FUJISAKI Keikichi HIROSE Noboru TAKAHASHI
Prosodic features of the spoken Japanese play an important role in the transmission of linguistic information concerning the lexical word accent, the sentence structure and the discourse structure. In order to construct prosodic rules for synthesizing high-quality speech, therefore, prosodic features of speech should be quantitatively analyzed with respect to the linguistic information. With a special focus on the fundamental frequency contour, we first define four prosodic units for the spoken Japanese, viz., prosodic word, prosodic phrase, prosodic clause and prosodic sentence, based on a decomposition of the fundamental frequency contour using a functional model for the generation process. Syntactic units are also introduced which have rough correspondence to these prosodic units. The relationships between the linguistic information and the characteristics of the components of the fundamental frequency contour are then described on the basis of results obtained by the analysis of two sets of speech material. Analysis of weathercast and newscast sentences showed that prosodic boundaries given by the manner of continuation/termination of phrase components fall into three categories, and are primarily related to the syntactic boundaries. On the other hand, analysis of noun phrases with various combinations of word accent types, syntactic structures, and focal conditions, indicated that the magnitude and the shape of the accent components, which of course reflect the information concerning the lexical accent types of constituent words, are largely influenced by the focal structure. The results also indicated that there are cases where prosody fails to meet all the requirements presented by word accent, syntax and discourse.
Nobuyoshi KAIKI Yoshinori SAGISAKA
In this paper, we quantitively analyzed speech data in seven different styles to make natural Japanese conversational speech synthesis. Three reading styles were produced at different speeds (slow, normal and fast), and four speaking styles were produced by enacting conversation in different situations (free, hurried, angry and polite). To clarify the differences in prosodic characteristics between conversational speech and read speech, means and standard deviations of vowel duration, vowel amplitude and fundamental frequency (F0) were analyzed. We found large variation in these prosodic parameters. To look more precisely at the segmental duration and segmental amplitude differences between conversational speech and read speech, control rules of prosodic parameters in reading styles were applied to conversational speech. F0 contours of different speaking styles are superposed by normalizing the segmental duration. The differences between estimated values and actual values were analyzed. Large differences were found at sentence final and key (focused) phrases. Sentence final positions showed lengthening of segmental vowel duration and increased segmental vowel amplitude. Key phrase positions featured raising F0.
Kenji HASHIMOTO Takemi MOCHIDA Yasuaki SATO Tetsunori KOBAYASHI Katsuhiko SHIRAI
For the production of high quality synthetic sounds in a text-to-speech system, an excellent synthesizing method of speech signals is indispensable. In this paper, a new speech analysis-synthesis method for the text-to-speech system is proposed. The signals of voiced speech, which have a line spectrum structure at intervals of pitch in the linear frequency domain, can be represented approximately by the superposition of sinusoidal waves. In our system, analysis and synthesis are performed using such a harmonic structure of the signals of voiced speech. In the analysis phase, assuming an exact harmonic structure model at intervals of pitch against the fine structure of the short-time power spectrum, the fundamental frequency f0 is decided so as to minimize the error of the log-power spectrum at each peak position. At the same time, according to the value of the above minimized error, the rate of periodicity of the speech signal is detemined. Then the log-power spectrum envelope is represented by the cosine-series interpolating the data which are sampled at every pitch period. In the synthesis phase, numerical solutions of non-linear differential equations which generate sinusoidal waves are used. For voiced sounds, those equations behave as a group of mutually synchronized oscillators. These sinusoidal waves are superposed so as to reconstruct the line spectrum structure. For voiceless sounds, those non-linear differential equations work as passive filters with input noise sources. Our system has some characteristics as follows. (1) Voiced and voiceless sounds can be treated in a same framowork. (2) Since the phase and the power information of each sinusoidal wave can be easily controlled, if necessary, periodic waveforms in the voiced sounds can be precisely reproduced in the time domain. (3) The fundamental frequency f0 and phoneme duration can be easily changed without much degradation of original sound quality.
The paper indicates the importance of suitability assesment in speech synthesis applications. Human factors involved in the use of a synthetic speech are first discussed on the basis of an example of a newspaper company where synthetic speech is extensively used as an aid for proofreading a manuscript. Some findings obtained from perceptual experiments on the subjects' preference for paralinguistic properties of synthetic speech are then described, focusing primarily on the suitability of pitch characteristics, speaker's gender, and speaking rates in the task where subjects are asked to proofread a printed text while listening to the speech. The paper finally claims the need for a flexibile speech synthesis system which helps the users create their own synthetic speech.
Keikichi HIROSE Hiroya FUJISAKI
A text-to-speech conversion system for Japanese has been developed for the purpose of producing high-quality speech output. This system consists of four processing stages: 1) linguistic processing, 2) phonological processing, 3) control parameter generation, and 4) speech waveform generation. Although the processing at the first stage is restricted to the texts on general weather conditions, the other three stages can also cope with texts of news and narrations on other topics. Since the prosodic features of speech are largely related to the linguistic information, such as word accent, syntactic structure and discourse structure, linguistic processing of a wider range than ever, at least a sentence, is indispensable to obtain good quality speech with respect to the prosody. From this point of view, input text was restricted to the weather forecast sentences and a method for linguistic processing was developed to conduct morpheme, syntactic and semantic analyses simultaneously. A quantitative model for generating fundamental frequency contours was adopted to make a good reflection of the linguistic information on the prosody of synthetic speech. A set of prosodic rules was constructed to generate prosodic symbols representing prosodic structures of the text from the linguistic information obtained at the first stage. A new speech synthesizer based on the terminal analog method was also developed to improve the segmental quality of synthetic speech. It consists of four paths of cascade connection of pole/zero filters and three waveform generators. The four paths are respectively used for the synthesis of vowels and vowel-like sounds, nasal murmur and buzz bar, friction, and plosion, while the three generators produce voicing source waveform approximated by polynomials, white Gaussian noise source for fricatives and impulse source for plosives. The validity of the approach above has been confirmed by the listening tests using speech synthesized by the developed system. Improvements both in the quality of prosodic features and in the quality of segmental features were realized for the synthetic speech.
Yoshinobu HIGAMI Seiji KAJIHARA Kozo KINOSHITA
In this paper we present a method to generate test sequences for stuck-at faults in sequential circuits which have distinguishing sequences. Since the circuit may have no distinguishing sequence, we use two design techniques for circuits which have distinguishing sequences. One is at state transition level and the other is at gate level. In our proposed method complete test sequence can be generated. The sequence consists of test vectors for the combinational part of the circuit, distinguishing sequences and transition sequences. The test vectors, which are generated by a combinational test generator, cause faulty staes or faulty output responses for a fault, and disinguishing sequences identify the differences between faulty states and fault free states. Transition sequences are necessary to make the state in the combinational vectors. And the distinguishing sequence and the transition sequence are used in the initializing sequence. Some techniques for shortening the test sequence is also proposed. The basic ideas of the techniques are to use a short initializing sequence and to find the order in concatenating sequences. But fault simulation is conducted so as not to miss any faults. The initializing sequence is obtained by using a distinguishing sequence. The efficiency of our method is shown in the experimental results for benchmark circuits.
Knowledge-based Database Assistant is an expert system designed to help novice users formulate correct and complete database queries. This paper describes a knowledge-based database assistant with advanced facilities such as (1) a menu-based querymaking guidance, (2) a menu-based natural-language user-interface, and (3) database-commands generator which formulates formal database queries with SQL language. The system works as an intelligent front-end to an SQL database system or a computer-aided SQL tutorial-system. In this paper, we also discuss a semantic-network model, named S-Net, which is used to represent the knowledge for formal database-query formulating processes. The menu-based English user-interface allows end-users to make a query by filling a certain query pattern with appropriate words. The query-pattern filling process is guided by pop-up menus provided by the system. The query-pattern instances thus obtained are then translated into formal database queries. The translation is carried out by evaluating operations on S-Net knowledge-base which conveys knowledge about application domain, and the underlying database schema.
Tomohiro MURATA Kenzou KURIHARA Ayako ASHIDA
Reactive systems respond to internal or external stimuli and act in an event-driven manner. It is generally difficult to specify a complex reactive systems' behavior using conventional state machine formalism. One reason is that actual reactive systems are usually formed by combining plural state-machince that behave concurretly. This paper presents the State Diagram Matrix (SDM) which is a visual and hierarchical formalism of such a reactive system's behavior. SDM has two concepts. The first is matrix plane description on which 3-dimensional state space is projected. The second is state abstraction for hierarchical state-machine definition. Understandability and reliability of control software was improved as a consequence of adopting SDM for specifying disk-subsystem control requirements. The development support functions of SDM using a workstation are also described.
Ikuo ARAI Kazuma MOTOMURA Tsutomu SUZUKI
A method to measure the displacement from the phase rotation of the Doppler signal including the displacement information of the moving body is proposed, where the displacement resolution can be improved 4 times by making the phase rotation faster. Furthermore, this test system is applied in clinical use. The test system is built up by using a two-phase microwave Doppler sensor covering a 10GHz band, where the Doppler frequency is multiplied 4 times by signal processing. Thus, the resolution is improved from a conventional 12.6mm (in case of 11.9GHz) to 3.15mm, and practical utilization has been attained. The microwave Doppler radar system described in this paper is adequate for the displacement measurement for a relatively fast moving body. As a medical sensor for clinical use, measurement examples of head movement in a vestibule examination (vestibule oculomotor reflexive inspection) and finger movement in a cerebellum function test are given. Furthermore by using two sets of this Doppler radar system, a 2-dimensional measurement of head movement is possible.
Mario G. FROMOW RANGEL Akira NOGUCHI
The inverse problem we consider in this paper seeks, based on the equivalent source method, to determine the shape of perfectly conducting cylinders from the scattered farfield data obtained by using several incident waves. When incident waves of different frequencies are used, the shape of the scatterer can be reconstructed by employing only a few number of observation points. In the reconstruction problem, to determine the shape of the scatterer, the conjugate gradients method is applied. The general approach is applicable to cylindrical scatterers of arbitrary shape. Results of numerical simulations are presented to support the suggested approach.