IEICE global.ieice.org Site

Keyword Search Result

[Keyword] Y(22683hit)

21041-21060hit(22683hit)

Duration Modeling with Decreased Intra-Group Temporal Variation for HMM-Based Phoneme Recognition
Nobuaki MINEMATSU Keikichi HIROSE

PAPER

Vol:
E78-D No:6
Page(s):
654-661
A new clustering method was proposed to increase the effect of duration modeling on the HMM-based phoneme recognition. A precise observation on the temporal correspondences between a phoneme HMM with output probabilities by single Gaussian modeling and its training data indicated that there were two extreme cases, one with several types of correspondences in a phoneme class completely different from each other, and the other with only one type of correspondence. Although duration modeling was commonly used to incorporate the temporal information in the HMMs, a good modeling could not be obtained for the former case. Further observation for phoneme HMMs with output probabilities by Gaussian mixture modeling also showed that some HMMs still had multiple temporal correspondences, though the number of such phonemes was reduced as compared to the case of single Gaussian modeling. An appropriate duration modeling cannot be obtained for these phoneme HMMs by the conventional methods, where the duration distribution for each HMM state is represented by a distribution function. In order to cope with the problem, a new method was proposed which was based on the clustering of phoneme classes with plural types of temporal correspondences into sub-classes. The clustering was conducted so as to reduce the variations of the temporal correspondences in sub-classes. After the clustering, an HMM was constructed for each sub-class. Using the proposed method, speaker dependent recognition experiments were performed for phonemes segmented from isolated words. A few-percent increase was realized in the recognition rate, which was not obtained by another method based on the duration modeling with a Gaussian mixture.
A Study on Speaker Adaptation for Mandarin Syllable Recognition with Minimum Error Discriminative Training
Chih-Heng LIN Chien-Hsing WU Pao-Chung CHANG

PAPER

Vol:
E78-D No:6
Page(s):
712-718
This paper investigates a different method of speaker adaptation for Mandarin syllable recognition. Based on the minimum classification error (MCE) criterion, we use the generalized probabilistic decent (GPD) algorithm to adjust interatively the parameters of the hidden Markov models (HMM). The experiments on the multi-speaker Mandarin syllable database of Telecommunication Laboratories (T.L.) yield the following results: 1) Efficient speaker adaptation can be achieved through discriminative training using the MCE criterion and the GPD algorithm. 2) The computations required can be reduced through the use of the confusion sets in Mandarin base syllables. 3) For the discriminative training, the adjustment on the mean values of the Gaussian mixtures has the most prominent effect on speaker adaptation. 4) The discriminative training approach can be used to enhance the speaker adaptation capability of the maximum a posteriori (MAP) approach.
A Comparative Study of Output Probability Functions in HMMs
Seiichi NAKAGAWA Li ZHAO Hideyuki SUZUKI

PAPER

Vol:
E78-D No:6
Page(s):
669-675
One of the most effective methods in speech recognition is the HMM which has been used to model speech statistically. The discrete distribution and the continuos distribution HMMs have been widely used in various applications. However, in recent years, HMMs with various output probability functions have been proposed to further improve recognition performance, e.g. the Gaussian mixture continuous and the semi-continuous distributed HMMs. We recently have also proposed the RBF (radial basis function)-based HMM and the VQ-distortion based HMM which use a RBF function and VQ-distortion measure at each state instead of an output probability density function used by traditional HMMs. In this paper, we describe the RBF-based HMM and the VQ-distortion based HMM and compare their performance with the discrete distributed, the Gaussian mixture distributed and the semi-continuous distributed HMMs based on their speech recognition performance rates through experiments on speaker-independent spoken digit recognition. Our results confirmed that the RBF-based and VQ-distortion based HMMs are more robust and superior to traditional HMMs.
Tone Recognition of Chinese Dissyllables Using Hidden Markov Models
Xinhui HU Keikichi HIROSE

PAPER

Vol:
E78-D No:6
Page(s):
685-691
A method of tone recognition has been developed for dissyllabic speech of Standard Chinese based on discrete hidden Markov modeling. As for the feature parameters of recognition, combination of macroscopic and microscopic parameters of fundamental frequency contours was shown to give a better result as compared to the isolated use of each parameter. Speaker normalization was realized by introducing an offset to the fundamental frequency. In order to avoid recognition errors due to syllable segmentation, a scheme of concatenated learning was adopted for training hidden Markov models. Based on the observations of fundamental frequency contours of dissyllables, a scheme was introduced to the method, where a contour was represented with a series of three syllabic tone models, two for the first and the second syllables and one for the transition part around the syllabic boundary. Corresponding to the voiceless consonant of the second syllable, fundamental frequency contour of a dissyllable may include a part without fundamental frequencies. This part was linearly interpolated in the current method. To prove the validity of the proposed method, it was compared with other methods, such as representing all of the dissyllabic contours as the concatenation of two models, assigning a special code to the voiceless part, and so on. Tone sandhi was also taken into account by introducing two additional models for the half-third tone and for the first 4th tone of the combination of two 4th tones. With the proposed method, average recognition rate of 96% was achieved for 5 male and 5 female speakers.
Recent Trends in Medical Microwave Radiometry
Shizuo MIZUSHINA Hiroyuki OHBA Katsumi ABE Shinya MIZOSHIRI Toshifumi SUGIURA

INVITED PAPER

Vol:
E78-B No:6
Page(s):
789-798
Microwave radiometry has been investigated for non-invasive measurement of temperature in human body. Recent trends are to explore the capability of retrieving a temperature profile or map from a set of brightness temperatures measured by a multifrequency radiometer operating in a 1-6GHz range. The retrieval of temperature from the multifrequency measurement data is formulated as an inverse problem in which the number of independent measurement or data is limited (7) and the data suffer from considerably large random fluctuations. The standard deviation of the data fluctuation is given by the brightness temperature resolution of the instrument (0.04-0.1K). Solutions are prone to instabilities and large errors unless proper solution methods are used. Solution methods developed during the last few years are reviewed: singular system analysis, bio-heat transfer solution matched with radiometric data, and model-fitting combined with Monte Carlo technique. Typical results obtained by these methods are presented to indicate a crosssection of the present-state-of-the-development in the field. This review concludes with discussions on the radiometric weighting function which connects physical temperatures in object to the brightness temperature. Three-dimensional weighting functions derived by the modal analysis and the FDTD method for a rectangular waveguide antenna coupled to a four layered lossy medium are discussed. Development of temperature retrieval procedures incorporating the 3-D weighting functions is an important and challenging task for future work in this field.
Recent Progress of Electromagnetic Techniques in Hyperthermia Treatment
Makoto KIKUCHI

INVITED PAPER

Vol:
E78-B No:6
Page(s):
799-808
In the early stage of hyperthermia, a large number of engineering efforts have been done in the development or the improvement of the heating and temperature measuring techniques. However, they were not always satisfactory clinically. Thus, even in this moment, various engineering researches as well as the electromagnetic techniques for hyperthermia should be build up rapidly. This paper describes some of the highlights of developed or ongoing electromagnetic heating techniques in hyperthermia and identities a trend of emerging electromagnetic heating. Furthermore, the author emphasizes that few medical engineering efforts have been done in the boundary field between pure physics and clinics, and the proper way to develop the hyperthermia equipment is the best use of successes in the three essential regions: Physics, Biology and Clinics.
Performance of Spread Spectrum Medical Telemetry System in a Sharing Frequency Band with Current Telemetry System
Masaki KYOSO Toshiaki TAKANE Akihiko UCHIYAMA

LETTER

Vol:
E78-B No:6
Page(s):
862-865
To make medical telemetry system more reliable in severe electromagnetic environment, we applied spread spectrum communication to ECG data transmission method. Spread spectrum communication system has shown superior performances to other systems, especially, in respect of anti-jamming, which allows it to share the frequency band with current telemetry systems. In this study, we show the characteristics of a spread spectrum transmitter when it is used in the same frequency band as a narrow-band transmitter. The result shows that the spread spectrum telemetry system can use the same frequency band permitted for medical telemetry system.
Electromagnetic Near Fields of Rectangular Waveguide Antennas in Contact with Biological Objects Obtained by the FD-TD Method
Katsumi ABE Shinya MIZOSHIRI Toshifumi SUGIURA Shizuo MIZUSHINA

LETTER

Vol:
E78-B No:6
Page(s):
866-870
Multifrequency microwave radiometry for non-invasive measurement of temperature in biological objects has been investigated in our laboratory. An open-ended rectangular waveguide filled with a dielectric has been used as a contact-type antenna of a radiometer operating over a 1-4GHz range. In the radiometric measurement, the radiometer measures the thermal radiation emitted by the object via the antenna as the brightness temperature. The brightness temperature is related to the physical temperatures in the object through the radiometric weighting function. By virtue of the reciprocity of antenna, the weighting function can be derived from the field distribution induced in the object by the same antenna when it is operated in the active mode. In this paper, the FD-TD method is used to analyze the problem of coupling between the rectangular waveguide antenna and a biological object. The objects studied in this paper are a homogeneous and a four-layered lossy media. Working frequency is 1.2GHz, which is the center frequency of the lowest-frequency band of our radiometer. Numerical results are presented in the form of SAR patterns. It is found that the SAR patterns tend to spread out in the lateral directions in the bolus, skin and fat layers due to the diffraction which becomes stronger at lower frequencies. Results also suggest that the lateral spreading can be controlled to a certain extent by choosing the size elf antenna flange properly.
Analyses of Virtual Path Bandwidth Control Effects in ATM Networks
Hisaya HADAMA Ken-ichi SATO Ikuo TOKIZAWA

PAPER-Communication Systems and Transmission Equipment

Vol:
E78-B No:6
Page(s):
907-915
This paper presents a newly developed analytical method which evaluates the virtual path bandwidth control effects for a general topology ATM (Asynchronous Transfer Mode) transport network. The virtual path concept can enhance the controllability of path bandwidth. Required link capacity to attain a specified call blocking probability can be reduced by applying virtual path bandwidth control. This paper proposes an analytical method to evaluate the call blocking probability of a general topology ATM network, which includes many virtual paths, that is using virtual path bandwidth control. A method for the designing link capacities of the network is also proposed. These methods make it possible to design an optimum transport network with path bandwidth control. Finally, a newly developed approximation technique is used to develop some analytical results on the effects of dynamic path bandwidth control are provided to demonstrate its effectiveness.
Coding for Multi-Pulse PPM with Imperfect Slot Synchronization in Optical Direct-Detection Channels
Kazumi SATO Tomoaki OHTSUKI Iwao SASASE

PAPER-Optical Communication

Vol:
E78-B No:6
Page(s):
916-922
The performance of coded multi-pulse pulse position modulation (MPPM) consisting of m slots and 2 pulses, denoted as (m, 2) MPPM, with imperfect slot synchronization is analyzed. Convolutional codes and Reed-Solomon (RS) codes are employed for (m, 2) MPPM, and the bit error probability of coded (m, 2) MPPM in the presence of the timing offset is derived. In each coded (m, 2) MPPM, we compare the performance of some different code rate systems. Moreover, we compare the performance of both systems at the same information bit rate. It is shown that in both coded systems, the performance of code rate-1/2 coded (m, 2) MPPM is the best when the timing offset is small. Wheji the timing offset is somewhat large, however, uncoded (m, 2) MPPM is shown to perform better than coded (m, 2) MPPM. Further, convolutional coded (m, 2) MPPM with the constraint length k7 is shown to perform better than RS coded (m, 2) MPPM for the same code rate.
Cooperative Spoken Dialogue Model Using Bayesian Network and Event Hierarchy
Masahiro ARAKI Shuji DOSHITA

PAPER

Vol:
E78-D No:6
Page(s):
629-635
In this paper, we propose a dialogue model that reflects two important aspects of spoken dialogue system: to be robust' and to be cooperative'. For this purpose, our model has two main inference spaces: Conversational Space (CS) and Problem Solving Space (PSS). CS is a kind of dynamic Bayesian network that represents a meaning of utterance and general dialogue rule. Robust' aspect is treated in CS. PSS is a network so called Event Hierarchy that represents the structure of task domain problems. Cooperative' aspect is mainly treated in PSS. In constructing CS and making inference on PSS, system's process, from meaning understanding through response generation, is modeled by dividing into five steps. These steps are (1) meaning understanding, (2) intention understanding, (3) communicative effect, (4) reaction generation, and (5) response generation. Meaning understanding step constructs CS and response generation step composes a surface expression of system's response from the part of CS. Intention understanding step makes correspondence utterance type in CS with action in PSS. Reaction generation step selects a cooperative reaction in PSS and expands a reaction to utterance type of CS. The status of problem solving and declared user's preference are recorded in mental state by communicative effect step. Then from our point of view, cooperative problem solving dialogue is regarded as a process of constructing CS and achieving goal in PSS through these five steps.
Fast Solutions for Consecutive 2-out-of-r-from-n: F System
Yoichi HIGASHIYAMA Hiromu ARIYOSHI Miro KRAETZL

PAPER

Vol:
E78-A No:6
Page(s):
680-684
The previous literature on consecutive k-out-of-r-from-n: F systems give recursive equations for the system reliability only for the special case when all component probabilities are equal. This paper deals with the problem of calculating the reliability for a (linear or circular) consecutive 2-out-of-r-from-n: F system with unequal component probabilities. We provide two new algorithms for the linear and circular systems which have time complexity of O(n) and O(nr), respectively. The results of some computational experiments are also described.
Global Dynamic Behaviour of a Parallel Blower System
Hideaki OKAZAKI Hideo NAKANO Takehiko KAWASE

PAPER-Nonlinear Problems

Vol:
E78-A No:6
Page(s):
715-726
A parallel blower system is quite familiar in hydraulic machine systems and quite often employed in many process industries. It is dynamically dual to the fundamental functional element of digital computer, that is, the flip-flop circuit, which was extensively studied by Moser. Although the global dynamic behaviour of such systems has significant bearing upon system operation, no substantial study reports have hitherto been presented. Extensive research concern has primarily been concentrated upon the local stability of the equilibrium point. In the paper, a piecewise linear model is used to analytically and numerically investigate its manifold global dynamic behaviour. To do this, first the Poincar map is defined as a composition boundary map, each of which is defined as the point transformation from the entry point to the end point of any trajectory on some boundary planes. It was already shown that, in some parameter region, the system exhibits the so-called chaotic states. The chaos generating process is investigated using the above Poincar map and it is shown that the map contains the contracting, stretching and folding operations which, as we often see in many cases particularly in horse shoe map, produce the chaotic states. Considering the one dimensional motions of the orbits by such Poincar map, that is, the stretching and folding operations, a one dimensional approximation of the Poincar map is introduced to more closely and exactly study manifold bifurcation processes and some illustrative bifurcation diagrams in relation to system parameters are presented. Thus it is shown how many types of bifurcations like the Hopf, period doubling, saddle node, and homoclinic bifurcations come to exist in the system.
A Markovian Software Availability Measurement with a Geometrically Decreasing Failure-Occurrence Rate
Koichi TOKUNO Shigeru YAMADA

PAPER-Reliability and Fault Analysis

Vol:
E78-A No:6
Page(s):
737-741
We develop a software availability model incorporating software failure-occurrence and fault-correction times, under the assumption that the hazard rate for software failure-occurrence decreases geometrically with the progress in fault-removal process. Considering that the software system alternates two states, i.e. the operational state that a system is operating and the maintenance state that a system is inoperable due to the fault-correction activity, we model the time-dependent behavior of the system with a Markov process. Expressions for several quantities of software system perfomance are derived from this model. Finally, numerical examples are presented for illustration of software availability measurement.
New Error Probability Upper Bound on Maximum Likelihood Sequence Estimation for Intersymbol Interference Channels
Hiroshi NOGAMI Gordon L. STÜBER

PAPER-Information Theory and Coding Theory

Vol:
E78-A No:6
Page(s):
742-752
A new upper hound on the error probability for maximum likelihood sequence estimation of digital signaling on intersymbol interference channels with additive white Gaussian noise is presented. The basic idea is to exclude all parallel error sequences and to exclude some of the overlapping error events from the union bound. It is shown that the new upper bound can be easily and efficiently computed by using a properly labeled error-state diagram and a one-directional stack algorithm. Several examples are presented that compare the new upper bound with bounds previously reported in the literature.
Automatic Language Identification Using Sequential Information of Phonemes
Takayuki ARAI

PAPER

Vol:
E78-D No:6
Page(s):
705-711
In this paper approaches to language identification based on the sequential information of phonemes are described. These approaches assume that each language can be identified from its own phoneme structure, or phonotactics. To extract this phoneme structure, we use phoneme classifiers and grammars for each language. The phoneme classifier for each language is implemented as a multi-layer perceptron trained on quasi-phonetic hand-labeled transcriptions. After training the phoneme classifiers, the grammars for each language are calculated as a set of transition probabilities for each phoneme pair. Because of the interest in automatic language identification for worldwide voice communication, we decided to use telephone speech for this study. The data for this study were drawn from the OGI (Oregon Graduate Institute)-TS (telephone speech) corpus, a standard corpus for this type of research. To investigate the basic issues of this approach, two languages, Japanese and English, were selected. The language classification algorithms are based on Viterbi search constrained by a bigram grammar and by minimum and maximum durations. Using a phoneme classifier trained only on English phonemes, we achieved 81.1% accuracy. We achieved 79.3% accuracy using a phoneme classifier trained on Japanese phonemes. Using both the English and the Japanese phoneme classifiers together, we obtained our best result: 83.3%. Our results were comparable to those obtained by other methods such as that based on the hidden Markov model.
An Objective Measure Based on an Auditory Model for Assessing Low-Rate Coded Speech
Toshiro WATANABE Shinji HAYASHI

PAPER

Vol:
E78-D No:6
Page(s):
751-757
We propose an objective measure from assessing low-rate coded speech. The model for this objective measure, in which several known features of the perceptual processing of speech sounds by the human ear are emulated, is based on the Hertz-to-Bark transformation, critical-band filtering with preemphasis to boost higher frequencies, nonlinear conversion for subjective loudness, and temporal (forward) masking. The effectiveness of the measure, called the Bark spectral distortion rating (BSDR), was validated by second-order polynomial regression analysis between the computed BSDR values and subjective MOS ratings obtained for a large number of utterances coded by several versions of CELP coders and one VSELP coder under three degradation conditions: input speech levels, transmission error rates, and background noise levels. The BSDR values correspond better to MOS ratings than several commonly used measures. Thus, BSDR can be used to accurately predict subjective scores.
Error Analysis of Field Trial Results of a Spoken Dialogue System for Telecommunications Applications
Shingo KUROIWA Kazuya TAKEDA Masaki NAITO Naomi INOUE Seiichi YAMAMOTO

PAPER

Vol:
E78-D No:6
Page(s):
636-641
We carried out a one year field trial of a voice-activated automatic telephone exchange service at KDD Laboratories which has about 200 branch phones. This system has DSP-based continuous speech recognition hardware which can process incoming calls in real time using a vocabulary of 300 words. The recognition accuracy was found to be 92.5% for speech read from a written text under laboratory conditions independent of the speaker. In this paper, we describe the performance of the system obtained as a result of the field trial. Apart from recognition accuracy, there was about 20% error due to out-of-vocabulary input and incorrect detection of speech endpoints which had not been allowed for in the laboratory experiments. Also, we found that the recognition accuracy for actual speech was about 18% lower than for speech read from text even if there were no out-of-vocabulary words. In this paper, we examine error variations for individual data in order to try and pinpoint the cause of incorrect recognition. It was found from experiments on the collected data that the pause model used, filled pause grammar and differences of channel frequency response seriously affected recognition accuracy. With the help of simple techniques to overcome these problems, we finally obtained a recognition accuracy of 88.7% for real data.
Computation of the Field Distribution Generated by a Rectangular Aperture in a Four-Layered Lossy Dielectric Medium by Modal Analysis
Shinya MIZOSHIRI Katsumi ABE Toshifumi SUGIURA Shizuo MIZUSHINA

PAPER

Vol:
E78-B No:6
Page(s):
851-858
An open-ended rectangular waveguide filled with a dielectric has been used as a contact-type antenna of microwave radiometer for non-invasive measurement of temperature in a biological object. In this application, the thermal radiation emitted by the object is measured as the brightness temperature by the instrument via the antenna. The brightness temperature is related to the physical temperatures in the object through the radiometric weighting function. By virtue of the reciprocity of antenna, the weighting function can be derived from the field distribution induced in the object by the antenna when it is operated in the active mode. In this work, we treat a problem of the rectangular waveguide antenna radiating into a four-layered medium by modal analysis. The results are first compared with those obtained by the FD-TD method to indicate that the results of the two methods are in a good agreement. The operation of an antenna used in a radiometer system in our laboratory is analyzed by this method and the weighting functions at different frequencies are computed, and the results are presented along with discussions on the results.
A Flexible Hybrid Channel Assignment Strategy Using an Artificial Neural Network in a Cellular Mobile Communication system
Kazuhiko SHIMADA Masakazu SENGOKU Takeo ABE

PAPER

Vol:
E78-A No:6
Page(s):
693-700
A novel algorithm, as an advanced Hybrid Channel Assignment strategy, for channel assignment problem in a cellular system is proposed. A difference from the conventional Hybrid Channel Assignment method is that flexible fixed channel allocations which are variable through the channel assignment can be performed in order to cope with varying traffic. This strategy utilizes the Channel Rearrangement technique using the artificial neural network algorithm in order to enhance channel occupancy on the fixed channels. The strategy is applied to two simulation models which are the spatial homogeneous and inhomogeneous systems in traffic. The simulation results show that the strategy can effectively improve blocking probability in comparison with pure dynamic channel assignment strategy only with the Channel Rearrangement.

21041-21060hit(22683hit)

Keyword Search Result

[Keyword] Y(22683hit)

Duration Modeling with Decreased Intra-Group Temporal Variation for HMM-Based Phoneme Recognition

A Study on Speaker Adaptation for Mandarin Syllable Recognition with Minimum Error Discriminative Training

A Comparative Study of Output Probability Functions in HMMs

Tone Recognition of Chinese Dissyllables Using Hidden Markov Models

Recent Trends in Medical Microwave Radiometry

Recent Progress of Electromagnetic Techniques in Hyperthermia Treatment

Performance of Spread Spectrum Medical Telemetry System in a Sharing Frequency Band with Current Telemetry System

Electromagnetic Near Fields of Rectangular Waveguide Antennas in Contact with Biological Objects Obtained by the FD-TD Method

Analyses of Virtual Path Bandwidth Control Effects in ATM Networks

Coding for Multi-Pulse PPM with Imperfect Slot Synchronization in Optical Direct-Detection Channels

Cooperative Spoken Dialogue Model Using Bayesian Network and Event Hierarchy

Fast Solutions for Consecutive 2-out-of-r-from-n: F System

Global Dynamic Behaviour of a Parallel Blower System

A Markovian Software Availability Measurement with a Geometrically Decreasing Failure-Occurrence Rate

New Error Probability Upper Bound on Maximum Likelihood Sequence Estimation for Intersymbol Interference Channels

Automatic Language Identification Using Sequential Information of Phonemes

An Objective Measure Based on an Auditory Model for Assessing Low-Rate Coded Speech

Error Analysis of Field Trial Results of a Spoken Dialogue System for Telecommunications Applications

Computation of the Field Distribution Generated by a Rectangular Aperture in a Four-Layered Lossy Dielectric Medium by Modal Analysis

A Flexible Hybrid Channel Assignment Strategy Using an Artificial Neural Network in a Cellular Mobile Communication system

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles