IEICE global.ieice.org Site

Keyword Search Result

[Keyword] (42756hit)

40381-40400hit(42756hit)

An Integration of Knowledge and Neural Networks toward a Phoneme Typewriter without a Language Model
Yasuhiro KOMORI Kaichiro HATAZAKI

PAPER-Continuous Speech Recognition

Vol:
E74-A No:7
Page(s):
1797-1805
In this paper, a speech recognition system toward a phoneme typewriter without a language model is proposed. The system is realized as an integration of spectrogram reading knowledge and Time-Delay Neural Networks (TDNNs). The system mainly consists of two parts: in the consonant recognition part, a sophisticated integration of knowledge and TDNN is proposed. This improves not only recognition performance and segmentation accuracy, but also reduces insertion errors drastically. In the vowel recognition part, a TDNN is used for detection and rough segmentation using its time shift tolerance advantage. The knowledge part is mainly used for verification of categories and boundaries. A phoneme recognition experiment on 2,620 Japanese words, uttered by one male speaker showed a 91.4% (11,612/12,710) recognition rate, a 3.6% deletion error rate, a 5.0% substitution error rate and a 20.7% insertion error rate, for all Japanese phonemes. This good result was obtained without any language model.
SUSKIT---A Speech Understanding System Based on Robust Phone Spotting--
Yutaka KOBAYASHI Masanori OMOTE Hidenori ENDO Yasuhisa NIIMI

PAPER-Speech Understanding

Vol:
E74-A No:7
Page(s):
1863-1869
This paper describes an overview of our speech understanding system and reports on the recent results of the sentence recognition experiments. The system, we call SUSKIT-, recognizes database queries in natural Japanese sentences. The user is expected to speak sentence by sentence. Among the difficult problems to overcome, this study paid the prime attentions to how to cope with the contextual variations of pronunciations and how to verify partial sentence hypotheses in a hierarchical system. The SUSKIT- predicts words strings in a top-down manner, however, the verification of hypotheses against the input speech is done using a unit independent of word boundaries. Words are not suitable units of verification because the smoothing effect owing to phonetic contexts makes it difficult to recognize short words. In order to avoid the misrecognition caused by the smoothing effect across word boundaries, the SUSKIT- dynamically extracts those phoneme strings bounded by the easily detectable phonemes from the predicted word string as verification templates. The left-to-right timesynchronous beam-search strategy was adopted for searching likely sentences. We carried out sentence recognition experiments using the speech corpus consists of 159 sentences read by three Japanese male speakers. The task perplexity was 8.3. Using the speaker-dependent HMM parameters, we obtained the sentence recognition rates of 83.0-92.5%.
FOREWORD
Tadayuki KOBAYASHI

FOREWORD

Vol:
E74-C No:7
Page(s):
1947-1948
Japanese Phonetic Typewriter Using HMM Phone Recognition and Stochastic Phone-Sequence Modeling
Takeshi KAWABATA Toshiyuki HANAZAWA Katsunobu ITOH Kiyohiro SHIKANO

PAPER-Dictation Systems

Vol:
E74-A No:7
Page(s):
1783-1787
A phonetic typewriter is an unlimitedvocabulary continuous speech recognition system recognizing each phone in speech without the need for lexical information. This paper describes a Japanese phonetic typewriter system based on HMM phone recognition and syllable-based stochastic phone sequence modeling. Even though HMM methods have considerable capacity for recognizing speech, it is difficult to recognize individual phones in continuous speech without lexical information. HMM phone recognition is improved by incorporating syllable trigrams for phone sequence modeling. HMM phone units are trained using an isolated word database, and their duration parameters are modified according to speaking rate. Syllable trigram tables are made from a text database of over 300,000 syllables, and phone sequence probabilities calculated from the trigrams are combined with HMM probabilities. Using these probabilities, to limit the number of intermediate candidates leads to an accurate phonetic typewriter system without requiring excessive computation time. An interpolated n-gram approach to phone sequence modeling, is shown to be more effective than a simple trigram method.
Processing Unknown Words in Continuous Speech Recognition
Kenji KITA Terumasa EHARA Tsuyoshi MORIMOTO

PAPER-Continuous Speech Recognition

Vol:
E74-A No:7
Page(s):
1811-1816
Current continuous speech recognition systems essentially ignore unknown words. Systems are designed to recognize words in the lexicon. However, for using speech recognition systems in a real application such as spoken-language processing, it is very important to process unknown words. This paper proposes a continuous speech recognition method which accepts any utterance that might include unknown words. In this method, words not in the lexicon are transcribed as phone sequences, while words in the lexicon are recognized correctly. The HMM-LR speech recognition system, which is an integration of Hidden Markov Models and generalized LR parsing, is used as the baseline system, and enhanced with the trigram model of syllables to take into account the stochastic characteristics of a language. In our approach, two kinds of grammars, a task grammar which describes the task and a phonetic grammar which describes constraints between phones, are merged and used in the HMM-LR system. The system can output a phonetic transcription for an unknown word by using the phonetic grammar. Experiment results indicate that our approach is very promising.
Continuous Speech Recognition Using Two-Level LR Parsing
Kenji KITA Toshiyuki TAKEZAWA Tsuyoshi MORIMOTO

PAPER-Continuous Speech Recognition

Vol:
E74-A No:7
Page(s):
1806-1810
This paper describes a continuous speech recognition system using two-level LR parsing and phone based HMMs. ATR has already implemented a predictive LR parsing algorithm in an HMM-based speech recognition system for Japanese. However, up to now, this system has used only intra-phrase grammatical constraints. In Japanese, a sentence is composed of several phrases and thus, two kinds of grammars, namely an intra-phrase grammar and an inter-phrase grammar, are sufficient for recognizing sentences. Two-level LR parsing makes it possible to use not only intra-phrase grammatical constraints but also inter-phrase grammatical constraints during speech recognition. The system is applied to Japanese sentence recognition where sentences were uttered phrase by phrase, and attains a word accuracy of 95.9% and a sentence accuracy of 84.7%.
Phenomenological Description of Conduction Mechanism of High-T_c Superconductors by Three-Fluid Model
Yoshio KOBAYASHI Tadashi IMAI

PAPER

Vol:
E74-C No:7
Page(s):
1986-1992
A new conduction mechanism of high-Tc superconductors is proposed to interpret experimental results phenomenologically, which will be called a three-fluid model, since nonpairing residual normal electron density nres is added into the super electron density ns and the normal electron density nn for a well-known two-fluid model. According to this model, an admittance equivalent circuit of superconductor in unit cube is derived and nn, ns, the complex conductivity , the surface impedance Zs, the skin depth δ, and the penetration depth λ are expressed as a function of temperature. Also such electron densities as ns, nn, and nres are expressed as a function of Zs. For Zs, , δ, λ, nn, and ns of a YBCO plate, the values calculated by this model agree well in the range T0.9 Tc with the ones measured by the dielectric resonator method, where Tc is the critical temperature; thus the validity of this model is verified. Also the ratio nres/nt0.18 is determined by using nres1.71024 m-3 and nt9.51024 m-3 obtained from the measured Zs values; thus it means that there is 18% nonpairing electron density in this YBCO plate. The ratio nres/nt can be used as figure of merit of high-Tc superconductors.
Comparison of Syntax-0riented Spoken Japanese Understanding System with Semantic-Oriented System
Seiichi NAKAGAWA Yoshimitsu HIRATA Isao MURASE Tomohiro TANOUE

PAPER-Speech Understanding

Vol:
E74-A No:7
Page(s):
1854-1862
This paper describes syntax/semantics oriented spoken Japanese understanding systems named "SPOJUSSYNO/SEMO" and compares them. At first these systems make Hidden-Markov-Models (HMM) based on word units automatically by concatenating syllables. Then a word lattice is hypothsized by using a word spotting algorithm and word-based HMMs for an input utterance. In SPOJUS-SYNO, the time-synchronous left-to-right parsing algorithm is executed to find the best word sequence from the word lattice according to syntactic & semantic knowledge represented by a context free semantic grammar. In SPOJUS-SEMO, the knowledges of syntax and semantics are represented by a dependency and case grammar. These systems were implemented in the "UNIX-QA" task with the vocabulary size of 521 words. Experimental result shows that the sentence recognition/understanding rate was about 80/87% for six male speakers for the SPOJUS-SYNO, but was very low performance for the SPOJUS-SEMO.
Induced Noise Properties Caused by Circuit Interruption with Electric Contacts
Keiichi UCHIMURA Hiroshi FUJITA

PAPER-Electromagnetic Compatibility

Vol:
E74-B No:7
Page(s):
1935-1940
Electric contact is one of the most important noise sources of electromagnetic noise. Hence, the noise of contact switching has been researched from various points of view with respect to the generation mechanism and properties. However, many phenomena have been remained not being investigated yet. In this paper, we describe our recent investigations about characteristics of the induced noise that is produced by the break of silver contact. The number of TTL IC's malfunction in the relay switching are counted under conditions of inductive load (10mH), circuit current (0.1-2A), and low source voltage (24V). From this experimental results, it became clear that the rate of malfunction decreased with increasing circuit current. To clarify its phenomenon, the circuit current dependence of the induced noise voltage was measured. It was observed that the level of induced noise voltage became the maximum in the current range of 0.2-1A. This property is discussed by the occurrence mechanism of each discharge mode on the break of contacts and the observation of induced noise corresponding to its mode.
Continuous Speech Recognition Using a Dependency Grammar and Phoneme-Based HMMs
Sho-ichi MATSUNAGA Shigeru HOMMA Shigeki SAGAYAMA Sadaoki FURUI

PAPER-Continuous Speech Recognition

Vol:
E74-A No:7
Page(s):
1826-1833
This paper describes two Japanese continuous speech recognition systems (system-1 and system-2) based on phoneme-based HMMs and a two-level grammar approach. Two grammars are an intra-phrase transition network grammar for phrase recognition, and an inter-phrase dependency grammar for sentence recognition. A joint score, combining acoustic likelihood and linguistic certainty factors derived from phonemebased HMMs and dependency rules, is maximized to obtain the best sentence recognition results. System-1 is tuned for sentences uttered phrase-by-phrase and system-2 is tuned for sentence utterances, to make the amount of computation practical. In system-1, two efficient parsing algorithms are used for each grammar. They are a bi-directional network parser and a breadth-first dependency parser. With the phrase-network parser, input phrase utterances are parsed bi-directionally both left-to-right and right-to-left, and optimal Viterbi paths are found along which the accumulated phonetic likelihood is maximized. The dependency parser utilizes efficient breadth-first search and beam search algorithms. For system-2, we have extended the dependency analysis algorithm for sentence utterances, using a technique for detecting most-likely multi-phrase candidates based on the Viterbi phrase alignment. Where the perplexity of the phrase syntax is 40, system-1 and system-2 increase phrase recognition performance in the sentence by approximately 6% and 14%, showing the effectiveness of semantic dependency analysis.
A Use of Current Continuity Condition in GTD-MM Hybrid Technique
Xu ZHANG Naoki INAGAKI Nobuyoshi KIKUMA

PAPER-Electromagnetic Theory

Vol:
E74-C No:7
Page(s):
2055-2060
A current continuity equation is proposed as the additional equation for the GTD-MM hybrid technique formulation to acquire the uniqueness of the solution which were nonexistent in the conventional formulation with the matching-point equation. The current continuity equation, which ensures the current continuity and satisfies the boundary condition, can directly be written down through equating the MM-region current to the GTD-region current at the regions boundary. It is proved that the current continuity equation is equivalent to the matching-point equation of special case when the matching-point located very close to the boundary, which were able to give the best solution in the conventional formulation with the matching-point equation as explained by Burnside et al. The validity of the new equation is confirmed through the numerical results.
A Note on Dual Trail Partition of a Plane Graph
Shuichi UENO Katsufumi TSUJI Yoji KAJITANI

LETTER-Graphs, Networks and Matroids

Vol:
E74-A No:7
Page(s):
1915-1917
Given a plane graph G, a trail of G is said to be dual if it is also a trail in the geometric dual of G. We show that the problem of partitioning the edges of G into the minimum number of dual trails is NP-hard.
Characteristics of All-Nb Thin Film Microbridges Fabricated by Nanometer Process
Yoshinori UZAWA Nobumitsu HIROSE Yuichi HARADA Motoaki SANO Matsuo SEKINE Kazuo YAMAGUCHI Hiroyuki OZAKI Akira HIRAO Shigeru YOSHIMORI Mitsuo KAWAMURA

PAPER

Vol:
E74-C No:7
Page(s):
2015-2019
We have fabricated all-Nb thin film microbridges by nanometer process using new resist developed by us, electron beam lithography (EBL) and reactive ion etching (RIE) using CBrF3 gas. The resistance against CBrF3 plasma of this EB resist is 4-10 times as strong as poly-(methyl methacrylate) PMMA. The merit of RIE using CBrF3 gas is an anisotropic etching and high selectivity about resist and target. Trench of about 20 nm width was fabricated. Using this technique, the bridge with 40 nm length and 50 nm width was fabricated, and the thickness of bridge was 100 nm. The capacitance of the junction was estimated as 0.004 pF. Because of this small capacitance, fabricated samples are suitable for detection of submillimeter wave. The critical current Ic (T) of fabricated samples was proportional to (1T/Tc)3/2 like variable thickness bridge (VTB). Moreover, Shapiro step up to the 11th under the millimeter wave radiation (70 GHz) was observed.
Study of Mass Production of Low Ohm Metal Film Resistors Prepared by Electroless Plating
Hiroo AOKI

PAPER-Components

Vol:
E74-C No:7
Page(s):
2049-2054
The low ohm metal film resistors under 1Ω can be made by electroless Ni-P plating. However, they are not suitable for precise use because their TCR can not become less than 100 ppm/. So the author developed Ni-W-P metal film resistors with its TCR is equal to conventional precision metal film resistors, by changing the ratio of ingredients in plating bath. At first on this paper, the trade-off point of ingredients ratio with respect to TCR and depositing rate has been investigated, assuming mass-production of 30,000 resistors in a lot. And then reliability tests, including short time over-load, temperature cycle, pulse, step stress, load life and moisture load life, were carried out for both conventional Ni-P and Ni-W-P metal film resistors. Before the tests, the films were grooved by laser and mechanical cutting. As the result, it has been found that the Ni-W-P is superior to the Ni-P, especially in step stress and pulse, and that laser cutting is found preferable to mechanical cutting.
Fabrication of Y-Ba-Cu-O Superconducting Films by Rapid Thermal Annealing
Yasuo UNEKAWA Takuo SUGANO

PAPER

Vol:
E74-C No:7
Page(s):
1955-1959
Rapid thermal annealing of sputter-deposited Y-Ba-Cu-O (YBCO) films is investigated. Annealing above 980 results dominantly the decomposition reaction of YBCO, but annealing at 960 for 2 min yields the crystallization of YBCO to low-Tc phase which has c-axis preferred-orientation to MgO substrates. Tc end of the films were about 60 K. Superconducting YBCO films are also obtained on ZrO2/Si substrates by rapid thermal annealing at 940 for 5 sec and holding at 500 for 5 min.
Formation of Au Electrodes on 80K-Phase Bi-Sr-Ca-Cu-O Single Crystal Surfaces and Their Characteristics by XPS
Satoru KISHIDA Heizo TOKUTAKA Makoto CHIHAYA Wataru FUTO Fumihiko TODA Katsumi NISHIMORI Naganori ISHIHARA

PAPER

Vol:
E74-C No:7
Page(s):
1967-1971
Au/Bi-Sr-Ca-Cu-O (BSCCO) were prepared by evaporating Au on the air-cleaved surfaces of 80 K-phase BSCCO single crystals and heating in Ar gas. The electrical contact resistance of the Au/BSCCO specimen decreased by heating at 400 in Ar gas. From the depth profiles of the specimens by X-ray photoelectron spectroscopy, we found that there was a reaction region (Ar+-sputter etching time is 0 to 2 min (20 )) on the surface of the Au/BSCCO specimen heated at 400, where Au diffused into the bulk of the BSCCO single crystals. The reaction region is thought to cause the decrease of the electrical contact resistance of the specimen.
A Japanese Text Dictation System Based on Phoneme Recognition and a Dependency Grammar
Shozo MAKINO Akinori ITO Mitsuru ENDO Ken'iti KIDO

PAPER-Dictation Systems

Vol:
E74-A No:7
Page(s):
1773-1782
This paper describes an overview of Japanese text dictation system composed of an acoustic processor and a linguistic processor. The system deals with 843 conceptual words and 431 functional words. The phoneme recognition is carried out using a modified LVQ2 method which we propose. The phoneme recognition score was 86.1% for 226 sentences uttered by two male speakers. The linguistic processor is composed of a processor for spotting Bunsetsu-units and a syntactic processor. The structure of the Bunsetsu-unit is effectively described by a finite-state automaton. The test-set perplexity of the finite-state automaton is 230. In the processor for spotting Bunsetsu-units, using a syntax-driven continuous-DP matching algorithm, the Bunsetsu-units are spotted from a recognized phoneme sequence and then a Bunsetsu-unit lattice is generated. In the syntactic processor, the Bunsetsu-unit lattice is parsed based on the dependency grammar. The dependency grammar is expressed as the correspondence between a FEATURE marker in a modifier-Bunsetsu and a SLOT-FILLER marker in a head-Bunsetsu. The recognition scores of the Bunsetsu-unit and conceptual words were 73.2% and 85.7% for 226 sentences uttered by the two male speakers.
Alternate Approach to the Stability of Linear Combinations of Polynomials
Norio FUKUMA Takehiro MORI

PAPER-Control and Computing

Vol:
E74-A No:7
Page(s):
1911-1914
A stability of convex combinations of polynomials and a stability margin of stable polynomials are studied using Hermite matrices for continuous-time systems. Available results are found to give a heavy computational burden especially in checking the stability of a polytope of polynomials by means of "the edge theorem". We propose alternate stability conditions and margin which reduce the computational burden. In our approach, the stability condition reported by Bialas and Garloff can be derived readily.
Highly Sensitive Electric Field Sensor Using LiNbO₃ Optical Modulator
Kimihiro TAJIMA Nobuo KUWABARA Fujio AMEMIYA

LETTER-Electromagnetic Compatibility

Vol:
E74-B No:7
Page(s):
1941-1943
This letter describes a highly sensitive broadband electric field sensor that uses a LiNbO3 optical modulator. A broad-band, low driving-power optical modulator and high-power optical source are used to achieve high sensitivity. The minimum detection level of 1 mV/m and band-width of 1GHz are obtained.
Evaluation for a Database Recovery Action with Periodical Checkpoint Generations
Satoshi FUKUMOTO Naoto KAIO Shunji OSAKI

PAPER-Fault Tolerant Computing

Vol:
E74-D No:7
Page(s):
2076-2082
It is of great importane to make a recovery action to reconstruct the logical consistency of the databese on the occasion of a system failure. Such a recovery action consists of two operations. One is UNDO operation, which rolls back the effects of all incomplete transactions from the database, and the other is REDO operation, which reflects the results of all complete transactions in the databese. In general, we limit the amount of REDO operation by generating checkpoints, in which the results of a complete transactions(s) are collected in a safe place. In this paper, we discuss evaluation for a database recovery action with periodical checkpoint generations. A new model is proposed to evaluate the recovery action in a case where a failure rate of the system changes with time. The expected recovery time and the availability for one cycle are derived under the assumption of an arbitrary failure-time distribution. In particular, we obtain analytically the optimum checkpoint interval with the maximum availability in the case of an exponential distribution. We numerically calculate the above results by assuming Weibull distributions. We further discuss the numerical results in varying the parameters that we define in our model, and show the impact of these parameters on the recovery action.

40381-40400hit(42756hit)

Keyword Search Result

[Keyword] (42756hit)

An Integration of Knowledge and Neural Networks toward a Phoneme Typewriter without a Language Model

SUSKIT---A Speech Understanding System Based on Robust Phone Spotting--

FOREWORD

Japanese Phonetic Typewriter Using HMM Phone Recognition and Stochastic Phone-Sequence Modeling

Processing Unknown Words in Continuous Speech Recognition

Continuous Speech Recognition Using Two-Level LR Parsing

Phenomenological Description of Conduction Mechanism of High-T_c Superconductors by Three-Fluid Model

Comparison of Syntax-0riented Spoken Japanese Understanding System with Semantic-Oriented System

Induced Noise Properties Caused by Circuit Interruption with Electric Contacts

Continuous Speech Recognition Using a Dependency Grammar and Phoneme-Based HMMs

A Use of Current Continuity Condition in GTD-MM Hybrid Technique

A Note on Dual Trail Partition of a Plane Graph

Characteristics of All-Nb Thin Film Microbridges Fabricated by Nanometer Process

Study of Mass Production of Low Ohm Metal Film Resistors Prepared by Electroless Plating

Fabrication of Y-Ba-Cu-O Superconducting Films by Rapid Thermal Annealing

Formation of Au Electrodes on 80K-Phase Bi-Sr-Ca-Cu-O Single Crystal Surfaces and Their Characteristics by XPS

A Japanese Text Dictation System Based on Phoneme Recognition and a Dependency Grammar

Alternate Approach to the Stability of Linear Combinations of Polynomials

Highly Sensitive Electric Field Sensor Using LiNbO₃ Optical Modulator

Evaluation for a Database Recovery Action with Periodical Checkpoint Generations

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles