IEICE global.ieice.org Site

Keyword Search Result

[Keyword] EE(4073hit)

3921-3940hit(4073hit)

A Combined Fast Adaptive Filter Algorithm with an Automatic Switching Method
Youhua WANG Kenji NAKAYAMA

PAPER-Adaptive Signal Processing

Vol:
E77-A No:1
Page(s):
247-256
This paper proposes a new combined fast algorithm for transversal adaptive filters. The fast transversal filter (FTF) algorithm and the normalized LMS (NLMS) are combined in the following way. In the initialization period, the FTF is used to obtain fast convergence. After converging, the algorithm is switched to the NLMS algorithm because the FTF cannot be used for a long time due to its numerical instability. Nonstationary environment, that is, time varying unknown system for instance, is classified into three categories: slow time varying, fast time varying and sudden time varying systems. The NLMS algorithm is applied to the first situation. In the latter two cases, however, the NLMS algorithm cannot provide a good performance. So, the FTF algorithm is selected. Switching between the two algorithms is automatically controlled by using the difference of the MSE sequence. If the difference exceeds a threshold, then the FTF is selected. Other wise, the NLMS is selected. Compared with the RLS algorithm, the proposed combined algorithm needs less computation, while maintaining the same performance. Furthermore, compared with the FTF algorithm, it provides numerically stable operation.
New Proposal and Comparison of Closure Tests--More Efficient than the CRYPTO'92 Test for DES--
Hikaru MORITA Kazuo OHTA

PAPER

Vol:
E77-A No:1
Page(s):
15-19
The well-known closure tests, the cycling closure test (CCT) and the meet-in-the-middle closure test (MCT), were introduced by Kaliski, Rivest and Sherman to analyze the algebraic properties of cryptosystems, and CCT indicates that DES is not closed. Though Coppersmith presented that DES can be proved not to be closed by a particular way, the closure tests can check various kinds of cryptosystems generally. Thus, successors to MCT and CCT have been proposed at CRYPTO. This paper expands the MCT successor, the switching closure test (SCT), to apply to the DES-like cryptosystems, and shows that this SCT variant is more efficient than the closure test proposed at CRYPTO'92, because the SCT variant establishes a better relationship between the computation cost and the probability of error (the evaluation index). The MCT successors are more important than the CCTs, because the MCTs can directly break closed cryptosystemes. Therefore, if you want to detect the closure property of cryptosystems generally, the SCT variant is better.
Optimal Redundancy of Systems for Minimizing the Probability of Dangerous Errors
Kyoichi NAKASHIMA Hitoshi MATZNAGA

PAPER-Reliability and Safety

Vol:
E77-A No:1
Page(s):
228-236
For systems in which the probability that an incorrect output is observed differs with input values, we adopt the redundant usage of n copies of identical systems which we call the n-redundant system. This paper presents a method to find the optimal redundancy of systems for minimizing the probability of dangerous errors. First, it is proved that a k-out-of-n redundancy or a mixture of two kinds of k-out-of-n redundancies minimizes the probability of D-errors under the condition that the probability of output errors including both dangerous errors and safe errors is below a specified value. Next, an algorithm is given to find the optimal series-parallel redundancy of systems by using the properties of the distance between two structure functions.
On Claw Free Families
Wakaha OGATA Kaoru KUROSAWA

PAPER

Vol:
E77-A No:1
Page(s):
72-80
This paper points out that there are two types of claw free families with respect to a level of claw freeness. We formulate them as weak claw free families and strong claw free families. Then, we present sufficient conditions for each type of claw free families. (A similar result is known for weak claw free families.) They are represented as some algebraic forms of one way functions. A new example of strong claw free families is also given.
A Consideration of the Thin Planar Antenna with Wire-Grid Model
Nozomu ISHII Kiyohiko ITOH

PAPER

Vol:
E76-B No:12
Page(s):
1518-1525
A theoretical and experimental study of a thin card-sized antenna is presented. The method of moment with a wire-grid model is used to analyze this antenna. In order to validate numerical efficiency, measurements using Wheeler method are preformed on this antenna and its wire-grid models. The experimental and theoretical results are in good agreement if the wire conductivity is well chosen. And the noise reduction of measured Wheeler efficiency using least mean square method is also examined.
Speech Recognition of lsolated Digits Using Simultaneous Generative Histogram
Yasuhisa HAYASHI Akio OGIHARA Kunio FUKUNAGA

LETTER

Vol:
E76-A No:12
Page(s):
2052-2054
We propose a recognition method for HMM using a simultaneous generative histogram. Proposed method uses the correlation between two features, which is expressed by a simultaneous generative histogram. Then output probabilities of integrated HMM are conditioned by the codeword of another feature. The proposed method is applied to isolated digit word recognition to confirm its validity.
Single-Board SIMD Processors Using Gate-Array LSIs for Parallel Processing
Toshio KONDO Yoshimasa KIMURA Noboru SONEHARA

PAPER

Vol:
E76-C No:12
Page(s):
1827-1834
We have developed an SIMD processor on a double-height VME board. We achieved a good balance between cost and performance by combining four identical gate-array LSIs in the processor array with a 16-bit degital signal processor (DSP), standard dynamic random-access memories (DRAMs) and other peripherals. The gate-array LSIs have 168-bit processing elements (PEs), each containing a one-bit processing block and a serial multiplier. This PE structure offers high-level bit processing capability and peak performance of 512 million operations per second (MOPS) for 8-bit multiply and accumulate operations. Effective performance of more than 300 MOPS for 8-bit array data processing is achieved by using an LSI structure tuned to the DRAM access rate, although the processing speed is reduced by the DRAM access bottleneck. The LSIs also have two unique additional hardware structures that speed up various array data processes. One is an inter-PE routing register array for supporting a transmission, rotation and memory access path. The other is a tree-structure network for propagating operations among PEs. With these cost-effective structures, the SIMD processor is expected to be widely used for two-dimensional data processing, such as image processing and pattern recognition.
An Error-Correcting Version of the Leiss's Parser for Context-Free Languages
Ken-ichi KURODA Eiichi TANAKA

LETTER-Automaton, Language and Theory of Computing

Vol:
E76-D No:12
Page(s):
1528-1531
This paper describes an error-correcting parser (ec-parser) for context-free languages that is an extension of the Leiss's parser. Since the ec-parser uses precomputed informations and a pruning technique by lookahead, the ec-parser is always faster than the Lyon's parser. Several examples are shown.
Calculation of the Potential Distribution around an Impurity-Atom-Wire--The Validity of the Thomas-Fermi Approximation--
Tomonori SEKIGUCHI Kazuhito FURUYA

PAPER-Semiconductor Materials and Devices

Vol:
E76-C No:12
Page(s):
1842-1846
The potential distribution around a linear array of donor atoms in a semiconductor crystal is calculated, approximating the linear array by a continuous line charge. Two methods are used for the analysis. One is the self-consistent calculation of Poisson's equation and the effective mass Schrödinger's equation, and the other is the Thomas-Fermi approximation. Results of both methods agree very well, and it is shown that it is possible to form a potential distribution as fine as the electron wavelength by appropriate arrangement of the impurity atoms. Arrays of impurity atoms therefore can act as buiding elements for future electron wave devices.
An Autocorrelation Associative Neural Network with Self-Feedbacks
Hiroshi UEDA Masaya OHTA Akio OGIHARA Kunio FUKUNAGA

LETTER

Vol:
E76-A No:12
Page(s):
2072-2075
In this article, the autocorrelation associative neural network that is one of well-known applications of neural networks is improved to extend its capacity and error correcting ability. Our approach of the improvement is based on the consideration that negative self-feedbacks remove spurious states. Therefore, we propose a method to determine the self-feedbacks as small as possible within the range that all stored patterns are stable. A state transition rule that enables to escape oscillation is also presented because the method has a possibility of falling into oscillation. The efficiency of the method is confirmed by means of some computer simulations.
Synthesis of Protocol Specifications for Design of Responsive Protocols
Hirotaka IGARASHI Yoshiaki KAKUDA Tohru KIKUNO

PAPER

Vol:
E76-D No:11
Page(s):
1375-1385
Responsive protocols are communication protocols which ensure timely and reliable recovery when error events occur. Protocol synthesis for design of responsive protocols is to derive a protocol specification based on a service specification. In the previous methods, if the service specification includes simultaneous transmission of primitives from a high layer to a low layer through different service access points, then the derived protocol specification includes protocol errors of unspecified reception caused by message collisions. Also, they only includes a recovery function such as retransmission of messages. This is not enough for recovery from abnormal states due to coordination loss. This paper extends a class of derived protocol specifications to include message collisions which usually occur in real communication protocols. Furthermore, this paper proposes a new method for synthesis of a responsive protocal specification derived from a service specification such that the derived protocol specification is free from protocol erros of unspecified receptions caused by message collisions and includes two recovery functions: message retransmission and checkpoint restart functions.
An Effective Defect-Repair Scheme for a High Speed SRAM
Sadayuki OOKUMA Katsuyuki SATO Akira IDE Hideyuki AOKI Takashi AKIOKA Hideaki UCHIDA

PAPER-SRAM

Vol:
E76-C No:11
Page(s):
1620-1625
To make a fast Bi-CMOS SRAM yield high without speed degradation, three defect-repair methods, the address comparison method, the fuse decoder method and the distributed fuse method, were considered in detail and their advantages and disadvantages were made clear. The distributed fuse method is demonstrated to be further improved by a built-in fuse word driver and a built-in fuse column selector, and fuse analog switches. This enhanced distributed fuse scheme was examined in a fast Bi-CMOS SRAM. A maximun access time of 14 ns and a chip size of 8.8 mm17.4 mm are expected for a 4 Mb Bi-CMOS SRAM in the future.
Physiologically-Based Speech Synthesis Using Neural Networks
Makoto HIRAYAMA Eric Vatikiotis-BATESON Mitsuo KAWATO

PAPER

Vol:
E76-A No:11
Page(s):
1898-1910
This paper focuses on two areas in our effort to synthesize speech from neuromotor input using neural network models that effect transforms between cognitive intentions to speak, their physiological effects on vocal tract structures, and subsequent realization as acoustic signals. The first area concerns the biomechanical transform between motor commands to muscles and the ensuing articulator behavior. Using physiological data of muscle EMG (electromyography) and articulator movements during natural English speech utterances, three articulator-specific neural networks learn the forward dynamics that relate motor commands to the muscles and motion of the tongue, jaw, ant lips. Compared to a fully-connected network, mapping muscle EMG and motion for all three sets of articulators at once, this modular approach has improved performance by reducing network complexity and has eliminated some of the confounding influence of functional coupling among articulators. Network independence has also allowed us to identify and assess the effects of technical and empirical limitations on an articulator-by-articulator basis. This is particularly important for modeling the tongue whose complex structure is very difficult to examine empirically. The second area of progress concerns the transform between articulator motion and the speech acoustics. From the articulatory movement trajectories, a second neural network generates PARCOR (partial correlation) coefficients which are then used to synthesize the speech acoustics. In the current implementation, articulator velocities have been added as the inputs to the network. As a result, the model now follows the fast changes of the coefficients for consonants generated by relatively slow articulatory movements during natural English utterances. Although much work still needs to be done, progress in these areas brings us closer to our goal of emulating speech production processes computationally.
Tree-Based Approaches to Automatic Generation of Speech Synthesis Rules for Prosodic Parameters
Yoichi YAMASHITA Manabu TANAKA Yoshitake AMAKO Yasuo NOMURA Yoshikazu OHTA Atsunori KITOH Osamu KAKUSHO Riichiro MIZOGUCHI

PAPER

Vol:
E76-A No:11
Page(s):
1934-1941
This paper describes automatic generation of speech synthesis rules which predict a stress level for each bunsetsu in long noun phrases. The rules are inductively inferred from a lot of speech data by using two kinds of tree-based methods, the conventional decision tree and the SBR-tree methods. The rule sets automatically generated by two methods have almost the same performance and decrease the prediction error to about 14 Hz from 23 Hz of the accent component value. The rate of the correct reproduction of the change for adjacent bunsetsu pairs is also used as a measure for evaluating the generated rule sets and they correctly reproduce the change of about 80%. The effectiveness of the rule sets is verified through the listening test. And, with regard to the comprehensiveness of the generated rules, the rules by the SBR-tree methods are very compact and easy to human experts to interpret and matches the former studies.
Significance of Suitability Assessment in Speech Synthesis Applications
Hideki KASUYA

INVITED PAPER

Vol:
E76-A No:11
Page(s):
1893-1897
The paper indicates the importance of suitability assesment in speech synthesis applications. Human factors involved in the use of a synthetic speech are first discussed on the basis of an example of a newspaper company where synthetic speech is extensively used as an aid for proofreading a manuscript. Some findings obtained from perceptual experiments on the subjects' preference for paralinguistic properties of synthetic speech are then described, focusing primarily on the suitability of pitch characteristics, speaker's gender, and speaking rates in the task where subjects are asked to proofread a printed text while listening to the speech. The paper finally claims the need for a flexibile speech synthesis system which helps the users create their own synthetic speech.
Power Control of a Terminal Analog Synthesizer Using a Glottal Model
Mikio YAMAGUCHI

PAPER

Vol:
E76-A No:11
Page(s):
1957-1963
A terminal-analog synthesizer which uses a glottal model has already been proposed for rule-based speech synthesis, but the control strategy for glottal source intensity levels has not yet been defined. On the other hand, power-control rules which determine the target segmental power of synthetic speech have been proposed, based on statistical analysis of the power in natural speech. It is pointed out that there is a close correlation between observed fundamental frequency and power levels in natural speech; however, the theoretical reasons for this correlation have not been explained. This paper shows the relationship between fundamental frequency and resultant power in a terminal-analog synthesizer which uses a glottal model. From the equations it can be deduced that the tendency in natural speech for power to increase with fundamental frequency can be closely simulated by the sum of the effect of the radiation characteristic and the effect of the synthesis system's vocal tract transfer function. In addition, this paper proposes a method for adjusting the power of synthetic speech to any desired value. This control method can be executed in real-time.
Development of TTS Card for PCs and TTS Software for WSs
Yoshiyuki HARA Tsuneo NITTA Hiroyoshi SAITO Ken'ichiro KOBAYASHI

PAPER

Vol:
E76-A No:11
Page(s):
1999-2007
Text-to-speech synthesis (TTS) is currently one of the most important media conversion techniques. In this paper, we describe a Japanese TTS card developed for constructing a personal-computer-based multimedia platform, and a TTS software package developed for a workstation-based multimedia platform. Some applications of this hardware and software are also discussed. The TTS consists of a linguistic processing stage for converting text into phonetic and prosodic information, and a speech processing stage for producing speech from the phonetic and prosodic symbols. The linguistic processing stage uses morphological analysis, rewriting rules for accent movement and pause insertion, and other techniques to impart correct accentuation and a natural-sounding intonation to the synthesized speech. The speech processing stage employs the cepstrum method with consonant-vowel (CV) syllables as the synthesis unit to achieve clear and smooth synthesized speech. All of the processing for converting Japanese text (consisting of mixed Japanese Kanji and Kana characters) to synthesized speech is done internally on the TTS card. This allows the card to be used widely in various applications, including electronic mail and telephone service systems without placing any processing burden on the personal computer. The TTS software was used for an E-mail reading tool on a workstation.
High Quality Speech Synthesis System Based on Waveform Concatenation of Phoneme Segment
Tomohisa HIROKAWA Kenzo ITOH Hirokazu SATO

PAPER

Vol:
E76-A No:11
Page(s):
1964-1970
A new system for speech synthesis by concatenating waveforms selected from a dictionary is described. The dictionary is constructed from a two-hour speech that includes isolated words and sentences uttered by one male speaker, and contains over 45,000 entries which are identified by their average pitch, dynamic pitch parameter which represents micro pitch structure in a segment, duration and average amplitude. Phoneme duration is set according to phoneme environment, and phoneme power is controlled, by both pitch frequency and phoneme environment. Tests show the average errors in vowel duration and consonant duration are 28.8 ms and 16.8 ms respectively, and the vowel power average error is 2.9 dB. The pitch frequency patterns are calculated according to a conventional model in which the accent component is abbed to a gross phrase component. Set a phoneme string and prosody information, the optimum waveforms are selected from the dictionary by matching their attributes with the given phonetic and prosodic information. A waveform selection function, which has two terms corresponding to prosody and phonological coincidence between rule-set values and waveform values from the dictionary, is proposed. The weight coefficients used in the selection function are determined through subjective hearing tests. The selected waveform segments are then modified in waveform domain to further adjust for the desired prosody. A pitch frequency modification method based on pitch synchronous overlap-add technique is introduced into the system. Lastly, the waveforms are interpolated between voiced waveforms to avoid abrupt changes in voice spectrum and waveform shape. An absolute evaluation test of five grades is performed to the synthesized voice and the mean of the score is 3.1, which is over "good," and while the original speaker quality is retained.
Development of a Rule-Based Speech Synthesizer Module for Embedded Use
Mikio YAMAGUCHI John-Paul HOSOM

PAPER

Vol:
E76-A No:11
Page(s):
1990-1998
A module for rule-based Japanese speech synthesis has been developed. The synthesizer was constructed using the Multiple-Cascade Terminal Analog (MCTA) structure, and this sturcture has been improved in three respects: the voicing-source model has an increased number of variable parameters which allows for voicing-source waveforms that better approximate natural speech; the spectral characteristics of the fricative source have been improved; and the path used for nasal consonants has an increased number of resonators to better conform to theory. The current synthesis system uses a modified stored-pattern data structure which allows better transitions between syllables; however, time-invariant values are used in certain cases in order to decrease the amount of required memory. This system also has a new consolidated method for generating geminate obstruents and syllabic nasals. This synthesizer and synthesis system have been implemented in a re-developed rule-based speech-synthesis module. This module has been constructed using ASIC technology and has both small size (56368 mm) and light weight (19g); it is therefore possible to embed it in various types of portable or moving machinery. The module can be connected directly to a mocroprocessor bus and accepts as input sentences which are generated by the host computer. The input sentences are written with the Japanese katakana or romaji syllabaries and other symbols which describe the sentence structure. The syllable articulation rate for one hundred Japanese syllables (including palatalized sounds) is 65% and for sixty-seven syllables (not including palatalized sounds) is 74%. The word intelligibility, measured using phonetically-balanced words, it 88%.
Phoneme Power Control for Speech Synthesis
Kenzo ITOH Tomohisa HIROKAWA Hirokazu SATO

PAPER

Vol:
E76-A No:11
Page(s):
1911-1918
This paper proposes a new method of phoneme power control for speech synthesis by rule. The innovation of this method lies in its use of the phoneme environment and the relationship between speech power and pitch frequency. First, the permissible threshold (PT) for power modification is measured by subjective experiments using power manipulated speech material. As a result, it is concluded that the PT of power modification is 4.1 dB. This experimental result is significant when discussing power control and gives a criterion for power control accuracy. Next, the relationship between speech power and pitch frequency is analyzed using a very large speech data base. The results show that the relationship between phoneme power and pitch frequency is affected by the kind of phoneme, the adjoining phonemes, rising or falling pitch, and initial or final position in the sentence. Finally, we propose that the phoneme power should be controlled by pitch frequency and phoneme environment. This proposal is implemented in a waveform concatenation type text-to-speech synthesizer. This new method yields an averaged root mean square error between real and estimated speech power of 2.17 dB. This value indicates that 94% of the estimated power values are within the permissible threshold of human perception.

3921-3940hit(4073hit)

Keyword Search Result

[Keyword] EE(4073hit)

A Combined Fast Adaptive Filter Algorithm with an Automatic Switching Method

New Proposal and Comparison of Closure Tests--More Efficient than the CRYPTO'92 Test for DES--

Optimal Redundancy of Systems for Minimizing the Probability of Dangerous Errors

On Claw Free Families

A Consideration of the Thin Planar Antenna with Wire-Grid Model

Speech Recognition of lsolated Digits Using Simultaneous Generative Histogram

Single-Board SIMD Processors Using Gate-Array LSIs for Parallel Processing

An Error-Correcting Version of the Leiss's Parser for Context-Free Languages

Calculation of the Potential Distribution around an Impurity-Atom-Wire--The Validity of the Thomas-Fermi Approximation--

An Autocorrelation Associative Neural Network with Self-Feedbacks

Synthesis of Protocol Specifications for Design of Responsive Protocols

An Effective Defect-Repair Scheme for a High Speed SRAM

Physiologically-Based Speech Synthesis Using Neural Networks

Tree-Based Approaches to Automatic Generation of Speech Synthesis Rules for Prosodic Parameters

Significance of Suitability Assessment in Speech Synthesis Applications

Power Control of a Terminal Analog Synthesizer Using a Glottal Model

Development of TTS Card for PCs and TTS Software for WSs

High Quality Speech Synthesis System Based on Waveform Concatenation of Phoneme Segment

Development of a Rule-Based Speech Synthesizer Module for Embedded Use

Phoneme Power Control for Speech Synthesis

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles