The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] EE(4073hit)

3781-3800hit(4073hit)

  • Reconstructing Data Flow Diagrams from Structure Charts Based on the Input and Output Relationship

    Shuichiro YAMAMOTO  

     
    PAPER-Methodologies

      Vol:
    E78-D No:9
      Page(s):
    1118-1126

    The traceability of data flow diagrams against structure charts is very important for large software development. Specifying if there is a relationship between a data flow diagram and a structure chart is a time consuming task. Existing CASE tools provide a way to maintain traceability. If we can extract the input-output relationship of a system from a structure chart, the corresponding data flow diagram can be automatically generated from the relationship. For example, Benedusi et al. proposed a reverse engineering methodology to reconstruct a data flow diagram from existing code. The methodology develops a hierarchical data flow diagram from dependency relationships between the program variables. The methodology, however, transforms each module in structure charts into a process in data flow diagrams. The reconstructed diagrams may have different processes with the same name. This paper proposes a transformation algorithm that solves these problems. It analyzes the structure charts and extracts the input and ouput relationships, then determines how the set of outputs depends on the set of inputs for the data flow diagram process. After that, it produces a data flow diagram based on the include operation between the sets of output items. The major characteristics of the algorithm are that it is simple, because it only uses the basic operations of sets, it generates data flow diagrams with deterministic steps, and it can generate minimal data flow diagrams. This process will reduce the cost of traceability between data flow diagrams and structure charts.

  • Multisegment Multiple VQ Codebooks-Based Speaker Independent Isolated-Word Recognition Using Unbiased Mel Cepstrum

    Liang ZHOU  Satoshi IMAI  

     
    PAPER-Speech Processing and Acoustics

      Vol:
    E78-D No:9
      Page(s):
    1178-1187

    In this paper, we propose a new approach to speaker independent isolated-word speech recognition using multisegment multiple vector quantization (VQ) codebooks. In this approach, words are recognized by means of multisegment multiple VQ codebooks, a separate multisegment multiple VQ codebooks are designed for each word in the recognition vocabulary by dividing equally the word into multiple segments which is correlative with number of syllables or phonemes of the word, and designing two individual VQ codebooks consisting of both instantaneous and transitional speech features for each segment. Using this approach, the influence of the within-word coarticulation can be minimized, the time-sequence information of speech can be used, and the word length differences in the vocabulary or speaking rates variations can be adapted automatically. Moreover, the mel-cepstral coefficients based on unbiased estimation of log spectrum (UELS) are used, and comparison experiment with LPC derived mel cepstral coefficients is made. Recognition experiments Using testing databases consisting of 100 Japanese words (Waseda database) and 216 phonetically balanced words (ATR database), confirmed the effectiveness of the new method and the new speech features. The approach is described, computational complexity as well as memory requirements are analyzed, the experimental results are presented.

  • High Speed GaAs Digital Integrated Circuits

    Masahiro AKIYAMA  Seiji NISHI  Yasushi KAWAKAMI  

     
    INVITED PAPER

      Vol:
    E78-C No:9
      Page(s):
    1165-1170

    High speed GaAs ICs (Integrated Circutis) using FETs (Field Effect Transistors) are reported. As the fabricating techniques, ion implantation processes for both 0.5 µm and 0.2 µm gate FETs using W/Al refractory metal and 0.2 µm recessed gate process with MBE grown epitaxial wafers are shown. These fabrication processes are selected depending on the circuit speed and the integration level. The outline of the circuit design and the examples of ICs, which are developed for 10 Gb/s optical communication systems, are also shown with the obtained characteristics.

  • Identification of a Class of Time-Varying Nonlinear System Based on the Wiener Model with Application to Automotive Engineering

    Jonathon C. RALSTON  Abdelhak M. ZOUBIR  Boualem BOASHASH  

     
    PAPER

      Vol:
    E78-A No:9
      Page(s):
    1192-1200

    We consider the identification of a class of systems which are both time-varying and nonlinear. Time-varying nonlinear systems are often encountered in practice, but tend to be avoided due to the difficulties that arise in modelling and estimation. We study a particular time-varying polynomial model, which is a member of the class of time-varying Wiener models. The model can characterise both time-variation and nonlinearity in a straightforward manner, without requiring an excessively large number of coefficients. We formulate a procedure to find least-squares estimates of the model coefficients. An advantage of the approach is that systems with rapidly changing dynamics can be characterised. In addition, we do not require that the input is stationary or Gaussian. The approach is validated with an application to an automobile modelling problem, where a time-varying nonlinear model is seen to more accurately characterise the system than a time-invariant nonlinear one.

  • Unsupervised Speaker Adaptation Using All-Phoneme Ergodic Hidden Markov Network

    Yasunage MIYAZAWA  Jun-ichi TAKAMI  Shigeki SAGAYAMA  Shoichi MATSUNAGA  

     
    PAPER-Speech Processing and Acoustics

      Vol:
    E78-D No:8
      Page(s):
    1044-1050

    This paper proposes an unsupervised speaker adaptation method using an all-phoneme ergodic Hidden Markov Network" that combines allophonic (context-dependent phone) acoustic models with stochastic language constraints. Hidden Markov Network (HMnet) for allophone modeling and allophonic bigram probabilities derived from a large text database are combined to yield a single large ergodic HMM which represents arbitrary speech signals in a particular language so that the model parameters can be re-estimated using text-unknown speech samples with the Baum-Welch algorithm. When combined with the Vector Field Smoothing (VFS) technique, unsupervised speaker adaptation can be effectively performed. This method experimentally gave better performances compared with our previous unsupervised adaptation method which used conventional phonetic HMMs and phoneme bigram probabilities especially when the amount of training data was small.

  • Fundamental Time Domain Solutions for Plane TEM-Waves in Lossy Media and Applications

    Michael SCHINKE  Karl REISS  

     
    PAPER

      Vol:
    E78-C No:8
      Page(s):
    1111-1116

    Closed-form solutions of the characteristic initial value problem for electric and magnetic fields propagating as nonsinusoidal plane TEM-waves in lossy unbounded media are calculated with Riemann's method and discussed in detail. As an application, the reflection and transmission of waves on a planar boundary is examined, when one semi-infinite medium is lossy.

  • A Minimum Error Approach to Spotting-Based Pattern Recognition

    Takashi KOMORI  Shigeru KATAGIRI  

     
    PAPER-Speech Processing and Acoustics

      Vol:
    E78-D No:8
      Page(s):
    1032-1043

    Keyword spotting is a fundamental approach to recognizing/understanding naturally and spontaneously spoken language. To spot acoustic events such as keywords, an overall spotting system, comprising acoustic models and decision thresholds, primarily needs to be optimized to minimize all spoting errors. However, in most conventional spotting systems, the acoustic models and the thresholds are separately and heuristically designed: There has not necessarily been a theoretical basis that has allowed one to design an overall system consistently. This paper introduces a novel approach to spotting, by proposing a new design method called Minimum SPotting Error learning (MSPE). MSPE is conceptually based on a recent discriminative learning theory, i.e., the Minimum Classification Error learning/Generalized Probabilistic Descent method (MCE/GPD); it features a rigorous framework for minimizing spotting error objectives. MSPE can be used in a wide range of pattern spotting applications, such as spoken phonemes, written characters as well as spoken words. Experimental results for a Japanese consonant spotting task clearly demonstrate the promising future of the proposed approach.

  • High-Speed Digital Circuit for Discrete Cosine Transform

    Motonobu TONOMURA  

     
    PAPER

      Vol:
    E78-A No:8
      Page(s):
    957-962

    This paper deals with a high-speed digital circuit for discrete cosine transform (DCT). We propose a new algorithm that reduces the number of calculations for partial sum-of-products in the DCT and synthesize the small gate depth circuit of DCT by using carry-propagation-free adders based on redundant binary {1,0,1} representation. The gate depth is only half to one third that of the conventional algorithms with the same number of gates.

  • Analysis of Database Production Rules by Process Algebra

    Yoshinao ISOBE  Isao KOJIMA  Kazuhito OHMAKI  

     
    PAPER-Databases

      Vol:
    E78-D No:8
      Page(s):
    992-1002

    The purpose of this research is to analyze production rules with coupling modes in active databases and to exploit an assistant system for rule programming. Each production rule is a specification including an event, a condition, and an action. The action is automatically executed whenever the event occurs and the condition is satisfied. Coupling modes are useful to control execution order of transactions. For example, a transaction for consistency check should be executed after transactions for update. An active database, which is a database with production rules, can spontaneously update database states and check their consistency. Production rules provide a powerful mechanism for knowledge-bases. However it is very difficult in general to predict how a set of production rules will behave because of cascading rule triggers, concurrency, and so on. We are attempting to adopt a process algebra as a basic tool to analyze production rules. In order to describe and analyze concurrent and communicating systems, process algebras such as CCS, CSP, ACP, and π-calculus, are well known. However there are some difficulties to apply existing process algebras to analysis of production rules in growing process trees by process creation. In this paper we propose a process algebra named CCSPR (a Calculus of Communicating Systems with Production Rules), Which is an extension of CCS. An advantage of CCSPR is to syntactically describe growing process trees. Therefore, production rules can be appropriately analyzed in CCSPR. After giving definitions and properties of CCSPR, we show an example of analysis of production rules in CCSPR.

  • Characterization of Single and Coupled Microstrip Lines Covered with Protective Dielectric Film

    Kazuhiko ATSUKI  Keren LI  Shoichiro YAMAGUCHI  

     
    PAPER

      Vol:
    E78-C No:8
      Page(s):
    1095-1099

    In this paper, we presented an analysis of single and coupled microstrip lines covered with protective dielectric film which is usually used in the microwave integrated circuits. The method employed in the characterization is called partial-boundary element method (p-BEM). The p-BEM provides an efficient means to the analysis of the structures with multilayered media or covered with protective dielectric film. The numerical results show that by changing the thickness of the protective dielectric films such as SiO2, Si and Polyimide covered on these lines on a GaAs substrate, the coupled microstrip lines vary within 10% on the characteristic impedance and within 25% on the effective dielectric constant for the odd mode of coupled microstrip line, respectively, in comparison with the structures without the protective dielectric film. In contrast, the single microstrip lines vary within 4% on the characteristic impedance and within 8% on the effective dielectric constant, respectively. The protective dielectric film affects the odd mode of the coupled lines more strongly than the even mode and the characteristics of the single microstrip lines.

  • Three-Dimensional MMIC and Its Application: An Ultra-Wideband Miniature Balun

    Ichihiko TOYODA  Makoto HIRANO  Tsuneo TOKUMITSU  

     
    PAPER

      Vol:
    E78-C No:8
      Page(s):
    919-924

    A new three-dimensional MMIC structure and an ultra-wideband miniature MMIC balun are proposed. The MMIC is a combined structure of multilayer MMICs and U-shaped micro-wires. This technology effectively reduces chip size and enhances MMIC performance. The proposed balun is constructed with three narrow conductors located side by side. The U-shaped micro-wire technology is employed to reduce the insertion loss and chip size. 1.51 dB insertion loss over 10 to 30 GHz, and 2 dB and 5 degrees of amplitude and phase balances over 5 to 35 GHz have been obtained. The intrinsic area of the balun is only 450800 µm, about 1/5 to 1/3 the area of recently reported miniaturized MMIC baluns.

  • 8-kb/s Low-Delay Speech Coding with 4-ms Frame Size

    Yoshiaki ASAKAWA  Preeti RAO  Hidetoshi SEKINE  

     
    PAPER

      Vol:
    E78-A No:8
      Page(s):
    927-933

    This paper describes modifications to a previously proposed 8-kb/s 4-ms-delay CELP speech coding algorithm with a view to improving the speech quality while maintaining low delay and only moderately increasing complexity. The modifications are intended to improve the effectiveness of interframe pitch lag prediction and the sub-optimality level of the excitation coding to the backward adapted synthesis filter by using delayed decision and joint optimization techniques. Results of subjective listening tests using Japanese speech indicate that the coded speech quality is significantly superior to that of the 8-kb/s VSELP coder which has a 20-ms delay. A method that reduces the computational complexity of closed-loop 3-tap pitch prediction with no perceptible degradation in speech quality is proposed, based on representing the pitch-tap vector as the product of a scalar pitch gain and a normalized shape codevector.

  • Spectrum Broadening of Telephone Band Signals Using Multirate Processing for Speech Quality Enhancement

    Hiroshi YASUKAWA  

     
    LETTER

      Vol:
    E78-A No:8
      Page(s):
    996-998

    This paper describes a system that can enchance the speech quality degradation due to severe band limitation during speech transmission. We have already proposed a spectrum widening method that utilizes aliasing in sampling rate conversion and digital filtering for spectrum shaping. This paper proposes a new method that offers improved performance in terms of the spectrum distortion characteristics. Implementation procedures are clarified, and its performance is discussed. The proposed method can effectively enhance speech quality.

  • Enhanced Feeding Structure of Microstrip Antenna

    Sanghoon CHOI  Sangwook NAM  

     
    PAPER

      Vol:
    E78-C No:8
      Page(s):
    984-987

    In this paper, a waveguide-fed slot-coupled microstrip antenna is proposed as enhanced feeding structure of microstrip antenna and an analysis is pesented. The presence of dielectric substrate between a strip and a slot is explicitly taken into account in this analysis. The evaluation of the antenna characteristics is carried out using the method of moments and the spectral domain approach in terms of the electric current distribution on the strip and the magnetic current distribution on the slot.

  • An 11-GHz-Band Subharmonic-Injection-Locked Oscillator MMIC

    Kenji KAMOGAWA  Ichihiko TOYODA  Tsuneo TOKUMITSU  

     
    PAPER

      Vol:
    E78-C No:8
      Page(s):
    925-930

    A subharmonic injection-locked oscillator (ILO) MMIC chain is proposed for the local oscillators and synthesizers used at millimeter-wave frequencies. A fabricated, primary 11-GHz-band injection-locked oscillator MMIC for the first stage ILO in the ILO-chain MMIC, achieves a wide subharmonic-injection-locking range at the subharmonic factors, 1/n (n=1, 2, 3, ), of 1/1, 1/2 and 1/3. The ILO MMIC abilities for synthesizer applications were confirmed with an injection-locking time of only 100-200 nsec, which is less than 1/100 that of PLL oscillators, and also with free-running oscillation performance and a wide injection locking range within a temperature range of -30 and 80.

  • The Complexity of Drawing Tree-Structured Diagrams

    Kensei TSUCHIDA  

     
    PAPER-Algorithm and Computational Complexity

      Vol:
    E78-D No:7
      Page(s):
    901-908

    Concerning the complexity of tree drawing, the following result of Supowit and Reingold is known: the problem of minimum drawing binary trees under several constraints is NP-complete. There remain, however, many open problems. For example, is it still NP-complete if we eliminate some constraints from the above set? In this paper, we treat tree-structured diagrams. A tree-structured diagrm is a tree with variably sized rectangular nodes. We consider the layout problem of tree-structured diagrams on Z2 (the integral lattice). Our problems are different from that of Supowit and Reingold, even if our problems are limited to binary trees. In fact, our set of constraints and that of Supowit and Reingold are incomparable. We show that a problem is NP-complete under a certain set of constraints. Furthermore, we also show that another problem is still NP-complete, even if we delete a constraint concerning with the symmetry from the previous set of constraints. This constraint corresponds to one of the constraints of Supowit and Reingold, if the problem is limited to binary trees.

  • Duration Modeling with Decreased Intra-Group Temporal Variation for HMM-Based Phoneme Recognition

    Nobuaki MINEMATSU  Keikichi HIROSE  

     
    PAPER

      Vol:
    E78-D No:6
      Page(s):
    654-661

    A new clustering method was proposed to increase the effect of duration modeling on the HMM-based phoneme recognition. A precise observation on the temporal correspondences between a phoneme HMM with output probabilities by single Gaussian modeling and its training data indicated that there were two extreme cases, one with several types of correspondences in a phoneme class completely different from each other, and the other with only one type of correspondence. Although duration modeling was commonly used to incorporate the temporal information in the HMMs, a good modeling could not be obtained for the former case. Further observation for phoneme HMMs with output probabilities by Gaussian mixture modeling also showed that some HMMs still had multiple temporal correspondences, though the number of such phonemes was reduced as compared to the case of single Gaussian modeling. An appropriate duration modeling cannot be obtained for these phoneme HMMs by the conventional methods, where the duration distribution for each HMM state is represented by a distribution function. In order to cope with the problem, a new method was proposed which was based on the clustering of phoneme classes with plural types of temporal correspondences into sub-classes. The clustering was conducted so as to reduce the variations of the temporal correspondences in sub-classes. After the clustering, an HMM was constructed for each sub-class. Using the proposed method, speaker dependent recognition experiments were performed for phonemes segmented from isolated words. A few-percent increase was realized in the recognition rate, which was not obtained by another method based on the duration modeling with a Gaussian mixture.

  • Speaker-Consistent Parsing for Speaker-Independent Continuous Speech Recognition

    Kouichi YAMAGUCHI  Harald SINGER  Shoichi MATSUNAGA  Shigeki SAGAYAMA  

     
    PAPER

      Vol:
    E78-D No:6
      Page(s):
    719-724

    This paper describes a novel speaker-independent speech recognition method, called speaker-consistent parsing", which is based on an intra-speaker correlation called the speaker-consistency principle. We focus on the fact that a sentence or a string of words is uttered by an individual speaker even in a speaker-independent task. Thus, the proposed method searches through speaker variations in addition to the contents of utterances. As a result of the recognition process, an appropriate standard speaker is selected for speaker adaptation. This new method is experimentally compared with a conventional speaker-independent speech recognition method. Since the speaker-consistency principle best demonstrates its effect with a large number of training and test speakers, a small-scale experiment may not fully exploit this principle. Nevertheless, even the results of our small-scale experiment show that the new method significantly outperforms the conventional method. In addition, this framework's speaker selection mechanism can drastically reduce the likelihood map computation.

  • Automatic Language Identification Using Sequential Information of Phonemes

    Takayuki ARAI  

     
    PAPER

      Vol:
    E78-D No:6
      Page(s):
    705-711

    In this paper approaches to language identification based on the sequential information of phonemes are described. These approaches assume that each language can be identified from its own phoneme structure, or phonotactics. To extract this phoneme structure, we use phoneme classifiers and grammars for each language. The phoneme classifier for each language is implemented as a multi-layer perceptron trained on quasi-phonetic hand-labeled transcriptions. After training the phoneme classifiers, the grammars for each language are calculated as a set of transition probabilities for each phoneme pair. Because of the interest in automatic language identification for worldwide voice communication, we decided to use telephone speech for this study. The data for this study were drawn from the OGI (Oregon Graduate Institute)-TS (telephone speech) corpus, a standard corpus for this type of research. To investigate the basic issues of this approach, two languages, Japanese and English, were selected. The language classification algorithms are based on Viterbi search constrained by a bigram grammar and by minimum and maximum durations. Using a phoneme classifier trained only on English phonemes, we achieved 81.1% accuracy. We achieved 79.3% accuracy using a phoneme classifier trained on Japanese phonemes. Using both the English and the Japanese phoneme classifiers together, we obtained our best result: 83.3%. Our results were comparable to those obtained by other methods such as that based on the hidden Markov model.

  • A Scheme for Word Detection in Continuous Speech Using Likelihood Scores of Segments Modified by Their Context Within a Word

    Sumio OHNO  Keikichi HIROSE  Hiroya FUJISAKI  

     
    PAPER

      Vol:
    E78-D No:6
      Page(s):
    725-731

    In conventional word-spotting methods for automatic recognition of continuous speech, individual frames or segments of the input speech are assigned labels and local likelihood scores solely on the basis of their own acoustic characteristics. On the other hand, experiments on human speech perception conducted by the present authors and others show that human perception of words in connected speech is based, not only on the acoustic characteristics of individual segments, but also on the acoustic and linguistic contexts in which these segments occurs. In other words, individual segments are not correctly perceive by humans unless they are accompanied by their context. These findings on the process of human speech perception have to be applied in automatic speech recognition in order to improve the performance. From this point of view, the present paper proposes a new scheme for detecting words in continuous speech based on template matching where the likelihood of each segment of a word is determined not only by its own characteristics but also by the likelihood of its context within the framework of a word. This is accomplished by modifying the likelihood score of each segment by the likelihood score of its phonetic context, the latter representing the degree of similarity of the context to that of a candidate word in the lexicon. Higher enhancement is given to the segmental likelihood score if the likelihood score of its context is higher. The advantage of the proposed scheme over conventional schemes is demonstrated by an experiment on constructing a word lattice using connected speech of Japanese uttered by a male speaker. The result indicates that the scheme is especially effective in giving correct recognition in cases where there are two or more candidate words which are almost equal in raw segmental likelihood scores.

3781-3800hit(4073hit)