The search functionality is under construction.

Author Search Result

[Author] Tsuyoshi MORIMOTO(7hit)

1-7hit
  • A Unification-Based Japanese Parser for Speech-to-Speech Translation

    Masaaki NAGATA  Tsuyoshi MORIMOTO  

     
    PAPER

      Vol:
    E76-D No:1
      Page(s):
    51-61

    A unification-based Japanese parser has been implemented for an experimental Japanese-to-English spoken language translation system (SL-TRANS). The parser consists of a unification-based spoken-style Japanese grammar and an active chart parser. The grammar handles the syntactic, semantic, and pragmatic constraints in an integrated fashion using HPSG-based framework in order to cope with speech recognition errors. The parser takes multiple sentential candidates from the HMM-LR speech recognizer, and produces a semantic representation associated with the best scoring parse based on acoustic and linguistic plausibility. The unification-based parser has been tested using 12 dialogues in the conference registration domain, which include 261 sentences uttered by one male speaker. The sentence recognition accuracy of the underlying speech recognizer is 73.6% for the top candidate, and 83.5% for the top three candidates, where the test-set perplexity of the CFG grammar is 65. By ruling out erroneous speech recognition results using various linguistic constraints, the parser improves the sentence recognition accuracy up to 81.6% for the top candidate, and 85.8% for the top three candidates. From the experiment result, we found that the combination of syntactic restriction, selectional restriction and coordinate structure restriction can provide a sufficient restriction to rule out the recognition errors between case-marking particles with the same vowel, which are the type of errors most likely to occur. However, we also found that it is necessary to use pragmatic information, such as topic, presupposition, and discourse structure, to rule out the recognition errors involved with topicalizing particles and sentence final particles.

  • Spoken Sentence Recognition Based on HMM-LR with Hybrid Language Modeling

    Kenji KITA  Tsuyoshi MORIMOTO  Kazumi OHKURA  Shigeki SAGAYAMA  Yaneo YANO  

     
    PAPER

      Vol:
    E77-D No:2
      Page(s):
    258-265

    This paper describes Japanese spoken sentence recognition using hybrid language modeling, which combines the advantages of both syntactic and stochastic language models. As the baseline system, we adopted the HMM-LR speech recognition system, with which we have already achieved good performance for Japanese phrase recognition tasks. Several improvements have been made to this system aimed at handling continuously spoken sentences. The first improvement is HMM training with continuous utterances as well as word utterances. In previous implementations, HMMs were trained with only word utterances. Continuous utterances are included in the HMM training data because coarticulation effects are much stronger in continuous utterances. The second improvement is the development of a sentential grammar for Japanese. The sentential grammar was created by combining inter- and intra-phrase CFG grammars, which were developed separately. The third improvement is the incorporation of stochastic linguistic knowledge, which includes stochastic CFG and a bigram model of production rules. The system was evaluated using continuously spoken sentences from a conference registration task that included approximately 750 words. We attained a sentence accuracy of 83.9% in the speaker-dependent condition.

  • Integration of Speech Recognition and Language Processing in a Japanese to English Spoken Language Translation System

    Tsuyoshi MORIMOTO  Kiyohiro SHIKANO  Kiyoshi KOGURE  Hitoshi IIDA  Akira KUREMATSU  

     
    PAPER-Speech Understanding

      Vol:
    E74-A No:7
      Page(s):
    1889-1896

    The experimental spoken language translation system (SL-TRANS) has been implemented. It can recognize Japanese speech, translate it to English, and output a synthesized English speech. One of the most important problems in realizing such a system is how to integrate, or connect, speech recognition and language processing. In this paper, a new method realized in the system is described. The method is composed of three processes: grammar-driven predictive speech recognition, Kakariuke-dependency-based candidate filtering, and HPSG-based lattice parsing which is supplemented with a sentence preference mechanism. Input speech is uttered phrase by phrase. The speech recognizer takes an input phrase utterance and outputs several candidates with recognition scores for each phrase. Japanese phrasal grammar is used in recognition. It contributes to the output of grammatically well-formed phrase candidates, as well as to the reduction of phone perplexity. The candidate filter takes a phrase lattice, which is a sequence of multiple candidates for a phrase, and outputs a reduced phrase lattice. It removes semantically inappropriate phrase candidates by applying the Kakariuke dependency relationship between phrases. Finally, the HPSG-based lattice parser takes a phrase lattice and chooses the most plausible sentence by checking syntactic and semantic legitimacy or evaluating sentential preference. Experiment results for the system are also reported and the usefulness of the method is confirmed.

  • Continuous Speech Recognition Using a Combination of Syntactic Constraints and Dependency Relationships

    Tsuyoshi MORIMOTO  

     
    PAPER-Speech Processing and Acoustics

      Vol:
    E79-D No:1
      Page(s):
    54-62

    This paper proposes a Japanese continuous speech recognition mechanism in which a full-sentence-level context-free-grammar (CFG) and one kind of semantic constraint called dependency relationships between two bunsetsu (a kind of phrase) in Japanese" are used during speech recognition in an integrated way. Each dependency relationship is a modification relationship between two bunsetsu; these relationships include the case-frame relationship of a noun bunsetsu to a predicate bunsetsu, or adnominal modification relationships such as a noun bunsetsu to a noun bunsetsu. To suppress the processing overhead caused by using relationships of this type during speech recognition, no rigorous semantic analysis is performed. Instead, a simple matching with examples" approach is adopted. An experiment was carried out and results were compared with a case employing only CFG constraints. They show that the speech recognition accuracy is improved and that the overhead is small enough.

  • LR Parsing with a Category Reachability Test Applied to Speech Recognition

    Kenji KITA  Tsuyoshi MORIMOTO  Shigeki SAGAYAMA  

     
    PAPER

      Vol:
    E76-D No:1
      Page(s):
    23-28

    In this paper, we propose an extended LR parsing algorithm, called LR parsing with a category reachability test (the LR-CRT algorithm). The LR-CRT algorithm enables a parser to efficiently recognize those sentences that belong to a specified grammatical category. The key point of the algorithm is to use an augmented LR parsing table in which each action entry contains a set of reachable categories. When executing a shift or reduce action, the parser checks whether the action can reach a given category using the augmented table. We apply the LR-CRT algorithm to improve a speech recognition system based on two-level LR parsing. This system uses two kinds of grammars, inter- and intra-phrase grammars, to recognize Japanese sentential speech. Two-level LR parsing guides the search of speech recognition through two-level symbol prediction, phrase category prediction and phone prediction, based on these grammars. The LR-CRT algorithm makes possible the efficient phone prediction based on the phrase category prediction. The system was evaluated using sentential speech data uttered phrase by phrase, and attained a word accuracy of 97.5% and a sentence accuracy of 91.2%

  • Continuous Speech Recognition Using Two-Level LR Parsing

    Kenji KITA  Toshiyuki TAKEZAWA  Tsuyoshi MORIMOTO  

     
    PAPER-Continuous Speech Recognition

      Vol:
    E74-A No:7
      Page(s):
    1806-1810

    This paper describes a continuous speech recognition system using two-level LR parsing and phone based HMMs. ATR has already implemented a predictive LR parsing algorithm in an HMM-based speech recognition system for Japanese. However, up to now, this system has used only intra-phrase grammatical constraints. In Japanese, a sentence is composed of several phrases and thus, two kinds of grammars, namely an intra-phrase grammar and an inter-phrase grammar, are sufficient for recognizing sentences. Two-level LR parsing makes it possible to use not only intra-phrase grammatical constraints but also inter-phrase grammatical constraints during speech recognition. The system is applied to Japanese sentence recognition where sentences were uttered phrase by phrase, and attains a word accuracy of 95.9% and a sentence accuracy of 84.7%.

  • Processing Unknown Words in Continuous Speech Recognition

    Kenji KITA  Terumasa EHARA  Tsuyoshi MORIMOTO  

     
    PAPER-Continuous Speech Recognition

      Vol:
    E74-A No:7
      Page(s):
    1811-1816

    Current continuous speech recognition systems essentially ignore unknown words. Systems are designed to recognize words in the lexicon. However, for using speech recognition systems in a real application such as spoken-language processing, it is very important to process unknown words. This paper proposes a continuous speech recognition method which accepts any utterance that might include unknown words. In this method, words not in the lexicon are transcribed as phone sequences, while words in the lexicon are recognized correctly. The HMM-LR speech recognition system, which is an integration of Hidden Markov Models and generalized LR parsing, is used as the baseline system, and enhanced with the trigram model of syllables to take into account the stochastic characteristics of a language. In our approach, two kinds of grammars, a task grammar which describes the task and a phonetic grammar which describes constraints between phones, are merged and used in the HMM-LR system. The system can output a phonetic transcription for an unknown word by using the phonetic grammar. Experiment results indicate that our approach is very promising.