The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] phoneme recognition(3hit)

1-3hit
  • Multi-Task Learning for Improved Recognition of Multiple Types of Acoustic Information

    Jae-Won KIM  Hochong PARK  

     
    LETTER-Speech and Hearing

      Pubricized:
    2021/07/14
      Vol:
    E104-D No:10
      Page(s):
    1762-1765

    We propose a new method for improving the recognition performance of phonemes, speech emotions, and music genres using multi-task learning. When tasks are closely related, multi-task learning can improve the performance of each task by learning common feature representation for all the tasks. However, the recognition tasks considered in this study demand different input signals of speech and music at different time scales, resulting in input features with different characteristics. In addition, a training dataset with multiple labels for all information sources is not available. Considering these issues, we conduct multi-task learning in a sequential training process using input features with a single label for one information source. A comparative evaluation confirms that the proposed method for multi-task learning provides higher performance for all recognition tasks than individual learning for each task as in conventional methods.

  • N-Gram Modeling Based on Recognized Phonemes in Automatic Language Identification

    Hingkeung KWAN  Keikichi HIROSE  

     
    PAPER-Speech Processing and Acoustics

      Vol:
    E81-D No:11
      Page(s):
    1224-1231

    Due to a rather low phoneme recognition rate for noisy telephone speech, there may arise large differences between N-gram built upon recognized phoneme labels and those built upon original attached phoneme labels, which in turn would affect the performances of N-gram based language identification methods. Use of N-gram built upon recognized phoneme labels from the training data was evaluated and was shown to be more effective for the language identification. The performance of mixed phoneme recognizer, in which both language-dependent and language-independent phonemes were included, was also evaluated. Results showed that the performance was better than that using parallel language-dependent phoneme recognizers in which bias existed due to different numbers of phonemes among languages.

  • A New HMnet Construction Algorithm Requiring No Contextual Factors

    Motoyuki SUZUKI  Shozo MAKINO  Akinori ITO  Hirotomo ASO  Hiroshi SHIMODAIRA  

     
    PAPER

      Vol:
    E78-D No:6
      Page(s):
    662-668

    Many methods have been proposed for constructing context-dependent phoneme models using Hidden Markov Models (HMMs) to improve performance. These conventional methods require previously defined contextual factors. If these factors are deficient, the method exhibit poor recognition performance. In this paper, we propose a new construction algorithm for HMnet which does not require pre-defined contextual factors. Experiments demonstrated that the new algorithm could construct the HMnet even for the case that the Successive State Splitting (SSS) algorithm could not. The new algorithm produced better phoneme recognition characteristics than the SSS algorithm.