The search functionality is under construction.

Author Search Result

[Author] Kazumasa MURAI(1hit)

1-1hit
  • Face-to-Talk: Audio-Visual Speech Detection for Robust Speech Recognition in Noisy Environment

    Kazumasa MURAI  Satoshi NAKAMURA  

     
    PAPER-Robust Speech Recognition and Enhancement

      Vol:
    E86-D No:3
      Page(s):
    505-513

    This paper discusses "face-to-talk" audio-visual speech detection for robust speech recognition in noisy environment, which consists of facial orientation based switch and audio-visual speech section detection. Most of today's speech recognition systems must actually turned on and off by a switch e.g. "push-to-talk" to indicate which utterance should be recognized, and a specific speech section must be detected prior to any further analysis. To improve usability and performance, we have researched how to extract the useful information from visual modality. We implemented a facial orientation based switch, which activates the speech recognition during a speaker is facing to the camera. Then, the speech section is detected by analyzing the image of the face. Visual speech detection is robust to audio noise, but because the articulation starts prior to the speech and lasts longer than the speech, the detected section tends to be longer and ends up with insertion errors. Therefore, we have fused the audio-visual modality detected sections. Our experiment confirms that the proposed audio-visual speech detection method improves recognition performance in noisy environment.