The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] lip reading(3hit)

1-3hit
  • Visual Speech Recognition Using Weighted Dynamic Time Warping

    Kyungsun LEE  Minseok KEUM  David K. HAN  Hanseok KO  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2015/04/09
      Vol:
    E98-D No:7
      Page(s):
    1430-1433

    It is unclear whether Hidden Markov Model (HMM) or Dynamic Time Warping (DTW) mapping is more appropriate for visual speech recognition when only small data samples are available. In this letter, the two approaches are compared in terms of sensitivity to the amount of training samples and computing time with the objective of determining the tipping point. The limited training data problem is addressed by exploiting a straightforward template matching via weighted-DTW. The proposed framework is a refined DTW by adjusting the warping paths with judicially injected weights to ensure a smooth diagonal path for accurate alignment without added computational load. The proposed WDTW is evaluated on three databases (two in the public domain and one developed in-house) for visual recognition performance. Subsequent experiments indicate that the proposed WDTW significantly enhances the recognition rate compared to the DTW and HMM based algorithms, especially under limited data samples.

  • Japanese 45 Single Sounds Recognition Using Intraoral Shape

    Takeshi SAITOH  Ryosuke KONISHI  

     
    LETTER-Pattern Recognition

      Vol:
    E91-D No:11
      Page(s):
    2735-2738

    This paper describes a recognition method of Japanese single sounds for application to lip reading. Related researches investigated only five or ten sounds. In this paper, experiments were conducted for 45 Japanese single sounds by classifying them into five vowels category, ten consonants category, and 45 sounds category. We obtained recognition rates of 94.7, 30.9 and 30.0% with trajectory feature.

  • An Efficient Lip-Reading Method Robust to Illumination Variations

    Jinyoung KIM  Joohun LEE  Katsuhiko SHIRAI  

     
    LETTER-Speech and Hearing

      Vol:
    E85-A No:9
      Page(s):
    2164-2168

    In this paper, for real-time automatic image transform based lip-reading under illumination variations, an efficient (smaller feature data size) and robust (better recognition under different lighting conditions) method is proposed. Image transform based approach obtains a compressed representation of image pixel values of speaker's mouth and is reported to show superior lip-reading performance. However, this approach inevitably produces large feature vectors relevant to lip information to require much computation time for lip-reading even when principal component analysis (PCA) is applied. To reduce the necessary dimension of feature vectors, the proposed method folded the lip image based on its symmetry in a frame image. This method also compensates the unbalanced illumination between the left and the right lip areas. Additionally, to filter out the inter-frame time-domain spectral distortion of each pixel contaminated by illumination noise, our method adapted the hi-pass filtering on the variations of pixel values between consecutive frames. In the experimental results performed on database recorded at various lighting conditions, the proposed lip-folding or/and inter-frame filtering reduced much the necessary number of feature data, principal components in this work, and showed superior recognition rate compared to the conventional method.