The search functionality is under construction.

Author Search Result

[Author] Liyanage C. DE SILVA(4hit)

1-4hit
  • Use of Multimodal Information in Facial Emotion Recognition

    Liyanage C. DE SILVA  Tsutomu MIYASATO  Ryohei NAKATSU  

     
    PAPER-Artificial Intelligence and Cognitive Science

      Vol:
    E81-D No:1
      Page(s):
    105-114

    Detection of facial emotions are mainly addressed by computer vision researchers based on facial display. Also detection of vocal expressions of emotions is found in research work done by acoustic researchers. Most of these research paradigms are devoted purely to visual or purely to auditory human emotion detection. However we found that it is very interesting to consider both of these auditory and visual informations together, for processing, since we hope this kind of multimodal information processing will become a datum of information processing in future multimedia era. By several intensive subjective evaluation studies we found that human beings recognize Anger, happiness, Surprise and Dislike by their visual appearance, compared to voice only detection. When the audio track of each emotion clip is dubbed with a different type of auditory emotional expression, still Anger, Happiness and Surprise were video dominant. However Dislike emotion gave mixed responses to different speakers. In both studies we found that Sadness and Fear emotions were audio dominant. As a conclusion to the paper we propose a method of facial emotion detection by using a hybrid approach, which uses multimodal informations for facial emotion recognition.

  • Detection and Tracking of Facial Features by Using Edge Pixel Counting and Deformable Circular Template Matching

    Liyanage C. DE SILVA  Kiyoharu AIZAWA  Mitsutoshi HATORI  

     
    PAPER-Image Processing, Computer Graphics and Pattern Recognition

      Vol:
    E78-D No:9
      Page(s):
    1195-1207

    In this paper face feature detection and tracking are discussed, using methods called edge pixel counting and deformable circular template matching. Instead of utilizing color or gray scale information of the facial image, the proposed edge pixel counting method utilizes the edge information to estimate the face feature positions such as eyes, nose and mouth, using a variable size face feature template, the initial size of which is predetermined by using a facial image database. The method is robust in the sense that the detection is possible with facial images with different skin color and different facial orientations. Subsequently, by using a deformable circular template matching two iris positions of the face are determined and are used in the edge pixel counting, to track the features in the next frame. Although feature tracking using gray scale template matching often fails when inter frame correlation around the feature areas are very low due to facial expression change (such as, talking, smiling, eye blinking etc.), feature tracking using edge pixel counting can track facial features reliably. Some experimental results are shown to demonstrate the effectiveness of the proposed method.

  • Stress Classification Using Subband Based Features

    Tin Lay NWE  Say Wei FOO  Liyanage C. DE SILVA  

     
    PAPER-Speech Synthesis and Prosody

      Vol:
    E86-D No:3
      Page(s):
    565-573

    On research to determine reliable acoustic indicators for the type of stress present in speech, the majority of systems have concentrated on the statistics extracted from pitch contour, energy contour, wavelet based subband features and Teager-Energy-Operator (TEO) based feature parameters. These systems work mostly on pair-wise distinction between stress and neutral speech. Their performance decreases substantially when tested in multi-style detection among many stress categories. In this paper, a novel system is proposed using linear short time Log Frequency Power Coefficients (LFPC) and TEO based nonlinear LFPC features in both time and frequency domain. Five-state Hidden Markov Model (HMM) with continuous Gaussian mixture distribution is used. The stress classification ability of the system is tested using data from the SUSAS (Speech Under Simulated and Actual Stress) database to categorize five stress conditions individually. It is found that the performance of linear acoustic features LFPC is better than that of nonlinear TEO based LFPC feature parameters. Results show that with linear acoustic feature LFPC, average accuracy of 84% and the best accuracy of 95% can be achieved in the classification of the five categories. Results of test of the system under different signal-to-noise conditions show that the performance of the system does not degrade drastically with increase in noise. It is also observed that classification using nonlinear frequency domain LFPC features gives relatively higher accuracy than that using nonlinear time domain LFPC features.

  • Emotion Enhanced Face to Face Meetings Using the Concept of Virtual Space Teleconferencing

    Liyanage C. DE SILVA  Tsutomu MIYASATO  Fumio KISHINO  

     
    PAPER

      Vol:
    E79-D No:6
      Page(s):
    772-780

    Here we investigate the unique advantages of our proposed Virtual Space Teleconferencing System (VST) in the area of multimedia teleconferencing, with emphasis to facial emotion transmission and recognition. Specially, we show that this concept can be used in a unique way of communication in which the emotions of the local participant are transmitted to the remote party with higher recognition rate by enhancing the emotions using some intelligence processing in between the local and the remote participants. In other words, we can show that this kind of emotion enhanced teleconferencing systems can supersede face to face meetings, by effectively alleviating the barriers in recognizing emotions between different nations. Also in this paper we show that it is better alternative to the blurred or mosaiced facial images that one can find in some television interviews with people who are not willing to be exposed in public.