IEICE global.ieice.org Site

Author Search Result

[Author] Shifeng OU(2hit)

1-2hit

Learning Corpus-Invariant Discriminant Feature Representations for Speech Emotion Recognition
Peng SONG Shifeng OU Zhenbin DU Yanyan GUO Wenming MA Jinglei LIU Wenming ZHENG

LETTER-Speech and Hearing

Pubricized:
2017/02/02
Vol:
E100-D No:5
Page(s):
1136-1139
As a hot topic of speech signal processing, speech emotion recognition methods have been developed rapidly in recent years. Some satisfactory results have been achieved. However, it should be noted that most of these methods are trained and evaluated on the same corpus. In reality, the training data and testing data are often collected from different corpora, and the feature distributions of different datasets often follow different distributions. These discrepancies will greatly affect the recognition performance. To tackle this problem, a novel corpus-invariant discriminant feature representation algorithm, called transfer discriminant analysis (TDA), is presented for speech emotion recognition. The basic idea of TDA is to integrate the kernel LDA algorithm and the similarity measurement of distributions into one objective function. Experimental results under the cross-corpus conditions show that our proposed method can significantly improve the recognition rates.
Transfer Semi-Supervised Non-Negative Matrix Factorization for Speech Emotion Recognition
Peng SONG Shifeng OU Xinran ZHANG Yun JIN Wenming ZHENG Jinglei LIU Yanwei YU

LETTER-Speech and Hearing

Pubricized:
2016/07/01
Vol:
E99-D No:10
Page(s):
2647-2650
In practice, emotional speech utterances are often collected from different devices or conditions, which will lead to discrepancy between the training and testing data, resulting in sharp decrease of recognition rates. To solve this problem, in this letter, a novel transfer semi-supervised non-negative matrix factorization (TSNMF) method is presented. A semi-supervised negative matrix factorization algorithm, utilizing both labeled source and unlabeled target data, is adopted to learn common feature representations. Meanwhile, the maximum mean discrepancy (MMD) as a similarity measurement is employed to reduce the distance between the feature distributions of two databases. Finally, the TSNMF algorithm, which optimizes the SNMF and MMD functions together, is proposed to obtain robust feature representations across databases. Extensive experiments demonstrate that in comparison to the state-of-the-art approaches, our proposed method can significantly improve the cross-corpus recognition rates.

Author Search Result

[Author] Shifeng OU(2hit)

Learning Corpus-Invariant Discriminant Feature Representations for Speech Emotion Recognition

Transfer Semi-Supervised Non-Negative Matrix Factorization for Speech Emotion Recognition

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles