A major challenge for speech emotion recognition is that when the training and deployment conditions do not use the same speech corpus, the recognition rates will obviously drop. Transfer learning, which has successfully addressed the cross-domain classification or recognition problem, is presented for cross-corpus speech emotion recognition. First, by using the maximum mean discrepancy embedding (MMDE) optimization and dimension reduction algorithms, two close low-dimensional feature spaces are obtained for source and target speech corpora, respectively. Then, a classifier function is trained using the learned low-dimensional features in the labeled source corpus, and directly applied to the unlabeled target corpus for emotion label recognition. Experimental results demonstrate that the transfer learning method can significantly outperform the traditional automatic recognition technique for cross-corpus speech emotion recognition.
Peng SONG
Southeast University
Yun JIN
Jiangsu Normal University
Li ZHAO
Southeast University
Minghai XIN
Southeast University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Peng SONG, Yun JIN, Li ZHAO, Minghai XIN, "Speech Emotion Recognition Using Transfer Learning" in IEICE TRANSACTIONS on Information,
vol. E97-D, no. 9, pp. 2530-2532, September 2014, doi: 10.1587/transinf.2014EDL8038.
Abstract: A major challenge for speech emotion recognition is that when the training and deployment conditions do not use the same speech corpus, the recognition rates will obviously drop. Transfer learning, which has successfully addressed the cross-domain classification or recognition problem, is presented for cross-corpus speech emotion recognition. First, by using the maximum mean discrepancy embedding (MMDE) optimization and dimension reduction algorithms, two close low-dimensional feature spaces are obtained for source and target speech corpora, respectively. Then, a classifier function is trained using the learned low-dimensional features in the labeled source corpus, and directly applied to the unlabeled target corpus for emotion label recognition. Experimental results demonstrate that the transfer learning method can significantly outperform the traditional automatic recognition technique for cross-corpus speech emotion recognition.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2014EDL8038/_p
Copy
@ARTICLE{e97-d_9_2530,
author={Peng SONG, Yun JIN, Li ZHAO, Minghai XIN, },
journal={IEICE TRANSACTIONS on Information},
title={Speech Emotion Recognition Using Transfer Learning},
year={2014},
volume={E97-D},
number={9},
pages={2530-2532},
abstract={A major challenge for speech emotion recognition is that when the training and deployment conditions do not use the same speech corpus, the recognition rates will obviously drop. Transfer learning, which has successfully addressed the cross-domain classification or recognition problem, is presented for cross-corpus speech emotion recognition. First, by using the maximum mean discrepancy embedding (MMDE) optimization and dimension reduction algorithms, two close low-dimensional feature spaces are obtained for source and target speech corpora, respectively. Then, a classifier function is trained using the learned low-dimensional features in the labeled source corpus, and directly applied to the unlabeled target corpus for emotion label recognition. Experimental results demonstrate that the transfer learning method can significantly outperform the traditional automatic recognition technique for cross-corpus speech emotion recognition.},
keywords={},
doi={10.1587/transinf.2014EDL8038},
ISSN={1745-1361},
month={September},}
Copy
TY - JOUR
TI - Speech Emotion Recognition Using Transfer Learning
T2 - IEICE TRANSACTIONS on Information
SP - 2530
EP - 2532
AU - Peng SONG
AU - Yun JIN
AU - Li ZHAO
AU - Minghai XIN
PY - 2014
DO - 10.1587/transinf.2014EDL8038
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E97-D
IS - 9
JA - IEICE TRANSACTIONS on Information
Y1 - September 2014
AB - A major challenge for speech emotion recognition is that when the training and deployment conditions do not use the same speech corpus, the recognition rates will obviously drop. Transfer learning, which has successfully addressed the cross-domain classification or recognition problem, is presented for cross-corpus speech emotion recognition. First, by using the maximum mean discrepancy embedding (MMDE) optimization and dimension reduction algorithms, two close low-dimensional feature spaces are obtained for source and target speech corpora, respectively. Then, a classifier function is trained using the learned low-dimensional features in the labeled source corpus, and directly applied to the unlabeled target corpus for emotion label recognition. Experimental results demonstrate that the transfer learning method can significantly outperform the traditional automatic recognition technique for cross-corpus speech emotion recognition.
ER -