A salient feature extraction algorithm is proposed to improve the recognition rate of the speech emotion. Firstly, the spectrogram of the emotional speech is calculated. Secondly, imitating the selective attention mechanism, the color, direction and brightness map of the spectrogram is computed. Each map is normalized and down-sampled to form the low resolution feature matrix. Then, each feature matrix is converted to the row vector and the principal component analysis (PCA) is used to reduce features redundancy to make the subsequent classification algorithm more practical. Finally, the speech emotion is classified with the support vector machine. Compared with the tradition features, the improved recognition rate reaches 15%.
Ruiyu LIANG
Nanjing Institute of Technology
Huawei TAO
Southeast University
Guichen TANG
Nanjing Institute of Technology
Qingyun WANG
Nanjing Institute of Technology
Li ZHAO
Southeast University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Ruiyu LIANG, Huawei TAO, Guichen TANG, Qingyun WANG, Li ZHAO, "A Salient Feature Extraction Algorithm for Speech Emotion Recognition" in IEICE TRANSACTIONS on Information,
vol. E98-D, no. 9, pp. 1715-1718, September 2015, doi: 10.1587/transinf.2015EDL8091.
Abstract: A salient feature extraction algorithm is proposed to improve the recognition rate of the speech emotion. Firstly, the spectrogram of the emotional speech is calculated. Secondly, imitating the selective attention mechanism, the color, direction and brightness map of the spectrogram is computed. Each map is normalized and down-sampled to form the low resolution feature matrix. Then, each feature matrix is converted to the row vector and the principal component analysis (PCA) is used to reduce features redundancy to make the subsequent classification algorithm more practical. Finally, the speech emotion is classified with the support vector machine. Compared with the tradition features, the improved recognition rate reaches 15%.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2015EDL8091/_p
Copy
@ARTICLE{e98-d_9_1715,
author={Ruiyu LIANG, Huawei TAO, Guichen TANG, Qingyun WANG, Li ZHAO, },
journal={IEICE TRANSACTIONS on Information},
title={A Salient Feature Extraction Algorithm for Speech Emotion Recognition},
year={2015},
volume={E98-D},
number={9},
pages={1715-1718},
abstract={A salient feature extraction algorithm is proposed to improve the recognition rate of the speech emotion. Firstly, the spectrogram of the emotional speech is calculated. Secondly, imitating the selective attention mechanism, the color, direction and brightness map of the spectrogram is computed. Each map is normalized and down-sampled to form the low resolution feature matrix. Then, each feature matrix is converted to the row vector and the principal component analysis (PCA) is used to reduce features redundancy to make the subsequent classification algorithm more practical. Finally, the speech emotion is classified with the support vector machine. Compared with the tradition features, the improved recognition rate reaches 15%.},
keywords={},
doi={10.1587/transinf.2015EDL8091},
ISSN={1745-1361},
month={September},}
Copy
TY - JOUR
TI - A Salient Feature Extraction Algorithm for Speech Emotion Recognition
T2 - IEICE TRANSACTIONS on Information
SP - 1715
EP - 1718
AU - Ruiyu LIANG
AU - Huawei TAO
AU - Guichen TANG
AU - Qingyun WANG
AU - Li ZHAO
PY - 2015
DO - 10.1587/transinf.2015EDL8091
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E98-D
IS - 9
JA - IEICE TRANSACTIONS on Information
Y1 - September 2015
AB - A salient feature extraction algorithm is proposed to improve the recognition rate of the speech emotion. Firstly, the spectrogram of the emotional speech is calculated. Secondly, imitating the selective attention mechanism, the color, direction and brightness map of the spectrogram is computed. Each map is normalized and down-sampled to form the low resolution feature matrix. Then, each feature matrix is converted to the row vector and the principal component analysis (PCA) is used to reduce features redundancy to make the subsequent classification algorithm more practical. Finally, the speech emotion is classified with the support vector machine. Compared with the tradition features, the improved recognition rate reaches 15%.
ER -