IEICE global.ieice.org Site

Author Search Result

[Author] Zhenlin LIANG(3hit)

1-3hit

Speech Emotion Recognition Using Multihead Attention in Both Time and Feature Dimensions
Yue XIE Ruiyu LIANG Zhenlin LIANG Xiaoyan ZHAO Wenhao ZENG

LETTER-Speech and Hearing

Pubricized:
2023/02/21
Vol:
E106-D No:5
Page(s):
1098-1101
To enhance the emotion feature and improve the performance of speech emotion recognition, an attention mechanism is employed to recognize the important information in both time and feature dimensions. In the time dimension, multi-heads attention is modified with the last state of the long short-term memory (LSTM)'s output to match the time accumulation characteristic of LSTM. In the feature dimension, scaled dot-product attention is replaced with additive attention that refers to the method of the state update of LSTM to construct multi-heads attention. This means that a nonlinear change replaces the linear mapping in classical multi-heads attention. Experiments on IEMOCAP datasets demonstrate that the attention mechanism could enhance emotional information and improve the performance of speech emotion recognition.
Attention-Based Dense LSTM for Speech Emotion Recognition Open Access
Yue XIE Ruiyu LIANG Zhenlin LIANG Li ZHAO

LETTER-Pattern Recognition

Pubricized:
2019/04/17
Vol:
E102-D No:7
Page(s):
1426-1429
Despite the widespread use of deep learning for speech emotion recognition, they are severely restricted due to the information loss in the high layer of deep neural networks, as well as the degradation problem. In order to efficiently utilize information and solve degradation, attention-based dense long short-term memory (LSTM) is proposed for speech emotion recognition. LSTM networks with the ability to process time series such as speech are constructed into which attention-based dense connections are introduced. That means the weight coefficients are added to skip-connections of each layer to distinguish the difference of the emotional information between layers and avoid the interference of redundant information from the bottom layer to the effective information from the top layer. The experiments demonstrate that proposed method improves the recognition performance by 12% and 7% on eNTERFACE and IEMOCAP corpus respectively.
Weighted Gradient Pretrain for Low-Resource Speech Emotion Recognition
Yue XIE Ruiyu LIANG Xiaoyan ZHAO Zhenlin LIANG Jing DU

LETTER-Speech and Hearing

Pubricized:
2022/04/04
Vol:
E105-D No:7
Page(s):
1352-1355
To alleviate the problem of the dependency on the quantity of the training sample data in speech emotion recognition, a weighted gradient pre-train algorithm for low-resource speech emotion recognition is proposed. Multiple public emotion corpora are used for pre-training to generate shared hidden layer (SHL) parameters with the generalization ability. The parameters are used to initialize the downsteam network of the recognition task for the low-resource dataset, thereby improving the recognition performance on low-resource emotion corpora. However, the emotion categories are different among the public corpora, and the number of samples varies greatly, which will increase the difficulty of joint training on multiple emotion datasets. To this end, a weighted gradient (WG) algorithm is proposed to enable the shared layer to learn the generalized representation of different datasets without affecting the priority of the emotion recognition on each corpus. Experiments show that the accuracy is improved by using CASIA, IEMOCAP, and eNTERFACE as the known datasets to pre-train the emotion models of GEMEP, and the performance could be improved further by combining WG with gradient reversal layer.

Author Search Result

[Author] Zhenlin LIANG(3hit)

Speech Emotion Recognition Using Multihead Attention in Both Time and Feature Dimensions

Attention-Based Dense LSTM for Speech Emotion Recognition Open Access

Weighted Gradient Pretrain for Low-Resource Speech Emotion Recognition

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles