IEICE global.ieice.org Site

Author Search Result

[Author] Shidang LI(2hit)

1-2hit

Multimodal Speech Emotion Recognition Based on Large Language Model Open Access
Congcong FANG Yun JIN Guanlin CHEN Yunfan ZHANG Shidang LI Yong MA Yue XIE

LETTER-Speech and Hearing

Pubricized:
2024/07/22
Vol:
E107-D No:11
Page(s):
1463-1467
Currently, an increasing number of tasks in speech emotion recognition rely on the analysis of both speech and text features. However, there remains a paucity of research exploring the potential of leveraging large language models like GPT-3 to enhance emotion recognition. In this investigation, we harness the power of the GPT-3 model to extract semantic information from transcribed texts, generating text modal features with a dimensionality of 1536. Subsequently, we perform feature fusion, combining the 1536-dimensional text features with 1188-dimensional acoustic features to yield comprehensive multi-modal recognition outcomes. Our findings reveal that the proposed method achieves a weighted accuracy of 79.62% across the four emotion categories in IEMOCAP, underscoring the considerable enhancement in emotion recognition accuracy facilitated by integrating large language models.
Robust Beamforming for Joint Transceiver Design in K-User Interference Channel over Energy Efficient 5G
Shidang LI Chunguo LI Yongming HUANG Dongming WANG Luxi YANG

LETTER-Communication Theory and Signals

Vol:
E98-A No:8
Page(s):
1860-1864
Considering worse-case channel uncertainties, we investigate the robust energy efficient (EE) beamforming design problem in a K-user multiple-input-single-output (MISO) interference channel. Our objective is to maximize the worse-case sum EE under individual transmit power constraints. In general, this fractional programming problem is NP-hard for the optimal solution. To obtain an insight into the problem, we first transform the original problem into its lower bound problem with max-min and fractional form by exploiting the relationship between the user rate and the minimum mean square error (MMSE) and using the min-max inequality. To make it tractable, we transform the problem of fractional form into a subtractive form by using the Dinkelbach transformation, and then propose an iterative algorithm using Lagrangian duality, which leads to the locally optimal solution. Simulation results demonstrate that our proposed robust EE beamforming scheme outperforms the conventional algorithm.

Author Search Result

[Author] Shidang LI(2hit)

Multimodal Speech Emotion Recognition Based on Large Language Model Open Access

Robust Beamforming for Joint Transceiver Design in K-User Interference Channel over Energy Efficient 5G

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles