Unconstrained Facial Expression Recognition Based on Feature Enhanced CNN and Cross-Layer LSTM

Ying TONG; Rui CHEN; Ruiyu LIANG

doi:10.1587/transinf.2020EDL8065

Unconstrained Facial Expression Recognition Based on Feature Enhanced CNN and Cross-Layer LSTM

Ying TONG, Rui CHEN, Ruiyu LIANG

Full Text Views

0

Cite this

Summary :

LSTM network have shown to outperform in facial expression recognition of video sequence. In view of limited representation ability of single-layer LSTM, a hierarchical attention model with enhanced feature branch is proposed. This new network architecture consists of traditional VGG-16-FACE with enhanced feature branch followed by a cross-layer LSTM. The VGG-16-FACE with enhanced branch extracts the spatial features as well as the cross-layer LSTM extracts the temporal relations between different frames in the video. The proposed method is evaluated on the public emotion databases in subject-independent and cross-database tasks and outperforms state-of-the-art methods.

Publication: IEICE TRANSACTIONS on Information Vol.E103-D No.11 pp.2403-2406

Publication Date: 2020/11/01

Publicized: 2020/07/30

Online ISSN: 1745-1361

DOI: 10.1587/transinf.2020EDL8065

Type of Manuscript: LETTER

Category: Image Recognition, Computer Vision

Authors

Ying TONG
  Nanjing Institute of Technology
Rui CHEN
  Nanjing Institute of Technology
Ruiyu LIANG
  Nanjing Institute of Technology

Keyword

facial expression recognition, video sequence, long short-term memory, feature extraction

Cite this

Copy

Ying TONG, Rui CHEN, Ruiyu LIANG, "Unconstrained Facial Expression Recognition Based on Feature Enhanced CNN and Cross-Layer LSTM" in IEICE TRANSACTIONS on Information, vol. E103-D, no. 11, pp. 2403-2406, November 2020, doi: 10.1587/transinf.2020EDL8065.
Abstract: LSTM network have shown to outperform in facial expression recognition of video sequence. In view of limited representation ability of single-layer LSTM, a hierarchical attention model with enhanced feature branch is proposed. This new network architecture consists of traditional VGG-16-FACE with enhanced feature branch followed by a cross-layer LSTM. The VGG-16-FACE with enhanced branch extracts the spatial features as well as the cross-layer LSTM extracts the temporal relations between different frames in the video. The proposed method is evaluated on the public emotion databases in subject-independent and cross-database tasks and outperforms state-of-the-art methods.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2020EDL8065/_p

Copy

@ARTICLE{e103-d_11_2403,
author={Ying TONG, Rui CHEN, Ruiyu LIANG, },
journal={IEICE TRANSACTIONS on Information},
title={Unconstrained Facial Expression Recognition Based on Feature Enhanced CNN and Cross-Layer LSTM},
year={2020},
volume={E103-D},
number={11},
pages={2403-2406},
abstract={LSTM network have shown to outperform in facial expression recognition of video sequence. In view of limited representation ability of single-layer LSTM, a hierarchical attention model with enhanced feature branch is proposed. This new network architecture consists of traditional VGG-16-FACE with enhanced feature branch followed by a cross-layer LSTM. The VGG-16-FACE with enhanced branch extracts the spatial features as well as the cross-layer LSTM extracts the temporal relations between different frames in the video. The proposed method is evaluated on the public emotion databases in subject-independent and cross-database tasks and outperforms state-of-the-art methods.},
keywords={},
doi={10.1587/transinf.2020EDL8065},
ISSN={1745-1361},
month={November},}

Copy

TY - JOUR
TI - Unconstrained Facial Expression Recognition Based on Feature Enhanced CNN and Cross-Layer LSTM
T2 - IEICE TRANSACTIONS on Information
SP - 2403
EP - 2406
AU - Ying TONG
AU - Rui CHEN
AU - Ruiyu LIANG
PY - 2020
DO - 10.1587/transinf.2020EDL8065
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E103-D
IS - 11
JA - IEICE TRANSACTIONS on Information
Y1 - November 2020
AB - LSTM network have shown to outperform in facial expression recognition of video sequence. In view of limited representation ability of single-layer LSTM, a hierarchical attention model with enhanced feature branch is proposed. This new network architecture consists of traditional VGG-16-FACE with enhanced feature branch followed by a cross-layer LSTM. The VGG-16-FACE with enhanced branch extracts the spatial features as well as the cross-layer LSTM extracts the temporal relations between different frames in the video. The proposed method is evaluated on the public emotion databases in subject-independent and cross-database tasks and outperforms state-of-the-art methods.
ER -