In this letter, we research the method of using face and gesture image sequences to deal with the video-based bimodal emotion recognition problem, in which both Harris plus cuboids spatio-temporal feature (HST) and sparse canonical correlation analysis (SCCA) fusion method are applied to this end. To efficaciously pick up the spatio-temporal features, we adopt the Harris 3D feature detector proposed by Laptev and Lindeberg to find the points from both face and gesture videos, and then apply the cuboids feature descriptor to extract the facial expression and gesture emotion features [1],[2]. To further extract the common emotion features from both facial expression feature set and gesture feature set, the SCCA method is applied and the extracted emotion features are used for the biomodal emotion classification, where the K-nearest neighbor classifier and the SVM classifier are respectively used for this purpose. We test this method on the biomodal face and body gesture (FABO) database and the experimental results demonstrate the better recognition accuracy compared with other methods.
Jingjie YAN
Southeast University
Wenming ZHENG
Southeast University
Minhai XIN
Southeast University
Jingwei YAN
Southeast University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Jingjie YAN, Wenming ZHENG, Minhai XIN, Jingwei YAN, "Integrating Facial Expression and Body Gesture in Videos for Emotion Recognition" in IEICE TRANSACTIONS on Information,
vol. E97-D, no. 3, pp. 610-613, March 2014, doi: 10.1587/transinf.E97.D.610.
Abstract: In this letter, we research the method of using face and gesture image sequences to deal with the video-based bimodal emotion recognition problem, in which both Harris plus cuboids spatio-temporal feature (HST) and sparse canonical correlation analysis (SCCA) fusion method are applied to this end. To efficaciously pick up the spatio-temporal features, we adopt the Harris 3D feature detector proposed by Laptev and Lindeberg to find the points from both face and gesture videos, and then apply the cuboids feature descriptor to extract the facial expression and gesture emotion features [1],[2]. To further extract the common emotion features from both facial expression feature set and gesture feature set, the SCCA method is applied and the extracted emotion features are used for the biomodal emotion classification, where the K-nearest neighbor classifier and the SVM classifier are respectively used for this purpose. We test this method on the biomodal face and body gesture (FABO) database and the experimental results demonstrate the better recognition accuracy compared with other methods.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.E97.D.610/_p
Copy
@ARTICLE{e97-d_3_610,
author={Jingjie YAN, Wenming ZHENG, Minhai XIN, Jingwei YAN, },
journal={IEICE TRANSACTIONS on Information},
title={Integrating Facial Expression and Body Gesture in Videos for Emotion Recognition},
year={2014},
volume={E97-D},
number={3},
pages={610-613},
abstract={In this letter, we research the method of using face and gesture image sequences to deal with the video-based bimodal emotion recognition problem, in which both Harris plus cuboids spatio-temporal feature (HST) and sparse canonical correlation analysis (SCCA) fusion method are applied to this end. To efficaciously pick up the spatio-temporal features, we adopt the Harris 3D feature detector proposed by Laptev and Lindeberg to find the points from both face and gesture videos, and then apply the cuboids feature descriptor to extract the facial expression and gesture emotion features [1],[2]. To further extract the common emotion features from both facial expression feature set and gesture feature set, the SCCA method is applied and the extracted emotion features are used for the biomodal emotion classification, where the K-nearest neighbor classifier and the SVM classifier are respectively used for this purpose. We test this method on the biomodal face and body gesture (FABO) database and the experimental results demonstrate the better recognition accuracy compared with other methods.},
keywords={},
doi={10.1587/transinf.E97.D.610},
ISSN={1745-1361},
month={March},}
Copy
TY - JOUR
TI - Integrating Facial Expression and Body Gesture in Videos for Emotion Recognition
T2 - IEICE TRANSACTIONS on Information
SP - 610
EP - 613
AU - Jingjie YAN
AU - Wenming ZHENG
AU - Minhai XIN
AU - Jingwei YAN
PY - 2014
DO - 10.1587/transinf.E97.D.610
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E97-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2014
AB - In this letter, we research the method of using face and gesture image sequences to deal with the video-based bimodal emotion recognition problem, in which both Harris plus cuboids spatio-temporal feature (HST) and sparse canonical correlation analysis (SCCA) fusion method are applied to this end. To efficaciously pick up the spatio-temporal features, we adopt the Harris 3D feature detector proposed by Laptev and Lindeberg to find the points from both face and gesture videos, and then apply the cuboids feature descriptor to extract the facial expression and gesture emotion features [1],[2]. To further extract the common emotion features from both facial expression feature set and gesture feature set, the SCCA method is applied and the extracted emotion features are used for the biomodal emotion classification, where the K-nearest neighbor classifier and the SVM classifier are respectively used for this purpose. We test this method on the biomodal face and body gesture (FABO) database and the experimental results demonstrate the better recognition accuracy compared with other methods.
ER -