Recognizing human action in complex scenes is a challenging problem in computer vision. Some action-unrelated concepts, such as camera position features, could significantly affect the appearance of local spatio-temporal features, and therefore the performance of low-level features based methods degrades. In this letter, we define the action-unrelated concept: the position of camera as high-level features. We observe that they can serve as a prior to local spatio-temporal features for human action recognition. We encode this prior by modeling interactions between spatio-temporal features and camera position features. We infer camera position features from local spatio-temporal features via these interactions. The parameters of this model are estimated by a new max-margin algorithm. We evaluate the proposed method on KTH, IXMAS and Youtube actions datasets. Experimental results show the effectiveness of the proposed method.
Wen ZHOU
Chinese Academy of Sciences
Chunheng WANG
Chinese Academy of Sciences
Baihua XIAO
Chinese Academy of Sciences
Zhong ZHANG
Chinese Academy of Sciences
Yunxue SHAO
Chinese Academy of Sciences
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Wen ZHOU, Chunheng WANG, Baihua XIAO, Zhong ZHANG, Yunxue SHAO, "Modeling Interactions between Low-Level and High-Level Features for Human Action Recognition" in IEICE TRANSACTIONS on Information,
vol. E96-D, no. 12, pp. 2896-2899, December 2013, doi: 10.1587/transinf.E96.D.2896.
Abstract: Recognizing human action in complex scenes is a challenging problem in computer vision. Some action-unrelated concepts, such as camera position features, could significantly affect the appearance of local spatio-temporal features, and therefore the performance of low-level features based methods degrades. In this letter, we define the action-unrelated concept: the position of camera as high-level features. We observe that they can serve as a prior to local spatio-temporal features for human action recognition. We encode this prior by modeling interactions between spatio-temporal features and camera position features. We infer camera position features from local spatio-temporal features via these interactions. The parameters of this model are estimated by a new max-margin algorithm. We evaluate the proposed method on KTH, IXMAS and Youtube actions datasets. Experimental results show the effectiveness of the proposed method.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.E96.D.2896/_p
Copy
@ARTICLE{e96-d_12_2896,
author={Wen ZHOU, Chunheng WANG, Baihua XIAO, Zhong ZHANG, Yunxue SHAO, },
journal={IEICE TRANSACTIONS on Information},
title={Modeling Interactions between Low-Level and High-Level Features for Human Action Recognition},
year={2013},
volume={E96-D},
number={12},
pages={2896-2899},
abstract={Recognizing human action in complex scenes is a challenging problem in computer vision. Some action-unrelated concepts, such as camera position features, could significantly affect the appearance of local spatio-temporal features, and therefore the performance of low-level features based methods degrades. In this letter, we define the action-unrelated concept: the position of camera as high-level features. We observe that they can serve as a prior to local spatio-temporal features for human action recognition. We encode this prior by modeling interactions between spatio-temporal features and camera position features. We infer camera position features from local spatio-temporal features via these interactions. The parameters of this model are estimated by a new max-margin algorithm. We evaluate the proposed method on KTH, IXMAS and Youtube actions datasets. Experimental results show the effectiveness of the proposed method.},
keywords={},
doi={10.1587/transinf.E96.D.2896},
ISSN={1745-1361},
month={December},}
Copy
TY - JOUR
TI - Modeling Interactions between Low-Level and High-Level Features for Human Action Recognition
T2 - IEICE TRANSACTIONS on Information
SP - 2896
EP - 2899
AU - Wen ZHOU
AU - Chunheng WANG
AU - Baihua XIAO
AU - Zhong ZHANG
AU - Yunxue SHAO
PY - 2013
DO - 10.1587/transinf.E96.D.2896
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E96-D
IS - 12
JA - IEICE TRANSACTIONS on Information
Y1 - December 2013
AB - Recognizing human action in complex scenes is a challenging problem in computer vision. Some action-unrelated concepts, such as camera position features, could significantly affect the appearance of local spatio-temporal features, and therefore the performance of low-level features based methods degrades. In this letter, we define the action-unrelated concept: the position of camera as high-level features. We observe that they can serve as a prior to local spatio-temporal features for human action recognition. We encode this prior by modeling interactions between spatio-temporal features and camera position features. We infer camera position features from local spatio-temporal features via these interactions. The parameters of this model are estimated by a new max-margin algorithm. We evaluate the proposed method on KTH, IXMAS and Youtube actions datasets. Experimental results show the effectiveness of the proposed method.
ER -