Collective activity recognition plays an important role in high-level video analysis. Most current feature representations look at contextual information extracted from the behaviour of nearby people. Every person needs to be detected and his pose should be estimated. After extracting the feature, hierarchical graphical models are always employed to model the spatio-temporal patterns of individuals and their interactions, and so can not avoid complex preprocessing and inference operations. To overcome these drawbacks, we present a new feature representation method, called attribute-based spatio-temporal (AST) descriptor. First, two types of information, spatio-temporal (ST) features and attribute features, are exploited. Attribute-based features are manually specified. An attribute classifier is trained to model the relationship between the ST features and attribute-based features, according to which the attribute features are refreshed. Then, the ST features, attribute features and the relationship between the attributes are combined to form the AST descriptor. An objective classifier can be specified on the AST descriptor and the weight parameters of the classifier are used for recognition. Experiments on standard collective activity benchmark sets show the effectiveness of the proposed descriptor.
Changhong CHEN
Nanjing University of Posts and Telecommunications
Hehe DOU
Nanjing University of Posts and Telecommunications
Zongliang GAN
Nanjing University of Posts and Telecommunications
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Changhong CHEN, Hehe DOU, Zongliang GAN, "Collective Activity Recognition by Attribute-Based Spatio-Temporal Descriptor" in IEICE TRANSACTIONS on Information,
vol. E98-D, no. 10, pp. 1875-1878, October 2015, doi: 10.1587/transinf.2015EDL8108.
Abstract: Collective activity recognition plays an important role in high-level video analysis. Most current feature representations look at contextual information extracted from the behaviour of nearby people. Every person needs to be detected and his pose should be estimated. After extracting the feature, hierarchical graphical models are always employed to model the spatio-temporal patterns of individuals and their interactions, and so can not avoid complex preprocessing and inference operations. To overcome these drawbacks, we present a new feature representation method, called attribute-based spatio-temporal (AST) descriptor. First, two types of information, spatio-temporal (ST) features and attribute features, are exploited. Attribute-based features are manually specified. An attribute classifier is trained to model the relationship between the ST features and attribute-based features, according to which the attribute features are refreshed. Then, the ST features, attribute features and the relationship between the attributes are combined to form the AST descriptor. An objective classifier can be specified on the AST descriptor and the weight parameters of the classifier are used for recognition. Experiments on standard collective activity benchmark sets show the effectiveness of the proposed descriptor.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2015EDL8108/_p
Copy
@ARTICLE{e98-d_10_1875,
author={Changhong CHEN, Hehe DOU, Zongliang GAN, },
journal={IEICE TRANSACTIONS on Information},
title={Collective Activity Recognition by Attribute-Based Spatio-Temporal Descriptor},
year={2015},
volume={E98-D},
number={10},
pages={1875-1878},
abstract={Collective activity recognition plays an important role in high-level video analysis. Most current feature representations look at contextual information extracted from the behaviour of nearby people. Every person needs to be detected and his pose should be estimated. After extracting the feature, hierarchical graphical models are always employed to model the spatio-temporal patterns of individuals and their interactions, and so can not avoid complex preprocessing and inference operations. To overcome these drawbacks, we present a new feature representation method, called attribute-based spatio-temporal (AST) descriptor. First, two types of information, spatio-temporal (ST) features and attribute features, are exploited. Attribute-based features are manually specified. An attribute classifier is trained to model the relationship between the ST features and attribute-based features, according to which the attribute features are refreshed. Then, the ST features, attribute features and the relationship between the attributes are combined to form the AST descriptor. An objective classifier can be specified on the AST descriptor and the weight parameters of the classifier are used for recognition. Experiments on standard collective activity benchmark sets show the effectiveness of the proposed descriptor.},
keywords={},
doi={10.1587/transinf.2015EDL8108},
ISSN={1745-1361},
month={October},}
Copy
TY - JOUR
TI - Collective Activity Recognition by Attribute-Based Spatio-Temporal Descriptor
T2 - IEICE TRANSACTIONS on Information
SP - 1875
EP - 1878
AU - Changhong CHEN
AU - Hehe DOU
AU - Zongliang GAN
PY - 2015
DO - 10.1587/transinf.2015EDL8108
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E98-D
IS - 10
JA - IEICE TRANSACTIONS on Information
Y1 - October 2015
AB - Collective activity recognition plays an important role in high-level video analysis. Most current feature representations look at contextual information extracted from the behaviour of nearby people. Every person needs to be detected and his pose should be estimated. After extracting the feature, hierarchical graphical models are always employed to model the spatio-temporal patterns of individuals and their interactions, and so can not avoid complex preprocessing and inference operations. To overcome these drawbacks, we present a new feature representation method, called attribute-based spatio-temporal (AST) descriptor. First, two types of information, spatio-temporal (ST) features and attribute features, are exploited. Attribute-based features are manually specified. An attribute classifier is trained to model the relationship between the ST features and attribute-based features, according to which the attribute features are refreshed. Then, the ST features, attribute features and the relationship between the attributes are combined to form the AST descriptor. An objective classifier can be specified on the AST descriptor and the weight parameters of the classifier are used for recognition. Experiments on standard collective activity benchmark sets show the effectiveness of the proposed descriptor.
ER -