In this paper we propose an extension of the Attention Branch Network (ABN) by using instance segmentation for generating sharper attention maps for action recognition. Methods for visual explanation such as Grad-CAM usually generate blurry maps which are not intuitive for humans to understand, particularly in recognizing actions of people in videos. Our proposed method, Object-ABN, tackles this issue by introducing a new mask loss that makes the generated attention maps close to the instance segmentation result. Further the Prototype Conformity (PC) loss and multiple attention maps are introduced to enhance the sharpness of the maps and improve the performance of classification. Experimental results with UCF101 and SSv2 shows that the generated maps by the proposed method are much clearer qualitatively and quantitatively than those of the original ABN.
Tomoya NITTA
Nagoya Institute of Technology
Tsubasa HIRAKAWA
Chubu University
Hironobu FUJIYOSHI
Chubu University
Toru TAMAKI
Nagoya Institute of Technology
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Tomoya NITTA, Tsubasa HIRAKAWA, Hironobu FUJIYOSHI, Toru TAMAKI, "Object-ABN: Learning to Generate Sharp Attention Maps for Action Recognition" in IEICE TRANSACTIONS on Information,
vol. E106-D, no. 3, pp. 391-400, March 2023, doi: 10.1587/transinf.2022EDP7138.
Abstract: In this paper we propose an extension of the Attention Branch Network (ABN) by using instance segmentation for generating sharper attention maps for action recognition. Methods for visual explanation such as Grad-CAM usually generate blurry maps which are not intuitive for humans to understand, particularly in recognizing actions of people in videos. Our proposed method, Object-ABN, tackles this issue by introducing a new mask loss that makes the generated attention maps close to the instance segmentation result. Further the Prototype Conformity (PC) loss and multiple attention maps are introduced to enhance the sharpness of the maps and improve the performance of classification. Experimental results with UCF101 and SSv2 shows that the generated maps by the proposed method are much clearer qualitatively and quantitatively than those of the original ABN.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2022EDP7138/_p
Copy
@ARTICLE{e106-d_3_391,
author={Tomoya NITTA, Tsubasa HIRAKAWA, Hironobu FUJIYOSHI, Toru TAMAKI, },
journal={IEICE TRANSACTIONS on Information},
title={Object-ABN: Learning to Generate Sharp Attention Maps for Action Recognition},
year={2023},
volume={E106-D},
number={3},
pages={391-400},
abstract={In this paper we propose an extension of the Attention Branch Network (ABN) by using instance segmentation for generating sharper attention maps for action recognition. Methods for visual explanation such as Grad-CAM usually generate blurry maps which are not intuitive for humans to understand, particularly in recognizing actions of people in videos. Our proposed method, Object-ABN, tackles this issue by introducing a new mask loss that makes the generated attention maps close to the instance segmentation result. Further the Prototype Conformity (PC) loss and multiple attention maps are introduced to enhance the sharpness of the maps and improve the performance of classification. Experimental results with UCF101 and SSv2 shows that the generated maps by the proposed method are much clearer qualitatively and quantitatively than those of the original ABN.},
keywords={},
doi={10.1587/transinf.2022EDP7138},
ISSN={1745-1361},
month={March},}
Copy
TY - JOUR
TI - Object-ABN: Learning to Generate Sharp Attention Maps for Action Recognition
T2 - IEICE TRANSACTIONS on Information
SP - 391
EP - 400
AU - Tomoya NITTA
AU - Tsubasa HIRAKAWA
AU - Hironobu FUJIYOSHI
AU - Toru TAMAKI
PY - 2023
DO - 10.1587/transinf.2022EDP7138
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E106-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2023
AB - In this paper we propose an extension of the Attention Branch Network (ABN) by using instance segmentation for generating sharper attention maps for action recognition. Methods for visual explanation such as Grad-CAM usually generate blurry maps which are not intuitive for humans to understand, particularly in recognizing actions of people in videos. Our proposed method, Object-ABN, tackles this issue by introducing a new mask loss that makes the generated attention maps close to the instance segmentation result. Further the Prototype Conformity (PC) loss and multiple attention maps are introduced to enhance the sharpness of the maps and improve the performance of classification. Experimental results with UCF101 and SSv2 shows that the generated maps by the proposed method are much clearer qualitatively and quantitatively than those of the original ABN.
ER -