Fine-grained image classification is a difficult problem, and previous studies mainly overcome this problem by locating multiple discriminative regions in different scales and then aggregating complementary information explored from the located regions. However, locating discriminative regions introduces heavy overhead and is not suitable for real-world application. In this paper, we propose the recursive multi-scale channel-spatial attention module (RMCSAM) for addressing this problem. Following the experience of previous research on fine-grained image classification, RMCSAM explores multi-scale attentional information. However, the attentional information is explored by recursively refining the deep feature maps of a convolutional neural network (CNN) to better correspond to multi-scale channel-wise and spatial-wise attention, instead of localizing attention regions. In this way, RMCSAM provides a lightweight module that can be inserted into standard CNNs. Experimental results show that RMCSAM can improve the classification accuracy and attention capturing ability over baselines. Also, RMCSAM performs better than other state-of-the-art attention modules in fine-grained image classification, and is complementary to some state-of-the-art approaches for fine-grained image classification. Code is available at https://github.com/Dichao-Liu/Recursive-Multi-Scale-Channel-Spatial-Attention-Module.
Dichao LIU
Nagoya University
Yu WANG
Ritsumeikan University
Kenji MASE
Nagoya University
Jien KATO
Ritsumeikan University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Dichao LIU, Yu WANG, Kenji MASE, Jien KATO, "Recursive Multi-Scale Channel-Spatial Attention for Fine-Grained Image Classification" in IEICE TRANSACTIONS on Information,
vol. E105-D, no. 3, pp. 713-726, March 2022, doi: 10.1587/transinf.2021EDP7166.
Abstract: Fine-grained image classification is a difficult problem, and previous studies mainly overcome this problem by locating multiple discriminative regions in different scales and then aggregating complementary information explored from the located regions. However, locating discriminative regions introduces heavy overhead and is not suitable for real-world application. In this paper, we propose the recursive multi-scale channel-spatial attention module (RMCSAM) for addressing this problem. Following the experience of previous research on fine-grained image classification, RMCSAM explores multi-scale attentional information. However, the attentional information is explored by recursively refining the deep feature maps of a convolutional neural network (CNN) to better correspond to multi-scale channel-wise and spatial-wise attention, instead of localizing attention regions. In this way, RMCSAM provides a lightweight module that can be inserted into standard CNNs. Experimental results show that RMCSAM can improve the classification accuracy and attention capturing ability over baselines. Also, RMCSAM performs better than other state-of-the-art attention modules in fine-grained image classification, and is complementary to some state-of-the-art approaches for fine-grained image classification. Code is available at https://github.com/Dichao-Liu/Recursive-Multi-Scale-Channel-Spatial-Attention-Module.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2021EDP7166/_p
Copy
@ARTICLE{e105-d_3_713,
author={Dichao LIU, Yu WANG, Kenji MASE, Jien KATO, },
journal={IEICE TRANSACTIONS on Information},
title={Recursive Multi-Scale Channel-Spatial Attention for Fine-Grained Image Classification},
year={2022},
volume={E105-D},
number={3},
pages={713-726},
abstract={Fine-grained image classification is a difficult problem, and previous studies mainly overcome this problem by locating multiple discriminative regions in different scales and then aggregating complementary information explored from the located regions. However, locating discriminative regions introduces heavy overhead and is not suitable for real-world application. In this paper, we propose the recursive multi-scale channel-spatial attention module (RMCSAM) for addressing this problem. Following the experience of previous research on fine-grained image classification, RMCSAM explores multi-scale attentional information. However, the attentional information is explored by recursively refining the deep feature maps of a convolutional neural network (CNN) to better correspond to multi-scale channel-wise and spatial-wise attention, instead of localizing attention regions. In this way, RMCSAM provides a lightweight module that can be inserted into standard CNNs. Experimental results show that RMCSAM can improve the classification accuracy and attention capturing ability over baselines. Also, RMCSAM performs better than other state-of-the-art attention modules in fine-grained image classification, and is complementary to some state-of-the-art approaches for fine-grained image classification. Code is available at https://github.com/Dichao-Liu/Recursive-Multi-Scale-Channel-Spatial-Attention-Module.},
keywords={},
doi={10.1587/transinf.2021EDP7166},
ISSN={1745-1361},
month={March},}
Copy
TY - JOUR
TI - Recursive Multi-Scale Channel-Spatial Attention for Fine-Grained Image Classification
T2 - IEICE TRANSACTIONS on Information
SP - 713
EP - 726
AU - Dichao LIU
AU - Yu WANG
AU - Kenji MASE
AU - Jien KATO
PY - 2022
DO - 10.1587/transinf.2021EDP7166
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E105-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2022
AB - Fine-grained image classification is a difficult problem, and previous studies mainly overcome this problem by locating multiple discriminative regions in different scales and then aggregating complementary information explored from the located regions. However, locating discriminative regions introduces heavy overhead and is not suitable for real-world application. In this paper, we propose the recursive multi-scale channel-spatial attention module (RMCSAM) for addressing this problem. Following the experience of previous research on fine-grained image classification, RMCSAM explores multi-scale attentional information. However, the attentional information is explored by recursively refining the deep feature maps of a convolutional neural network (CNN) to better correspond to multi-scale channel-wise and spatial-wise attention, instead of localizing attention regions. In this way, RMCSAM provides a lightweight module that can be inserted into standard CNNs. Experimental results show that RMCSAM can improve the classification accuracy and attention capturing ability over baselines. Also, RMCSAM performs better than other state-of-the-art attention modules in fine-grained image classification, and is complementary to some state-of-the-art approaches for fine-grained image classification. Code is available at https://github.com/Dichao-Liu/Recursive-Multi-Scale-Channel-Spatial-Attention-Module.
ER -