The search functionality is under construction.
The search functionality is under construction.

The Comparison of Attention Mechanisms with Different Embedding Modes for Performance Improvement of Fine-Grained Classification

Wujian YE, Run TAN, Yijun LIU, Chin-Chen CHANG

  • Full Text Views

    7

  • Cite this

Summary :

Fine-grained image classification is one of the key basic tasks of computer vision. The appearance of traditional deep convolutional neural network (DCNN) combined with attention mechanism can focus on partial and local features of fine-grained images, but it still lacks the consideration of the embedding mode of different attention modules in the network, leading to the unsatisfactory result of classification model. To solve the above problems, three different attention mechanisms are introduced into the DCNN network (like ResNet, VGGNet, etc.), including SE, CBAM and ECA modules, so that DCNN could better focus on the key local features of salient regions in the image. At the same time, we adopt three different embedding modes of attention modules, including serial, residual and parallel modes, to further improve the performance of the classification model. The experimental results show that the three attention modules combined with three different embedding modes can improve the performance of DCNN network effectively. Moreover, compared with SE and ECA, CBAM has stronger feature extraction capability. Among them, the parallelly embedded CBAM can make the local information paid attention to by DCNN richer and more accurate, and bring the optimal effect for DCNN, which is 1.98% and 1.57% higher than that of original VGG16 and Resnet34 in CUB-200-2011 dataset, respectively. The visualization analysis also indicates that the attention modules can be easily embedded into DCNN networks, especially in the parallel mode, with stronger generality and universality.

Publication
IEICE TRANSACTIONS on Information Vol.E106-D No.5 pp.590-600
Publication Date
2023/05/01
Publicized
2021/12/22
Online ISSN
1745-1361
DOI
10.1587/transinf.2022DLP0006
Type of Manuscript
Special Section PAPER (Special Section on Deep Learning Technologies: Architecture, Optimization, Techniques, and Applications)
Category
Core Methods

Authors

Wujian YE
  Guangdong University of Technology
Run TAN
  Guangdong University of Technology
Yijun LIU
  Guangdong University of Technology
Chin-Chen CHANG
  Feng Chia University

Keyword