Convolutional Neural Networks (CNNs) and Transformers have achieved remarkable performance in detection and classification tasks. Nevertheless, their feature extraction cannot consider both local and global information, so the detection and classification performance can be further improved. In addition, more and more deep learning networks are designed as more and more complex, and the amount of computation and storage space required is also significantly increased. This paper proposes a combination of CNN and transformer, and designs a local feature enhancement module and global context modeling module to enhance the cascade network. While the local feature enhancement module increases the range of feature extraction, the global context modeling is used to capture the feature maps' global information. To decrease the model complexity, a shared sublayer is designed to realize the sharing of weight parameters between the adjacent convolutional layers or cross convolutional layers, thereby reducing the number of convolutional weight parameters. Moreover, to effectively improve the detection performance of neural networks without increasing network parameters, the optimal transport assignment approach is proposed to resolve the problem of label assignment. The classification loss and regression loss are the summations of the cost between the demander and supplier. The experiment results demonstrate that the proposed Combination of CNN and Transformer with Shared Sublayer (CCTSS) performs better than the state-of-the-art methods in various datasets and applications.
Aorui GOU
Fudan University
Jingjing LIU
Fudan University,Shanghai University
Xiaoxiang CHEN
Fudan University
Xiaoyang ZENG
Fudan University
Yibo FAN
Fudan University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Aorui GOU, Jingjing LIU, Xiaoxiang CHEN, Xiaoyang ZENG, Yibo FAN, "CCTSS: The Combination of CNN and Transformer with Shared Sublayer for Detection and Classification" in IEICE TRANSACTIONS on Fundamentals,
vol. E107-A, no. 1, pp. 141-156, January 2024, doi: 10.1587/transfun.2022EAP1157.
Abstract: Convolutional Neural Networks (CNNs) and Transformers have achieved remarkable performance in detection and classification tasks. Nevertheless, their feature extraction cannot consider both local and global information, so the detection and classification performance can be further improved. In addition, more and more deep learning networks are designed as more and more complex, and the amount of computation and storage space required is also significantly increased. This paper proposes a combination of CNN and transformer, and designs a local feature enhancement module and global context modeling module to enhance the cascade network. While the local feature enhancement module increases the range of feature extraction, the global context modeling is used to capture the feature maps' global information. To decrease the model complexity, a shared sublayer is designed to realize the sharing of weight parameters between the adjacent convolutional layers or cross convolutional layers, thereby reducing the number of convolutional weight parameters. Moreover, to effectively improve the detection performance of neural networks without increasing network parameters, the optimal transport assignment approach is proposed to resolve the problem of label assignment. The classification loss and regression loss are the summations of the cost between the demander and supplier. The experiment results demonstrate that the proposed Combination of CNN and Transformer with Shared Sublayer (CCTSS) performs better than the state-of-the-art methods in various datasets and applications.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/transfun.2022EAP1157/_p
Copy
@ARTICLE{e107-a_1_141,
author={Aorui GOU, Jingjing LIU, Xiaoxiang CHEN, Xiaoyang ZENG, Yibo FAN, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={CCTSS: The Combination of CNN and Transformer with Shared Sublayer for Detection and Classification},
year={2024},
volume={E107-A},
number={1},
pages={141-156},
abstract={Convolutional Neural Networks (CNNs) and Transformers have achieved remarkable performance in detection and classification tasks. Nevertheless, their feature extraction cannot consider both local and global information, so the detection and classification performance can be further improved. In addition, more and more deep learning networks are designed as more and more complex, and the amount of computation and storage space required is also significantly increased. This paper proposes a combination of CNN and transformer, and designs a local feature enhancement module and global context modeling module to enhance the cascade network. While the local feature enhancement module increases the range of feature extraction, the global context modeling is used to capture the feature maps' global information. To decrease the model complexity, a shared sublayer is designed to realize the sharing of weight parameters between the adjacent convolutional layers or cross convolutional layers, thereby reducing the number of convolutional weight parameters. Moreover, to effectively improve the detection performance of neural networks without increasing network parameters, the optimal transport assignment approach is proposed to resolve the problem of label assignment. The classification loss and regression loss are the summations of the cost between the demander and supplier. The experiment results demonstrate that the proposed Combination of CNN and Transformer with Shared Sublayer (CCTSS) performs better than the state-of-the-art methods in various datasets and applications.},
keywords={},
doi={10.1587/transfun.2022EAP1157},
ISSN={1745-1337},
month={January},}
Copy
TY - JOUR
TI - CCTSS: The Combination of CNN and Transformer with Shared Sublayer for Detection and Classification
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 141
EP - 156
AU - Aorui GOU
AU - Jingjing LIU
AU - Xiaoxiang CHEN
AU - Xiaoyang ZENG
AU - Yibo FAN
PY - 2024
DO - 10.1587/transfun.2022EAP1157
JO - IEICE TRANSACTIONS on Fundamentals
SN - 1745-1337
VL - E107-A
IS - 1
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - January 2024
AB - Convolutional Neural Networks (CNNs) and Transformers have achieved remarkable performance in detection and classification tasks. Nevertheless, their feature extraction cannot consider both local and global information, so the detection and classification performance can be further improved. In addition, more and more deep learning networks are designed as more and more complex, and the amount of computation and storage space required is also significantly increased. This paper proposes a combination of CNN and transformer, and designs a local feature enhancement module and global context modeling module to enhance the cascade network. While the local feature enhancement module increases the range of feature extraction, the global context modeling is used to capture the feature maps' global information. To decrease the model complexity, a shared sublayer is designed to realize the sharing of weight parameters between the adjacent convolutional layers or cross convolutional layers, thereby reducing the number of convolutional weight parameters. Moreover, to effectively improve the detection performance of neural networks without increasing network parameters, the optimal transport assignment approach is proposed to resolve the problem of label assignment. The classification loss and regression loss are the summations of the cost between the demander and supplier. The experiment results demonstrate that the proposed Combination of CNN and Transformer with Shared Sublayer (CCTSS) performs better than the state-of-the-art methods in various datasets and applications.
ER -