CCTSS: The Combination of CNN and Transformer with Shared Sublayer for Detection and Classification

Aorui GOU; Jingjing LIU; Xiaoxiang CHEN; Xiaoyang ZENG; Yibo FAN

doi:10.1587/transfun.2022EAP1157

IEICE TRANSACTIONS on Fundamentals

CCTSS: The Combination of CNN and Transformer with Shared Sublayer for Detection and Classification

Aorui GOU, Jingjing LIU, Xiaoxiang CHEN, Xiaoyang ZENG, Yibo FAN

Full Text Views

0

Cite this

Summary :

Convolutional Neural Networks (CNNs) and Transformers have achieved remarkable performance in detection and classification tasks. Nevertheless, their feature extraction cannot consider both local and global information, so the detection and classification performance can be further improved. In addition, more and more deep learning networks are designed as more and more complex, and the amount of computation and storage space required is also significantly increased. This paper proposes a combination of CNN and transformer, and designs a local feature enhancement module and global context modeling module to enhance the cascade network. While the local feature enhancement module increases the range of feature extraction, the global context modeling is used to capture the feature maps' global information. To decrease the model complexity, a shared sublayer is designed to realize the sharing of weight parameters between the adjacent convolutional layers or cross convolutional layers, thereby reducing the number of convolutional weight parameters. Moreover, to effectively improve the detection performance of neural networks without increasing network parameters, the optimal transport assignment approach is proposed to resolve the problem of label assignment. The classification loss and regression loss are the summations of the cost between the demander and supplier. The experiment results demonstrate that the proposed Combination of CNN and Transformer with Shared Sublayer (CCTSS) performs better than the state-of-the-art methods in various datasets and applications.

Publication: IEICE TRANSACTIONS on Fundamentals Vol.E107-A No.1 pp.141-156

Publication Date: 2024/01/01

Publicized: 2023/07/06

Online ISSN: 1745-1337

DOI: 10.1587/transfun.2022EAP1157

Type of Manuscript: PAPER

Category: Image

Authors

Aorui GOU
  Fudan University
Jingjing LIU
  Fudan University,Shanghai University
Xiaoxiang CHEN
  Fudan University
Xiaoyang ZENG
  Fudan University
Yibo FAN
  Fudan University

Keyword

convolutional neural networks, coefficient sharing, convolution decomposition, shared sublayer, transformers, optimal transport

Cite this

Copy

Aorui GOU, Jingjing LIU, Xiaoxiang CHEN, Xiaoyang ZENG, Yibo FAN, "CCTSS: The Combination of CNN and Transformer with Shared Sublayer for Detection and Classification" in IEICE TRANSACTIONS on Fundamentals, vol. E107-A, no. 1, pp. 141-156, January 2024, doi: 10.1587/transfun.2022EAP1157.
Abstract: Convolutional Neural Networks (CNNs) and Transformers have achieved remarkable performance in detection and classification tasks. Nevertheless, their feature extraction cannot consider both local and global information, so the detection and classification performance can be further improved. In addition, more and more deep learning networks are designed as more and more complex, and the amount of computation and storage space required is also significantly increased. This paper proposes a combination of CNN and transformer, and designs a local feature enhancement module and global context modeling module to enhance the cascade network. While the local feature enhancement module increases the range of feature extraction, the global context modeling is used to capture the feature maps' global information. To decrease the model complexity, a shared sublayer is designed to realize the sharing of weight parameters between the adjacent convolutional layers or cross convolutional layers, thereby reducing the number of convolutional weight parameters. Moreover, to effectively improve the detection performance of neural networks without increasing network parameters, the optimal transport assignment approach is proposed to resolve the problem of label assignment. The classification loss and regression loss are the summations of the cost between the demander and supplier. The experiment results demonstrate that the proposed Combination of CNN and Transformer with Shared Sublayer (CCTSS) performs better than the state-of-the-art methods in various datasets and applications.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/transfun.2022EAP1157/_p

Copy

@ARTICLE{e107-a_1_141,
author={Aorui GOU, Jingjing LIU, Xiaoxiang CHEN, Xiaoyang ZENG, Yibo FAN, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={CCTSS: The Combination of CNN and Transformer with Shared Sublayer for Detection and Classification},
year={2024},
volume={E107-A},
number={1},
pages={141-156},
abstract={Convolutional Neural Networks (CNNs) and Transformers have achieved remarkable performance in detection and classification tasks. Nevertheless, their feature extraction cannot consider both local and global information, so the detection and classification performance can be further improved. In addition, more and more deep learning networks are designed as more and more complex, and the amount of computation and storage space required is also significantly increased. This paper proposes a combination of CNN and transformer, and designs a local feature enhancement module and global context modeling module to enhance the cascade network. While the local feature enhancement module increases the range of feature extraction, the global context modeling is used to capture the feature maps' global information. To decrease the model complexity, a shared sublayer is designed to realize the sharing of weight parameters between the adjacent convolutional layers or cross convolutional layers, thereby reducing the number of convolutional weight parameters. Moreover, to effectively improve the detection performance of neural networks without increasing network parameters, the optimal transport assignment approach is proposed to resolve the problem of label assignment. The classification loss and regression loss are the summations of the cost between the demander and supplier. The experiment results demonstrate that the proposed Combination of CNN and Transformer with Shared Sublayer (CCTSS) performs better than the state-of-the-art methods in various datasets and applications.},
keywords={},
doi={10.1587/transfun.2022EAP1157},
ISSN={1745-1337},
month={January},}

Copy

TY - JOUR
TI - CCTSS: The Combination of CNN and Transformer with Shared Sublayer for Detection and Classification
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 141
EP - 156
AU - Aorui GOU
AU - Jingjing LIU
AU - Xiaoxiang CHEN
AU - Xiaoyang ZENG
AU - Yibo FAN
PY - 2024
DO - 10.1587/transfun.2022EAP1157
JO - IEICE TRANSACTIONS on Fundamentals
SN - 1745-1337
VL - E107-A
IS - 1
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - January 2024
AB - Convolutional Neural Networks (CNNs) and Transformers have achieved remarkable performance in detection and classification tasks. Nevertheless, their feature extraction cannot consider both local and global information, so the detection and classification performance can be further improved. In addition, more and more deep learning networks are designed as more and more complex, and the amount of computation and storage space required is also significantly increased. This paper proposes a combination of CNN and transformer, and designs a local feature enhancement module and global context modeling module to enhance the cascade network. While the local feature enhancement module increases the range of feature extraction, the global context modeling is used to capture the feature maps' global information. To decrease the model complexity, a shared sublayer is designed to realize the sharing of weight parameters between the adjacent convolutional layers or cross convolutional layers, thereby reducing the number of convolutional weight parameters. Moreover, to effectively improve the detection performance of neural networks without increasing network parameters, the optimal transport assignment approach is proposed to resolve the problem of label assignment. The classification loss and regression loss are the summations of the cost between the demander and supplier. The experiment results demonstrate that the proposed Combination of CNN and Transformer with Shared Sublayer (CCTSS) performs better than the state-of-the-art methods in various datasets and applications.
ER -

IEICE TRANSACTIONS on Fundamentals