The search functionality is under construction.

IEICE TRANSACTIONS on Information

Robust Visual Tracking Using Hierarchical Vision Transformer with Shifted Windows Multi-Head Self-Attention

Peng GAO, Xin-Yue ZHANG, Xiao-Li YANG, Jian-Cheng NI, Fei WANG

  • Full Text Views

    0

  • Cite this

Summary :

Despite Siamese trackers attracting much attention due to their scalability and efficiency in recent years, researchers have ignored the background appearance, which leads to their inapplicability in recognizing arbitrary target objects with various variations, especially in complex scenarios with background clutter and distractors. In this paper, we present a simple yet effective Siamese tracker, where the shifted windows multi-head self-attention is produced to learn the characteristics of a specific given target object for visual tracking. To validate the effectiveness of our proposed tracker, we use the Swin Transformer as the backbone network and introduced an auxiliary feature enhancement network. Extensive experimental results on two evaluation datasets demonstrate that the proposed tracker outperforms other baselines.

Publication
IEICE TRANSACTIONS on Information Vol.E107-D No.1 pp.161-164
Publication Date
2024/01/01
Publicized
2023/10/20
Online ISSN
1745-1361
DOI
10.1587/transinf.2023EDL8053
Type of Manuscript
LETTER
Category
Image Recognition, Computer Vision

Authors

Peng GAO
  Qufu Normal University
Xin-Yue ZHANG
  Qufu Normal University
Xiao-Li YANG
  Qufu Normal University
Jian-Cheng NI
  Qufu Normal University
Fei WANG
  Harbin Institute of Technology

Keyword