The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] visual tracking(29hit)

1-20hit(29hit)

  • Robust Visual Tracking Using Hierarchical Vision Transformer with Shifted Windows Multi-Head Self-Attention

    Peng GAO  Xin-Yue ZHANG  Xiao-Li YANG  Jian-Cheng NI  Fei WANG  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2023/10/20
      Vol:
    E107-D No:1
      Page(s):
    161-164

    Despite Siamese trackers attracting much attention due to their scalability and efficiency in recent years, researchers have ignored the background appearance, which leads to their inapplicability in recognizing arbitrary target objects with various variations, especially in complex scenarios with background clutter and distractors. In this paper, we present a simple yet effective Siamese tracker, where the shifted windows multi-head self-attention is produced to learn the characteristics of a specific given target object for visual tracking. To validate the effectiveness of our proposed tracker, we use the Swin Transformer as the backbone network and introduced an auxiliary feature enhancement network. Extensive experimental results on two evaluation datasets demonstrate that the proposed tracker outperforms other baselines.

  • Convolutional Neural Networks Based Dictionary Pair Learning for Visual Tracking

    Chenchen MENG  Jun WANG  Chengzhi DENG  Yuanyun WANG  Shengqian WANG  

     
    PAPER-Vision

      Pubricized:
    2022/02/21
      Vol:
    E105-A No:8
      Page(s):
    1147-1156

    Feature representation is a key component of most visual tracking algorithms. It is difficult to deal with complex appearance changes with low-level hand-crafted features due to weak representation capacities of such features. In this paper, we propose a novel tracking algorithm through combining a joint dictionary pair learning with convolutional neural networks (CNN). We utilize CNN model that is trained on ImageNet-Vid to extract target features. The CNN includes three convolutional layers and two fully connected layers. A dictionary pair learning follows the second fully connected layer. The joint dictionary pair is learned upon extracted deep features by the trained CNN model. The temporal variations of target appearances are learned in the dictionary learning. We use the learned dictionaries to encode target candidates. A linear combination of atoms in the learned dictionary is used to represent target candidates. Extensive experimental evaluations on OTB2015 demonstrate the superior performances against SOTA trackers.

  • Reinforced Tracker Based on Hierarchical Convolutional Features

    Xin ZENG  Lin ZHANG  Zhongqiang LUO  Xingzhong XIONG  Chengjie LI  

     
    PAPER-Image Processing and Video Processing

      Pubricized:
    2022/03/10
      Vol:
    E105-D No:6
      Page(s):
    1225-1233

    In recent years, the development of visual tracking is getting better and better, but some methods cannot overcome the problem of low accuracy and success rate of tracking. Although there are some trackers will be more accurate, they will cost more time. In order to solve the problem, we propose a reinforced tracker based on Hierarchical Convolutional Features (HCF for short). HOG, color-naming and grayscale features are used with different weights to supplement the convolution features, which can enhance the tracking robustness. At the same time, we improved the model update strategy to save the time costs. This tracker is called RHCF and the code is published on https://github.com/z15846/RHCF. Experiments on the OTB2013 dataset show that our tracker can validly achieve the promotion of the accuracy and success rate.

  • Siamese Visual Tracking with Dual-Pipeline Correlated Fusion Network

    Ying KANG  Cong LIU  Ning WANG  Dianxi SHI  Ning ZHOU  Mengmeng LI  Yunlong WU  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2021/07/09
      Vol:
    E104-D No:10
      Page(s):
    1702-1711

    Siamese visual tracking, viewed as a problem of max-similarity matching to the target template, has absorbed increasing attention in computer vision. However, it is a challenge for current Siamese trackers that the demands of balance between accuracy in real-time tracking and robustness in long-time tracking are hard to meet. This work proposes a new Siamese based tracker with a dual-pipeline correlated fusion network (named as ADF-SiamRPN), which consists of one initial template for robust correlation, and the other transient template with the ability of adaptive feature optimal selection for accurate correlation. By the promotion from the learnable correlation-response fusion network afterwards, we are in pursuit of the synthetical improvement of tracking performance. To compare the performance of ADF-SiamRPN with state-of-the-art trackers, we conduct lots of experiments on benchmarks like OTB100, UAV123, VOT2016, VOT2018, GOT-10k, LaSOT and TrackingNet. The experimental results of tracking demonstrate that ADF-SiamRPN outperforms all the compared trackers and achieves the best balance between accuracy and robustness.

  • Correlation Filter-Based Visual Tracking Using Confidence Map and Adaptive Model

    Zhaoqian TANG  Kaoru ARAKAWA  

     
    PAPER-Vision

      Vol:
    E103-A No:12
      Page(s):
    1512-1519

    Recently, visual trackers based on the framework of kernelized correlation filter (KCF) achieve the robustness and accuracy results. These trackers need to learn information on the object from each frame, thus the state change of the object affects the tracking performances. In order to deal with the state change, we propose a novel KCF tracker using the filter response map, namely a confidence map, and adaptive model. This method firstly takes a skipped scale pool method which utilizes variable window size at every two frames. Secondly, the location of the object is estimated using the combination of the filter response and the similarity of the luminance histogram at multiple points in the confidence map. Moreover, we use the re-detection of the multiple peaks of the confidence map to prevent the target drift and reduce the influence of illumination. Thirdly, the learning rate to obtain the model of the object is adjusted, using the filter response and the similarity of the luminance histogram, considering the state of the object. Experimentally, the proposed tracker (CFCA) achieves outstanding performance for the challenging benchmark sequence (OTB2013 and OTB2015).

  • Combining Siamese Network and Regression Network for Visual Tracking

    Yao GE  Rui CHEN  Ying TONG  Xuehong CAO  Ruiyu LIANG  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2020/05/13
      Vol:
    E103-D No:8
      Page(s):
    1924-1927

    We combine the siamese network and the recurrent regression network, proposing a two-stage tracking framework termed as SiamReg. Our method solves the problem that the classic siamese network can not judge the target size precisely and simplifies the procedures of regression in the training and testing process. We perform experiments on three challenging tracking datasets: VOT2016, OTB100, and VOT2018. The results indicate that, after offline trained, SiamReg can obtain a higher expected average overlap measure.

  • Prediction-Based Scale Adaptive Correlation Filter Tracker

    Zuopeng ZHAO  Hongda ZHANG  Yi LIU  Nana ZHOU  Han ZHENG  Shanyi SUN  Xiaoman LI  Sili XIA  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2019/07/30
      Vol:
    E102-D No:11
      Page(s):
    2267-2271

    Although correlation filter-based trackers have demonstrated excellent performance for visual object tracking, there remain several challenges to be addressed. In this work, we propose a novel tracker based on the correlation filter framework. Traditional trackers face difficulty in accurately adapting to changes in the scale of the target when the target moves quickly. To address this, we suggest a scale adaptive scheme based on prediction scales. We also incorporate a speed-based adaptive model update method to further improve overall tracking performance. Experiments with samples from the OTB100 and KITTI datasets demonstrate that our method outperforms existing state-of-the-art tracking algorithms in fast motion scenes.

  • Faster-ADNet for Visual Tracking

    Tiansa ZHANG  Chunlei HUO  Zhiqiang ZHOU  Bo WANG  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2018/12/12
      Vol:
    E102-D No:3
      Page(s):
    684-687

    By taking advantages of deep learning and reinforcement learning, ADNet (Action Decision Network) outperforms other approaches. However, its speed and performance are still limited by factors such as unreliable confidence score estimation and redundant historical actions. To address the above limitations, a faster and more accurate approach named Faster-ADNet is proposed in this paper. By optimizing the tracking process via a status re-identification network, the proposed approach is more efficient and 6 times faster than ADNet. At the same time, the accuracy and stability are enhanced by historical actions removal. Experiments demonstrate the advantages of Faster-ADNet.

  • Real-Time Sparse Visual Tracking Using Circulant Reverse Lasso Model

    Chenggang GUO  Dongyi CHEN  Zhiqi HUANG  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2018/10/09
      Vol:
    E102-D No:1
      Page(s):
    175-184

    Sparse representation has been successfully applied to visual tracking. Recent progresses in sparse tracking are mainly made within the particle filter framework. However, most sparse trackers need to extract complex feature representations for each particle in the limited sample space, leading to expensive computation cost and yielding inferior tracking performance. To deal with the above issues, we propose a novel sparse tracking method based on the circulant reverse lasso model. Benefiting from the properties of circulant matrices, densely sampled target candidates are implicitly generated by cyclically shifting the base feature descriptors, and then embedded into a reverse sparse reconstruction model as a dictionary to encode a robust appearance template. The alternating direction method of multipliers is employed for solving the reverse sparse model and the optimization process can be efficiently solved in the frequency domain, which enables the proposed tracker to run in real-time. The calculated sparse coefficient map represents the similarity scores between the template and circular shifted samples. Thus the target location can be directly predicted according to the coordinates of the peak coefficient. A scale-aware template updating strategy is combined with the correlation filter template learning to take into account both appearance deformations and scale variations. Both quantitative and qualitative evaluations on two challenging tracking benchmarks demonstrate that the proposed algorithm performs favorably against several state-of-the-art sparse representation based tracking methods.

  • Accurate Scale Adaptive and Real-Time Visual Tracking with Correlation Filters

    Jiatian PI  Shaohua ZENG  Qing ZUO  Yan WEI  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2018/07/27
      Vol:
    E101-D No:11
      Page(s):
    2855-2858

    Visual tracking has been studied for several decades but continues to draw significant attention because of its critical role in many applications. This letter handles the problem of fixed template size in Kernelized Correlation Filter (KCF) tracker with no significant decrease in the speed. Extensive experiments are performed on the new OTB dataset.

  • An Efficient Misalignment Method for Visual Tracking Based on Sparse Representation

    Shan JIANG  Cheng HAN  Xiaoqiang DI  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2018/05/14
      Vol:
    E101-D No:8
      Page(s):
    2123-2131

    Sparse representation has been widely applied to visual tracking for several years. In the sparse representation framework, tracking problem is transferred into solving an L1 minimization issue. However, during the tracking procedure, the appearance of target was affected by external environment. Therefore, we proposed a robust tracking algorithm based on the traditional sparse representation jointly particle filter framework. First, we obtained the observation image set from particle filter. Furthermore, we introduced a 2D transformation on the observation image set, which enables the tracking target candidates set more robust to handle misalignment problem in complex scene. Moreover, we adopt the occlusion detection mechanism before template updating, reducing the drift problem effectively. Experimental evaluations on five public challenging sequences, which exhibit occlusions, illuminating variations, scale changes, motion blur, and our tracker demonstrate accuracy and robustness in comparisons with the state-of-the-arts.

  • Regularized Kernel Representation for Visual Tracking

    Jun WANG  Yuanyun WANG  Chengzhi DENG  Shengqian WANG  Yong QIN  

     
    PAPER-Digital Signal Processing

      Vol:
    E101-A No:4
      Page(s):
    668-677

    Developing a robust appearance model is a challenging task due to appearance variations of objects such as partial occlusion, illumination variation, rotation and background clutter. Existing tracking algorithms employ linear combinations of target templates to represent target appearances, which are not accurate enough to deal with appearance variations. The underlying relationship between target candidates and the target templates is highly nonlinear because of complicated appearance variations. To address this, this paper presents a regularized kernel representation for visual tracking. Namely, the feature vectors of target appearances are mapped into higher dimensional features, in which a target candidate is approximately represented by a nonlinear combination of target templates in a dimensional space. The kernel based appearance model takes advantage of considering the non-linear relationship and capturing the nonlinear similarity between target candidates and target templates. l2-regularization on coding coefficients makes the approximate solution of target representations more stable. Comprehensive experiments demonstrate the superior performances in comparison with state-of-the-art trackers.

  • Deep Correlation Tracking with Backtracking

    Yulong XU  Yang LI  Jiabao WANG  Zhuang MIAO  Hang LI  Yafei ZHANG  Gang TAO  

     
    LETTER-Vision

      Vol:
    E100-A No:7
      Page(s):
    1601-1605

    Feature extractor is an important component of a tracker and the convolutional neural networks (CNNs) have demonstrated excellent performance in visual tracking. However, the CNN features cannot perform well under conditions of low illumination. To address this issue, we propose a novel deep correlation tracker with backtracking, which consists of target translation, backtracking and scale estimation. We employ four correlation filters, one with a histogram of oriented gradient (HOG) descriptor and the other three with the CNN features to estimate the translation. In particular, we propose a backtracking algorithm to reconfirm the translation location. Comprehensive experiments are performed on a large-scale challenging benchmark dataset. And the results show that the proposed algorithm outperforms state-of-the-art methods in accuracy and robustness.

  • Low-Rank Representation with Graph Constraints for Robust Visual Tracking

    Jieyan LIU  Ao MA  Jingjing LI  Ke LU  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2017/03/08
      Vol:
    E100-D No:6
      Page(s):
    1325-1338

    Subspace representation model is an important subset of visual tracking algorithms. Compared with models performed on the original data space, subspace representation model can effectively reduce the computational complexity, and filter out high dimensional noises. However, for some complicated situations, e.g., dramatic illumination changing, large area of occlusion and abrupt object drifting, traditional subspace representation models may fail to handle the visual tracking task. In this paper, we propose a novel subspace representation algorithm for robust visual tracking by using low-rank representation with graph constraints (LRGC). Low-rank representation has been well-known for its superiority of handling corrupted samples, and graph constraint is flexible to characterize sample relationship. In this paper, we aim to exploit benefits from both low-rank representation and graph constraint, and deploy it to handle challenging visual tracking problems. Specifically, we first propose a novel graph structure to characterize the relationship of target object in different observation states. Then we learn a subspace by jointly optimizing low-rank representation and graph embedding in a unified framework. Finally, the learned subspace is embedded into a Bayesian inference framework by using the dynamical model and the observation model. Experiments on several video benchmarks demonstrate that our algorithm performs better than traditional ones, especially in dynamically changing and drifting situations.

  • Adaptive Updating Probabilistic Model for Visual Tracking

    Kai FANG  Shuoyan LIU  Chunjie XU  Hao XUE  

     
    LETTER-Pattern Recognition

      Pubricized:
    2017/01/06
      Vol:
    E100-D No:4
      Page(s):
    914-917

    In this paper, an adaptive updating probabilistic model is proposed to track an object in real-world environment that includes motion blur, illumination changes, pose variations, and occlusions. This model adaptively updates tracker with the searching and updating process. The searching process focuses on how to learn appropriate tracker and updating process aims to correct it as a robust and efficient tracker in unconstrained real-world environments. Specifically, according to various changes in an object's appearance and recent probability matrix (TPM), tracker probability is achieved in Expectation-Maximization (EM) manner. When the tracking in each frame is completed, the estimated object's state is obtained and then fed into update current TPM and tracker probability via running EM in a similar manner. The highest tracker probability denotes the object location in every frame. The experimental result demonstrates that our method tracks targets accurately and robustly in the real-world tracking environments.

  • Feature Adaptive Correlation Tracking

    Yulong XU  Yang LI  Jiabao WANG  Zhuang MIAO  Hang LI  Yafei ZHANG  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2016/11/28
      Vol:
    E100-D No:3
      Page(s):
    594-597

    Feature extractor plays an important role in visual tracking, but most state-of-the-art methods employ the same feature representation in all scenes. Taking into account the diverseness, a tracker should choose different features according to the videos. In this work, we propose a novel feature adaptive correlation tracker, which decomposes the tracking task into translation and scale estimation. According to the luminance of the target, our approach automatically selects either hierarchical convolutional features or histogram of oriented gradient features in translation for varied scenarios. Furthermore, we employ a discriminative correlation filter to handle scale variations. Extensive experiments are performed on a large-scale benchmark challenging dataset. And the results show that the proposed algorithm outperforms state-of-the-art trackers in accuracy and robustness.

  • Combining Color Features for Real-Time Correlation Tracking

    Yulong XU  Zhuang MIAO  Jiabao WANG  Yang LI  Hang LI  Yafei ZHANG  Weiguang XU  Zhisong PAN  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2016/10/04
      Vol:
    E100-D No:1
      Page(s):
    225-228

    Correlation filter-based approaches achieve competitive results in visual tracking, but the traditional correlation tracking methods failed in mining the color information of the videos. To address this issue, we propose a novel tracker combined with color features in a correlation filter framework, which extracts not only gray but also color information as the feature maps to compute the maximum response location via multi-channel correlation filters. In particular, we modify the label function of the conventional classifier to improve positioning accuracy and employ a discriminative correlation filter to handle scale variations. Experiments are performed on 35 challenging benchmark color sequences. And the results clearly show that our method outperforms state-of-the-art tracking approaches while operating in real-time.

  • Robust and Adaptive Object Tracking via Correspondence Clustering

    Bo WU  Yurui XIE  Wang LUO  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2016/06/23
      Vol:
    E99-D No:10
      Page(s):
    2664-2667

    We propose a new visual tracking method, where the target appearance is represented by combining color distribution and keypoints. Firstly, the object is localized via a keypoint-based tracking and matching strategy, where a new clustering method is presented to remove outliers. Secondly, the tracking confidence is evaluated by the color template. According to the tracking confidence, the local and global keypoints matching can be performed adaptively. Finally, we propose a target appearance update method in which the new appearance can be learned and added to the target model. The proposed tracker is compared with five state-of-the-art tracking methods on a recent benchmark dataset. Both qualitative and quantitative evaluations show that our method has favorable performance.

  • Robust Scale Adaptive and Real-Time Visual Tracking with Correlation Filters

    Jiatian PI  Keli HU  Yuzhang GU  Lei QU  Fengrong LI  Xiaolin ZHANG  Yunlong ZHAN  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2016/04/07
      Vol:
    E99-D No:7
      Page(s):
    1895-1902

    Visual tracking has been studied for several decades but continues to draw significant attention because of its critical role in many applications. Recent years have seen greater interest in the use of correlation filters in visual tracking systems, owing to their extremely compelling results in different competitions and benchmarks. However, there is still a need to improve the overall tracking capability to counter various tracking issues, including large scale variation, occlusion, and deformation. This paper presents an appealing tracker with robust scale estimation, which can handle the problem of fixed template size in Kernelized Correlation Filter (KCF) tracker with no significant decrease in the speed. We apply the discriminative correlation filter for scale estimation as an independent part after finding the optimal translation based on the KCF tracker. Compared to an exhaustive scale space search scheme, our approach provides improved performance while being computationally efficient. In order to reveal the effectiveness of our approach, we use benchmark sequences annotated with 11 attributes to evaluate how well the tracker handles different attributes. Numerous experiments demonstrate that the proposed algorithm performs favorably against several state-of-the-art algorithms. Appealing results both in accuracy and robustness are also achieved on all 51 benchmark sequences, which proves the efficiency of our tracker.

  • Object Tracking with Embedded Deformable Parts in Dynamic Conditional Random Fields

    Suofei ZHANG  Zhixin SUN  Xu CHENG  Lin ZHOU  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2016/01/19
      Vol:
    E99-D No:4
      Page(s):
    1268-1271

    This work presents an object tracking framework which is based on integration of Deformable Part based Models (DPMs) and Dynamic Conditional Random Fields (DCRF). In this framework, we propose a DCRF based novel way to track an object and its details on multiple resolutions simultaneously. Meanwhile, we tackle drastic variations in target appearance such as pose, view, scale and illumination changes with DPMs. To embed DPMs into DCRF, we design specific temporal potential functions between vertices by explicitly formulating deformation and partial occlusion respectively. Furthermore, temporal transition functions between mixture models bring higher robustness to perspective and pose changes. To evaluate the efficacy of our proposed method, quantitative tests on six challenging video sequences are conducted and the results are analyzed. Experimental results indicate that the method effectively addresses serious problems in object tracking and performs favorably against state-of-the-art trackers.

1-20hit(29hit)