The search functionality is under construction.

Keyword Search Result

[Keyword] Siamese network(6hit)

1-6hit
  • Robust Visual Tracking Using Hierarchical Vision Transformer with Shifted Windows Multi-Head Self-Attention

    Peng GAO  Xin-Yue ZHANG  Xiao-Li YANG  Jian-Cheng NI  Fei WANG  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2023/10/20
      Vol:
    E107-D No:1
      Page(s):
    161-164

    Despite Siamese trackers attracting much attention due to their scalability and efficiency in recent years, researchers have ignored the background appearance, which leads to their inapplicability in recognizing arbitrary target objects with various variations, especially in complex scenarios with background clutter and distractors. In this paper, we present a simple yet effective Siamese tracker, where the shifted windows multi-head self-attention is produced to learn the characteristics of a specific given target object for visual tracking. To validate the effectiveness of our proposed tracker, we use the Swin Transformer as the backbone network and introduced an auxiliary feature enhancement network. Extensive experimental results on two evaluation datasets demonstrate that the proposed tracker outperforms other baselines.

  • Detection and Tracking Method for Dynamic Barcodes Based on a Siamese Network

    Menglong WU  Cuizhu QIN  Hongxia DONG  Wenkai LIU  Xiaodong NIE  Xichang CAI  Yundong LI  

     
    PAPER-Wireless Communication Technologies

      Pubricized:
    2022/01/13
      Vol:
    E105-B No:7
      Page(s):
    866-875

    In many screen to camera communication (S2C) systems, the barcode preprocessing method is a significant prerequisite because barcodes may be deformed due to various environmental factors. However, previous studies have focused on barcode detection under static conditions; to date, few studies have been carried out on dynamic conditions (for example, the barcode video stream or the transmitter and receiver are moving). Therefore, we present a detection and tracking method for dynamic barcodes based on a Siamese network. The backbone of the CNN in the Siamese network is improved by SE-ResNet. The detection accuracy achieved 89.5%, which stands out from other classical detection networks. The EAO reaches 0.384, which is better than previous tracking methods. It is also superior to other methods in terms of accuracy and robustness. The SE-ResNet in this paper improved the EAO by 1.3% compared with ResNet in SiamMask. Also, our method is not only applicable to static barcodes but also allows real-time tracking and segmentation of barcodes captured in dynamic situations.

  • Siamese Visual Tracking with Dual-Pipeline Correlated Fusion Network

    Ying KANG  Cong LIU  Ning WANG  Dianxi SHI  Ning ZHOU  Mengmeng LI  Yunlong WU  

     
    PAPER-Image Recognition, Computer Vision

      Pubricized:
    2021/07/09
      Vol:
    E104-D No:10
      Page(s):
    1702-1711

    Siamese visual tracking, viewed as a problem of max-similarity matching to the target template, has absorbed increasing attention in computer vision. However, it is a challenge for current Siamese trackers that the demands of balance between accuracy in real-time tracking and robustness in long-time tracking are hard to meet. This work proposes a new Siamese based tracker with a dual-pipeline correlated fusion network (named as ADF-SiamRPN), which consists of one initial template for robust correlation, and the other transient template with the ability of adaptive feature optimal selection for accurate correlation. By the promotion from the learnable correlation-response fusion network afterwards, we are in pursuit of the synthetical improvement of tracking performance. To compare the performance of ADF-SiamRPN with state-of-the-art trackers, we conduct lots of experiments on benchmarks like OTB100, UAV123, VOT2016, VOT2018, GOT-10k, LaSOT and TrackingNet. The experimental results of tracking demonstrate that ADF-SiamRPN outperforms all the compared trackers and achieves the best balance between accuracy and robustness.

  • Combining Siamese Network and Regression Network for Visual Tracking

    Yao GE  Rui CHEN  Ying TONG  Xuehong CAO  Ruiyu LIANG  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2020/05/13
      Vol:
    E103-D No:8
      Page(s):
    1924-1927

    We combine the siamese network and the recurrent regression network, proposing a two-stage tracking framework termed as SiamReg. Our method solves the problem that the classic siamese network can not judge the target size precisely and simplifies the procedures of regression in the training and testing process. We perform experiments on three challenging tracking datasets: VOT2016, OTB100, and VOT2018. The results indicate that, after offline trained, SiamReg can obtain a higher expected average overlap measure.

  • Siamese Attention-Based LSTM for Speech Emotion Recognition

    Tashpolat NIZAMIDIN  Li ZHAO  Ruiyu LIANG  Yue XIE  Askar HAMDULLA  

     
    LETTER-Engineering Acoustics

      Vol:
    E103-A No:7
      Page(s):
    937-941

    As one of the popular topics in the field of human-computer interaction, the Speech Emotion Recognition (SER) aims to classify the emotional tendency from the speakers' utterances. Using the existing deep learning methods, and with a large amount of training data, we can achieve a highly accurate performance result. Unfortunately, it's time consuming and difficult job to build such a huge emotional speech database that can be applicable universally. However, the Siamese Neural Network (SNN), which we discuss in this paper, can yield extremely precise results with just a limited amount of training data through pairwise training which mitigates the impacts of sample deficiency and provides enough iterations. To obtain enough SER training, this study proposes a novel method which uses Siamese Attention-based Long Short-Term Memory Networks. In this framework, we designed two Attention-based Long Short-Term Memory Networks which shares the same weights, and we input frame level acoustic emotional features to the Siamese network rather than utterance level emotional features. The proposed solution has been evaluated on EMODB, ABC and UYGSEDB corpora, and showed significant improvement on SER results, compared to conventional deep learning methods.

  • Deep Discriminative Supervised Hashing via Siamese Network

    Yang LI  Zhuang MIAO  Jiabao WANG  Yafei ZHANG  Hang LI  

     
    LETTER-Artificial Intelligence, Data Mining

      Pubricized:
    2017/09/12
      Vol:
    E100-D No:12
      Page(s):
    3036-3040

    The latest deep hashing methods perform hash codes learning and image feature learning simultaneously by using pairwise or triplet labels. However, generating all possible pairwise or triplet labels from the training dataset can quickly become intractable, where the majority of those samples may produce small costs, resulting in slow convergence. In this letter, we propose a novel deep discriminative supervised hashing method, called DDSH, which directly learns hash codes based on a new combined loss function. Compared to previous methods, our method can take full advantages of the annotated data in terms of pairwise similarity and image identities. Extensive experiments on standard benchmarks demonstrate that our method preserves the instance-level similarity and outperforms state-of-the-art deep hashing methods in the image retrieval application. Remarkably, our 16-bits binary representation can surpass the performance of existing 48-bits binary representation, which demonstrates that our method can effectively improve the speed and precision of large scale image retrieval systems.