1-6hit |
Hang LI Yafei ZHANG Jiabao WANG Yulong XU Yang LI Zhisong PAN
State-of-the-art background subtraction and foreground detection methods still face a variety of challenges, including illumination changes, camouflage, dynamic backgrounds, shadows, intermittent object motion. Detection of foreground elements via the robust principal component analysis (RPCA) method and its extensions based on low-rank and sparse structures have been conducted to achieve good performance in many scenes of the datasets, such as Changedetection.net (CDnet); however, the conventional RPCA method does not handle shadows well. To address this issue, we propose an approach that considers observed video data as the sum of three parts, namely a row-rank background, sparse moving objects and moving shadows. Next, we cast inequality constraints on the basic RPCA model and use an alternating direction method of multipliers framework combined with Rockafeller multipliers to derive a closed-form solution of the shadow matrix sub-problem. Our experiments have demonstrated that our method works effectively on challenging datasets that contain shadows.
Yang LI Zhuang MIAO Jiabao WANG Yafei ZHANG Hang LI
The latest deep hashing methods perform hash codes learning and image feature learning simultaneously by using pairwise or triplet labels. However, generating all possible pairwise or triplet labels from the training dataset can quickly become intractable, where the majority of those samples may produce small costs, resulting in slow convergence. In this letter, we propose a novel deep discriminative supervised hashing method, called DDSH, which directly learns hash codes based on a new combined loss function. Compared to previous methods, our method can take full advantages of the annotated data in terms of pairwise similarity and image identities. Extensive experiments on standard benchmarks demonstrate that our method preserves the instance-level similarity and outperforms state-of-the-art deep hashing methods in the image retrieval application. Remarkably, our 16-bits binary representation can surpass the performance of existing 48-bits binary representation, which demonstrates that our method can effectively improve the speed and precision of large scale image retrieval systems.
Yulong XU Yang LI Jiabao WANG Zhuang MIAO Hang LI Yafei ZHANG Gang TAO
Feature extractor is an important component of a tracker and the convolutional neural networks (CNNs) have demonstrated excellent performance in visual tracking. However, the CNN features cannot perform well under conditions of low illumination. To address this issue, we propose a novel deep correlation tracker with backtracking, which consists of target translation, backtracking and scale estimation. We employ four correlation filters, one with a histogram of oriented gradient (HOG) descriptor and the other three with the CNN features to estimate the translation. In particular, we propose a backtracking algorithm to reconfirm the translation location. Comprehensive experiments are performed on a large-scale challenging benchmark dataset. And the results show that the proposed algorithm outperforms state-of-the-art methods in accuracy and robustness.
Yulong XU Yang LI Jiabao WANG Zhuang MIAO Hang LI Yafei ZHANG
Feature extractor plays an important role in visual tracking, but most state-of-the-art methods employ the same feature representation in all scenes. Taking into account the diverseness, a tracker should choose different features according to the videos. In this work, we propose a novel feature adaptive correlation tracker, which decomposes the tracking task into translation and scale estimation. According to the luminance of the target, our approach automatically selects either hierarchical convolutional features or histogram of oriented gradient features in translation for varied scenarios. Furthermore, we employ a discriminative correlation filter to handle scale variations. Extensive experiments are performed on a large-scale benchmark challenging dataset. And the results show that the proposed algorithm outperforms state-of-the-art trackers in accuracy and robustness.
Yang LI Zhuang MIAO Ming HE Yafei ZHANG Hang LI
How to represent images into highly compact binary codes is a critical issue in many computer vision tasks. Existing deep hashing methods typically focus on designing loss function by using pairwise or triplet labels. However, these methods ignore the attention mechanism in the human visual system. In this letter, we propose a novel Deep Attention Residual Hashing (DARH) method, which directly learns hash codes based on a simple pointwise classification loss function. Compared to previous methods, our method does not need to generate all possible pairwise or triplet labels from the training dataset. Specifically, we develop a new type of attention layer which can learn human eye fixation and significantly improves the representation ability of hash codes. In addition, we embedded the attention layer into the residual network to simultaneously learn discriminative image features and hash codes in an end-to-end manner. Extensive experiments on standard benchmarks demonstrate that our method preserves the instance-level similarity and outperforms state-of-the-art deep hashing methods in the image retrieval application.
Yulong XU Zhuang MIAO Jiabao WANG Yang LI Hang LI Yafei ZHANG Weiguang XU Zhisong PAN
Correlation filter-based approaches achieve competitive results in visual tracking, but the traditional correlation tracking methods failed in mining the color information of the videos. To address this issue, we propose a novel tracker combined with color features in a correlation filter framework, which extracts not only gray but also color information as the feature maps to compute the maximum response location via multi-channel correlation filters. In particular, we modify the label function of the conventional classifier to improve positioning accuracy and employ a discriminative correlation filter to handle scale variations. Experiments are performed on 35 challenging benchmark color sequences. And the results clearly show that our method outperforms state-of-the-art tracking approaches while operating in real-time.