How to represent images into highly compact binary codes is a critical issue in many computer vision tasks. Existing deep hashing methods typically focus on designing loss function by using pairwise or triplet labels. However, these methods ignore the attention mechanism in the human visual system. In this letter, we propose a novel Deep Attention Residual Hashing (DARH) method, which directly learns hash codes based on a simple pointwise classification loss function. Compared to previous methods, our method does not need to generate all possible pairwise or triplet labels from the training dataset. Specifically, we develop a new type of attention layer which can learn human eye fixation and significantly improves the representation ability of hash codes. In addition, we embedded the attention layer into the residual network to simultaneously learn discriminative image features and hash codes in an end-to-end manner. Extensive experiments on standard benchmarks demonstrate that our method preserves the instance-level similarity and outperforms state-of-the-art deep hashing methods in the image retrieval application.
Yang LI
PLA University of Science and Technology (PLAUST)
Zhuang MIAO
PLA University of Science and Technology (PLAUST)
Ming HE
PLA University of Science and Technology (PLAUST)
Yafei ZHANG
PLA University of Science and Technology (PLAUST)
Hang LI
PLA University of Science and Technology (PLAUST)
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Yang LI, Zhuang MIAO, Ming HE, Yafei ZHANG, Hang LI, "Deep Attention Residual Hashing" in IEICE TRANSACTIONS on Fundamentals,
vol. E101-A, no. 3, pp. 654-657, March 2018, doi: 10.1587/transfun.E101.A.654.
Abstract: How to represent images into highly compact binary codes is a critical issue in many computer vision tasks. Existing deep hashing methods typically focus on designing loss function by using pairwise or triplet labels. However, these methods ignore the attention mechanism in the human visual system. In this letter, we propose a novel Deep Attention Residual Hashing (DARH) method, which directly learns hash codes based on a simple pointwise classification loss function. Compared to previous methods, our method does not need to generate all possible pairwise or triplet labels from the training dataset. Specifically, we develop a new type of attention layer which can learn human eye fixation and significantly improves the representation ability of hash codes. In addition, we embedded the attention layer into the residual network to simultaneously learn discriminative image features and hash codes in an end-to-end manner. Extensive experiments on standard benchmarks demonstrate that our method preserves the instance-level similarity and outperforms state-of-the-art deep hashing methods in the image retrieval application.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/transfun.E101.A.654/_p
Copy
@ARTICLE{e101-a_3_654,
author={Yang LI, Zhuang MIAO, Ming HE, Yafei ZHANG, Hang LI, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={Deep Attention Residual Hashing},
year={2018},
volume={E101-A},
number={3},
pages={654-657},
abstract={How to represent images into highly compact binary codes is a critical issue in many computer vision tasks. Existing deep hashing methods typically focus on designing loss function by using pairwise or triplet labels. However, these methods ignore the attention mechanism in the human visual system. In this letter, we propose a novel Deep Attention Residual Hashing (DARH) method, which directly learns hash codes based on a simple pointwise classification loss function. Compared to previous methods, our method does not need to generate all possible pairwise or triplet labels from the training dataset. Specifically, we develop a new type of attention layer which can learn human eye fixation and significantly improves the representation ability of hash codes. In addition, we embedded the attention layer into the residual network to simultaneously learn discriminative image features and hash codes in an end-to-end manner. Extensive experiments on standard benchmarks demonstrate that our method preserves the instance-level similarity and outperforms state-of-the-art deep hashing methods in the image retrieval application.},
keywords={},
doi={10.1587/transfun.E101.A.654},
ISSN={1745-1337},
month={March},}
Copy
TY - JOUR
TI - Deep Attention Residual Hashing
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 654
EP - 657
AU - Yang LI
AU - Zhuang MIAO
AU - Ming HE
AU - Yafei ZHANG
AU - Hang LI
PY - 2018
DO - 10.1587/transfun.E101.A.654
JO - IEICE TRANSACTIONS on Fundamentals
SN - 1745-1337
VL - E101-A
IS - 3
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - March 2018
AB - How to represent images into highly compact binary codes is a critical issue in many computer vision tasks. Existing deep hashing methods typically focus on designing loss function by using pairwise or triplet labels. However, these methods ignore the attention mechanism in the human visual system. In this letter, we propose a novel Deep Attention Residual Hashing (DARH) method, which directly learns hash codes based on a simple pointwise classification loss function. Compared to previous methods, our method does not need to generate all possible pairwise or triplet labels from the training dataset. Specifically, we develop a new type of attention layer which can learn human eye fixation and significantly improves the representation ability of hash codes. In addition, we embedded the attention layer into the residual network to simultaneously learn discriminative image features and hash codes in an end-to-end manner. Extensive experiments on standard benchmarks demonstrate that our method preserves the instance-level similarity and outperforms state-of-the-art deep hashing methods in the image retrieval application.
ER -