Cross-modal hashing technology has attracted much attention for its favorable retrieval performance and low storage cost. However, for existing cross-modal hashing methods, the heterogeneity of data across modalities is still a challenge and how to fully explore and utilize the intra-modality features has not been well studied. In this paper, we propose a novel cross-modal hashing approach called Modality-fused Graph Network (MFGN). The network architecture consists of a text channel and an image channel that are used to learn modality-specific features, and a modality fusion channel that uses the graph network to learn the modality-shared representations to reduce the heterogeneity across modalities. In addition, an integration module is introduced for the image and text channels to fully explore intra-modality features. Experiments on two widely used datasets show that our approach achieves better results than the state-of-the-art cross-modal hashing methods.
Fei WU
Nanjing University of Posts and Telecommunications
Shuaishuai LI
Nanjing University of Posts and Telecommunications
Guangchuan PENG
Nanjing University of Posts and Telecommunications
Yongheng MA
Nanjing University of Posts and Telecommunications
Xiao-Yuan JING
Wuhan University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Fei WU, Shuaishuai LI, Guangchuan PENG, Yongheng MA, Xiao-Yuan JING, "Modality-Fused Graph Network for Cross-Modal Retrieval" in IEICE TRANSACTIONS on Information,
vol. E106-D, no. 5, pp. 1094-1097, May 2023, doi: 10.1587/transinf.2022EDL8069.
Abstract: Cross-modal hashing technology has attracted much attention for its favorable retrieval performance and low storage cost. However, for existing cross-modal hashing methods, the heterogeneity of data across modalities is still a challenge and how to fully explore and utilize the intra-modality features has not been well studied. In this paper, we propose a novel cross-modal hashing approach called Modality-fused Graph Network (MFGN). The network architecture consists of a text channel and an image channel that are used to learn modality-specific features, and a modality fusion channel that uses the graph network to learn the modality-shared representations to reduce the heterogeneity across modalities. In addition, an integration module is introduced for the image and text channels to fully explore intra-modality features. Experiments on two widely used datasets show that our approach achieves better results than the state-of-the-art cross-modal hashing methods.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2022EDL8069/_p
Copy
@ARTICLE{e106-d_5_1094,
author={Fei WU, Shuaishuai LI, Guangchuan PENG, Yongheng MA, Xiao-Yuan JING, },
journal={IEICE TRANSACTIONS on Information},
title={Modality-Fused Graph Network for Cross-Modal Retrieval},
year={2023},
volume={E106-D},
number={5},
pages={1094-1097},
abstract={Cross-modal hashing technology has attracted much attention for its favorable retrieval performance and low storage cost. However, for existing cross-modal hashing methods, the heterogeneity of data across modalities is still a challenge and how to fully explore and utilize the intra-modality features has not been well studied. In this paper, we propose a novel cross-modal hashing approach called Modality-fused Graph Network (MFGN). The network architecture consists of a text channel and an image channel that are used to learn modality-specific features, and a modality fusion channel that uses the graph network to learn the modality-shared representations to reduce the heterogeneity across modalities. In addition, an integration module is introduced for the image and text channels to fully explore intra-modality features. Experiments on two widely used datasets show that our approach achieves better results than the state-of-the-art cross-modal hashing methods.},
keywords={},
doi={10.1587/transinf.2022EDL8069},
ISSN={1745-1361},
month={May},}
Copy
TY - JOUR
TI - Modality-Fused Graph Network for Cross-Modal Retrieval
T2 - IEICE TRANSACTIONS on Information
SP - 1094
EP - 1097
AU - Fei WU
AU - Shuaishuai LI
AU - Guangchuan PENG
AU - Yongheng MA
AU - Xiao-Yuan JING
PY - 2023
DO - 10.1587/transinf.2022EDL8069
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E106-D
IS - 5
JA - IEICE TRANSACTIONS on Information
Y1 - May 2023
AB - Cross-modal hashing technology has attracted much attention for its favorable retrieval performance and low storage cost. However, for existing cross-modal hashing methods, the heterogeneity of data across modalities is still a challenge and how to fully explore and utilize the intra-modality features has not been well studied. In this paper, we propose a novel cross-modal hashing approach called Modality-fused Graph Network (MFGN). The network architecture consists of a text channel and an image channel that are used to learn modality-specific features, and a modality fusion channel that uses the graph network to learn the modality-shared representations to reduce the heterogeneity across modalities. In addition, an integration module is introduced for the image and text channels to fully explore intra-modality features. Experiments on two widely used datasets show that our approach achieves better results than the state-of-the-art cross-modal hashing methods.
ER -