Modality-Fused Graph Network for Cross-Modal Retrieval

Fei WU; Shuaishuai LI; Guangchuan PENG; Yongheng MA; Xiao-Yuan JING

doi:10.1587/transinf.2022EDL8069

IEICE TRANSACTIONS on Information

Modality-Fused Graph Network for Cross-Modal Retrieval

Fei WU, Shuaishuai LI, Guangchuan PENG, Yongheng MA, Xiao-Yuan JING

Full Text Views

1

Cite this

Summary :

Cross-modal hashing technology has attracted much attention for its favorable retrieval performance and low storage cost. However, for existing cross-modal hashing methods, the heterogeneity of data across modalities is still a challenge and how to fully explore and utilize the intra-modality features has not been well studied. In this paper, we propose a novel cross-modal hashing approach called Modality-fused Graph Network (MFGN). The network architecture consists of a text channel and an image channel that are used to learn modality-specific features, and a modality fusion channel that uses the graph network to learn the modality-shared representations to reduce the heterogeneity across modalities. In addition, an integration module is introduced for the image and text channels to fully explore intra-modality features. Experiments on two widely used datasets show that our approach achieves better results than the state-of-the-art cross-modal hashing methods.

Publication: IEICE TRANSACTIONS on Information Vol.E106-D No.5 pp.1094-1097

Publication Date: 2023/05/01

Publicized: 2023/02/09

Online ISSN: 1745-1361

DOI: 10.1587/transinf.2022EDL8069

Type of Manuscript: LETTER

Category: Pattern Recognition

Authors

Fei WU
  Nanjing University of Posts and Telecommunications
Shuaishuai LI
  Nanjing University of Posts and Telecommunications
Guangchuan PENG
  Nanjing University of Posts and Telecommunications
Yongheng MA
  Nanjing University of Posts and Telecommunications
Xiao-Yuan JING
  Wuhan University

Keyword

cross-modal hashing, modality fusion channel, graph network

Cite this

Copy

Fei WU, Shuaishuai LI, Guangchuan PENG, Yongheng MA, Xiao-Yuan JING, "Modality-Fused Graph Network for Cross-Modal Retrieval" in IEICE TRANSACTIONS on Information, vol. E106-D, no. 5, pp. 1094-1097, May 2023, doi: 10.1587/transinf.2022EDL8069.
Abstract: Cross-modal hashing technology has attracted much attention for its favorable retrieval performance and low storage cost. However, for existing cross-modal hashing methods, the heterogeneity of data across modalities is still a challenge and how to fully explore and utilize the intra-modality features has not been well studied. In this paper, we propose a novel cross-modal hashing approach called Modality-fused Graph Network (MFGN). The network architecture consists of a text channel and an image channel that are used to learn modality-specific features, and a modality fusion channel that uses the graph network to learn the modality-shared representations to reduce the heterogeneity across modalities. In addition, an integration module is introduced for the image and text channels to fully explore intra-modality features. Experiments on two widely used datasets show that our approach achieves better results than the state-of-the-art cross-modal hashing methods.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2022EDL8069/_p

Copy

@ARTICLE{e106-d_5_1094,
author={Fei WU, Shuaishuai LI, Guangchuan PENG, Yongheng MA, Xiao-Yuan JING, },
journal={IEICE TRANSACTIONS on Information},
title={Modality-Fused Graph Network for Cross-Modal Retrieval},
year={2023},
volume={E106-D},
number={5},
pages={1094-1097},
abstract={Cross-modal hashing technology has attracted much attention for its favorable retrieval performance and low storage cost. However, for existing cross-modal hashing methods, the heterogeneity of data across modalities is still a challenge and how to fully explore and utilize the intra-modality features has not been well studied. In this paper, we propose a novel cross-modal hashing approach called Modality-fused Graph Network (MFGN). The network architecture consists of a text channel and an image channel that are used to learn modality-specific features, and a modality fusion channel that uses the graph network to learn the modality-shared representations to reduce the heterogeneity across modalities. In addition, an integration module is introduced for the image and text channels to fully explore intra-modality features. Experiments on two widely used datasets show that our approach achieves better results than the state-of-the-art cross-modal hashing methods.},
keywords={},
doi={10.1587/transinf.2022EDL8069},
ISSN={1745-1361},
month={May},}

Copy

TY - JOUR
TI - Modality-Fused Graph Network for Cross-Modal Retrieval
T2 - IEICE TRANSACTIONS on Information
SP - 1094
EP - 1097
AU - Fei WU
AU - Shuaishuai LI
AU - Guangchuan PENG
AU - Yongheng MA
AU - Xiao-Yuan JING
PY - 2023
DO - 10.1587/transinf.2022EDL8069
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E106-D
IS - 5
JA - IEICE TRANSACTIONS on Information
Y1 - May 2023
AB - Cross-modal hashing technology has attracted much attention for its favorable retrieval performance and low storage cost. However, for existing cross-modal hashing methods, the heterogeneity of data across modalities is still a challenge and how to fully explore and utilize the intra-modality features has not been well studied. In this paper, we propose a novel cross-modal hashing approach called Modality-fused Graph Network (MFGN). The network architecture consists of a text channel and an image channel that are used to learn modality-specific features, and a modality fusion channel that uses the graph network to learn the modality-shared representations to reduce the heterogeneity across modalities. In addition, an integration module is introduced for the image and text channels to fully explore intra-modality features. Experiments on two widely used datasets show that our approach achieves better results than the state-of-the-art cross-modal hashing methods.
ER -

IEICE TRANSACTIONS on Information