Modeling Joint Representation with Tri-Modal Deep Belief Networks for Query and Question Matching

Nan JIANG; Wenge RONG; Baolin PENG; Yifan NIE; Zhang XIONG

doi:10.1587/transinf.2015DAP0009

IEICE TRANSACTIONS on Information

Modeling Joint Representation with Tri-Modal Deep Belief Networks for Query and Question Matching

Nan JIANG, Wenge RONG, Baolin PENG, Yifan NIE, Zhang XIONG

Full Text Views

0

Cite this

Summary :

One of the main research tasks in community question answering (cQA) is finding the most relevant questions for a given new query, thereby providing useful knowledge for users. The straightforward approach is to capitalize on textual features, or a bag-of-words (BoW) representation, to conduct the matching process between queries and questions. However, these approaches have a lexical gap issue which means that, if lexicon matching fails, they cannot model the semantic meaning. In addition, latent semantic models, like latent semantic analysis (LSA), attempt to map queries to its corresponding semantically similar questions through a lower dimension representation. But alas, LSA is a shallow and linear model that cannot model highly non-linear correlations in cQA. Moreover, both BoW and semantic oriented solutions utilize a single dictionary to represent the query, question, and answer in the same feature space. However, the correlations between them, as we observe from data, imply that they lie in entirely different feature spaces. In light of these observations, this paper proposes a tri-modal deep belief network (tri-DBN) to extract a unified representation for the query, question, and answer, with the hypothesis that they locate in three different feature spaces. Besides, we compare the unified representation extracted by our model with other representations using the Yahoo! Answers queries on the dataset. Finally, Experimental results reveal that the proposed model captures semantic meaning both within and between queries, questions, and answers. In addition, the results also suggest that the joint representation extracted via the proposed method can improve the performance of cQA archives searching.

Publication: IEICE TRANSACTIONS on Information Vol.E99-D No.4 pp.927-935

Publication Date: 2016/04/01

Publicized: 2016/01/14

Online ISSN: 1745-1361

DOI: 10.1587/transinf.2015DAP0009

Type of Manuscript: Special Section PAPER (Special Section on Data Engineering and Information Management)

Category

Authors

Nan JIANG
  Beihang University
Wenge RONG
  Beihang University
Baolin PENG
  The Chinese University of Hong Kong
Yifan NIE
  Université de Montréal
Zhang XIONG
  Beihang University

Keyword

cQA, deep belief networks, joint representation, tri-modal deep belief network

Cite this

Copy

Nan JIANG, Wenge RONG, Baolin PENG, Yifan NIE, Zhang XIONG, "Modeling Joint Representation with Tri-Modal Deep Belief Networks for Query and Question Matching" in IEICE TRANSACTIONS on Information, vol. E99-D, no. 4, pp. 927-935, April 2016, doi: 10.1587/transinf.2015DAP0009.
Abstract: One of the main research tasks in community question answering (cQA) is finding the most relevant questions for a given new query, thereby providing useful knowledge for users. The straightforward approach is to capitalize on textual features, or a bag-of-words (BoW) representation, to conduct the matching process between queries and questions. However, these approaches have a lexical gap issue which means that, if lexicon matching fails, they cannot model the semantic meaning. In addition, latent semantic models, like latent semantic analysis (LSA), attempt to map queries to its corresponding semantically similar questions through a lower dimension representation. But alas, LSA is a shallow and linear model that cannot model highly non-linear correlations in cQA. Moreover, both BoW and semantic oriented solutions utilize a single dictionary to represent the query, question, and answer in the same feature space. However, the correlations between them, as we observe from data, imply that they lie in entirely different feature spaces. In light of these observations, this paper proposes a tri-modal deep belief network (tri-DBN) to extract a unified representation for the query, question, and answer, with the hypothesis that they locate in three different feature spaces. Besides, we compare the unified representation extracted by our model with other representations using the Yahoo! Answers queries on the dataset. Finally, Experimental results reveal that the proposed model captures semantic meaning both within and between queries, questions, and answers. In addition, the results also suggest that the joint representation extracted via the proposed method can improve the performance of cQA archives searching.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2015DAP0009/_p

Copy

@ARTICLE{e99-d_4_927,
author={Nan JIANG, Wenge RONG, Baolin PENG, Yifan NIE, Zhang XIONG, },
journal={IEICE TRANSACTIONS on Information},
title={Modeling Joint Representation with Tri-Modal Deep Belief Networks for Query and Question Matching},
year={2016},
volume={E99-D},
number={4},
pages={927-935},
abstract={One of the main research tasks in community question answering (cQA) is finding the most relevant questions for a given new query, thereby providing useful knowledge for users. The straightforward approach is to capitalize on textual features, or a bag-of-words (BoW) representation, to conduct the matching process between queries and questions. However, these approaches have a lexical gap issue which means that, if lexicon matching fails, they cannot model the semantic meaning. In addition, latent semantic models, like latent semantic analysis (LSA), attempt to map queries to its corresponding semantically similar questions through a lower dimension representation. But alas, LSA is a shallow and linear model that cannot model highly non-linear correlations in cQA. Moreover, both BoW and semantic oriented solutions utilize a single dictionary to represent the query, question, and answer in the same feature space. However, the correlations between them, as we observe from data, imply that they lie in entirely different feature spaces. In light of these observations, this paper proposes a tri-modal deep belief network (tri-DBN) to extract a unified representation for the query, question, and answer, with the hypothesis that they locate in three different feature spaces. Besides, we compare the unified representation extracted by our model with other representations using the Yahoo! Answers queries on the dataset. Finally, Experimental results reveal that the proposed model captures semantic meaning both within and between queries, questions, and answers. In addition, the results also suggest that the joint representation extracted via the proposed method can improve the performance of cQA archives searching.},
keywords={},
doi={10.1587/transinf.2015DAP0009},
ISSN={1745-1361},
month={April},}

Copy

TY - JOUR
TI - Modeling Joint Representation with Tri-Modal Deep Belief Networks for Query and Question Matching
T2 - IEICE TRANSACTIONS on Information
SP - 927
EP - 935
AU - Nan JIANG
AU - Wenge RONG
AU - Baolin PENG
AU - Yifan NIE
AU - Zhang XIONG
PY - 2016
DO - 10.1587/transinf.2015DAP0009
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E99-D
IS - 4
JA - IEICE TRANSACTIONS on Information
Y1 - April 2016
AB - One of the main research tasks in community question answering (cQA) is finding the most relevant questions for a given new query, thereby providing useful knowledge for users. The straightforward approach is to capitalize on textual features, or a bag-of-words (BoW) representation, to conduct the matching process between queries and questions. However, these approaches have a lexical gap issue which means that, if lexicon matching fails, they cannot model the semantic meaning. In addition, latent semantic models, like latent semantic analysis (LSA), attempt to map queries to its corresponding semantically similar questions through a lower dimension representation. But alas, LSA is a shallow and linear model that cannot model highly non-linear correlations in cQA. Moreover, both BoW and semantic oriented solutions utilize a single dictionary to represent the query, question, and answer in the same feature space. However, the correlations between them, as we observe from data, imply that they lie in entirely different feature spaces. In light of these observations, this paper proposes a tri-modal deep belief network (tri-DBN) to extract a unified representation for the query, question, and answer, with the hypothesis that they locate in three different feature spaces. Besides, we compare the unified representation extracted by our model with other representations using the Yahoo! Answers queries on the dataset. Finally, Experimental results reveal that the proposed model captures semantic meaning both within and between queries, questions, and answers. In addition, the results also suggest that the joint representation extracted via the proposed method can improve the performance of cQA archives searching.
ER -

IEICE TRANSACTIONS on Information