A widely-used query on a graph is a regular path query (RPQ) whose answer is a set of tuples of nodes connected by paths corresponding to a given regular expression. Traditionally, evaluating an RPQ on a large graph takes substantial memory spaces and long response time. Recently, several studies have focused on improving response time for evaluating an RPQ by splitting an original RPQ into smaller subqueries, evaluating them in parallel and combining partial answers. In these works, how to choose split labels in an RPQ is one of key points of the performance of RPQ evaluation, and rare labels of a graph can be used as split labels. However there is still a room for improvement, because a rare label cannot guarantee the minimum evaluation cost all the time. In this paper, we propose a novel approach of selecting split labels by estimating evaluation cost of each split subquery with a unit-subquery cost matrix (USCM), which can be obtained from a graph in prior to evaluate an RPQ. USCM presents the evaluation cost of a unit-subquery which is the smallest possible subquery, and we can estimate the evaluation cost of an RPQ by decomposing into a set of unit-subqueries. Experimental results show that our proposed approach outperforms rare label based approaches.
Van-Quyet NGUYEN
Chonnam National University
Kyungbaek KIM
Chonnam National University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Van-Quyet NGUYEN, Kyungbaek KIM, "Efficient Regular Path Query Evaluation by Splitting with Unit-Subquery Cost Matrix" in IEICE TRANSACTIONS on Information,
vol. E100-D, no. 10, pp. 2648-2652, October 2017, doi: 10.1587/transinf.2017EDL8060.
Abstract: A widely-used query on a graph is a regular path query (RPQ) whose answer is a set of tuples of nodes connected by paths corresponding to a given regular expression. Traditionally, evaluating an RPQ on a large graph takes substantial memory spaces and long response time. Recently, several studies have focused on improving response time for evaluating an RPQ by splitting an original RPQ into smaller subqueries, evaluating them in parallel and combining partial answers. In these works, how to choose split labels in an RPQ is one of key points of the performance of RPQ evaluation, and rare labels of a graph can be used as split labels. However there is still a room for improvement, because a rare label cannot guarantee the minimum evaluation cost all the time. In this paper, we propose a novel approach of selecting split labels by estimating evaluation cost of each split subquery with a unit-subquery cost matrix (USCM), which can be obtained from a graph in prior to evaluate an RPQ. USCM presents the evaluation cost of a unit-subquery which is the smallest possible subquery, and we can estimate the evaluation cost of an RPQ by decomposing into a set of unit-subqueries. Experimental results show that our proposed approach outperforms rare label based approaches.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2017EDL8060/_p
Copy
@ARTICLE{e100-d_10_2648,
author={Van-Quyet NGUYEN, Kyungbaek KIM, },
journal={IEICE TRANSACTIONS on Information},
title={Efficient Regular Path Query Evaluation by Splitting with Unit-Subquery Cost Matrix},
year={2017},
volume={E100-D},
number={10},
pages={2648-2652},
abstract={A widely-used query on a graph is a regular path query (RPQ) whose answer is a set of tuples of nodes connected by paths corresponding to a given regular expression. Traditionally, evaluating an RPQ on a large graph takes substantial memory spaces and long response time. Recently, several studies have focused on improving response time for evaluating an RPQ by splitting an original RPQ into smaller subqueries, evaluating them in parallel and combining partial answers. In these works, how to choose split labels in an RPQ is one of key points of the performance of RPQ evaluation, and rare labels of a graph can be used as split labels. However there is still a room for improvement, because a rare label cannot guarantee the minimum evaluation cost all the time. In this paper, we propose a novel approach of selecting split labels by estimating evaluation cost of each split subquery with a unit-subquery cost matrix (USCM), which can be obtained from a graph in prior to evaluate an RPQ. USCM presents the evaluation cost of a unit-subquery which is the smallest possible subquery, and we can estimate the evaluation cost of an RPQ by decomposing into a set of unit-subqueries. Experimental results show that our proposed approach outperforms rare label based approaches.},
keywords={},
doi={10.1587/transinf.2017EDL8060},
ISSN={1745-1361},
month={October},}
Copy
TY - JOUR
TI - Efficient Regular Path Query Evaluation by Splitting with Unit-Subquery Cost Matrix
T2 - IEICE TRANSACTIONS on Information
SP - 2648
EP - 2652
AU - Van-Quyet NGUYEN
AU - Kyungbaek KIM
PY - 2017
DO - 10.1587/transinf.2017EDL8060
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E100-D
IS - 10
JA - IEICE TRANSACTIONS on Information
Y1 - October 2017
AB - A widely-used query on a graph is a regular path query (RPQ) whose answer is a set of tuples of nodes connected by paths corresponding to a given regular expression. Traditionally, evaluating an RPQ on a large graph takes substantial memory spaces and long response time. Recently, several studies have focused on improving response time for evaluating an RPQ by splitting an original RPQ into smaller subqueries, evaluating them in parallel and combining partial answers. In these works, how to choose split labels in an RPQ is one of key points of the performance of RPQ evaluation, and rare labels of a graph can be used as split labels. However there is still a room for improvement, because a rare label cannot guarantee the minimum evaluation cost all the time. In this paper, we propose a novel approach of selecting split labels by estimating evaluation cost of each split subquery with a unit-subquery cost matrix (USCM), which can be obtained from a graph in prior to evaluate an RPQ. USCM presents the evaluation cost of a unit-subquery which is the smallest possible subquery, and we can estimate the evaluation cost of an RPQ by decomposing into a set of unit-subqueries. Experimental results show that our proposed approach outperforms rare label based approaches.
ER -