In many applications, tables are distributively stored in different data sources, but the frequency of updates on each data source is different. Some techniques have been proposed to effectively express the temporal orders between different values, and the most current, i.e. up-to-date, value of a given data item can be easily picked up according to the temporal orders. However, the currency of the data items in the same table may be different. That is, when a user asks for a table D, it cannot be ensured that all the most current values of the data items in D are stored in a single table. Since different data sources may have overlaps, we can construct a conjunctive query on multiple tables to get all the required current values. In this paper, we formalize the conjunctive query as currency preserving query, and study how to generate the minimized currency preserving query to reduce the cost of visiting different data sources. First, a graph model is proposed to represent the distributed tables and their relationships. Based on the model, we prove that a currency preserving query is equivalent to a terminal tree in the graph, and give an algorithm to generate a query from a terminal tree. After that, we study the problem of finding minimized currency preserving query. The problem is proved to be NP-hard, and some heuristics strategies are provided to solve the problem. Finally, we conduct experiments on both synthetic and real data sets to verify the effectiveness and efficiency of the proposed techniques.
Mohan LI
Guangzhou University
Yanbin SUN
Guangzhou University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Mohan LI, Yanbin SUN, "Currency Preserving Query: Selecting the Newest Values from Multiple Tables" in IEICE TRANSACTIONS on Information,
vol. E101-D, no. 12, pp. 3059-3072, December 2018, doi: 10.1587/transinf.2018EDP7058.
Abstract: In many applications, tables are distributively stored in different data sources, but the frequency of updates on each data source is different. Some techniques have been proposed to effectively express the temporal orders between different values, and the most current, i.e. up-to-date, value of a given data item can be easily picked up according to the temporal orders. However, the currency of the data items in the same table may be different. That is, when a user asks for a table D, it cannot be ensured that all the most current values of the data items in D are stored in a single table. Since different data sources may have overlaps, we can construct a conjunctive query on multiple tables to get all the required current values. In this paper, we formalize the conjunctive query as currency preserving query, and study how to generate the minimized currency preserving query to reduce the cost of visiting different data sources. First, a graph model is proposed to represent the distributed tables and their relationships. Based on the model, we prove that a currency preserving query is equivalent to a terminal tree in the graph, and give an algorithm to generate a query from a terminal tree. After that, we study the problem of finding minimized currency preserving query. The problem is proved to be NP-hard, and some heuristics strategies are provided to solve the problem. Finally, we conduct experiments on both synthetic and real data sets to verify the effectiveness and efficiency of the proposed techniques.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2018EDP7058/_p
Copy
@ARTICLE{e101-d_12_3059,
author={Mohan LI, Yanbin SUN, },
journal={IEICE TRANSACTIONS on Information},
title={Currency Preserving Query: Selecting the Newest Values from Multiple Tables},
year={2018},
volume={E101-D},
number={12},
pages={3059-3072},
abstract={In many applications, tables are distributively stored in different data sources, but the frequency of updates on each data source is different. Some techniques have been proposed to effectively express the temporal orders between different values, and the most current, i.e. up-to-date, value of a given data item can be easily picked up according to the temporal orders. However, the currency of the data items in the same table may be different. That is, when a user asks for a table D, it cannot be ensured that all the most current values of the data items in D are stored in a single table. Since different data sources may have overlaps, we can construct a conjunctive query on multiple tables to get all the required current values. In this paper, we formalize the conjunctive query as currency preserving query, and study how to generate the minimized currency preserving query to reduce the cost of visiting different data sources. First, a graph model is proposed to represent the distributed tables and their relationships. Based on the model, we prove that a currency preserving query is equivalent to a terminal tree in the graph, and give an algorithm to generate a query from a terminal tree. After that, we study the problem of finding minimized currency preserving query. The problem is proved to be NP-hard, and some heuristics strategies are provided to solve the problem. Finally, we conduct experiments on both synthetic and real data sets to verify the effectiveness and efficiency of the proposed techniques.},
keywords={},
doi={10.1587/transinf.2018EDP7058},
ISSN={1745-1361},
month={December},}
Copy
TY - JOUR
TI - Currency Preserving Query: Selecting the Newest Values from Multiple Tables
T2 - IEICE TRANSACTIONS on Information
SP - 3059
EP - 3072
AU - Mohan LI
AU - Yanbin SUN
PY - 2018
DO - 10.1587/transinf.2018EDP7058
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E101-D
IS - 12
JA - IEICE TRANSACTIONS on Information
Y1 - December 2018
AB - In many applications, tables are distributively stored in different data sources, but the frequency of updates on each data source is different. Some techniques have been proposed to effectively express the temporal orders between different values, and the most current, i.e. up-to-date, value of a given data item can be easily picked up according to the temporal orders. However, the currency of the data items in the same table may be different. That is, when a user asks for a table D, it cannot be ensured that all the most current values of the data items in D are stored in a single table. Since different data sources may have overlaps, we can construct a conjunctive query on multiple tables to get all the required current values. In this paper, we formalize the conjunctive query as currency preserving query, and study how to generate the minimized currency preserving query to reduce the cost of visiting different data sources. First, a graph model is proposed to represent the distributed tables and their relationships. Based on the model, we prove that a currency preserving query is equivalent to a terminal tree in the graph, and give an algorithm to generate a query from a terminal tree. After that, we study the problem of finding minimized currency preserving query. The problem is proved to be NP-hard, and some heuristics strategies are provided to solve the problem. Finally, we conduct experiments on both synthetic and real data sets to verify the effectiveness and efficiency of the proposed techniques.
ER -