1-4hit |
Song-Hyon KIM Inchul SONG Kyong-Ha LEE Yoon-Joon LEE
Subgraph matching is a fundamental operation for querying graph-structured data. Due to potential errors and noises in real-world graph data, exact subgraph matching is sometimes inappropriate in practice. In this paper we consider an approximate subgraph matching model that allows missing edges. Based on this model, approximate subgraph matching finds all occurrences of a given query graph in a database graph, allowing missing edges. A straightforward approach is to first generate query subgraphs of a given query graph by deleting edges and then perform exact subgraph matching for each query subgraph. In this paper we propose a sharing-based approach to approximate subgraph matching, called SASUM. Our method is based on the fact that query subgraphs are highly overlapped. Due to this overlapping nature of query subgraphs, the matches of a query subgraph can be computed from the matches of a smaller query subgraph, which results in reducing the number of query subgraphs that require expensive exact subgraph matching. Our method uses a lattice framework to identify sharing opportunities between query subgraphs. To further reduce the number of graphs that need exact subgraph matching, SASUM generates small base graphs that are shared by query subgraphs and chooses the minimum number of base graphs whose matches are used to derive the matching results of all query subgraphs. A comprehensive set of experiments shows that our approach outperforms the state-of-the-art approach by orders of magnitude in terms of query execution time.
Song-Hyon KIM Kyong-Ha LEE Inchul SONG Hyebong CHOI Yoon-Joon LEE
We address the problem of processing graph pattern matching queries over a massive set of data graphs in this letter. As the number of data graphs is growing rapidly, it is often hard to process such queries with serial algorithms in a timely manner. We propose a distributed graph querying algorithm, which employs feature-based comparison and a filter-and-verify scheme working on the MapReduce framework. Moreover, we devise an efficient scheme that adaptively tunes a proper feature size at runtime by sampling data graphs. With various experiments, we show that the proposed method outperforms conventional algorithms in terms of scalability and efficiency.
Junsu KIM Kyong-Ha LEE Myoung-Ho KIM
With rapid increase of the number of applications as well as the sizes of data, multi-query processing on the MapReduce framework has gained much attention. Meanwhile, there have been much interest in skyline query processing due to its power of multi-criteria decision making and analysis. Recently, there have been attempts to optimize multi-query processing in MapReduce. However, they are not appropriate to process multiple skyline queries efficiently and they also require modifications of the Hadoop internals. In this paper, we propose an efficient method for processing multi-skyline queries with MapReduce without any modification of the Hadoop internals. Through various experiments, we show that our approach outperforms previous studies by orders of magnitude.
Young-Kyoon SUH Seounghyeon KIM Joo-Young LEE Hawon CHU Junyoung AN Kyong-Ha LEE
In this letter we analyze the economic worth of GPU on analytical processing of GPU-accelerated database management systems (DBMSes). To this end, we conducted rigorous experiments with TPC-H across three popular GPU DBMSes. Consequently, we show that co-processing with CPU and GPU in the GPU DBMSes was cost-effective despite exposed concerns.