Two-Phased Bulk Insertion by Seeded Clustering for R-Trees

Taewon LEE; Sukho LEE

doi:10.1093/ietisy/e89-d.1.228

Two-Phased Bulk Insertion by Seeded Clustering for R-Trees

Taewon LEE, Sukho LEE

Full Text Views

0

Cite this

Summary :

With great advances in the mobile technology and wireless communications, users expect to be online anytime anywhere. However, due to the high cost of being online, applications are still implemented as partially connected to the server. In many data-intensive mobile client/server frameworks, it is a daunting task to archive and index such a mass volume of complex data that are continuously added to the server when each mobile client gets online. In this paper, we propose a scalable technique called Seeded Clustering that allows us to maintain R-tree indexes by bulk insertion while keeping pace with high data arrival rates. Our approach uses a seed tree, which is copied from the top k levels of a target R-tree, to classify input data objects into clusters. We then build an R-tree for each of the clusters and insert the input R-trees into the target R-tree in bulk one at a time. We present detailed algorithms for the seeded clustering and bulk insertion as well as the results from our extensive experimental study. The experimental results show that the bulk insertion by seeded clustering outperforms the previously known methods in terms of insertion cost and the quality of target R-trees measured by their query performance.

Publication: IEICE TRANSACTIONS on Information Vol.E89-D No.1 pp.228-236

Publication Date: 2006/01/01

Publicized

Online ISSN: 1745-1361

DOI: 10.1093/ietisy/e89-d.1.228

Type of Manuscript: PAPER

Category: Database

Cite this

Copy

Taewon LEE, Sukho LEE, "Two-Phased Bulk Insertion by Seeded Clustering for R-Trees" in IEICE TRANSACTIONS on Information, vol. E89-D, no. 1, pp. 228-236, January 2006, doi: 10.1093/ietisy/e89-d.1.228.
Abstract: With great advances in the mobile technology and wireless communications, users expect to be online anytime anywhere. However, due to the high cost of being online, applications are still implemented as partially connected to the server. In many data-intensive mobile client/server frameworks, it is a daunting task to archive and index such a mass volume of complex data that are continuously added to the server when each mobile client gets online. In this paper, we propose a scalable technique called Seeded Clustering that allows us to maintain R-tree indexes by bulk insertion while keeping pace with high data arrival rates. Our approach uses a seed tree, which is copied from the top k levels of a target R-tree, to classify input data objects into clusters. We then build an R-tree for each of the clusters and insert the input R-trees into the target R-tree in bulk one at a time. We present detailed algorithms for the seeded clustering and bulk insertion as well as the results from our extensive experimental study. The experimental results show that the bulk insertion by seeded clustering outperforms the previously known methods in terms of insertion cost and the quality of target R-trees measured by their query performance.
URL: https://global.ieice.org/en_transactions/information/10.1093/ietisy/e89-d.1.228/_p

Copy

@ARTICLE{e89-d_1_228,
author={Taewon LEE, Sukho LEE, },
journal={IEICE TRANSACTIONS on Information},
title={Two-Phased Bulk Insertion by Seeded Clustering for R-Trees},
year={2006},
volume={E89-D},
number={1},
pages={228-236},
abstract={With great advances in the mobile technology and wireless communications, users expect to be online anytime anywhere. However, due to the high cost of being online, applications are still implemented as partially connected to the server. In many data-intensive mobile client/server frameworks, it is a daunting task to archive and index such a mass volume of complex data that are continuously added to the server when each mobile client gets online. In this paper, we propose a scalable technique called Seeded Clustering that allows us to maintain R-tree indexes by bulk insertion while keeping pace with high data arrival rates. Our approach uses a seed tree, which is copied from the top k levels of a target R-tree, to classify input data objects into clusters. We then build an R-tree for each of the clusters and insert the input R-trees into the target R-tree in bulk one at a time. We present detailed algorithms for the seeded clustering and bulk insertion as well as the results from our extensive experimental study. The experimental results show that the bulk insertion by seeded clustering outperforms the previously known methods in terms of insertion cost and the quality of target R-trees measured by their query performance.},
keywords={},
doi={10.1093/ietisy/e89-d.1.228},
ISSN={1745-1361},
month={January},}

Copy

TY - JOUR
TI - Two-Phased Bulk Insertion by Seeded Clustering for R-Trees
T2 - IEICE TRANSACTIONS on Information
SP - 228
EP - 236
AU - Taewon LEE
AU - Sukho LEE
PY - 2006
DO - 10.1093/ietisy/e89-d.1.228
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E89-D
IS - 1
JA - IEICE TRANSACTIONS on Information
Y1 - January 2006
AB - With great advances in the mobile technology and wireless communications, users expect to be online anytime anywhere. However, due to the high cost of being online, applications are still implemented as partially connected to the server. In many data-intensive mobile client/server frameworks, it is a daunting task to archive and index such a mass volume of complex data that are continuously added to the server when each mobile client gets online. In this paper, we propose a scalable technique called Seeded Clustering that allows us to maintain R-tree indexes by bulk insertion while keeping pace with high data arrival rates. Our approach uses a seed tree, which is copied from the top k levels of a target R-tree, to classify input data objects into clusters. We then build an R-tree for each of the clusters and insert the input R-trees into the target R-tree in bulk one at a time. We present detailed algorithms for the seeded clustering and bulk insertion as well as the results from our extensive experimental study. The experimental results show that the bulk insertion by seeded clustering outperforms the previously known methods in terms of insertion cost and the quality of target R-trees measured by their query performance.
ER -