R-trees have been traditionally optimized for I/O performance with disk pages as tree nodes. Recently, researchers have proposed cache-conscious variations of R-trees optimized for CPU cache performance in main memory environments, where the node size is several cache lines wide and more entries are packed in a node by compressing MBR keys. However, because there is a big difference between the node sizes of two types of R-trees, disk-optimized R-trees show poor cache performance while cache-optimized R-trees exhibit poor disk performance. In this paper, we propose a cache and disk optimized R-tree, called PR-tree (Prefetching R-tree). For cache performance, the node size of the PR-tree is wider than a cache line, and the prefetch instruction is used to reduce the number of cache misses. For I/O performance, the nodes of the PR-tree are fitted into one disk page. We represent the detailed analysis of cache misses for range queries, and enumerate all the reasonable in-page leaf and nonleaf node sizes, and heights of in-page trees to figure out tree parameters for the best cache and I/O performance. The PR-tree that we propose achieves better cache performance than the disk-optimized R-tree: a factor of 3.5-15.1 improvement for one-by-one insertions, 6.5-15.1 improvement for deletions, 1.3-1.9 improvement for range queries, and 2.7-9.7 improvement for k-nearest neighbor queries. All experimental results do not show notable declines of I/O performance.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Myungsun PARK, Sukho LEE, "A Cache Optimized Multidimensional Index in Disk-Based Environments" in IEICE TRANSACTIONS on Information,
vol. E88-D, no. 8, pp. 1932-1939, August 2005, doi: 10.1093/ietisy/e88-d.8.1932.
Abstract: R-trees have been traditionally optimized for I/O performance with disk pages as tree nodes. Recently, researchers have proposed cache-conscious variations of R-trees optimized for CPU cache performance in main memory environments, where the node size is several cache lines wide and more entries are packed in a node by compressing MBR keys. However, because there is a big difference between the node sizes of two types of R-trees, disk-optimized R-trees show poor cache performance while cache-optimized R-trees exhibit poor disk performance. In this paper, we propose a cache and disk optimized R-tree, called PR-tree (Prefetching R-tree). For cache performance, the node size of the PR-tree is wider than a cache line, and the prefetch instruction is used to reduce the number of cache misses. For I/O performance, the nodes of the PR-tree are fitted into one disk page. We represent the detailed analysis of cache misses for range queries, and enumerate all the reasonable in-page leaf and nonleaf node sizes, and heights of in-page trees to figure out tree parameters for the best cache and I/O performance. The PR-tree that we propose achieves better cache performance than the disk-optimized R-tree: a factor of 3.5-15.1 improvement for one-by-one insertions, 6.5-15.1 improvement for deletions, 1.3-1.9 improvement for range queries, and 2.7-9.7 improvement for k-nearest neighbor queries. All experimental results do not show notable declines of I/O performance.
URL: https://global.ieice.org/en_transactions/information/10.1093/ietisy/e88-d.8.1932/_p
Copy
@ARTICLE{e88-d_8_1932,
author={Myungsun PARK, Sukho LEE, },
journal={IEICE TRANSACTIONS on Information},
title={A Cache Optimized Multidimensional Index in Disk-Based Environments},
year={2005},
volume={E88-D},
number={8},
pages={1932-1939},
abstract={R-trees have been traditionally optimized for I/O performance with disk pages as tree nodes. Recently, researchers have proposed cache-conscious variations of R-trees optimized for CPU cache performance in main memory environments, where the node size is several cache lines wide and more entries are packed in a node by compressing MBR keys. However, because there is a big difference between the node sizes of two types of R-trees, disk-optimized R-trees show poor cache performance while cache-optimized R-trees exhibit poor disk performance. In this paper, we propose a cache and disk optimized R-tree, called PR-tree (Prefetching R-tree). For cache performance, the node size of the PR-tree is wider than a cache line, and the prefetch instruction is used to reduce the number of cache misses. For I/O performance, the nodes of the PR-tree are fitted into one disk page. We represent the detailed analysis of cache misses for range queries, and enumerate all the reasonable in-page leaf and nonleaf node sizes, and heights of in-page trees to figure out tree parameters for the best cache and I/O performance. The PR-tree that we propose achieves better cache performance than the disk-optimized R-tree: a factor of 3.5-15.1 improvement for one-by-one insertions, 6.5-15.1 improvement for deletions, 1.3-1.9 improvement for range queries, and 2.7-9.7 improvement for k-nearest neighbor queries. All experimental results do not show notable declines of I/O performance.},
keywords={},
doi={10.1093/ietisy/e88-d.8.1932},
ISSN={},
month={August},}
Copy
TY - JOUR
TI - A Cache Optimized Multidimensional Index in Disk-Based Environments
T2 - IEICE TRANSACTIONS on Information
SP - 1932
EP - 1939
AU - Myungsun PARK
AU - Sukho LEE
PY - 2005
DO - 10.1093/ietisy/e88-d.8.1932
JO - IEICE TRANSACTIONS on Information
SN -
VL - E88-D
IS - 8
JA - IEICE TRANSACTIONS on Information
Y1 - August 2005
AB - R-trees have been traditionally optimized for I/O performance with disk pages as tree nodes. Recently, researchers have proposed cache-conscious variations of R-trees optimized for CPU cache performance in main memory environments, where the node size is several cache lines wide and more entries are packed in a node by compressing MBR keys. However, because there is a big difference between the node sizes of two types of R-trees, disk-optimized R-trees show poor cache performance while cache-optimized R-trees exhibit poor disk performance. In this paper, we propose a cache and disk optimized R-tree, called PR-tree (Prefetching R-tree). For cache performance, the node size of the PR-tree is wider than a cache line, and the prefetch instruction is used to reduce the number of cache misses. For I/O performance, the nodes of the PR-tree are fitted into one disk page. We represent the detailed analysis of cache misses for range queries, and enumerate all the reasonable in-page leaf and nonleaf node sizes, and heights of in-page trees to figure out tree parameters for the best cache and I/O performance. The PR-tree that we propose achieves better cache performance than the disk-optimized R-tree: a factor of 3.5-15.1 improvement for one-by-one insertions, 6.5-15.1 improvement for deletions, 1.3-1.9 improvement for range queries, and 2.7-9.7 improvement for k-nearest neighbor queries. All experimental results do not show notable declines of I/O performance.
ER -