When developing an index for a similarity search in metric spaces, how to divide the space for effective search pruning is a fundamental issue. We present Maximal Metric Margin Partitioning (MMMP), a partitioning scheme for similarity search indexes. MMMP divides the data based on its distribution pattern, especially for the boundaries of clusters. A partitioning boundary created by MMMP is likely to be located in a sparse area between clusters. Moreover, the partitioning boundary is at maximum distances from the two cluster edges. We also present an indexing scheme, named the MMMP-Index, which uses MMMP and pivot filtering. The MMMP-Index can prune many objects that are not relevant to a query, and it reduces the query execution cost. Our experimental results show that MMMP effectively indexes clustered data and reduces the search cost. For clustered data in a vector space, the MMMP-Index reduces the computational cost to less than two thirds that of comparable schemes.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Hisashi KURASAWA, Daiji FUKAGAWA, Atsuhiro TAKASU, Jun ADACHI, "Margin-Based Pivot Selection for Similarity Search Indexes" in IEICE TRANSACTIONS on Information,
vol. E93-D, no. 6, pp. 1422-1432, June 2010, doi: 10.1587/transinf.E93.D.1422.
Abstract: When developing an index for a similarity search in metric spaces, how to divide the space for effective search pruning is a fundamental issue. We present Maximal Metric Margin Partitioning (MMMP), a partitioning scheme for similarity search indexes. MMMP divides the data based on its distribution pattern, especially for the boundaries of clusters. A partitioning boundary created by MMMP is likely to be located in a sparse area between clusters. Moreover, the partitioning boundary is at maximum distances from the two cluster edges. We also present an indexing scheme, named the MMMP-Index, which uses MMMP and pivot filtering. The MMMP-Index can prune many objects that are not relevant to a query, and it reduces the query execution cost. Our experimental results show that MMMP effectively indexes clustered data and reduces the search cost. For clustered data in a vector space, the MMMP-Index reduces the computational cost to less than two thirds that of comparable schemes.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.E93.D.1422/_p
Copy
@ARTICLE{e93-d_6_1422,
author={Hisashi KURASAWA, Daiji FUKAGAWA, Atsuhiro TAKASU, Jun ADACHI, },
journal={IEICE TRANSACTIONS on Information},
title={Margin-Based Pivot Selection for Similarity Search Indexes},
year={2010},
volume={E93-D},
number={6},
pages={1422-1432},
abstract={When developing an index for a similarity search in metric spaces, how to divide the space for effective search pruning is a fundamental issue. We present Maximal Metric Margin Partitioning (MMMP), a partitioning scheme for similarity search indexes. MMMP divides the data based on its distribution pattern, especially for the boundaries of clusters. A partitioning boundary created by MMMP is likely to be located in a sparse area between clusters. Moreover, the partitioning boundary is at maximum distances from the two cluster edges. We also present an indexing scheme, named the MMMP-Index, which uses MMMP and pivot filtering. The MMMP-Index can prune many objects that are not relevant to a query, and it reduces the query execution cost. Our experimental results show that MMMP effectively indexes clustered data and reduces the search cost. For clustered data in a vector space, the MMMP-Index reduces the computational cost to less than two thirds that of comparable schemes.},
keywords={},
doi={10.1587/transinf.E93.D.1422},
ISSN={1745-1361},
month={June},}
Copy
TY - JOUR
TI - Margin-Based Pivot Selection for Similarity Search Indexes
T2 - IEICE TRANSACTIONS on Information
SP - 1422
EP - 1432
AU - Hisashi KURASAWA
AU - Daiji FUKAGAWA
AU - Atsuhiro TAKASU
AU - Jun ADACHI
PY - 2010
DO - 10.1587/transinf.E93.D.1422
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E93-D
IS - 6
JA - IEICE TRANSACTIONS on Information
Y1 - June 2010
AB - When developing an index for a similarity search in metric spaces, how to divide the space for effective search pruning is a fundamental issue. We present Maximal Metric Margin Partitioning (MMMP), a partitioning scheme for similarity search indexes. MMMP divides the data based on its distribution pattern, especially for the boundaries of clusters. A partitioning boundary created by MMMP is likely to be located in a sparse area between clusters. Moreover, the partitioning boundary is at maximum distances from the two cluster edges. We also present an indexing scheme, named the MMMP-Index, which uses MMMP and pivot filtering. The MMMP-Index can prune many objects that are not relevant to a query, and it reduces the query execution cost. Our experimental results show that MMMP effectively indexes clustered data and reduces the search cost. For clustered data in a vector space, the MMMP-Index reduces the computational cost to less than two thirds that of comparable schemes.
ER -