1-1hit |
Masahiro ISHIKAWA Kazutaka FURUSE Hanxiong CHEN Nobuo OHBO
Clustering is one of the most important topics in the field of knowledge discovery from databases. Especially, hierarchical clustering is useful since it gives a hierarchical view of a whole database and can be used to guide users in browsing a huge database. In many cases, clustering can be modeled as a graph partitioning problem. When an appropriate distance function between database objects is given, a database can be viewed as an edge-weighted complete graph, where vertices are database objects and weights of edges are distances between them. Then a process of MST (Minimal Spanning Tree) construction can be viewed as a process of a single-linkage agglomerative clustering process for database objects. In this paper, we propose an efficient MST construction method for a large complete metric graph, which is derived from a database with a metric distance function defined on it. Our method utilizes a metric index to reduce the number of distance calculations. The basic idea is to exclude those edges less probable to be a part of an MST by using the metric postulate. For this purpose, we introduce a new metric index named MetricMatrix. Experimental results show that our method can drastically reduce the number of distance calculations needed for MST construction in comparison with the classical method.