IEICE global.ieice.org Site

Keyword Search Result

[Keyword] database processing(2hit)

1-2hit

Examination of Criterion for Choosing a Run Time Method in GN Hash Join Algorithm
Miyuki NAKANO Masaru KITSUREGAWA

PAPER-Databases

Vol:
E79-D No:11
Page(s):
1561-1569
The join operation is one of the most expensive operations in relational database systems. So far many researchers have proposed several hash-based algorithms for the join operation. In a hash-based algorithm, a large relation is first partitioned into several clusters. When clusters overflow, that is, when the size of the cluster exceeds the size of main memory, the performance of hash-based algorithms degrade substantially. Previously we proposed the GN hash algorithm which is robust in the presence of overflown clusters. The GN hash join algorithm combines the Grace hash join and hash-based nested-loop join algorithms. We analyze the performance of the GN hash join algorithm when applied to relations with a non-uniform Zipf-like data distribution. The performance is compared with other hash-based join algorithms: Grace, Hybrid, nested-loop, and simple hash join. The GN hash join algorithm is found to have higher performance on non-uniformly distributed relations. In this paper, the robustness of the GN hash algorithm from the point of choosing a run time method is verified. In the GN hash algorithm, the criterion for selecting a run time method from the two algorithm is determined by using the value calculated from the I/O cost formula of the two algorithms. This criterion cannot be guaranteed to be optimal under every data distribution, that is, the optimal criterion may change depending on the data distribution. When the data distribution is unknown, all data has to be repartitioned in order to get an accurate optimal criterion. However, from the view of choosing a method at run time, it is necessary for the GN hash algorithm to determine an appropriate criterion regardless of the data distribution. Thus, we inspect the criterion adopted in our algorithm under a simulation environment. From simulation results, we find that the range of the criterion is very wide under any data distribution and assure that the criterion determined with the assumption of a uniform data distribution can be used even when the data is highly skewed. Consequently, we can conclude that the GN hash algorithm which dynamically selects the nested-loop and Grace hash algorithms provides good performance in the presence of data skew and its performance is not sensitive to the criterion.
A Study on Customer Traffic Data Management Method
Kazuhiko OHKUBO Hiroshi ARIMICHI

LETTER-Communication Networks and Service

Vol:
E78-B No:9
Page(s):
1322-1325
In this paper, we analyze the traffic data management requirements of the customers, describe the functions of the traffic database needed to satisfy their requirements, and propose a highly distributed database system which can efficiently implement these functions. Finally, we report the results of system performance evaluations.

Keyword Search Result

[Keyword] database processing(2hit)

Examination of Criterion for Choosing a Run Time Method in GN Hash Join Algorithm

A Study on Customer Traffic Data Management Method

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles