IEICE global.ieice.org Site

Author Search Result

[Author] Yoichirou SATO(2hit)

1-2hit

Complexity and a Method of Extracting a Database Schema over Semistructured Documents
Nobutaka SUZUKI Yoichirou SATO Michiyoshi HAYASE

PAPER-Databases

Vol:
E85-D No:6
Page(s):
940-949
Semistructured data comprises irregular structure and has no a-priori database schema, therefore we encounter several problems such as inefficient data retrieval and wasteful data storage. To cope with such problems, some schema extraction algorithms over semistructured data have been proposed, in which data is modeled as an unordered tree. However, the order of elements is indispensable for document data, therefore we consider extracting an optimal database schema over an ordered tree. We consider an optimization problem to extract a smallest database schema such that the density of each class is no less than a given threshold, where the density of a class represents a similarity between the type of the class and those of the objects in the class. We first prove that the corresponding decision problem is strongly NP-complete, and show that another version of the problem is strongly NP-hard and belongs to Δ2 P. Then we show that for any r < 3/2, there is no polynomial-time r-approximation algorithm that solves the optimization problem unless P = NP. Finally, we propose a kind of class called bounded class that can be constructed efficiently, then show a polynomial-time algorithm for constructing a database schema by using bounded classes.
Extracting Typical Classes and a Database Schema from Semistructured Data
Nobutaka SUZUKI Yoichirou SATO Michiyoshi HAYASE

PAPER-Databases

Vol:
E84-D No:1
Page(s):
100-112
Semistructured data has no a-priori schema information, which causes some problems such as inefficient storage and query execution. To cope with such problems, extracting schema information from semistructured data has been an important issue. However, in most cases optimal schema information cannot be extracted efficiently, and few efficient approximation algorithms have been proposed. In this paper, we consider an approximation algorithm for extracting "typical" classes from semistructured data. Intuitively, a class C is said to be typical if the structure of C is "similar" to those of "many" objects. We present the following results. First, we prove that the problem of deciding if a typical class can be extracted from given semistructured data is NP-complete. Second, we present an approximation algorithm for extracting typical classes from given semistructured data, and show a sufficient condition for the approximation algorithm to run in polynomial time. Finally, by using extracted classes obtained by the approximation algorithm, we propose a polynomial-time algorithm for constructing a set R of classes such that R covers all the objects to form a database schema.

Author Search Result

[Author] Yoichirou SATO(2hit)

Complexity and a Method of Extracting a Database Schema over Semistructured Documents

Extracting Typical Classes and a Database Schema from Semistructured Data

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles