Full-Text and Structural Indexing of XML Documents on B<SUP>+</SUP>-Tree

Toshiyuki SHIMIZU; Masatoshi YOSHIKAWA

doi:10.1093/ietisy/e89-d.1.237

IEICE TRANSACTIONS on Information

Full-Text and Structural Indexing of XML Documents on B⁺-Tree

Toshiyuki SHIMIZU, Masatoshi YOSHIKAWA

Full Text Views

0

Cite this

Summary :

XML query processing is one of the most active areas of database research. Although the main focus of past research has been the processing of structural XML queries, there are growing demands for a full-text search for XML documents. In this paper, we propose XICS (XML Indices for Content and Structural search), which aims at high-speed processing of both full-text and structural queries in XML documents. An important design principle of our indices is the use of a B⁺-tree. To represent the structural information of XML trees, each node in the XML tree is labeled with an identifier. The identifier contains an integer number representing the path information from the root node. XICS consist of two types of indices, the COB-tree (COntent B⁺-tree) and the STB-tree (STructure B⁺-tree). The search keys of the COB-tree are a pair of text fragments in the XML document and the identifiers of the leaf nodes that contain the text, whereas the search keys of the STB-tree are the node identifiers. By using a node identifier in the search keys, we can retrieve only the entries that match the path information in the query. The STB-tree can filter nodes using structural conditions in queries, while the COB-tree can filter nodes using text conditions. We have implemented a COB-tree and an STB-tree using GiST and examined index size and query processing time. Our experimental results show the efficiency of XICS in query processing.

Publication: IEICE TRANSACTIONS on Information Vol.E89-D No.1 pp.237-247

Publication Date: 2006/01/01

Publicized

Online ISSN: 1745-1361

DOI: 10.1093/ietisy/e89-d.1.237

Type of Manuscript: PAPER

Category: Contents Technology and Web Information Systems

Cite this

Copy

Toshiyuki SHIMIZU, Masatoshi YOSHIKAWA, "Full-Text and Structural Indexing of XML Documents on B+-Tree" in IEICE TRANSACTIONS on Information, vol. E89-D, no. 1, pp. 237-247, January 2006, doi: 10.1093/ietisy/e89-d.1.237.
Abstract: XML query processing is one of the most active areas of database research. Although the main focus of past research has been the processing of structural XML queries, there are growing demands for a full-text search for XML documents. In this paper, we propose XICS (XML Indices for Content and Structural search), which aims at high-speed processing of both full-text and structural queries in XML documents. An important design principle of our indices is the use of a B⁺-tree. To represent the structural information of XML trees, each node in the XML tree is labeled with an identifier. The identifier contains an integer number representing the path information from the root node. XICS consist of two types of indices, the COB-tree (COntent B⁺-tree) and the STB-tree (STructure B⁺-tree). The search keys of the COB-tree are a pair of text fragments in the XML document and the identifiers of the leaf nodes that contain the text, whereas the search keys of the STB-tree are the node identifiers. By using a node identifier in the search keys, we can retrieve only the entries that match the path information in the query. The STB-tree can filter nodes using structural conditions in queries, while the COB-tree can filter nodes using text conditions. We have implemented a COB-tree and an STB-tree using GiST and examined index size and query processing time. Our experimental results show the efficiency of XICS in query processing.
URL: https://global.ieice.org/en_transactions/information/10.1093/ietisy/e89-d.1.237/_p

Copy

@ARTICLE{e89-d_1_237,
author={Toshiyuki SHIMIZU, Masatoshi YOSHIKAWA, },
journal={IEICE TRANSACTIONS on Information},
title={Full-Text and Structural Indexing of XML Documents on B+-Tree},
year={2006},
volume={E89-D},
number={1},
pages={237-247},
abstract={XML query processing is one of the most active areas of database research. Although the main focus of past research has been the processing of structural XML queries, there are growing demands for a full-text search for XML documents. In this paper, we propose XICS (XML Indices for Content and Structural search), which aims at high-speed processing of both full-text and structural queries in XML documents. An important design principle of our indices is the use of a B⁺-tree. To represent the structural information of XML trees, each node in the XML tree is labeled with an identifier. The identifier contains an integer number representing the path information from the root node. XICS consist of two types of indices, the COB-tree (COntent B⁺-tree) and the STB-tree (STructure B⁺-tree). The search keys of the COB-tree are a pair of text fragments in the XML document and the identifiers of the leaf nodes that contain the text, whereas the search keys of the STB-tree are the node identifiers. By using a node identifier in the search keys, we can retrieve only the entries that match the path information in the query. The STB-tree can filter nodes using structural conditions in queries, while the COB-tree can filter nodes using text conditions. We have implemented a COB-tree and an STB-tree using GiST and examined index size and query processing time. Our experimental results show the efficiency of XICS in query processing.},
keywords={},
doi={10.1093/ietisy/e89-d.1.237},
ISSN={1745-1361},
month={January},}

Copy

TY - JOUR
TI - Full-Text and Structural Indexing of XML Documents on B+-Tree
T2 - IEICE TRANSACTIONS on Information
SP - 237
EP - 247
AU - Toshiyuki SHIMIZU
AU - Masatoshi YOSHIKAWA
PY - 2006
DO - 10.1093/ietisy/e89-d.1.237
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E89-D
IS - 1
JA - IEICE TRANSACTIONS on Information
Y1 - January 2006
AB - XML query processing is one of the most active areas of database research. Although the main focus of past research has been the processing of structural XML queries, there are growing demands for a full-text search for XML documents. In this paper, we propose XICS (XML Indices for Content and Structural search), which aims at high-speed processing of both full-text and structural queries in XML documents. An important design principle of our indices is the use of a B⁺-tree. To represent the structural information of XML trees, each node in the XML tree is labeled with an identifier. The identifier contains an integer number representing the path information from the root node. XICS consist of two types of indices, the COB-tree (COntent B⁺-tree) and the STB-tree (STructure B⁺-tree). The search keys of the COB-tree are a pair of text fragments in the XML document and the identifiers of the leaf nodes that contain the text, whereas the search keys of the STB-tree are the node identifiers. By using a node identifier in the search keys, we can retrieve only the entries that match the path information in the query. The STB-tree can filter nodes using structural conditions in queries, while the COB-tree can filter nodes using text conditions. We have implemented a COB-tree and an STB-tree using GiST and examined index size and query processing time. Our experimental results show the efficiency of XICS in query processing.
ER -

IEICE TRANSACTIONS on Information

Full-Text and Structural Indexing of XML Documents on B⁺-Tree

Summary :

Authors

Keyword

Latest Issue

Contents

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles

IEICE TRANSACTIONS on Information

Full-Text and Structural Indexing of XML Documents on B+-Tree

Summary :

Authors

Keyword

Latest Issue

Contents

Copyrights notice of machine-translated contents

Cite this

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles

Full-Text and Structural Indexing of XML Documents on B⁺-Tree