The search functionality is under construction.
The search functionality is under construction.

Tag-Annotated Text Search Using Extended Region Algebra

Katsuya MASUDA, Jun'ichi TSUJII

  • Full Text Views

    0

  • Cite this

Summary :

This paper presents algorithms for searching text regions with specifying annotated information in tag-annotated text by using Region Algebra. The original algebra and its efficient algorithms are extended to handle both nested regions and crossed regions. The extensions are necessary for text search by using rich linguistic annotations. We first assign a depth number to every nested tag region to order these regions and write efficient algorithms using the depth number for the containment operations which can treat nested tag regions. Next, we introduce variables for attribute values of tags into the algebra to treat annotations in which attributes indicate another tag regions, and propose an efficient method of treating re-entrancy by incrementally determining values for variables. Our algorithms have been implemented in a text search engine for MEDLINE, which is a large textbase of abstracts in medical science. Experiments in tag-annotated MEDLINE abstracts demonstrate the effectiveness of specifying annotations and the efficiency of our algorithms. The system is made publicly accessible at http://www-tsujii.is.s.u-tokyo.ac.jp/medie/.

Publication
IEICE TRANSACTIONS on Information Vol.E92-D No.12 pp.2369-2377
Publication Date
2009/12/01
Publicized
Online ISSN
1745-1361
DOI
10.1587/transinf.E92.D.2369
Type of Manuscript
Special Section PAPER (Special Section on Natural Language Processing and its Applications)
Category
Information Retrieval

Authors

Keyword