This paper presents algorithms for searching text regions with specifying annotated information in tag-annotated text by using Region Algebra. The original algebra and its efficient algorithms are extended to handle both nested regions and crossed regions. The extensions are necessary for text search by using rich linguistic annotations. We first assign a depth number to every nested tag region to order these regions and write efficient algorithms using the depth number for the containment operations which can treat nested tag regions. Next, we introduce variables for attribute values of tags into the algebra to treat annotations in which attributes indicate another tag regions, and propose an efficient method of treating re-entrancy by incrementally determining values for variables. Our algorithms have been implemented in a text search engine for MEDLINE, which is a large textbase of abstracts in medical science. Experiments in tag-annotated MEDLINE abstracts demonstrate the effectiveness of specifying annotations and the efficiency of our algorithms. The system is made publicly accessible at http://www-tsujii.is.s.u-tokyo.ac.jp/medie/.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Katsuya MASUDA, Jun'ichi TSUJII, "Tag-Annotated Text Search Using Extended Region Algebra" in IEICE TRANSACTIONS on Information,
vol. E92-D, no. 12, pp. 2369-2377, December 2009, doi: 10.1587/transinf.E92.D.2369.
Abstract: This paper presents algorithms for searching text regions with specifying annotated information in tag-annotated text by using Region Algebra. The original algebra and its efficient algorithms are extended to handle both nested regions and crossed regions. The extensions are necessary for text search by using rich linguistic annotations. We first assign a depth number to every nested tag region to order these regions and write efficient algorithms using the depth number for the containment operations which can treat nested tag regions. Next, we introduce variables for attribute values of tags into the algebra to treat annotations in which attributes indicate another tag regions, and propose an efficient method of treating re-entrancy by incrementally determining values for variables. Our algorithms have been implemented in a text search engine for MEDLINE, which is a large textbase of abstracts in medical science. Experiments in tag-annotated MEDLINE abstracts demonstrate the effectiveness of specifying annotations and the efficiency of our algorithms. The system is made publicly accessible at http://www-tsujii.is.s.u-tokyo.ac.jp/medie/.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.E92.D.2369/_p
Copy
@ARTICLE{e92-d_12_2369,
author={Katsuya MASUDA, Jun'ichi TSUJII, },
journal={IEICE TRANSACTIONS on Information},
title={Tag-Annotated Text Search Using Extended Region Algebra},
year={2009},
volume={E92-D},
number={12},
pages={2369-2377},
abstract={This paper presents algorithms for searching text regions with specifying annotated information in tag-annotated text by using Region Algebra. The original algebra and its efficient algorithms are extended to handle both nested regions and crossed regions. The extensions are necessary for text search by using rich linguistic annotations. We first assign a depth number to every nested tag region to order these regions and write efficient algorithms using the depth number for the containment operations which can treat nested tag regions. Next, we introduce variables for attribute values of tags into the algebra to treat annotations in which attributes indicate another tag regions, and propose an efficient method of treating re-entrancy by incrementally determining values for variables. Our algorithms have been implemented in a text search engine for MEDLINE, which is a large textbase of abstracts in medical science. Experiments in tag-annotated MEDLINE abstracts demonstrate the effectiveness of specifying annotations and the efficiency of our algorithms. The system is made publicly accessible at http://www-tsujii.is.s.u-tokyo.ac.jp/medie/.},
keywords={},
doi={10.1587/transinf.E92.D.2369},
ISSN={1745-1361},
month={December},}
Copy
TY - JOUR
TI - Tag-Annotated Text Search Using Extended Region Algebra
T2 - IEICE TRANSACTIONS on Information
SP - 2369
EP - 2377
AU - Katsuya MASUDA
AU - Jun'ichi TSUJII
PY - 2009
DO - 10.1587/transinf.E92.D.2369
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E92-D
IS - 12
JA - IEICE TRANSACTIONS on Information
Y1 - December 2009
AB - This paper presents algorithms for searching text regions with specifying annotated information in tag-annotated text by using Region Algebra. The original algebra and its efficient algorithms are extended to handle both nested regions and crossed regions. The extensions are necessary for text search by using rich linguistic annotations. We first assign a depth number to every nested tag region to order these regions and write efficient algorithms using the depth number for the containment operations which can treat nested tag regions. Next, we introduce variables for attribute values of tags into the algebra to treat annotations in which attributes indicate another tag regions, and propose an efficient method of treating re-entrancy by incrementally determining values for variables. Our algorithms have been implemented in a text search engine for MEDLINE, which is a large textbase of abstracts in medical science. Experiments in tag-annotated MEDLINE abstracts demonstrate the effectiveness of specifying annotations and the efficiency of our algorithms. The system is made publicly accessible at http://www-tsujii.is.s.u-tokyo.ac.jp/medie/.
ER -