The search functionality is under construction.
The search functionality is under construction.

Extracting Semantic Frames from Thai Medical-Symptom Unstructured Text with Unknown Target-Phrase Boundaries

Peerasak INTARAPAIBOON, Ekawit NANTAJEEWARAWAT, Thanaruk THEERAMUNKONG

  • Full Text Views

    0

  • Cite this

Summary :

Due to the limitations of language-processing tools for the Thai language, pattern-based information extraction from Thai documents requires supplementary techniques. Based on sliding-window rule application and extraction filtering, we present a framework for extracting semantic information from medical-symptom phrases with unknown boundaries in Thai unstructured-text information entries. A supervised rule learning algorithm is employed for automatic construction of information extraction rules from hand-tagged training symptom phrases. Two filtering components are introduced: one uses a classification model to predict rule application across a symptom-phrase boundary based on instantiation features of rule internal wildcards, the other uses weighted classification confidence to resolve conflicts arising from overlapping extractions. In our experimental study, we focus our attention on two basic types of symptom phrasal descriptions: one is concerned with abnormal characteristics of some observable entities and the other with human-body locations at which primitive symptoms appear. The experimental results show that the filtering components improve precision while preserving recall satisfactorily.

Publication
IEICE TRANSACTIONS on Information Vol.E94-D No.3 pp.465-478
Publication Date
2011/03/01
Publicized
Online ISSN
1745-1361
DOI
10.1587/transinf.E94.D.465
Type of Manuscript
Special Section PAPER (Special Section on Knowledge Discovery, Data Mining and Creativity Support System)
Category

Authors

Keyword