The search functionality is under construction.

Author Search Result

[Author] Peerasak INTARAPAIBOON(3hit)

1-3hit
  • Extracting Chemical Reactions from Thai Text for Semantics-Based Information Retrieval

    Peerasak INTARAPAIBOON  Ekawit NANTAJEEWARAWAT  Thanaruk THEERAMUNKONG  

     
    PAPER

      Vol:
    E94-D No:3
      Page(s):
    479-486

    Based on sliding-window rule application and extraction filtering, we present a framework for extracting multi-slot frames describing chemical reactions from Thai free text with unknown target-phrase boundaries. A supervised rule learning algorithm is employed for automatic construction of pattern-based extraction rules from hand-tagged training phrases. A filtering method is devised for removal of incorrect extraction results based on features observed from text portions appearing between adjacent slot fillers in source documents. Extracted reaction frames are represented as concept expressions in description logics and are used as metadata for document indexing. A document knowledge base supporting semantics-based information retrieval is constructed by integrating document metadata with domain-specific ontologies.

  • Extracting Semantic Frames from Thai Medical-Symptom Unstructured Text with Unknown Target-Phrase Boundaries

    Peerasak INTARAPAIBOON  Ekawit NANTAJEEWARAWAT  Thanaruk THEERAMUNKONG  

     
    PAPER

      Vol:
    E94-D No:3
      Page(s):
    465-478

    Due to the limitations of language-processing tools for the Thai language, pattern-based information extraction from Thai documents requires supplementary techniques. Based on sliding-window rule application and extraction filtering, we present a framework for extracting semantic information from medical-symptom phrases with unknown boundaries in Thai unstructured-text information entries. A supervised rule learning algorithm is employed for automatic construction of information extraction rules from hand-tagged training symptom phrases. Two filtering components are introduced: one uses a classification model to predict rule application across a symptom-phrase boundary based on instantiation features of rule internal wildcards, the other uses weighted classification confidence to resolve conflicts arising from overlapping extractions. In our experimental study, we focus our attention on two basic types of symptom phrasal descriptions: one is concerned with abnormal characteristics of some observable entities and the other with human-body locations at which primitive symptoms appear. The experimental results show that the filtering components improve precision while preserving recall satisfactorily.

  • An Application of Intuitionistic Fuzzy Sets to Improve Information Extraction from Thai Unstructured Text

    Peerasak INTARAPAIBOON  Thanaruk THEERAMUNKONG  

     
    PAPER-Artificial Intelligence, Data Mining

      Pubricized:
    2018/05/23
      Vol:
    E101-D No:9
      Page(s):
    2334-2345

    Multi-slot information extraction, also known as frame extraction, is a task that identify several related entities simultaneously. Most researches on this task are concerned with applying IE patterns (rules) to extract related entities from unstructured documents. An important obstacle for the success in this task is unknowing where text portions containing interested information are. This problem is more complicated when involving languages with sentence boundary ambiguity, e.g. the Thai language. Applying IE rules to all reasonable text portions can degrade the effect of this obstacle, but it raises another problem that is incorrect (unwanted) extractions. This paper aims to present a method for removing these incorrect extractions. In the method, extractions are represented as intuitionistic fuzzy sets, and a similarity measure for IFSs is used to calculate distance between IFS of an unclassified extraction and that of each already-classified extraction. The concept of k nearest neighbor is adopted to design whether the unclassified extraction is correct or not. From the experiment on various domains, the proposed technique improves extraction precision while satisfactorily preserving recall.