1-3hit |
Despite the importance of domain-specific resource construction for domain ontology development, few studies have sought to develop a method for automatically identifying domain ontology-relevant web pages. To address this situation, here we propose a web page filtering scheme for domain ontology that identifies domain-relevant web pages from the web based on the context of concepts. Testing of the proposed filtering scheme with a business domain ontology on YahooPicks web pages yielded promising filtering results that were superior to those obtained using the baseline system.
Bo-Yeong KANG Dae-Won KIM Qing LI
A great deal of research has been made to model the vagueness and uncertainty in information retrieval. One such research is fuzzy ranking models, which have been showing their superior performance in handling the uncertainty involved in the retrieval process. However, these conventional fuzzy ranking models have a limited ability to incorporate the user preference when calculating the rank of documents. To address this issue, in this study we develop a new fuzzy ranking model based on the user preference. Through the experiments on the TREC-2 collection of Wall Street Journal documents, we show that the proposed method outperforms the conventional fuzzy ranking models.
Bo-Yeong KANG Sung-Hyon MYAENG
Since sentences are the basic propositional units of text, knowing their themes should help in completing various tasks such as automatic summarization requiring the knowledge about the semantic content of text. Despite the importance of determining the theme of a sentence, however, few studies have investigated the problem of automatically assigning a theme to a sentence. In this paper, we examine the notion of sentence theme and propose an automatic scheme where head-driven patterns are used for theme assignment. We tested our scheme with sentences in encyclopedia articles and obtained a promising result of 98.96% in F-score for training data and 88.57% for testing data, which outperform the baseline using all but the head-driven patterns.