The search functionality is under construction.

Author Search Result

[Author] Kong-Joo LEE(4hit)

1-4hit
  • Improving Automatic English Writing Assessment Using Regression Trees and Error-Weighting

    Kong-Joo LEE  Jee-Eun KIM  

     
    PAPER-Natural Language Processing

      Vol:
    E93-D No:8
      Page(s):
    2281-2290

    The proposed automated scoring system for English writing tests provides an assessment result including a score and diagnostic feedback to test-takers without human's efforts. The system analyzes an input sentence and detects errors related to spelling, syntax and content similarity. The scoring model has adopted one of the statistical approaches, a regression tree. A scoring model in general calculates a score based on the count and the types of automatically detected errors. Accordingly, a system with higher accuracy in detecting errors raises the accuracy in scoring a test. The accuracy of the system, however, cannot be fully guaranteed for several reasons, such as parsing failure, incompleteness of knowledge bases, and ambiguous nature of natural language. In this paper, we introduce an error-weighting technique, which is similar to term-weighting widely used in information retrieval. The error-weighting technique is applied to judge reliability of the errors detected by the system. The score calculated with the technique is proven to be more accurate than the score without it.

  • Normalizing Syntactic Structures Using Part-of-Speech Tags and Binary Rules

    Seongyong KIM  Kong-Joo LEE  Key-Sun CHOI  

     
    PAPER

      Vol:
    E86-D No:10
      Page(s):
    2049-2056

    We propose a normalization scheme of syntactic structures using a binary phrase structure grammar with composite labels. The normalization adopts binary rules so that the dependency between two sub-trees can be represented in the label of the tree. The label of a tree is composed of two attributes, each of which is extracted from each sub-tree, so that it can represent the compositional information of the tree. The composite label is generated from part-of-speech tags using an automatic labelling algorithm. Since the proposed normalization scheme is binary and uses only part-of-speech information, it can readily be used to compare the results of different syntactic analyses independently of their syntactic description and can be applied to other languages as well. It can also be used for syntactic analysis, which performs higher than the previous syntactic description for Korean corpus. We implement a tool that transforms a syntactic description into normalized one based on this proposed scheme. It can help construct a unified syntactic corpus and extract syntactic information from various types of syntactic corpus in a uniform way.

  • Extracting Partial Parsing Rules from Tree-Annotated Corpus: Toward Deterministic Global Parsing

    Myung-Seok CHOI  Kong-Joo LEE  Key-Sun CHOI  Gil Chang KIM  

     
    PAPER-Natural Language Processing

      Vol:
    E88-D No:6
      Page(s):
    1248-1255

    It is not always possible to find a global parse for an input sentence owing to problems such as errors of a sentence, incompleteness of lexicon and grammar. Partial parsing is an alternative approach to respond to these problems. Partial parsing techniques try to recover syntactic information efficiently and reliably by sacrificing completeness and depth of analysis. One of the difficulties in partial parsing is how the grammar might be automatically extracted. In this paper we present a method of automatically extracting partial parsing rules from a tree-annotated corpus using the decision tree method. Our goal is deterministic global parsing using partial parsing rules, in other words, to extract partial parsing rules with higher accuracy and broader expansion. First, we define a rule template that enables to learn a subtree for a given substring, so that the resultant rules can be more specific and stricter to apply. Second, rule candidates extracted from a training corpus are enriched with contextual and lexical information using the decision tree method and verified through cross-validation. Last, we underspecify non-deterministic rules by merging substructures with ambiguity in those rules. The learned grammar is similar to phrase structure grammar with contextual and lexical information, but allows building structures of depth one or more. Thanks to automatic learning, the partial parsing rules can be consistent and domain-independent. Partial parsing with this grammar processes an input sentence deterministically using longest-match heuristics, and recursively applies rules to an input sentence. The experiments showed that the partial parser using automatically extracted rules is not only accurate and efficient but also achieves reasonable coverage for Korean.

  • Document Genre Classification for User Interface of Web Search Engine

    Kong-Joo LEE  

     
    LETTER-Natural Language Processing

      Vol:
    E87-D No:7
      Page(s):
    1982-1986

    In this letter we suggest sets of features to classify genres of web documents. Web documents are different from textual documents in that they contain URL and HTML tags within the pages. We introduce the features specific to web documents, which are extracted from URL and HTML tags. Experimental results enable us to evaluate their characteristics and performances. On the basis of the experimental results, we implement a user interface of a web search engine that presents documents grouped by genres.