The search functionality is under construction.
The search functionality is under construction.

Author Search Result

[Author] Seung-Hoon NA(2hit)

1-2hit
  • Phrase-Based Statistical Model for Korean Morpheme Segmentation and POS Tagging

    Seung-Hoon NA  Young-Kil KIM  

     
    PAPER-Natural Language Processing

      Pubricized:
    2017/11/13
      Vol:
    E101-D No:2
      Page(s):
    512-522

    In this paper, we propose a novel phrase-based model for Korean morphological analysis by considering a phrase as the basic processing unit, which generalizes all the other existing processing units. The impetus for using phrases this way is largely motivated by the success of phrase-based statistical machine translation (SMT), which convincingly shows that the larger the processing unit, the better the performance. Experimental results using the SEJONG dataset show that the proposed phrase-based models outperform the morpheme-based models used as baselines. In particular, when combined with the conditional random field (CRF) model, our model leads to statistically significant improvements over the state-of-the-art CRF method.

  • Pruning-Based Unsupervised Segmentation for Korean

    In-Su KANG  Seung-Hoon NA  Jong-Hyeok LEE  

     
    PAPER-Natural Language Processing

      Vol:
    E89-D No:10
      Page(s):
    2670-2677

    Compound noun segmentation is a key component for Korean language processing. Supervised approaches require some types of human intervention such as maintaining lexicons, manually segmenting the corpora, or devising heuristic rules. Thus, they suffer from the unknown word problem, and cannot distinguish domain-oriented or corpus-directed segmentation results from the others. These problems can be overcome by unsupervised approaches that employ segmentation clues obtained purely from a raw corpus. However, most unsupervised approaches require tuning of empirical parameters or learning of the statistical dictionary. To develop a tuning-less, learning-free unsupervised segmentation algorithm, this study proposes a pruning-based unsupervised technique that eliminates unhelpful segmentation candidates. In addition, unlike previous unsupervised methods that have relied on purely character-based segmentation clues, this study utilizes word-based segmentation clues. Experimental evaluations show that the pruning scheme is very effective to unsupervised segmentation of Korean compound nouns, and the use of word-based prior knowledge enables better segmentation accuracy. This study also shows that the proposed algorithm performs competitively with or better than other unsupervised methods.