The search functionality is under construction.

Author Search Result

[Author] Jong-Hoon OH(2hit)

1-2hit
  • Machine Learning Based English-to-Korean Transliteration Using Grapheme and Phoneme Information

    Jong-Hoon OH  Key-Sun CHOI  

     
    PAPER-Natural Language Processing

      Vol:
    E88-D No:7
      Page(s):
    1737-1748

    Machine transliteration is an automatic method to generate characters or words in one alphabetical system for the corresponding characters in another alphabetical system. Machine transliteration can play an important role in natural language application such as information retrieval and machine translation, especially for handling proper nouns and technical terms. The previous works focus on either a grapheme-based or phoneme-based method. However, transliteration is an orthographical and phonetic converting process. Therefore, both grapheme and phoneme information should be considered in machine transliteration. In this paper, we propose a grapheme and phoneme-based transliteration model and compare it with previous grapheme-based and phoneme-based models using several machine learning techniques. Our method shows about 1378% performance improvement.

  • An Alignment Model for Extracting English-Korean Translations of Term Constituents

    Jong-Hoon OH  Key-Sun CHOI  Hitoshi ISAHARA  

     
    PAPER-Natural Language Processing

      Vol:
    E89-D No:12
      Page(s):
    2972-2980

    Technical terms are linguistic representations of a domain concept, and their constituents are components used to represent the concept. Technical terms are usually multi-word terms and their meanings can be inferred from their constituents. Therefore, term constituents are essential for understanding the designated meaning of technical terms. However, there are several problems in finding the correct meanings of technical terms with their term constituents. First, because a term constituent is usually a morphological unit rather than a conceptual unit in the case of Korean technical terms, we need to first identify conceptual units by chunking term constituents. Second, conceptual units are sometimes homonyms or synonyms. Moreover their meanings show domain dependency. It is therefore necessary to give information about conceptual units and their possible meanings, including homonyms, synonyms, and domain dependency, so that natural language applications can properly handle technical terms. In this paper, we propose a term constituent alignment algorithm that extracts such information from bilingual technical term pairs. Our algorithm recognizes conceptual units and their meanings by finding English term constituents and their corresponding Korean term constituents for given English-Korean term pairs. Our experimental results indicate that this method can effectively find conceptual units and their meanings with about 6% alignment error rate (AER) on manually analyzed experimental data and about 14% AER on automatically analyzed experimental data.