The search functionality is under construction.
The search functionality is under construction.

Author Search Result

[Author] Heyan HUANG(4hit)

1-4hit
  • Efficient Algorithm for Sentence Information Content Computing in Semantic Hierarchical Network

    Hao WU  Heyan HUANG  

     
    LETTER-Natural Language Processing

      Pubricized:
    2016/10/18
      Vol:
    E100-D No:1
      Page(s):
    238-241

    We previously proposed an unsupervised model using the inclusion-exclusion principle to compute sentence information content. Though it can achieve desirable experimental results in sentence semantic similarity, the computational complexity is more than O(2n). In this paper, we propose an efficient method to calculate sentence information content, which employs the thinking of the difference set in hierarchical network. Impressively, experimental results show that the computational complexity decreases to O(n). We prove the algorithm in the form of theorems. Performance analysis and experiments are also provided.

  • Preordering for Chinese-Vietnamese Statistical Machine Translation

    Huu-Anh TRAN  Heyan HUANG  Phuoc TRAN  Shumin SHI  Huu NGUYEN  

     
    PAPER-Natural Language Processing

      Pubricized:
    2018/11/12
      Vol:
    E102-D No:2
      Page(s):
    375-382

    Word order is one of the most significant differences between the Chinese and Vietnamese. In the phrase-based statistical machine translation, the reordering model will learn reordering rules from bilingual corpora. If the bilingual corpora are large and good enough, the reordering rules are exact and coverable. However, Chinese-Vietnamese is a low-resource language pair, the extraction of reordering rules is limited. This leads to the quality of reordering in Chinese-Vietnamese machine translation is not high. In this paper, we have combined Chinese dependency relation and Chinese-Vietnamese word alignment results in order to pre-order Chinese word order to be suitable to Vietnamese one. The experimental results show that our methodology has improved the machine translation performance compared to the translation system using only the reordering models of phrase-based statistical machine translation.

  • An Empirical Study of Classifier Combination Based Word Sense Disambiguation

    Wenpeng LU  Hao WU  Ping JIAN  Yonggang HUANG  Heyan HUANG  

     
    PAPER-Natural Language Processing

      Pubricized:
    2017/08/23
      Vol:
    E101-D No:1
      Page(s):
    225-233

    Word sense disambiguation (WSD) is to identify the right sense of ambiguous words via mining their context information. Previous studies show that classifier combination is an effective approach to enhance the performance of WSD. In this paper, we systematically review state-of-the-art methods for classifier combination based WSD, including probability-based and voting-based approaches. Furthermore, a new classifier combination based WSD, namely the probability weighted voting method with dynamic self-adaptation, is proposed in this paper. Compared with existing approaches, the new method can take into consideration both the differences of classifiers and ambiguous instances. Exhaustive experiments are performed on a real-world dataset, the results show the superiority of our method over state-of-the-art methods.

  • Sentence Similarity Computational Model Based on Information Content

    Hao WU  Heyan HUANG  

     
    PAPER-Natural Language Processing

      Pubricized:
    2016/03/14
      Vol:
    E99-D No:6
      Page(s):
    1645-1652

    Sentence similarity computation is an increasingly important task in applications of natural language processing such as information retrieval, machine translation, text summarization and so on. From the viewpoint of information theory, the essential attribute of natural language is that the carrier of information and the capacity of information can be measured by information content which is already successfully used for word similarity computation in simple ways. Existing sentence similarity methods don't emphasize the information contained by the sentence, and the complicated models they employ often need using empirical parameters or training parameters. This paper presents a fully unsupervised computational model of sentence semantic similarity. It is also a simply and straightforward model that neither needs any empirical parameter nor rely on other NLP tools. The method can obtain state-of-the-art experimental results which show that sentence similarity evaluated by the model is closer to human judgment than multiple competing baselines. The paper also tests the proposed model on the influence of external corpus, the performance of various sizes of the semantic net, and the relationship between efficiency and accuracy.