The search functionality is under construction.
The search functionality is under construction.

Author Search Result

[Author] Le-Minh NGUYEN(2hit)

1-2hit
  • High-Performance Training of Conditional Random Fields for Large-Scale Applications of Labeling Sequence Data

    Xuan-Hieu PHAN  Le-Minh NGUYEN  Yasushi INOGUCHI  Susumu HORIGUCHI  

     
    PAPER-Parallel Processing System

      Vol:
    E90-D No:1
      Page(s):
    13-21

    Conditional random fields (CRFs) have been successfully applied to various applications of predicting and labeling structured data, such as natural language tagging & parsing, image segmentation & object recognition, and protein secondary structure prediction. The key advantages of CRFs are the ability to encode a variety of overlapping, non-independent features from empirical data as well as the capability of reaching the global normalization and optimization. However, estimating parameters for CRFs is very time-consuming due to an intensive forward-backward computation needed to estimate the likelihood function and its gradient during training. This paper presents a high-performance training of CRFs on massively parallel processing systems that allows us to handle huge datasets with hundreds of thousand data sequences and millions of features. We performed the experiments on an important natural language processing task (text chunking) on large-scale corpora and achieved significant results in terms of both the reduction of computational time and the improvement of prediction accuracy.

  • Personal Name Resolution Crossover Documents by a Semantics-Based Approach

    Xuan-Hieu PHAN  Le-Minh NGUYEN  Susumu HORIGUCHI  

     
    PAPER-Natural Language Processing

      Vol:
    E89-D No:2
      Page(s):
    825-836

    Cross-document personal name resolution is the process of identifying whether or not a common personal name mentioned in different documents refers to the same individual. Most previous approaches usually rely on lexical matching such as the occurrence of common words surrounding the entity name to measure the similarity between documents, and then clusters the documents according to their referents. In spite of certain successes, measuring similarity based on lexical comparison sometimes ignores important linguistic phenomena at the semantic level such as synonym or paraphrase. This paper presents a semantics-based approach to the resolution of personal name crossover documents that can make the most of both lexical evidences and semantic clues. In our method, the similarity values between documents are determined by estimating the semantic relatedness between words. Further, the semantic labels attached to sentences allow us to highlight the common personal facts that are potentially available among documents. An evaluation on three web datasets demonstrates that our method achieves the better performance than the previous work.