The search functionality is under construction.
The search functionality is under construction.

High-Performance Training of Conditional Random Fields for Large-Scale Applications of Labeling Sequence Data

Xuan-Hieu PHAN, Le-Minh NGUYEN, Yasushi INOGUCHI, Susumu HORIGUCHI

  • Full Text Views

    0

  • Cite this

Summary :

Conditional random fields (CRFs) have been successfully applied to various applications of predicting and labeling structured data, such as natural language tagging & parsing, image segmentation & object recognition, and protein secondary structure prediction. The key advantages of CRFs are the ability to encode a variety of overlapping, non-independent features from empirical data as well as the capability of reaching the global normalization and optimization. However, estimating parameters for CRFs is very time-consuming due to an intensive forward-backward computation needed to estimate the likelihood function and its gradient during training. This paper presents a high-performance training of CRFs on massively parallel processing systems that allows us to handle huge datasets with hundreds of thousand data sequences and millions of features. We performed the experiments on an important natural language processing task (text chunking) on large-scale corpora and achieved significant results in terms of both the reduction of computational time and the improvement of prediction accuracy.

Publication
IEICE TRANSACTIONS on Information Vol.E90-D No.1 pp.13-21
Publication Date
2007/01/01
Publicized
Online ISSN
1745-1361
DOI
Type of Manuscript
Special Section PAPER (Special Section on Parallel/Distributed Processing and Systems)
Category
Parallel Processing System

Authors

Keyword