The search functionality is under construction.

IEICE TRANSACTIONS on Information

Short Text Classification Based on Distributional Representations of Words

Chenglong MA, Qingwei ZHAO, Jielin PAN, Yonghong YAN

  • Full Text Views

    0

  • Cite this

Summary :

Short texts usually encounter the problem of data sparseness, as they do not provide sufficient term co-occurrence information. In this paper, we show how to mitigate the problem in short text classification through word embeddings. We assume that a short text document is a specific sample of one distribution in a Gaussian-Bayesian framework. Furthermore, a fast clustering algorithm is utilized to expand and enrich the context of short text in embedding space. This approach is compared with those based on the classical bag-of-words approaches and neural network based methods. Experimental results validate the effectiveness of the proposed method.

Publication
IEICE TRANSACTIONS on Information Vol.E99-D No.10 pp.2562-2565
Publication Date
2016/10/01
Publicized
2016/07/19
Online ISSN
1745-1361
DOI
10.1587/transinf.2016SLL0006
Type of Manuscript
Special Section LETTER (Special Section on Recent Advances in Machine Learning for Spoken Language Processing)
Category
Text classification

Authors

Chenglong MA
  Chinese Academy of Sciences
Qingwei ZHAO
  Chinese Academy of Sciences
Jielin PAN
  Chinese Academy of Sciences
Yonghong YAN
  Chinese Academy of Sciences

Keyword