IEICE global.ieice.org Site

The search functionality is under construction.

The search functionality is under construction.

Author Search Result

[Author] Ruhua CHEN(1hit)

1-1hit

Improving Text Categorization with Semantic Knowledge in Wikipedia
Xiang WANG Yan JIA Ruhua CHEN Hua FAN Bin ZHOU

PAPER-Artificial Intelligence, Data Mining

Vol:
E96-D No:12
Page(s):
2786-2794
Text categorization, especially short text categorization, is a difficult and challenging task since the text data is sparse and multidimensional. In traditional text classification methods, document texts are represented with “Bag of Words (BOW)” text representation schema, which is based on word co-occurrence and has many limitations. In this paper, we mapped document texts to Wikipedia concepts and used the Wikipedia-concept-based document representation method to take the place of traditional BOW model for text classification. In order to overcome the weakness of ignoring the semantic relationships among terms in document representation model and utilize rich semantic knowledge in Wikipedia, we constructed a semantic matrix to enrich Wikipedia-concept-based document representation. Experimental evaluation on five real datasets of long and short text shows that our approach outperforms the traditional BOW method.

Latest Issue

English

Links

Call for Papers

Call for Papers

Special Section

Submit to IEICE Trans.

Submit to IEICE Trans.

Information for Authors

Transactions NEWS

Transactions NEWS

Popular articles

Popular articles

Top 10 Downloads