The search functionality is under construction.

IEICE TRANSACTIONS on Information

Cluster-Based Minority Over-Sampling for Imbalanced Datasets

Kamthorn PUNTUMAPON, Thanawin RAKTHAMAMON, Kitsana WAIYAMAI

  • Full Text Views

    0

  • Cite this

Summary :

Synthetic over-sampling is a well-known method to solve class imbalance by modifying class distribution and generating synthetic samples. A large number of synthetic over-sampling techniques have been proposed; however, most of them suffer from the over-generalization problem whereby synthetic minority class samples are generated into the majority class region. Learning from an over-generalized dataset, a classifier could misclassify a majority class member as belonging to a minority class. In this paper a method called TRIM is proposed to overcome the over-generalization problem. The idea is to identify minority class regions that compromise between generalization and overfitting. TRIM identifies all the minority class regions in the form of clusters. Then, it merges a large number of small minority class clusters into more generalized clusters. To enhance the generalization ability, a cluster connection step is proposed to avoid over-generalization toward the majority class while increasing generalization of the minority class. As a result, the classifier is able to correctly classify more minority class samples while maintaining its precision. Compared with SMOTE and extended versions such as Borderline-SMOTE, experimental results show that TRIM exhibits significant performance improvement in terms of F-measure and AUC. TRIM can be used as a pre-processing step for synthetic over-sampling methods such as SMOTE and its extended versions.

Publication
IEICE TRANSACTIONS on Information Vol.E99-D No.12 pp.3101-3109
Publication Date
2016/12/01
Publicized
2016/09/06
Online ISSN
1745-1361
DOI
10.1587/transinf.2016EDP7130
Type of Manuscript
PAPER
Category
Artificial Intelligence, Data Mining

Authors

Kamthorn PUNTUMAPON
  Kasetsart University
Thanawin RAKTHAMAMON
  Kasetsart University
Kitsana WAIYAMAI
  Kasetsart University

Keyword