The search functionality is under construction.
The search functionality is under construction.

Improved Majority Filtering Algorithm for Cleaning Class Label Noise in Supervised Learning

Muhammad Ammar MALIK, Jae Young CHOI, Moonsoo KANG, Bumshik LEE

  • Full Text Views

    0

  • Cite this

Summary :

In most supervised learning problems, the labelling quality of datasets plays a paramount role in the learning of high-performance classifiers. The performance of a classifier can significantly be degraded if it is trained with mislabeled data. Therefore, identification of such examples from the dataset is of critical importance. In this study, we proposed an improved majority filtering algorithm, which utilized the ability of a support vector machine in terms of capturing potentially mislabeled examples as support vectors (SVs). The key technical contribution of our work, is that the base (or component) classifiers that construct the ensemble of classifiers are trained using non-SV examples, although at the time of testing, the examples captured as SVs were employed. An example can be tagged as mislabeled if the majority of the base classifiers incorrectly classifies the example. Experimental results confirmed that our algorithm not only showed high-level accuracy with higher F1 scores, for identifying the mislabeled examples, but was also significantly faster than the previous methods.

Publication
IEICE TRANSACTIONS on Fundamentals Vol.E102-A No.11 pp.1556-1559
Publication Date
2019/11/01
Publicized
Online ISSN
1745-1337
DOI
10.1587/transfun.E102.A.1556
Type of Manuscript
LETTER
Category
Digital Signal Processing

Authors

Muhammad Ammar MALIK
  Chosun University
Jae Young CHOI
  Hankook University of Foreign Study
Moonsoo KANG
  Chosun University
Bumshik LEE
  Chosun University

Keyword