The search functionality is under construction.
The search functionality is under construction.

The Performance Stability of Defect Prediction Models with Class Imbalance: An Empirical Study

Qiao YU, Shujuan JIANG, Yanmei ZHANG

  • Full Text Views

    0

  • Cite this

Summary :

Class imbalance has drawn much attention of researchers in software defect prediction. In practice, the performance of defect prediction models may be affected by the class imbalance problem. In this paper, we present an approach to evaluating the performance stability of defect prediction models on imbalanced datasets. First, random sampling is applied to convert the original imbalanced dataset into a set of new datasets with different levels of imbalance ratio. Second, typical prediction models are selected to make predictions on these new constructed datasets, and Coefficient of Variation (C·V) is used to evaluate the performance stability of different models. Finally, an empirical study is designed to evaluate the performance stability of six prediction models, which are widely used in software defect prediction. The results show that the performance of C4.5 is unstable on imbalanced datasets, and the performance of Naive Bayes and Random Forest are more stable than other models.

Publication
IEICE TRANSACTIONS on Information Vol.E100-D No.2 pp.265-272
Publication Date
2017/02/01
Publicized
2016/11/04
Online ISSN
1745-1361
DOI
10.1587/transinf.2016EDP7204
Type of Manuscript
PAPER
Category
Software Engineering

Authors

Qiao YU
  China University of Mining and Technology
Shujuan JIANG
  China University of Mining and Technology
Yanmei ZHANG
  China University of Mining and Technology

Keyword