Cross-Validation-Based Association Rule Prioritization Metric for Software Defect Characterization

Takashi WATANABE; Akito MONDEN; Zeynep YÜCEL; Yasutaka KAMEI; Shuji MORISAKI

doi:10.1587/transinf.2018EDP7020

IEICE TRANSACTIONS on Information

Cross-Validation-Based Association Rule Prioritization Metric for Software Defect Characterization

Takashi WATANABE, Akito MONDEN, Zeynep YÜCEL, Yasutaka KAMEI, Shuji MORISAKI

Full Text Views

0

Cite this

Summary :

Association rule mining discovers relationships among variables in a data set, representing them as rules. These are expected to often have predictive abilities, that is, to be able to predict future events, but commonly used rule interestingness measures, such as support and confidence, do not directly assess their predictive power. This paper proposes a cross-validation -based metric that quantifies the predictive power of such rules for characterizing software defects. The results of evaluation this metric experimentally using four open-source data sets (Mylyn, NetBeans, Apache Ant and jEdit) show that it can improve rule prioritization performance over conventional metrics (support, confidence and odds ratio) by 72.8% for Mylyn, 15.0% for NetBeans, 10.5% for Apache Ant and 0 for jEdit in terms of SumNormPre(100) precision criterion. This suggests that the proposed metric can provide better rule prioritization performance than conventional metrics and can at least provide similar performance even in the worst case.

Publication: IEICE TRANSACTIONS on Information Vol.E101-D No.9 pp.2269-2278

Publication Date: 2018/09/01

Publicized: 2018/06/13

Online ISSN: 1745-1361

DOI: 10.1587/transinf.2018EDP7020

Type of Manuscript: PAPER

Category: Software Engineering

Authors

Takashi WATANABE
  Okayama University
Akito MONDEN
  Okayama University
Zeynep YÜCEL
  Okayama University
Yasutaka KAMEI
  Kyushu University
Shuji MORISAKI
  Nagoya University

Keyword

association rule mining, defect prediction, cross-validation, data mining, software quality

Cite this

Copy

Takashi WATANABE, Akito MONDEN, Zeynep YÜCEL, Yasutaka KAMEI, Shuji MORISAKI, "Cross-Validation-Based Association Rule Prioritization Metric for Software Defect Characterization" in IEICE TRANSACTIONS on Information, vol. E101-D, no. 9, pp. 2269-2278, September 2018, doi: 10.1587/transinf.2018EDP7020.
Abstract: Association rule mining discovers relationships among variables in a data set, representing them as rules. These are expected to often have predictive abilities, that is, to be able to predict future events, but commonly used rule interestingness measures, such as support and confidence, do not directly assess their predictive power. This paper proposes a cross-validation -based metric that quantifies the predictive power of such rules for characterizing software defects. The results of evaluation this metric experimentally using four open-source data sets (Mylyn, NetBeans, Apache Ant and jEdit) show that it can improve rule prioritization performance over conventional metrics (support, confidence and odds ratio) by 72.8% for Mylyn, 15.0% for NetBeans, 10.5% for Apache Ant and 0 for jEdit in terms of SumNormPre(100) precision criterion. This suggests that the proposed metric can provide better rule prioritization performance than conventional metrics and can at least provide similar performance even in the worst case.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2018EDP7020/_p

Copy

@ARTICLE{e101-d_9_2269,
author={Takashi WATANABE, Akito MONDEN, Zeynep YÜCEL, Yasutaka KAMEI, Shuji MORISAKI, },
journal={IEICE TRANSACTIONS on Information},
title={Cross-Validation-Based Association Rule Prioritization Metric for Software Defect Characterization},
year={2018},
volume={E101-D},
number={9},
pages={2269-2278},
abstract={Association rule mining discovers relationships among variables in a data set, representing them as rules. These are expected to often have predictive abilities, that is, to be able to predict future events, but commonly used rule interestingness measures, such as support and confidence, do not directly assess their predictive power. This paper proposes a cross-validation -based metric that quantifies the predictive power of such rules for characterizing software defects. The results of evaluation this metric experimentally using four open-source data sets (Mylyn, NetBeans, Apache Ant and jEdit) show that it can improve rule prioritization performance over conventional metrics (support, confidence and odds ratio) by 72.8% for Mylyn, 15.0% for NetBeans, 10.5% for Apache Ant and 0 for jEdit in terms of SumNormPre(100) precision criterion. This suggests that the proposed metric can provide better rule prioritization performance than conventional metrics and can at least provide similar performance even in the worst case.},
keywords={},
doi={10.1587/transinf.2018EDP7020},
ISSN={1745-1361},
month={September},}

Copy

TY - JOUR
TI - Cross-Validation-Based Association Rule Prioritization Metric for Software Defect Characterization
T2 - IEICE TRANSACTIONS on Information
SP - 2269
EP - 2278
AU - Takashi WATANABE
AU - Akito MONDEN
AU - Zeynep YÜCEL
AU - Yasutaka KAMEI
AU - Shuji MORISAKI
PY - 2018
DO - 10.1587/transinf.2018EDP7020
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E101-D
IS - 9
JA - IEICE TRANSACTIONS on Information
Y1 - September 2018
AB - Association rule mining discovers relationships among variables in a data set, representing them as rules. These are expected to often have predictive abilities, that is, to be able to predict future events, but commonly used rule interestingness measures, such as support and confidence, do not directly assess their predictive power. This paper proposes a cross-validation -based metric that quantifies the predictive power of such rules for characterizing software defects. The results of evaluation this metric experimentally using four open-source data sets (Mylyn, NetBeans, Apache Ant and jEdit) show that it can improve rule prioritization performance over conventional metrics (support, confidence and odds ratio) by 72.8% for Mylyn, 15.0% for NetBeans, 10.5% for Apache Ant and 0 for jEdit in terms of SumNormPre(100) precision criterion. This suggests that the proposed metric can provide better rule prioritization performance than conventional metrics and can at least provide similar performance even in the worst case.
ER -

IEICE TRANSACTIONS on Information