Association rule mining discovers relationships among variables in a data set, representing them as rules. These are expected to often have predictive abilities, that is, to be able to predict future events, but commonly used rule interestingness measures, such as support and confidence, do not directly assess their predictive power. This paper proposes a cross-validation -based metric that quantifies the predictive power of such rules for characterizing software defects. The results of evaluation this metric experimentally using four open-source data sets (Mylyn, NetBeans, Apache Ant and jEdit) show that it can improve rule prioritization performance over conventional metrics (support, confidence and odds ratio) by 72.8% for Mylyn, 15.0% for NetBeans, 10.5% for Apache Ant and 0 for jEdit in terms of SumNormPre(100) precision criterion. This suggests that the proposed metric can provide better rule prioritization performance than conventional metrics and can at least provide similar performance even in the worst case.
Takashi WATANABE
Okayama University
Akito MONDEN
Okayama University
Zeynep YÜCEL
Okayama University
Yasutaka KAMEI
Kyushu University
Shuji MORISAKI
Nagoya University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Takashi WATANABE, Akito MONDEN, Zeynep YÜCEL, Yasutaka KAMEI, Shuji MORISAKI, "Cross-Validation-Based Association Rule Prioritization Metric for Software Defect Characterization" in IEICE TRANSACTIONS on Information,
vol. E101-D, no. 9, pp. 2269-2278, September 2018, doi: 10.1587/transinf.2018EDP7020.
Abstract: Association rule mining discovers relationships among variables in a data set, representing them as rules. These are expected to often have predictive abilities, that is, to be able to predict future events, but commonly used rule interestingness measures, such as support and confidence, do not directly assess their predictive power. This paper proposes a cross-validation -based metric that quantifies the predictive power of such rules for characterizing software defects. The results of evaluation this metric experimentally using four open-source data sets (Mylyn, NetBeans, Apache Ant and jEdit) show that it can improve rule prioritization performance over conventional metrics (support, confidence and odds ratio) by 72.8% for Mylyn, 15.0% for NetBeans, 10.5% for Apache Ant and 0 for jEdit in terms of SumNormPre(100) precision criterion. This suggests that the proposed metric can provide better rule prioritization performance than conventional metrics and can at least provide similar performance even in the worst case.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2018EDP7020/_p
Copy
@ARTICLE{e101-d_9_2269,
author={Takashi WATANABE, Akito MONDEN, Zeynep YÜCEL, Yasutaka KAMEI, Shuji MORISAKI, },
journal={IEICE TRANSACTIONS on Information},
title={Cross-Validation-Based Association Rule Prioritization Metric for Software Defect Characterization},
year={2018},
volume={E101-D},
number={9},
pages={2269-2278},
abstract={Association rule mining discovers relationships among variables in a data set, representing them as rules. These are expected to often have predictive abilities, that is, to be able to predict future events, but commonly used rule interestingness measures, such as support and confidence, do not directly assess their predictive power. This paper proposes a cross-validation -based metric that quantifies the predictive power of such rules for characterizing software defects. The results of evaluation this metric experimentally using four open-source data sets (Mylyn, NetBeans, Apache Ant and jEdit) show that it can improve rule prioritization performance over conventional metrics (support, confidence and odds ratio) by 72.8% for Mylyn, 15.0% for NetBeans, 10.5% for Apache Ant and 0 for jEdit in terms of SumNormPre(100) precision criterion. This suggests that the proposed metric can provide better rule prioritization performance than conventional metrics and can at least provide similar performance even in the worst case.},
keywords={},
doi={10.1587/transinf.2018EDP7020},
ISSN={1745-1361},
month={September},}
Copy
TY - JOUR
TI - Cross-Validation-Based Association Rule Prioritization Metric for Software Defect Characterization
T2 - IEICE TRANSACTIONS on Information
SP - 2269
EP - 2278
AU - Takashi WATANABE
AU - Akito MONDEN
AU - Zeynep YÜCEL
AU - Yasutaka KAMEI
AU - Shuji MORISAKI
PY - 2018
DO - 10.1587/transinf.2018EDP7020
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E101-D
IS - 9
JA - IEICE TRANSACTIONS on Information
Y1 - September 2018
AB - Association rule mining discovers relationships among variables in a data set, representing them as rules. These are expected to often have predictive abilities, that is, to be able to predict future events, but commonly used rule interestingness measures, such as support and confidence, do not directly assess their predictive power. This paper proposes a cross-validation -based metric that quantifies the predictive power of such rules for characterizing software defects. The results of evaluation this metric experimentally using four open-source data sets (Mylyn, NetBeans, Apache Ant and jEdit) show that it can improve rule prioritization performance over conventional metrics (support, confidence and odds ratio) by 72.8% for Mylyn, 15.0% for NetBeans, 10.5% for Apache Ant and 0 for jEdit in terms of SumNormPre(100) precision criterion. This suggests that the proposed metric can provide better rule prioritization performance than conventional metrics and can at least provide similar performance even in the worst case.
ER -