The search functionality is under construction.

The search functionality is under construction.

Association rule mining discovers relationships among variables in a data set, representing them as rules. These are expected to often have predictive abilities, that is, to be able to predict future events, but commonly used rule interestingness measures, such as support and confidence, do not directly assess their predictive power. This paper proposes a cross-validation -based metric that quantifies the predictive power of such rules for characterizing software defects. The results of evaluation this metric experimentally using four open-source data sets (Mylyn, NetBeans, Apache Ant and jEdit) show that it can improve rule prioritization performance over conventional metrics (support, confidence and odds ratio) by 72.8% for Mylyn, 15.0% for NetBeans, 10.5% for Apache Ant and 0 for jEdit in terms of **SumNormPre**(100) precision criterion. This suggests that the proposed metric can provide better rule prioritization performance than conventional metrics and can at least provide similar performance even in the worst case.

- Publication
- IEICE TRANSACTIONS on Information Vol.E101-D No.9 pp.2269-2278

- Publication Date
- 2018/09/01

- Publicized
- 2018/06/13

- Online ISSN
- 1745-1361

- DOI
- 10.1587/transinf.2018EDP7020

- Type of Manuscript
- PAPER

- Category
- Software Engineering

Takashi WATANABE

Okayama University

Akito MONDEN

Okayama University

Zeynep YÜCEL

Okayama University

Yasutaka KAMEI

Kyushu University

Shuji MORISAKI

Nagoya University

The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.

Copy

Takashi WATANABE, Akito MONDEN, Zeynep YÜCEL, Yasutaka KAMEI, Shuji MORISAKI, "Cross-Validation-Based Association Rule Prioritization Metric for Software Defect Characterization" in IEICE TRANSACTIONS on Information,
vol. E101-D, no. 9, pp. 2269-2278, September 2018, doi: 10.1587/transinf.2018EDP7020.

Abstract: Association rule mining discovers relationships among variables in a data set, representing them as rules. These are expected to often have predictive abilities, that is, to be able to predict future events, but commonly used rule interestingness measures, such as support and confidence, do not directly assess their predictive power. This paper proposes a cross-validation -based metric that quantifies the predictive power of such rules for characterizing software defects. The results of evaluation this metric experimentally using four open-source data sets (Mylyn, NetBeans, Apache Ant and jEdit) show that it can improve rule prioritization performance over conventional metrics (support, confidence and odds ratio) by 72.8% for Mylyn, 15.0% for NetBeans, 10.5% for Apache Ant and 0 for jEdit in terms of **SumNormPre**(100) precision criterion. This suggests that the proposed metric can provide better rule prioritization performance than conventional metrics and can at least provide similar performance even in the worst case.

URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2018EDP7020/_p

Copy

@ARTICLE{e101-d_9_2269,

author={Takashi WATANABE, Akito MONDEN, Zeynep YÜCEL, Yasutaka KAMEI, Shuji MORISAKI, },

journal={IEICE TRANSACTIONS on Information},

title={Cross-Validation-Based Association Rule Prioritization Metric for Software Defect Characterization},

year={2018},

volume={E101-D},

number={9},

pages={2269-2278},

abstract={Association rule mining discovers relationships among variables in a data set, representing them as rules. These are expected to often have predictive abilities, that is, to be able to predict future events, but commonly used rule interestingness measures, such as support and confidence, do not directly assess their predictive power. This paper proposes a cross-validation -based metric that quantifies the predictive power of such rules for characterizing software defects. The results of evaluation this metric experimentally using four open-source data sets (Mylyn, NetBeans, Apache Ant and jEdit) show that it can improve rule prioritization performance over conventional metrics (support, confidence and odds ratio) by 72.8% for Mylyn, 15.0% for NetBeans, 10.5% for Apache Ant and 0 for jEdit in terms of **SumNormPre**(100) precision criterion. This suggests that the proposed metric can provide better rule prioritization performance than conventional metrics and can at least provide similar performance even in the worst case.},

keywords={},

doi={10.1587/transinf.2018EDP7020},

ISSN={1745-1361},

month={September},}

Copy

TY - JOUR

TI - Cross-Validation-Based Association Rule Prioritization Metric for Software Defect Characterization

T2 - IEICE TRANSACTIONS on Information

SP - 2269

EP - 2278

AU - Takashi WATANABE

AU - Akito MONDEN

AU - Zeynep YÜCEL

AU - Yasutaka KAMEI

AU - Shuji MORISAKI

PY - 2018

DO - 10.1587/transinf.2018EDP7020

JO - IEICE TRANSACTIONS on Information

SN - 1745-1361

VL - E101-D

IS - 9

JA - IEICE TRANSACTIONS on Information

Y1 - September 2018

AB - Association rule mining discovers relationships among variables in a data set, representing them as rules. These are expected to often have predictive abilities, that is, to be able to predict future events, but commonly used rule interestingness measures, such as support and confidence, do not directly assess their predictive power. This paper proposes a cross-validation -based metric that quantifies the predictive power of such rules for characterizing software defects. The results of evaluation this metric experimentally using four open-source data sets (Mylyn, NetBeans, Apache Ant and jEdit) show that it can improve rule prioritization performance over conventional metrics (support, confidence and odds ratio) by 72.8% for Mylyn, 15.0% for NetBeans, 10.5% for Apache Ant and 0 for jEdit in terms of **SumNormPre**(100) precision criterion. This suggests that the proposed metric can provide better rule prioritization performance than conventional metrics and can at least provide similar performance even in the worst case.

ER -