The search functionality is under construction.

The search functionality is under construction.

To enhance the prediction accuracy of the number of faults, many studies proposed various prediction models. The model is built using a dataset collected in past projects, and the number of faults is predicted using the model and the data of the current project. Datasets sometimes have many data points where the dependent variable, i.e., the number of faults is zero. When a multiple linear regression model is made using the dataset, the model may not be built properly. To avoid the problem, the Tobit model is considered to be effective when predicting software faults. The model assumes that the range of a dependent variable is limited and the model is built based on the assumption. Similar to the Tobit model, the Poisson regression model assumes there are many data points whose value is zero on the dependent variable. Also, log-transformation is sometimes applied to enhance the accuracy of the model. Additionally, ensemble methods are effective to enhance prediction accuracy of the models. We evaluated the prediction accuracy of the methods separately, when the number of faults is zero and not zero. In the experiment, our proposed ensemble method showed the highest accuracy, and Pred25 was 21% when the number of faults was not zero, and it was 45% when the number was zero.

- Publication
- IEICE TRANSACTIONS on Information Vol.E103-D No.6 pp.1319-1327

- Publication Date
- 2020/06/01

- Publicized
- 2020/03/09

- Online ISSN
- 1745-1361

- DOI
- 10.1587/transinf.2019KBP0019

- Type of Manuscript
- Special Section PAPER (Special Section on Knowledge-Based Software Engineering)

- Category

Yukasa MURAKAMI

Kindai University

Masateru TSUNODA

Kindai University

Koji TODA

Fukuoka Institute of Technology University

The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.

Copy

Yukasa MURAKAMI, Masateru TSUNODA, Koji TODA, "Evaluation of Software Fault Prediction Models Considering Faultless Cases" in IEICE TRANSACTIONS on Information,
vol. E103-D, no. 6, pp. 1319-1327, June 2020, doi: 10.1587/transinf.2019KBP0019.

Abstract: To enhance the prediction accuracy of the number of faults, many studies proposed various prediction models. The model is built using a dataset collected in past projects, and the number of faults is predicted using the model and the data of the current project. Datasets sometimes have many data points where the dependent variable, i.e., the number of faults is zero. When a multiple linear regression model is made using the dataset, the model may not be built properly. To avoid the problem, the Tobit model is considered to be effective when predicting software faults. The model assumes that the range of a dependent variable is limited and the model is built based on the assumption. Similar to the Tobit model, the Poisson regression model assumes there are many data points whose value is zero on the dependent variable. Also, log-transformation is sometimes applied to enhance the accuracy of the model. Additionally, ensemble methods are effective to enhance prediction accuracy of the models. We evaluated the prediction accuracy of the methods separately, when the number of faults is zero and not zero. In the experiment, our proposed ensemble method showed the highest accuracy, and Pred25 was 21% when the number of faults was not zero, and it was 45% when the number was zero.

URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2019KBP0019/_p

Copy

@ARTICLE{e103-d_6_1319,

author={Yukasa MURAKAMI, Masateru TSUNODA, Koji TODA, },

journal={IEICE TRANSACTIONS on Information},

title={Evaluation of Software Fault Prediction Models Considering Faultless Cases},

year={2020},

volume={E103-D},

number={6},

pages={1319-1327},

abstract={To enhance the prediction accuracy of the number of faults, many studies proposed various prediction models. The model is built using a dataset collected in past projects, and the number of faults is predicted using the model and the data of the current project. Datasets sometimes have many data points where the dependent variable, i.e., the number of faults is zero. When a multiple linear regression model is made using the dataset, the model may not be built properly. To avoid the problem, the Tobit model is considered to be effective when predicting software faults. The model assumes that the range of a dependent variable is limited and the model is built based on the assumption. Similar to the Tobit model, the Poisson regression model assumes there are many data points whose value is zero on the dependent variable. Also, log-transformation is sometimes applied to enhance the accuracy of the model. Additionally, ensemble methods are effective to enhance prediction accuracy of the models. We evaluated the prediction accuracy of the methods separately, when the number of faults is zero and not zero. In the experiment, our proposed ensemble method showed the highest accuracy, and Pred25 was 21% when the number of faults was not zero, and it was 45% when the number was zero.},

keywords={},

doi={10.1587/transinf.2019KBP0019},

ISSN={1745-1361},

month={June},}

Copy

TY - JOUR

TI - Evaluation of Software Fault Prediction Models Considering Faultless Cases

T2 - IEICE TRANSACTIONS on Information

SP - 1319

EP - 1327

AU - Yukasa MURAKAMI

AU - Masateru TSUNODA

AU - Koji TODA

PY - 2020

DO - 10.1587/transinf.2019KBP0019

JO - IEICE TRANSACTIONS on Information

SN - 1745-1361

VL - E103-D

IS - 6

JA - IEICE TRANSACTIONS on Information

Y1 - June 2020

AB - To enhance the prediction accuracy of the number of faults, many studies proposed various prediction models. The model is built using a dataset collected in past projects, and the number of faults is predicted using the model and the data of the current project. Datasets sometimes have many data points where the dependent variable, i.e., the number of faults is zero. When a multiple linear regression model is made using the dataset, the model may not be built properly. To avoid the problem, the Tobit model is considered to be effective when predicting software faults. The model assumes that the range of a dependent variable is limited and the model is built based on the assumption. Similar to the Tobit model, the Poisson regression model assumes there are many data points whose value is zero on the dependent variable. Also, log-transformation is sometimes applied to enhance the accuracy of the model. Additionally, ensemble methods are effective to enhance prediction accuracy of the models. We evaluated the prediction accuracy of the methods separately, when the number of faults is zero and not zero. In the experiment, our proposed ensemble method showed the highest accuracy, and Pred25 was 21% when the number of faults was not zero, and it was 45% when the number was zero.

ER -