Cross-project defect prediction (CPDP) is a research hot recently, which utilizes the data form existing source project to construct prediction model and predicts the defect-prone of software instances from target project. However, it is challenging in bridging the distribution difference between different projects. To minimize the data distribution differences between different projects and predict unlabeled target instances, we present a novel approach called selective pseudo-labeling based subspace learning (SPSL). SPSL learns a common subspace by using both labeled source instances and pseudo-labeled target instances. The accuracy of pseudo-labeling is promoted by iterative selective pseudo-labeling strategy. The pseudo-labeled instances from target project are iteratively updated by selecting the instances with high confidence from two pseudo-labeling technologies. Experiments are conducted on AEEEM dataset and the results show that SPSL is effective for CPDP.
Ying SUN
Nanjing University of Posts and Telecommunications (NJUPT)
Xiao-Yuan JING
NJUPT,Wuhan University
Fei WU
NJUPT
Yanfei SUN
NJUPT,Jiangsu Engineering Research Center of HPC and Intelligent Processing
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Ying SUN, Xiao-Yuan JING, Fei WU, Yanfei SUN, "Selective Pseudo-Labeling Based Subspace Learning for Cross-Project Defect Prediction" in IEICE TRANSACTIONS on Information,
vol. E103-D, no. 9, pp. 2003-2006, September 2020, doi: 10.1587/transinf.2020EDL8034.
Abstract: Cross-project defect prediction (CPDP) is a research hot recently, which utilizes the data form existing source project to construct prediction model and predicts the defect-prone of software instances from target project. However, it is challenging in bridging the distribution difference between different projects. To minimize the data distribution differences between different projects and predict unlabeled target instances, we present a novel approach called selective pseudo-labeling based subspace learning (SPSL). SPSL learns a common subspace by using both labeled source instances and pseudo-labeled target instances. The accuracy of pseudo-labeling is promoted by iterative selective pseudo-labeling strategy. The pseudo-labeled instances from target project are iteratively updated by selecting the instances with high confidence from two pseudo-labeling technologies. Experiments are conducted on AEEEM dataset and the results show that SPSL is effective for CPDP.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2020EDL8034/_p
Copy
@ARTICLE{e103-d_9_2003,
author={Ying SUN, Xiao-Yuan JING, Fei WU, Yanfei SUN, },
journal={IEICE TRANSACTIONS on Information},
title={Selective Pseudo-Labeling Based Subspace Learning for Cross-Project Defect Prediction},
year={2020},
volume={E103-D},
number={9},
pages={2003-2006},
abstract={Cross-project defect prediction (CPDP) is a research hot recently, which utilizes the data form existing source project to construct prediction model and predicts the defect-prone of software instances from target project. However, it is challenging in bridging the distribution difference between different projects. To minimize the data distribution differences between different projects and predict unlabeled target instances, we present a novel approach called selective pseudo-labeling based subspace learning (SPSL). SPSL learns a common subspace by using both labeled source instances and pseudo-labeled target instances. The accuracy of pseudo-labeling is promoted by iterative selective pseudo-labeling strategy. The pseudo-labeled instances from target project are iteratively updated by selecting the instances with high confidence from two pseudo-labeling technologies. Experiments are conducted on AEEEM dataset and the results show that SPSL is effective for CPDP.},
keywords={},
doi={10.1587/transinf.2020EDL8034},
ISSN={1745-1361},
month={September},}
Copy
TY - JOUR
TI - Selective Pseudo-Labeling Based Subspace Learning for Cross-Project Defect Prediction
T2 - IEICE TRANSACTIONS on Information
SP - 2003
EP - 2006
AU - Ying SUN
AU - Xiao-Yuan JING
AU - Fei WU
AU - Yanfei SUN
PY - 2020
DO - 10.1587/transinf.2020EDL8034
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E103-D
IS - 9
JA - IEICE TRANSACTIONS on Information
Y1 - September 2020
AB - Cross-project defect prediction (CPDP) is a research hot recently, which utilizes the data form existing source project to construct prediction model and predicts the defect-prone of software instances from target project. However, it is challenging in bridging the distribution difference between different projects. To minimize the data distribution differences between different projects and predict unlabeled target instances, we present a novel approach called selective pseudo-labeling based subspace learning (SPSL). SPSL learns a common subspace by using both labeled source instances and pseudo-labeled target instances. The accuracy of pseudo-labeling is promoted by iterative selective pseudo-labeling strategy. The pseudo-labeled instances from target project are iteratively updated by selecting the instances with high confidence from two pseudo-labeling technologies. Experiments are conducted on AEEEM dataset and the results show that SPSL is effective for CPDP.
ER -