The search functionality is under construction.

Author Search Result

[Author] Song HUANG(3hit)

1-3hit
  • Empirical Studies of a Kernel Density Estimation Based Naive Bayes Method for Software Defect Prediction

    Haijin JI  Song HUANG  Xuewei LV  Yaning WU  Yuntian FENG  

     
    PAPER-Software Engineering

      Pubricized:
    2018/10/03
      Vol:
    E102-D No:1
      Page(s):
    75-84

    Software defect prediction (SDP) plays a significant part in allocating testing resources reasonably, reducing testing costs, and ensuring software quality. One of the most widely used algorithms of SDP models is Naive Bayes (NB) because of its simplicity, effectiveness and robustness. In NB, when a data set has continuous or numeric attributes, they are generally assumed to follow normal distributions and incorporate the probability density function of normal distribution into their conditional probabilities estimates. However, after conducting a Kolmogorov-Smirnov test, we find that the 21 main software metrics follow non-normal distribution at the 5% significance level. Therefore, this paper proposes an improved NB approach, which estimates the conditional probabilities of NB with kernel density estimation of training data sets, to help improve the prediction accuracy of NB for SDP. To evaluate the proposed method, we carry out experiments on 34 software releases obtained from 10 open source projects provided by PROMISE repository. Four well-known classification algorithms are included for comparison, namely Naive Bayes, Support Vector Machine, Logistic Regression and Random Tree. The obtained results show that this new method is more successful than the four well-known classification algorithms in the most software releases.

  • A Method of K-Means Clustering Based on TF-IDF for Software Requirements Documents Written in Chinese Language

    Jing ZHU  Song HUANG  Yaqing SHI  Kaishun WU  Yanqiu WANG  

     
    PAPER-Software Engineering

      Pubricized:
    2021/12/28
      Vol:
    E105-D No:4
      Page(s):
    736-754

    Nowadays there is no way to automatically obtain the function points when using function point analyze (FPA) method, especially for the requirement documents written in Chinese language. Considering the characteristics of Chinese grammar in words segmentation, it is necessary to divide words accurately Chinese words, so that the subsequent entity recognition and disambiguation can be carried out in a smaller range, which lays a solid foundation for the efficient automatic extraction of the function points. Therefore, this paper proposed a method of K-Means clustering based on TF-IDF, and conducts experiments with 24 software requirement documents written in Chinese language. The results show that the best clustering effect is achieved when the extracted information is retained by 55% to 75% and the number of clusters takes the middle value of the total number of clusters. Not only for Chinese, this method and conclusion of this paper, but provides an important reference for automatic extraction of function points from software requirements documents written in other Oriental languages, and also fills the gaps of data preprocessing in the early stage of automatic calculation function points.

  • MTGAN: Extending Test Case set for Deep Learning Image Classifier

    Erhu LIU  Song HUANG  Cheng ZONG  Changyou ZHENG  Yongming YAO  Jing ZHU  Shiqi TANG  Yanqiu WANG  

     
    PAPER-Software Engineering

      Pubricized:
    2021/02/05
      Vol:
    E104-D No:5
      Page(s):
    709-722

    During the recent several years, deep learning has achieved excellent results in image recognition, voice processing, and other research areas, which has set off a new upsurge of research and application. Internal defects and external malicious attacks may threaten the safe and reliable operation of a deep learning system and even cause unbearable consequences. The technology of testing deep learning systems is still in its infancy. Traditional software testing technology is not applicable to test deep learning systems. In addition, the characteristics of deep learning such as complex application scenarios, the high dimensionality of input data, and poor interpretability of operation logic bring new challenges to the testing work. This paper focuses on the problem of test case generation and points out that adversarial examples can be used as test cases. Then the paper proposes MTGAN which is a framework to generate test cases for deep learning image classifiers based on Generative Adversarial Network. Finally, this paper evaluates the effectiveness of MTGAN.