The search functionality is under construction.

Author Search Result

[Author] Jung-Been LEE(2hit)

1-2hit
  • Automatic Stop Word Generation for Mining Software Artifact Using Topic Model with Pointwise Mutual Information

    Jung-Been LEE  Taek LEE  Hoh Peter IN  

     
    PAPER-Software Engineering

      Pubricized:
    2019/05/27
      Vol:
    E102-D No:9
      Page(s):
    1761-1772

    Mining software artifacts is a useful way to understand the source code of software projects. Topic modeling in particular has been widely used to discover meaningful information from software artifacts. However, software artifacts are unstructured and contain a mix of textual types within the natural text. These software artifact characteristics worsen the performance of topic modeling. Among several natural language pre-processing tasks, removing stop words to reduce meaningless and uninteresting terms is an efficient way to improve the quality of topic models. Although many approaches are used to generate effective stop words, the lists are outdated or too general to apply to mining software artifacts. In addition, the performance of the topic model is sensitive to the datasets used in the training for each approach. To resolve these problems, we propose an automatic stop word generation approach for topic models of software artifacts. By measuring topic coherence among words in the topic using Pointwise Mutual Information (PMI), we added words with a low PMI score to our stop words list for every topic modeling loop. Through our experiment, we proved that our stop words list results in a higher performance of the topic model than lists from other approaches.

  • Effect Analysis of Coding Convention Violations on Readability of Post-Delivered Code

    Taek LEE  Jung-Been LEE  Hoh Peter IN  

     
    PAPER-Software Engineering

      Pubricized:
    2015/04/10
      Vol:
    E98-D No:7
      Page(s):
    1286-1296

    Adherence to coding conventions during the code production stage of software development is essential. Benefits include enabling programmers to quickly understand the context of shared code, communicate with one another in a consistent manner, and easily maintain the source code at low costs. In reality, however, programmers tend to doubt or ignore the degree to which the quality of their code is affected by adherence to these guidelines. This paper addresses research questions such as “Do violations of coding conventions affect the readability of the produced code?”, “What kinds of coding violations reduce code readability?”, and “How much do variable factors such as developer experience, project size, team size, and project maturity influence coding violations?” To respond to these research questions, we explored 210 open-source Java projects with 117 coding conventions from the Sun standard checklist. We believe our findings and the analysis approach used in the paper will encourage programmers and QA managers to develop their own customized and effective coding style guidelines.