The search functionality is under construction.

Author Search Result

[Author] Kenichi MATSUMOTO(15hit)

1-15hit
  • Investigating and Projecting Population Structures in Open Source Software Projects: A Case Study of Projects in GitHub

    Saya ONOUE  Hideaki HATA  Akito MONDEN  Kenichi MATSUMOTO  

     
    PAPER-Software Engineering

      Pubricized:
    2016/02/05
      Vol:
    E99-D No:5
      Page(s):
    1304-1315

    GitHub is a developers' social networking service that hosts a great number of open source software (OSS) projects. Although some of the hosted projects are growing and have many developers, most projects are organized by a few developers and face difficulties in terms of sustainability. OSS projects depend mainly on volunteer developers, and attracting and retaining these volunteers are major concerns of the project stakeholders. To investigate the population structures of OSS development communities in detail and conduct software analytics to obtain actionable information, we apply a demographic approach. Demography is the scientific study of population and seeks to identify the levels and trends in the size and components of a population. This paper presents a case study, investigating the characteristics of the population structures of OSS projects on GitHub, and shows population projections generated with the well-known cohort component method. We found that there are four types of population structures in OSS development communities in terms of experiences and contributions. In addition, we projected the future population accurately using a cohort component population projection method. This method predicts a population of the next period using a survival rate calculated from past population. To the best of our knowledge, this is the first study that applied demography to the field of OSS research. Our approach addressing OSS-related problems based on demography will bring new insights, since studying population is novel in OSS research. Understanding current and future structures of OSS projects can help practitioners to monitor a project, gain awareness of what is happening, manage risks, and evaluate past decisions.

  • Understanding Developer Commenting in Code Reviews

    Toshiki HIRAO  Raula GAIKOVINA KULA  Akinori IHARA  Kenichi MATSUMOTO  

     
    PAPER

      Pubricized:
    2019/09/11
      Vol:
    E102-D No:12
      Page(s):
    2423-2432

    Modern code review is a well-known practice to assess the quality of software where developers discuss the quality in a web-based review tool. However, this lightweight approach may risk an inefficient review participation, especially when comments becomes either excessive (i.e., too many) or underwhelming (i.e., too few). In this study, we investigate the phenomena of reviewer commenting. Through a large-scale empirical analysis of over 1.1 million reviews from five OSS systems, we conduct an exploratory study to investigate the frequency, size, and evolution of reviewer commenting. Moreover, we also conduct a modeling study to understand the most important features that potentially drive reviewer comments. Our results find that (i) the number of comments and the number of words in the comments tend to vary among reviews and across studied systems; (ii) reviewers change their behaviours in commenting over time; and (iii) human experience and patch property aspects impact the number of comments and the number of words in the comments.

  • How Does Time Conscious Rule of Gamification Affect Coding and Review?

    Kohei YOSHIGAMI  Taishi HAYASHI  Masateru TSUNODA  Hidetake UWANO  Shunichiro SASAKI  Kenichi MATSUMOTO  

     
    LETTER

      Pubricized:
    2019/09/18
      Vol:
    E102-D No:12
      Page(s):
    2435-2440

    Recently, many studies have applied gamification to software engineering education and software development to enhance work results. Gamification is defined as “the use of game design elements in non-game contexts.” When applying gamification, we make various game rules, such as a time limit. However, it is not clear whether the rule affects working time or not. For example, if we apply a time limit to impatient developers, the working time may become shorter, but the rule may negatively affect because of pressure for time. In this study, we analyze with subjective experiments whether the rules affects work results such as working time. Our experimental results suggest that for the coding tasks, working time was shortened when we applied a rule that made developers aware of working time by showing elapsed time.

  • An Empirical Study of Package Management Issues via Stack Overflow

    Syful ISLAM  Raula GAIKOVINA KULA  Christoph TREUDE  Bodin CHINTHANET  Takashi ISHIO  Kenichi MATSUMOTO  

     
    PAPER

      Pubricized:
    2022/11/18
      Vol:
    E106-D No:2
      Page(s):
    138-147

    The package manager (PM) is crucial to most technology stacks, acting as a broker to ensure that a verified dependency package is correctly installed, configured, or removed from an application. Diversity in technology stacks has led to dozens of PMs with various features. While our recent study indicates that package management features of PM are related to end-user experiences, it is unclear what those issues are and what information is required to resolve them. In this paper, we have investigated PM issues faced by end-users through an empirical study of content on Stack Overflow (SO). We carried out a qualitative analysis of 1,131 questions and their accepted answer posts for three popular PMs (i.e., Maven, npm, and NuGet) to identify issue types, underlying causes, and their resolutions. Our results confirm that end-users struggle with PM tool usage (approximately 64-72%). We observe that most issues are raised by end-users due to lack of instructions and errors messages from PM tools. In terms of issue resolution, we find that external link sharing is the most common practice to resolve PM issues. Additionally, we observe that links pointing to useful resources (i.e., official documentation websites, tutorials, etc.) are most frequently shared, indicating the potential for tool support and the ability to provide relevant information for PM end-users.

  • Analyzing Web Search Strategy of Software Developers to Modify Source Codes

    Keitaro NAKASAI  Masateru TSUNODA  Kenichi MATSUMOTO  

     
    LETTER

      Pubricized:
    2021/10/29
      Vol:
    E105-D No:1
      Page(s):
    31-36

    Software developers often use a web search engine to improve work efficiency. However, web search strategies (e.g., frequently changing web search keywords) may be different for each developer. In this study, we attempted to define a better web search strategy. Although many previous studies analyzed web search behavior in programming, they did not provide guidelines for web search strategies. To suggest guidelines for web search strategies, we asked 10 subjects four questions about programming which they had to solve, and analyzed their behavior. In the analysis, we focused on the subjects' task time and the web search metrics defined by us. Based on our experiment, to enhance the effectiveness of the search, we suggest (1) that one should not go through the next search result pages, (2) the number of keywords in queries should be suppressed, and (3) previously used keywords must be avoided when creating a new query.

  • LSA-X: Exploiting Productivity Factors in Linear Size Adaptation for Analogy-Based Software Effort Estimation

    Passakorn PHANNACHITTA  Akito MONDEN  Jacky KEUNG  Kenichi MATSUMOTO  

     
    PAPER-Software Engineering

      Pubricized:
    2015/10/15
      Vol:
    E99-D No:1
      Page(s):
    151-162

    Analogy-based software effort estimation has gained a considerable amount of attention in current research and practice. Its excellent estimation accuracy relies on its solution adaptation stage, where an effort estimate is produced from similar past projects. This study proposes a solution adaptation technique named LSA-X that introduces an approach to exploit the potential of productivity factors, i.e., project variables with a high correlation with software productivity, in the solution adaptation stage. The LSA-X technique tailors the exploitation of the productivity factors with a procedure based on the Linear Size Adaptation (LSA) technique. The results, based on 19 datasets show that in circumstances where a dataset exhibits a high correlation coefficient between productivity and a related factor (r≥0.30), the proposed LSA-X technique statistically outperformed (95% confidence) the other 8 commonly used techniques compared in this study. In other circumstances, our results suggest using any linear adaptation technique based on software size to compensate for the limitations of the LSA-X technique.

  • Extraction of Library Update History Using Source Code Reuse Detection

    Kanyakorn JEWMAIDANG  Takashi ISHIO  Akinori IHARA  Kenichi MATSUMOTO  Pattara LEELAPRUTE  

     
    LETTER-Software Engineering

      Pubricized:
    2017/12/20
      Vol:
    E101-D No:3
      Page(s):
    799-802

    This paper proposes a method to extract and visualize a library update history in a project. The method identifies reused library versions by comparing source code in a product with existing versions of the library so that developers can understand when their own copy of a library has been copied, modified, and updated.

  • A Novel Approach to Address External Validity Issues in Fault Prediction Using Bandit Algorithms

    Teruki HAYAKAWA  Masateru TSUNODA  Koji TODA  Keitaro NAKASAI  Amjed TAHIR  Kwabena Ebo BENNIN  Akito MONDEN  Kenichi MATSUMOTO  

     
    LETTER-Software Engineering

      Pubricized:
    2020/10/30
      Vol:
    E104-D No:2
      Page(s):
    327-331

    Various software fault prediction models have been proposed in the past twenty years. Many studies have compared and evaluated existing prediction approaches in order to identify the most effective ones. However, in most cases, such models and techniques provide varying results, and their outcomes do not result in best possible performance across different datasets. This is mainly due to the diverse nature of software development projects, and therefore, there is a risk that the selected models lead to inconsistent results across multiple datasets. In this work, we propose the use of bandit algorithms in cases where the accuracy of the models are inconsistent across multiple datasets. In the experiment discussed in this work, we used four conventional prediction models, tested on three different dataset, and then selected the best possible model dynamically by applying bandit algorithms. We then compared our results with those obtained using majority voting. As a result, Epsilon-greedy with ϵ=0.3 showed the best or second-best prediction performance compared with using only one prediction model and majority voting. Our results showed that bandit algorithms can provide promising outcomes when used in fault prediction.

  • An Exploration of Cross-Patch Collaborations via Patch Linkage in OpenStack

    Dong WANG  Patanamon THONGTANUNAM  Raula GAIKOVINA KULA  Kenichi MATSUMOTO  

     
    PAPER

      Pubricized:
    2022/11/18
      Vol:
    E106-D No:2
      Page(s):
    148-156

    Contemporary development projects benefit from code review as it improves the quality of a project. Large ecosystems of inter-dependent projects like OpenStack generate a large number of reviews, which poses new challenges for collaboration (improving patches, fixing defects). Review tools allow developers to link between patches, to indicate patch dependency, competing solutions, or provide broader context. We hypothesize that such patch linkage may also simulate cross-collaboration. With a case study of OpenStack, we take a first step to explore collaborations that occur after a patch linkage was posted between two patches (i.e., cross-patch collaboration). Our empirical results show that although patch linkage that requests collaboration is relatively less prevalent, the probability of collaboration is relatively higher. Interestingly, the results also show that collaborative contributions via patch linkage are non-trivial, i.e, contributions can affect the review outcome (such as voting) or even improve the patch (i.e., revising). This work opens up future directions to understand barriers and opportunities related to this new kind of collaboration, that assists with code review and development tasks in large ecosystems.

  • An Exploration of npm Package Co-Usage Examples from Stack Overflow: A Case Study

    Syful ISLAM  Dong WANG  Raula GAIKOVINA KULA  Takashi ISHIO  Kenichi MATSUMOTO  

     
    PAPER

      Pubricized:
    2021/10/11
      Vol:
    E105-D No:1
      Page(s):
    11-18

    Third-party package usage has become a common practice in contemporary software development. Developers often face different challenges, including choosing the right libraries, installing errors, discrepancies, setting up the environment, and building failures during software development. The risks of maintaining a third-party package are well known, but it is unclear how information from Stack Overflow (SO) can be useful. This paper performed an empirical study to explore npm package co-usage examples from SO. From over 30,000 SO question posts, we extracted 2,100 posts with package usage information and matched them against the 217,934 npm library package. We find that, popular and highly used libraries are not discussed as often in SO. However, we can see that the accepted answers may prove useful, as we believe that the usage examples and executable commands could be reused for tool support.

  • SōjiTantei: Function-Call Reachability Detection of Vulnerable Code for npm Packages

    Bodin CHINTHANET  Raula GAIKOVINA KULA  Rodrigo ELIZA ZAPATA  Takashi ISHIO  Kenichi MATSUMOTO  Akinori IHARA  

     
    LETTER

      Pubricized:
    2021/09/27
      Vol:
    E105-D No:1
      Page(s):
    19-20

    It has become common practice for software projects to adopt third-party dependencies. Developers are encouraged to update any outdated dependency to remain safe from potential threats of vulnerabilities. In this study, we present an approach to aid developers show whether or not a vulnerable code is reachable for JavaScript projects. Our prototype, SōjiTantei, is evaluated in two ways (i) the accuracy when compared to a manual approach and (ii) a larger-scale analysis of 780 clients from 78 security vulnerability cases. The first evaluation shows that SōjiTantei has a high accuracy of 83.3%, with a speed of less than a second analysis per client. The second evaluation reveals that 68 out of the studied 78 vulnerabilities reported having at least one clean client. The study proves that automation is promising with the potential for further improvement.

  • How are IF-Conditional Statements Fixed Through Peer CodeReview?

    Yuki UEDA  Akinori IHARA  Takashi ISHIO  Toshiki HIRAO  Kenichi MATSUMOTO  

     
    PAPER-Software Engineering

      Pubricized:
    2018/08/08
      Vol:
    E101-D No:11
      Page(s):
    2720-2729

    Peer code review is key to ensuring the absence of software defects. To reduce review costs, software developers adopt code convention checking tools that automatically identify maintainability issues in source code. However, these tools do not always address the maintainability issue for a particular project. The goal of this study is to understand how code review fixes conditional statement issues, which are the most frequent changes in software development. We conduct an empirical study to understand if-statement changes through code review. Using review requests in the Qt and OpenStack projects, we analyze changes of the if-conditional statements that are (1) requested to be reviewed, and are (2) revised through code review. We find the most frequently changed symbols are “( )”, “.”, and “!”. We also find project-specific fixing patterns for improving code readability by association rule mining. For example “!” operator is frequently replaced with a function call. These rules are useful for improving a coding convention checker tailored for the projects.

  • An Empirical Study of README contents for JavaScript Packages

    Shohei IKEDA  Akinori IHARA  Raula Gaikovina KULA  Kenichi MATSUMOTO  

     
    PAPER-Software Engineering

      Pubricized:
    2018/10/24
      Vol:
    E102-D No:2
      Page(s):
    280-288

    Contemporary software projects often utilize a README.md to share crucial information such as installation and usage examples related to their software. Furthermore, these files serve as an important source of updated and useful documentation for developers and prospective users of the software. Nonetheless, both novice and seasoned developers are sometimes unsure of what is required for a good README file. To understand the contents of README, we investigate the contents of 43,900 JavaScript packages. Results show that these packages contain common content themes (i.e., ‘usage’, ‘install’ and ‘license’). Furthermore, we find that application-specific packages more frequently included content themes such as ‘options’, while library-based packages more frequently included other specific content themes (i.e., ‘install’ and ‘license’).

  • Analysis of Work Efficiency and Quality of Software Maintenance Using Cross-Company Dataset

    Masateru TSUNODA  Akito MONDEN  Kenichi MATSUMOTO  Sawako OHIWA  Tomoki OSHINO  

     
    PAPER

      Pubricized:
    2020/08/31
      Vol:
    E104-D No:1
      Page(s):
    76-90

    Software maintenance is an important activity in the software lifecycle. Software maintenance does not only mean removing faults found after software release. Software needs extensions or modifications of its functions owing to changes in the business environment and software maintenance also refers to them. To help users and service suppliers benchmark work efficiency for software maintenance, and to clarify the relationships between software quality, work efficiency, and unit cost of staff, we used a dataset that includes 134 data points collected by the Economic Research Association in 2012, and analyzed the factors that affected the work efficiency of software maintenance. In the analysis, using a multiple regression model, we clarified the relationships between work efficiency and programming language and productivity factors. To analyze the influence to the quality, relationships of fault ratio was analyzed using correlation coefficients. The programming language and productivity factors affect work efficiency. Higher work efficiency and higher unit cost of staff do not affect the quality of software maintenance.

  • Influence of Outliers on Estimation Accuracy of Software Development Effort

    Kenichi ONO  Masateru TSUNODA  Akito MONDEN  Kenichi MATSUMOTO  

     
    PAPER

      Pubricized:
    2020/10/02
      Vol:
    E104-D No:1
      Page(s):
    91-105

    When applying estimation methods, the issue of outliers is inevitable. The extent of their influence has not been clarified, though several studies have evaluated outlier elimination methods. It is unclear whether we should always be sensitive to outliers, whether outliers should always be removed before estimation, and what amount of precaution is required for collecting project data. Therefore, the goal of this study is to illustrate a guideline that suggests how sensitively we should handle outliers. In the analysis, we experimentally add outliers to three datasets, to analyze their influence. We modified the percentage of outliers, their extent (e.g., we varied the actual effort from 100 to 200 person-hours when the extent was 100%), the variables including outliers (e.g., adding outliers to function points or effort), and the locations of outliers in a dataset. Next, the effort was estimated using these datasets. We used multiple linear regression analysis and analogy based estimation to estimate the development effort. The experimental results indicate that the influence of outliers on the estimation accuracy is non-trivial when the extent or percentage of outliers is considerable (i.e., 100% and 20%, respectively). In contrast, their influence is negligible when the extent and percentage are small (i.e., 50% and 10%, respectively). Moreover, in some cases, the linear regression analysis was less affected by outliers than analogy based estimation.