1-9hit |
Shi QIU Daniel M. GERMAN Katsuro INOUE
Software copyright claims an exclusive right for the software copyright owner to determine whether and under what conditions others can modify, reuse, or redistribute this software. For Free and Open Source Software (FOSS), it is very important to identify the copyright owner who can control those activities with license compliance. Copyright notice is a few sentences mostly placed in the header part of a source file as a comment or in a license document in a FOSS project, and it is an important clue to establish the ownership of a FOSS project. Repositories of FOSS projects contain rich and varied information on the development including the source code contributors who are also an important clue to establish the ownership. In this paper, as a first step of understanding copyright owner, we will explore the situation of the software copyright in the Linux kernel, a typical example of FOSS, by analyzing and comparing two kinds of datasets, copyright notices in source files and source code contributors in the software repositories. The discrepancy between two kinds of analysis results is defined as copyright inconsistency. The analysis result has indicated that copyright inconsistencies are prevalent in the Linux kernel. We have also found that code reuse, affiliation change, refactoring, support function, and others' contributions potentially have impacts on the occurrence of the copyright inconsistencies in the Linux kernel. This study exposes the difficulty in managing software copyright in FOSS, highlighting the usefulness of future work to address software copyright problems.
Shi QIU German M. DANIEL Katsuro INOUE
For Free and Open Source Software (FOSS), identifying the copyright notices is important. However, both the collaborative manner of FOSS project development and the large number of source files increase its difficulty. In this paper, we aim at automatically identifying the copyright notices in source files based on machine learning techniques. The evaluation experiment shows that our method outperforms FOSSology, the only existing method based on regular expression.
Geunseok YANG Tao ZHANG Byungjeong LEE
Many software development teams usually tend to focus on maintenance activities in general. Recently, many studies on bug severity prediction have been proposed to help a bug reporter determine severity. But they do not consider the reporter's expression of emotion appearing in the bug report when they predict the bug severity level. In this paper, we propose a novel approach to severity prediction for reported bugs by using emotion similarity. First, we do not only compute an emotion-word probability vector by using smoothed unigram model (UM), but we also use the new bug report to find similar-emotion bug reports with Kullback-Leibler divergence (KL-divergence). Then, we introduce a new algorithm, Emotion Similarity (ES)-Multinomial, which modifies the original Naïve Bayes Multinomial algorithm. We train the model with emotion bug reports by using ES-Multinomial. Finally, we can predict the bug severity level in the new bug report. To compare the performance in bug severity prediction, we select related studies including Emotion Words-based Dictionary (EWD)-Multinomial, Naïve Bayes Multinomial, and another study as baseline approaches in open source projects (e.g., Eclipse, GNU, JBoss, Mozilla, and WireShark). The results show that our approach outperforms the baselines, and can reflect reporters' emotional expressions during the bug reporting.
Panita MEANANEATRA Songsakdi RONGVIRIYAPANISH Taweesup APIWATTANAPONG
An important step for improving software analyzability is applying refactorings during the maintenance phase to remove bad smells, especially the long method bad smell. Long method bad smell occurs most frequently and is a root cause of other bad smells. However, no research has proposed an approach to repeating refactoring identification, suggestion, and application until all long method bad smells have been removed completely without reducing software analyzability. This paper proposes an effective approach to identifying refactoring opportunities and suggesting an effective refactoring set for complete removal of long method bad smell without reducing code analyzability. This approach, called the long method remover or LMR, uses refactoring enabling conditions based on program analysis and code metrics to identify four refactoring techniques and uses a technique embedded in JDeodorant to identify extract method. For effective refactoring set suggestion, LMR uses two criteria: code analyzability level and the number of statements impacted by the refactorings. LMR also uses side effect analysis to ensure behavior preservation. To evaluate LMR, we apply it to the core package of a real world java application. Our evaluation criteria are 1) the preservation of code functionality, 2) the removal rate of long method characteristics, and 3) the improvement on analyzability. The result showed that the methods that apply suggested refactoring sets can completely remove long method bad smell, still have behavior preservation, and have not decreased analyzability. It is concluded that LMR meets the objectives in almost all classes. We also discussed the issues we found during evaluation as lesson learned.
Katsuhisa MARUYAMA Takayuki OMORI Shinpei HAYASHI
Change-aware development environments can automatically record fine-grained code changes on a program and allow programmers to replay the recorded changes in chronological order. However, since they do not always need to replay all the code changes to investigate how a particular entity of the program has been changed, they often eliminate several code changes of no interest by manually skipping them in replaying. This skipping action is an obstacle that makes many programmers hesitate when they use existing replaying tools. This paper proposes a slicing mechanism that automatically removes manually skipped code changes from the whole history of past code changes and extracts only those necessary to build a particular class member of a Java program. In this mechanism, fine-grained code changes are represented by edit operations recorded on the source code of a program and dependencies among edit operations are formalized. The paper also presents a running tool that slices the operation history and replays its resulting slices. With this tool, programmers can avoid replaying nonessential edit operations for the construction of class members that they want to understand. Experimental results show that the tool offered improvements over conventional replaying tools with respect to the reduction of the number of edit operations needed to be examined and over history filtering tools with respect to the accuracy of edit operations to be replayed.
Yoji YAMATO Shinichiro KATSURAGI Shinji NAGAO Norihiro MIURA
We evaluated software maintenance of an open source cloud platform system we developed using an agile software development method. We previously reported on a rapid service launch using the agile software development method in spite of large-scale development. For this study, we analyzed inquiries and the defect removal efficiency of our recently developed software throughout one-year operation. We found that the defect removal efficiency of our recently developed software was 98%. This indicates that we could achieve sufficient quality in spite of large-scale agile development. In term of maintenance process, we could answer all enquiries within three business days and could conduct version-upgrade fast. Thus, we conclude that software maintenance of agile software development is not ineffective.
Software products are increasingly complex, so it is becoming more difficult to find and correct bugs in large programs. Software developers rely on bug reports to fix bugs; thus, bug-tracking tools have been introduced to allow developers to upload, manage, and comment on bug reports to guide corrective software maintenance. However, the very high frequency of duplicate bug reports means that the triagers who help software developers in eliminating bugs must allocate large amounts of time and effort to the identification and analysis of these bug reports. In addition, classifying bug reports can help triagers arrange bugs in categories for the fixers who have more experience for resolving historical bugs in the same category. Unfortunately, due to a large number of submitted bug reports every day, the manual classification for these bug reports increases the triagers' workload. To resolve these problems, in this study, we develop a novel technique for automatic duplicate detection and classification of bug reports, which reduces the time and effort consumed by triagers for bug fixing. Our novel technique uses a support vector machine to check whether a new bug report is a duplicate. The concept profile is also used to classify the bug reports into related categories in a taxonomic tree. Finally, we conduct experiments that demonstrate the feasibility of our proposed approach using bug reports extracted from the large-scale open source project Mozilla.
Kazuma AIZAWA Haruhiko KAIYA Kenji KAIJIRI
We introduce a method, so called FC method, for maintaining software resources, such as source codes and design documents, in consumer electronics products. Because a consumer electronics product is frequently and rapidly revised, software components in such product are also revised in the same way. However, it is not so easy for software engineers to follow the revision of the product because requirements changes for the product, including the changes of its functionalities and its hardware components, are largely independent of the structure of current software resources. FC method lets software engineers to restructure software resources, especially design documents, stepwise so as to follow the requirements changes for the product easily. We report an application of this method in our company to validate it. From the application, we can confirm that the quality of software was improved about in twice, and that efficiency of development process was also improved over four times.
Takahiro NAKANISHI Motoshi SAEKI
In a software maintenance phase, since quality assurance engineers frequently only change source codes, the consistency between the source codes and their specification documents cannot be kept. In this paper we propose a supporting technique for changing specification documents automatically so that the specifications can be consistent with the source codes. In our technique, we represent a program with multiple graphs and we consider the changes on programs as the modification of the graphs. The modification of the graphs is formalized with a sequence of the operation on the graphs. We design the rules of how to relate the operations on program graphs to the operations on graphs that represent specification documents. By applying these rules, we can detect what modification and which parts of the specification document should be made to maintain the consistency between the specification and the program, when the program is modified.