IEICE global.ieice.org Site

Author Search Result

[Author] Yoshiki HIGO(2hit)

1-2hit

Proposing and Evaluating Clone Detection Approaches with Preprocessing Input Source Files
Eunjong CHOI Norihiro YOSHIDA Yoshiki HIGO Katsuro INOUE

PAPER-Software Engineering

Pubricized:
2014/10/28
Vol:
E98-D No:2
Page(s):
325-333
So far, many approaches for detecting code clones have been proposed based on the different degrees of normalizations (e.g. removal of white spaces, tokenization, and regularization of identifiers). Different degrees of normalizations lead to different granularities of source code to be detect as code clones. To investigate how the normalizations impact the code clone detection, this study proposes six approaches for detecting code clones with preprocessing input source files using different degrees of normalizations. More precisely, each normalization is applied to the input source files and then equivalence class partitioning is performed to the files in the preprocessing. After that, code clones are detected from a set of files that are representatives of each equivalence class using a token-based code clone detection tool named CCFinder. The proposed approaches can be categorized into two types, approaches with non-normalization and normalization. The former is the detection of only identical files without any normalization. Meanwhile, the latter category is the detection of identical files with different degrees of normalizations such as removal of all lines containing macros. From the case study, we observed that our proposed approaches detect code clones faster than the approach that uses only CCFinder. We also found the approach with non-normalization is the fastest among the proposed approaches in many cases.
Dataset of Functionally Equivalent Java Methods and Its Application to Evaluating Clone Detection Tools Open Access
Yoshiki HIGO

PAPER-Software System

Pubricized:
2024/02/21
Vol:
E107-D No:6
Page(s):
751-760
Modern high-level programming languages have a wide variety of grammar and can implement the required functionality in different ways. The authors believe that a large amount of code that implements the same functionality in different ways exists even in open source software where the source code is publicly available, and that by collecting such code, a useful data set can be constructed for various studies in software engineering. In this study, we construct a dataset of pairs of Java methods that have the same functionality but different structures from approximately 314 million lines of source code. To construct this dataset, the authors used an automated test generation technique, EvoSuite. Test cases generated by automated test generation techniques have the property that the test cases always succeed. In constructing the dataset, using this property, test cases generated from two methods were executed against each other to automatically determine whether the behavior of the two methods is the same to some extent. Pairs of methods for which all test cases succeeded in cross-running test cases are manually investigated to be functionally equivalent. This paper also reports the results of an accuracy evaluation of code clone detection tools using the constructed dataset. The purpose of this evaluation is assessing how accurately code clone detection tools could find the functionally equivalent methods, not assessing the accuracy of detecting ordinary clones. The constructed dataset is available at github (https://github.com/YoshikiHigo/FEMPDataset).

Author Search Result

[Author] Yoshiki HIGO(2hit)

Proposing and Evaluating Clone Detection Approaches with Preprocessing Input Source Files

Dataset of Functionally Equivalent Java Methods and Its Application to Evaluating Clone Detection Tools Open Access

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles