A Novel Technique for Duplicate Detection and Classification of Bug Reports

Tao ZHANG; Byungjeong LEE

doi:10.1587/transinf.E97.D.1756

A Novel Technique for Duplicate Detection and Classification of Bug Reports

Tao ZHANG, Byungjeong LEE

Full Text Views

0

Cite this

Summary :

Software products are increasingly complex, so it is becoming more difficult to find and correct bugs in large programs. Software developers rely on bug reports to fix bugs; thus, bug-tracking tools have been introduced to allow developers to upload, manage, and comment on bug reports to guide corrective software maintenance. However, the very high frequency of duplicate bug reports means that the triagers who help software developers in eliminating bugs must allocate large amounts of time and effort to the identification and analysis of these bug reports. In addition, classifying bug reports can help triagers arrange bugs in categories for the fixers who have more experience for resolving historical bugs in the same category. Unfortunately, due to a large number of submitted bug reports every day, the manual classification for these bug reports increases the triagers' workload. To resolve these problems, in this study, we develop a novel technique for automatic duplicate detection and classification of bug reports, which reduces the time and effort consumed by triagers for bug fixing. Our novel technique uses a support vector machine to check whether a new bug report is a duplicate. The concept profile is also used to classify the bug reports into related categories in a taxonomic tree. Finally, we conduct experiments that demonstrate the feasibility of our proposed approach using bug reports extracted from the large-scale open source project Mozilla.

Publication: IEICE TRANSACTIONS on Information Vol.E97-D No.7 pp.1756-1768

Publication Date: 2014/07/01

Publicized

Online ISSN: 1745-1361

DOI: 10.1587/transinf.E97.D.1756

Type of Manuscript: PAPER

Category: Software Engineering

Authors

Tao ZHANG
University of Seoul
Byungjeong LEE
University of Seoul

Keyword

bug report classification, concept profile, duplicate detection, support vector machine, software maintenance

Cite this

Copy

Tao ZHANG, Byungjeong LEE, "A Novel Technique for Duplicate Detection and Classification of Bug Reports" in IEICE TRANSACTIONS on Information, vol. E97-D, no. 7, pp. 1756-1768, July 2014, doi: 10.1587/transinf.E97.D.1756.
Abstract: Software products are increasingly complex, so it is becoming more difficult to find and correct bugs in large programs. Software developers rely on bug reports to fix bugs; thus, bug-tracking tools have been introduced to allow developers to upload, manage, and comment on bug reports to guide corrective software maintenance. However, the very high frequency of duplicate bug reports means that the triagers who help software developers in eliminating bugs must allocate large amounts of time and effort to the identification and analysis of these bug reports. In addition, classifying bug reports can help triagers arrange bugs in categories for the fixers who have more experience for resolving historical bugs in the same category. Unfortunately, due to a large number of submitted bug reports every day, the manual classification for these bug reports increases the triagers' workload. To resolve these problems, in this study, we develop a novel technique for automatic duplicate detection and classification of bug reports, which reduces the time and effort consumed by triagers for bug fixing. Our novel technique uses a support vector machine to check whether a new bug report is a duplicate. The concept profile is also used to classify the bug reports into related categories in a taxonomic tree. Finally, we conduct experiments that demonstrate the feasibility of our proposed approach using bug reports extracted from the large-scale open source project Mozilla.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.E97.D.1756/_p

Copy

@ARTICLE{e97-d_7_1756,
author={Tao ZHANG, Byungjeong LEE, },
journal={IEICE TRANSACTIONS on Information},
title={A Novel Technique for Duplicate Detection and Classification of Bug Reports},
year={2014},
volume={E97-D},
number={7},
pages={1756-1768},
abstract={Software products are increasingly complex, so it is becoming more difficult to find and correct bugs in large programs. Software developers rely on bug reports to fix bugs; thus, bug-tracking tools have been introduced to allow developers to upload, manage, and comment on bug reports to guide corrective software maintenance. However, the very high frequency of duplicate bug reports means that the triagers who help software developers in eliminating bugs must allocate large amounts of time and effort to the identification and analysis of these bug reports. In addition, classifying bug reports can help triagers arrange bugs in categories for the fixers who have more experience for resolving historical bugs in the same category. Unfortunately, due to a large number of submitted bug reports every day, the manual classification for these bug reports increases the triagers' workload. To resolve these problems, in this study, we develop a novel technique for automatic duplicate detection and classification of bug reports, which reduces the time and effort consumed by triagers for bug fixing. Our novel technique uses a support vector machine to check whether a new bug report is a duplicate. The concept profile is also used to classify the bug reports into related categories in a taxonomic tree. Finally, we conduct experiments that demonstrate the feasibility of our proposed approach using bug reports extracted from the large-scale open source project Mozilla.},
keywords={},
doi={10.1587/transinf.E97.D.1756},
ISSN={1745-1361},
month={July},}

Copy

TY - JOUR
TI - A Novel Technique for Duplicate Detection and Classification of Bug Reports
T2 - IEICE TRANSACTIONS on Information
SP - 1756
EP - 1768
AU - Tao ZHANG
AU - Byungjeong LEE
PY - 2014
DO - 10.1587/transinf.E97.D.1756
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E97-D
IS - 7
JA - IEICE TRANSACTIONS on Information
Y1 - July 2014
AB - Software products are increasingly complex, so it is becoming more difficult to find and correct bugs in large programs. Software developers rely on bug reports to fix bugs; thus, bug-tracking tools have been introduced to allow developers to upload, manage, and comment on bug reports to guide corrective software maintenance. However, the very high frequency of duplicate bug reports means that the triagers who help software developers in eliminating bugs must allocate large amounts of time and effort to the identification and analysis of these bug reports. In addition, classifying bug reports can help triagers arrange bugs in categories for the fixers who have more experience for resolving historical bugs in the same category. Unfortunately, due to a large number of submitted bug reports every day, the manual classification for these bug reports increases the triagers' workload. To resolve these problems, in this study, we develop a novel technique for automatic duplicate detection and classification of bug reports, which reduces the time and effort consumed by triagers for bug fixing. Our novel technique uses a support vector machine to check whether a new bug report is a duplicate. The concept profile is also used to classify the bug reports into related categories in a taxonomic tree. Finally, we conduct experiments that demonstrate the feasibility of our proposed approach using bug reports extracted from the large-scale open source project Mozilla.
ER -