Abstract

SZZ is a widely used algorithm in the software engineering community to identify changes that are likely to introduce bugs (i.e., bug-introducing changes). Despite its wide adoption, SZZ still has room for improvements. For example, current SZZ implementations may still flag refactoring changes as bug-introducing. Refactorings should be disregarded as bug-introducing because they do not change the system behaviour. In this paper, we empirically investigate how refactorings impact both the input (bug-fix changes) and the output (bug-introducing changes) of the SZZ algorithm. We analyse 31,518 issues of ten Apache projects with 20,298 bug-introducing changes. We use an existing tool that automatically detects refactorings in code changes. We observe that 6.5% of lines that are flagged as bug-introducing changes by SZZ are in fact refactoring changes. Regarding bug-fix changes, we observe that 19.9% of lines that are removed during a fix are related to refactorings and, therefore, their respective inducing changes are false positives. We then incorporate the refactoring-detection tool in our Refactoring Aware SZZ Implementation (RA-SZZ). Our results reveal that RA-SZZ reduces 20.8% of the lines that are flagged as bug-introducing changes compared to the state-of-the-art SZZ implementations. Finally, we perform a manual analysis to identify change patterns that are not captured by the refactoring identification tool used in our study. Our results reveal that 47.95% of the analyzed bug-introducing changes contain additional change patterns that RA-SZZ should not flag as bug-introducing.

Bibtex

@inproceedings{neto2018impact,
  title={The impact of refactoring changes on the szz algorithm: An empirical study},
  author={Neto, Edmilson Campos and da Costa, Daniel Alencar and Kulesza, Uir{\'a}},
  booktitle={2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)},
  pages={380--390},
  year={2018},
  organization={IEEE}
}