Abstract

Behavior Driven Development (BDD) is an agile approach that uses .feature files to describe the functionalities of a software system using natural language constructs (English- like phrases). Because of the English-like structure of .feature files, BDD specifications become an evolving documentation that helps all (even non-technical) stakeholders to understand and contribute to a software project. After specifying a .feature file, developers can use a BDD tool (e.g., Cucumber) to automatically generate test cases and implement the code of the specified functionality. However, maintaining traceability between .feature files and source code requires human efforts. Therefore, .feature files can be out-of-date, reducing the advantages of BDD. Furthermore, existing research do not attempt to improve the traceability between .feature files and source code files. In this paper, we study the co-changes between .feature files and source code files to improve the traceability between .feature files and source code files. Due to the English-like syntax of .feature files, we use natural language processing to identify links between .feature files and source code files, with an accuracy of 79%. We study the characteristics of BDD co-changes and build random forest models to predict when a .feature files should be modified before committing a code change. The random forest model obtains an AUC of 0.77. The model can assist developers in identifying when a .feature files should be modified in code commits. Once the traceability is up-to-date, BDD developers can write test code more efficiently and keep the software documentation up-to-date.

Bibtex

@inproceedings{bernardo2018studying,
  title={Predicting Co-Changes between Functionality Specifications and Source Code in Behavior Driven Development},
  author={Aidan Yang, Daniel Alencar da Costa and Ying Zou},
  booktitle={16th International Conference on Mining Software Repositories (MSR)},
  pages={toAppear},
  year={2019},
  organization={IEEE}
}