Summary: | Software developers use commits to track source code changes made to a project, and to allow multiple developers to make changes simultaneously. To ensure that the commits can be traced to the issues that describe the work to be performed, developers typically add the identifier of the issue to the commit message to link commits to issues. However, developers are not infallible and not all desirable links are captured manually. To help find and improve links that have been manually specified, several techniques have been created. Although many software engineering tools, like defect predictors, depend on the links between commits and issues, there is currently no way to assess the quality of existing links. To provide a means of assessing the quality of links, I propose two quality attributes: completeness and consistency. Completeness measures whether all appropriate commits link to an issue, and consistency measures whether commits are linked to the most specific issue. I applied these quality attributes to assess a number of existing link techniques and found that existing techniques to link commits to issues lack both completeness and consistency in the links that they created. To enable researchers to better assess their techniques, I built a dataset that improves the link data for two open source projects. In addition, I provide an analysis of information in issue repositories in the form of relationships between issues that might help improve existing link augmentation techniques. === Science, Faculty of === Computer Science, Department of === Graduate
|