Supporting project comprehension with revision control system repository analysis

Context: Project comprehension is an activity relevant to all aspects of software engineering, from requirements specification to maintenance. The historical, transactional data stored in revision control systems can be mined and analysed to produce a great deal of information about a project. Aims:...

Full description

Bibliographic Details
Main Author: Burn, Andrew James
Published: Durham University 2011
Subjects:
004
Online Access:http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.531245
id ndltd-bl.uk-oai-ethos.bl.uk-531245
record_format oai_dc
spelling ndltd-bl.uk-oai-ethos.bl.uk-5312452015-03-20T04:50:50ZSupporting project comprehension with revision control system repository analysisBurn, Andrew James2011Context: Project comprehension is an activity relevant to all aspects of software engineering, from requirements specification to maintenance. The historical, transactional data stored in revision control systems can be mined and analysed to produce a great deal of information about a project. Aims: This research aims to explore how the data-mining, analysis and presentation of revision control systems can be used to augment aspects of project comprehension, including change prediction, maintenance, visualization, management, profiling, sampling and assessment. Method: A series of case studies investigate how transactional data can be used to support project comprehension. A thematic analysis of revision logs is used to explore the development process and developer behaviour. A benchmarking study of a history-based model of change prediction is conducted to assess how successfully such a technique can be used to augment syntax-based models. A visualization tool is developed for managers of student projects with the aim of evaluating what visualizations best support their roles. Finally, a quasi-experiment is conducted to determine how well an algorithmic model can automatically select a representative sample of code entities from a project, in comparison with expert strategies. Results: The thematic analysis case study classified maintenance activities in 22 undergraduate projects and four real-world projects. The change prediction study calculated information retrieval metrics for 34 undergraduate projects and three real-world projects, as well as an in-depth exploration of the model's performance and applications in two selected projects. File samples for seven projects were generated by six experts and three heuristic models and compared to assess agreement rates, both within the experts and between the experts and the models. Conclusions: When the results from each study are evaluated together, the evidence strongly shows that the information stored in revision control systems can indeed be used to support a range of project comprehension activities in a manner which complements existing, syntax-based techniques. The case studies also help to develop the empirical foundation of repository analysis in the areas of visualization, maintenance, sampling, profiling and management; the research also shows that students can be viable substitutes for industrial practitioners in certain areas of software engineering research, which weakens one of the primary obstacles to empirical studies in these areas.004Durham Universityhttp://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.531245http://etheses.dur.ac.uk/716/Electronic Thesis or Dissertation
collection NDLTD
sources NDLTD
topic 004
spellingShingle 004
Burn, Andrew James
Supporting project comprehension with revision control system repository analysis
description Context: Project comprehension is an activity relevant to all aspects of software engineering, from requirements specification to maintenance. The historical, transactional data stored in revision control systems can be mined and analysed to produce a great deal of information about a project. Aims: This research aims to explore how the data-mining, analysis and presentation of revision control systems can be used to augment aspects of project comprehension, including change prediction, maintenance, visualization, management, profiling, sampling and assessment. Method: A series of case studies investigate how transactional data can be used to support project comprehension. A thematic analysis of revision logs is used to explore the development process and developer behaviour. A benchmarking study of a history-based model of change prediction is conducted to assess how successfully such a technique can be used to augment syntax-based models. A visualization tool is developed for managers of student projects with the aim of evaluating what visualizations best support their roles. Finally, a quasi-experiment is conducted to determine how well an algorithmic model can automatically select a representative sample of code entities from a project, in comparison with expert strategies. Results: The thematic analysis case study classified maintenance activities in 22 undergraduate projects and four real-world projects. The change prediction study calculated information retrieval metrics for 34 undergraduate projects and three real-world projects, as well as an in-depth exploration of the model's performance and applications in two selected projects. File samples for seven projects were generated by six experts and three heuristic models and compared to assess agreement rates, both within the experts and between the experts and the models. Conclusions: When the results from each study are evaluated together, the evidence strongly shows that the information stored in revision control systems can indeed be used to support a range of project comprehension activities in a manner which complements existing, syntax-based techniques. The case studies also help to develop the empirical foundation of repository analysis in the areas of visualization, maintenance, sampling, profiling and management; the research also shows that students can be viable substitutes for industrial practitioners in certain areas of software engineering research, which weakens one of the primary obstacles to empirical studies in these areas.
author Burn, Andrew James
author_facet Burn, Andrew James
author_sort Burn, Andrew James
title Supporting project comprehension with revision control system repository analysis
title_short Supporting project comprehension with revision control system repository analysis
title_full Supporting project comprehension with revision control system repository analysis
title_fullStr Supporting project comprehension with revision control system repository analysis
title_full_unstemmed Supporting project comprehension with revision control system repository analysis
title_sort supporting project comprehension with revision control system repository analysis
publisher Durham University
publishDate 2011
url http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.531245
work_keys_str_mv AT burnandrewjames supportingprojectcomprehensionwithrevisioncontrolsystemrepositoryanalysis
_version_ 1716786750628560896