Analysis of cross-system porting and porting errors in software projects

Software forking---creating a variant product by copying and modifying an existing project---is often considered an ad hoc, low cost alternative to principled product line development. To maintain forked projects, developers need to manually port existing features or bug-fixes from one project to an...

Full description

Bibliographic Details
Main Author: Ray, Baishakhi
Format: Others
Language:en_US
Published: 2013
Subjects:
Bug
Online Access:http://hdl.handle.net/2152/22103
id ndltd-UTEXAS-oai-repositories.lib.utexas.edu-2152-22103
record_format oai_dc
spelling ndltd-UTEXAS-oai-repositories.lib.utexas.edu-2152-221032015-09-20T17:17:31ZAnalysis of cross-system porting and porting errors in software projectsRay, BaishakhiSoftware evolutionForkingPortingRepetitive changesCode clonesStatic analysisSubgraph isomorphismBugError detectionCopy-paste errorSoftware forking---creating a variant product by copying and modifying an existing project---is often considered an ad hoc, low cost alternative to principled product line development. To maintain forked projects, developers need to manually port existing features or bug-fixes from one project to another. Such manual porting is not only tedious but also error-prone. When the contexts of the ported code vary, developers often have to adapt the ported code to fit its surroundings. Faulty adaptations or inconsistent updates of the ported code could potentially introduce subtle inconsistencies in the codebase. To build a deeper understanding to cross-system porting and porting related errors, this dissertation investigates: (1) How can we identify ported code from software version histories? (2) What is the overhead of cross-system porting required to maintain forked projects? (3) What is the extent and characteristics of porting errors that occur in practice? and (4) How can we detect and characterize potential porting errors? As a first step towards assessing the overhead of cross-system porting, we implement REPERTOIRE, a tool to analyze repeated work of cross-system porting across peer projects. REPERTOIRE can detect ported edits between program patches with high accuracy of 94% precision and 84% recall. Using REPERTOIRE, we study the temporal, spatial, and developer dimensions of cross-system porting using 18 years of parallel evolution history of the BSD product family. Our study finds that cross-system porting happens periodically and the porting rate does not necessarily decrease over time. The upkeep work of porting changes from peer projects is significant and currently, porting practice seems to heavily depend on developers doing their porting job on time. Analyzing version histories of Linux and FreeBSD, we derive five categories of porting errors, including incorrect control- and data-flow, code redundancy, and inconsistent identifier and token renamings. Leveraging this categorization, we design a static control- and data-dependence analysis technique, SPA, to detect and characterize porting inconsistencies. SPA detects porting inconsistencies with 65% to 73% precision and 90% recall, and identify inconsistency types with 58% to 63% precision and 92% recall on average. In a comparison with two existing error detection tools, SPA outperforms them with 14% to 17% better precision.text2013-11-11T18:45:15Z2013-082013-11-11August 20132013-11-11T18:45:16Zapplication/pdfhttp://hdl.handle.net/2152/22103en_US
collection NDLTD
language en_US
format Others
sources NDLTD
topic Software evolution
Forking
Porting
Repetitive changes
Code clones
Static analysis
Subgraph isomorphism
Bug
Error detection
Copy-paste error
spellingShingle Software evolution
Forking
Porting
Repetitive changes
Code clones
Static analysis
Subgraph isomorphism
Bug
Error detection
Copy-paste error
Ray, Baishakhi
Analysis of cross-system porting and porting errors in software projects
description Software forking---creating a variant product by copying and modifying an existing project---is often considered an ad hoc, low cost alternative to principled product line development. To maintain forked projects, developers need to manually port existing features or bug-fixes from one project to another. Such manual porting is not only tedious but also error-prone. When the contexts of the ported code vary, developers often have to adapt the ported code to fit its surroundings. Faulty adaptations or inconsistent updates of the ported code could potentially introduce subtle inconsistencies in the codebase. To build a deeper understanding to cross-system porting and porting related errors, this dissertation investigates: (1) How can we identify ported code from software version histories? (2) What is the overhead of cross-system porting required to maintain forked projects? (3) What is the extent and characteristics of porting errors that occur in practice? and (4) How can we detect and characterize potential porting errors? As a first step towards assessing the overhead of cross-system porting, we implement REPERTOIRE, a tool to analyze repeated work of cross-system porting across peer projects. REPERTOIRE can detect ported edits between program patches with high accuracy of 94% precision and 84% recall. Using REPERTOIRE, we study the temporal, spatial, and developer dimensions of cross-system porting using 18 years of parallel evolution history of the BSD product family. Our study finds that cross-system porting happens periodically and the porting rate does not necessarily decrease over time. The upkeep work of porting changes from peer projects is significant and currently, porting practice seems to heavily depend on developers doing their porting job on time. Analyzing version histories of Linux and FreeBSD, we derive five categories of porting errors, including incorrect control- and data-flow, code redundancy, and inconsistent identifier and token renamings. Leveraging this categorization, we design a static control- and data-dependence analysis technique, SPA, to detect and characterize porting inconsistencies. SPA detects porting inconsistencies with 65% to 73% precision and 90% recall, and identify inconsistency types with 58% to 63% precision and 92% recall on average. In a comparison with two existing error detection tools, SPA outperforms them with 14% to 17% better precision. === text
author Ray, Baishakhi
author_facet Ray, Baishakhi
author_sort Ray, Baishakhi
title Analysis of cross-system porting and porting errors in software projects
title_short Analysis of cross-system porting and porting errors in software projects
title_full Analysis of cross-system porting and porting errors in software projects
title_fullStr Analysis of cross-system porting and porting errors in software projects
title_full_unstemmed Analysis of cross-system porting and porting errors in software projects
title_sort analysis of cross-system porting and porting errors in software projects
publishDate 2013
url http://hdl.handle.net/2152/22103
work_keys_str_mv AT raybaishakhi analysisofcrosssystemportingandportingerrorsinsoftwareprojects
_version_ 1716823333153013760