Analysis of cross-system porting and porting errors in software projects

Software forking---creating a variant product by copying and modifying an existing project---is often considered an ad hoc, low cost alternative to principled product line development. To maintain forked projects, developers need to manually port existing features or bug-fixes from one project to an...

Full description

Bibliographic Details
Main Author:	Ray, Baishakhi
Format:	Others
Language:	en_US
Published:	2013
Subjects:	Software evolution Forking Porting Repetitive changes Code clones Static analysis Subgraph isomorphism Bug Error detection Copy-paste error
Online Access:	http://hdl.handle.net/2152/22103

id	ndltd-UTEXAS-oai-repositories.lib.utexas.edu-2152-22103
record_format	oai_dc
spelling	ndltd-UTEXAS-oai-repositories.lib.utexas.edu-2152-221032015-09-20T17:17:31ZAnalysis of cross-system porting and porting errors in software projectsRay, BaishakhiSoftware evolutionForkingPortingRepetitive changesCode clonesStatic analysisSubgraph isomorphismBugError detectionCopy-paste errorSoftware forking---creating a variant product by copying and modifying an existing project---is often considered an ad hoc, low cost alternative to principled product line development. To maintain forked projects, developers need to manually port existing features or bug-fixes from one project to another. Such manual porting is not only tedious but also error-prone. When the contexts of the ported code vary, developers often have to adapt the ported code to fit its surroundings. Faulty adaptations or inconsistent updates of the ported code could potentially introduce subtle inconsistencies in the codebase. To build a deeper understanding to cross-system porting and porting related errors, this dissertation investigates: (1) How can we identify ported code from software version histories? (2) What is the overhead of cross-system porting required to maintain forked projects? (3) What is the extent and characteristics of porting errors that occur in practice? and (4) How can we detect and characterize potential porting errors? As a first step towards assessing the overhead of cross-system porting, we implement REPERTOIRE, a tool to analyze repeated work of cross-system porting across peer projects. REPERTOIRE can detect ported edits between program patches with high accuracy of 94% precision and 84% recall. Using REPERTOIRE, we study the temporal, spatial, and developer dimensions of cross-system porting using 18 years of parallel evolution history of the BSD product family. Our study finds that cross-system porting happens periodically and the porting rate does not necessarily decrease over time. The upkeep work of porting changes from peer projects is significant and currently, porting practice seems to heavily depend on developers doing their porting job on time. Analyzing version histories of Linux and FreeBSD, we derive five categories of porting errors, including incorrect control- and data-flow, code redundancy, and inconsistent identifier and token renamings. Leveraging this categorization, we design a static control- and data-dependence analysis technique, SPA, to detect and characterize porting inconsistencies. SPA detects porting inconsistencies with 65% to 73% precision and 90% recall, and identify inconsistency types with 58% to 63% precision and 92% recall on average. In a comparison with two existing error detection tools, SPA outperforms them with 14% to 17% better precision.text2013-11-11T18:45:15Z2013-082013-11-11August 20132013-11-11T18:45:16Zapplication/pdfhttp://hdl.handle.net/2152/22103en_US
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
topic	Software evolution Forking Porting Repetitive changes Code clones Static analysis Subgraph isomorphism Bug Error detection Copy-paste error
spellingShingle	Software evolution Forking Porting Repetitive changes Code clones Static analysis Subgraph isomorphism Bug Error detection Copy-paste error Ray, Baishakhi Analysis of cross-system porting and porting errors in software projects
description	Software forking---creating a variant product by copying and modifying an existing project---is often considered an ad hoc, low cost alternative to principled product line development. To maintain forked projects, developers need to manually port existing features or bug-fixes from one project to another. Such manual porting is not only tedious but also error-prone. When the contexts of the ported code vary, developers often have to adapt the ported code to fit its surroundings. Faulty adaptations or inconsistent updates of the ported code could potentially introduce subtle inconsistencies in the codebase. To build a deeper understanding to cross-system porting and porting related errors, this dissertation investigates: (1) How can we identify ported code from software version histories? (2) What is the overhead of cross-system porting required to maintain forked projects? (3) What is the extent and characteristics of porting errors that occur in practice? and (4) How can we detect and characterize potential porting errors? As a first step towards assessing the overhead of cross-system porting, we implement REPERTOIRE, a tool to analyze repeated work of cross-system porting across peer projects. REPERTOIRE can detect ported edits between program patches with high accuracy of 94% precision and 84% recall. Using REPERTOIRE, we study the temporal, spatial, and developer dimensions of cross-system porting using 18 years of parallel evolution history of the BSD product family. Our study finds that cross-system porting happens periodically and the porting rate does not necessarily decrease over time. The upkeep work of porting changes from peer projects is significant and currently, porting practice seems to heavily depend on developers doing their porting job on time. Analyzing version histories of Linux and FreeBSD, we derive five categories of porting errors, including incorrect control- and data-flow, code redundancy, and inconsistent identifier and token renamings. Leveraging this categorization, we design a static control- and data-dependence analysis technique, SPA, to detect and characterize porting inconsistencies. SPA detects porting inconsistencies with 65% to 73% precision and 90% recall, and identify inconsistency types with 58% to 63% precision and 92% recall on average. In a comparison with two existing error detection tools, SPA outperforms them with 14% to 17% better precision. === text
author	Ray, Baishakhi
author_facet	Ray, Baishakhi
author_sort	Ray, Baishakhi
title	Analysis of cross-system porting and porting errors in software projects
title_short	Analysis of cross-system porting and porting errors in software projects
title_full	Analysis of cross-system porting and porting errors in software projects
title_fullStr	Analysis of cross-system porting and porting errors in software projects
title_full_unstemmed	Analysis of cross-system porting and porting errors in software projects
title_sort	analysis of cross-system porting and porting errors in software projects
publishDate	2013
url	http://hdl.handle.net/2152/22103
work_keys_str_mv	AT raybaishakhi analysisofcrosssystemportingandportingerrorsinsoftwareprojects
_version_	1716823333153013760

Analysis of cross-system porting and porting errors in software projects

Similar Items