A Novel Approach for Software Defect prediction Based on the Power Law Function

Power law describes a common behavior in which a few factors play decisive roles in one thing. Most software defects occur in very few instances. In this study, we proposed a novel approach that adopts power law function characteristics for software defect prediction. The first step in this approach...

Full description

Bibliographic Details
Main Authors: Junhua Ren, Feng Liu
Format: Article
Language:English
Published: MDPI AG 2020-03-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/10/5/1892
id doaj-d467596571ab45279373ab28b30ac645
record_format Article
spelling doaj-d467596571ab45279373ab28b30ac6452020-11-25T01:41:51ZengMDPI AGApplied Sciences2076-34172020-03-01105189210.3390/app10051892app10051892A Novel Approach for Software Defect prediction Based on the Power Law FunctionJunhua Ren0Feng Liu1School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, ChinaSchool of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, ChinaPower law describes a common behavior in which a few factors play decisive roles in one thing. Most software defects occur in very few instances. In this study, we proposed a novel approach that adopts power law function characteristics for software defect prediction. The first step in this approach is to establish the power law function of the majority of metrics in a software system. Following this, the power law function&#8217;s maximal curvature value is applied as the threshold value for determining higher metric values. Furthermore, the total number of higher metric values is counted in each instance. Finally, the statistical data are clustered into different categories as defect-free and defect-prone instances. Case studies and a comparison were conducted based on twelve public datasets of Promise, SoftLab, and ReLink by using five different algorithms. The results indicate that the precision, recall, and F-measure values obtained by the proposed approach are the most optimal among the tested five algorithms, the average values of recall and F-measure were improved by 14.3% and 6.0%, respectively. Furthermore, the complexity of the proposed approach based on the power law function is <inline-formula> <math display="inline"> <semantics> <mrow> <mi>O</mi> <mo stretchy="false">(</mo> <mn>2</mn> <mi>n</mi> <mo stretchy="false">)</mo> </mrow> </semantics> </math> </inline-formula>, which is the lowest among the tested five algorithms. The proposed approach is thus demonstrated to be feasible and highly efficient at software defect prediction with unlabeled datasets.https://www.mdpi.com/2076-3417/10/5/1892software defect predictionpower law functionunlabeled datacurvature
collection DOAJ
language English
format Article
sources DOAJ
author Junhua Ren
Feng Liu
spellingShingle Junhua Ren
Feng Liu
A Novel Approach for Software Defect prediction Based on the Power Law Function
Applied Sciences
software defect prediction
power law function
unlabeled data
curvature
author_facet Junhua Ren
Feng Liu
author_sort Junhua Ren
title A Novel Approach for Software Defect prediction Based on the Power Law Function
title_short A Novel Approach for Software Defect prediction Based on the Power Law Function
title_full A Novel Approach for Software Defect prediction Based on the Power Law Function
title_fullStr A Novel Approach for Software Defect prediction Based on the Power Law Function
title_full_unstemmed A Novel Approach for Software Defect prediction Based on the Power Law Function
title_sort novel approach for software defect prediction based on the power law function
publisher MDPI AG
series Applied Sciences
issn 2076-3417
publishDate 2020-03-01
description Power law describes a common behavior in which a few factors play decisive roles in one thing. Most software defects occur in very few instances. In this study, we proposed a novel approach that adopts power law function characteristics for software defect prediction. The first step in this approach is to establish the power law function of the majority of metrics in a software system. Following this, the power law function&#8217;s maximal curvature value is applied as the threshold value for determining higher metric values. Furthermore, the total number of higher metric values is counted in each instance. Finally, the statistical data are clustered into different categories as defect-free and defect-prone instances. Case studies and a comparison were conducted based on twelve public datasets of Promise, SoftLab, and ReLink by using five different algorithms. The results indicate that the precision, recall, and F-measure values obtained by the proposed approach are the most optimal among the tested five algorithms, the average values of recall and F-measure were improved by 14.3% and 6.0%, respectively. Furthermore, the complexity of the proposed approach based on the power law function is <inline-formula> <math display="inline"> <semantics> <mrow> <mi>O</mi> <mo stretchy="false">(</mo> <mn>2</mn> <mi>n</mi> <mo stretchy="false">)</mo> </mrow> </semantics> </math> </inline-formula>, which is the lowest among the tested five algorithms. The proposed approach is thus demonstrated to be feasible and highly efficient at software defect prediction with unlabeled datasets.
topic software defect prediction
power law function
unlabeled data
curvature
url https://www.mdpi.com/2076-3417/10/5/1892
work_keys_str_mv AT junhuaren anovelapproachforsoftwaredefectpredictionbasedonthepowerlawfunction
AT fengliu anovelapproachforsoftwaredefectpredictionbasedonthepowerlawfunction
AT junhuaren novelapproachforsoftwaredefectpredictionbasedonthepowerlawfunction
AT fengliu novelapproachforsoftwaredefectpredictionbasedonthepowerlawfunction
_version_ 1725039326054580224