A Novel Approach for Software Defect prediction Based on the Power Law Function
Power law describes a common behavior in which a few factors play decisive roles in one thing. Most software defects occur in very few instances. In this study, we proposed a novel approach that adopts power law function characteristics for software defect prediction. The first step in this approach...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2020-03-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/10/5/1892 |
id |
doaj-d467596571ab45279373ab28b30ac645 |
---|---|
record_format |
Article |
spelling |
doaj-d467596571ab45279373ab28b30ac6452020-11-25T01:41:51ZengMDPI AGApplied Sciences2076-34172020-03-01105189210.3390/app10051892app10051892A Novel Approach for Software Defect prediction Based on the Power Law FunctionJunhua Ren0Feng Liu1School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, ChinaSchool of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, ChinaPower law describes a common behavior in which a few factors play decisive roles in one thing. Most software defects occur in very few instances. In this study, we proposed a novel approach that adopts power law function characteristics for software defect prediction. The first step in this approach is to establish the power law function of the majority of metrics in a software system. Following this, the power law function’s maximal curvature value is applied as the threshold value for determining higher metric values. Furthermore, the total number of higher metric values is counted in each instance. Finally, the statistical data are clustered into different categories as defect-free and defect-prone instances. Case studies and a comparison were conducted based on twelve public datasets of Promise, SoftLab, and ReLink by using five different algorithms. The results indicate that the precision, recall, and F-measure values obtained by the proposed approach are the most optimal among the tested five algorithms, the average values of recall and F-measure were improved by 14.3% and 6.0%, respectively. Furthermore, the complexity of the proposed approach based on the power law function is <inline-formula> <math display="inline"> <semantics> <mrow> <mi>O</mi> <mo stretchy="false">(</mo> <mn>2</mn> <mi>n</mi> <mo stretchy="false">)</mo> </mrow> </semantics> </math> </inline-formula>, which is the lowest among the tested five algorithms. The proposed approach is thus demonstrated to be feasible and highly efficient at software defect prediction with unlabeled datasets.https://www.mdpi.com/2076-3417/10/5/1892software defect predictionpower law functionunlabeled datacurvature |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Junhua Ren Feng Liu |
spellingShingle |
Junhua Ren Feng Liu A Novel Approach for Software Defect prediction Based on the Power Law Function Applied Sciences software defect prediction power law function unlabeled data curvature |
author_facet |
Junhua Ren Feng Liu |
author_sort |
Junhua Ren |
title |
A Novel Approach for Software Defect prediction Based on the Power Law Function |
title_short |
A Novel Approach for Software Defect prediction Based on the Power Law Function |
title_full |
A Novel Approach for Software Defect prediction Based on the Power Law Function |
title_fullStr |
A Novel Approach for Software Defect prediction Based on the Power Law Function |
title_full_unstemmed |
A Novel Approach for Software Defect prediction Based on the Power Law Function |
title_sort |
novel approach for software defect prediction based on the power law function |
publisher |
MDPI AG |
series |
Applied Sciences |
issn |
2076-3417 |
publishDate |
2020-03-01 |
description |
Power law describes a common behavior in which a few factors play decisive roles in one thing. Most software defects occur in very few instances. In this study, we proposed a novel approach that adopts power law function characteristics for software defect prediction. The first step in this approach is to establish the power law function of the majority of metrics in a software system. Following this, the power law function’s maximal curvature value is applied as the threshold value for determining higher metric values. Furthermore, the total number of higher metric values is counted in each instance. Finally, the statistical data are clustered into different categories as defect-free and defect-prone instances. Case studies and a comparison were conducted based on twelve public datasets of Promise, SoftLab, and ReLink by using five different algorithms. The results indicate that the precision, recall, and F-measure values obtained by the proposed approach are the most optimal among the tested five algorithms, the average values of recall and F-measure were improved by 14.3% and 6.0%, respectively. Furthermore, the complexity of the proposed approach based on the power law function is <inline-formula> <math display="inline"> <semantics> <mrow> <mi>O</mi> <mo stretchy="false">(</mo> <mn>2</mn> <mi>n</mi> <mo stretchy="false">)</mo> </mrow> </semantics> </math> </inline-formula>, which is the lowest among the tested five algorithms. The proposed approach is thus demonstrated to be feasible and highly efficient at software defect prediction with unlabeled datasets. |
topic |
software defect prediction power law function unlabeled data curvature |
url |
https://www.mdpi.com/2076-3417/10/5/1892 |
work_keys_str_mv |
AT junhuaren anovelapproachforsoftwaredefectpredictionbasedonthepowerlawfunction AT fengliu anovelapproachforsoftwaredefectpredictionbasedonthepowerlawfunction AT junhuaren novelapproachforsoftwaredefectpredictionbasedonthepowerlawfunction AT fengliu novelapproachforsoftwaredefectpredictionbasedonthepowerlawfunction |
_version_ |
1725039326054580224 |