Analyses of Classifier’s Performance Measures Used in Software Fault Prediction Studies
Assessing the quality of the software is both important and difficult. For this purpose, software fault prediction (SFP) models have been extensively used. However, selecting the right model and declaring the best out of multiple models are dependent on the performance measures. We analyze 14 freque...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2019-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8741051/ |
id |
doaj-58121f0b8f464fd69da701b4fca8a2c6 |
---|---|
record_format |
Article |
spelling |
doaj-58121f0b8f464fd69da701b4fca8a2c62021-03-30T00:16:04ZengIEEEIEEE Access2169-35362019-01-017827648277510.1109/ACCESS.2019.29238218741051Analyses of Classifier’s Performance Measures Used in Software Fault Prediction StudiesMuhammad Rizwan0https://orcid.org/0000-0002-0855-3465Aamer Nadeem1Muddassar Azam Sindhu2https://orcid.org/0000-0002-3411-9224Department of Computer Science, Capital University of Science and Technology (CUST), Islamabad, PakistanDepartment of Computer Science, Capital University of Science and Technology (CUST), Islamabad, PakistanDepartment of Computer Science, Quaid-i-Azam University (QAU), Islamabad, PakistanAssessing the quality of the software is both important and difficult. For this purpose, software fault prediction (SFP) models have been extensively used. However, selecting the right model and declaring the best out of multiple models are dependent on the performance measures. We analyze 14 frequently used, non-graphic classifier's performance measures used in SFP studies. These analyses would help machine learning practitioners and researchers in SFP to select the most appropriate performance measure for the models' evaluation. We analyze the performance measures for resilience against producing invalid values through our proposed plausibility criterion. After that, consistency and discriminancy analyses are performed to find the best out of the 14 performance measures. Finally, we draw the order of the selected performance measures from better to worse in both balance and imbalance datasets. Our analyses conclude that the F-measure and the G-mean1 are equally the best candidates to evaluate the SFP models with careful analysis of the result, as there is a risk of invalid values in certain scenarios.https://ieeexplore.ieee.org/document/8741051/Classificationevaluation parametersmachine learningperformance measuressoftware fault prediction |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Muhammad Rizwan Aamer Nadeem Muddassar Azam Sindhu |
spellingShingle |
Muhammad Rizwan Aamer Nadeem Muddassar Azam Sindhu Analyses of Classifier’s Performance Measures Used in Software Fault Prediction Studies IEEE Access Classification evaluation parameters machine learning performance measures software fault prediction |
author_facet |
Muhammad Rizwan Aamer Nadeem Muddassar Azam Sindhu |
author_sort |
Muhammad Rizwan |
title |
Analyses of Classifier’s Performance Measures Used in Software Fault Prediction Studies |
title_short |
Analyses of Classifier’s Performance Measures Used in Software Fault Prediction Studies |
title_full |
Analyses of Classifier’s Performance Measures Used in Software Fault Prediction Studies |
title_fullStr |
Analyses of Classifier’s Performance Measures Used in Software Fault Prediction Studies |
title_full_unstemmed |
Analyses of Classifier’s Performance Measures Used in Software Fault Prediction Studies |
title_sort |
analyses of classifier’s performance measures used in software fault prediction studies |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2019-01-01 |
description |
Assessing the quality of the software is both important and difficult. For this purpose, software fault prediction (SFP) models have been extensively used. However, selecting the right model and declaring the best out of multiple models are dependent on the performance measures. We analyze 14 frequently used, non-graphic classifier's performance measures used in SFP studies. These analyses would help machine learning practitioners and researchers in SFP to select the most appropriate performance measure for the models' evaluation. We analyze the performance measures for resilience against producing invalid values through our proposed plausibility criterion. After that, consistency and discriminancy analyses are performed to find the best out of the 14 performance measures. Finally, we draw the order of the selected performance measures from better to worse in both balance and imbalance datasets. Our analyses conclude that the F-measure and the G-mean1 are equally the best candidates to evaluate the SFP models with careful analysis of the result, as there is a risk of invalid values in certain scenarios. |
topic |
Classification evaluation parameters machine learning performance measures software fault prediction |
url |
https://ieeexplore.ieee.org/document/8741051/ |
work_keys_str_mv |
AT muhammadrizwan analysesofclassifierx2019sperformancemeasuresusedinsoftwarefaultpredictionstudies AT aamernadeem analysesofclassifierx2019sperformancemeasuresusedinsoftwarefaultpredictionstudies AT muddassarazamsindhu analysesofclassifierx2019sperformancemeasuresusedinsoftwarefaultpredictionstudies |
_version_ |
1724188440609685504 |