Analyses of Classifier’s Performance Measures Used in Software Fault Prediction Studies

Assessing the quality of the software is both important and difficult. For this purpose, software fault prediction (SFP) models have been extensively used. However, selecting the right model and declaring the best out of multiple models are dependent on the performance measures. We analyze 14 freque...

Full description

Bibliographic Details
Main Authors:	Muhammad Rizwan, Aamer Nadeem, Muddassar Azam Sindhu
Format:	Article
Language:	English
Published:	IEEE 2019-01-01
Series:	IEEE Access
Subjects:	Classification evaluation parameters machine learning performance measures software fault prediction
Online Access:	https://ieeexplore.ieee.org/document/8741051/

id	doaj-58121f0b8f464fd69da701b4fca8a2c6
record_format	Article
spelling	doaj-58121f0b8f464fd69da701b4fca8a2c62021-03-30T00:16:04ZengIEEEIEEE Access2169-35362019-01-017827648277510.1109/ACCESS.2019.29238218741051Analyses of Classifier’s Performance Measures Used in Software Fault Prediction StudiesMuhammad Rizwan0https://orcid.org/0000-0002-0855-3465Aamer Nadeem1Muddassar Azam Sindhu2https://orcid.org/0000-0002-3411-9224Department of Computer Science, Capital University of Science and Technology (CUST), Islamabad, PakistanDepartment of Computer Science, Capital University of Science and Technology (CUST), Islamabad, PakistanDepartment of Computer Science, Quaid-i-Azam University (QAU), Islamabad, PakistanAssessing the quality of the software is both important and difficult. For this purpose, software fault prediction (SFP) models have been extensively used. However, selecting the right model and declaring the best out of multiple models are dependent on the performance measures. We analyze 14 frequently used, non-graphic classifier's performance measures used in SFP studies. These analyses would help machine learning practitioners and researchers in SFP to select the most appropriate performance measure for the models' evaluation. We analyze the performance measures for resilience against producing invalid values through our proposed plausibility criterion. After that, consistency and discriminancy analyses are performed to find the best out of the 14 performance measures. Finally, we draw the order of the selected performance measures from better to worse in both balance and imbalance datasets. Our analyses conclude that the F-measure and the G-mean1 are equally the best candidates to evaluate the SFP models with careful analysis of the result, as there is a risk of invalid values in certain scenarios.https://ieeexplore.ieee.org/document/8741051/Classificationevaluation parametersmachine learningperformance measuressoftware fault prediction
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Muhammad Rizwan Aamer Nadeem Muddassar Azam Sindhu
spellingShingle	Muhammad Rizwan Aamer Nadeem Muddassar Azam Sindhu Analyses of Classifier’s Performance Measures Used in Software Fault Prediction Studies IEEE Access Classification evaluation parameters machine learning performance measures software fault prediction
author_facet	Muhammad Rizwan Aamer Nadeem Muddassar Azam Sindhu
author_sort	Muhammad Rizwan
title	Analyses of Classifier’s Performance Measures Used in Software Fault Prediction Studies
title_short	Analyses of Classifier’s Performance Measures Used in Software Fault Prediction Studies
title_full	Analyses of Classifier’s Performance Measures Used in Software Fault Prediction Studies
title_fullStr	Analyses of Classifier’s Performance Measures Used in Software Fault Prediction Studies
title_full_unstemmed	Analyses of Classifier’s Performance Measures Used in Software Fault Prediction Studies
title_sort	analyses of classifier’s performance measures used in software fault prediction studies
publisher	IEEE
series	IEEE Access
issn	2169-3536
publishDate	2019-01-01
description	Assessing the quality of the software is both important and difficult. For this purpose, software fault prediction (SFP) models have been extensively used. However, selecting the right model and declaring the best out of multiple models are dependent on the performance measures. We analyze 14 frequently used, non-graphic classifier's performance measures used in SFP studies. These analyses would help machine learning practitioners and researchers in SFP to select the most appropriate performance measure for the models' evaluation. We analyze the performance measures for resilience against producing invalid values through our proposed plausibility criterion. After that, consistency and discriminancy analyses are performed to find the best out of the 14 performance measures. Finally, we draw the order of the selected performance measures from better to worse in both balance and imbalance datasets. Our analyses conclude that the F-measure and the G-mean1 are equally the best candidates to evaluate the SFP models with careful analysis of the result, as there is a risk of invalid values in certain scenarios.
topic	Classification evaluation parameters machine learning performance measures software fault prediction
url	https://ieeexplore.ieee.org/document/8741051/
work_keys_str_mv	AT muhammadrizwan analysesofclassifierx2019sperformancemeasuresusedinsoftwarefaultpredictionstudies AT aamernadeem analysesofclassifierx2019sperformancemeasuresusedinsoftwarefaultpredictionstudies AT muddassarazamsindhu analysesofclassifierx2019sperformancemeasuresusedinsoftwarefaultpredictionstudies
_version_	1724188440609685504

Analyses of Classifier&#x2019;s Performance Measures Used in Software Fault Prediction Studies

Similar Items

Analyses of Classifier’s Performance Measures Used in Software Fault Prediction Studies