Optimal Statistical Feature Subset Selection for Bearing Fault Detection and Severity Estimation

The performance of bearing fault detection systems based on machine learning techniques largely depends on the selected features. Hence, selection of an ideal number of dominant features from a comprehensive list of features is needed to decrease the number of computations involved in fault detectio...

Full description

Bibliographic Details
Main Authors:	Chhaya Grover, Neelam Turk
Format:	Article
Language:	English
Published:	Hindawi Limited 2020-01-01
Series:	Shock and Vibration
Online Access:	http://dx.doi.org/10.1155/2020/5742053

id	doaj-aa13c3196c8d4dd6a59b27db8afcfba0
record_format	Article
spelling	doaj-aa13c3196c8d4dd6a59b27db8afcfba02020-11-25T03:44:06ZengHindawi LimitedShock and Vibration1070-96221875-92032020-01-01202010.1155/2020/57420535742053Optimal Statistical Feature Subset Selection for Bearing Fault Detection and Severity EstimationChhaya Grover0Neelam Turk1Department of Electronics Engineering, J.C. Bose University of Science and Technology (YMCA), Sector-6, Faridabad, Haryana 121006, IndiaDepartment of Electronics Engineering, J.C. Bose University of Science and Technology (YMCA), Sector-6, Faridabad, Haryana 121006, IndiaThe performance of bearing fault detection systems based on machine learning techniques largely depends on the selected features. Hence, selection of an ideal number of dominant features from a comprehensive list of features is needed to decrease the number of computations involved in fault detection. In this paper, we attempted statistical time-domain features, namely, Hjorth parameters (activity, mobility, and complexity) and normal negative log likelihood for Gaussian mixture model (GMM) for the first time in addition to 26 other established statistical features for identification of bearing fault type and severity. Two datasets are derived from a publicly available database of Case Western Reserve University to identify the capability of features in fault identification under various fault sizes and motor loads. Features have been investigated using a two-step approach—filter-based ranking with 3 metrics followed by feature subset selection with 11 search techniques. The results indicate that the set of features root mean square, geometric mean, zero crossing rate, Hjorth parameter—mobility, and normal negative log likelihood for GMM outperforms other features. We also compared the diagnostic performance of normal negative log likelihood for GMM with the established feature normal negative log likelihood for single Gaussian. The selected set of statistical features is validated using ensemble rule-based classifiers and showed an average accuracy of 96.75% with proposed statistical features subset and 99.63% with all 30 features. F-measure and G-mean scores are also calculated to investigate their performance on datasets with class imbalance. The diagnostic effectiveness of the features was further validated on a bearing dataset obtained from an operating thermal power plant. The results obtained show that our newly proposed feature subset plays a major role in achieving good classification results and has a future potential of being used in a high-dimensional dataset with multidomain features.http://dx.doi.org/10.1155/2020/5742053
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Chhaya Grover Neelam Turk
spellingShingle	Chhaya Grover Neelam Turk Optimal Statistical Feature Subset Selection for Bearing Fault Detection and Severity Estimation Shock and Vibration
author_facet	Chhaya Grover Neelam Turk
author_sort	Chhaya Grover
title	Optimal Statistical Feature Subset Selection for Bearing Fault Detection and Severity Estimation
title_short	Optimal Statistical Feature Subset Selection for Bearing Fault Detection and Severity Estimation
title_full	Optimal Statistical Feature Subset Selection for Bearing Fault Detection and Severity Estimation
title_fullStr	Optimal Statistical Feature Subset Selection for Bearing Fault Detection and Severity Estimation
title_full_unstemmed	Optimal Statistical Feature Subset Selection for Bearing Fault Detection and Severity Estimation
title_sort	optimal statistical feature subset selection for bearing fault detection and severity estimation
publisher	Hindawi Limited
series	Shock and Vibration
issn	1070-9622 1875-9203
publishDate	2020-01-01
description	The performance of bearing fault detection systems based on machine learning techniques largely depends on the selected features. Hence, selection of an ideal number of dominant features from a comprehensive list of features is needed to decrease the number of computations involved in fault detection. In this paper, we attempted statistical time-domain features, namely, Hjorth parameters (activity, mobility, and complexity) and normal negative log likelihood for Gaussian mixture model (GMM) for the first time in addition to 26 other established statistical features for identification of bearing fault type and severity. Two datasets are derived from a publicly available database of Case Western Reserve University to identify the capability of features in fault identification under various fault sizes and motor loads. Features have been investigated using a two-step approach—filter-based ranking with 3 metrics followed by feature subset selection with 11 search techniques. The results indicate that the set of features root mean square, geometric mean, zero crossing rate, Hjorth parameter—mobility, and normal negative log likelihood for GMM outperforms other features. We also compared the diagnostic performance of normal negative log likelihood for GMM with the established feature normal negative log likelihood for single Gaussian. The selected set of statistical features is validated using ensemble rule-based classifiers and showed an average accuracy of 96.75% with proposed statistical features subset and 99.63% with all 30 features. F-measure and G-mean scores are also calculated to investigate their performance on datasets with class imbalance. The diagnostic effectiveness of the features was further validated on a bearing dataset obtained from an operating thermal power plant. The results obtained show that our newly proposed feature subset plays a major role in achieving good classification results and has a future potential of being used in a high-dimensional dataset with multidomain features.
url	http://dx.doi.org/10.1155/2020/5742053
work_keys_str_mv	AT chhayagrover optimalstatisticalfeaturesubsetselectionforbearingfaultdetectionandseverityestimation AT neelamturk optimalstatisticalfeaturesubsetselectionforbearingfaultdetectionandseverityestimation
_version_	1715130488687951872

Optimal Statistical Feature Subset Selection for Bearing Fault Detection and Severity Estimation

Similar Items