An Evolutionary Algorithm for Feature Subset Selection in Hard Disk Drive Failure Prediction
Hard disk drives are used in everyday life to store critical data. Although they are reliable, failure of a hard disk drive can be catastrophic, especially in applications like medicine, banking, air traffic control systems, missile guidance systems, computer numerical controlled machines, and more....
Main Author: | |
---|---|
Format: | Others |
Published: |
NSUWorks
2011
|
Subjects: | |
Online Access: | http://nsuworks.nova.edu/gscis_etd/91 http://nsuworks.nova.edu/cgi/viewcontent.cgi?article=1090&context=gscis_etd |
id |
ndltd-nova.edu-oai-nsuworks.nova.edu-gscis_etd-1090 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-nova.edu-oai-nsuworks.nova.edu-gscis_etd-10902016-10-20T03:59:12Z An Evolutionary Algorithm for Feature Subset Selection in Hard Disk Drive Failure Prediction Bhasin, Harpreet Hard disk drives are used in everyday life to store critical data. Although they are reliable, failure of a hard disk drive can be catastrophic, especially in applications like medicine, banking, air traffic control systems, missile guidance systems, computer numerical controlled machines, and more. The use of Self-Monitoring, Analysis and Reporting Technology (SMART) can aid in failure prediction by monitoring specific drive attributes and warning the user of an impending failure so that the user can backup data while there is still time. As a consequence, hard drive failure prediction has become an important problem and the subject of active research. The best available approaches for hard drive failure prediction achieve acceptably low false alarm rates by first selecting a subset of features using non-parametric statistical methods such as reverse arrangements and then using the multiple-instance naïve Bayes classifier for the prediction task. However, the prediction accuracy of this approach is not sufficiently high. The focus of this dissertation was to improve the drive failure prediction accuracy while maintaining a low false alarm rate by using a genetic algorithm for feature set reduction in conjunction with the multiple-instance naïve Bayes classifier for the prediction task. This research achieved a failure detection rate of 81% with a 0% false alarm rate on 12 attributes selected by the genetic algorithm. As a secondary contribution, this dissertation investigated the tradeoff between feature subset reduction and prediction accuracy in the hard drive prediction problem. This research found that as the number of features decreased below 10, the detection accuracy decreased significantly. 2011-01-01T08:00:00Z text application/pdf http://nsuworks.nova.edu/gscis_etd/91 http://nsuworks.nova.edu/cgi/viewcontent.cgi?article=1090&context=gscis_etd CEC Theses and Dissertations NSUWorks Computer Sciences |
collection |
NDLTD |
format |
Others
|
sources |
NDLTD |
topic |
Computer Sciences |
spellingShingle |
Computer Sciences Bhasin, Harpreet An Evolutionary Algorithm for Feature Subset Selection in Hard Disk Drive Failure Prediction |
description |
Hard disk drives are used in everyday life to store critical data. Although they are reliable, failure of a hard disk drive can be catastrophic, especially in applications like medicine, banking, air traffic control systems, missile guidance systems, computer numerical controlled machines, and more. The use of Self-Monitoring, Analysis and Reporting Technology (SMART) can aid in failure prediction by monitoring specific drive attributes and warning the user of an impending failure so that the user can backup data while there is still time. As a consequence, hard drive failure prediction has become an important problem and the subject of active research.
The best available approaches for hard drive failure prediction achieve acceptably low false alarm rates by first selecting a subset of features using non-parametric statistical methods such as reverse arrangements and then using the multiple-instance naïve Bayes classifier for the prediction task. However, the prediction accuracy of this approach is not sufficiently high.
The focus of this dissertation was to improve the drive failure prediction accuracy while maintaining a low false alarm rate by using a genetic algorithm for feature set reduction in conjunction with the multiple-instance naïve Bayes classifier for the prediction task. This research achieved a failure detection rate of 81% with a 0% false alarm rate on 12 attributes selected by the genetic algorithm. As a secondary contribution, this dissertation investigated the tradeoff between feature subset reduction and prediction accuracy in the hard drive prediction problem. This research found that as the number of features decreased below 10, the detection accuracy decreased significantly. |
author |
Bhasin, Harpreet |
author_facet |
Bhasin, Harpreet |
author_sort |
Bhasin, Harpreet |
title |
An Evolutionary Algorithm for Feature Subset Selection in Hard Disk Drive Failure Prediction |
title_short |
An Evolutionary Algorithm for Feature Subset Selection in Hard Disk Drive Failure Prediction |
title_full |
An Evolutionary Algorithm for Feature Subset Selection in Hard Disk Drive Failure Prediction |
title_fullStr |
An Evolutionary Algorithm for Feature Subset Selection in Hard Disk Drive Failure Prediction |
title_full_unstemmed |
An Evolutionary Algorithm for Feature Subset Selection in Hard Disk Drive Failure Prediction |
title_sort |
evolutionary algorithm for feature subset selection in hard disk drive failure prediction |
publisher |
NSUWorks |
publishDate |
2011 |
url |
http://nsuworks.nova.edu/gscis_etd/91 http://nsuworks.nova.edu/cgi/viewcontent.cgi?article=1090&context=gscis_etd |
work_keys_str_mv |
AT bhasinharpreet anevolutionaryalgorithmforfeaturesubsetselectioninharddiskdrivefailureprediction AT bhasinharpreet evolutionaryalgorithmforfeaturesubsetselectioninharddiskdrivefailureprediction |
_version_ |
1718387599452667904 |