A Hybridized Feature Selection and Extraction Approach for Enhancing Cancer Prediction Based on DNA Methylation

Due to the vital role of the aberrant DNA methylation during the disease development such as cancer, the comprehension of its mechanism had become essential in the recent years for early detection and diagnosis. With the advent of the high-throughput technologies, there are still several challenges...

Full description

Bibliographic Details
Main Authors: Abeer A. Raweh, Mohammed Nassef, Amr Badr
Format: Article
Language:English
Published: IEEE 2018-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8307050/
id doaj-aab1735d951f404389aa6394c075199e
record_format Article
spelling doaj-aab1735d951f404389aa6394c075199e2021-03-29T20:48:11ZengIEEEIEEE Access2169-35362018-01-016152121522310.1109/ACCESS.2018.28127348307050A Hybridized Feature Selection and Extraction Approach for Enhancing Cancer Prediction Based on DNA MethylationAbeer A. Raweh0https://orcid.org/0000-0001-8449-5239Mohammed Nassef1Amr Badr2Faculty of Computers and Information, Cairo University, Cairo, EgyptFaculty of Computers and Information, Cairo University, Cairo, EgyptFaculty of Computers and Information, Cairo University, Cairo, EgyptDue to the vital role of the aberrant DNA methylation during the disease development such as cancer, the comprehension of its mechanism had become essential in the recent years for early detection and diagnosis. With the advent of the high-throughput technologies, there are still several challenges to achieve the classification process using the DNA methylation data. The high-dimensionality and high-noisiness of the DNA methylation data may lead to the degradation of the prediction accuracy. Thus, it becomes increasingly important in a wide range to employ robust computational tools such as feature selection and extraction methods to extract the informative features amongst thousands of them, and hence improving cancer prediction. By using the DNA methylation degree in promoters and probes regions, this paper aims at predicting cancer with a hybridized approach based on the feature selection and feature extraction techniques. The suggested approach exploits a filter feature selection method called (F-score) to overcome the high-dimensionality problem of the DNA methylation data, and proposes an extraction model which employs the peaks of the mean methylation density, the fast Fourier transform algorithm, and the symmetry between the methylation density of a sample and the mean methylation density of both sample types normal and cancer as novel feature extraction methods, in order to accurate cancer classification and reduce training time. To evaluate the reliability of our approach, The naïve base, random forest, and support vector machine algorithms are introduced to predict different cancer types like: breast, colon, head, kidney, lung, thyroid, and uterine with and without the hybridized approach. The results show that, the classification accuracy improves in all most cases and it also proves the reliability indirectly.https://ieeexplore.ieee.org/document/8307050/Cancer predictionepigeneticsbiomarkersDNA methylationfeature selectionfeature extraction
collection DOAJ
language English
format Article
sources DOAJ
author Abeer A. Raweh
Mohammed Nassef
Amr Badr
spellingShingle Abeer A. Raweh
Mohammed Nassef
Amr Badr
A Hybridized Feature Selection and Extraction Approach for Enhancing Cancer Prediction Based on DNA Methylation
IEEE Access
Cancer prediction
epigenetics
biomarkers
DNA methylation
feature selection
feature extraction
author_facet Abeer A. Raweh
Mohammed Nassef
Amr Badr
author_sort Abeer A. Raweh
title A Hybridized Feature Selection and Extraction Approach for Enhancing Cancer Prediction Based on DNA Methylation
title_short A Hybridized Feature Selection and Extraction Approach for Enhancing Cancer Prediction Based on DNA Methylation
title_full A Hybridized Feature Selection and Extraction Approach for Enhancing Cancer Prediction Based on DNA Methylation
title_fullStr A Hybridized Feature Selection and Extraction Approach for Enhancing Cancer Prediction Based on DNA Methylation
title_full_unstemmed A Hybridized Feature Selection and Extraction Approach for Enhancing Cancer Prediction Based on DNA Methylation
title_sort hybridized feature selection and extraction approach for enhancing cancer prediction based on dna methylation
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2018-01-01
description Due to the vital role of the aberrant DNA methylation during the disease development such as cancer, the comprehension of its mechanism had become essential in the recent years for early detection and diagnosis. With the advent of the high-throughput technologies, there are still several challenges to achieve the classification process using the DNA methylation data. The high-dimensionality and high-noisiness of the DNA methylation data may lead to the degradation of the prediction accuracy. Thus, it becomes increasingly important in a wide range to employ robust computational tools such as feature selection and extraction methods to extract the informative features amongst thousands of them, and hence improving cancer prediction. By using the DNA methylation degree in promoters and probes regions, this paper aims at predicting cancer with a hybridized approach based on the feature selection and feature extraction techniques. The suggested approach exploits a filter feature selection method called (F-score) to overcome the high-dimensionality problem of the DNA methylation data, and proposes an extraction model which employs the peaks of the mean methylation density, the fast Fourier transform algorithm, and the symmetry between the methylation density of a sample and the mean methylation density of both sample types normal and cancer as novel feature extraction methods, in order to accurate cancer classification and reduce training time. To evaluate the reliability of our approach, The naïve base, random forest, and support vector machine algorithms are introduced to predict different cancer types like: breast, colon, head, kidney, lung, thyroid, and uterine with and without the hybridized approach. The results show that, the classification accuracy improves in all most cases and it also proves the reliability indirectly.
topic Cancer prediction
epigenetics
biomarkers
DNA methylation
feature selection
feature extraction
url https://ieeexplore.ieee.org/document/8307050/
work_keys_str_mv AT abeeraraweh ahybridizedfeatureselectionandextractionapproachforenhancingcancerpredictionbasedondnamethylation
AT mohammednassef ahybridizedfeatureselectionandextractionapproachforenhancingcancerpredictionbasedondnamethylation
AT amrbadr ahybridizedfeatureselectionandextractionapproachforenhancingcancerpredictionbasedondnamethylation
AT abeeraraweh hybridizedfeatureselectionandextractionapproachforenhancingcancerpredictionbasedondnamethylation
AT mohammednassef hybridizedfeatureselectionandextractionapproachforenhancingcancerpredictionbasedondnamethylation
AT amrbadr hybridizedfeatureselectionandextractionapproachforenhancingcancerpredictionbasedondnamethylation
_version_ 1724194178471034880