A Hybridized Feature Selection and Extraction Approach for Enhancing Cancer Prediction Based on DNA Methylation
Due to the vital role of the aberrant DNA methylation during the disease development such as cancer, the comprehension of its mechanism had become essential in the recent years for early detection and diagnosis. With the advent of the high-throughput technologies, there are still several challenges...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2018-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8307050/ |
id |
doaj-aab1735d951f404389aa6394c075199e |
---|---|
record_format |
Article |
spelling |
doaj-aab1735d951f404389aa6394c075199e2021-03-29T20:48:11ZengIEEEIEEE Access2169-35362018-01-016152121522310.1109/ACCESS.2018.28127348307050A Hybridized Feature Selection and Extraction Approach for Enhancing Cancer Prediction Based on DNA MethylationAbeer A. Raweh0https://orcid.org/0000-0001-8449-5239Mohammed Nassef1Amr Badr2Faculty of Computers and Information, Cairo University, Cairo, EgyptFaculty of Computers and Information, Cairo University, Cairo, EgyptFaculty of Computers and Information, Cairo University, Cairo, EgyptDue to the vital role of the aberrant DNA methylation during the disease development such as cancer, the comprehension of its mechanism had become essential in the recent years for early detection and diagnosis. With the advent of the high-throughput technologies, there are still several challenges to achieve the classification process using the DNA methylation data. The high-dimensionality and high-noisiness of the DNA methylation data may lead to the degradation of the prediction accuracy. Thus, it becomes increasingly important in a wide range to employ robust computational tools such as feature selection and extraction methods to extract the informative features amongst thousands of them, and hence improving cancer prediction. By using the DNA methylation degree in promoters and probes regions, this paper aims at predicting cancer with a hybridized approach based on the feature selection and feature extraction techniques. The suggested approach exploits a filter feature selection method called (F-score) to overcome the high-dimensionality problem of the DNA methylation data, and proposes an extraction model which employs the peaks of the mean methylation density, the fast Fourier transform algorithm, and the symmetry between the methylation density of a sample and the mean methylation density of both sample types normal and cancer as novel feature extraction methods, in order to accurate cancer classification and reduce training time. To evaluate the reliability of our approach, The naïve base, random forest, and support vector machine algorithms are introduced to predict different cancer types like: breast, colon, head, kidney, lung, thyroid, and uterine with and without the hybridized approach. The results show that, the classification accuracy improves in all most cases and it also proves the reliability indirectly.https://ieeexplore.ieee.org/document/8307050/Cancer predictionepigeneticsbiomarkersDNA methylationfeature selectionfeature extraction |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Abeer A. Raweh Mohammed Nassef Amr Badr |
spellingShingle |
Abeer A. Raweh Mohammed Nassef Amr Badr A Hybridized Feature Selection and Extraction Approach for Enhancing Cancer Prediction Based on DNA Methylation IEEE Access Cancer prediction epigenetics biomarkers DNA methylation feature selection feature extraction |
author_facet |
Abeer A. Raweh Mohammed Nassef Amr Badr |
author_sort |
Abeer A. Raweh |
title |
A Hybridized Feature Selection and Extraction Approach for Enhancing Cancer Prediction Based on DNA Methylation |
title_short |
A Hybridized Feature Selection and Extraction Approach for Enhancing Cancer Prediction Based on DNA Methylation |
title_full |
A Hybridized Feature Selection and Extraction Approach for Enhancing Cancer Prediction Based on DNA Methylation |
title_fullStr |
A Hybridized Feature Selection and Extraction Approach for Enhancing Cancer Prediction Based on DNA Methylation |
title_full_unstemmed |
A Hybridized Feature Selection and Extraction Approach for Enhancing Cancer Prediction Based on DNA Methylation |
title_sort |
hybridized feature selection and extraction approach for enhancing cancer prediction based on dna methylation |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2018-01-01 |
description |
Due to the vital role of the aberrant DNA methylation during the disease development such as cancer, the comprehension of its mechanism had become essential in the recent years for early detection and diagnosis. With the advent of the high-throughput technologies, there are still several challenges to achieve the classification process using the DNA methylation data. The high-dimensionality and high-noisiness of the DNA methylation data may lead to the degradation of the prediction accuracy. Thus, it becomes increasingly important in a wide range to employ robust computational tools such as feature selection and extraction methods to extract the informative features amongst thousands of them, and hence improving cancer prediction. By using the DNA methylation degree in promoters and probes regions, this paper aims at predicting cancer with a hybridized approach based on the feature selection and feature extraction techniques. The suggested approach exploits a filter feature selection method called (F-score) to overcome the high-dimensionality problem of the DNA methylation data, and proposes an extraction model which employs the peaks of the mean methylation density, the fast Fourier transform algorithm, and the symmetry between the methylation density of a sample and the mean methylation density of both sample types normal and cancer as novel feature extraction methods, in order to accurate cancer classification and reduce training time. To evaluate the reliability of our approach, The naïve base, random forest, and support vector machine algorithms are introduced to predict different cancer types like: breast, colon, head, kidney, lung, thyroid, and uterine with and without the hybridized approach. The results show that, the classification accuracy improves in all most cases and it also proves the reliability indirectly. |
topic |
Cancer prediction epigenetics biomarkers DNA methylation feature selection feature extraction |
url |
https://ieeexplore.ieee.org/document/8307050/ |
work_keys_str_mv |
AT abeeraraweh ahybridizedfeatureselectionandextractionapproachforenhancingcancerpredictionbasedondnamethylation AT mohammednassef ahybridizedfeatureselectionandextractionapproachforenhancingcancerpredictionbasedondnamethylation AT amrbadr ahybridizedfeatureselectionandextractionapproachforenhancingcancerpredictionbasedondnamethylation AT abeeraraweh hybridizedfeatureselectionandextractionapproachforenhancingcancerpredictionbasedondnamethylation AT mohammednassef hybridizedfeatureselectionandextractionapproachforenhancingcancerpredictionbasedondnamethylation AT amrbadr hybridizedfeatureselectionandextractionapproachforenhancingcancerpredictionbasedondnamethylation |
_version_ |
1724194178471034880 |