Feature Extension of Gut Microbiome Data for Deep Neural Network-Based Colorectal Cancer Classification

Colorectal cancer (CRC) is the third most deadly cancer worldwide. The use of gut microbiome in early detection of the disease has attracted much attention from the research community, mainly because of its noninvasive nature. Recent achievements in next generation sequencing technology have led to...

Full description

Bibliographic Details
Main Authors: Mwenge Mulenga, Sameem Abdul Kareem, Aznul Qalid Md Sabri, Manjeevan Seera, Suresh Govind, Chandramathi Samudi, Saharuddin Bin Mohamad
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9319639/
id doaj-3fbdbb65ee514f438cb6b7c7ab39fb61
record_format Article
spelling doaj-3fbdbb65ee514f438cb6b7c7ab39fb612021-03-30T15:06:22ZengIEEEIEEE Access2169-35362021-01-019235652357810.1109/ACCESS.2021.30508389319639Feature Extension of Gut Microbiome Data for Deep Neural Network-Based Colorectal Cancer ClassificationMwenge Mulenga0https://orcid.org/0000-0001-5961-4830Sameem Abdul Kareem1Aznul Qalid Md Sabri2https://orcid.org/0000-0002-4758-5400Manjeevan Seera3https://orcid.org/0000-0002-2797-3668Suresh Govind4Chandramathi Samudi5Saharuddin Bin Mohamad6School of Science, Engineering and Technology, Mulungushi University, Kabwe, ZambiaFaculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, MalaysiaFaculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, MalaysiaDepartment of Econometrics and Business Statistics, School of Business, Monash University Malaysia, Subang Jaya, MalaysiaDepartment of Parasitology, Faculty of Medicine, University of Malaya, Kuala Lumpur, MalaysiaDepartment of Medical Microbiology, Faculty of Medicine, University of Malaya, Kuala Lumpur, MalaysiaFaculty of Science, Institute of Biological Sciences, University of Malaya, Kuala Lumpur, MalaysiaColorectal cancer (CRC) is the third most deadly cancer worldwide. The use of gut microbiome in early detection of the disease has attracted much attention from the research community, mainly because of its noninvasive nature. Recent achievements in next generation sequencing technology have led to increased availability of sequence data and enabled an environment for the growth of gut microbiome research. The use of conventional machine learning algorithms for automatic detection of CRC based on the microbiome is limited by factors such as low accuracy and the need for manual selection of features. Despite their success in other fields, Deep Neural Network (DNN) algorithms have limitations in microbiome-based CRC classification. These limitations include high dimensionality of microbiome data and other characteristics associated with sequence data such as feature dominance. In this paper, we propose a feature augmentation approach that aggregates data normalization methods to extend existing features of a dataset. The proposed method combines feature extension with data augmentation to improve CRC classification performance of a DNN model. The proposed model obtained area under the curve (AUC) scores of 0.96 and 0.89 on two publicly available microbiome datasets.https://ieeexplore.ieee.org/document/9319639/Colorectal cancerdeep neural networkfeature dominancegut microbiomenormalizationfeature extension
collection DOAJ
language English
format Article
sources DOAJ
author Mwenge Mulenga
Sameem Abdul Kareem
Aznul Qalid Md Sabri
Manjeevan Seera
Suresh Govind
Chandramathi Samudi
Saharuddin Bin Mohamad
spellingShingle Mwenge Mulenga
Sameem Abdul Kareem
Aznul Qalid Md Sabri
Manjeevan Seera
Suresh Govind
Chandramathi Samudi
Saharuddin Bin Mohamad
Feature Extension of Gut Microbiome Data for Deep Neural Network-Based Colorectal Cancer Classification
IEEE Access
Colorectal cancer
deep neural network
feature dominance
gut microbiome
normalization
feature extension
author_facet Mwenge Mulenga
Sameem Abdul Kareem
Aznul Qalid Md Sabri
Manjeevan Seera
Suresh Govind
Chandramathi Samudi
Saharuddin Bin Mohamad
author_sort Mwenge Mulenga
title Feature Extension of Gut Microbiome Data for Deep Neural Network-Based Colorectal Cancer Classification
title_short Feature Extension of Gut Microbiome Data for Deep Neural Network-Based Colorectal Cancer Classification
title_full Feature Extension of Gut Microbiome Data for Deep Neural Network-Based Colorectal Cancer Classification
title_fullStr Feature Extension of Gut Microbiome Data for Deep Neural Network-Based Colorectal Cancer Classification
title_full_unstemmed Feature Extension of Gut Microbiome Data for Deep Neural Network-Based Colorectal Cancer Classification
title_sort feature extension of gut microbiome data for deep neural network-based colorectal cancer classification
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2021-01-01
description Colorectal cancer (CRC) is the third most deadly cancer worldwide. The use of gut microbiome in early detection of the disease has attracted much attention from the research community, mainly because of its noninvasive nature. Recent achievements in next generation sequencing technology have led to increased availability of sequence data and enabled an environment for the growth of gut microbiome research. The use of conventional machine learning algorithms for automatic detection of CRC based on the microbiome is limited by factors such as low accuracy and the need for manual selection of features. Despite their success in other fields, Deep Neural Network (DNN) algorithms have limitations in microbiome-based CRC classification. These limitations include high dimensionality of microbiome data and other characteristics associated with sequence data such as feature dominance. In this paper, we propose a feature augmentation approach that aggregates data normalization methods to extend existing features of a dataset. The proposed method combines feature extension with data augmentation to improve CRC classification performance of a DNN model. The proposed model obtained area under the curve (AUC) scores of 0.96 and 0.89 on two publicly available microbiome datasets.
topic Colorectal cancer
deep neural network
feature dominance
gut microbiome
normalization
feature extension
url https://ieeexplore.ieee.org/document/9319639/
work_keys_str_mv AT mwengemulenga featureextensionofgutmicrobiomedatafordeepneuralnetworkbasedcolorectalcancerclassification
AT sameemabdulkareem featureextensionofgutmicrobiomedatafordeepneuralnetworkbasedcolorectalcancerclassification
AT aznulqalidmdsabri featureextensionofgutmicrobiomedatafordeepneuralnetworkbasedcolorectalcancerclassification
AT manjeevanseera featureextensionofgutmicrobiomedatafordeepneuralnetworkbasedcolorectalcancerclassification
AT sureshgovind featureextensionofgutmicrobiomedatafordeepneuralnetworkbasedcolorectalcancerclassification
AT chandramathisamudi featureextensionofgutmicrobiomedatafordeepneuralnetworkbasedcolorectalcancerclassification
AT saharuddinbinmohamad featureextensionofgutmicrobiomedatafordeepneuralnetworkbasedcolorectalcancerclassification
_version_ 1724179968477364224