Integrative machine learning analysis of multiple gene expression profiles in cervical cancer

Although most of the cervical cancer cases are reported to be closely related to the Human Papillomavirus (HPV) infection, there is a need to study genes that stand up differentially in the final actualization of cervical cancers following HPV infection. In this study, we proposed an integrative mac...

Full description

Bibliographic Details
Main Authors: Mei Sze Tan, Siow-Wee Chang, Phaik Leng Cheah, Hwa Jen Yap
Format: Article
Language:English
Published: PeerJ Inc. 2018-07-01
Series:PeerJ
Subjects:
Online Access:https://peerj.com/articles/5285.pdf
id doaj-e0799ec20a9b4d79855687e2b4fa6ee5
record_format Article
spelling doaj-e0799ec20a9b4d79855687e2b4fa6ee52020-11-24T23:43:30ZengPeerJ Inc.PeerJ2167-83592018-07-016e528510.7717/peerj.5285Integrative machine learning analysis of multiple gene expression profiles in cervical cancerMei Sze Tan0Siow-Wee Chang1Phaik Leng Cheah2Hwa Jen Yap3Bioinformatics Programme, Institute of Biological Sciences, Faculty of Science, University of Malaya, Kuala Lumpur, MalaysiaBioinformatics Programme, Institute of Biological Sciences, Faculty of Science, University of Malaya, Kuala Lumpur, MalaysiaDepartment of Pathology, Faculty of Medicine, University of Malaya, Kuala Lumpur, MalaysiaDepartment of Mechanical Engineering, Faculty of Engineering, University of Malaya, Kuala Lumpur, MalaysiaAlthough most of the cervical cancer cases are reported to be closely related to the Human Papillomavirus (HPV) infection, there is a need to study genes that stand up differentially in the final actualization of cervical cancers following HPV infection. In this study, we proposed an integrative machine learning approach to analyse multiple gene expression profiles in cervical cancer in order to identify a set of genetic markers that are associated with and may eventually aid in the diagnosis or prognosis of cervical cancers. The proposed integrative analysis is composed of three steps: namely, (i) gene expression analysis of individual dataset; (ii) meta-analysis of multiple datasets; and (iii) feature selection and machine learning analysis. As a result, 21 gene expressions were identified through the integrative machine learning analysis which including seven supervised and one unsupervised methods. A functional analysis with GSEA (Gene Set Enrichment Analysis) was performed on the selected 21-gene expression set and showed significant enrichment in a nine-potential gene expression signature, namely PEG3, SPON1, BTD and RPLP2 (upregulated genes) and PRDX3, COPB2, LSM3, SLC5A3 and AS1B (downregulated genes).https://peerj.com/articles/5285.pdfGene expression profilingMeta-analysisMachine learningFeature selectionCervical cancer prognosisPotential gene signature
collection DOAJ
language English
format Article
sources DOAJ
author Mei Sze Tan
Siow-Wee Chang
Phaik Leng Cheah
Hwa Jen Yap
spellingShingle Mei Sze Tan
Siow-Wee Chang
Phaik Leng Cheah
Hwa Jen Yap
Integrative machine learning analysis of multiple gene expression profiles in cervical cancer
PeerJ
Gene expression profiling
Meta-analysis
Machine learning
Feature selection
Cervical cancer prognosis
Potential gene signature
author_facet Mei Sze Tan
Siow-Wee Chang
Phaik Leng Cheah
Hwa Jen Yap
author_sort Mei Sze Tan
title Integrative machine learning analysis of multiple gene expression profiles in cervical cancer
title_short Integrative machine learning analysis of multiple gene expression profiles in cervical cancer
title_full Integrative machine learning analysis of multiple gene expression profiles in cervical cancer
title_fullStr Integrative machine learning analysis of multiple gene expression profiles in cervical cancer
title_full_unstemmed Integrative machine learning analysis of multiple gene expression profiles in cervical cancer
title_sort integrative machine learning analysis of multiple gene expression profiles in cervical cancer
publisher PeerJ Inc.
series PeerJ
issn 2167-8359
publishDate 2018-07-01
description Although most of the cervical cancer cases are reported to be closely related to the Human Papillomavirus (HPV) infection, there is a need to study genes that stand up differentially in the final actualization of cervical cancers following HPV infection. In this study, we proposed an integrative machine learning approach to analyse multiple gene expression profiles in cervical cancer in order to identify a set of genetic markers that are associated with and may eventually aid in the diagnosis or prognosis of cervical cancers. The proposed integrative analysis is composed of three steps: namely, (i) gene expression analysis of individual dataset; (ii) meta-analysis of multiple datasets; and (iii) feature selection and machine learning analysis. As a result, 21 gene expressions were identified through the integrative machine learning analysis which including seven supervised and one unsupervised methods. A functional analysis with GSEA (Gene Set Enrichment Analysis) was performed on the selected 21-gene expression set and showed significant enrichment in a nine-potential gene expression signature, namely PEG3, SPON1, BTD and RPLP2 (upregulated genes) and PRDX3, COPB2, LSM3, SLC5A3 and AS1B (downregulated genes).
topic Gene expression profiling
Meta-analysis
Machine learning
Feature selection
Cervical cancer prognosis
Potential gene signature
url https://peerj.com/articles/5285.pdf
work_keys_str_mv AT meiszetan integrativemachinelearninganalysisofmultiplegeneexpressionprofilesincervicalcancer
AT siowweechang integrativemachinelearninganalysisofmultiplegeneexpressionprofilesincervicalcancer
AT phaiklengcheah integrativemachinelearninganalysisofmultiplegeneexpressionprofilesincervicalcancer
AT hwajenyap integrativemachinelearninganalysisofmultiplegeneexpressionprofilesincervicalcancer
_version_ 1725501319712604160