RNA-Seq-Based Breast Cancer Subtypes Classification Using Machine Learning Approaches
Background. Breast invasive carcinoma (BRCA) is not a single disease as each subtype has a distinct morphology structure. Although several computational methods have been proposed to conduct breast cancer subtype identification, the specific interaction mechanisms of genes involved in the subtypes a...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Hindawi Limited
2020-01-01
|
Series: | Computational Intelligence and Neuroscience |
Online Access: | http://dx.doi.org/10.1155/2020/4737969 |
id |
doaj-bb553df3de0040eea2a002f29e630c2a |
---|---|
record_format |
Article |
spelling |
doaj-bb553df3de0040eea2a002f29e630c2a2020-11-25T04:06:19ZengHindawi LimitedComputational Intelligence and Neuroscience1687-52732020-01-01202010.1155/2020/47379694737969RNA-Seq-Based Breast Cancer Subtypes Classification Using Machine Learning ApproachesZhezhou Yu0Zhuo Wang1Xiangchun Yu2Zhe Zhang3College of Computer Science and TechnologyCollege of Computer Science and TechnologyCollege of Computer Science and TechnologyCollege of Computer Science and TechnologyBackground. Breast invasive carcinoma (BRCA) is not a single disease as each subtype has a distinct morphology structure. Although several computational methods have been proposed to conduct breast cancer subtype identification, the specific interaction mechanisms of genes involved in the subtypes are still incomplete. To identify and explore the corresponding interaction mechanisms of genes for each subtype of breast cancer can impose an important impact on the personalized treatment for different patients. Methods. We integrate the biological importance of genes from the gene regulatory networks to the differential expression analysis and then obtain the weighted differentially expressed genes (weighted DEGs). A gene with a high weight means it regulates more target genes and thus holds more biological importance. Besides, we constructed gene coexpression networks for control and experiment groups, and the significantly differentially interacting structures encouraged us to design the corresponding Gene Ontology (GO) enrichment based on gene coexpression networks (GOEGCN). The GOEGCN considers the two-side distinction analysis between gene coexpression networks for control and experiment groups. The method allows us to study how the modulated coexpressed gene couples impact biological functions at a GO level. Results. We modeled the binary classification with weighted DEGs for each subtype. The binary classifier could make a good prediction for an unseen sample, and the experimental results validated the effectiveness of our proposed approaches. The novel enriched GO terms based on GOEGCN for control and experiment groups of each subtype explain the specific biological function changes according to the two-side distinction of coexpression network structures to some extent. Conclusion. The weighted DEGs contain biological importance derived from the gene regulatory network. Based on the weighted DEGs, five binary classifiers were learned and showed good performance concerning the “Sensitivity,” “Specificity,” “Accuracy,” “F1,” and “AUC” metrics. The GOEGCN with weighted DEGs for control and experiment groups presented a novel GO enrichment analysis results and the novel enriched GO terms would further unveil the changes of specific biological functions among all the BRCA subtypes to some extent. The R code in this research is available at https://github.com/yxchspring/GOEGCN_BRCA_Subtypes.http://dx.doi.org/10.1155/2020/4737969 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Zhezhou Yu Zhuo Wang Xiangchun Yu Zhe Zhang |
spellingShingle |
Zhezhou Yu Zhuo Wang Xiangchun Yu Zhe Zhang RNA-Seq-Based Breast Cancer Subtypes Classification Using Machine Learning Approaches Computational Intelligence and Neuroscience |
author_facet |
Zhezhou Yu Zhuo Wang Xiangchun Yu Zhe Zhang |
author_sort |
Zhezhou Yu |
title |
RNA-Seq-Based Breast Cancer Subtypes Classification Using Machine Learning Approaches |
title_short |
RNA-Seq-Based Breast Cancer Subtypes Classification Using Machine Learning Approaches |
title_full |
RNA-Seq-Based Breast Cancer Subtypes Classification Using Machine Learning Approaches |
title_fullStr |
RNA-Seq-Based Breast Cancer Subtypes Classification Using Machine Learning Approaches |
title_full_unstemmed |
RNA-Seq-Based Breast Cancer Subtypes Classification Using Machine Learning Approaches |
title_sort |
rna-seq-based breast cancer subtypes classification using machine learning approaches |
publisher |
Hindawi Limited |
series |
Computational Intelligence and Neuroscience |
issn |
1687-5273 |
publishDate |
2020-01-01 |
description |
Background. Breast invasive carcinoma (BRCA) is not a single disease as each subtype has a distinct morphology structure. Although several computational methods have been proposed to conduct breast cancer subtype identification, the specific interaction mechanisms of genes involved in the subtypes are still incomplete. To identify and explore the corresponding interaction mechanisms of genes for each subtype of breast cancer can impose an important impact on the personalized treatment for different patients. Methods. We integrate the biological importance of genes from the gene regulatory networks to the differential expression analysis and then obtain the weighted differentially expressed genes (weighted DEGs). A gene with a high weight means it regulates more target genes and thus holds more biological importance. Besides, we constructed gene coexpression networks for control and experiment groups, and the significantly differentially interacting structures encouraged us to design the corresponding Gene Ontology (GO) enrichment based on gene coexpression networks (GOEGCN). The GOEGCN considers the two-side distinction analysis between gene coexpression networks for control and experiment groups. The method allows us to study how the modulated coexpressed gene couples impact biological functions at a GO level. Results. We modeled the binary classification with weighted DEGs for each subtype. The binary classifier could make a good prediction for an unseen sample, and the experimental results validated the effectiveness of our proposed approaches. The novel enriched GO terms based on GOEGCN for control and experiment groups of each subtype explain the specific biological function changes according to the two-side distinction of coexpression network structures to some extent. Conclusion. The weighted DEGs contain biological importance derived from the gene regulatory network. Based on the weighted DEGs, five binary classifiers were learned and showed good performance concerning the “Sensitivity,” “Specificity,” “Accuracy,” “F1,” and “AUC” metrics. The GOEGCN with weighted DEGs for control and experiment groups presented a novel GO enrichment analysis results and the novel enriched GO terms would further unveil the changes of specific biological functions among all the BRCA subtypes to some extent. The R code in this research is available at https://github.com/yxchspring/GOEGCN_BRCA_Subtypes. |
url |
http://dx.doi.org/10.1155/2020/4737969 |
work_keys_str_mv |
AT zhezhouyu rnaseqbasedbreastcancersubtypesclassificationusingmachinelearningapproaches AT zhuowang rnaseqbasedbreastcancersubtypesclassificationusingmachinelearningapproaches AT xiangchunyu rnaseqbasedbreastcancersubtypesclassificationusingmachinelearningapproaches AT zhezhang rnaseqbasedbreastcancersubtypesclassificationusingmachinelearningapproaches |
_version_ |
1715049259271716864 |