Exploration of predictive and prognostic alternative splicing signatures in lung adenocarcinoma using machine learning methods
Abstract Background Alternative splicing (AS) plays critical roles in generating protein diversity and complexity. Dysregulation of AS underlies the initiation and progression of tumors. Machine learning approaches have emerged as efficient tools to identify promising biomarkers. It is meaningful to...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2020-12-01
|
Series: | Journal of Translational Medicine |
Subjects: | |
Online Access: | https://doi.org/10.1186/s12967-020-02635-y |
id |
doaj-29a314920dba4008a0a4865dc55a70a8 |
---|---|
record_format |
Article |
spelling |
doaj-29a314920dba4008a0a4865dc55a70a82020-12-07T19:32:59ZengBMCJournal of Translational Medicine1479-58762020-12-0118111510.1186/s12967-020-02635-yExploration of predictive and prognostic alternative splicing signatures in lung adenocarcinoma using machine learning methodsQidong Cai0Boxue He1Pengfei Zhang2Zhenyu Zhao3Xiong Peng4Yuqian Zhang5Hui Xie6Xiang Wang7Department of Thoracic Surgery, The Second Xiangya Hospital, Central South UniversityDepartment of Thoracic Surgery, The Second Xiangya Hospital, Central South UniversityDepartment of Thoracic Surgery, The Second Xiangya Hospital, Central South UniversityDepartment of Thoracic Surgery, The Second Xiangya Hospital, Central South UniversityDepartment of Thoracic Surgery, The Second Xiangya Hospital, Central South UniversityDepartment of Thoracic Surgery, The Second Xiangya Hospital, Central South UniversityDepartment of Thoracic Surgery, The Second Xiangya Hospital, Central South UniversityDepartment of Thoracic Surgery, The Second Xiangya Hospital, Central South UniversityAbstract Background Alternative splicing (AS) plays critical roles in generating protein diversity and complexity. Dysregulation of AS underlies the initiation and progression of tumors. Machine learning approaches have emerged as efficient tools to identify promising biomarkers. It is meaningful to explore pivotal AS events (ASEs) to deepen understanding and improve prognostic assessments of lung adenocarcinoma (LUAD) via machine learning algorithms. Method RNA sequencing data and AS data were extracted from The Cancer Genome Atlas (TCGA) database and TCGA SpliceSeq database. Using several machine learning methods, we identified 24 pairs of LUAD-related ASEs implicated in splicing switches and a random forest-based classifiers for identifying lymph node metastasis (LNM) consisting of 12 ASEs. Furthermore, we identified key prognosis-related ASEs and established a 16-ASE-based prognostic model to predict overall survival for LUAD patients using Cox regression model, random survival forest analysis, and forward selection model. Bioinformatics analyses were also applied to identify underlying mechanisms and associated upstream splicing factors (SFs). Results Each pair of ASEs was spliced from the same parent gene, and exhibited perfect inverse intrapair correlation (correlation coefficient = − 1). The 12-ASE-based classifier showed robust ability to evaluate LNM status of LUAD patients with the area under the receiver operating characteristic (ROC) curve (AUC) more than 0.7 in fivefold cross-validation. The prognostic model performed well at 1, 3, 5, and 10 years in both the training cohort and internal test cohort. Univariate and multivariate Cox regression indicated the prognostic model could be used as an independent prognostic factor for patients with LUAD. Further analysis revealed correlations between the prognostic model and American Joint Committee on Cancer stage, T stage, N stage, and living status. The splicing network constructed of survival-related SFs and ASEs depicts regulatory relationships between them. Conclusion In summary, our study provides insight into LUAD researches and managements based on these AS biomarkers.https://doi.org/10.1186/s12967-020-02635-yLung adenocarcinomaAlternative splicingRandom forestsSplicing switchMetastasisPrognosis |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Qidong Cai Boxue He Pengfei Zhang Zhenyu Zhao Xiong Peng Yuqian Zhang Hui Xie Xiang Wang |
spellingShingle |
Qidong Cai Boxue He Pengfei Zhang Zhenyu Zhao Xiong Peng Yuqian Zhang Hui Xie Xiang Wang Exploration of predictive and prognostic alternative splicing signatures in lung adenocarcinoma using machine learning methods Journal of Translational Medicine Lung adenocarcinoma Alternative splicing Random forests Splicing switch Metastasis Prognosis |
author_facet |
Qidong Cai Boxue He Pengfei Zhang Zhenyu Zhao Xiong Peng Yuqian Zhang Hui Xie Xiang Wang |
author_sort |
Qidong Cai |
title |
Exploration of predictive and prognostic alternative splicing signatures in lung adenocarcinoma using machine learning methods |
title_short |
Exploration of predictive and prognostic alternative splicing signatures in lung adenocarcinoma using machine learning methods |
title_full |
Exploration of predictive and prognostic alternative splicing signatures in lung adenocarcinoma using machine learning methods |
title_fullStr |
Exploration of predictive and prognostic alternative splicing signatures in lung adenocarcinoma using machine learning methods |
title_full_unstemmed |
Exploration of predictive and prognostic alternative splicing signatures in lung adenocarcinoma using machine learning methods |
title_sort |
exploration of predictive and prognostic alternative splicing signatures in lung adenocarcinoma using machine learning methods |
publisher |
BMC |
series |
Journal of Translational Medicine |
issn |
1479-5876 |
publishDate |
2020-12-01 |
description |
Abstract Background Alternative splicing (AS) plays critical roles in generating protein diversity and complexity. Dysregulation of AS underlies the initiation and progression of tumors. Machine learning approaches have emerged as efficient tools to identify promising biomarkers. It is meaningful to explore pivotal AS events (ASEs) to deepen understanding and improve prognostic assessments of lung adenocarcinoma (LUAD) via machine learning algorithms. Method RNA sequencing data and AS data were extracted from The Cancer Genome Atlas (TCGA) database and TCGA SpliceSeq database. Using several machine learning methods, we identified 24 pairs of LUAD-related ASEs implicated in splicing switches and a random forest-based classifiers for identifying lymph node metastasis (LNM) consisting of 12 ASEs. Furthermore, we identified key prognosis-related ASEs and established a 16-ASE-based prognostic model to predict overall survival for LUAD patients using Cox regression model, random survival forest analysis, and forward selection model. Bioinformatics analyses were also applied to identify underlying mechanisms and associated upstream splicing factors (SFs). Results Each pair of ASEs was spliced from the same parent gene, and exhibited perfect inverse intrapair correlation (correlation coefficient = − 1). The 12-ASE-based classifier showed robust ability to evaluate LNM status of LUAD patients with the area under the receiver operating characteristic (ROC) curve (AUC) more than 0.7 in fivefold cross-validation. The prognostic model performed well at 1, 3, 5, and 10 years in both the training cohort and internal test cohort. Univariate and multivariate Cox regression indicated the prognostic model could be used as an independent prognostic factor for patients with LUAD. Further analysis revealed correlations between the prognostic model and American Joint Committee on Cancer stage, T stage, N stage, and living status. The splicing network constructed of survival-related SFs and ASEs depicts regulatory relationships between them. Conclusion In summary, our study provides insight into LUAD researches and managements based on these AS biomarkers. |
topic |
Lung adenocarcinoma Alternative splicing Random forests Splicing switch Metastasis Prognosis |
url |
https://doi.org/10.1186/s12967-020-02635-y |
work_keys_str_mv |
AT qidongcai explorationofpredictiveandprognosticalternativesplicingsignaturesinlungadenocarcinomausingmachinelearningmethods AT boxuehe explorationofpredictiveandprognosticalternativesplicingsignaturesinlungadenocarcinomausingmachinelearningmethods AT pengfeizhang explorationofpredictiveandprognosticalternativesplicingsignaturesinlungadenocarcinomausingmachinelearningmethods AT zhenyuzhao explorationofpredictiveandprognosticalternativesplicingsignaturesinlungadenocarcinomausingmachinelearningmethods AT xiongpeng explorationofpredictiveandprognosticalternativesplicingsignaturesinlungadenocarcinomausingmachinelearningmethods AT yuqianzhang explorationofpredictiveandprognosticalternativesplicingsignaturesinlungadenocarcinomausingmachinelearningmethods AT huixie explorationofpredictiveandprognosticalternativesplicingsignaturesinlungadenocarcinomausingmachinelearningmethods AT xiangwang explorationofpredictiveandprognosticalternativesplicingsignaturesinlungadenocarcinomausingmachinelearningmethods |
_version_ |
1724397242367868928 |