ELMO: An Efficient Logistic Regression-Based Multi-Omic Integrated Analysis Method for Breast Cancer Intrinsic Subtypes

Breast cancer is one of the most frequently occurring female cancer types and represents a major cause of death among women worldwide. Breast cancer is heterogeneous in both molecular characteristics and clinical outcomes for its different molecular subtypes. High-throughput technologies facilitated...

Full description

Bibliographic Details
Main Authors: Yexian Zhang, Ruoyao Shi, Chaorong Chen, Meiyu Duan, Shuai Liu, Yanjiao Ren, Lan Huang, Xiaofeng Dai, Fengfeng Zhou
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8935338/
id doaj-1244a452ea6f418090560c56548332f9
record_format Article
spelling doaj-1244a452ea6f418090560c56548332f92021-03-30T01:13:20ZengIEEEIEEE Access2169-35362020-01-0185121513010.1109/ACCESS.2019.29603738935338ELMO: An Efficient Logistic Regression-Based Multi-Omic Integrated Analysis Method for Breast Cancer Intrinsic SubtypesYexian Zhang0https://orcid.org/0000-0002-0287-4065Ruoyao Shi1https://orcid.org/0000-0001-8957-1712Chaorong Chen2https://orcid.org/0000-0001-5492-4426Meiyu Duan3https://orcid.org/0000-0001-7171-2695Shuai Liu4https://orcid.org/0000-0003-2867-4683Yanjiao Ren5https://orcid.org/0000-0003-4393-2505Lan Huang6https://orcid.org/0000-0003-3233-3777Xiaofeng Dai7https://orcid.org/0000-0001-5323-7886Fengfeng Zhou8https://orcid.org/0000-0002-8108-6007BioKnow Health Informatics Lab, Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, ChinaBioKnow Health Informatics Lab, College of Life Sciences, Jilin University, Changchun, ChinaBioKnow Health Informatics Lab, College of Software, Jilin University, Changchun, ChinaBioKnow Health Informatics Lab, Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, ChinaBioKnow Health Informatics Lab, Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, ChinaCollege of Information Technology, Jilin Agricultural University, Changchun, ChinaBioKnow Health Informatics Lab, Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, ChinaWuxi School of Medicine, Jiangnan University, Wuxi, ChinaBioKnow Health Informatics Lab, Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, ChinaBreast cancer is one of the most frequently occurring female cancer types and represents a major cause of death among women worldwide. Breast cancer is heterogeneous in both molecular characteristics and clinical outcomes for its different molecular subtypes. High-throughput technologies facilitated the fast accumulations of the multiple Omic data for cancer patients. These data sources posed a computational challenge for the efficient integrated multi-Omic analysis. The existing studies usually investigated the differential representation or machine learning problems using a single type of Omic data. This study hypothesized that different Omic types contributed complementary information to each other, and their integrated analysis may improve the single-Omic models. An efficient logistic regression-based multi-Omic integrated analysis method (ELMO) was proposed to integrate the RNA-seq and DNA methylation data to detect the breast cancer intrinsic subtypes. ELMO achieved the highest accuracy with a smaller number of features compared with the existing filter and wrapper feature selection methods in this study. The experimental data supported our hypothesis that multi-Omic models outperformed the single-Omic ones.https://ieeexplore.ieee.org/document/8935338/Breast cancerintrinsic subtypesmulti-omicsfeature selection
collection DOAJ
language English
format Article
sources DOAJ
author Yexian Zhang
Ruoyao Shi
Chaorong Chen
Meiyu Duan
Shuai Liu
Yanjiao Ren
Lan Huang
Xiaofeng Dai
Fengfeng Zhou
spellingShingle Yexian Zhang
Ruoyao Shi
Chaorong Chen
Meiyu Duan
Shuai Liu
Yanjiao Ren
Lan Huang
Xiaofeng Dai
Fengfeng Zhou
ELMO: An Efficient Logistic Regression-Based Multi-Omic Integrated Analysis Method for Breast Cancer Intrinsic Subtypes
IEEE Access
Breast cancer
intrinsic subtypes
multi-omics
feature selection
author_facet Yexian Zhang
Ruoyao Shi
Chaorong Chen
Meiyu Duan
Shuai Liu
Yanjiao Ren
Lan Huang
Xiaofeng Dai
Fengfeng Zhou
author_sort Yexian Zhang
title ELMO: An Efficient Logistic Regression-Based Multi-Omic Integrated Analysis Method for Breast Cancer Intrinsic Subtypes
title_short ELMO: An Efficient Logistic Regression-Based Multi-Omic Integrated Analysis Method for Breast Cancer Intrinsic Subtypes
title_full ELMO: An Efficient Logistic Regression-Based Multi-Omic Integrated Analysis Method for Breast Cancer Intrinsic Subtypes
title_fullStr ELMO: An Efficient Logistic Regression-Based Multi-Omic Integrated Analysis Method for Breast Cancer Intrinsic Subtypes
title_full_unstemmed ELMO: An Efficient Logistic Regression-Based Multi-Omic Integrated Analysis Method for Breast Cancer Intrinsic Subtypes
title_sort elmo: an efficient logistic regression-based multi-omic integrated analysis method for breast cancer intrinsic subtypes
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2020-01-01
description Breast cancer is one of the most frequently occurring female cancer types and represents a major cause of death among women worldwide. Breast cancer is heterogeneous in both molecular characteristics and clinical outcomes for its different molecular subtypes. High-throughput technologies facilitated the fast accumulations of the multiple Omic data for cancer patients. These data sources posed a computational challenge for the efficient integrated multi-Omic analysis. The existing studies usually investigated the differential representation or machine learning problems using a single type of Omic data. This study hypothesized that different Omic types contributed complementary information to each other, and their integrated analysis may improve the single-Omic models. An efficient logistic regression-based multi-Omic integrated analysis method (ELMO) was proposed to integrate the RNA-seq and DNA methylation data to detect the breast cancer intrinsic subtypes. ELMO achieved the highest accuracy with a smaller number of features compared with the existing filter and wrapper feature selection methods in this study. The experimental data supported our hypothesis that multi-Omic models outperformed the single-Omic ones.
topic Breast cancer
intrinsic subtypes
multi-omics
feature selection
url https://ieeexplore.ieee.org/document/8935338/
work_keys_str_mv AT yexianzhang elmoanefficientlogisticregressionbasedmultiomicintegratedanalysismethodforbreastcancerintrinsicsubtypes
AT ruoyaoshi elmoanefficientlogisticregressionbasedmultiomicintegratedanalysismethodforbreastcancerintrinsicsubtypes
AT chaorongchen elmoanefficientlogisticregressionbasedmultiomicintegratedanalysismethodforbreastcancerintrinsicsubtypes
AT meiyuduan elmoanefficientlogisticregressionbasedmultiomicintegratedanalysismethodforbreastcancerintrinsicsubtypes
AT shuailiu elmoanefficientlogisticregressionbasedmultiomicintegratedanalysismethodforbreastcancerintrinsicsubtypes
AT yanjiaoren elmoanefficientlogisticregressionbasedmultiomicintegratedanalysismethodforbreastcancerintrinsicsubtypes
AT lanhuang elmoanefficientlogisticregressionbasedmultiomicintegratedanalysismethodforbreastcancerintrinsicsubtypes
AT xiaofengdai elmoanefficientlogisticregressionbasedmultiomicintegratedanalysismethodforbreastcancerintrinsicsubtypes
AT fengfengzhou elmoanefficientlogisticregressionbasedmultiomicintegratedanalysismethodforbreastcancerintrinsicsubtypes
_version_ 1724187447577804800