Data integration by multi-tuning parameter elastic net regression

Abstract Background To integrate molecular features from multiple high-throughput platforms in prediction, a regression model that penalizes features from all platforms equally is commonly used. However, data from different platforms are likely to differ in effect sizes, the proportion of predictive...

Full description

Bibliographic Details
Main Authors:	Jie Liu, Gangning Liang, Kimberly D Siegmund, Juan Pablo Lewinger
Format:	Article
Language:	English
Published:	BMC 2018-10-01
Series:	BMC Bioinformatics
Subjects:	Data integration Classification Elastic net
Online Access:	http://link.springer.com/article/10.1186/s12859-018-2401-1

id	doaj-b23a285ad1b54a30b6208477ddc37f78
record_format	Article
spelling	doaj-b23a285ad1b54a30b6208477ddc37f782020-11-25T02:45:11ZengBMCBMC Bioinformatics1471-21052018-10-011911910.1186/s12859-018-2401-1Data integration by multi-tuning parameter elastic net regressionJie Liu0Gangning Liang1Kimberly D Siegmund2Juan Pablo Lewinger3Department of Preventive Medicine, USC Keck School of MedicineUSC Institute of Urology and the Catherine & Joseph Aresty Department of Urology, Norris Comprehensive Cancer Center, University of Southern CaliforniaDepartment of Preventive Medicine, USC Keck School of MedicineDepartment of Preventive Medicine, USC Keck School of MedicineAbstract Background To integrate molecular features from multiple high-throughput platforms in prediction, a regression model that penalizes features from all platforms equally is commonly used. However, data from different platforms are likely to differ in effect sizes, the proportion of predictive features, and correlations structures. Subtle but important features may be missed by shrinking all features equally. Results We propose an Elastic net (EN) model with separate tuning parameter penalties for each platform that is fit using standard software. In a comprehensive simulation study, we evaluated the performance of EN logistic regression with multiple tuning penalties. We found that when the number of informative features differs among the platforms, and when there is no notable correlation between the features from different platforms, the multi-tuning parameter EN yields more predictive models. Moreover, the multi-tuning parameter EN is robust, in the sense that there is no loss of predictivity relative to a single tuning parameter EN when features across all platforms have similar effects. We also investigated the performance of multi-tuning parameter EN using real cancer datasets. Conclusion The proposed multi-tuning parameter EN model, fit using standard penalized regression software, can achieve better prediction in sample classification when integrating multiple genomic platforms, compared to the traditional method where a single penalty parameter is used for all features in different platforms.http://link.springer.com/article/10.1186/s12859-018-2401-1Data integrationClassificationElastic net
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Jie Liu Gangning Liang Kimberly D Siegmund Juan Pablo Lewinger
spellingShingle	Jie Liu Gangning Liang Kimberly D Siegmund Juan Pablo Lewinger Data integration by multi-tuning parameter elastic net regression BMC Bioinformatics Data integration Classification Elastic net
author_facet	Jie Liu Gangning Liang Kimberly D Siegmund Juan Pablo Lewinger
author_sort	Jie Liu
title	Data integration by multi-tuning parameter elastic net regression
title_short	Data integration by multi-tuning parameter elastic net regression
title_full	Data integration by multi-tuning parameter elastic net regression
title_fullStr	Data integration by multi-tuning parameter elastic net regression
title_full_unstemmed	Data integration by multi-tuning parameter elastic net regression
title_sort	data integration by multi-tuning parameter elastic net regression
publisher	BMC
series	BMC Bioinformatics
issn	1471-2105
publishDate	2018-10-01
description	Abstract Background To integrate molecular features from multiple high-throughput platforms in prediction, a regression model that penalizes features from all platforms equally is commonly used. However, data from different platforms are likely to differ in effect sizes, the proportion of predictive features, and correlations structures. Subtle but important features may be missed by shrinking all features equally. Results We propose an Elastic net (EN) model with separate tuning parameter penalties for each platform that is fit using standard software. In a comprehensive simulation study, we evaluated the performance of EN logistic regression with multiple tuning penalties. We found that when the number of informative features differs among the platforms, and when there is no notable correlation between the features from different platforms, the multi-tuning parameter EN yields more predictive models. Moreover, the multi-tuning parameter EN is robust, in the sense that there is no loss of predictivity relative to a single tuning parameter EN when features across all platforms have similar effects. We also investigated the performance of multi-tuning parameter EN using real cancer datasets. Conclusion The proposed multi-tuning parameter EN model, fit using standard penalized regression software, can achieve better prediction in sample classification when integrating multiple genomic platforms, compared to the traditional method where a single penalty parameter is used for all features in different platforms.
topic	Data integration Classification Elastic net
url	http://link.springer.com/article/10.1186/s12859-018-2401-1
work_keys_str_mv	AT jieliu dataintegrationbymultituningparameterelasticnetregression AT gangningliang dataintegrationbymultituningparameterelasticnetregression AT kimberlydsiegmund dataintegrationbymultituningparameterelasticnetregression AT juanpablolewinger dataintegrationbymultituningparameterelasticnetregression
_version_	1724763602248794112

Data integration by multi-tuning parameter elastic net regression

Similar Items