Data integration by multi-tuning parameter elastic net regression
Abstract Background To integrate molecular features from multiple high-throughput platforms in prediction, a regression model that penalizes features from all platforms equally is commonly used. However, data from different platforms are likely to differ in effect sizes, the proportion of predictive...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2018-10-01
|
Series: | BMC Bioinformatics |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s12859-018-2401-1 |
id |
doaj-b23a285ad1b54a30b6208477ddc37f78 |
---|---|
record_format |
Article |
spelling |
doaj-b23a285ad1b54a30b6208477ddc37f782020-11-25T02:45:11ZengBMCBMC Bioinformatics1471-21052018-10-011911910.1186/s12859-018-2401-1Data integration by multi-tuning parameter elastic net regressionJie Liu0Gangning Liang1Kimberly D Siegmund2Juan Pablo Lewinger3Department of Preventive Medicine, USC Keck School of MedicineUSC Institute of Urology and the Catherine & Joseph Aresty Department of Urology, Norris Comprehensive Cancer Center, University of Southern CaliforniaDepartment of Preventive Medicine, USC Keck School of MedicineDepartment of Preventive Medicine, USC Keck School of MedicineAbstract Background To integrate molecular features from multiple high-throughput platforms in prediction, a regression model that penalizes features from all platforms equally is commonly used. However, data from different platforms are likely to differ in effect sizes, the proportion of predictive features, and correlations structures. Subtle but important features may be missed by shrinking all features equally. Results We propose an Elastic net (EN) model with separate tuning parameter penalties for each platform that is fit using standard software. In a comprehensive simulation study, we evaluated the performance of EN logistic regression with multiple tuning penalties. We found that when the number of informative features differs among the platforms, and when there is no notable correlation between the features from different platforms, the multi-tuning parameter EN yields more predictive models. Moreover, the multi-tuning parameter EN is robust, in the sense that there is no loss of predictivity relative to a single tuning parameter EN when features across all platforms have similar effects. We also investigated the performance of multi-tuning parameter EN using real cancer datasets. Conclusion The proposed multi-tuning parameter EN model, fit using standard penalized regression software, can achieve better prediction in sample classification when integrating multiple genomic platforms, compared to the traditional method where a single penalty parameter is used for all features in different platforms.http://link.springer.com/article/10.1186/s12859-018-2401-1Data integrationClassificationElastic net |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Jie Liu Gangning Liang Kimberly D Siegmund Juan Pablo Lewinger |
spellingShingle |
Jie Liu Gangning Liang Kimberly D Siegmund Juan Pablo Lewinger Data integration by multi-tuning parameter elastic net regression BMC Bioinformatics Data integration Classification Elastic net |
author_facet |
Jie Liu Gangning Liang Kimberly D Siegmund Juan Pablo Lewinger |
author_sort |
Jie Liu |
title |
Data integration by multi-tuning parameter elastic net regression |
title_short |
Data integration by multi-tuning parameter elastic net regression |
title_full |
Data integration by multi-tuning parameter elastic net regression |
title_fullStr |
Data integration by multi-tuning parameter elastic net regression |
title_full_unstemmed |
Data integration by multi-tuning parameter elastic net regression |
title_sort |
data integration by multi-tuning parameter elastic net regression |
publisher |
BMC |
series |
BMC Bioinformatics |
issn |
1471-2105 |
publishDate |
2018-10-01 |
description |
Abstract Background To integrate molecular features from multiple high-throughput platforms in prediction, a regression model that penalizes features from all platforms equally is commonly used. However, data from different platforms are likely to differ in effect sizes, the proportion of predictive features, and correlations structures. Subtle but important features may be missed by shrinking all features equally. Results We propose an Elastic net (EN) model with separate tuning parameter penalties for each platform that is fit using standard software. In a comprehensive simulation study, we evaluated the performance of EN logistic regression with multiple tuning penalties. We found that when the number of informative features differs among the platforms, and when there is no notable correlation between the features from different platforms, the multi-tuning parameter EN yields more predictive models. Moreover, the multi-tuning parameter EN is robust, in the sense that there is no loss of predictivity relative to a single tuning parameter EN when features across all platforms have similar effects. We also investigated the performance of multi-tuning parameter EN using real cancer datasets. Conclusion The proposed multi-tuning parameter EN model, fit using standard penalized regression software, can achieve better prediction in sample classification when integrating multiple genomic platforms, compared to the traditional method where a single penalty parameter is used for all features in different platforms. |
topic |
Data integration Classification Elastic net |
url |
http://link.springer.com/article/10.1186/s12859-018-2401-1 |
work_keys_str_mv |
AT jieliu dataintegrationbymultituningparameterelasticnetregression AT gangningliang dataintegrationbymultituningparameterelasticnetregression AT kimberlydsiegmund dataintegrationbymultituningparameterelasticnetregression AT juanpablolewinger dataintegrationbymultituningparameterelasticnetregression |
_version_ |
1724763602248794112 |