The partitioned LASSO-patternsearch algorithm with application to gene expression data
<p>Abstract</p> <p>Background</p> <p>In systems biology, the task of reverse engineering gene pathways from data has been limited not just by the curse of dimensionality (the interaction space is huge) but also by systematic error in the data. The gene expression barcod...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2012-05-01
|
Series: | BMC Bioinformatics |
Online Access: | http://www.biomedcentral.com/1471-2105/13/98 |
id |
doaj-548d93c3501b4c8db169d21d3ff3517c |
---|---|
record_format |
Article |
spelling |
doaj-548d93c3501b4c8db169d21d3ff3517c2020-11-25T00:26:06ZengBMCBMC Bioinformatics1471-21052012-05-011319810.1186/1471-2105-13-98The partitioned LASSO-patternsearch algorithm with application to gene expression dataShi WeiliangWahba GraceIrizarry Rafael ABravo HectorWright Stephen J<p>Abstract</p> <p>Background</p> <p>In systems biology, the task of reverse engineering gene pathways from data has been limited not just by the curse of dimensionality (the interaction space is huge) but also by systematic error in the data. The gene expression barcode reduces spurious association driven by batch effects and probe effects. The binary nature of the resulting expression calls lends itself perfectly to modern regularization approaches that thrive in high-dimensional settings.</p> <p>Results</p> <p>The Partitioned LASSO-Patternsearch algorithm is proposed to identify patterns of multiple dichotomous risk factors for outcomes of interest in genomic studies. A partitioning scheme is used to identify promising patterns by solving many LASSO-Patternsearch subproblems in parallel. All variables that survive this stage proceed to an aggregation stage where the most significant patterns are identified by solving a reduced LASSO-Patternsearch problem in just these variables. This approach was applied to genetic data sets with expression levels dichotomized by gene expression bar code. Most of the genes and second-order interactions thus selected and are known to be related to the outcomes.</p> <p>Conclusions</p> <p>We demonstrate with simulations and data analyses that the proposed method not only selects variables and patterns more accurately, but also provides smaller models with better prediction accuracy, in comparison to several alternative methodologies.</p> http://www.biomedcentral.com/1471-2105/13/98 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Shi Weiliang Wahba Grace Irizarry Rafael A Bravo Hector Wright Stephen J |
spellingShingle |
Shi Weiliang Wahba Grace Irizarry Rafael A Bravo Hector Wright Stephen J The partitioned LASSO-patternsearch algorithm with application to gene expression data BMC Bioinformatics |
author_facet |
Shi Weiliang Wahba Grace Irizarry Rafael A Bravo Hector Wright Stephen J |
author_sort |
Shi Weiliang |
title |
The partitioned LASSO-patternsearch algorithm with application to gene expression data |
title_short |
The partitioned LASSO-patternsearch algorithm with application to gene expression data |
title_full |
The partitioned LASSO-patternsearch algorithm with application to gene expression data |
title_fullStr |
The partitioned LASSO-patternsearch algorithm with application to gene expression data |
title_full_unstemmed |
The partitioned LASSO-patternsearch algorithm with application to gene expression data |
title_sort |
partitioned lasso-patternsearch algorithm with application to gene expression data |
publisher |
BMC |
series |
BMC Bioinformatics |
issn |
1471-2105 |
publishDate |
2012-05-01 |
description |
<p>Abstract</p> <p>Background</p> <p>In systems biology, the task of reverse engineering gene pathways from data has been limited not just by the curse of dimensionality (the interaction space is huge) but also by systematic error in the data. The gene expression barcode reduces spurious association driven by batch effects and probe effects. The binary nature of the resulting expression calls lends itself perfectly to modern regularization approaches that thrive in high-dimensional settings.</p> <p>Results</p> <p>The Partitioned LASSO-Patternsearch algorithm is proposed to identify patterns of multiple dichotomous risk factors for outcomes of interest in genomic studies. A partitioning scheme is used to identify promising patterns by solving many LASSO-Patternsearch subproblems in parallel. All variables that survive this stage proceed to an aggregation stage where the most significant patterns are identified by solving a reduced LASSO-Patternsearch problem in just these variables. This approach was applied to genetic data sets with expression levels dichotomized by gene expression bar code. Most of the genes and second-order interactions thus selected and are known to be related to the outcomes.</p> <p>Conclusions</p> <p>We demonstrate with simulations and data analyses that the proposed method not only selects variables and patterns more accurately, but also provides smaller models with better prediction accuracy, in comparison to several alternative methodologies.</p> |
url |
http://www.biomedcentral.com/1471-2105/13/98 |
work_keys_str_mv |
AT shiweiliang thepartitionedlassopatternsearchalgorithmwithapplicationtogeneexpressiondata AT wahbagrace thepartitionedlassopatternsearchalgorithmwithapplicationtogeneexpressiondata AT irizarryrafaela thepartitionedlassopatternsearchalgorithmwithapplicationtogeneexpressiondata AT bravohector thepartitionedlassopatternsearchalgorithmwithapplicationtogeneexpressiondata AT wrightstephenj thepartitionedlassopatternsearchalgorithmwithapplicationtogeneexpressiondata AT shiweiliang partitionedlassopatternsearchalgorithmwithapplicationtogeneexpressiondata AT wahbagrace partitionedlassopatternsearchalgorithmwithapplicationtogeneexpressiondata AT irizarryrafaela partitionedlassopatternsearchalgorithmwithapplicationtogeneexpressiondata AT bravohector partitionedlassopatternsearchalgorithmwithapplicationtogeneexpressiondata AT wrightstephenj partitionedlassopatternsearchalgorithmwithapplicationtogeneexpressiondata |
_version_ |
1725346045310795776 |