Identifying Prognostic SNPs in Clinical Cohorts: Complementing Univariate Analyses by Resampling and Multivariable Modeling.
Clinical cohorts with time-to-event endpoints are increasingly characterized by measurements of a number of single nucleotide polymorphisms that is by a magnitude larger than the number of measurements typically considered at the gene level. At the same time, the size of clinical cohorts often is st...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2016-01-01
|
Series: | PLoS ONE |
Online Access: | http://europepmc.org/articles/PMC4861340?pdf=render |
id |
doaj-888780320d2a42caa8b2af5373c205c2 |
---|---|
record_format |
Article |
spelling |
doaj-888780320d2a42caa8b2af5373c205c22020-11-24T21:50:24ZengPublic Library of Science (PLoS)PLoS ONE1932-62032016-01-01115e015522610.1371/journal.pone.0155226Identifying Prognostic SNPs in Clinical Cohorts: Complementing Univariate Analyses by Resampling and Multivariable Modeling.Stefanie HiekeAxel BennerRichard F SchlenkMartin SchumacherLars BullingerHarald BinderClinical cohorts with time-to-event endpoints are increasingly characterized by measurements of a number of single nucleotide polymorphisms that is by a magnitude larger than the number of measurements typically considered at the gene level. At the same time, the size of clinical cohorts often is still limited, calling for novel analysis strategies for identifying potentially prognostic SNPs that can help to better characterize disease processes. We propose such a strategy, drawing on univariate testing ideas from epidemiological case-controls studies on the one hand, and multivariable regression techniques as developed for gene expression data on the other hand. In particular, we focus on stable selection of a small set of SNPs and corresponding genes for subsequent validation. For univariate analysis, a permutation-based approach is proposed to test at the gene level. We use regularized multivariable regression models for considering all SNPs simultaneously and selecting a small set of potentially important prognostic SNPs. Stability is judged according to resampling inclusion frequencies for both the univariate and the multivariable approach. The overall strategy is illustrated with data from a cohort of acute myeloid leukemia patients and explored in a simulation study. The multivariable approach is seen to automatically focus on a smaller set of SNPs compared to the univariate approach, roughly in line with blocks of correlated SNPs. This more targeted extraction of SNPs results in more stable selection at the SNP as well as at the gene level. Thus, the multivariable regression approach with resampling provides a perspective in the proposed analysis strategy for SNP data in clinical cohorts highlighting what can be added by regularized regression techniques compared to univariate analyses.http://europepmc.org/articles/PMC4861340?pdf=render |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Stefanie Hieke Axel Benner Richard F Schlenk Martin Schumacher Lars Bullinger Harald Binder |
spellingShingle |
Stefanie Hieke Axel Benner Richard F Schlenk Martin Schumacher Lars Bullinger Harald Binder Identifying Prognostic SNPs in Clinical Cohorts: Complementing Univariate Analyses by Resampling and Multivariable Modeling. PLoS ONE |
author_facet |
Stefanie Hieke Axel Benner Richard F Schlenk Martin Schumacher Lars Bullinger Harald Binder |
author_sort |
Stefanie Hieke |
title |
Identifying Prognostic SNPs in Clinical Cohorts: Complementing Univariate Analyses by Resampling and Multivariable Modeling. |
title_short |
Identifying Prognostic SNPs in Clinical Cohorts: Complementing Univariate Analyses by Resampling and Multivariable Modeling. |
title_full |
Identifying Prognostic SNPs in Clinical Cohorts: Complementing Univariate Analyses by Resampling and Multivariable Modeling. |
title_fullStr |
Identifying Prognostic SNPs in Clinical Cohorts: Complementing Univariate Analyses by Resampling and Multivariable Modeling. |
title_full_unstemmed |
Identifying Prognostic SNPs in Clinical Cohorts: Complementing Univariate Analyses by Resampling and Multivariable Modeling. |
title_sort |
identifying prognostic snps in clinical cohorts: complementing univariate analyses by resampling and multivariable modeling. |
publisher |
Public Library of Science (PLoS) |
series |
PLoS ONE |
issn |
1932-6203 |
publishDate |
2016-01-01 |
description |
Clinical cohorts with time-to-event endpoints are increasingly characterized by measurements of a number of single nucleotide polymorphisms that is by a magnitude larger than the number of measurements typically considered at the gene level. At the same time, the size of clinical cohorts often is still limited, calling for novel analysis strategies for identifying potentially prognostic SNPs that can help to better characterize disease processes. We propose such a strategy, drawing on univariate testing ideas from epidemiological case-controls studies on the one hand, and multivariable regression techniques as developed for gene expression data on the other hand. In particular, we focus on stable selection of a small set of SNPs and corresponding genes for subsequent validation. For univariate analysis, a permutation-based approach is proposed to test at the gene level. We use regularized multivariable regression models for considering all SNPs simultaneously and selecting a small set of potentially important prognostic SNPs. Stability is judged according to resampling inclusion frequencies for both the univariate and the multivariable approach. The overall strategy is illustrated with data from a cohort of acute myeloid leukemia patients and explored in a simulation study. The multivariable approach is seen to automatically focus on a smaller set of SNPs compared to the univariate approach, roughly in line with blocks of correlated SNPs. This more targeted extraction of SNPs results in more stable selection at the SNP as well as at the gene level. Thus, the multivariable regression approach with resampling provides a perspective in the proposed analysis strategy for SNP data in clinical cohorts highlighting what can be added by regularized regression techniques compared to univariate analyses. |
url |
http://europepmc.org/articles/PMC4861340?pdf=render |
work_keys_str_mv |
AT stefaniehieke identifyingprognosticsnpsinclinicalcohortscomplementingunivariateanalysesbyresamplingandmultivariablemodeling AT axelbenner identifyingprognosticsnpsinclinicalcohortscomplementingunivariateanalysesbyresamplingandmultivariablemodeling AT richardfschlenk identifyingprognosticsnpsinclinicalcohortscomplementingunivariateanalysesbyresamplingandmultivariablemodeling AT martinschumacher identifyingprognosticsnpsinclinicalcohortscomplementingunivariateanalysesbyresamplingandmultivariablemodeling AT larsbullinger identifyingprognosticsnpsinclinicalcohortscomplementingunivariateanalysesbyresamplingandmultivariablemodeling AT haraldbinder identifyingprognosticsnpsinclinicalcohortscomplementingunivariateanalysesbyresamplingandmultivariablemodeling |
_version_ |
1725884295645495296 |