Methods for Predicting an Ordinal Response with High-Throughput Genomic Data

Multigenic diagnostic and prognostic tools can be derived for ordinal clinical outcomes using data from high-throughput genomic experiments. A challenge in this setting is that the number of predictors is much greater than the sample size, so traditional ordinal response modeling techniques must be...

Full description

Bibliographic Details
Main Author: Ferber, Kyle L
Format: Others
Published: VCU Scholars Compass 2016
Subjects:
Online Access:http://scholarscompass.vcu.edu/etd/4585
http://scholarscompass.vcu.edu/cgi/viewcontent.cgi?article=5629&context=etd
id ndltd-vcu.edu-oai-scholarscompass.vcu.edu-etd-5629
record_format oai_dc
spelling ndltd-vcu.edu-oai-scholarscompass.vcu.edu-etd-56292017-03-17T08:35:25Z Methods for Predicting an Ordinal Response with High-Throughput Genomic Data Ferber, Kyle L Multigenic diagnostic and prognostic tools can be derived for ordinal clinical outcomes using data from high-throughput genomic experiments. A challenge in this setting is that the number of predictors is much greater than the sample size, so traditional ordinal response modeling techniques must be exchanged for more specialized approaches. Existing methods perform well on some datasets, but there is room for improvement in terms of variable selection and predictive accuracy. Therefore, we extended an impressive binary response modeling technique, Feature Augmentation via Nonparametrics and Selection, to the ordinal response setting. Through simulation studies and analyses of high-throughput genomic datasets, we showed that our Ordinal FANS method is sensitive and specific when discriminating between important and unimportant features from the high-dimensional feature space and is highly competitive in terms of predictive accuracy. Discrete survival time is another example of an ordinal response. For many illnesses and chronic conditions, it is impossible to record the precise date and time of disease onset or relapse. Further, the HIPPA Privacy Rule prevents recording of protected health information which includes all elements of dates (except year), so in the absence of a “limited dataset,” date of diagnosis or date of death are not available for calculating overall survival. Thus, we developed a method that is suitable for modeling high-dimensional discrete survival time data and assessed its performance by conducting a simulation study and by predicting the discrete survival times of acute myeloid leukemia patients using a high-dimensional dataset. 2016-01-01T08:00:00Z text application/pdf http://scholarscompass.vcu.edu/etd/4585 http://scholarscompass.vcu.edu/cgi/viewcontent.cgi?article=5629&context=etd © Kyle L. Ferber Theses and Dissertations VCU Scholars Compass ordinal high-dimensional predictive modeling data genomics
collection NDLTD
format Others
sources NDLTD
topic ordinal
high-dimensional
predictive
modeling
data
genomics
spellingShingle ordinal
high-dimensional
predictive
modeling
data
genomics
Ferber, Kyle L
Methods for Predicting an Ordinal Response with High-Throughput Genomic Data
description Multigenic diagnostic and prognostic tools can be derived for ordinal clinical outcomes using data from high-throughput genomic experiments. A challenge in this setting is that the number of predictors is much greater than the sample size, so traditional ordinal response modeling techniques must be exchanged for more specialized approaches. Existing methods perform well on some datasets, but there is room for improvement in terms of variable selection and predictive accuracy. Therefore, we extended an impressive binary response modeling technique, Feature Augmentation via Nonparametrics and Selection, to the ordinal response setting. Through simulation studies and analyses of high-throughput genomic datasets, we showed that our Ordinal FANS method is sensitive and specific when discriminating between important and unimportant features from the high-dimensional feature space and is highly competitive in terms of predictive accuracy. Discrete survival time is another example of an ordinal response. For many illnesses and chronic conditions, it is impossible to record the precise date and time of disease onset or relapse. Further, the HIPPA Privacy Rule prevents recording of protected health information which includes all elements of dates (except year), so in the absence of a “limited dataset,” date of diagnosis or date of death are not available for calculating overall survival. Thus, we developed a method that is suitable for modeling high-dimensional discrete survival time data and assessed its performance by conducting a simulation study and by predicting the discrete survival times of acute myeloid leukemia patients using a high-dimensional dataset.
author Ferber, Kyle L
author_facet Ferber, Kyle L
author_sort Ferber, Kyle L
title Methods for Predicting an Ordinal Response with High-Throughput Genomic Data
title_short Methods for Predicting an Ordinal Response with High-Throughput Genomic Data
title_full Methods for Predicting an Ordinal Response with High-Throughput Genomic Data
title_fullStr Methods for Predicting an Ordinal Response with High-Throughput Genomic Data
title_full_unstemmed Methods for Predicting an Ordinal Response with High-Throughput Genomic Data
title_sort methods for predicting an ordinal response with high-throughput genomic data
publisher VCU Scholars Compass
publishDate 2016
url http://scholarscompass.vcu.edu/etd/4585
http://scholarscompass.vcu.edu/cgi/viewcontent.cgi?article=5629&context=etd
work_keys_str_mv AT ferberkylel methodsforpredictinganordinalresponsewithhighthroughputgenomicdata
_version_ 1718429512611397632