Variable selection and estimation procedures for high-dimensional survival data
In survival analysis the popular models are usually well suited for data with few covariates and many observations. In contrast for many other fields such as microarray, it is necessary in practice to consider the opposite case where the number of covariates (number of genes) far exceeds the number...
Main Author: | |
---|---|
Published: |
University of Warwick
2013
|
Subjects: | |
Online Access: | http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.582399 |
id |
ndltd-bl.uk-oai-ethos.bl.uk-582399 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-bl.uk-oai-ethos.bl.uk-5823992015-12-03T04:14:49ZVariable selection and estimation procedures for high-dimensional survival dataKhan, M. H. R.2013In survival analysis the popular models are usually well suited for data with few covariates and many observations. In contrast for many other fields such as microarray, it is necessary in practice to consider the opposite case where the number of covariates (number of genes) far exceeds the number of observations. However, with such data the accelerated failure time models (AFT) have not received much attention in variable selection literature. This thesis attempts to meet this need, extending and applying the modern tools of variable selection and estimation to high–dimensional censored data. We introduce two new variable selection strategies for AFT models. The first is based upon regularized weighted least squares that leads to four adaptive elastic net type variable selection approaches. In particular one adaptive elastic net, one weighted elastic net and two extensions that incorporate censoring constraints into the optimization framework of the methods. The second variable selection strategy is based upon the synthesis of the Buckley–James method and the Dantzig selector, that results in two modified Buckley– James methods and one adaptive Dantzig selector. The adaptive Dantzig selector uses both standard and novel weights giving rise to three new algorithms. Out of the variable selection strategies we focus on two important issues. One is the sensitivity of Stute’s weighted least squares estimator to the censored largest observations when Efron’s tail correction approach violates one of the basic right censoring assumptions. We propose some intuitive imputing approaches for the censored largest observations that allow Efron’s approach to be applied without violating the censoring assumption, and furthermore, generate estimates with reduced mean squared errors and bias. The other issue is related to proposing some modifications to the jackknife estimate of bias for Kaplan– Meier estimators. The proposed modifications relax the conditions needed for such bias creation by suitably applying the above imputing methods. It also appears that without the modifications the bias of Kaplan–Meier estimators can be badly underestimated by the jackknifing.519.5QA MathematicsUniversity of Warwickhttp://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.582399http://wrap.warwick.ac.uk/57484/Electronic Thesis or Dissertation |
collection |
NDLTD |
sources |
NDLTD |
topic |
519.5 QA Mathematics |
spellingShingle |
519.5 QA Mathematics Khan, M. H. R. Variable selection and estimation procedures for high-dimensional survival data |
description |
In survival analysis the popular models are usually well suited for data with few covariates and many observations. In contrast for many other fields such as microarray, it is necessary in practice to consider the opposite case where the number of covariates (number of genes) far exceeds the number of observations. However, with such data the accelerated failure time models (AFT) have not received much attention in variable selection literature. This thesis attempts to meet this need, extending and applying the modern tools of variable selection and estimation to high–dimensional censored data. We introduce two new variable selection strategies for AFT models. The first is based upon regularized weighted least squares that leads to four adaptive elastic net type variable selection approaches. In particular one adaptive elastic net, one weighted elastic net and two extensions that incorporate censoring constraints into the optimization framework of the methods. The second variable selection strategy is based upon the synthesis of the Buckley–James method and the Dantzig selector, that results in two modified Buckley– James methods and one adaptive Dantzig selector. The adaptive Dantzig selector uses both standard and novel weights giving rise to three new algorithms. Out of the variable selection strategies we focus on two important issues. One is the sensitivity of Stute’s weighted least squares estimator to the censored largest observations when Efron’s tail correction approach violates one of the basic right censoring assumptions. We propose some intuitive imputing approaches for the censored largest observations that allow Efron’s approach to be applied without violating the censoring assumption, and furthermore, generate estimates with reduced mean squared errors and bias. The other issue is related to proposing some modifications to the jackknife estimate of bias for Kaplan– Meier estimators. The proposed modifications relax the conditions needed for such bias creation by suitably applying the above imputing methods. It also appears that without the modifications the bias of Kaplan–Meier estimators can be badly underestimated by the jackknifing. |
author |
Khan, M. H. R. |
author_facet |
Khan, M. H. R. |
author_sort |
Khan, M. H. R. |
title |
Variable selection and estimation procedures for high-dimensional survival data |
title_short |
Variable selection and estimation procedures for high-dimensional survival data |
title_full |
Variable selection and estimation procedures for high-dimensional survival data |
title_fullStr |
Variable selection and estimation procedures for high-dimensional survival data |
title_full_unstemmed |
Variable selection and estimation procedures for high-dimensional survival data |
title_sort |
variable selection and estimation procedures for high-dimensional survival data |
publisher |
University of Warwick |
publishDate |
2013 |
url |
http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.582399 |
work_keys_str_mv |
AT khanmhr variableselectionandestimationproceduresforhighdimensionalsurvivaldata |
_version_ |
1718144146371248128 |