Variable selection and estimation procedures for high-dimensional survival data

In survival analysis the popular models are usually well suited for data with few covariates and many observations. In contrast for many other fields such as microarray, it is necessary in practice to consider the opposite case where the number of covariates (number of genes) far exceeds the number...

Full description

Bibliographic Details
Main Author: Khan, M. H. R.
Published: University of Warwick 2013
Subjects:
Online Access:http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.582399
id ndltd-bl.uk-oai-ethos.bl.uk-582399
record_format oai_dc
spelling ndltd-bl.uk-oai-ethos.bl.uk-5823992015-12-03T04:14:49ZVariable selection and estimation procedures for high-dimensional survival dataKhan, M. H. R.2013In survival analysis the popular models are usually well suited for data with few covariates and many observations. In contrast for many other fields such as microarray, it is necessary in practice to consider the opposite case where the number of covariates (number of genes) far exceeds the number of observations. However, with such data the accelerated failure time models (AFT) have not received much attention in variable selection literature. This thesis attempts to meet this need, extending and applying the modern tools of variable selection and estimation to high–dimensional censored data. We introduce two new variable selection strategies for AFT models. The first is based upon regularized weighted least squares that leads to four adaptive elastic net type variable selection approaches. In particular one adaptive elastic net, one weighted elastic net and two extensions that incorporate censoring constraints into the optimization framework of the methods. The second variable selection strategy is based upon the synthesis of the Buckley–James method and the Dantzig selector, that results in two modified Buckley– James methods and one adaptive Dantzig selector. The adaptive Dantzig selector uses both standard and novel weights giving rise to three new algorithms. Out of the variable selection strategies we focus on two important issues. One is the sensitivity of Stute’s weighted least squares estimator to the censored largest observations when Efron’s tail correction approach violates one of the basic right censoring assumptions. We propose some intuitive imputing approaches for the censored largest observations that allow Efron’s approach to be applied without violating the censoring assumption, and furthermore, generate estimates with reduced mean squared errors and bias. The other issue is related to proposing some modifications to the jackknife estimate of bias for Kaplan– Meier estimators. The proposed modifications relax the conditions needed for such bias creation by suitably applying the above imputing methods. It also appears that without the modifications the bias of Kaplan–Meier estimators can be badly underestimated by the jackknifing.519.5QA MathematicsUniversity of Warwickhttp://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.582399http://wrap.warwick.ac.uk/57484/Electronic Thesis or Dissertation
collection NDLTD
sources NDLTD
topic 519.5
QA Mathematics
spellingShingle 519.5
QA Mathematics
Khan, M. H. R.
Variable selection and estimation procedures for high-dimensional survival data
description In survival analysis the popular models are usually well suited for data with few covariates and many observations. In contrast for many other fields such as microarray, it is necessary in practice to consider the opposite case where the number of covariates (number of genes) far exceeds the number of observations. However, with such data the accelerated failure time models (AFT) have not received much attention in variable selection literature. This thesis attempts to meet this need, extending and applying the modern tools of variable selection and estimation to high–dimensional censored data. We introduce two new variable selection strategies for AFT models. The first is based upon regularized weighted least squares that leads to four adaptive elastic net type variable selection approaches. In particular one adaptive elastic net, one weighted elastic net and two extensions that incorporate censoring constraints into the optimization framework of the methods. The second variable selection strategy is based upon the synthesis of the Buckley–James method and the Dantzig selector, that results in two modified Buckley– James methods and one adaptive Dantzig selector. The adaptive Dantzig selector uses both standard and novel weights giving rise to three new algorithms. Out of the variable selection strategies we focus on two important issues. One is the sensitivity of Stute’s weighted least squares estimator to the censored largest observations when Efron’s tail correction approach violates one of the basic right censoring assumptions. We propose some intuitive imputing approaches for the censored largest observations that allow Efron’s approach to be applied without violating the censoring assumption, and furthermore, generate estimates with reduced mean squared errors and bias. The other issue is related to proposing some modifications to the jackknife estimate of bias for Kaplan– Meier estimators. The proposed modifications relax the conditions needed for such bias creation by suitably applying the above imputing methods. It also appears that without the modifications the bias of Kaplan–Meier estimators can be badly underestimated by the jackknifing.
author Khan, M. H. R.
author_facet Khan, M. H. R.
author_sort Khan, M. H. R.
title Variable selection and estimation procedures for high-dimensional survival data
title_short Variable selection and estimation procedures for high-dimensional survival data
title_full Variable selection and estimation procedures for high-dimensional survival data
title_fullStr Variable selection and estimation procedures for high-dimensional survival data
title_full_unstemmed Variable selection and estimation procedures for high-dimensional survival data
title_sort variable selection and estimation procedures for high-dimensional survival data
publisher University of Warwick
publishDate 2013
url http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.582399
work_keys_str_mv AT khanmhr variableselectionandestimationproceduresforhighdimensionalsurvivaldata
_version_ 1718144146371248128