Prediction and Variable Selection
Main Author: | |
---|---|
Language: | English |
Published: |
Case Western Reserve University School of Graduate Studies / OhioLINK
2008
|
Subjects: | |
Online Access: | http://rave.ohiolink.edu/etdc/view?acc_num=case1212581055 |
id |
ndltd-OhioLink-oai-etd.ohiolink.edu-case1212581055 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-OhioLink-oai-etd.ohiolink.edu-case12125810552021-08-03T05:32:35Z Prediction and Variable Selection Dey, Tanujit Statistics posterior model <p>Variable selection in linear regression models is an important aspect of many scientific analyses. We review several frequentist model selection techniques in the introductory chapter. Model uncertainty is one of the serious issues related to the model selection problem. One way this issue can be resolved is by using a Bayesian technique called Bayesian model averaging (BMA). In Chapter 2, we discuss BMA techniques and illustrate the ideas with examples. An often used BMA approach to model selectionis based on the so-called highest posterior probability model.</p><p>In Chapter 3 we discuss several asymptotic properties of this model selection technique. Under a spike and slab hierarchy we find that the highest posterior modelis total risk consistent for model selection, but that it also possesses some curious properties. Most important of these is a marked underfitting in finite samples, aphenomenon well noted in the literature for Bayesian Information Criterion (BIC) related procedures, but not often associated with highest posterior model selection.We employ a rescaling of the hierarchy and show the resulting rescaled spike and slab models mitigate the effects of underfitting due a perfect cancelation of a BIC-like penalty term. By drawing upon an equivalence between the highest posterior model and the median model, we consider the issue of how to calibrate rescaled spike andslab models by looking at their posterior inclusion probabilities.</p><p>In Chapter 4 we describe a new spike and slab model for model space exploration and variable selection in linear regression models. Several theoretical features arediscussed to motivate the approach. An R package modelSampler has been developed and applications are presented. In Chapter 5 we present a more stable variableselection technique. We also discuss the issue of model selection uncertainty. Numerical examples are provided. Chapter 6 discussed the issue of imputing missing valueswithout biasing variable selection and prediction. Several methods are discussed with examples and a new tree based imputation technique is proposed.</p> 2008-06-24 English text Case Western Reserve University School of Graduate Studies / OhioLINK http://rave.ohiolink.edu/etdc/view?acc_num=case1212581055 http://rave.ohiolink.edu/etdc/view?acc_num=case1212581055 unrestricted This thesis or dissertation is protected by copyright: all rights reserved. It may not be copied or redistributed beyond the terms of applicable copyright laws. |
collection |
NDLTD |
language |
English |
sources |
NDLTD |
topic |
Statistics posterior model |
spellingShingle |
Statistics posterior model Dey, Tanujit Prediction and Variable Selection |
author |
Dey, Tanujit |
author_facet |
Dey, Tanujit |
author_sort |
Dey, Tanujit |
title |
Prediction and Variable Selection |
title_short |
Prediction and Variable Selection |
title_full |
Prediction and Variable Selection |
title_fullStr |
Prediction and Variable Selection |
title_full_unstemmed |
Prediction and Variable Selection |
title_sort |
prediction and variable selection |
publisher |
Case Western Reserve University School of Graduate Studies / OhioLINK |
publishDate |
2008 |
url |
http://rave.ohiolink.edu/etdc/view?acc_num=case1212581055 |
work_keys_str_mv |
AT deytanujit predictionandvariableselection |
_version_ |
1719421535881527296 |