Variable screening and graphical modeling for ultra-high dimensional longitudinal data
Ultrahigh-dimensional variable selection is of great importance in the statistical research. And independence screening is a powerful tool to select important variable when there are massive variables. Some commonly used independence screening procedures are based on single replicate data and are no...
Main Author: | |
---|---|
Other Authors: | |
Format: | Others |
Published: |
Virginia Tech
2020
|
Subjects: | |
Online Access: | http://hdl.handle.net/10919/101662 |
id |
ndltd-VTETD-oai-vtechworks.lib.vt.edu-10919-101662 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-VTETD-oai-vtechworks.lib.vt.edu-10919-1016622020-12-25T06:09:05Z Variable screening and graphical modeling for ultra-high dimensional longitudinal data Zhang, Yafei Statistics Du, Pang Wu, Xiaowei Kim, Inyoung Hong, Yili graphical model variable screening longitudinal data analysis Ultrahigh-dimensional variable selection is of great importance in the statistical research. And independence screening is a powerful tool to select important variable when there are massive variables. Some commonly used independence screening procedures are based on single replicate data and are not applicable to longitudinal data. This motivates us to propose a new Sure Independence Screening (SIS) procedure to bring the dimension from ultra-high down to a relatively large scale which is similar to or smaller than the sample size. In chapter 2, we provide two types of SIS, and their iterative extensions (iterative SIS) to enhance the finite sample performance. An upper bound on the number of variables to be included is derived and assumptions are given under which sure screening is applicable. The proposed procedures are assessed by simulations and an application of them to a study on systemic lupus erythematosus illustrates the practical use of these procedures. After the variables screening process, we then explore the relationship among the variables. Graphical models are commonly used to explore the association network for a set of variables, which could be genes or other objects under study. However, graphical modes currently used are only designed for single replicate data, rather than longitudinal data. In chapter 3, we propose a penalized likelihood approach to identify the edges in a conditional independence graph for longitudinal data. We used pairwise coordinate descent combined with second order cone programming to optimize the penalized likelihood and estimate the parameters. Furthermore, we extended the nodewise regression method the for longitudinal data case. Simulation and real data analysis exhibit the competitive performance of the penalized likelihood method. Doctor of Philosophy 2020-12-24T07:00:36Z 2020-12-24T07:00:36Z 2019-07-02 Dissertation vt_gsexam:20483 http://hdl.handle.net/10919/101662 This item is protected by copyright and/or related rights. Some uses of this item may be deemed fair and permitted by law even without permission from the rights holder(s), or the rights holder(s) may have licensed the work for use under certain conditions. For other uses you need to obtain permission from the rights holder(s). ETD application/pdf Virginia Tech |
collection |
NDLTD |
format |
Others
|
sources |
NDLTD |
topic |
graphical model variable screening longitudinal data analysis |
spellingShingle |
graphical model variable screening longitudinal data analysis Zhang, Yafei Variable screening and graphical modeling for ultra-high dimensional longitudinal data |
description |
Ultrahigh-dimensional variable selection is of great importance in the statistical research. And independence screening is a powerful tool to select important variable when there are massive variables. Some commonly used independence screening procedures are based on single replicate data and are not applicable to longitudinal data. This motivates us to propose a new Sure Independence Screening (SIS) procedure to bring the dimension from ultra-high down to a relatively large scale which is similar to or smaller than the sample size. In chapter 2, we provide two types of SIS, and their iterative extensions (iterative SIS) to enhance the finite sample performance. An upper bound on the number of variables to be included is derived and assumptions are given under which sure screening is applicable. The proposed procedures are assessed by simulations and an application of them to a study on systemic lupus erythematosus illustrates the practical use of these procedures. After the variables screening process, we then explore the relationship among the variables. Graphical models are commonly used to explore the association network for a set of variables, which could be genes or other objects under study. However, graphical modes currently used are only designed for single replicate data, rather than longitudinal data. In chapter 3, we propose a penalized likelihood approach to identify the edges in a conditional independence graph for longitudinal data. We used pairwise coordinate descent combined with second order cone programming to optimize the penalized likelihood and estimate the parameters. Furthermore, we extended the nodewise regression method the for longitudinal data case. Simulation and real data analysis exhibit the competitive performance of the penalized likelihood method. === Doctor of Philosophy |
author2 |
Statistics |
author_facet |
Statistics Zhang, Yafei |
author |
Zhang, Yafei |
author_sort |
Zhang, Yafei |
title |
Variable screening and graphical modeling for ultra-high dimensional longitudinal data |
title_short |
Variable screening and graphical modeling for ultra-high dimensional longitudinal data |
title_full |
Variable screening and graphical modeling for ultra-high dimensional longitudinal data |
title_fullStr |
Variable screening and graphical modeling for ultra-high dimensional longitudinal data |
title_full_unstemmed |
Variable screening and graphical modeling for ultra-high dimensional longitudinal data |
title_sort |
variable screening and graphical modeling for ultra-high dimensional longitudinal data |
publisher |
Virginia Tech |
publishDate |
2020 |
url |
http://hdl.handle.net/10919/101662 |
work_keys_str_mv |
AT zhangyafei variablescreeningandgraphicalmodelingforultrahighdimensionallongitudinaldata |
_version_ |
1719371598728790016 |