Bayesian Modeling of Complex High-Dimensional Data
With the rapid development of modern high-throughput technologies, scientists can now collect high-dimensional complex data in different forms, such as medical images, genomics measurements. However, acquisition of more data does not automatically lead to better knowledge discovery. One needs effici...
Main Author: | |
---|---|
Other Authors: | |
Format: | Others |
Published: |
Virginia Tech
2020
|
Subjects: | |
Online Access: | http://hdl.handle.net/10919/101037 |
id |
ndltd-VTETD-oai-vtechworks.lib.vt.edu-10919-101037 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-VTETD-oai-vtechworks.lib.vt.edu-10919-1010372020-12-10T05:33:41Z Bayesian Modeling of Complex High-Dimensional Data Huo, Shuning Statistics Zhu, Hongxiao Deng, Xinwei Kim, Inyoung Gramacy, Robert B. Variational Inference Bayesian Variable Selection Functional Mixed Model Parallel Computing Bayesian Hierarchical Clustering Dirichlet Diffusion Tree With the rapid development of modern high-throughput technologies, scientists can now collect high-dimensional complex data in different forms, such as medical images, genomics measurements. However, acquisition of more data does not automatically lead to better knowledge discovery. One needs efficient and reliable analytical tools to extract useful information from complex datasets. The main objective of this dissertation is to develop innovative Bayesian methodologies to enable effective and efficient knowledge discovery from complex high-dimensional data. It contains two parts—the development of computationally efficient functional mixed models and the modeling of data heterogeneity via Dirichlet Diffusion Tree. The first part focuses on tackling the computational bottleneck in Bayesian functional mixed models. We propose a computational framework called variational functional mixed model (VFMM). This new method facilitates efficient data compression and high-performance computing in basis space. We also propose a new multiple testing procedure in basis space, which can be used to detect significant local regions. The effectiveness of the proposed model is demonstrated through two datasets, a mass spectrometry dataset in a cancer study and a neuroimaging dataset in an Alzheimer's disease study. The second part is about modeling data heterogeneity by using Dirichlet Diffusion Trees. We propose a Bayesian latent tree model that incorporates covariates of subjects to characterize the heterogeneity and uncover the latent tree structure underlying data. This innovative model may reveal the hierarchical evolution process through branch structures and estimate systematic differences between groups of samples. We demonstrate the effectiveness of the model through the simulation study and a brain tumor real data. Doctor of Philosophy With the rapid development of modern high-throughput technologies, scientists can now collect high-dimensional data in different forms, such as engineering signals, medical images, and genomics measurements. However, acquisition of such data does not automatically lead to efficient knowledge discovery. The main objective of this dissertation is to develop novel Bayesian methods to extract useful knowledge from complex high-dimensional data. It has two parts—the development of an ultra-fast functional mixed model and the modeling of data heterogeneity via Dirichlet Diffusion Trees. The first part focuses on developing approximate Bayesian methods in functional mixed models to estimate parameters and detect significant regions. Two datasets demonstrate the effectiveness of proposed method—a mass spectrometry dataset in a cancer study and a neuroimaging dataset in an Alzheimer's disease study. The second part focuses on modeling data heterogeneity via Dirichlet Diffusion Trees. The method helps uncover the underlying hierarchical tree structures and estimate systematic differences between the group of samples. We demonstrate the effectiveness of the method through the brain tumor imaging data. 2020-12-08T09:00:20Z 2020-12-08T09:00:20Z 2020-12-07 Dissertation vt_gsexam:28197 http://hdl.handle.net/10919/101037 In Copyright http://rightsstatements.org/vocab/InC/1.0/ ETD application/pdf Virginia Tech |
collection |
NDLTD |
format |
Others
|
sources |
NDLTD |
topic |
Variational Inference Bayesian Variable Selection Functional Mixed Model Parallel Computing Bayesian Hierarchical Clustering Dirichlet Diffusion Tree |
spellingShingle |
Variational Inference Bayesian Variable Selection Functional Mixed Model Parallel Computing Bayesian Hierarchical Clustering Dirichlet Diffusion Tree Huo, Shuning Bayesian Modeling of Complex High-Dimensional Data |
description |
With the rapid development of modern high-throughput technologies, scientists can now
collect high-dimensional complex data in different forms, such as medical images, genomics
measurements. However, acquisition of more data does not automatically lead to better
knowledge discovery. One needs efficient and reliable analytical tools to extract useful information
from complex datasets. The main objective of this dissertation is to develop innovative
Bayesian methodologies to enable effective and efficient knowledge discovery from
complex high-dimensional data. It contains two parts—the development of computationally
efficient functional mixed models and the modeling of data heterogeneity via Dirichlet
Diffusion Tree. The first part focuses on tackling the computational bottleneck in Bayesian
functional mixed models. We propose a computational framework called variational functional
mixed model (VFMM). This new method facilitates efficient data compression and
high-performance computing in basis space. We also propose a new multiple testing procedure
in basis space, which can be used to detect significant local regions. The effectiveness
of the proposed model is demonstrated through two datasets, a mass spectrometry dataset
in a cancer study and a neuroimaging dataset in an Alzheimer's disease study. The second
part is about modeling data heterogeneity by using Dirichlet Diffusion Trees. We propose
a Bayesian latent tree model that incorporates covariates of subjects to characterize the
heterogeneity and uncover the latent tree structure underlying data. This innovative model
may reveal the hierarchical evolution process through branch structures and estimate systematic
differences between groups of samples. We demonstrate the effectiveness of the model
through the simulation study and a brain tumor real data. === Doctor of Philosophy === With the rapid development of modern high-throughput technologies, scientists can now
collect high-dimensional data in different forms, such as engineering signals, medical images,
and genomics measurements. However, acquisition of such data does not automatically lead
to efficient knowledge discovery. The main objective of this dissertation is to develop novel
Bayesian methods to extract useful knowledge from complex high-dimensional data. It has
two parts—the development of an ultra-fast functional mixed model and the modeling of
data heterogeneity via Dirichlet Diffusion Trees. The first part focuses on developing approximate
Bayesian methods in functional mixed models to estimate parameters and detect
significant regions. Two datasets demonstrate the effectiveness of proposed method—a mass
spectrometry dataset in a cancer study and a neuroimaging dataset in an Alzheimer's disease
study. The second part focuses on modeling data heterogeneity via Dirichlet Diffusion
Trees. The method helps uncover the underlying hierarchical tree structures and estimate
systematic differences between the group of samples. We demonstrate the effectiveness of
the method through the brain tumor imaging data. |
author2 |
Statistics |
author_facet |
Statistics Huo, Shuning |
author |
Huo, Shuning |
author_sort |
Huo, Shuning |
title |
Bayesian Modeling of Complex High-Dimensional Data |
title_short |
Bayesian Modeling of Complex High-Dimensional Data |
title_full |
Bayesian Modeling of Complex High-Dimensional Data |
title_fullStr |
Bayesian Modeling of Complex High-Dimensional Data |
title_full_unstemmed |
Bayesian Modeling of Complex High-Dimensional Data |
title_sort |
bayesian modeling of complex high-dimensional data |
publisher |
Virginia Tech |
publishDate |
2020 |
url |
http://hdl.handle.net/10919/101037 |
work_keys_str_mv |
AT huoshuning bayesianmodelingofcomplexhighdimensionaldata |
_version_ |
1719370077598384128 |