Bayesian Models for Capturing Heterogeneity in Discrete Data
Population heterogeneity exists frequently in discrete data. Many Bayesian models perform reasonably well in capturing this subpopulation structure. Typically, the Dirichlet process mixture model (DPMM) and a variable dimensional alternative that we refer to as the mixture of finite mixtures (MFM) m...
Other Authors: | |
---|---|
Format: | Others |
Language: | English English |
Published: |
Florida State University
|
Subjects: | |
Online Access: | http://purl.flvc.org/fsu/fd/FSU_2017SP_Geng_fsu_0071E_13791 |
id |
ndltd-fsu.edu-oai-fsu.digital.flvc.org-fsu_507662 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-fsu.edu-oai-fsu.digital.flvc.org-fsu_5076622020-06-24T03:08:53Z Bayesian Models for Capturing Heterogeneity in Discrete Data Geng, Junxian (authoraut) Slate, Elizabeth H. (professor co-directing dissertation) Pati, Debdeep (professor co-directing dissertation) Schmertmann, Carl P. (university representative) Zhang, Xin (committee member) Florida State University (degree granting institution) College of Arts and Sciences (degree granting college) Department of Statistics (degree granting departmentdgg) Text text doctoral thesis Florida State University Florida State University English eng 1 online resource (98 pages) computer application/pdf Population heterogeneity exists frequently in discrete data. Many Bayesian models perform reasonably well in capturing this subpopulation structure. Typically, the Dirichlet process mixture model (DPMM) and a variable dimensional alternative that we refer to as the mixture of finite mixtures (MFM) model are used, as they both have natural byproducts of clustering derived from Polya urn schemes. The first part of this dissertation focuses on a model for the association between a binary response and binary predictors. The model incorporates Boolean combinations of predictors, called logic trees, as parameters arising from a DPMM or MFM. Joint modeling is proposed to solve the identifiability issue that arises when using a mixture model for a binary response. Different MCMC algorithms are introduced and compared for fitting these models. The second part of this dissertation is the application of the mixture of finite mixtures model to community detection problems. Here, the communities are analogous to the clusters in the earlier work. A probabilistic framework that allows simultaneous estimation of the number of clusters and the cluster configuration is proposed. We prove clustering consistency in this setting. We also illustrate the performance of these methods with simulation studies and discuss applications. A Dissertation submitted to the Department of Statistics in partial fulfillment of the requirements for the degree of Doctor of Philosophy. Spring Semester 2017. April 5, 2017. Community Detection, Discrete data, Joint Modeling, MCMC, Mixture Model, Population heterogeneity Includes bibliographical references. Elizabeth H. Slate, Professor Co-Directing Dissertation; Debdeep Pati, Professor Co-Directing Dissertation; Carl P. Schmertmann, University Representative; Xin Zhang, Committee Member. Statistics FSU_2017SP_Geng_fsu_0071E_13791 http://purl.flvc.org/fsu/fd/FSU_2017SP_Geng_fsu_0071E_13791 This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s). The copyright in theses and dissertations completed at Florida State University is held by the students who author them. http://diginole.lib.fsu.edu/islandora/object/fsu%3A507662/datastream/TN/view/Bayesian%20Models%20for%20Capturing%20Heterogeneity%20in%20Discrete%20Data.jpg |
collection |
NDLTD |
language |
English English |
format |
Others
|
sources |
NDLTD |
topic |
Statistics |
spellingShingle |
Statistics Bayesian Models for Capturing Heterogeneity in Discrete Data |
description |
Population heterogeneity exists frequently in discrete data. Many Bayesian models perform reasonably well in capturing this subpopulation structure. Typically, the Dirichlet process mixture model (DPMM) and a variable dimensional alternative that we refer to as the mixture of finite mixtures (MFM) model are used, as they both have natural byproducts of clustering derived from Polya urn schemes. The first part of this dissertation focuses on a model for the association between a binary response and binary predictors. The model incorporates Boolean combinations of predictors, called logic trees, as parameters arising from a DPMM or MFM. Joint modeling is proposed to solve the identifiability issue that arises when using a mixture model for a binary response. Different MCMC algorithms are introduced and compared for fitting these models. The second part of this dissertation is the application of the mixture of finite mixtures model to community detection problems. Here, the communities are analogous to the clusters in the earlier work. A probabilistic framework that allows simultaneous estimation of the number of clusters and the cluster configuration is proposed. We prove clustering consistency in this setting. We also illustrate the performance of these methods with simulation studies and discuss applications. === A Dissertation submitted to the Department of Statistics in partial fulfillment of the requirements for the degree of Doctor of Philosophy. === Spring Semester 2017. === April 5, 2017. === Community Detection, Discrete data, Joint Modeling, MCMC, Mixture Model, Population heterogeneity === Includes bibliographical references. === Elizabeth H. Slate, Professor Co-Directing Dissertation; Debdeep Pati, Professor Co-Directing Dissertation; Carl P. Schmertmann, University Representative; Xin Zhang, Committee Member. |
author2 |
Geng, Junxian (authoraut) |
author_facet |
Geng, Junxian (authoraut) |
title |
Bayesian Models for Capturing Heterogeneity in Discrete Data |
title_short |
Bayesian Models for Capturing Heterogeneity in Discrete Data |
title_full |
Bayesian Models for Capturing Heterogeneity in Discrete Data |
title_fullStr |
Bayesian Models for Capturing Heterogeneity in Discrete Data |
title_full_unstemmed |
Bayesian Models for Capturing Heterogeneity in Discrete Data |
title_sort |
bayesian models for capturing heterogeneity in discrete data |
publisher |
Florida State University |
url |
http://purl.flvc.org/fsu/fd/FSU_2017SP_Geng_fsu_0071E_13791 |
_version_ |
1719323355932262400 |