Bayesian Models for Capturing Heterogeneity in Discrete Data

Population heterogeneity exists frequently in discrete data. Many Bayesian models perform reasonably well in capturing this subpopulation structure. Typically, the Dirichlet process mixture model (DPMM) and a variable dimensional alternative that we refer to as the mixture of finite mixtures (MFM) m...

Full description

Bibliographic Details
Other Authors: Geng, Junxian (authoraut)
Format: Others
Language:English
English
Published: Florida State University
Subjects:
Online Access:http://purl.flvc.org/fsu/fd/FSU_2017SP_Geng_fsu_0071E_13791
id ndltd-fsu.edu-oai-fsu.digital.flvc.org-fsu_507662
record_format oai_dc
spelling ndltd-fsu.edu-oai-fsu.digital.flvc.org-fsu_5076622020-06-24T03:08:53Z Bayesian Models for Capturing Heterogeneity in Discrete Data Geng, Junxian (authoraut) Slate, Elizabeth H. (professor co-directing dissertation) Pati, Debdeep (professor co-directing dissertation) Schmertmann, Carl P. (university representative) Zhang, Xin (committee member) Florida State University (degree granting institution) College of Arts and Sciences (degree granting college) Department of Statistics (degree granting departmentdgg) Text text doctoral thesis Florida State University Florida State University English eng 1 online resource (98 pages) computer application/pdf Population heterogeneity exists frequently in discrete data. Many Bayesian models perform reasonably well in capturing this subpopulation structure. Typically, the Dirichlet process mixture model (DPMM) and a variable dimensional alternative that we refer to as the mixture of finite mixtures (MFM) model are used, as they both have natural byproducts of clustering derived from Polya urn schemes. The first part of this dissertation focuses on a model for the association between a binary response and binary predictors. The model incorporates Boolean combinations of predictors, called logic trees, as parameters arising from a DPMM or MFM. Joint modeling is proposed to solve the identifiability issue that arises when using a mixture model for a binary response. Different MCMC algorithms are introduced and compared for fitting these models. The second part of this dissertation is the application of the mixture of finite mixtures model to community detection problems. Here, the communities are analogous to the clusters in the earlier work. A probabilistic framework that allows simultaneous estimation of the number of clusters and the cluster configuration is proposed. We prove clustering consistency in this setting. We also illustrate the performance of these methods with simulation studies and discuss applications. A Dissertation submitted to the Department of Statistics in partial fulfillment of the requirements for the degree of Doctor of Philosophy. Spring Semester 2017. April 5, 2017. Community Detection, Discrete data, Joint Modeling, MCMC, Mixture Model, Population heterogeneity Includes bibliographical references. Elizabeth H. Slate, Professor Co-Directing Dissertation; Debdeep Pati, Professor Co-Directing Dissertation; Carl P. Schmertmann, University Representative; Xin Zhang, Committee Member. Statistics FSU_2017SP_Geng_fsu_0071E_13791 http://purl.flvc.org/fsu/fd/FSU_2017SP_Geng_fsu_0071E_13791 This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s). The copyright in theses and dissertations completed at Florida State University is held by the students who author them. http://diginole.lib.fsu.edu/islandora/object/fsu%3A507662/datastream/TN/view/Bayesian%20Models%20for%20Capturing%20Heterogeneity%20in%20Discrete%20Data.jpg
collection NDLTD
language English
English
format Others
sources NDLTD
topic Statistics
spellingShingle Statistics
Bayesian Models for Capturing Heterogeneity in Discrete Data
description Population heterogeneity exists frequently in discrete data. Many Bayesian models perform reasonably well in capturing this subpopulation structure. Typically, the Dirichlet process mixture model (DPMM) and a variable dimensional alternative that we refer to as the mixture of finite mixtures (MFM) model are used, as they both have natural byproducts of clustering derived from Polya urn schemes. The first part of this dissertation focuses on a model for the association between a binary response and binary predictors. The model incorporates Boolean combinations of predictors, called logic trees, as parameters arising from a DPMM or MFM. Joint modeling is proposed to solve the identifiability issue that arises when using a mixture model for a binary response. Different MCMC algorithms are introduced and compared for fitting these models. The second part of this dissertation is the application of the mixture of finite mixtures model to community detection problems. Here, the communities are analogous to the clusters in the earlier work. A probabilistic framework that allows simultaneous estimation of the number of clusters and the cluster configuration is proposed. We prove clustering consistency in this setting. We also illustrate the performance of these methods with simulation studies and discuss applications. === A Dissertation submitted to the Department of Statistics in partial fulfillment of the requirements for the degree of Doctor of Philosophy. === Spring Semester 2017. === April 5, 2017. === Community Detection, Discrete data, Joint Modeling, MCMC, Mixture Model, Population heterogeneity === Includes bibliographical references. === Elizabeth H. Slate, Professor Co-Directing Dissertation; Debdeep Pati, Professor Co-Directing Dissertation; Carl P. Schmertmann, University Representative; Xin Zhang, Committee Member.
author2 Geng, Junxian (authoraut)
author_facet Geng, Junxian (authoraut)
title Bayesian Models for Capturing Heterogeneity in Discrete Data
title_short Bayesian Models for Capturing Heterogeneity in Discrete Data
title_full Bayesian Models for Capturing Heterogeneity in Discrete Data
title_fullStr Bayesian Models for Capturing Heterogeneity in Discrete Data
title_full_unstemmed Bayesian Models for Capturing Heterogeneity in Discrete Data
title_sort bayesian models for capturing heterogeneity in discrete data
publisher Florida State University
url http://purl.flvc.org/fsu/fd/FSU_2017SP_Geng_fsu_0071E_13791
_version_ 1719323355932262400