Bayesian Models for Capturing Heterogeneity in Discrete Data

Population heterogeneity exists frequently in discrete data. Many Bayesian models perform reasonably well in capturing this subpopulation structure. Typically, the Dirichlet process mixture model (DPMM) and a variable dimensional alternative that we refer to as the mixture of finite mixtures (MFM) m...

Full description

Bibliographic Details
Other Authors: Geng, Junxian (authoraut)
Format: Others
Language:English
English
Published: Florida State University
Subjects:
Online Access:http://purl.flvc.org/fsu/fd/FSU_2017SP_Geng_fsu_0071E_13791
Description
Summary:Population heterogeneity exists frequently in discrete data. Many Bayesian models perform reasonably well in capturing this subpopulation structure. Typically, the Dirichlet process mixture model (DPMM) and a variable dimensional alternative that we refer to as the mixture of finite mixtures (MFM) model are used, as they both have natural byproducts of clustering derived from Polya urn schemes. The first part of this dissertation focuses on a model for the association between a binary response and binary predictors. The model incorporates Boolean combinations of predictors, called logic trees, as parameters arising from a DPMM or MFM. Joint modeling is proposed to solve the identifiability issue that arises when using a mixture model for a binary response. Different MCMC algorithms are introduced and compared for fitting these models. The second part of this dissertation is the application of the mixture of finite mixtures model to community detection problems. Here, the communities are analogous to the clusters in the earlier work. A probabilistic framework that allows simultaneous estimation of the number of clusters and the cluster configuration is proposed. We prove clustering consistency in this setting. We also illustrate the performance of these methods with simulation studies and discuss applications. === A Dissertation submitted to the Department of Statistics in partial fulfillment of the requirements for the degree of Doctor of Philosophy. === Spring Semester 2017. === April 5, 2017. === Community Detection, Discrete data, Joint Modeling, MCMC, Mixture Model, Population heterogeneity === Includes bibliographical references. === Elizabeth H. Slate, Professor Co-Directing Dissertation; Debdeep Pati, Professor Co-Directing Dissertation; Carl P. Schmertmann, University Representative; Xin Zhang, Committee Member.