Summary: | Analysis of familial data with quantitative traits based on the multivariate normal distribution
has been well studied. However, little attention has been devoted to traits which do not have a
multivariate normal distribution, such as traits with discrete or censored values. In this thesis, we
devote our effort to (1) construct models for familial data when the trait value is discrete and/or
censored, and (2) study alternative estimation methods when maximum likelihood estimation is
infeasible. We discuss two existing classes of models: models with random effects which are multivariate
normally distributed, and models constructed from the multivariate normal copula. These
two classes include a variety of models which can be applied to familial data. We also propose
another class of models which we call conditional independence models. This type of model is
based on a conditional independence assumption: for a trait variable, we assume independence of a
pair of non-sibling relatives conditional on their parents, so that the dependence structure is built
on the Markov property.
Maximum likelihood estimates are generally difficult to obtain for random effect models and
copula models when there are large families involved. We propose two estimation procedures based
on composite likelihoods: the first is a two-stage method in which univariate marginal parameters
are estimated based on univariate marginal distributions and the dependence parameters are estimated
separately based on bivariate marginal distributions with the marginal parameters treated
as known; whereas in the second, all the parameters are estimated using the likelihoods of bivariate
marginal distributions. The composite likelihood methods can greatly reduce computation in
parameter estimation, but with a price of efficiency loss. In this thesis, extensive investigations
based on asymptotic covariance matrices and simulations were carried out to compare the asymptotic
efficiency of these two procedures with the maximum likelihood method. In our efficiency comparisons, we investigate the multivariate normal model for a continuous trait, the multivariate
probit model for a binary trait, the multivariate Poisson-lognormal mixture model for a count trait
and multivariate lognormal model for a censored variable. We found that when the dependence is
strong, the first approach is inefficient for the regression parameters; whereas when the dependence
is weak, the second approach is inefficient for the dependence parameters.
In many familial analyses, quantifying familial association is of great interest. For a binary
trait, the odds ratio may be used as a measure of association between a parent-offspring pair or a
sibling pair. We develop theories so that the asymptotic variance of an odds ratio can be computed
from a 2 x 2 contingency table formed by dependent pairs. === Science, Faculty of === Statistics, Department of === Graduate
|