Marker-set Genetic Association Studies with An Uncertainty-coding Matrix

博士 === 國立臺灣大學 === 流行病學與預防醫學研究所 === 100 === Compared with single marker analysis, marker-set analysis usually contains more biological interpretation and can provide more informative and powerful results. For instance, the haplotype with linkage-disequilibrium among SNP markers is one marker-set pres...

Full description

Bibliographic Details
Main Authors: Yung-Hsiang Huang, 黃詠詳
Other Authors: Chuhsing K. Hsiao
Format: Others
Language:en_US
Published: 2012
Online Access:http://ndltd.ncl.edu.tw/handle/05078448317489251507
Description
Summary:博士 === 國立臺灣大學 === 流行病學與預防醫學研究所 === 100 === Compared with single marker analysis, marker-set analysis usually contains more biological interpretation and can provide more informative and powerful results. For instance, the haplotype with linkage-disequilibrium among SNP markers is one marker-set presentation. Such analysis, however, does not come free. There are difficulties need to be overcome. For example, the determination of haplotype phase and the large number of haplotypes in the genetic region of interest may cause problem in statistical inference. In addition, for family study designs, there exists the uncertainty of transmission/non-transmission status from parents to affected offspring. In this thesis, I will construct an uncertainty-coding matrix of marker-set based on collected genotype data, and apply to association studies with family design for two stages. First, I will base on trios design to conduct the transmission/non-transmission haplotype and employ the Bayesian conditional logistic regression. Second I use Bayesian generalized mixed effect model to incorporate the whole marker-set information of the other offspring in family. Such design matrix can cope with the phase uncertainty and the transmission uncertainty among family studies. Furthermore, an evolutionary-based clustering method can avoid the curse of dimensionality. The simulation studies and real data applications are presented and compared with other tools. These proposed methods are implemented in R and will be available for free download in the future.