The GLM Approach for QTL Linkage Analysis Based on the Haseman-Elston Method

碩士 === 國立臺灣大學 === 流行病學研究所 === 91 === The Haseman-Elston method (1972) is known for analyzing quantitative traits, which is proceeded by collecting the marker genotypes of sib pair and their parents and the traits of the sibs, and then constructs the relationship between the expectation of the square...

Full description

Bibliographic Details
Main Authors: Wan-Yu Lin, 林菀俞
Other Authors: John Jen Tai
Format: Others
Language:en_US
Published: 2003
Online Access:http://ndltd.ncl.edu.tw/handle/64594738770323660581
id ndltd-TW-091NTU01544009
record_format oai_dc
spelling ndltd-TW-091NTU015440092016-06-20T04:15:58Z http://ndltd.ncl.edu.tw/handle/64594738770323660581 The GLM Approach for QTL Linkage Analysis Based on the Haseman-Elston Method 基於Haseman-Elston方法之廣義線性模式數量性狀連鎖分析 Wan-Yu Lin 林菀俞 碩士 國立臺灣大學 流行病學研究所 91 The Haseman-Elston method (1972) is known for analyzing quantitative traits, which is proceeded by collecting the marker genotypes of sib pair and their parents and the traits of the sibs, and then constructs the relationship between the expectation of the squared trait difference (SQD) of sibs and the estimated proportion of alleles shared identical by descent (i.b.d.) at the marker locus. It was derived by Haseman and Elston (1972) that the expectation of the SQD is linear in the estimated proportion of alleles shared i.b.d. at the marker locus. The subsequent researches are not beyond the scope of regressing the SQD on the estimated proportion of genes i.b.d. at the marker locus to test if the slope is significantly negative; or performing a nonparametric analysis to examine if there is correlation between the SQD and the estimated proportion of marker genes i.b.d.; or even substituting other transformations for SQD. In this thesis, we focus on regression of the SQD on the estimated proportion of marker genes i.b.d. In Chapter 3, we classify the whole data into several strata by their marker informativeness and then calculate the conditional variance of the SQD, which is the inverse of the data reliability. Analyses based on the classical linear model are questionable when the conditional variances are not constant. Moreover, the distribution of the SQD is more right-skewed than the normal distribution. In Chapter 4, we apply the generalized linear model (GLM) for allowing nonnormal distribution. The algorithm solving the asymptotic maximum likelihood estimator (MLE) in GLM is iteratively reweighted least squares (IRLS), provided the inverse of the variance function as the weight then implement the weighted least squares until the estimate converges, so it also provides a solution for dealing with the problem mentioned in Chapter 3, that is, the heterogeneous error variance. In Chapter 5, we compare the GLM with the simple linear regression (SLR) by simulation. We find that the two methods perform similarly in large samples. The GLM approach improves the analyses of the Haseman-Elston method in small or moderate samples. We have used a diallelic marker for the simulation study, causing the data with complete marker i.b.d. information reduced. However, with the more informative molecular markers currently discovered, the number of marker alleles is expected to range from four to ten, which increases the portion of data with complete marker i.b.d. information and violates the assumption of constant error variance more severely. In that situation, using the GLM approach will be more meaningful. John Jen Tai 戴政 2003 學位論文 ; thesis 74 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立臺灣大學 === 流行病學研究所 === 91 === The Haseman-Elston method (1972) is known for analyzing quantitative traits, which is proceeded by collecting the marker genotypes of sib pair and their parents and the traits of the sibs, and then constructs the relationship between the expectation of the squared trait difference (SQD) of sibs and the estimated proportion of alleles shared identical by descent (i.b.d.) at the marker locus. It was derived by Haseman and Elston (1972) that the expectation of the SQD is linear in the estimated proportion of alleles shared i.b.d. at the marker locus. The subsequent researches are not beyond the scope of regressing the SQD on the estimated proportion of genes i.b.d. at the marker locus to test if the slope is significantly negative; or performing a nonparametric analysis to examine if there is correlation between the SQD and the estimated proportion of marker genes i.b.d.; or even substituting other transformations for SQD. In this thesis, we focus on regression of the SQD on the estimated proportion of marker genes i.b.d. In Chapter 3, we classify the whole data into several strata by their marker informativeness and then calculate the conditional variance of the SQD, which is the inverse of the data reliability. Analyses based on the classical linear model are questionable when the conditional variances are not constant. Moreover, the distribution of the SQD is more right-skewed than the normal distribution. In Chapter 4, we apply the generalized linear model (GLM) for allowing nonnormal distribution. The algorithm solving the asymptotic maximum likelihood estimator (MLE) in GLM is iteratively reweighted least squares (IRLS), provided the inverse of the variance function as the weight then implement the weighted least squares until the estimate converges, so it also provides a solution for dealing with the problem mentioned in Chapter 3, that is, the heterogeneous error variance. In Chapter 5, we compare the GLM with the simple linear regression (SLR) by simulation. We find that the two methods perform similarly in large samples. The GLM approach improves the analyses of the Haseman-Elston method in small or moderate samples. We have used a diallelic marker for the simulation study, causing the data with complete marker i.b.d. information reduced. However, with the more informative molecular markers currently discovered, the number of marker alleles is expected to range from four to ten, which increases the portion of data with complete marker i.b.d. information and violates the assumption of constant error variance more severely. In that situation, using the GLM approach will be more meaningful.
author2 John Jen Tai
author_facet John Jen Tai
Wan-Yu Lin
林菀俞
author Wan-Yu Lin
林菀俞
spellingShingle Wan-Yu Lin
林菀俞
The GLM Approach for QTL Linkage Analysis Based on the Haseman-Elston Method
author_sort Wan-Yu Lin
title The GLM Approach for QTL Linkage Analysis Based on the Haseman-Elston Method
title_short The GLM Approach for QTL Linkage Analysis Based on the Haseman-Elston Method
title_full The GLM Approach for QTL Linkage Analysis Based on the Haseman-Elston Method
title_fullStr The GLM Approach for QTL Linkage Analysis Based on the Haseman-Elston Method
title_full_unstemmed The GLM Approach for QTL Linkage Analysis Based on the Haseman-Elston Method
title_sort glm approach for qtl linkage analysis based on the haseman-elston method
publishDate 2003
url http://ndltd.ncl.edu.tw/handle/64594738770323660581
work_keys_str_mv AT wanyulin theglmapproachforqtllinkageanalysisbasedonthehasemanelstonmethod
AT línwǎnyú theglmapproachforqtllinkageanalysisbasedonthehasemanelstonmethod
AT wanyulin jīyúhasemanelstonfāngfǎzhīguǎngyìxiànxìngmóshìshùliàngxìngzhuàngliánsuǒfēnxī
AT línwǎnyú jīyúhasemanelstonfāngfǎzhīguǎngyìxiànxìngmóshìshùliàngxìngzhuàngliánsuǒfēnxī
AT wanyulin glmapproachforqtllinkageanalysisbasedonthehasemanelstonmethod
AT línwǎnyú glmapproachforqtllinkageanalysisbasedonthehasemanelstonmethod
_version_ 1718310967021928448