The GLM Approach for QTL Linkage Analysis Based on the Haseman-Elston Method
碩士 === 國立臺灣大學 === 流行病學研究所 === 91 === The Haseman-Elston method (1972) is known for analyzing quantitative traits, which is proceeded by collecting the marker genotypes of sib pair and their parents and the traits of the sibs, and then constructs the relationship between the expectation of the square...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2003
|
Online Access: | http://ndltd.ncl.edu.tw/handle/64594738770323660581 |
id |
ndltd-TW-091NTU01544009 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-091NTU015440092016-06-20T04:15:58Z http://ndltd.ncl.edu.tw/handle/64594738770323660581 The GLM Approach for QTL Linkage Analysis Based on the Haseman-Elston Method 基於Haseman-Elston方法之廣義線性模式數量性狀連鎖分析 Wan-Yu Lin 林菀俞 碩士 國立臺灣大學 流行病學研究所 91 The Haseman-Elston method (1972) is known for analyzing quantitative traits, which is proceeded by collecting the marker genotypes of sib pair and their parents and the traits of the sibs, and then constructs the relationship between the expectation of the squared trait difference (SQD) of sibs and the estimated proportion of alleles shared identical by descent (i.b.d.) at the marker locus. It was derived by Haseman and Elston (1972) that the expectation of the SQD is linear in the estimated proportion of alleles shared i.b.d. at the marker locus. The subsequent researches are not beyond the scope of regressing the SQD on the estimated proportion of genes i.b.d. at the marker locus to test if the slope is significantly negative; or performing a nonparametric analysis to examine if there is correlation between the SQD and the estimated proportion of marker genes i.b.d.; or even substituting other transformations for SQD. In this thesis, we focus on regression of the SQD on the estimated proportion of marker genes i.b.d. In Chapter 3, we classify the whole data into several strata by their marker informativeness and then calculate the conditional variance of the SQD, which is the inverse of the data reliability. Analyses based on the classical linear model are questionable when the conditional variances are not constant. Moreover, the distribution of the SQD is more right-skewed than the normal distribution. In Chapter 4, we apply the generalized linear model (GLM) for allowing nonnormal distribution. The algorithm solving the asymptotic maximum likelihood estimator (MLE) in GLM is iteratively reweighted least squares (IRLS), provided the inverse of the variance function as the weight then implement the weighted least squares until the estimate converges, so it also provides a solution for dealing with the problem mentioned in Chapter 3, that is, the heterogeneous error variance. In Chapter 5, we compare the GLM with the simple linear regression (SLR) by simulation. We find that the two methods perform similarly in large samples. The GLM approach improves the analyses of the Haseman-Elston method in small or moderate samples. We have used a diallelic marker for the simulation study, causing the data with complete marker i.b.d. information reduced. However, with the more informative molecular markers currently discovered, the number of marker alleles is expected to range from four to ten, which increases the portion of data with complete marker i.b.d. information and violates the assumption of constant error variance more severely. In that situation, using the GLM approach will be more meaningful. John Jen Tai 戴政 2003 學位論文 ; thesis 74 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立臺灣大學 === 流行病學研究所 === 91 === The Haseman-Elston method (1972) is known for analyzing quantitative traits, which is proceeded by collecting the marker genotypes of sib pair and their parents and the traits of the sibs, and then constructs the relationship between the expectation of the squared trait difference (SQD) of sibs and the estimated proportion of alleles shared identical by descent (i.b.d.) at the marker locus. It was derived by Haseman and Elston (1972) that the expectation of the SQD is linear in the estimated proportion of alleles shared i.b.d. at the marker locus. The subsequent researches are not beyond the scope of regressing the SQD on the estimated proportion of genes i.b.d. at the marker locus to test if the slope is significantly negative; or performing a nonparametric analysis to examine if there is correlation between the SQD and the estimated proportion of marker genes i.b.d.; or even substituting other transformations for SQD. In this thesis, we focus on regression of the SQD on the estimated proportion of marker genes i.b.d. In Chapter 3, we classify the whole data into several strata by their marker informativeness and then calculate the conditional variance of the SQD, which is the inverse of the data reliability. Analyses based on the classical linear model are questionable when the conditional variances are not constant. Moreover, the distribution of the SQD is more right-skewed than the normal distribution. In Chapter 4, we apply the generalized linear model (GLM) for allowing nonnormal distribution. The algorithm solving the asymptotic maximum likelihood estimator (MLE) in GLM is iteratively reweighted least squares (IRLS), provided the inverse of the variance function as the weight then implement the weighted least squares until the estimate converges, so it also provides a solution for dealing with the problem mentioned in Chapter 3, that is, the heterogeneous error variance. In Chapter 5, we compare the GLM with the simple linear regression (SLR) by simulation. We find that the two methods perform similarly in large samples. The GLM approach improves the analyses of the Haseman-Elston method in small or moderate samples. We have used a diallelic marker for the simulation study, causing the data with complete marker i.b.d. information reduced. However, with the more informative molecular markers currently discovered, the number of marker alleles is expected to range from four to ten, which increases the portion of data with complete marker i.b.d. information and violates the assumption of constant error variance more severely. In that situation, using the GLM approach will be more meaningful.
|
author2 |
John Jen Tai |
author_facet |
John Jen Tai Wan-Yu Lin 林菀俞 |
author |
Wan-Yu Lin 林菀俞 |
spellingShingle |
Wan-Yu Lin 林菀俞 The GLM Approach for QTL Linkage Analysis Based on the Haseman-Elston Method |
author_sort |
Wan-Yu Lin |
title |
The GLM Approach for QTL Linkage Analysis Based on the Haseman-Elston Method |
title_short |
The GLM Approach for QTL Linkage Analysis Based on the Haseman-Elston Method |
title_full |
The GLM Approach for QTL Linkage Analysis Based on the Haseman-Elston Method |
title_fullStr |
The GLM Approach for QTL Linkage Analysis Based on the Haseman-Elston Method |
title_full_unstemmed |
The GLM Approach for QTL Linkage Analysis Based on the Haseman-Elston Method |
title_sort |
glm approach for qtl linkage analysis based on the haseman-elston method |
publishDate |
2003 |
url |
http://ndltd.ncl.edu.tw/handle/64594738770323660581 |
work_keys_str_mv |
AT wanyulin theglmapproachforqtllinkageanalysisbasedonthehasemanelstonmethod AT línwǎnyú theglmapproachforqtllinkageanalysisbasedonthehasemanelstonmethod AT wanyulin jīyúhasemanelstonfāngfǎzhīguǎngyìxiànxìngmóshìshùliàngxìngzhuàngliánsuǒfēnxī AT línwǎnyú jīyúhasemanelstonfāngfǎzhīguǎngyìxiànxìngmóshìshùliàngxìngzhuàngliánsuǒfēnxī AT wanyulin glmapproachforqtllinkageanalysisbasedonthehasemanelstonmethod AT línwǎnyú glmapproachforqtllinkageanalysisbasedonthehasemanelstonmethod |
_version_ |
1718310967021928448 |