Genotype imputation using LD-based Weighted K Nearest Neighbor
碩士 === 國立臺灣大學 === 農藝學研究所 === 103 === Detection of single nucleotide polymorphism (SNP) in high-throughput sequencing technologies has become efficient and robust strategies for SNP discovery and genome-Wide association study. However, the conventional high-throughput genotyping techniques often prod...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2014
|
Online Access: | http://ndltd.ncl.edu.tw/handle/11447615957510197320 |
id |
ndltd-TW-103NTU05417002 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-103NTU054170022016-05-22T04:40:55Z http://ndltd.ncl.edu.tw/handle/11447615957510197320 Genotype imputation using LD-based Weighted K Nearest Neighbor 利用連鎖失衡加權K最近鄰法於基因型資料填補之研究 Jhih-Wun Zeng 曾志文 碩士 國立臺灣大學 農藝學研究所 103 Detection of single nucleotide polymorphism (SNP) in high-throughput sequencing technologies has become efficient and robust strategies for SNP discovery and genome-Wide association study. However, the conventional high-throughput genotyping techniques often produce a certain proportion of missing calls. It has been long recognized that failing to account for these missing data could dramatically reduce the power of detecting SNPs. A variety of imputation methods have been developed to impute the missing genotypes. Methods based on the K-nearest neighbors (KNN) and weighting K-nearest neighbors (wtKNN) have received some attention by considering the similarities in the haplotype structures. More recently, a number of powerful methods based on hidden Markov model (HMM) have become popular in SNPs imputation. However, these methods are time consuming or mostly suitable for small maker sets imputation and cannot exploit the structure of indirect association of tightly linked SNPs. In this study, We Will propose a novel but computationally simple imputation method that is based on weighting K-nearest neighbors (wtKNN) by considering linkage disequilibrium (LD). We will demonstrate the performance of our method to impute missing SNPs using both Genotyping by sequencing (GBS) data and simulation studies. In addition, we will compare the accuracy and performance of our method with competing imputation methods. 蔡政安 2014 學位論文 ; thesis 84 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立臺灣大學 === 農藝學研究所 === 103 === Detection of single nucleotide polymorphism (SNP) in high-throughput sequencing technologies has become efficient and robust strategies for SNP discovery and genome-Wide association study. However, the conventional high-throughput genotyping techniques often produce a certain proportion of missing calls. It has been long recognized that failing to account for these missing data could dramatically reduce the power of detecting SNPs. A variety of imputation methods have been developed to impute the missing genotypes. Methods based on the K-nearest neighbors (KNN) and weighting K-nearest neighbors (wtKNN) have received some attention by considering the similarities in the haplotype structures. More recently, a number of powerful methods based on hidden Markov model (HMM) have become popular in SNPs imputation. However, these methods are time consuming or mostly suitable for small maker sets imputation and cannot exploit the structure of indirect association of tightly linked SNPs. In this study, We Will propose a novel but computationally simple imputation method that is based on weighting K-nearest neighbors (wtKNN) by considering linkage disequilibrium (LD). We will demonstrate the performance of our method to impute missing SNPs using both Genotyping by sequencing (GBS) data and simulation studies. In addition, we will compare the accuracy and performance of our method with competing imputation methods.
|
author2 |
蔡政安 |
author_facet |
蔡政安 Jhih-Wun Zeng 曾志文 |
author |
Jhih-Wun Zeng 曾志文 |
spellingShingle |
Jhih-Wun Zeng 曾志文 Genotype imputation using LD-based Weighted K Nearest Neighbor |
author_sort |
Jhih-Wun Zeng |
title |
Genotype imputation using LD-based Weighted K Nearest Neighbor |
title_short |
Genotype imputation using LD-based Weighted K Nearest Neighbor |
title_full |
Genotype imputation using LD-based Weighted K Nearest Neighbor |
title_fullStr |
Genotype imputation using LD-based Weighted K Nearest Neighbor |
title_full_unstemmed |
Genotype imputation using LD-based Weighted K Nearest Neighbor |
title_sort |
genotype imputation using ld-based weighted k nearest neighbor |
publishDate |
2014 |
url |
http://ndltd.ncl.edu.tw/handle/11447615957510197320 |
work_keys_str_mv |
AT jhihwunzeng genotypeimputationusingldbasedweightedknearestneighbor AT céngzhìwén genotypeimputationusingldbasedweightedknearestneighbor AT jhihwunzeng lìyòngliánsuǒshīhéngjiāquánkzuìjìnlínfǎyújīyīnxíngzīliàotiánbǔzhīyánjiū AT céngzhìwén lìyòngliánsuǒshīhéngjiāquánkzuìjìnlínfǎyújīyīnxíngzīliàotiánbǔzhīyánjiū |
_version_ |
1718277530527465472 |