Genotype imputation using LD-based Weighted K Nearest Neighbor

碩士 === 國立臺灣大學 === 農藝學研究所 === 103 === Detection of single nucleotide polymorphism (SNP) in high-throughput sequencing technologies has become efficient and robust strategies for SNP discovery and genome-Wide association study. However, the conventional high-throughput genotyping techniques often prod...

Full description

Bibliographic Details
Main Authors: Jhih-Wun Zeng, 曾志文
Other Authors: 蔡政安
Format: Others
Language:zh-TW
Published: 2014
Online Access:http://ndltd.ncl.edu.tw/handle/11447615957510197320
id ndltd-TW-103NTU05417002
record_format oai_dc
spelling ndltd-TW-103NTU054170022016-05-22T04:40:55Z http://ndltd.ncl.edu.tw/handle/11447615957510197320 Genotype imputation using LD-based Weighted K Nearest Neighbor 利用連鎖失衡加權K最近鄰法於基因型資料填補之研究 Jhih-Wun Zeng 曾志文 碩士 國立臺灣大學 農藝學研究所 103 Detection of single nucleotide polymorphism (SNP) in high-throughput sequencing technologies has become efficient and robust strategies for SNP discovery and genome-Wide association study. However, the conventional high-throughput genotyping techniques often produce a certain proportion of missing calls. It has been long recognized that failing to account for these missing data could dramatically reduce the power of detecting SNPs. A variety of imputation methods have been developed to impute the missing genotypes. Methods based on the K-nearest neighbors (KNN) and weighting K-nearest neighbors (wtKNN) have received some attention by considering the similarities in the haplotype structures. More recently, a number of powerful methods based on hidden Markov model (HMM) have become popular in SNPs imputation. However, these methods are time consuming or mostly suitable for small maker sets imputation and cannot exploit the structure of indirect association of tightly linked SNPs. In this study, We Will propose a novel but computationally simple imputation method that is based on weighting K-nearest neighbors (wtKNN) by considering linkage disequilibrium (LD). We will demonstrate the performance of our method to impute missing SNPs using both Genotyping by sequencing (GBS) data and simulation studies. In addition, we will compare the accuracy and performance of our method with competing imputation methods. 蔡政安 2014 學位論文 ; thesis 84 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立臺灣大學 === 農藝學研究所 === 103 === Detection of single nucleotide polymorphism (SNP) in high-throughput sequencing technologies has become efficient and robust strategies for SNP discovery and genome-Wide association study. However, the conventional high-throughput genotyping techniques often produce a certain proportion of missing calls. It has been long recognized that failing to account for these missing data could dramatically reduce the power of detecting SNPs. A variety of imputation methods have been developed to impute the missing genotypes. Methods based on the K-nearest neighbors (KNN) and weighting K-nearest neighbors (wtKNN) have received some attention by considering the similarities in the haplotype structures. More recently, a number of powerful methods based on hidden Markov model (HMM) have become popular in SNPs imputation. However, these methods are time consuming or mostly suitable for small maker sets imputation and cannot exploit the structure of indirect association of tightly linked SNPs. In this study, We Will propose a novel but computationally simple imputation method that is based on weighting K-nearest neighbors (wtKNN) by considering linkage disequilibrium (LD). We will demonstrate the performance of our method to impute missing SNPs using both Genotyping by sequencing (GBS) data and simulation studies. In addition, we will compare the accuracy and performance of our method with competing imputation methods.
author2 蔡政安
author_facet 蔡政安
Jhih-Wun Zeng
曾志文
author Jhih-Wun Zeng
曾志文
spellingShingle Jhih-Wun Zeng
曾志文
Genotype imputation using LD-based Weighted K Nearest Neighbor
author_sort Jhih-Wun Zeng
title Genotype imputation using LD-based Weighted K Nearest Neighbor
title_short Genotype imputation using LD-based Weighted K Nearest Neighbor
title_full Genotype imputation using LD-based Weighted K Nearest Neighbor
title_fullStr Genotype imputation using LD-based Weighted K Nearest Neighbor
title_full_unstemmed Genotype imputation using LD-based Weighted K Nearest Neighbor
title_sort genotype imputation using ld-based weighted k nearest neighbor
publishDate 2014
url http://ndltd.ncl.edu.tw/handle/11447615957510197320
work_keys_str_mv AT jhihwunzeng genotypeimputationusingldbasedweightedknearestneighbor
AT céngzhìwén genotypeimputationusingldbasedweightedknearestneighbor
AT jhihwunzeng lìyòngliánsuǒshīhéngjiāquánkzuìjìnlínfǎyújīyīnxíngzīliàotiánbǔzhīyánjiū
AT céngzhìwén lìyòngliánsuǒshīhéngjiāquánkzuìjìnlínfǎyújīyīnxíngzīliàotiánbǔzhīyánjiū
_version_ 1718277530527465472