WinHAP: an efficient haplotype phasing algorithm based on scalable sliding windows.

Haplotype phasing represents an essential step in studying the association of genomic polymorphisms with complex genetic diseases, and in determining targets for drug designing. In recent years, huge amounts of genotype data are produced from the rapidly evolving high-throughput sequencing technolog...

Full description

Bibliographic Details
Main Authors: Yun Xu, Wenhua Cheng, Pengyu Nie, Fengfeng Zhou
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2012-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC3419172?pdf=render
id doaj-5ceeb18787a54c4c84380d5194ac7b62
record_format Article
spelling doaj-5ceeb18787a54c4c84380d5194ac7b622020-11-25T01:29:12ZengPublic Library of Science (PLoS)PLoS ONE1932-62032012-01-0178e4316310.1371/journal.pone.0043163WinHAP: an efficient haplotype phasing algorithm based on scalable sliding windows.Yun XuWenhua ChengPengyu NieFengfeng ZhouHaplotype phasing represents an essential step in studying the association of genomic polymorphisms with complex genetic diseases, and in determining targets for drug designing. In recent years, huge amounts of genotype data are produced from the rapidly evolving high-throughput sequencing technologies, and the data volume challenges the community with more efficient haplotype phasing algorithms, in the senses of both running time and overall accuracy. 2SNP is one of the fastest haplotype phasing algorithms with comparable low error rates with the other algorithms. The most time-consuming step of 2SNP is the construction of a maximum spanning tree (MST) among all the heterozygous SNP pairs. We simplified this step by replacing the MST with the initial haplotypes of adjacent heterozygous SNP pairs. The multi-SNP haplotypes were estimated within a sliding window along the chromosomes. The comparative studies on four different-scale genotype datasets suggest that our algorithm WinHAP outperforms 2SNP and most of the other haplotype phasing algorithms in terms of both running speeds and overall accuracies. To facilitate the WinHAP's application in more practical biological datasets, we released the software for free at: http://staff.ustc.edu.cn/~xuyun/winhap/index.htm.http://europepmc.org/articles/PMC3419172?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Yun Xu
Wenhua Cheng
Pengyu Nie
Fengfeng Zhou
spellingShingle Yun Xu
Wenhua Cheng
Pengyu Nie
Fengfeng Zhou
WinHAP: an efficient haplotype phasing algorithm based on scalable sliding windows.
PLoS ONE
author_facet Yun Xu
Wenhua Cheng
Pengyu Nie
Fengfeng Zhou
author_sort Yun Xu
title WinHAP: an efficient haplotype phasing algorithm based on scalable sliding windows.
title_short WinHAP: an efficient haplotype phasing algorithm based on scalable sliding windows.
title_full WinHAP: an efficient haplotype phasing algorithm based on scalable sliding windows.
title_fullStr WinHAP: an efficient haplotype phasing algorithm based on scalable sliding windows.
title_full_unstemmed WinHAP: an efficient haplotype phasing algorithm based on scalable sliding windows.
title_sort winhap: an efficient haplotype phasing algorithm based on scalable sliding windows.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2012-01-01
description Haplotype phasing represents an essential step in studying the association of genomic polymorphisms with complex genetic diseases, and in determining targets for drug designing. In recent years, huge amounts of genotype data are produced from the rapidly evolving high-throughput sequencing technologies, and the data volume challenges the community with more efficient haplotype phasing algorithms, in the senses of both running time and overall accuracy. 2SNP is one of the fastest haplotype phasing algorithms with comparable low error rates with the other algorithms. The most time-consuming step of 2SNP is the construction of a maximum spanning tree (MST) among all the heterozygous SNP pairs. We simplified this step by replacing the MST with the initial haplotypes of adjacent heterozygous SNP pairs. The multi-SNP haplotypes were estimated within a sliding window along the chromosomes. The comparative studies on four different-scale genotype datasets suggest that our algorithm WinHAP outperforms 2SNP and most of the other haplotype phasing algorithms in terms of both running speeds and overall accuracies. To facilitate the WinHAP's application in more practical biological datasets, we released the software for free at: http://staff.ustc.edu.cn/~xuyun/winhap/index.htm.
url http://europepmc.org/articles/PMC3419172?pdf=render
work_keys_str_mv AT yunxu winhapanefficienthaplotypephasingalgorithmbasedonscalableslidingwindows
AT wenhuacheng winhapanefficienthaplotypephasingalgorithmbasedonscalableslidingwindows
AT pengyunie winhapanefficienthaplotypephasingalgorithmbasedonscalableslidingwindows
AT fengfengzhou winhapanefficienthaplotypephasingalgorithmbasedonscalableslidingwindows
_version_ 1725097805323698176