Algorithms for Computational Genetics Epidemiology
The most intriguing problems in genetics epidemiology are to predict genetic disease susceptibility and to associate single nucleotide polymorphisms (SNPs) with diseases. In such these studies, it is necessary to resolve the ambiguities in genetic data. The primary obstacle for ambiguity resolution...
Main Author: | |
---|---|
Format: | Others |
Published: |
Digital Archive @ GSU
2006
|
Subjects: | |
Online Access: | http://digitalarchive.gsu.edu/cs_diss/10 http://digitalarchive.gsu.edu/cgi/viewcontent.cgi?article=1009&context=cs_diss |
id |
ndltd-GEORGIA-oai-digitalarchive.gsu.edu-cs_diss-1009 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-GEORGIA-oai-digitalarchive.gsu.edu-cs_diss-10092013-04-23T03:18:55Z Algorithms for Computational Genetics Epidemiology He, Jingwu The most intriguing problems in genetics epidemiology are to predict genetic disease susceptibility and to associate single nucleotide polymorphisms (SNPs) with diseases. In such these studies, it is necessary to resolve the ambiguities in genetic data. The primary obstacle for ambiguity resolution is that the physical methods for separating two haplotypes from an individual genotype (phasing) are too expensive. Although computational haplotype inference is a well-explored problem, high error rates continue to deteriorate association accuracy. Secondly, it is essential to use a small subset of informative SNPs (tag SNPs) accurately representing the rest of the SNPs (tagging). Tagging can achieve budget savings by genotyping only a limited number of SNPs and computationally inferring all other SNPs. Recent successes in high throughput genotyping technologies drastically increase the length of available SNP sequences. This elevates importance of informative SNP selection for compaction of huge genetic data in order to make feasible fine genotype analysis. Finally, even if complete and accurate data is available, it is unclear if common statistical methods can determine the susceptibility of complex diseases. The dissertation explores above computational problems with a variety of methods, including linear algebra, graph theory, linear programming, and greedy methods. The contributions include (1)significant speed-up of popular phasing tools without compromising their quality, (2)stat-of-the-art tagging tools applied to disease association, and (3)graph-based method for disease tagging and predicting disease susceptibility. 2006-09-11 text application/pdf http://digitalarchive.gsu.edu/cs_diss/10 http://digitalarchive.gsu.edu/cgi/viewcontent.cgi?article=1009&context=cs_diss Computer Science Dissertations Digital Archive @ GSU Tagging Phasing Haplotype Genotype SNP Computer Sciences |
collection |
NDLTD |
format |
Others
|
sources |
NDLTD |
topic |
Tagging Phasing Haplotype Genotype SNP Computer Sciences |
spellingShingle |
Tagging Phasing Haplotype Genotype SNP Computer Sciences He, Jingwu Algorithms for Computational Genetics Epidemiology |
description |
The most intriguing problems in genetics epidemiology are to predict genetic disease susceptibility and to associate single nucleotide polymorphisms (SNPs) with diseases. In such these studies, it is necessary to resolve the ambiguities in genetic data. The primary obstacle for ambiguity resolution is that the physical methods for separating two haplotypes from an individual genotype (phasing) are too expensive. Although computational haplotype inference is a well-explored problem, high error rates continue to deteriorate association accuracy. Secondly, it is essential to use a small subset of informative SNPs (tag SNPs) accurately representing the rest of the SNPs (tagging). Tagging can achieve budget savings by genotyping only a limited number of SNPs and computationally inferring all other SNPs. Recent successes in high throughput genotyping technologies drastically increase the length of available SNP sequences. This elevates importance of informative SNP selection for compaction of huge genetic data in order to make feasible fine genotype analysis. Finally, even if complete and accurate data is available, it is unclear if common statistical methods can determine the susceptibility of complex diseases. The dissertation explores above computational problems with a variety of methods, including linear algebra, graph theory, linear programming, and greedy methods. The contributions include (1)significant speed-up of popular phasing tools without compromising their quality, (2)stat-of-the-art tagging tools applied to disease association, and (3)graph-based method for disease tagging and predicting disease susceptibility. |
author |
He, Jingwu |
author_facet |
He, Jingwu |
author_sort |
He, Jingwu |
title |
Algorithms for Computational Genetics Epidemiology |
title_short |
Algorithms for Computational Genetics Epidemiology |
title_full |
Algorithms for Computational Genetics Epidemiology |
title_fullStr |
Algorithms for Computational Genetics Epidemiology |
title_full_unstemmed |
Algorithms for Computational Genetics Epidemiology |
title_sort |
algorithms for computational genetics epidemiology |
publisher |
Digital Archive @ GSU |
publishDate |
2006 |
url |
http://digitalarchive.gsu.edu/cs_diss/10 http://digitalarchive.gsu.edu/cgi/viewcontent.cgi?article=1009&context=cs_diss |
work_keys_str_mv |
AT hejingwu algorithmsforcomputationalgeneticsepidemiology |
_version_ |
1716583946723000320 |