A Study on Some Optimization Problems Related to SNPs and Haplotypes

博士 === 國立臺灣大學 === 資訊工程學研究所 === 94 === This dissertation studies several optimization problems related to SNPs and haplotypes. Most problems studied in this dissertation are shown to be NP-hard. To efficiently solve these problems, we design and implement a series of approximation algorithms. Our the...

Full description

Bibliographic Details
Main Authors: Yao-Ting Huang, 黃耀廷
Other Authors: Kun-Mao Chao
Format: Others
Language:en_US
Published: 2006
Online Access:http://ndltd.ncl.edu.tw/handle/39365715893271752449
id ndltd-TW-094NTU05392080
record_format oai_dc
spelling ndltd-TW-094NTU053920802015-12-16T04:38:36Z http://ndltd.ncl.edu.tw/handle/39365715893271752449 A Study on Some Optimization Problems Related to SNPs and Haplotypes 單核苷酸多型性與單體型相關的最佳化問題之研究 Yao-Ting Huang 黃耀廷 博士 國立臺灣大學 資訊工程學研究所 94 This dissertation studies several optimization problems related to SNPs and haplotypes. Most problems studied in this dissertation are shown to be NP-hard. To efficiently solve these problems, we design and implement a series of approximation algorithms. Our theoretical analysis and experimental results indicate that these algorithms are not only efficient but the solutions found by them are also quite close to the optimal solution. In Part I of this dissertation, we show that there exists a set of SNPs called robust tag SNPs which can tolerate missing SNPs in genotyping. The problem of finding a minimum set of robust tag SNPs is shown to be NP-hard. We give two greedy algorithms and one linear programming relaxation algorithm to efficiently solve this problem. Our theoretical analysis and experimental results show that these algorithms not only run very fast but also find nearly-optimal solutions. In Part II of this dissertation, we study the problem of selecting a minimum set of tag SNPs by multimarker haplotypes. This problem is divided into three subproblems, two of which are shown to be NP-hard. Several exact and approximation algorithms are proposed to solve these subproblems. The experimental results indicate that the program developed by integrating these algorithms finds a smaller set of tag SNPs and runs much faster than existing methods. In Part III of this dissertation, we study the problem of haplotype inference by maximum parsimony. We formulate this problem as an integer quadratic programming problem and present an iterative semi-definite programming relaxation based approximation algorithm. Our theoretical analysis and experimental results show that the solution found is not only close to the optimal solution but the accuracy is also improved in comparison with existing methods. In Part IV of this dissertation, we study the problem of selecting discriminative SNPs for classifying cases and controls in genome-wide association studies. We describe an efficient algorithm for identifying discriminative SNPs and compare it with several existing methods using a variety of classifiers. The experimental results indicate that our method consistently obtains better accuracies than other methods when sufficient discriminative SNPs are selected. Kun-Mao Chao 趙坤茂 2006 學位論文 ; thesis 90 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 博士 === 國立臺灣大學 === 資訊工程學研究所 === 94 === This dissertation studies several optimization problems related to SNPs and haplotypes. Most problems studied in this dissertation are shown to be NP-hard. To efficiently solve these problems, we design and implement a series of approximation algorithms. Our theoretical analysis and experimental results indicate that these algorithms are not only efficient but the solutions found by them are also quite close to the optimal solution. In Part I of this dissertation, we show that there exists a set of SNPs called robust tag SNPs which can tolerate missing SNPs in genotyping. The problem of finding a minimum set of robust tag SNPs is shown to be NP-hard. We give two greedy algorithms and one linear programming relaxation algorithm to efficiently solve this problem. Our theoretical analysis and experimental results show that these algorithms not only run very fast but also find nearly-optimal solutions. In Part II of this dissertation, we study the problem of selecting a minimum set of tag SNPs by multimarker haplotypes. This problem is divided into three subproblems, two of which are shown to be NP-hard. Several exact and approximation algorithms are proposed to solve these subproblems. The experimental results indicate that the program developed by integrating these algorithms finds a smaller set of tag SNPs and runs much faster than existing methods. In Part III of this dissertation, we study the problem of haplotype inference by maximum parsimony. We formulate this problem as an integer quadratic programming problem and present an iterative semi-definite programming relaxation based approximation algorithm. Our theoretical analysis and experimental results show that the solution found is not only close to the optimal solution but the accuracy is also improved in comparison with existing methods. In Part IV of this dissertation, we study the problem of selecting discriminative SNPs for classifying cases and controls in genome-wide association studies. We describe an efficient algorithm for identifying discriminative SNPs and compare it with several existing methods using a variety of classifiers. The experimental results indicate that our method consistently obtains better accuracies than other methods when sufficient discriminative SNPs are selected.
author2 Kun-Mao Chao
author_facet Kun-Mao Chao
Yao-Ting Huang
黃耀廷
author Yao-Ting Huang
黃耀廷
spellingShingle Yao-Ting Huang
黃耀廷
A Study on Some Optimization Problems Related to SNPs and Haplotypes
author_sort Yao-Ting Huang
title A Study on Some Optimization Problems Related to SNPs and Haplotypes
title_short A Study on Some Optimization Problems Related to SNPs and Haplotypes
title_full A Study on Some Optimization Problems Related to SNPs and Haplotypes
title_fullStr A Study on Some Optimization Problems Related to SNPs and Haplotypes
title_full_unstemmed A Study on Some Optimization Problems Related to SNPs and Haplotypes
title_sort study on some optimization problems related to snps and haplotypes
publishDate 2006
url http://ndltd.ncl.edu.tw/handle/39365715893271752449
work_keys_str_mv AT yaotinghuang astudyonsomeoptimizationproblemsrelatedtosnpsandhaplotypes
AT huángyàotíng astudyonsomeoptimizationproblemsrelatedtosnpsandhaplotypes
AT yaotinghuang dānhégānsuānduōxíngxìngyǔdāntǐxíngxiāngguāndezuìjiāhuàwèntízhīyánjiū
AT huángyàotíng dānhégānsuānduōxíngxìngyǔdāntǐxíngxiāngguāndezuìjiāhuàwèntízhīyánjiū
AT yaotinghuang studyonsomeoptimizationproblemsrelatedtosnpsandhaplotypes
AT huángyàotíng studyonsomeoptimizationproblemsrelatedtosnpsandhaplotypes
_version_ 1718150407571636224