Taking advantage of phylogenetic trees in comparative genomics

Phylogenomics can be regarded as evolution and genomics in co-operation. Various kinds of evolutionary studies, gene family analysis among them, demand access to genome-scale datasets. But it is also clear that many genomics studies, such as assignment of gene function, are much improved by evolutio...

Full description

Bibliographic Details
Main Author: Åkerborg, Örjan
Format: Doctoral Thesis
Language:English
Published: KTH, Beräkningsbiologi, CB 2008
Subjects:
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-4757
http://nbn-resolving.de/urn:isbn:978-91-7178-987-7
id ndltd-UPSALLA1-oai-DiVA.org-kth-4757
record_format oai_dc
spelling ndltd-UPSALLA1-oai-DiVA.org-kth-47572013-01-08T13:06:38ZTaking advantage of phylogenetic trees in comparative genomicsengÅkerborg, ÖrjanKTH, Beräkningsbiologi, CBStockholm : KTH2008Computer ScienceBioinformaticsBioinformatikPhylogenomics can be regarded as evolution and genomics in co-operation. Various kinds of evolutionary studies, gene family analysis among them, demand access to genome-scale datasets. But it is also clear that many genomics studies, such as assignment of gene function, are much improved by evolutionary analysis. The work leading to this thesis is a contribution to the phylogenomics field. We have used phylogenetic relationships between species in genome-scale searches for two intriguing genomic features, namely and A-to-I RNA editing. In the first case we used pairwise species comparisons, specifically human-mouse and human-chimpanzee, to infer existence of functional mammalian pseudogenes. In the second case we profited upon later years' rapid growth of the number of sequenced genomes, and used 17-species multiple sequence alignments. In both these studies we have used non-genomic data, gene expression data and synteny relations among these, to verify predictions. In the A-to-I editing project we used 454 sequencing for experimental verification. We have further contributed a maximum a posteriori (MAP) method for fast and accurate dating analysis of speciations and other evolutionary events. This work follows recent years' trend of leaving the strict molecular clock when performing phylogenetic inference. We discretised the time interval from the leaves to the root in the tree, and used a dynamic programming (DP) algorithm to optimally factorise branch lengths into substitution rates and divergence times. We analysed two biological datasets and compared our results with recent MCMC-based methodologies. The dating point estimates that our method delivers were found to be of high quality while the gain in speed was dramatic. Finally we applied the DP strategy in a new setting. This time we used a grid laid out on a species tree instead of on an interval. The discretisation gives together with speciation times a common timeframe for a gene tree and the corresponding species tree. This is the key to integration of the sequence evolution process and the gene evolution process. Out of several potential application areas we chose gene tree reconstruction. We performed genome-wide analysis of yeast gene families and found that our methodology performs very well. QC 20100923Doctoral thesis, comprehensive summaryinfo:eu-repo/semantics/doctoralThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-4757urn:isbn:978-91-7178-987-7Trita-CSC-A, 1653-5723 ; 2008:09application/pdfinfo:eu-repo/semantics/openAccess
collection NDLTD
language English
format Doctoral Thesis
sources NDLTD
topic Computer Science
Bioinformatics
Bioinformatik
spellingShingle Computer Science
Bioinformatics
Bioinformatik
Åkerborg, Örjan
Taking advantage of phylogenetic trees in comparative genomics
description Phylogenomics can be regarded as evolution and genomics in co-operation. Various kinds of evolutionary studies, gene family analysis among them, demand access to genome-scale datasets. But it is also clear that many genomics studies, such as assignment of gene function, are much improved by evolutionary analysis. The work leading to this thesis is a contribution to the phylogenomics field. We have used phylogenetic relationships between species in genome-scale searches for two intriguing genomic features, namely and A-to-I RNA editing. In the first case we used pairwise species comparisons, specifically human-mouse and human-chimpanzee, to infer existence of functional mammalian pseudogenes. In the second case we profited upon later years' rapid growth of the number of sequenced genomes, and used 17-species multiple sequence alignments. In both these studies we have used non-genomic data, gene expression data and synteny relations among these, to verify predictions. In the A-to-I editing project we used 454 sequencing for experimental verification. We have further contributed a maximum a posteriori (MAP) method for fast and accurate dating analysis of speciations and other evolutionary events. This work follows recent years' trend of leaving the strict molecular clock when performing phylogenetic inference. We discretised the time interval from the leaves to the root in the tree, and used a dynamic programming (DP) algorithm to optimally factorise branch lengths into substitution rates and divergence times. We analysed two biological datasets and compared our results with recent MCMC-based methodologies. The dating point estimates that our method delivers were found to be of high quality while the gain in speed was dramatic. Finally we applied the DP strategy in a new setting. This time we used a grid laid out on a species tree instead of on an interval. The discretisation gives together with speciation times a common timeframe for a gene tree and the corresponding species tree. This is the key to integration of the sequence evolution process and the gene evolution process. Out of several potential application areas we chose gene tree reconstruction. We performed genome-wide analysis of yeast gene families and found that our methodology performs very well. === QC 20100923
author Åkerborg, Örjan
author_facet Åkerborg, Örjan
author_sort Åkerborg, Örjan
title Taking advantage of phylogenetic trees in comparative genomics
title_short Taking advantage of phylogenetic trees in comparative genomics
title_full Taking advantage of phylogenetic trees in comparative genomics
title_fullStr Taking advantage of phylogenetic trees in comparative genomics
title_full_unstemmed Taking advantage of phylogenetic trees in comparative genomics
title_sort taking advantage of phylogenetic trees in comparative genomics
publisher KTH, Beräkningsbiologi, CB
publishDate 2008
url http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-4757
http://nbn-resolving.de/urn:isbn:978-91-7178-987-7
work_keys_str_mv AT akerborgorjan takingadvantageofphylogenetictreesincomparativegenomics
_version_ 1716509039059271680