Inferring angiosperm phylogeny from EST data with widespread gene duplication

BACKGROUND:Most studies inferring species phylogenies use sequences from single copy genes or sets of orthologs culled from gene families. For taxa such as plants, with very high levels of gene duplication in their nuclear genomes, this has limited the exploitation of nuclear sequences for phylogene...

Full description

Bibliographic Details
Main Authors: Sanderson, Michael, McMahon, Michelle
Other Authors: Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ 85721, USA
Language:en
Published: BioMed Central 2007
Online Access:http://hdl.handle.net/10150/610375
http://arizona.openrepository.com/arizona/handle/10150/610375
id ndltd-arizona.edu-oai-arizona.openrepository.com-10150-610375
record_format oai_dc
spelling ndltd-arizona.edu-oai-arizona.openrepository.com-10150-6103752016-05-22T03:02:05Z Inferring angiosperm phylogeny from EST data with widespread gene duplication Sanderson, Michael McMahon, Michelle Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ 85721, USA Department of Plant Sciences, University of Arizona, Tucson, AZ 85721, USA BACKGROUND:Most studies inferring species phylogenies use sequences from single copy genes or sets of orthologs culled from gene families. For taxa such as plants, with very high levels of gene duplication in their nuclear genomes, this has limited the exploitation of nuclear sequences for phylogenetic studies, such as those available in large EST libraries. One rarely used method of inference, gene tree parsimony, can infer species trees from gene families undergoing duplication and loss, but its performance has not been evaluated at a phylogenomic scale for EST data in plants.RESULTS:A gene tree parsimony analysis based on EST data was undertaken for six angiosperm model species and Pinus, an outgroup. Although a large fraction of the tentative consensus sequences obtained from the TIGR database of ESTs was assembled into homologous clusters too small to be phylogenetically informative, some 557 clusters contained promising levels of information. Based on maximum likelihood estimates of the gene trees obtained from these clusters, gene tree parsimony correctly inferred the accepted species tree with strong statistical support. A slight variant of this species tree was obtained when maximum parsimony was used to infer the individual gene trees instead.CONCLUSION:Despite the complexity of the EST data and the relatively small fraction eventually used in inferring a species tree, the gene tree parsimony method performed well in the face of very high apparent rates of duplication. 2007 Article BMC Evolutionary Biology 2007, 7(Suppl 1):S3 doi:10.1186/1471-2148-7-S1-S3 10.1186/1471-2148-7-S1-S3 http://hdl.handle.net/10150/610375 http://arizona.openrepository.com/arizona/handle/10150/610375 1471-2148 BMC Evolutionary Biology en http://www.biomedcentral.com/1471-2148/7/S1/S3 © 2007 Sanderson and McMahon; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0) BioMed Central
collection NDLTD
language en
sources NDLTD
description BACKGROUND:Most studies inferring species phylogenies use sequences from single copy genes or sets of orthologs culled from gene families. For taxa such as plants, with very high levels of gene duplication in their nuclear genomes, this has limited the exploitation of nuclear sequences for phylogenetic studies, such as those available in large EST libraries. One rarely used method of inference, gene tree parsimony, can infer species trees from gene families undergoing duplication and loss, but its performance has not been evaluated at a phylogenomic scale for EST data in plants.RESULTS:A gene tree parsimony analysis based on EST data was undertaken for six angiosperm model species and Pinus, an outgroup. Although a large fraction of the tentative consensus sequences obtained from the TIGR database of ESTs was assembled into homologous clusters too small to be phylogenetically informative, some 557 clusters contained promising levels of information. Based on maximum likelihood estimates of the gene trees obtained from these clusters, gene tree parsimony correctly inferred the accepted species tree with strong statistical support. A slight variant of this species tree was obtained when maximum parsimony was used to infer the individual gene trees instead.CONCLUSION:Despite the complexity of the EST data and the relatively small fraction eventually used in inferring a species tree, the gene tree parsimony method performed well in the face of very high apparent rates of duplication.
author2 Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ 85721, USA
author_facet Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ 85721, USA
Sanderson, Michael
McMahon, Michelle
author Sanderson, Michael
McMahon, Michelle
spellingShingle Sanderson, Michael
McMahon, Michelle
Inferring angiosperm phylogeny from EST data with widespread gene duplication
author_sort Sanderson, Michael
title Inferring angiosperm phylogeny from EST data with widespread gene duplication
title_short Inferring angiosperm phylogeny from EST data with widespread gene duplication
title_full Inferring angiosperm phylogeny from EST data with widespread gene duplication
title_fullStr Inferring angiosperm phylogeny from EST data with widespread gene duplication
title_full_unstemmed Inferring angiosperm phylogeny from EST data with widespread gene duplication
title_sort inferring angiosperm phylogeny from est data with widespread gene duplication
publisher BioMed Central
publishDate 2007
url http://hdl.handle.net/10150/610375
http://arizona.openrepository.com/arizona/handle/10150/610375
work_keys_str_mv AT sandersonmichael inferringangiospermphylogenyfromestdatawithwidespreadgeneduplication
AT mcmahonmichelle inferringangiospermphylogenyfromestdatawithwidespreadgeneduplication
_version_ 1718274657513111552