PhyloFisher: A phylogenomic package for resolving eukaryotic relationships.

Phylogenomic analyses of hundreds of protein-coding genes aimed at resolving phylogenetic relationships is now a common practice. However, no software currently exists that includes tools for dataset construction and subsequent analysis with diverse validation strategies to assess robustness. Furthe...

Full description

Bibliographic Details
Main Authors: Alexander K Tice, David Žihala, Tomáš Pánek, Robert E Jones, Eric D Salomaki, Serafim Nenarokov, Fabien Burki, Marek Eliáš, Laura Eme, Andrew J Roger, Antonis Rokas, Xing-Xing Shen, Jürgen F H Strassert, Martin Kolísko, Matthew W Brown
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2021-08-01
Series:PLoS Biology
Online Access:https://doi.org/10.1371/journal.pbio.3001365
id doaj-5458e9e481234510ae6f912b55e8b0bb
record_format Article
spelling doaj-5458e9e481234510ae6f912b55e8b0bb2021-08-12T04:31:02ZengPublic Library of Science (PLoS)PLoS Biology1544-91731545-78852021-08-01198e300136510.1371/journal.pbio.3001365PhyloFisher: A phylogenomic package for resolving eukaryotic relationships.Alexander K TiceDavid ŽihalaTomáš PánekRobert E JonesEric D SalomakiSerafim NenarokovFabien BurkiMarek EliášLaura EmeAndrew J RogerAntonis RokasXing-Xing ShenJürgen F H StrassertMartin KolískoMatthew W BrownPhylogenomic analyses of hundreds of protein-coding genes aimed at resolving phylogenetic relationships is now a common practice. However, no software currently exists that includes tools for dataset construction and subsequent analysis with diverse validation strategies to assess robustness. Furthermore, there are no publicly available high-quality curated databases designed to assess deep (>100 million years) relationships in the tree of eukaryotes. To address these issues, we developed an easy-to-use software package, PhyloFisher (https://github.com/TheBrownLab/PhyloFisher), written in Python 3. PhyloFisher includes a manually curated database of 240 protein-coding genes from 304 eukaryotic taxa covering known eukaryotic diversity, a novel tool for ortholog selection, and utilities that will perform diverse analyses required by state-of-the-art phylogenomic investigations. Through phylogenetic reconstructions of the tree of eukaryotes and of the Saccharomycetaceae clade of budding yeasts, we demonstrate the utility of the PhyloFisher workflow and the provided starting database to address phylogenetic questions across a large range of evolutionary time points for diverse groups of organisms. We also demonstrate that undetected paralogy can remain in phylogenomic "single-copy orthogroup" datasets constructed using widely accepted methods such as all vs. all BLAST searches followed by Markov Cluster Algorithm (MCL) clustering and application of automated tree pruning algorithms. Finally, we show how the PhyloFisher workflow helps detect inadvertent paralog inclusions, allowing the user to make more informed decisions regarding orthology assignments, leading to a more accurate final dataset.https://doi.org/10.1371/journal.pbio.3001365
collection DOAJ
language English
format Article
sources DOAJ
author Alexander K Tice
David Žihala
Tomáš Pánek
Robert E Jones
Eric D Salomaki
Serafim Nenarokov
Fabien Burki
Marek Eliáš
Laura Eme
Andrew J Roger
Antonis Rokas
Xing-Xing Shen
Jürgen F H Strassert
Martin Kolísko
Matthew W Brown
spellingShingle Alexander K Tice
David Žihala
Tomáš Pánek
Robert E Jones
Eric D Salomaki
Serafim Nenarokov
Fabien Burki
Marek Eliáš
Laura Eme
Andrew J Roger
Antonis Rokas
Xing-Xing Shen
Jürgen F H Strassert
Martin Kolísko
Matthew W Brown
PhyloFisher: A phylogenomic package for resolving eukaryotic relationships.
PLoS Biology
author_facet Alexander K Tice
David Žihala
Tomáš Pánek
Robert E Jones
Eric D Salomaki
Serafim Nenarokov
Fabien Burki
Marek Eliáš
Laura Eme
Andrew J Roger
Antonis Rokas
Xing-Xing Shen
Jürgen F H Strassert
Martin Kolísko
Matthew W Brown
author_sort Alexander K Tice
title PhyloFisher: A phylogenomic package for resolving eukaryotic relationships.
title_short PhyloFisher: A phylogenomic package for resolving eukaryotic relationships.
title_full PhyloFisher: A phylogenomic package for resolving eukaryotic relationships.
title_fullStr PhyloFisher: A phylogenomic package for resolving eukaryotic relationships.
title_full_unstemmed PhyloFisher: A phylogenomic package for resolving eukaryotic relationships.
title_sort phylofisher: a phylogenomic package for resolving eukaryotic relationships.
publisher Public Library of Science (PLoS)
series PLoS Biology
issn 1544-9173
1545-7885
publishDate 2021-08-01
description Phylogenomic analyses of hundreds of protein-coding genes aimed at resolving phylogenetic relationships is now a common practice. However, no software currently exists that includes tools for dataset construction and subsequent analysis with diverse validation strategies to assess robustness. Furthermore, there are no publicly available high-quality curated databases designed to assess deep (>100 million years) relationships in the tree of eukaryotes. To address these issues, we developed an easy-to-use software package, PhyloFisher (https://github.com/TheBrownLab/PhyloFisher), written in Python 3. PhyloFisher includes a manually curated database of 240 protein-coding genes from 304 eukaryotic taxa covering known eukaryotic diversity, a novel tool for ortholog selection, and utilities that will perform diverse analyses required by state-of-the-art phylogenomic investigations. Through phylogenetic reconstructions of the tree of eukaryotes and of the Saccharomycetaceae clade of budding yeasts, we demonstrate the utility of the PhyloFisher workflow and the provided starting database to address phylogenetic questions across a large range of evolutionary time points for diverse groups of organisms. We also demonstrate that undetected paralogy can remain in phylogenomic "single-copy orthogroup" datasets constructed using widely accepted methods such as all vs. all BLAST searches followed by Markov Cluster Algorithm (MCL) clustering and application of automated tree pruning algorithms. Finally, we show how the PhyloFisher workflow helps detect inadvertent paralog inclusions, allowing the user to make more informed decisions regarding orthology assignments, leading to a more accurate final dataset.
url https://doi.org/10.1371/journal.pbio.3001365
work_keys_str_mv AT alexanderktice phylofisheraphylogenomicpackageforresolvingeukaryoticrelationships
AT davidzihala phylofisheraphylogenomicpackageforresolvingeukaryoticrelationships
AT tomaspanek phylofisheraphylogenomicpackageforresolvingeukaryoticrelationships
AT robertejones phylofisheraphylogenomicpackageforresolvingeukaryoticrelationships
AT ericdsalomaki phylofisheraphylogenomicpackageforresolvingeukaryoticrelationships
AT serafimnenarokov phylofisheraphylogenomicpackageforresolvingeukaryoticrelationships
AT fabienburki phylofisheraphylogenomicpackageforresolvingeukaryoticrelationships
AT marekelias phylofisheraphylogenomicpackageforresolvingeukaryoticrelationships
AT lauraeme phylofisheraphylogenomicpackageforresolvingeukaryoticrelationships
AT andrewjroger phylofisheraphylogenomicpackageforresolvingeukaryoticrelationships
AT antonisrokas phylofisheraphylogenomicpackageforresolvingeukaryoticrelationships
AT xingxingshen phylofisheraphylogenomicpackageforresolvingeukaryoticrelationships
AT jurgenfhstrassert phylofisheraphylogenomicpackageforresolvingeukaryoticrelationships
AT martinkolisko phylofisheraphylogenomicpackageforresolvingeukaryoticrelationships
AT matthewwbrown phylofisheraphylogenomicpackageforresolvingeukaryoticrelationships
_version_ 1721209973070888960