Testing the agreement of trees with internal labels

Background: A semi-labeled tree is a tree where all leaves as well as, possibly, some internal nodes are labeled with taxa. Semi-labeled trees encompass ordinary phylogenetic trees and taxonomies. Suppose we are given a collection P= { T1, T2, … , Tk} of semi-labeled trees, called input trees, over...

Full description

Bibliographic Details
Main Authors: Fernández-Baca, D. (Author), Liu, L. (Author)
Format: Article
Language:English
Published: BioMed Central Ltd 2021
Subjects:
Online Access:View Fulltext in Publisher
LEADER 02163nam a2200193Ia 4500
001 10.1186-s13015-021-00201-9
008 220427s2021 CNT 000 0 und d
020 |a 17487188 (ISSN) 
245 1 0 |a Testing the agreement of trees with internal labels 
260 0 |b BioMed Central Ltd  |c 2021 
856 |z View Fulltext in Publisher  |u https://doi.org/10.1186/s13015-021-00201-9 
520 3 |a Background: A semi-labeled tree is a tree where all leaves as well as, possibly, some internal nodes are labeled with taxa. Semi-labeled trees encompass ordinary phylogenetic trees and taxonomies. Suppose we are given a collection P= { T1, T2, … , Tk} of semi-labeled trees, called input trees, over partially overlapping sets of taxa. The agreement problem asks whether there exists a tree T, called an agreement tree, whose taxon set is the union of the taxon sets of the input trees such that the restriction of T to the taxon set of Ti is isomorphic to Ti, for each i∈ { 1 , 2 , … , k}. The agreement problems is a special case of the supertree problem, the problem of synthesizing a collection of phylogenetic trees with partially overlapping taxon sets into a single supertree that represents the information in the input trees. An obstacle to building large phylogenetic supertrees is the limited amount of taxonomic overlap among the phylogenetic studies from which the input trees are obtained. Incorporating taxonomies into supertree analyses can alleviate this issue. Results: We give a O(nk(∑ i∈[k]di+ log 2(nk))) algorithm for the agreement problem, where n is the total number of distinct taxa in P, k is the number of trees in P, and di is the maximum number of children of a node in Ti. Conclusion: Our algorithm can aid in integrating taxonomies into supertree analyses. Our computational experience with the algorithm suggests that its performance in practice is much better than its worst-case bound indicates. © 2021, The Author(s). 
650 0 4 |a Agreement 
650 0 4 |a Algorithm 
650 0 4 |a Phylogenetic tree 
650 0 4 |a Taxonomy 
700 1 |a Fernández-Baca, D.  |e author 
700 1 |a Liu, L.  |e author 
773 |t Algorithms for Molecular Biology