Maximum Parsimony on Phylogenetic networks

<p>Abstract</p> <p>Background</p> <p>Phylogenetic networks are generalizations of phylogenetic trees, that are used to model evolutionary events in various contexts. Several different methods and criteria have been introduced for reconstructing phylogenetic trees. Maxim...

Full description

Bibliographic Details
Main Authors: Kannan Lavanya, Wheeler Ward C
Format: Article
Language:English
Published: BMC 2012-05-01
Series:Algorithms for Molecular Biology
Online Access:http://www.almob.org/content/7/1/9
id doaj-b4aad06a0011443bb2ff35b5c624f864
record_format Article
spelling doaj-b4aad06a0011443bb2ff35b5c624f8642020-11-24T22:06:38ZengBMCAlgorithms for Molecular Biology1748-71882012-05-0171910.1186/1748-7188-7-9Maximum Parsimony on Phylogenetic networksKannan LavanyaWheeler Ward C<p>Abstract</p> <p>Background</p> <p>Phylogenetic networks are generalizations of phylogenetic trees, that are used to model evolutionary events in various contexts. Several different methods and criteria have been introduced for reconstructing phylogenetic trees. Maximum Parsimony is a character-based approach that infers a phylogenetic tree by minimizing the total number of evolutionary steps required to explain a given set of data assigned on the leaves. Exact solutions for optimizing parsimony scores on phylogenetic trees have been introduced in the past.</p> <p>Results</p> <p>In this paper, we define the parsimony score on networks as the sum of the substitution costs along all the edges of the network; and show that certain well-known algorithms that calculate the optimum parsimony score on trees, such as Sankoff and Fitch algorithms extend naturally for networks, barring conflicting assignments at the reticulate vertices. We provide heuristics for finding the optimum parsimony scores on networks. Our algorithms can be applied for any cost matrix that may contain unequal substitution costs of transforming between different characters along different edges of the network. We analyzed this for experimental data on 10 leaves or fewer with at most 2 reticulations and found that for almost all networks, the bounds returned by the heuristics matched with the exhaustively determined optimum parsimony scores.</p> <p>Conclusion</p> <p>The parsimony score we define here does not directly reflect the cost of the best tree in the network that displays the evolution of the character. However, when searching for the most parsimonious network that describes a collection of characters, it becomes necessary to add additional cost considerations to prefer simpler structures, such as trees over networks. The parsimony score on a network that we describe here takes into account the substitution costs along the additional edges incident on each reticulate vertex, in addition to the substitution costs along the other edges which are common to all the branching patterns introduced by the reticulate vertices. Thus the score contains an in-built cost for the number of reticulate vertices in the network, and would provide a criterion that is comparable among all networks. Although the problem of finding the parsimony score on the network is believed to be computationally hard to solve, heuristics such as the ones described here would be beneficial in our efforts to find a most parsimonious network.</p> http://www.almob.org/content/7/1/9
collection DOAJ
language English
format Article
sources DOAJ
author Kannan Lavanya
Wheeler Ward C
spellingShingle Kannan Lavanya
Wheeler Ward C
Maximum Parsimony on Phylogenetic networks
Algorithms for Molecular Biology
author_facet Kannan Lavanya
Wheeler Ward C
author_sort Kannan Lavanya
title Maximum Parsimony on Phylogenetic networks
title_short Maximum Parsimony on Phylogenetic networks
title_full Maximum Parsimony on Phylogenetic networks
title_fullStr Maximum Parsimony on Phylogenetic networks
title_full_unstemmed Maximum Parsimony on Phylogenetic networks
title_sort maximum parsimony on phylogenetic networks
publisher BMC
series Algorithms for Molecular Biology
issn 1748-7188
publishDate 2012-05-01
description <p>Abstract</p> <p>Background</p> <p>Phylogenetic networks are generalizations of phylogenetic trees, that are used to model evolutionary events in various contexts. Several different methods and criteria have been introduced for reconstructing phylogenetic trees. Maximum Parsimony is a character-based approach that infers a phylogenetic tree by minimizing the total number of evolutionary steps required to explain a given set of data assigned on the leaves. Exact solutions for optimizing parsimony scores on phylogenetic trees have been introduced in the past.</p> <p>Results</p> <p>In this paper, we define the parsimony score on networks as the sum of the substitution costs along all the edges of the network; and show that certain well-known algorithms that calculate the optimum parsimony score on trees, such as Sankoff and Fitch algorithms extend naturally for networks, barring conflicting assignments at the reticulate vertices. We provide heuristics for finding the optimum parsimony scores on networks. Our algorithms can be applied for any cost matrix that may contain unequal substitution costs of transforming between different characters along different edges of the network. We analyzed this for experimental data on 10 leaves or fewer with at most 2 reticulations and found that for almost all networks, the bounds returned by the heuristics matched with the exhaustively determined optimum parsimony scores.</p> <p>Conclusion</p> <p>The parsimony score we define here does not directly reflect the cost of the best tree in the network that displays the evolution of the character. However, when searching for the most parsimonious network that describes a collection of characters, it becomes necessary to add additional cost considerations to prefer simpler structures, such as trees over networks. The parsimony score on a network that we describe here takes into account the substitution costs along the additional edges incident on each reticulate vertex, in addition to the substitution costs along the other edges which are common to all the branching patterns introduced by the reticulate vertices. Thus the score contains an in-built cost for the number of reticulate vertices in the network, and would provide a criterion that is comparable among all networks. Although the problem of finding the parsimony score on the network is believed to be computationally hard to solve, heuristics such as the ones described here would be beneficial in our efforts to find a most parsimonious network.</p>
url http://www.almob.org/content/7/1/9
work_keys_str_mv AT kannanlavanya maximumparsimonyonphylogeneticnetworks
AT wheelerwardc maximumparsimonyonphylogeneticnetworks
_version_ 1725822748810280960