Reconstructing phylogenies from noisy quartets in polynomial time with a high success probability
<p>Abstract</p> <p>Background</p> <p>In recent years, quartet-based phylogeny reconstruction methods have received considerable attentions in the computational biology community. Traditionally, the accuracy of a phylogeny reconstruction method is measured by simulations...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2008-01-01
|
Series: | Algorithms for Molecular Biology |
Online Access: | http://www.almob.org/content/3/1/1 |
id |
doaj-7b513dc298fa49198d456b447cc08635 |
---|---|
record_format |
Article |
spelling |
doaj-7b513dc298fa49198d456b447cc086352020-11-24T23:43:32ZengBMCAlgorithms for Molecular Biology1748-71882008-01-0131110.1186/1748-7188-3-1Reconstructing phylogenies from noisy quartets in polynomial time with a high success probabilityWu GangKao Ming-YangLin GuohuiYou Jia-Huai<p>Abstract</p> <p>Background</p> <p>In recent years, quartet-based phylogeny reconstruction methods have received considerable attentions in the computational biology community. Traditionally, the accuracy of a phylogeny reconstruction method is measured by simulations on synthetic datasets with known "true" phylogenies, while little theoretical analysis has been done. In this paper, we present a new model-based approach to measuring the accuracy of a quartet-based phylogeny reconstruction method. Under this model, we propose three efficient algorithms to reconstruct the "true" phylogeny with a high success probability.</p> <p>Results</p> <p>The first algorithm can reconstruct the "true" phylogeny from the input quartet topology set without quartet errors in <it>O</it>(<it>n</it><sup>2</sup>) time by querying at most (<it>n </it>- 4) log(<it>n </it>- 1) quartet topologies, where <it>n </it>is the number of the taxa. When the input quartet topology set contains errors, the second algorithm can reconstruct the "true" phylogeny with a probability approximately 1 - <it>p </it>in <it>O</it>(<it>n</it><sup>4 </sup>log <it>n</it>) time, where <it>p </it>is the probability for a quartet topology being an error. This probability is improved by the third algorithm to approximately <inline-formula><m:math name="1748-7188-3-1-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mfrac><m:mn>1</m:mn><m:mrow><m:mn>1</m:mn><m:mo>+</m:mo><m:msup><m:mi>q</m:mi><m:mn>2</m:mn></m:msup><m:mo>+</m:mo><m:mfrac><m:mn>1</m:mn><m:mn>2</m:mn></m:mfrac><m:msup><m:mi>q</m:mi><m:mn>4</m:mn></m:msup><m:mo>+</m:mo><m:mfrac><m:mn>1</m:mn><m:mrow><m:mn>16</m:mn></m:mrow></m:mfrac><m:msup><m:mi>q</m:mi><m:mn>5</m:mn></m:msup></m:mrow></m:mfrac></m:mrow><m:annotation encoding="MathType-MTEF"> MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaqcfa4aaSaaaeaacqaIXaqmaeaacqaIXaqmcqGHRaWkcqWGXbqCdaahaaqabeaacqaIYaGmaaGaey4kaSYaaSaaaeaacqaIXaqmaeaacqaIYaGmaaGaemyCae3aaWbaaeqabaGaeGinaqdaaiabgUcaRmaalaaabaGaeGymaedabaGaeGymaeJaeGOnaydaaiabdghaXnaaCaaabeqaaiabiwda1aaaaaaaaa@3D5A@</m:annotation></m:semantics></m:math></inline-formula>, where <inline-formula><m:math name="1748-7188-3-1-i2" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mi>q</m:mi><m:mo>=</m:mo><m:mfrac><m:mi>p</m:mi><m:mrow><m:mn>1</m:mn><m:mo>−</m:mo><m:mi>p</m:mi></m:mrow></m:mfrac></m:mrow><m:annotation encoding="MathType-MTEF"> MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaGaemyCaeNaeyypa0tcfa4aaSaaaeaacqWGWbaCaeaacqaIXaqmcqGHsislcqWGWbaCaaaaaa@3391@</m:annotation></m:semantics></m:math></inline-formula>, with running time of <it>O</it>(<it>n</it><sup>5</sup>), which is at least 0.984 when <it>p </it>< 0.05.</p> <p>Conclusion</p> <p>The three proposed algorithms are mathematically guaranteed to reconstruct the "true" phylogeny with a high success probability. The experimental results showed that the third algorithm produced phylogenies with a higher probability than its aforementioned theoretical lower bound and outperformed some existing phylogeny reconstruction methods in both speed and accuracy.</p> http://www.almob.org/content/3/1/1 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Wu Gang Kao Ming-Yang Lin Guohui You Jia-Huai |
spellingShingle |
Wu Gang Kao Ming-Yang Lin Guohui You Jia-Huai Reconstructing phylogenies from noisy quartets in polynomial time with a high success probability Algorithms for Molecular Biology |
author_facet |
Wu Gang Kao Ming-Yang Lin Guohui You Jia-Huai |
author_sort |
Wu Gang |
title |
Reconstructing phylogenies from noisy quartets in polynomial time with a high success probability |
title_short |
Reconstructing phylogenies from noisy quartets in polynomial time with a high success probability |
title_full |
Reconstructing phylogenies from noisy quartets in polynomial time with a high success probability |
title_fullStr |
Reconstructing phylogenies from noisy quartets in polynomial time with a high success probability |
title_full_unstemmed |
Reconstructing phylogenies from noisy quartets in polynomial time with a high success probability |
title_sort |
reconstructing phylogenies from noisy quartets in polynomial time with a high success probability |
publisher |
BMC |
series |
Algorithms for Molecular Biology |
issn |
1748-7188 |
publishDate |
2008-01-01 |
description |
<p>Abstract</p> <p>Background</p> <p>In recent years, quartet-based phylogeny reconstruction methods have received considerable attentions in the computational biology community. Traditionally, the accuracy of a phylogeny reconstruction method is measured by simulations on synthetic datasets with known "true" phylogenies, while little theoretical analysis has been done. In this paper, we present a new model-based approach to measuring the accuracy of a quartet-based phylogeny reconstruction method. Under this model, we propose three efficient algorithms to reconstruct the "true" phylogeny with a high success probability.</p> <p>Results</p> <p>The first algorithm can reconstruct the "true" phylogeny from the input quartet topology set without quartet errors in <it>O</it>(<it>n</it><sup>2</sup>) time by querying at most (<it>n </it>- 4) log(<it>n </it>- 1) quartet topologies, where <it>n </it>is the number of the taxa. When the input quartet topology set contains errors, the second algorithm can reconstruct the "true" phylogeny with a probability approximately 1 - <it>p </it>in <it>O</it>(<it>n</it><sup>4 </sup>log <it>n</it>) time, where <it>p </it>is the probability for a quartet topology being an error. This probability is improved by the third algorithm to approximately <inline-formula><m:math name="1748-7188-3-1-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mfrac><m:mn>1</m:mn><m:mrow><m:mn>1</m:mn><m:mo>+</m:mo><m:msup><m:mi>q</m:mi><m:mn>2</m:mn></m:msup><m:mo>+</m:mo><m:mfrac><m:mn>1</m:mn><m:mn>2</m:mn></m:mfrac><m:msup><m:mi>q</m:mi><m:mn>4</m:mn></m:msup><m:mo>+</m:mo><m:mfrac><m:mn>1</m:mn><m:mrow><m:mn>16</m:mn></m:mrow></m:mfrac><m:msup><m:mi>q</m:mi><m:mn>5</m:mn></m:msup></m:mrow></m:mfrac></m:mrow><m:annotation encoding="MathType-MTEF"> MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaqcfa4aaSaaaeaacqaIXaqmaeaacqaIXaqmcqGHRaWkcqWGXbqCdaahaaqabeaacqaIYaGmaaGaey4kaSYaaSaaaeaacqaIXaqmaeaacqaIYaGmaaGaemyCae3aaWbaaeqabaGaeGinaqdaaiabgUcaRmaalaaabaGaeGymaedabaGaeGymaeJaeGOnaydaaiabdghaXnaaCaaabeqaaiabiwda1aaaaaaaaa@3D5A@</m:annotation></m:semantics></m:math></inline-formula>, where <inline-formula><m:math name="1748-7188-3-1-i2" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mi>q</m:mi><m:mo>=</m:mo><m:mfrac><m:mi>p</m:mi><m:mrow><m:mn>1</m:mn><m:mo>−</m:mo><m:mi>p</m:mi></m:mrow></m:mfrac></m:mrow><m:annotation encoding="MathType-MTEF"> MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaGaemyCaeNaeyypa0tcfa4aaSaaaeaacqWGWbaCaeaacqaIXaqmcqGHsislcqWGWbaCaaaaaa@3391@</m:annotation></m:semantics></m:math></inline-formula>, with running time of <it>O</it>(<it>n</it><sup>5</sup>), which is at least 0.984 when <it>p </it>< 0.05.</p> <p>Conclusion</p> <p>The three proposed algorithms are mathematically guaranteed to reconstruct the "true" phylogeny with a high success probability. The experimental results showed that the third algorithm produced phylogenies with a higher probability than its aforementioned theoretical lower bound and outperformed some existing phylogeny reconstruction methods in both speed and accuracy.</p> |
url |
http://www.almob.org/content/3/1/1 |
work_keys_str_mv |
AT wugang reconstructingphylogeniesfromnoisyquartetsinpolynomialtimewithahighsuccessprobability AT kaomingyang reconstructingphylogeniesfromnoisyquartetsinpolynomialtimewithahighsuccessprobability AT linguohui reconstructingphylogeniesfromnoisyquartetsinpolynomialtimewithahighsuccessprobability AT youjiahuai reconstructingphylogeniesfromnoisyquartetsinpolynomialtimewithahighsuccessprobability |
_version_ |
1725501180133507072 |