Choice of assembly software has a critical impact on virome characterisation

Abstract Background The viral component of microbial communities plays a vital role in driving bacterial diversity, facilitating nutrient turnover and shaping community composition. Despite their importance, the vast majority of viral sequences are poorly annotated and share little or no homology to...

Full description

Bibliographic Details
Main Authors: Thomas D. S. Sutton, Adam G. Clooney, Feargal J. Ryan, R. Paul Ross, Colin Hill
Format: Article
Language:English
Published: BMC 2019-01-01
Series:Microbiome
Subjects:
Online Access:http://link.springer.com/article/10.1186/s40168-019-0626-5
id doaj-7011bb25cc5a4d60ab9e589c17a861b0
record_format Article
spelling doaj-7011bb25cc5a4d60ab9e589c17a861b02020-11-25T01:23:00ZengBMCMicrobiome2049-26182019-01-017111510.1186/s40168-019-0626-5Choice of assembly software has a critical impact on virome characterisationThomas D. S. Sutton0Adam G. Clooney1Feargal J. Ryan2R. Paul Ross3Colin Hill4APC Microbiome IrelandAPC Microbiome IrelandAPC Microbiome IrelandAPC Microbiome IrelandAPC Microbiome IrelandAbstract Background The viral component of microbial communities plays a vital role in driving bacterial diversity, facilitating nutrient turnover and shaping community composition. Despite their importance, the vast majority of viral sequences are poorly annotated and share little or no homology to reference databases. As a result, investigation of the viral metagenome (virome) relies heavily on de novo assembly of short sequencing reads to recover compositional and functional information. Metagenomic assembly is particularly challenging for virome data, often resulting in fragmented assemblies and poor recovery of viral community members. Despite the essential role of assembly in virome analysis and difficulties posed by these data, current assembly comparisons have been limited to subsections of virome studies or bacterial datasets. Design This study presents the most comprehensive virome assembly comparison to date, featuring 16 metagenomic assembly approaches which have featured in human virome studies. Assemblers were assessed using four independent virome datasets, namely, simulated reads, two mock communities, viromes spiked with a known phage and human gut viromes. Results Assembly performance varied significantly across all test datasets, with SPAdes (meta) performing consistently well. Performance of MIRA and VICUNA varied, highlighting the importance of using a range of datasets when comparing assembly programs. It was also found that while some assemblers addressed the challenges of virome data better than others, all assemblers had limitations. Low read coverage and genomic repeats resulted in assemblies with poor genome recovery, high degrees of fragmentation and low-accuracy contigs across all assemblers. These limitations must be considered when setting thresholds for downstream analysis and when drawing conclusions from virome data.http://link.springer.com/article/10.1186/s40168-019-0626-5ViromeViralAssemblyMetagenomeBenchmarkComparison
collection DOAJ
language English
format Article
sources DOAJ
author Thomas D. S. Sutton
Adam G. Clooney
Feargal J. Ryan
R. Paul Ross
Colin Hill
spellingShingle Thomas D. S. Sutton
Adam G. Clooney
Feargal J. Ryan
R. Paul Ross
Colin Hill
Choice of assembly software has a critical impact on virome characterisation
Microbiome
Virome
Viral
Assembly
Metagenome
Benchmark
Comparison
author_facet Thomas D. S. Sutton
Adam G. Clooney
Feargal J. Ryan
R. Paul Ross
Colin Hill
author_sort Thomas D. S. Sutton
title Choice of assembly software has a critical impact on virome characterisation
title_short Choice of assembly software has a critical impact on virome characterisation
title_full Choice of assembly software has a critical impact on virome characterisation
title_fullStr Choice of assembly software has a critical impact on virome characterisation
title_full_unstemmed Choice of assembly software has a critical impact on virome characterisation
title_sort choice of assembly software has a critical impact on virome characterisation
publisher BMC
series Microbiome
issn 2049-2618
publishDate 2019-01-01
description Abstract Background The viral component of microbial communities plays a vital role in driving bacterial diversity, facilitating nutrient turnover and shaping community composition. Despite their importance, the vast majority of viral sequences are poorly annotated and share little or no homology to reference databases. As a result, investigation of the viral metagenome (virome) relies heavily on de novo assembly of short sequencing reads to recover compositional and functional information. Metagenomic assembly is particularly challenging for virome data, often resulting in fragmented assemblies and poor recovery of viral community members. Despite the essential role of assembly in virome analysis and difficulties posed by these data, current assembly comparisons have been limited to subsections of virome studies or bacterial datasets. Design This study presents the most comprehensive virome assembly comparison to date, featuring 16 metagenomic assembly approaches which have featured in human virome studies. Assemblers were assessed using four independent virome datasets, namely, simulated reads, two mock communities, viromes spiked with a known phage and human gut viromes. Results Assembly performance varied significantly across all test datasets, with SPAdes (meta) performing consistently well. Performance of MIRA and VICUNA varied, highlighting the importance of using a range of datasets when comparing assembly programs. It was also found that while some assemblers addressed the challenges of virome data better than others, all assemblers had limitations. Low read coverage and genomic repeats resulted in assemblies with poor genome recovery, high degrees of fragmentation and low-accuracy contigs across all assemblers. These limitations must be considered when setting thresholds for downstream analysis and when drawing conclusions from virome data.
topic Virome
Viral
Assembly
Metagenome
Benchmark
Comparison
url http://link.springer.com/article/10.1186/s40168-019-0626-5
work_keys_str_mv AT thomasdssutton choiceofassemblysoftwarehasacriticalimpactonviromecharacterisation
AT adamgclooney choiceofassemblysoftwarehasacriticalimpactonviromecharacterisation
AT feargaljryan choiceofassemblysoftwarehasacriticalimpactonviromecharacterisation
AT rpaulross choiceofassemblysoftwarehasacriticalimpactonviromecharacterisation
AT colinhill choiceofassemblysoftwarehasacriticalimpactonviromecharacterisation
_version_ 1725124116017577984