Bioinformatics strategies for taxonomy independent binning and visualization of sequences in shotgun metagenomics

One of main steps in a study of microbial communities is resolving their composition, diversity and function. In the past, these issues were mostly addressed by the use of amplicon sequencing of a target gene because of reasonable price and easier computational postprocessing of the bioinformatic da...

Full description

Bibliographic Details
Main Authors: Karel Sedlar, Kristyna Kupkova, Ivo Provaznik
Format: Article
Language:English
Published: Elsevier 2017-01-01
Series:Computational and Structural Biotechnology Journal
Online Access:http://www.sciencedirect.com/science/article/pii/S2001037016300678
id doaj-93f1483a405947059757cebd9b004d19
record_format Article
spelling doaj-93f1483a405947059757cebd9b004d192020-11-25T02:20:48ZengElsevierComputational and Structural Biotechnology Journal2001-03702017-01-01154855Bioinformatics strategies for taxonomy independent binning and visualization of sequences in shotgun metagenomicsKarel Sedlar0Kristyna Kupkova1Ivo Provaznik2Corresponding author.; Department of Biomedical Engineering, Brno University of Technology, Technicka 12, Brno, Czech RepublicDepartment of Biomedical Engineering, Brno University of Technology, Technicka 12, Brno, Czech RepublicDepartment of Biomedical Engineering, Brno University of Technology, Technicka 12, Brno, Czech RepublicOne of main steps in a study of microbial communities is resolving their composition, diversity and function. In the past, these issues were mostly addressed by the use of amplicon sequencing of a target gene because of reasonable price and easier computational postprocessing of the bioinformatic data. With the advancement of sequencing techniques, the main focus shifted to the whole metagenome shotgun sequencing, which allows much more detailed analysis of the metagenomic data, including reconstruction of novel microbial genomes and to gain knowledge about genetic potential and metabolic capacities of whole environments. On the other hand, the output of whole metagenomic shotgun sequencing is mixture of short DNA fragments belonging to various genomes, therefore this approach requires more sophisticated computational algorithms for clustering of related sequences, commonly referred to as sequence binning. There are currently two types of binning methods: taxonomy dependent and taxonomy independent. The first type classifies the DNA fragments by performing a standard homology inference against a reference database, while the latter performs the reference-free binning by applying clustering techniques on features extracted from the sequences. In this review, we describe the strategies within the second approach. Although these strategies do not require prior knowledge, they have higher demands on the length of sequences. Besides their basic principle, an overview of particular methods and tools is provided. Furthermore, the review covers the utilization of the methods in context with the length of sequences and discusses the needs for metagenomic data preprocessing in form of initial assembly prior to binning. Keywords: Metagenomics, Taxonomy independent, Sequence binning, Genomic signature, Abundance, Visualizationhttp://www.sciencedirect.com/science/article/pii/S2001037016300678
collection DOAJ
language English
format Article
sources DOAJ
author Karel Sedlar
Kristyna Kupkova
Ivo Provaznik
spellingShingle Karel Sedlar
Kristyna Kupkova
Ivo Provaznik
Bioinformatics strategies for taxonomy independent binning and visualization of sequences in shotgun metagenomics
Computational and Structural Biotechnology Journal
author_facet Karel Sedlar
Kristyna Kupkova
Ivo Provaznik
author_sort Karel Sedlar
title Bioinformatics strategies for taxonomy independent binning and visualization of sequences in shotgun metagenomics
title_short Bioinformatics strategies for taxonomy independent binning and visualization of sequences in shotgun metagenomics
title_full Bioinformatics strategies for taxonomy independent binning and visualization of sequences in shotgun metagenomics
title_fullStr Bioinformatics strategies for taxonomy independent binning and visualization of sequences in shotgun metagenomics
title_full_unstemmed Bioinformatics strategies for taxonomy independent binning and visualization of sequences in shotgun metagenomics
title_sort bioinformatics strategies for taxonomy independent binning and visualization of sequences in shotgun metagenomics
publisher Elsevier
series Computational and Structural Biotechnology Journal
issn 2001-0370
publishDate 2017-01-01
description One of main steps in a study of microbial communities is resolving their composition, diversity and function. In the past, these issues were mostly addressed by the use of amplicon sequencing of a target gene because of reasonable price and easier computational postprocessing of the bioinformatic data. With the advancement of sequencing techniques, the main focus shifted to the whole metagenome shotgun sequencing, which allows much more detailed analysis of the metagenomic data, including reconstruction of novel microbial genomes and to gain knowledge about genetic potential and metabolic capacities of whole environments. On the other hand, the output of whole metagenomic shotgun sequencing is mixture of short DNA fragments belonging to various genomes, therefore this approach requires more sophisticated computational algorithms for clustering of related sequences, commonly referred to as sequence binning. There are currently two types of binning methods: taxonomy dependent and taxonomy independent. The first type classifies the DNA fragments by performing a standard homology inference against a reference database, while the latter performs the reference-free binning by applying clustering techniques on features extracted from the sequences. In this review, we describe the strategies within the second approach. Although these strategies do not require prior knowledge, they have higher demands on the length of sequences. Besides their basic principle, an overview of particular methods and tools is provided. Furthermore, the review covers the utilization of the methods in context with the length of sequences and discusses the needs for metagenomic data preprocessing in form of initial assembly prior to binning. Keywords: Metagenomics, Taxonomy independent, Sequence binning, Genomic signature, Abundance, Visualization
url http://www.sciencedirect.com/science/article/pii/S2001037016300678
work_keys_str_mv AT karelsedlar bioinformaticsstrategiesfortaxonomyindependentbinningandvisualizationofsequencesinshotgunmetagenomics
AT kristynakupkova bioinformaticsstrategiesfortaxonomyindependentbinningandvisualizationofsequencesinshotgunmetagenomics
AT ivoprovaznik bioinformaticsstrategiesfortaxonomyindependentbinningandvisualizationofsequencesinshotgunmetagenomics
_version_ 1724869730599174144