CoMeta: classification of metagenomes using k-mers.

Nowadays, the study of environmental samples has been developing rapidly. Characterization of the environment composition broadens the knowledge about the relationship between species composition and environmental conditions. An important element of extracting the knowledge of the sample composition...

Full description

Bibliographic Details
Main Authors: Jolanta Kawulok, Sebastian Deorowicz
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2015-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0121453
id doaj-88f1dc90b1b94d0d868bd9f708251a75
record_format Article
spelling doaj-88f1dc90b1b94d0d868bd9f708251a752021-03-03T20:06:16ZengPublic Library of Science (PLoS)PLoS ONE1932-62032015-01-01104e012145310.1371/journal.pone.0121453CoMeta: classification of metagenomes using k-mers.Jolanta KawulokSebastian DeorowiczNowadays, the study of environmental samples has been developing rapidly. Characterization of the environment composition broadens the knowledge about the relationship between species composition and environmental conditions. An important element of extracting the knowledge of the sample composition is to compare the extracted fragments of DNA with sequences derived from known organisms. In the presented paper, we introduce an algorithm called CoMeta (Classification of metagenomes), which assigns a query read (a DNA fragment) into one of the groups previously prepared by the user. Typically, this is one of the taxonomic rank (e.g., phylum, genus), however prepared groups may contain sequences having various functions. In CoMeta, we used the exact method for read classification using short subsequences (k-mers) and fast program for indexing large set of k-mers. In contrast to the most popular methods based on BLAST, where the query is compared with each reference sequence, we begin the classification from the top of the taxonomy tree to reduce the number of comparisons. The presented experimental study confirms that CoMeta outperforms other programs used in this context. CoMeta is available at https://github.com/jkawulok/cometa under a free GNU GPL 2 license.https://doi.org/10.1371/journal.pone.0121453
collection DOAJ
language English
format Article
sources DOAJ
author Jolanta Kawulok
Sebastian Deorowicz
spellingShingle Jolanta Kawulok
Sebastian Deorowicz
CoMeta: classification of metagenomes using k-mers.
PLoS ONE
author_facet Jolanta Kawulok
Sebastian Deorowicz
author_sort Jolanta Kawulok
title CoMeta: classification of metagenomes using k-mers.
title_short CoMeta: classification of metagenomes using k-mers.
title_full CoMeta: classification of metagenomes using k-mers.
title_fullStr CoMeta: classification of metagenomes using k-mers.
title_full_unstemmed CoMeta: classification of metagenomes using k-mers.
title_sort cometa: classification of metagenomes using k-mers.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2015-01-01
description Nowadays, the study of environmental samples has been developing rapidly. Characterization of the environment composition broadens the knowledge about the relationship between species composition and environmental conditions. An important element of extracting the knowledge of the sample composition is to compare the extracted fragments of DNA with sequences derived from known organisms. In the presented paper, we introduce an algorithm called CoMeta (Classification of metagenomes), which assigns a query read (a DNA fragment) into one of the groups previously prepared by the user. Typically, this is one of the taxonomic rank (e.g., phylum, genus), however prepared groups may contain sequences having various functions. In CoMeta, we used the exact method for read classification using short subsequences (k-mers) and fast program for indexing large set of k-mers. In contrast to the most popular methods based on BLAST, where the query is compared with each reference sequence, we begin the classification from the top of the taxonomy tree to reduce the number of comparisons. The presented experimental study confirms that CoMeta outperforms other programs used in this context. CoMeta is available at https://github.com/jkawulok/cometa under a free GNU GPL 2 license.
url https://doi.org/10.1371/journal.pone.0121453
work_keys_str_mv AT jolantakawulok cometaclassificationofmetagenomesusingkmers
AT sebastiandeorowicz cometaclassificationofmetagenomesusingkmers
_version_ 1714824099469983744