CANGS: a user-friendly utility for processing and analyzing 454 GS-FLX data in biodiversity studies

<p>Abstract</p> <p>Background</p> <p>Next generation sequencing (NGS) technologies have substantially increased the sequence output while the costs were dramatically reduced. In addition to the use in whole genome sequencing, the 454 GS-FLX platform is becoming a widely...

Full description

Bibliographic Details
Main Authors: Schlötterer Christian, Nolte Viola, Pandey Ram
Format: Article
Language:English
Published: BMC 2010-01-01
Series:BMC Research Notes
Online Access:http://www.biomedcentral.com/1756-0500/3/3
id doaj-196c8effed1e45eda33cdfee80fc03ff
record_format Article
spelling doaj-196c8effed1e45eda33cdfee80fc03ff2020-11-24T21:50:31ZengBMCBMC Research Notes1756-05002010-01-0131310.1186/1756-0500-3-3CANGS: a user-friendly utility for processing and analyzing 454 GS-FLX data in biodiversity studiesSchlötterer ChristianNolte ViolaPandey Ram<p>Abstract</p> <p>Background</p> <p>Next generation sequencing (NGS) technologies have substantially increased the sequence output while the costs were dramatically reduced. In addition to the use in whole genome sequencing, the 454 GS-FLX platform is becoming a widely used tool for biodiversity surveys based on amplicon sequencing. In order to use NGS for biodiversity surveys, software tools are required, which perform quality control, trimming of the sequence reads, removal of PCR primers, and generation of input files for downstream analyses. A user-friendly software utility that carries out these steps is still lacking.</p> <p>Findings</p> <p>We developed CANGS (<b>C</b>leaning and <b>A</b>nalyzing <b>N</b>ext <b>G</b>eneration <b>S</b>equences) a flexible and user-friendly integrated software utility: CANGS is designed for amplicon based biodiversity surveys using the 454 sequencing platform. CANGS filters low quality sequences, removes PCR primers, filters singletons, identifies barcodes, and generates input files for downstream analyses. The downstream analyses rely either on third party software (e.g.: rarefaction analyses) or CANGS-specific scripts. The latter include modules linking 454 sequences with the name of the closest taxonomic reference retrieved from the NCBI database and the sequence divergence between them. Our software can be easily adapted to handle sequencing projects with different amplicon sizes, primer sequences, and quality thresholds, which makes this software especially useful for non-bioinformaticians.</p> <p>Conclusion</p> <p>CANGS performs PCR primer clipping, filtering of low quality sequences, links sequences to NCBI taxonomy and provides input files for common rarefaction analysis software programs. CANGS is written in Perl and runs on Mac OS X/Linux and is available at <url>http://i122server.vu-wien.ac.at/pop/software.html</url></p> http://www.biomedcentral.com/1756-0500/3/3
collection DOAJ
language English
format Article
sources DOAJ
author Schlötterer Christian
Nolte Viola
Pandey Ram
spellingShingle Schlötterer Christian
Nolte Viola
Pandey Ram
CANGS: a user-friendly utility for processing and analyzing 454 GS-FLX data in biodiversity studies
BMC Research Notes
author_facet Schlötterer Christian
Nolte Viola
Pandey Ram
author_sort Schlötterer Christian
title CANGS: a user-friendly utility for processing and analyzing 454 GS-FLX data in biodiversity studies
title_short CANGS: a user-friendly utility for processing and analyzing 454 GS-FLX data in biodiversity studies
title_full CANGS: a user-friendly utility for processing and analyzing 454 GS-FLX data in biodiversity studies
title_fullStr CANGS: a user-friendly utility for processing and analyzing 454 GS-FLX data in biodiversity studies
title_full_unstemmed CANGS: a user-friendly utility for processing and analyzing 454 GS-FLX data in biodiversity studies
title_sort cangs: a user-friendly utility for processing and analyzing 454 gs-flx data in biodiversity studies
publisher BMC
series BMC Research Notes
issn 1756-0500
publishDate 2010-01-01
description <p>Abstract</p> <p>Background</p> <p>Next generation sequencing (NGS) technologies have substantially increased the sequence output while the costs were dramatically reduced. In addition to the use in whole genome sequencing, the 454 GS-FLX platform is becoming a widely used tool for biodiversity surveys based on amplicon sequencing. In order to use NGS for biodiversity surveys, software tools are required, which perform quality control, trimming of the sequence reads, removal of PCR primers, and generation of input files for downstream analyses. A user-friendly software utility that carries out these steps is still lacking.</p> <p>Findings</p> <p>We developed CANGS (<b>C</b>leaning and <b>A</b>nalyzing <b>N</b>ext <b>G</b>eneration <b>S</b>equences) a flexible and user-friendly integrated software utility: CANGS is designed for amplicon based biodiversity surveys using the 454 sequencing platform. CANGS filters low quality sequences, removes PCR primers, filters singletons, identifies barcodes, and generates input files for downstream analyses. The downstream analyses rely either on third party software (e.g.: rarefaction analyses) or CANGS-specific scripts. The latter include modules linking 454 sequences with the name of the closest taxonomic reference retrieved from the NCBI database and the sequence divergence between them. Our software can be easily adapted to handle sequencing projects with different amplicon sizes, primer sequences, and quality thresholds, which makes this software especially useful for non-bioinformaticians.</p> <p>Conclusion</p> <p>CANGS performs PCR primer clipping, filtering of low quality sequences, links sequences to NCBI taxonomy and provides input files for common rarefaction analysis software programs. CANGS is written in Perl and runs on Mac OS X/Linux and is available at <url>http://i122server.vu-wien.ac.at/pop/software.html</url></p>
url http://www.biomedcentral.com/1756-0500/3/3
work_keys_str_mv AT schlottererchristian cangsauserfriendlyutilityforprocessingandanalyzing454gsflxdatainbiodiversitystudies
AT nolteviola cangsauserfriendlyutilityforprocessingandanalyzing454gsflxdatainbiodiversitystudies
AT pandeyram cangsauserfriendlyutilityforprocessingandanalyzing454gsflxdatainbiodiversitystudies
_version_ 1725883425223606272