FACEPAI: a script for fast and consistent environmental DNA processing and identification

Abstract Background The use of environmental DNA (eDNA) has become an increasing important tool in environmental surveys and taxonomic research. High throughput sequencing of samples from soil, water, sediment, trap alcohol or bulk samples generate large amount of barcode sequences that can be assig...

Full description

Bibliographic Details
Main Author: Emma Wahlberg
Format: Article
Language:English
Published: BMC 2019-12-01
Series:BMC Ecology
Subjects:
Online Access:https://doi.org/10.1186/s12898-019-0269-1
id doaj-c9055df7d50541ce9963a12e57021e56
record_format Article
spelling doaj-c9055df7d50541ce9963a12e57021e562021-09-02T15:43:52ZengBMCBMC Ecology1472-67852019-12-011911610.1186/s12898-019-0269-1FACEPAI: a script for fast and consistent environmental DNA processing and identificationEmma Wahlberg0Department of Zoology, Stockholm UniversityAbstract Background The use of environmental DNA (eDNA) has become an increasing important tool in environmental surveys and taxonomic research. High throughput sequencing of samples from soil, water, sediment, trap alcohol or bulk samples generate large amount of barcode sequences that can be assigned to a known taxon with a reference sequence. This process can however be bioinformatic cumbersome and time consuming, especially for researchers without specialised bioinformatic training. A number of different software packages and pipelines are available, but require training in preparation of data, running of analysis and formatting results. Comparison of results produced by different packages are often difficult. Results FACEPIE is an open source script dependant on a few open source applications that provides a pipeline for rapid analysis and taxonomic assignment of environmental DNA samples. It requires an initial formatting of a reference database, using the script CaPReSe, and a configuration file and can thereafter be run to process any number of samples in succession using the same settings and references. Both configuration and executing are designed to demand as little hands on work as possible, while assuring repeatable results. Conclusion The demonstration using example data from real environmental samples provides results in a time span ranging from less than 3 min to just above 15 min depending on the numbers of sequences to process. The memory usage is below 2 GB on a desktop PC. FACEPAI and CaPReSe provides a pipeline for analysing a large number of eDNA samples on common equipment, with little bioinformatic skills necessary, for subsequent ecological and taxonomical studies.https://doi.org/10.1186/s12898-019-0269-1eDNAFiltering sequence readsBLASTIdentificationBash scriptBioinformatics
collection DOAJ
language English
format Article
sources DOAJ
author Emma Wahlberg
spellingShingle Emma Wahlberg
FACEPAI: a script for fast and consistent environmental DNA processing and identification
BMC Ecology
eDNA
Filtering sequence reads
BLAST
Identification
Bash script
Bioinformatics
author_facet Emma Wahlberg
author_sort Emma Wahlberg
title FACEPAI: a script for fast and consistent environmental DNA processing and identification
title_short FACEPAI: a script for fast and consistent environmental DNA processing and identification
title_full FACEPAI: a script for fast and consistent environmental DNA processing and identification
title_fullStr FACEPAI: a script for fast and consistent environmental DNA processing and identification
title_full_unstemmed FACEPAI: a script for fast and consistent environmental DNA processing and identification
title_sort facepai: a script for fast and consistent environmental dna processing and identification
publisher BMC
series BMC Ecology
issn 1472-6785
publishDate 2019-12-01
description Abstract Background The use of environmental DNA (eDNA) has become an increasing important tool in environmental surveys and taxonomic research. High throughput sequencing of samples from soil, water, sediment, trap alcohol or bulk samples generate large amount of barcode sequences that can be assigned to a known taxon with a reference sequence. This process can however be bioinformatic cumbersome and time consuming, especially for researchers without specialised bioinformatic training. A number of different software packages and pipelines are available, but require training in preparation of data, running of analysis and formatting results. Comparison of results produced by different packages are often difficult. Results FACEPIE is an open source script dependant on a few open source applications that provides a pipeline for rapid analysis and taxonomic assignment of environmental DNA samples. It requires an initial formatting of a reference database, using the script CaPReSe, and a configuration file and can thereafter be run to process any number of samples in succession using the same settings and references. Both configuration and executing are designed to demand as little hands on work as possible, while assuring repeatable results. Conclusion The demonstration using example data from real environmental samples provides results in a time span ranging from less than 3 min to just above 15 min depending on the numbers of sequences to process. The memory usage is below 2 GB on a desktop PC. FACEPAI and CaPReSe provides a pipeline for analysing a large number of eDNA samples on common equipment, with little bioinformatic skills necessary, for subsequent ecological and taxonomical studies.
topic eDNA
Filtering sequence reads
BLAST
Identification
Bash script
Bioinformatics
url https://doi.org/10.1186/s12898-019-0269-1
work_keys_str_mv AT emmawahlberg facepaiascriptforfastandconsistentenvironmentaldnaprocessingandidentification
_version_ 1721173251681419264