FACEPAI: a script for fast and consistent environmental DNA processing and identification

Abstract Background The use of environmental DNA (eDNA) has become an increasing important tool in environmental surveys and taxonomic research. High throughput sequencing of samples from soil, water, sediment, trap alcohol or bulk samples generate large amount of barcode sequences that can be assig...

Full description

Bibliographic Details
Main Author:	Emma Wahlberg
Format:	Article
Language:	English
Published:	BMC 2019-12-01
Series:	BMC Ecology
Subjects:	eDNA Filtering sequence reads BLAST Identification Bash script Bioinformatics
Online Access:	https://doi.org/10.1186/s12898-019-0269-1

id	doaj-c9055df7d50541ce9963a12e57021e56
record_format	Article
spelling	doaj-c9055df7d50541ce9963a12e57021e562021-09-02T15:43:52ZengBMCBMC Ecology1472-67852019-12-011911610.1186/s12898-019-0269-1FACEPAI: a script for fast and consistent environmental DNA processing and identificationEmma Wahlberg0Department of Zoology, Stockholm UniversityAbstract Background The use of environmental DNA (eDNA) has become an increasing important tool in environmental surveys and taxonomic research. High throughput sequencing of samples from soil, water, sediment, trap alcohol or bulk samples generate large amount of barcode sequences that can be assigned to a known taxon with a reference sequence. This process can however be bioinformatic cumbersome and time consuming, especially for researchers without specialised bioinformatic training. A number of different software packages and pipelines are available, but require training in preparation of data, running of analysis and formatting results. Comparison of results produced by different packages are often difficult. Results FACEPIE is an open source script dependant on a few open source applications that provides a pipeline for rapid analysis and taxonomic assignment of environmental DNA samples. It requires an initial formatting of a reference database, using the script CaPReSe, and a configuration file and can thereafter be run to process any number of samples in succession using the same settings and references. Both configuration and executing are designed to demand as little hands on work as possible, while assuring repeatable results. Conclusion The demonstration using example data from real environmental samples provides results in a time span ranging from less than 3 min to just above 15 min depending on the numbers of sequences to process. The memory usage is below 2 GB on a desktop PC. FACEPAI and CaPReSe provides a pipeline for analysing a large number of eDNA samples on common equipment, with little bioinformatic skills necessary, for subsequent ecological and taxonomical studies.https://doi.org/10.1186/s12898-019-0269-1eDNAFiltering sequence readsBLASTIdentificationBash scriptBioinformatics
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Emma Wahlberg
spellingShingle	Emma Wahlberg FACEPAI: a script for fast and consistent environmental DNA processing and identification BMC Ecology eDNA Filtering sequence reads BLAST Identification Bash script Bioinformatics
author_facet	Emma Wahlberg
author_sort	Emma Wahlberg
title	FACEPAI: a script for fast and consistent environmental DNA processing and identification
title_short	FACEPAI: a script for fast and consistent environmental DNA processing and identification
title_full	FACEPAI: a script for fast and consistent environmental DNA processing and identification
title_fullStr	FACEPAI: a script for fast and consistent environmental DNA processing and identification
title_full_unstemmed	FACEPAI: a script for fast and consistent environmental DNA processing and identification
title_sort	facepai: a script for fast and consistent environmental dna processing and identification
publisher	BMC
series	BMC Ecology
issn	1472-6785
publishDate	2019-12-01
description	Abstract Background The use of environmental DNA (eDNA) has become an increasing important tool in environmental surveys and taxonomic research. High throughput sequencing of samples from soil, water, sediment, trap alcohol or bulk samples generate large amount of barcode sequences that can be assigned to a known taxon with a reference sequence. This process can however be bioinformatic cumbersome and time consuming, especially for researchers without specialised bioinformatic training. A number of different software packages and pipelines are available, but require training in preparation of data, running of analysis and formatting results. Comparison of results produced by different packages are often difficult. Results FACEPIE is an open source script dependant on a few open source applications that provides a pipeline for rapid analysis and taxonomic assignment of environmental DNA samples. It requires an initial formatting of a reference database, using the script CaPReSe, and a configuration file and can thereafter be run to process any number of samples in succession using the same settings and references. Both configuration and executing are designed to demand as little hands on work as possible, while assuring repeatable results. Conclusion The demonstration using example data from real environmental samples provides results in a time span ranging from less than 3 min to just above 15 min depending on the numbers of sequences to process. The memory usage is below 2 GB on a desktop PC. FACEPAI and CaPReSe provides a pipeline for analysing a large number of eDNA samples on common equipment, with little bioinformatic skills necessary, for subsequent ecological and taxonomical studies.
topic	eDNA Filtering sequence reads BLAST Identification Bash script Bioinformatics
url	https://doi.org/10.1186/s12898-019-0269-1
work_keys_str_mv	AT emmawahlberg facepaiascriptforfastandconsistentenvironmentaldnaprocessingandidentification
_version_	1721173251681419264

FACEPAI: a script for fast and consistent environmental DNA processing and identification

Similar Items