Improving ancient DNA read mapping against modern reference genomes

<p>Abstract</p> <p>Background</p> <p>Next-Generation Sequencing has revolutionized our approach to ancient DNA (aDNA) research, by providing complete genomic sequences of ancient individuals and extinct species. However, the recovery of genetic material from long-dead o...

Full description

Bibliographic Details
Main Authors: Schubert Mikkel, Ginolhac Aurelien, Lindgreen Stinus, Thompson John F, AL-Rasheid Khaled AS, Willerslev Eske, Krogh Anders, Orlando Ludovic
Format: Article
Language:English
Published: BMC 2012-05-01
Series:BMC Genomics
Online Access:http://www.biomedcentral.com/1471-2164/13/178
id doaj-c64dab67f6ef4d6a8e056cb4781ecb10
record_format Article
spelling doaj-c64dab67f6ef4d6a8e056cb4781ecb102020-11-25T00:19:54ZengBMCBMC Genomics1471-21642012-05-0113117810.1186/1471-2164-13-178Improving ancient DNA read mapping against modern reference genomesSchubert MikkelGinolhac AurelienLindgreen StinusThompson John FAL-Rasheid Khaled ASWillerslev EskeKrogh AndersOrlando Ludovic<p>Abstract</p> <p>Background</p> <p>Next-Generation Sequencing has revolutionized our approach to ancient DNA (aDNA) research, by providing complete genomic sequences of ancient individuals and extinct species. However, the recovery of genetic material from long-dead organisms is still complicated by a number of issues, including <it>post-mortem</it> DNA damage and high levels of environmental contamination. Together with error profiles specific to the type of sequencing platforms used, these specificities could limit our ability to map sequencing reads against modern reference genomes and therefore limit our ability to identify endogenous ancient reads, reducing the efficiency of shotgun sequencing aDNA.</p> <p>Results</p> <p>In this study, we compare different computational methods for improving the accuracy and sensitivity of aDNA sequence identification, based on shotgun sequencing reads recovered from Pleistocene horse extracts using Illumina GAIIx and Helicos Heliscope platforms. We show that the performance of the Burrows Wheeler Aligner (BWA), that has been developed for mapping of undamaged sequencing reads using platforms with low rates of indel-types of sequencing errors, can be employed at acceptable run-times by modifying default parameters in a platform-specific manner. We also examine if trimming likely damaged positions at read ends can increase the recovery of genuine aDNA fragments and if accurate identification of human contamination can be achieved using a strategy previously suggested based on best hit filtering. We show that combining our different mapping and filtering approaches can increase the number of high-quality endogenous hits recovered by up to 33%.</p> <p>Conclusions</p> <p>We have shown that Illumina and Helicos sequences recovered from aDNA extracts could not be aligned to modern reference genomes with the same efficiency unless mapping parameters are optimized for the specific types of errors generated by these platforms and by <it>post-mortem</it> DNA damage. Our findings have important implications for future aDNA research, as we define mapping guidelines that improve our ability to identify genuine aDNA sequences, which in turn could improve the genotyping accuracy of ancient specimens. Our framework provides a significant improvement to the standard procedures used for characterizing ancient genomes, which is challenged by contamination and often low amounts of DNA material.</p> http://www.biomedcentral.com/1471-2164/13/178
collection DOAJ
language English
format Article
sources DOAJ
author Schubert Mikkel
Ginolhac Aurelien
Lindgreen Stinus
Thompson John F
AL-Rasheid Khaled AS
Willerslev Eske
Krogh Anders
Orlando Ludovic
spellingShingle Schubert Mikkel
Ginolhac Aurelien
Lindgreen Stinus
Thompson John F
AL-Rasheid Khaled AS
Willerslev Eske
Krogh Anders
Orlando Ludovic
Improving ancient DNA read mapping against modern reference genomes
BMC Genomics
author_facet Schubert Mikkel
Ginolhac Aurelien
Lindgreen Stinus
Thompson John F
AL-Rasheid Khaled AS
Willerslev Eske
Krogh Anders
Orlando Ludovic
author_sort Schubert Mikkel
title Improving ancient DNA read mapping against modern reference genomes
title_short Improving ancient DNA read mapping against modern reference genomes
title_full Improving ancient DNA read mapping against modern reference genomes
title_fullStr Improving ancient DNA read mapping against modern reference genomes
title_full_unstemmed Improving ancient DNA read mapping against modern reference genomes
title_sort improving ancient dna read mapping against modern reference genomes
publisher BMC
series BMC Genomics
issn 1471-2164
publishDate 2012-05-01
description <p>Abstract</p> <p>Background</p> <p>Next-Generation Sequencing has revolutionized our approach to ancient DNA (aDNA) research, by providing complete genomic sequences of ancient individuals and extinct species. However, the recovery of genetic material from long-dead organisms is still complicated by a number of issues, including <it>post-mortem</it> DNA damage and high levels of environmental contamination. Together with error profiles specific to the type of sequencing platforms used, these specificities could limit our ability to map sequencing reads against modern reference genomes and therefore limit our ability to identify endogenous ancient reads, reducing the efficiency of shotgun sequencing aDNA.</p> <p>Results</p> <p>In this study, we compare different computational methods for improving the accuracy and sensitivity of aDNA sequence identification, based on shotgun sequencing reads recovered from Pleistocene horse extracts using Illumina GAIIx and Helicos Heliscope platforms. We show that the performance of the Burrows Wheeler Aligner (BWA), that has been developed for mapping of undamaged sequencing reads using platforms with low rates of indel-types of sequencing errors, can be employed at acceptable run-times by modifying default parameters in a platform-specific manner. We also examine if trimming likely damaged positions at read ends can increase the recovery of genuine aDNA fragments and if accurate identification of human contamination can be achieved using a strategy previously suggested based on best hit filtering. We show that combining our different mapping and filtering approaches can increase the number of high-quality endogenous hits recovered by up to 33%.</p> <p>Conclusions</p> <p>We have shown that Illumina and Helicos sequences recovered from aDNA extracts could not be aligned to modern reference genomes with the same efficiency unless mapping parameters are optimized for the specific types of errors generated by these platforms and by <it>post-mortem</it> DNA damage. Our findings have important implications for future aDNA research, as we define mapping guidelines that improve our ability to identify genuine aDNA sequences, which in turn could improve the genotyping accuracy of ancient specimens. Our framework provides a significant improvement to the standard procedures used for characterizing ancient genomes, which is challenged by contamination and often low amounts of DNA material.</p>
url http://www.biomedcentral.com/1471-2164/13/178
work_keys_str_mv AT schubertmikkel improvingancientdnareadmappingagainstmodernreferencegenomes
AT ginolhacaurelien improvingancientdnareadmappingagainstmodernreferencegenomes
AT lindgreenstinus improvingancientdnareadmappingagainstmodernreferencegenomes
AT thompsonjohnf improvingancientdnareadmappingagainstmodernreferencegenomes
AT alrasheidkhaledas improvingancientdnareadmappingagainstmodernreferencegenomes
AT willersleveske improvingancientdnareadmappingagainstmodernreferencegenomes
AT kroghanders improvingancientdnareadmappingagainstmodernreferencegenomes
AT orlandoludovic improvingancientdnareadmappingagainstmodernreferencegenomes
_version_ 1725369889936375808