Characterization and remediation of sample index swaps by non-redundant dual indexing on massively parallel sequencing platforms

Abstract Background Here we present an in-depth characterization of the mechanism of sequencer-induced sample contamination due to the phenomenon of index swapping that impacts Illumina sequencers employing patterned flow cells with Exclusion Amplification (ExAmp) chemistry (HiSeqX, HiSeq4000, and N...

Full description

Bibliographic Details
Main Authors: Maura Costello, Mark Fleharty, Justin Abreu, Yossi Farjoun, Steven Ferriera, Laurie Holmes, Brian Granger, Lisa Green, Tom Howd, Tamara Mason, Gina Vicente, Michael Dasilva, Wendy Brodeur, Timothy DeSmet, Sheila Dodge, Niall J. Lennon, Stacey Gabriel
Format: Article
Language:English
Published: BMC 2018-05-01
Series:BMC Genomics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12864-018-4703-0
id doaj-2079b2d7551942bc810c187c8e5c737e
record_format Article
spelling doaj-2079b2d7551942bc810c187c8e5c737e2020-11-24T21:11:06ZengBMCBMC Genomics1471-21642018-05-0119111010.1186/s12864-018-4703-0Characterization and remediation of sample index swaps by non-redundant dual indexing on massively parallel sequencing platformsMaura Costello0Mark Fleharty1Justin Abreu2Yossi Farjoun3Steven Ferriera4Laurie Holmes5Brian Granger6Lisa Green7Tom Howd8Tamara Mason9Gina Vicente10Michael Dasilva11Wendy Brodeur12Timothy DeSmet13Sheila Dodge14Niall J. Lennon15Stacey Gabriel16Broad Genomics, Broad Institute of MIT and HarvardBroad Genomics, Broad Institute of MIT and HarvardBroad Genomics, Broad Institute of MIT and HarvardBroad Genomics, Broad Institute of MIT and HarvardBroad Genomics, Broad Institute of MIT and HarvardBroad Genomics, Broad Institute of MIT and HarvardBroad Genomics, Broad Institute of MIT and HarvardBroad Genomics, Broad Institute of MIT and HarvardBroad Genomics, Broad Institute of MIT and HarvardBroad Genomics, Broad Institute of MIT and HarvardBroad Genomics, Broad Institute of MIT and HarvardBroad Genomics, Broad Institute of MIT and HarvardBroad Genomics, Broad Institute of MIT and HarvardBroad Genomics, Broad Institute of MIT and HarvardBroad Genomics, Broad Institute of MIT and HarvardBroad Genomics, Broad Institute of MIT and HarvardBroad Genomics, Broad Institute of MIT and HarvardAbstract Background Here we present an in-depth characterization of the mechanism of sequencer-induced sample contamination due to the phenomenon of index swapping that impacts Illumina sequencers employing patterned flow cells with Exclusion Amplification (ExAmp) chemistry (HiSeqX, HiSeq4000, and NovaSeq). We also present a remediation method that minimizes the impact of such swaps. Results Leveraging data collected over a two-year period, we demonstrate the widespread prevalence of index swapping in patterned flow cell data. We calculate mean swap rates across multiple sample preparation methods and sequencer models, demonstrating that different library methods can have vastly different swapping rates and that even non-ExAmp chemistry instruments display trace levels of index swapping. We provide methods for eliminating sample data cross contamination by utilizing non-redundant dual indexing for complete filtering of index swapped reads, and share the sequences for 96 non-combinatorial dual indexes we have validated across various library preparation methods and sequencer models. Finally, using computational methods we provide a greater insight into the mechanism of index swapping. Conclusions Index swapping in pooled libraries is a prevalent phenomenon that we observe at a rate of 0.2 to 6% in all sequencing runs on HiSeqX, HiSeq 4000/3000, and NovaSeq. Utilizing non-redundant dual indexing allows for the removal (flagging/filtering) of these swapped reads and eliminates swapping induced sample contamination, which is critical for sensitive applications such as RNA-seq, single cell, blood biopsy using circulating tumor DNA, or clinical sequencing.http://link.springer.com/article/10.1186/s12864-018-4703-0Next generation sequencingMassively parallel sequencingILLUMINA sequencingIndex swappingIndex hoppingMultiplexing
collection DOAJ
language English
format Article
sources DOAJ
author Maura Costello
Mark Fleharty
Justin Abreu
Yossi Farjoun
Steven Ferriera
Laurie Holmes
Brian Granger
Lisa Green
Tom Howd
Tamara Mason
Gina Vicente
Michael Dasilva
Wendy Brodeur
Timothy DeSmet
Sheila Dodge
Niall J. Lennon
Stacey Gabriel
spellingShingle Maura Costello
Mark Fleharty
Justin Abreu
Yossi Farjoun
Steven Ferriera
Laurie Holmes
Brian Granger
Lisa Green
Tom Howd
Tamara Mason
Gina Vicente
Michael Dasilva
Wendy Brodeur
Timothy DeSmet
Sheila Dodge
Niall J. Lennon
Stacey Gabriel
Characterization and remediation of sample index swaps by non-redundant dual indexing on massively parallel sequencing platforms
BMC Genomics
Next generation sequencing
Massively parallel sequencing
ILLUMINA sequencing
Index swapping
Index hopping
Multiplexing
author_facet Maura Costello
Mark Fleharty
Justin Abreu
Yossi Farjoun
Steven Ferriera
Laurie Holmes
Brian Granger
Lisa Green
Tom Howd
Tamara Mason
Gina Vicente
Michael Dasilva
Wendy Brodeur
Timothy DeSmet
Sheila Dodge
Niall J. Lennon
Stacey Gabriel
author_sort Maura Costello
title Characterization and remediation of sample index swaps by non-redundant dual indexing on massively parallel sequencing platforms
title_short Characterization and remediation of sample index swaps by non-redundant dual indexing on massively parallel sequencing platforms
title_full Characterization and remediation of sample index swaps by non-redundant dual indexing on massively parallel sequencing platforms
title_fullStr Characterization and remediation of sample index swaps by non-redundant dual indexing on massively parallel sequencing platforms
title_full_unstemmed Characterization and remediation of sample index swaps by non-redundant dual indexing on massively parallel sequencing platforms
title_sort characterization and remediation of sample index swaps by non-redundant dual indexing on massively parallel sequencing platforms
publisher BMC
series BMC Genomics
issn 1471-2164
publishDate 2018-05-01
description Abstract Background Here we present an in-depth characterization of the mechanism of sequencer-induced sample contamination due to the phenomenon of index swapping that impacts Illumina sequencers employing patterned flow cells with Exclusion Amplification (ExAmp) chemistry (HiSeqX, HiSeq4000, and NovaSeq). We also present a remediation method that minimizes the impact of such swaps. Results Leveraging data collected over a two-year period, we demonstrate the widespread prevalence of index swapping in patterned flow cell data. We calculate mean swap rates across multiple sample preparation methods and sequencer models, demonstrating that different library methods can have vastly different swapping rates and that even non-ExAmp chemistry instruments display trace levels of index swapping. We provide methods for eliminating sample data cross contamination by utilizing non-redundant dual indexing for complete filtering of index swapped reads, and share the sequences for 96 non-combinatorial dual indexes we have validated across various library preparation methods and sequencer models. Finally, using computational methods we provide a greater insight into the mechanism of index swapping. Conclusions Index swapping in pooled libraries is a prevalent phenomenon that we observe at a rate of 0.2 to 6% in all sequencing runs on HiSeqX, HiSeq 4000/3000, and NovaSeq. Utilizing non-redundant dual indexing allows for the removal (flagging/filtering) of these swapped reads and eliminates swapping induced sample contamination, which is critical for sensitive applications such as RNA-seq, single cell, blood biopsy using circulating tumor DNA, or clinical sequencing.
topic Next generation sequencing
Massively parallel sequencing
ILLUMINA sequencing
Index swapping
Index hopping
Multiplexing
url http://link.springer.com/article/10.1186/s12864-018-4703-0
work_keys_str_mv AT mauracostello characterizationandremediationofsampleindexswapsbynonredundantdualindexingonmassivelyparallelsequencingplatforms
AT markfleharty characterizationandremediationofsampleindexswapsbynonredundantdualindexingonmassivelyparallelsequencingplatforms
AT justinabreu characterizationandremediationofsampleindexswapsbynonredundantdualindexingonmassivelyparallelsequencingplatforms
AT yossifarjoun characterizationandremediationofsampleindexswapsbynonredundantdualindexingonmassivelyparallelsequencingplatforms
AT stevenferriera characterizationandremediationofsampleindexswapsbynonredundantdualindexingonmassivelyparallelsequencingplatforms
AT laurieholmes characterizationandremediationofsampleindexswapsbynonredundantdualindexingonmassivelyparallelsequencingplatforms
AT briangranger characterizationandremediationofsampleindexswapsbynonredundantdualindexingonmassivelyparallelsequencingplatforms
AT lisagreen characterizationandremediationofsampleindexswapsbynonredundantdualindexingonmassivelyparallelsequencingplatforms
AT tomhowd characterizationandremediationofsampleindexswapsbynonredundantdualindexingonmassivelyparallelsequencingplatforms
AT tamaramason characterizationandremediationofsampleindexswapsbynonredundantdualindexingonmassivelyparallelsequencingplatforms
AT ginavicente characterizationandremediationofsampleindexswapsbynonredundantdualindexingonmassivelyparallelsequencingplatforms
AT michaeldasilva characterizationandremediationofsampleindexswapsbynonredundantdualindexingonmassivelyparallelsequencingplatforms
AT wendybrodeur characterizationandremediationofsampleindexswapsbynonredundantdualindexingonmassivelyparallelsequencingplatforms
AT timothydesmet characterizationandremediationofsampleindexswapsbynonredundantdualindexingonmassivelyparallelsequencingplatforms
AT sheiladodge characterizationandremediationofsampleindexswapsbynonredundantdualindexingonmassivelyparallelsequencingplatforms
AT nialljlennon characterizationandremediationofsampleindexswapsbynonredundantdualindexingonmassivelyparallelsequencingplatforms
AT staceygabriel characterizationandremediationofsampleindexswapsbynonredundantdualindexingonmassivelyparallelsequencingplatforms
_version_ 1716754504446115840