Optimizing 16S rRNA gene profile analysis from low biomass nasopharyngeal and induced sputum specimens

Abstract Background Careful consideration of experimental artefacts is required in order to successfully apply high-throughput 16S ribosomal ribonucleic acid (rRNA) gene sequencing technology. Here we introduce experimental design, quality control and “denoising” approaches for sequencing low biomas...

Full description

Bibliographic Details
Main Authors: Shantelle Claassen-Weitz, Sugnet Gardner-Lubbe, Kilaza S. Mwaikono, Elloise du Toit, Heather J. Zar, Mark P. Nicol
Format: Article
Language:English
Published: BMC 2020-05-01
Series:BMC Microbiology
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12866-020-01795-7
id doaj-26285dcb969a4cd49e0391acb869fc2b
record_format Article
spelling doaj-26285dcb969a4cd49e0391acb869fc2b2020-11-25T02:18:36ZengBMCBMC Microbiology1471-21802020-05-0120112610.1186/s12866-020-01795-7Optimizing 16S rRNA gene profile analysis from low biomass nasopharyngeal and induced sputum specimensShantelle Claassen-Weitz0Sugnet Gardner-Lubbe1Kilaza S. Mwaikono2Elloise du Toit3Heather J. Zar4Mark P. Nicol5Division of Medical Microbiology, Department of Pathology, Faculty of Health Sciences, University of Cape TownDepartment of Statistics and Actuarial Science, Faculty of Economic and Management Sciences, Stellenbosch UniversityComputational Biology Group and H3ABioNet, Department of Integrative Biomedical Sciences, University of Cape TownDivision of Medical Microbiology, Department of Pathology, Faculty of Health Sciences, University of Cape TownDepartment of Paediatrics and Child Health, Red Cross War Memorial Children’s HospitalDivision of Medical Microbiology, Department of Pathology, Faculty of Health Sciences, University of Cape TownAbstract Background Careful consideration of experimental artefacts is required in order to successfully apply high-throughput 16S ribosomal ribonucleic acid (rRNA) gene sequencing technology. Here we introduce experimental design, quality control and “denoising” approaches for sequencing low biomass specimens. Results We found that bacterial biomass is a key driver of 16S rRNA gene sequencing profiles generated from bacterial mock communities and that the use of different deoxyribonucleic acid (DNA) extraction methods [DSP Virus/Pathogen Mini Kit® (Kit-QS) and ZymoBIOMICS DNA Miniprep Kit (Kit-ZB)] and storage buffers [PrimeStore® Molecular Transport medium (Primestore) and Skim-milk, Tryptone, Glucose and Glycerol (STGG)] further influence these profiles. Kit-QS better represented hard-to-lyse bacteria from bacterial mock communities compared to Kit-ZB. Primestore storage buffer yielded lower levels of background operational taxonomic units (OTUs) from low biomass bacterial mock community controls compared to STGG. In addition to bacterial mock community controls, we used technical repeats (nasopharyngeal and induced sputum processed in duplicate, triplicate or quadruplicate) to further evaluate the effect of specimen biomass and participant age at specimen collection on resultant sequencing profiles. We observed a positive correlation (r = 0.16) between specimen biomass and participant age at specimen collection: low biomass technical repeats (represented by < 500 16S rRNA gene copies/μl) were primarily collected at < 14 days of age. We found that low biomass technical repeats also produced higher alpha diversities (r = − 0.28); 16S rRNA gene profiles similar to no template controls (Primestore); and reduced sequencing reproducibility. Finally, we show that the use of statistical tools for in silico contaminant identification, as implemented through the decontam package in R, provides better representations of indigenous bacteria following decontamination. Conclusions We provide insight into experimental design, quality control steps and “denoising” approaches for 16S rRNA gene high-throughput sequencing of low biomass specimens. We highlight the need for careful assessment of DNA extraction methods and storage buffers; sequence quality and reproducibility; and in silico identification of contaminant profiles in order to avoid spurious results.http://link.springer.com/article/10.1186/s12866-020-01795-716S rRNA geneBacteriomeContaminationHigh-throughput sequencingLow biomassMock controls
collection DOAJ
language English
format Article
sources DOAJ
author Shantelle Claassen-Weitz
Sugnet Gardner-Lubbe
Kilaza S. Mwaikono
Elloise du Toit
Heather J. Zar
Mark P. Nicol
spellingShingle Shantelle Claassen-Weitz
Sugnet Gardner-Lubbe
Kilaza S. Mwaikono
Elloise du Toit
Heather J. Zar
Mark P. Nicol
Optimizing 16S rRNA gene profile analysis from low biomass nasopharyngeal and induced sputum specimens
BMC Microbiology
16S rRNA gene
Bacteriome
Contamination
High-throughput sequencing
Low biomass
Mock controls
author_facet Shantelle Claassen-Weitz
Sugnet Gardner-Lubbe
Kilaza S. Mwaikono
Elloise du Toit
Heather J. Zar
Mark P. Nicol
author_sort Shantelle Claassen-Weitz
title Optimizing 16S rRNA gene profile analysis from low biomass nasopharyngeal and induced sputum specimens
title_short Optimizing 16S rRNA gene profile analysis from low biomass nasopharyngeal and induced sputum specimens
title_full Optimizing 16S rRNA gene profile analysis from low biomass nasopharyngeal and induced sputum specimens
title_fullStr Optimizing 16S rRNA gene profile analysis from low biomass nasopharyngeal and induced sputum specimens
title_full_unstemmed Optimizing 16S rRNA gene profile analysis from low biomass nasopharyngeal and induced sputum specimens
title_sort optimizing 16s rrna gene profile analysis from low biomass nasopharyngeal and induced sputum specimens
publisher BMC
series BMC Microbiology
issn 1471-2180
publishDate 2020-05-01
description Abstract Background Careful consideration of experimental artefacts is required in order to successfully apply high-throughput 16S ribosomal ribonucleic acid (rRNA) gene sequencing technology. Here we introduce experimental design, quality control and “denoising” approaches for sequencing low biomass specimens. Results We found that bacterial biomass is a key driver of 16S rRNA gene sequencing profiles generated from bacterial mock communities and that the use of different deoxyribonucleic acid (DNA) extraction methods [DSP Virus/Pathogen Mini Kit® (Kit-QS) and ZymoBIOMICS DNA Miniprep Kit (Kit-ZB)] and storage buffers [PrimeStore® Molecular Transport medium (Primestore) and Skim-milk, Tryptone, Glucose and Glycerol (STGG)] further influence these profiles. Kit-QS better represented hard-to-lyse bacteria from bacterial mock communities compared to Kit-ZB. Primestore storage buffer yielded lower levels of background operational taxonomic units (OTUs) from low biomass bacterial mock community controls compared to STGG. In addition to bacterial mock community controls, we used technical repeats (nasopharyngeal and induced sputum processed in duplicate, triplicate or quadruplicate) to further evaluate the effect of specimen biomass and participant age at specimen collection on resultant sequencing profiles. We observed a positive correlation (r = 0.16) between specimen biomass and participant age at specimen collection: low biomass technical repeats (represented by < 500 16S rRNA gene copies/μl) were primarily collected at < 14 days of age. We found that low biomass technical repeats also produced higher alpha diversities (r = − 0.28); 16S rRNA gene profiles similar to no template controls (Primestore); and reduced sequencing reproducibility. Finally, we show that the use of statistical tools for in silico contaminant identification, as implemented through the decontam package in R, provides better representations of indigenous bacteria following decontamination. Conclusions We provide insight into experimental design, quality control steps and “denoising” approaches for 16S rRNA gene high-throughput sequencing of low biomass specimens. We highlight the need for careful assessment of DNA extraction methods and storage buffers; sequence quality and reproducibility; and in silico identification of contaminant profiles in order to avoid spurious results.
topic 16S rRNA gene
Bacteriome
Contamination
High-throughput sequencing
Low biomass
Mock controls
url http://link.springer.com/article/10.1186/s12866-020-01795-7
work_keys_str_mv AT shantelleclaassenweitz optimizing16srrnageneprofileanalysisfromlowbiomassnasopharyngealandinducedsputumspecimens
AT sugnetgardnerlubbe optimizing16srrnageneprofileanalysisfromlowbiomassnasopharyngealandinducedsputumspecimens
AT kilazasmwaikono optimizing16srrnageneprofileanalysisfromlowbiomassnasopharyngealandinducedsputumspecimens
AT elloisedutoit optimizing16srrnageneprofileanalysisfromlowbiomassnasopharyngealandinducedsputumspecimens
AT heatherjzar optimizing16srrnageneprofileanalysisfromlowbiomassnasopharyngealandinducedsputumspecimens
AT markpnicol optimizing16srrnageneprofileanalysisfromlowbiomassnasopharyngealandinducedsputumspecimens
_version_ 1724881116965371904