A systematic sequencing-based approach for microbial contaminant detection and functional inference

Abstract Background Microbial contamination poses a major difficulty for successful data analysis in biological and biomedical research. Computational approaches utilizing next-generation sequencing (NGS) data offer promising diagnostics to assess the presence of contaminants. However, as host cells...

Full description

Bibliographic Details
Main Authors: Sung-Joon Park, Satoru Onizuka, Masahide Seki, Yutaka Suzuki, Takanori Iwata, Kenta Nakai
Format: Article
Language:English
Published: BMC 2019-09-01
Series:BMC Biology
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12915-019-0690-0
id doaj-2b6b98a565334b96aa6eb1131800d2fb
record_format Article
spelling doaj-2b6b98a565334b96aa6eb1131800d2fb2020-11-25T02:42:02ZengBMCBMC Biology1741-70072019-09-0117111510.1186/s12915-019-0690-0A systematic sequencing-based approach for microbial contaminant detection and functional inferenceSung-Joon Park0Satoru Onizuka1Masahide Seki2Yutaka Suzuki3Takanori Iwata4Kenta Nakai5Human Genome Center, The Institute of Medical Science, The University of TokyoInstitute of Advanced Biomedical Engineering and Science, Tokyo Women’s Medical UniversityDepartment of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of TokyoDepartment of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of TokyoInstitute of Advanced Biomedical Engineering and Science, Tokyo Women’s Medical UniversityHuman Genome Center, The Institute of Medical Science, The University of TokyoAbstract Background Microbial contamination poses a major difficulty for successful data analysis in biological and biomedical research. Computational approaches utilizing next-generation sequencing (NGS) data offer promising diagnostics to assess the presence of contaminants. However, as host cells are often contaminated by multiple microorganisms, these approaches require careful attention to intra- and interspecies sequence similarities, which have not yet been fully addressed. Results We present a computational approach that rigorously investigates the genomic origins of sequenced reads, including those mapped to multiple species that have been discarded in previous studies. Through the analysis of large-scale synthetic and public NGS samples, we estimate that 1000–100,000 contaminating microbial reads are detected per million host reads sequenced by RNA-seq. The microbe catalog we established included Cutibacterium as a prevalent contaminant, suggesting that contamination mostly originates from the laboratory environment. Importantly, by applying a systematic method to infer the functional impact of contamination, we revealed that host-contaminant interactions cause profound changes in the host molecular landscapes, as exemplified by changes in inflammatory and apoptotic pathways during Mycoplasma infection of lymphoma cells. Conclusions We provide a computational method for profiling microbial contamination on NGS data and suggest that sources of contamination in laboratory reagents and the experimental environment alter the molecular landscape of host cells leading to phenotypic changes. These findings reinforce the concept that precise determination of the origins and functional impacts of contamination is imperative for quality research and illustrate the usefulness of the proposed approach to comprehensively characterize contamination landscapes.http://link.springer.com/article/10.1186/s12915-019-0690-0ContaminationMycoplasmaHost-microbe interactionNext-generation sequencingNon-negative matrix factorization
collection DOAJ
language English
format Article
sources DOAJ
author Sung-Joon Park
Satoru Onizuka
Masahide Seki
Yutaka Suzuki
Takanori Iwata
Kenta Nakai
spellingShingle Sung-Joon Park
Satoru Onizuka
Masahide Seki
Yutaka Suzuki
Takanori Iwata
Kenta Nakai
A systematic sequencing-based approach for microbial contaminant detection and functional inference
BMC Biology
Contamination
Mycoplasma
Host-microbe interaction
Next-generation sequencing
Non-negative matrix factorization
author_facet Sung-Joon Park
Satoru Onizuka
Masahide Seki
Yutaka Suzuki
Takanori Iwata
Kenta Nakai
author_sort Sung-Joon Park
title A systematic sequencing-based approach for microbial contaminant detection and functional inference
title_short A systematic sequencing-based approach for microbial contaminant detection and functional inference
title_full A systematic sequencing-based approach for microbial contaminant detection and functional inference
title_fullStr A systematic sequencing-based approach for microbial contaminant detection and functional inference
title_full_unstemmed A systematic sequencing-based approach for microbial contaminant detection and functional inference
title_sort systematic sequencing-based approach for microbial contaminant detection and functional inference
publisher BMC
series BMC Biology
issn 1741-7007
publishDate 2019-09-01
description Abstract Background Microbial contamination poses a major difficulty for successful data analysis in biological and biomedical research. Computational approaches utilizing next-generation sequencing (NGS) data offer promising diagnostics to assess the presence of contaminants. However, as host cells are often contaminated by multiple microorganisms, these approaches require careful attention to intra- and interspecies sequence similarities, which have not yet been fully addressed. Results We present a computational approach that rigorously investigates the genomic origins of sequenced reads, including those mapped to multiple species that have been discarded in previous studies. Through the analysis of large-scale synthetic and public NGS samples, we estimate that 1000–100,000 contaminating microbial reads are detected per million host reads sequenced by RNA-seq. The microbe catalog we established included Cutibacterium as a prevalent contaminant, suggesting that contamination mostly originates from the laboratory environment. Importantly, by applying a systematic method to infer the functional impact of contamination, we revealed that host-contaminant interactions cause profound changes in the host molecular landscapes, as exemplified by changes in inflammatory and apoptotic pathways during Mycoplasma infection of lymphoma cells. Conclusions We provide a computational method for profiling microbial contamination on NGS data and suggest that sources of contamination in laboratory reagents and the experimental environment alter the molecular landscape of host cells leading to phenotypic changes. These findings reinforce the concept that precise determination of the origins and functional impacts of contamination is imperative for quality research and illustrate the usefulness of the proposed approach to comprehensively characterize contamination landscapes.
topic Contamination
Mycoplasma
Host-microbe interaction
Next-generation sequencing
Non-negative matrix factorization
url http://link.springer.com/article/10.1186/s12915-019-0690-0
work_keys_str_mv AT sungjoonpark asystematicsequencingbasedapproachformicrobialcontaminantdetectionandfunctionalinference
AT satoruonizuka asystematicsequencingbasedapproachformicrobialcontaminantdetectionandfunctionalinference
AT masahideseki asystematicsequencingbasedapproachformicrobialcontaminantdetectionandfunctionalinference
AT yutakasuzuki asystematicsequencingbasedapproachformicrobialcontaminantdetectionandfunctionalinference
AT takanoriiwata asystematicsequencingbasedapproachformicrobialcontaminantdetectionandfunctionalinference
AT kentanakai asystematicsequencingbasedapproachformicrobialcontaminantdetectionandfunctionalinference
AT sungjoonpark systematicsequencingbasedapproachformicrobialcontaminantdetectionandfunctionalinference
AT satoruonizuka systematicsequencingbasedapproachformicrobialcontaminantdetectionandfunctionalinference
AT masahideseki systematicsequencingbasedapproachformicrobialcontaminantdetectionandfunctionalinference
AT yutakasuzuki systematicsequencingbasedapproachformicrobialcontaminantdetectionandfunctionalinference
AT takanoriiwata systematicsequencingbasedapproachformicrobialcontaminantdetectionandfunctionalinference
AT kentanakai systematicsequencingbasedapproachformicrobialcontaminantdetectionandfunctionalinference
_version_ 1724775675753136128