Quantification of experimentally induced nucleotide conversions in high-throughput sequencing datasets

Abstract Background Methods to read out naturally occurring or experimentally introduced nucleic acid modifications are emerging as powerful tools to study dynamic cellular processes. The recovery, quantification and interpretation of such events in high-throughput sequencing datasets demands specia...

Full description

Bibliographic Details
Main Authors: Tobias Neumann, Veronika A. Herzog, Matthias Muhar, Arndt von Haeseler, Johannes Zuber, Stefan L. Ameres, Philipp Rescheneder
Format: Article
Language:English
Published: BMC 2019-05-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-019-2849-7
id doaj-39292a6c04f248eeb835ced54223a359
record_format Article
spelling doaj-39292a6c04f248eeb835ced54223a3592020-11-25T03:33:18ZengBMCBMC Bioinformatics1471-21052019-05-0120111610.1186/s12859-019-2849-7Quantification of experimentally induced nucleotide conversions in high-throughput sequencing datasetsTobias Neumann0Veronika A. Herzog1Matthias Muhar2Arndt von Haeseler3Johannes Zuber4Stefan L. Ameres5Philipp Rescheneder6Research Institute of Molecular Pathology (IMP), Campus-Vienna-Biocenter 1, Vienna BioCenter (VBC)Institute of Molecular Biotechnology of the Austrian Academy of Sciences (IMBA)Research Institute of Molecular Pathology (IMP), Campus-Vienna-Biocenter 1, Vienna BioCenter (VBC)Center for Integrative Bioinformatics Vienna, Max F. Perutz Laboratories, University of Vienna, Medical University of ViennaResearch Institute of Molecular Pathology (IMP), Campus-Vienna-Biocenter 1, Vienna BioCenter (VBC)Institute of Molecular Biotechnology of the Austrian Academy of Sciences (IMBA)Center for Integrative Bioinformatics Vienna, Max F. Perutz Laboratories, University of Vienna, Medical University of ViennaAbstract Background Methods to read out naturally occurring or experimentally introduced nucleic acid modifications are emerging as powerful tools to study dynamic cellular processes. The recovery, quantification and interpretation of such events in high-throughput sequencing datasets demands specialized bioinformatics approaches. Results Here, we present Digital Unmasking of Nucleotide conversions in K-mers (DUNK), a data analysis pipeline enabling the quantification of nucleotide conversions in high-throughput sequencing datasets. We demonstrate using experimentally generated and simulated datasets that DUNK allows constant mapping rates irrespective of nucleotide-conversion rates, promotes the recovery of multimapping reads and employs Single Nucleotide Polymorphism (SNP) masking to uncouple true SNPs from nucleotide conversions to facilitate a robust and sensitive quantification of nucleotide-conversions. As a first application, we implement this strategy as SLAM-DUNK for the analysis of SLAMseq profiles, in which 4-thiouridine-labeled transcripts are detected based on T > C conversions. SLAM-DUNK provides both raw counts of nucleotide-conversion containing reads as well as a base-content and read coverage normalized approach for estimating the fractions of labeled transcripts as readout. Conclusion Beyond providing a readily accessible tool for analyzing SLAMseq and related time-resolved RNA sequencing methods (TimeLapse-seq, TUC-seq), DUNK establishes a broadly applicable strategy for quantifying nucleotide conversions.http://link.springer.com/article/10.1186/s12859-019-2849-7MappingEpitranscriptomicsNext generation sequencingHigh-throughput sequencing
collection DOAJ
language English
format Article
sources DOAJ
author Tobias Neumann
Veronika A. Herzog
Matthias Muhar
Arndt von Haeseler
Johannes Zuber
Stefan L. Ameres
Philipp Rescheneder
spellingShingle Tobias Neumann
Veronika A. Herzog
Matthias Muhar
Arndt von Haeseler
Johannes Zuber
Stefan L. Ameres
Philipp Rescheneder
Quantification of experimentally induced nucleotide conversions in high-throughput sequencing datasets
BMC Bioinformatics
Mapping
Epitranscriptomics
Next generation sequencing
High-throughput sequencing
author_facet Tobias Neumann
Veronika A. Herzog
Matthias Muhar
Arndt von Haeseler
Johannes Zuber
Stefan L. Ameres
Philipp Rescheneder
author_sort Tobias Neumann
title Quantification of experimentally induced nucleotide conversions in high-throughput sequencing datasets
title_short Quantification of experimentally induced nucleotide conversions in high-throughput sequencing datasets
title_full Quantification of experimentally induced nucleotide conversions in high-throughput sequencing datasets
title_fullStr Quantification of experimentally induced nucleotide conversions in high-throughput sequencing datasets
title_full_unstemmed Quantification of experimentally induced nucleotide conversions in high-throughput sequencing datasets
title_sort quantification of experimentally induced nucleotide conversions in high-throughput sequencing datasets
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2019-05-01
description Abstract Background Methods to read out naturally occurring or experimentally introduced nucleic acid modifications are emerging as powerful tools to study dynamic cellular processes. The recovery, quantification and interpretation of such events in high-throughput sequencing datasets demands specialized bioinformatics approaches. Results Here, we present Digital Unmasking of Nucleotide conversions in K-mers (DUNK), a data analysis pipeline enabling the quantification of nucleotide conversions in high-throughput sequencing datasets. We demonstrate using experimentally generated and simulated datasets that DUNK allows constant mapping rates irrespective of nucleotide-conversion rates, promotes the recovery of multimapping reads and employs Single Nucleotide Polymorphism (SNP) masking to uncouple true SNPs from nucleotide conversions to facilitate a robust and sensitive quantification of nucleotide-conversions. As a first application, we implement this strategy as SLAM-DUNK for the analysis of SLAMseq profiles, in which 4-thiouridine-labeled transcripts are detected based on T > C conversions. SLAM-DUNK provides both raw counts of nucleotide-conversion containing reads as well as a base-content and read coverage normalized approach for estimating the fractions of labeled transcripts as readout. Conclusion Beyond providing a readily accessible tool for analyzing SLAMseq and related time-resolved RNA sequencing methods (TimeLapse-seq, TUC-seq), DUNK establishes a broadly applicable strategy for quantifying nucleotide conversions.
topic Mapping
Epitranscriptomics
Next generation sequencing
High-throughput sequencing
url http://link.springer.com/article/10.1186/s12859-019-2849-7
work_keys_str_mv AT tobiasneumann quantificationofexperimentallyinducednucleotideconversionsinhighthroughputsequencingdatasets
AT veronikaaherzog quantificationofexperimentallyinducednucleotideconversionsinhighthroughputsequencingdatasets
AT matthiasmuhar quantificationofexperimentallyinducednucleotideconversionsinhighthroughputsequencingdatasets
AT arndtvonhaeseler quantificationofexperimentallyinducednucleotideconversionsinhighthroughputsequencingdatasets
AT johanneszuber quantificationofexperimentallyinducednucleotideconversionsinhighthroughputsequencingdatasets
AT stefanlameres quantificationofexperimentallyinducednucleotideconversionsinhighthroughputsequencingdatasets
AT philipprescheneder quantificationofexperimentallyinducednucleotideconversionsinhighthroughputsequencingdatasets
_version_ 1724563365189124096