A Compositional Model to Predict the Aggregated Isotope Distribution for Average DNA and RNA Oligonucleotides

Structural modifications of DNA and RNA molecules play a pivotal role in epigenetic and posttranscriptional regulation. To characterise these modifications, more and more MS and MS/MS- based tools for the analysis of nucleic acids are being developed. To identify an oligonucleotide in a mass spectru...

Full description

Bibliographic Details
Main Authors: Annelies Agten, Piotr Prostko, Melvin Geubbelmans, Youzhong Liu, Thomas De Vijlder, Dirk Valkenborg
Format: Article
Language:English
Published: MDPI AG 2021-06-01
Series:Metabolites
Subjects:
DNA
RNA
Online Access:https://www.mdpi.com/2218-1989/11/6/400
id doaj-264b08621d054998abed506b2734db2b
record_format Article
spelling doaj-264b08621d054998abed506b2734db2b2021-07-01T00:32:16ZengMDPI AGMetabolites2218-19892021-06-011140040010.3390/metabo11060400A Compositional Model to Predict the Aggregated Isotope Distribution for Average DNA and RNA OligonucleotidesAnnelies Agten0Piotr Prostko1Melvin Geubbelmans2Youzhong Liu3Thomas De Vijlder4Dirk Valkenborg5Data Science Institute, UHasselt—Hasselt University, Agoralaan 1, BE 3590 Diepenbeek, BelgiumData Science Institute, UHasselt—Hasselt University, Agoralaan 1, BE 3590 Diepenbeek, BelgiumData Science Institute, UHasselt—Hasselt University, Agoralaan 1, BE 3590 Diepenbeek, BelgiumChemical & Pharmaceutical Development & Supply, Janssen Research & Development, Turnhoutseweg 30, BE 2340 Beerse, BelgiumChemical & Pharmaceutical Development & Supply, Janssen Research & Development, Turnhoutseweg 30, BE 2340 Beerse, BelgiumData Science Institute, UHasselt—Hasselt University, Agoralaan 1, BE 3590 Diepenbeek, BelgiumStructural modifications of DNA and RNA molecules play a pivotal role in epigenetic and posttranscriptional regulation. To characterise these modifications, more and more MS and MS/MS- based tools for the analysis of nucleic acids are being developed. To identify an oligonucleotide in a mass spectrum, it is useful to compare the obtained isotope pattern of the molecule of interest to the one that is theoretically expected based on its elemental composition. However, this is not straightforward when the identity of the molecule under investigation is unknown. Here, we present a modelling approach for the prediction of the aggregated isotope distribution of an average DNA or RNA molecule when a particular (monoisotopic) mass is available. For this purpose, a theoretical database of all possible DNA/RNA oligonucleotides up to a mass of 25 kDa is created, and the aggregated isotope distribution for the entire database of oligonucleotides is generated using the BRAIN algorithm. Since this isotope information is compositional in nature, the modelling method is based on the additive log-ratio analysis of Aitchison. As a result, a univariate weighted polynomial regression model of order 10 is fitted to predict the first 20 isotope peaks for DNA and RNA molecules. The performance of the prediction model is assessed by using a mean squared error approach and a modified Pearson’s χ<sup>2</sup> goodness-of-fit measure on experimental data. Our analysis has indicated that the variability in spectral accuracy contributed more to the errors than the approximation of the theoretical isotope distribution by our proposed average DNA/RNA model. The prediction model is implemented as an online tool. An R function can be downloaded to incorporate the method in custom analysis workflows to process mass spectral data.https://www.mdpi.com/2218-1989/11/6/400DNARNAoligonucleotidepredictionisotope distributionmass spectrometry
collection DOAJ
language English
format Article
sources DOAJ
author Annelies Agten
Piotr Prostko
Melvin Geubbelmans
Youzhong Liu
Thomas De Vijlder
Dirk Valkenborg
spellingShingle Annelies Agten
Piotr Prostko
Melvin Geubbelmans
Youzhong Liu
Thomas De Vijlder
Dirk Valkenborg
A Compositional Model to Predict the Aggregated Isotope Distribution for Average DNA and RNA Oligonucleotides
Metabolites
DNA
RNA
oligonucleotide
prediction
isotope distribution
mass spectrometry
author_facet Annelies Agten
Piotr Prostko
Melvin Geubbelmans
Youzhong Liu
Thomas De Vijlder
Dirk Valkenborg
author_sort Annelies Agten
title A Compositional Model to Predict the Aggregated Isotope Distribution for Average DNA and RNA Oligonucleotides
title_short A Compositional Model to Predict the Aggregated Isotope Distribution for Average DNA and RNA Oligonucleotides
title_full A Compositional Model to Predict the Aggregated Isotope Distribution for Average DNA and RNA Oligonucleotides
title_fullStr A Compositional Model to Predict the Aggregated Isotope Distribution for Average DNA and RNA Oligonucleotides
title_full_unstemmed A Compositional Model to Predict the Aggregated Isotope Distribution for Average DNA and RNA Oligonucleotides
title_sort compositional model to predict the aggregated isotope distribution for average dna and rna oligonucleotides
publisher MDPI AG
series Metabolites
issn 2218-1989
publishDate 2021-06-01
description Structural modifications of DNA and RNA molecules play a pivotal role in epigenetic and posttranscriptional regulation. To characterise these modifications, more and more MS and MS/MS- based tools for the analysis of nucleic acids are being developed. To identify an oligonucleotide in a mass spectrum, it is useful to compare the obtained isotope pattern of the molecule of interest to the one that is theoretically expected based on its elemental composition. However, this is not straightforward when the identity of the molecule under investigation is unknown. Here, we present a modelling approach for the prediction of the aggregated isotope distribution of an average DNA or RNA molecule when a particular (monoisotopic) mass is available. For this purpose, a theoretical database of all possible DNA/RNA oligonucleotides up to a mass of 25 kDa is created, and the aggregated isotope distribution for the entire database of oligonucleotides is generated using the BRAIN algorithm. Since this isotope information is compositional in nature, the modelling method is based on the additive log-ratio analysis of Aitchison. As a result, a univariate weighted polynomial regression model of order 10 is fitted to predict the first 20 isotope peaks for DNA and RNA molecules. The performance of the prediction model is assessed by using a mean squared error approach and a modified Pearson’s χ<sup>2</sup> goodness-of-fit measure on experimental data. Our analysis has indicated that the variability in spectral accuracy contributed more to the errors than the approximation of the theoretical isotope distribution by our proposed average DNA/RNA model. The prediction model is implemented as an online tool. An R function can be downloaded to incorporate the method in custom analysis workflows to process mass spectral data.
topic DNA
RNA
oligonucleotide
prediction
isotope distribution
mass spectrometry
url https://www.mdpi.com/2218-1989/11/6/400
work_keys_str_mv AT anneliesagten acompositionalmodeltopredicttheaggregatedisotopedistributionforaveragednaandrnaoligonucleotides
AT piotrprostko acompositionalmodeltopredicttheaggregatedisotopedistributionforaveragednaandrnaoligonucleotides
AT melvingeubbelmans acompositionalmodeltopredicttheaggregatedisotopedistributionforaveragednaandrnaoligonucleotides
AT youzhongliu acompositionalmodeltopredicttheaggregatedisotopedistributionforaveragednaandrnaoligonucleotides
AT thomasdevijlder acompositionalmodeltopredicttheaggregatedisotopedistributionforaveragednaandrnaoligonucleotides
AT dirkvalkenborg acompositionalmodeltopredicttheaggregatedisotopedistributionforaveragednaandrnaoligonucleotides
AT anneliesagten compositionalmodeltopredicttheaggregatedisotopedistributionforaveragednaandrnaoligonucleotides
AT piotrprostko compositionalmodeltopredicttheaggregatedisotopedistributionforaveragednaandrnaoligonucleotides
AT melvingeubbelmans compositionalmodeltopredicttheaggregatedisotopedistributionforaveragednaandrnaoligonucleotides
AT youzhongliu compositionalmodeltopredicttheaggregatedisotopedistributionforaveragednaandrnaoligonucleotides
AT thomasdevijlder compositionalmodeltopredicttheaggregatedisotopedistributionforaveragednaandrnaoligonucleotides
AT dirkvalkenborg compositionalmodeltopredicttheaggregatedisotopedistributionforaveragednaandrnaoligonucleotides
_version_ 1721348252034400256