GPrank: an R package for detecting dynamic elements from genome-wide time series

Abstract Background Genome-wide high-throughput sequencing (HTS) time series experiments are a powerful tool for monitoring various genomic elements over time. They can be used to monitor, for example, gene or transcript expression with RNA sequencing (RNA-seq), DNA methylation levels with bisulfite...

Full description

Bibliographic Details
Main Authors:	Hande Topa, Antti Honkela
Format:	Article
Language:	English
Published:	BMC 2018-10-01
Series:	BMC Bioinformatics
Subjects:	Gaussian process High-throughput sequencing Time series Ranking Bayes factor Visualization
Online Access:	http://link.springer.com/article/10.1186/s12859-018-2370-4

id	doaj-a382d5f11c42415f9b79e51a629bd554
record_format	Article
spelling	doaj-a382d5f11c42415f9b79e51a629bd5542020-11-25T02:05:57ZengBMCBMC Bioinformatics1471-21052018-10-011911610.1186/s12859-018-2370-4GPrank: an R package for detecting dynamic elements from genome-wide time seriesHande Topa0Antti Honkela1Institute for Molecular Medicine Finland FIMM, University of HelsinkiHelsinki Institute for Information Technology HIIT, Department of Mathematics and Statistics, University of HelsinkiAbstract Background Genome-wide high-throughput sequencing (HTS) time series experiments are a powerful tool for monitoring various genomic elements over time. They can be used to monitor, for example, gene or transcript expression with RNA sequencing (RNA-seq), DNA methylation levels with bisulfite sequencing (BS-seq), or abundances of genetic variants in populations with pooled sequencing (Pool-seq). However, because of high experimental costs, the time series data sets often consist of a very limited number of time points with very few or no biological replicates, posing challenges in the data analysis. Results Here we present the GPrank R package for modelling genome-wide time series by incorporating variance information obtained during pre-processing of the HTS data using probabilistic quantification methods or from a beta-binomial model using sequencing depth. GPrank is well-suited for analysing both short and irregularly sampled time series. It is based on modelling each time series by two Gaussian process (GP) models, namely, time-dependent and time-independent GP models, and comparing the evidence provided by data under two models by computing their Bayes factor (BF). Genomic elements are then ranked by their BFs, and temporally most dynamic elements can be identified. Conclusions Incorporating the variance information helps GPrank avoid false positives without compromising computational efficiency. Fitted models can be easily further explored in a browser. Detection and visualisation of temporally most active dynamic elements in the genome can provide a good starting point for further downstream analyses for increasing our understanding of the studied processes.http://link.springer.com/article/10.1186/s12859-018-2370-4Gaussian processHigh-throughput sequencingTime seriesRankingBayes factorVisualization
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Hande Topa Antti Honkela
spellingShingle	Hande Topa Antti Honkela GPrank: an R package for detecting dynamic elements from genome-wide time series BMC Bioinformatics Gaussian process High-throughput sequencing Time series Ranking Bayes factor Visualization
author_facet	Hande Topa Antti Honkela
author_sort	Hande Topa
title	GPrank: an R package for detecting dynamic elements from genome-wide time series
title_short	GPrank: an R package for detecting dynamic elements from genome-wide time series
title_full	GPrank: an R package for detecting dynamic elements from genome-wide time series
title_fullStr	GPrank: an R package for detecting dynamic elements from genome-wide time series
title_full_unstemmed	GPrank: an R package for detecting dynamic elements from genome-wide time series
title_sort	gprank: an r package for detecting dynamic elements from genome-wide time series
publisher	BMC
series	BMC Bioinformatics
issn	1471-2105
publishDate	2018-10-01
description	Abstract Background Genome-wide high-throughput sequencing (HTS) time series experiments are a powerful tool for monitoring various genomic elements over time. They can be used to monitor, for example, gene or transcript expression with RNA sequencing (RNA-seq), DNA methylation levels with bisulfite sequencing (BS-seq), or abundances of genetic variants in populations with pooled sequencing (Pool-seq). However, because of high experimental costs, the time series data sets often consist of a very limited number of time points with very few or no biological replicates, posing challenges in the data analysis. Results Here we present the GPrank R package for modelling genome-wide time series by incorporating variance information obtained during pre-processing of the HTS data using probabilistic quantification methods or from a beta-binomial model using sequencing depth. GPrank is well-suited for analysing both short and irregularly sampled time series. It is based on modelling each time series by two Gaussian process (GP) models, namely, time-dependent and time-independent GP models, and comparing the evidence provided by data under two models by computing their Bayes factor (BF). Genomic elements are then ranked by their BFs, and temporally most dynamic elements can be identified. Conclusions Incorporating the variance information helps GPrank avoid false positives without compromising computational efficiency. Fitted models can be easily further explored in a browser. Detection and visualisation of temporally most active dynamic elements in the genome can provide a good starting point for further downstream analyses for increasing our understanding of the studied processes.
topic	Gaussian process High-throughput sequencing Time series Ranking Bayes factor Visualization
url	http://link.springer.com/article/10.1186/s12859-018-2370-4
work_keys_str_mv	AT handetopa gprankanrpackagefordetectingdynamicelementsfromgenomewidetimeseries AT anttihonkela gprankanrpackagefordetectingdynamicelementsfromgenomewidetimeseries
_version_	1724935944963883008

GPrank: an R package for detecting dynamic elements from genome-wide time series

Similar Items