A new method for detecting signal regions in ordered sequences of real numbers, and application to viral genomic data.

We present a fast, robust and parsimonious approach to detecting signals in an ordered sequence of numbers. Our motivation is in seeking a suitable method to take a sequence of scores corresponding to properties of positions in virus genomes, and find outlying regions of low scores. Suitable statist...

Full description

Bibliographic Details
Main Authors: Julia R Gog, Andrew M L Lever, Jordan P Skittrall
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2018-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC5898753?pdf=render
id doaj-8c397e40868b46ad8d9e8869e346ffe5
record_format Article
spelling doaj-8c397e40868b46ad8d9e8869e346ffe52020-11-25T01:37:00ZengPublic Library of Science (PLoS)PLoS ONE1932-62032018-01-01134e019576310.1371/journal.pone.0195763A new method for detecting signal regions in ordered sequences of real numbers, and application to viral genomic data.Julia R GogAndrew M L LeverJordan P SkittrallWe present a fast, robust and parsimonious approach to detecting signals in an ordered sequence of numbers. Our motivation is in seeking a suitable method to take a sequence of scores corresponding to properties of positions in virus genomes, and find outlying regions of low scores. Suitable statistical methods without using complex models or making many assumptions are surprisingly lacking. We resolve this by developing a method that detects regions of low score within sequences of real numbers. The method makes no assumptions a priori about the length of such a region; it gives the explicit location of the region and scores it statistically. It does not use detailed mechanistic models so the method is fast and will be useful in a wide range of applications. We present our approach in detail, and test it on simulated sequences. We show that it is robust to a wide range of signal morphologies, and that it is able to capture multiple signals in the same sequence. Finally we apply it to viral genomic data to identify regions of evolutionary conservation within influenza and rotavirus.http://europepmc.org/articles/PMC5898753?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Julia R Gog
Andrew M L Lever
Jordan P Skittrall
spellingShingle Julia R Gog
Andrew M L Lever
Jordan P Skittrall
A new method for detecting signal regions in ordered sequences of real numbers, and application to viral genomic data.
PLoS ONE
author_facet Julia R Gog
Andrew M L Lever
Jordan P Skittrall
author_sort Julia R Gog
title A new method for detecting signal regions in ordered sequences of real numbers, and application to viral genomic data.
title_short A new method for detecting signal regions in ordered sequences of real numbers, and application to viral genomic data.
title_full A new method for detecting signal regions in ordered sequences of real numbers, and application to viral genomic data.
title_fullStr A new method for detecting signal regions in ordered sequences of real numbers, and application to viral genomic data.
title_full_unstemmed A new method for detecting signal regions in ordered sequences of real numbers, and application to viral genomic data.
title_sort new method for detecting signal regions in ordered sequences of real numbers, and application to viral genomic data.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2018-01-01
description We present a fast, robust and parsimonious approach to detecting signals in an ordered sequence of numbers. Our motivation is in seeking a suitable method to take a sequence of scores corresponding to properties of positions in virus genomes, and find outlying regions of low scores. Suitable statistical methods without using complex models or making many assumptions are surprisingly lacking. We resolve this by developing a method that detects regions of low score within sequences of real numbers. The method makes no assumptions a priori about the length of such a region; it gives the explicit location of the region and scores it statistically. It does not use detailed mechanistic models so the method is fast and will be useful in a wide range of applications. We present our approach in detail, and test it on simulated sequences. We show that it is robust to a wide range of signal morphologies, and that it is able to capture multiple signals in the same sequence. Finally we apply it to viral genomic data to identify regions of evolutionary conservation within influenza and rotavirus.
url http://europepmc.org/articles/PMC5898753?pdf=render
work_keys_str_mv AT juliargog anewmethodfordetectingsignalregionsinorderedsequencesofrealnumbersandapplicationtoviralgenomicdata
AT andrewmllever anewmethodfordetectingsignalregionsinorderedsequencesofrealnumbersandapplicationtoviralgenomicdata
AT jordanpskittrall anewmethodfordetectingsignalregionsinorderedsequencesofrealnumbersandapplicationtoviralgenomicdata
AT juliargog newmethodfordetectingsignalregionsinorderedsequencesofrealnumbersandapplicationtoviralgenomicdata
AT andrewmllever newmethodfordetectingsignalregionsinorderedsequencesofrealnumbersandapplicationtoviralgenomicdata
AT jordanpskittrall newmethodfordetectingsignalregionsinorderedsequencesofrealnumbersandapplicationtoviralgenomicdata
_version_ 1725060384239386624