Rare Events and Conditional Events on Random Strings

Some strings -the texts- are assumed to be randomly generated, according to a probability model that is either a Bernoulli model or a Markov model. A rare event is the over or under-representation of a word or a set of words. The aim of this paper is twofold. First, a single word is given. One...

Full description

Bibliographic Details
Main Authors: Mireille Régnier, Alain Denise
Format: Article
Language:English
Published: Discrete Mathematics & Theoretical Computer Science 2004-12-01
Series:Discrete Mathematics & Theoretical Computer Science
Online Access:http://www.dmtcs.org/dmtcs-ojs/index.php/dmtcs/article/view/186
id doaj-337652cdbfd24a93bab2348bd84fb550
record_format Article
spelling doaj-337652cdbfd24a93bab2348bd84fb5502020-11-24T23:47:18ZengDiscrete Mathematics & Theoretical Computer ScienceDiscrete Mathematics & Theoretical Computer Science1462-72641365-80502004-12-0162Rare Events and Conditional Events on Random StringsMireille RégnierAlain DeniseSome strings -the texts- are assumed to be randomly generated, according to a probability model that is either a Bernoulli model or a Markov model. A rare event is the over or under-representation of a word or a set of words. The aim of this paper is twofold. First, a single word is given. One studies the tail distribution of the number of its occurrences. Sharp large deviation estimates are derived. Second, one assumes that a given word is overrepresented. The distribution of a second word is studied; formulae for the expectation and the variance are derived. In both cases, the formulae are accurate and actually computable. These results have applications in computational biology, where a genome is viewed as a text. http://www.dmtcs.org/dmtcs-ojs/index.php/dmtcs/article/view/186
collection DOAJ
language English
format Article
sources DOAJ
author Mireille Régnier
Alain Denise
spellingShingle Mireille Régnier
Alain Denise
Rare Events and Conditional Events on Random Strings
Discrete Mathematics & Theoretical Computer Science
author_facet Mireille Régnier
Alain Denise
author_sort Mireille Régnier
title Rare Events and Conditional Events on Random Strings
title_short Rare Events and Conditional Events on Random Strings
title_full Rare Events and Conditional Events on Random Strings
title_fullStr Rare Events and Conditional Events on Random Strings
title_full_unstemmed Rare Events and Conditional Events on Random Strings
title_sort rare events and conditional events on random strings
publisher Discrete Mathematics & Theoretical Computer Science
series Discrete Mathematics & Theoretical Computer Science
issn 1462-7264
1365-8050
publishDate 2004-12-01
description Some strings -the texts- are assumed to be randomly generated, according to a probability model that is either a Bernoulli model or a Markov model. A rare event is the over or under-representation of a word or a set of words. The aim of this paper is twofold. First, a single word is given. One studies the tail distribution of the number of its occurrences. Sharp large deviation estimates are derived. Second, one assumes that a given word is overrepresented. The distribution of a second word is studied; formulae for the expectation and the variance are derived. In both cases, the formulae are accurate and actually computable. These results have applications in computational biology, where a genome is viewed as a text.
url http://www.dmtcs.org/dmtcs-ojs/index.php/dmtcs/article/view/186
work_keys_str_mv AT mireilleregnier rareeventsandconditionaleventsonrandomstrings
AT alaindenise rareeventsandconditionaleventsonrandomstrings
_version_ 1725490404682366976