Rare Events and Conditional Events on Random Strings
Some strings -the texts- are assumed to be randomly generated, according to a probability model that is either a Bernoulli model or a Markov model. A rare event is the over or under-representation of a word or a set of words. The aim of this paper is twofold. First, a single word is given. One...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Discrete Mathematics & Theoretical Computer Science
2004-12-01
|
Series: | Discrete Mathematics & Theoretical Computer Science |
Online Access: | http://www.dmtcs.org/dmtcs-ojs/index.php/dmtcs/article/view/186 |
id |
doaj-337652cdbfd24a93bab2348bd84fb550 |
---|---|
record_format |
Article |
spelling |
doaj-337652cdbfd24a93bab2348bd84fb5502020-11-24T23:47:18ZengDiscrete Mathematics & Theoretical Computer ScienceDiscrete Mathematics & Theoretical Computer Science1462-72641365-80502004-12-0162Rare Events and Conditional Events on Random StringsMireille RégnierAlain DeniseSome strings -the texts- are assumed to be randomly generated, according to a probability model that is either a Bernoulli model or a Markov model. A rare event is the over or under-representation of a word or a set of words. The aim of this paper is twofold. First, a single word is given. One studies the tail distribution of the number of its occurrences. Sharp large deviation estimates are derived. Second, one assumes that a given word is overrepresented. The distribution of a second word is studied; formulae for the expectation and the variance are derived. In both cases, the formulae are accurate and actually computable. These results have applications in computational biology, where a genome is viewed as a text. http://www.dmtcs.org/dmtcs-ojs/index.php/dmtcs/article/view/186 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Mireille Régnier Alain Denise |
spellingShingle |
Mireille Régnier Alain Denise Rare Events and Conditional Events on Random Strings Discrete Mathematics & Theoretical Computer Science |
author_facet |
Mireille Régnier Alain Denise |
author_sort |
Mireille Régnier |
title |
Rare Events and Conditional Events on Random Strings |
title_short |
Rare Events and Conditional Events on Random Strings |
title_full |
Rare Events and Conditional Events on Random Strings |
title_fullStr |
Rare Events and Conditional Events on Random Strings |
title_full_unstemmed |
Rare Events and Conditional Events on Random Strings |
title_sort |
rare events and conditional events on random strings |
publisher |
Discrete Mathematics & Theoretical Computer Science |
series |
Discrete Mathematics & Theoretical Computer Science |
issn |
1462-7264 1365-8050 |
publishDate |
2004-12-01 |
description |
Some strings -the texts- are assumed to be randomly generated, according to a probability model that is either a Bernoulli model or a Markov model. A rare event is the over or under-representation of a word or a set of words. The aim of this paper is twofold. First, a single word is given. One studies the tail distribution of the number of its occurrences. Sharp large deviation estimates are derived. Second, one assumes that a given word is overrepresented. The distribution of a second word is studied; formulae for the expectation and the variance are derived. In both cases, the formulae are accurate and actually computable. These results have applications in computational biology, where a genome is viewed as a text. |
url |
http://www.dmtcs.org/dmtcs-ojs/index.php/dmtcs/article/view/186 |
work_keys_str_mv |
AT mireilleregnier rareeventsandconditionaleventsonrandomstrings AT alaindenise rareeventsandconditionaleventsonrandomstrings |
_version_ |
1725490404682366976 |