Computing distribution of scale independent motifs in biological sequences
<p>Abstract</p> <p>The use of Chaos Game Representation (CGR) or its generalization, Universal Sequence Maps (USM), to describe the distribution of biological sequences has been found objectionable because of the fractal structure of that coordinate system. Consequently, the invest...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2006-10-01
|
Series: | Algorithms for Molecular Biology |
Online Access: | http://www.almob.org/content/1/1/18 |
id |
doaj-cdfccd1b26644d72827b21639211eb61 |
---|---|
record_format |
Article |
spelling |
doaj-cdfccd1b26644d72827b21639211eb612020-11-25T01:58:20ZengBMCAlgorithms for Molecular Biology1748-71882006-10-01111810.1186/1748-7188-1-18Computing distribution of scale independent motifs in biological sequencesVinga SusanaAlmeida Jonas S<p>Abstract</p> <p>The use of Chaos Game Representation (CGR) or its generalization, Universal Sequence Maps (USM), to describe the distribution of biological sequences has been found objectionable because of the fractal structure of that coordinate system. Consequently, the investigation of distribution of symbolic motifs at multiple scales is hampered by an inexact association between distance and sequence dissimilarity. A solution to this problem could unleash the use of iterative maps as phase-state representation of sequences where its statistical properties can be conveniently investigated. In this study a family of kernel density functions is described that accommodates the fractal nature of iterative function representations of symbolic sequences and, consequently, enables the exact investigation of sequence motifs of arbitrary lengths in that scale-independent representation. Furthermore, the proposed kernel density includes both Markovian succession and currently used alignment-free sequence dissimilarity metrics as special solutions. Therefore, the fractal kernel described is in fact a generalization that provides a common framework for a diverse suite of sequence analysis techniques.</p> http://www.almob.org/content/1/1/18 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Vinga Susana Almeida Jonas S |
spellingShingle |
Vinga Susana Almeida Jonas S Computing distribution of scale independent motifs in biological sequences Algorithms for Molecular Biology |
author_facet |
Vinga Susana Almeida Jonas S |
author_sort |
Vinga Susana |
title |
Computing distribution of scale independent motifs in biological sequences |
title_short |
Computing distribution of scale independent motifs in biological sequences |
title_full |
Computing distribution of scale independent motifs in biological sequences |
title_fullStr |
Computing distribution of scale independent motifs in biological sequences |
title_full_unstemmed |
Computing distribution of scale independent motifs in biological sequences |
title_sort |
computing distribution of scale independent motifs in biological sequences |
publisher |
BMC |
series |
Algorithms for Molecular Biology |
issn |
1748-7188 |
publishDate |
2006-10-01 |
description |
<p>Abstract</p> <p>The use of Chaos Game Representation (CGR) or its generalization, Universal Sequence Maps (USM), to describe the distribution of biological sequences has been found objectionable because of the fractal structure of that coordinate system. Consequently, the investigation of distribution of symbolic motifs at multiple scales is hampered by an inexact association between distance and sequence dissimilarity. A solution to this problem could unleash the use of iterative maps as phase-state representation of sequences where its statistical properties can be conveniently investigated. In this study a family of kernel density functions is described that accommodates the fractal nature of iterative function representations of symbolic sequences and, consequently, enables the exact investigation of sequence motifs of arbitrary lengths in that scale-independent representation. Furthermore, the proposed kernel density includes both Markovian succession and currently used alignment-free sequence dissimilarity metrics as special solutions. Therefore, the fractal kernel described is in fact a generalization that provides a common framework for a diverse suite of sequence analysis techniques.</p> |
url |
http://www.almob.org/content/1/1/18 |
work_keys_str_mv |
AT vingasusana computingdistributionofscaleindependentmotifsinbiologicalsequences AT almeidajonass computingdistributionofscaleindependentmotifsinbiologicalsequences |
_version_ |
1724970280386822144 |