uShuffle: A useful tool for shuffling biological sequences while preserving the k-let counts
<p>Abstract</p> <p>Background</p> <p>Randomly shuffled sequences are routinely used in sequence analysis to evaluate the statistical significance of a biological sequence. In many cases, biologists need sophisticated shuffling tools that preserve not only the counts of...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2008-04-01
|
Series: | BMC Bioinformatics |
Online Access: | http://www.biomedcentral.com/1471-2105/9/192 |
id |
doaj-96704ac442bc4c23a491560888ea23f0 |
---|---|
record_format |
Article |
spelling |
doaj-96704ac442bc4c23a491560888ea23f02020-11-24T22:10:08ZengBMCBMC Bioinformatics1471-21052008-04-019119210.1186/1471-2105-9-192uShuffle: A useful tool for shuffling biological sequences while preserving the k-let countsGillespie JoelAnderson JamesJiang MinghuiMayne Martin<p>Abstract</p> <p>Background</p> <p>Randomly shuffled sequences are routinely used in sequence analysis to evaluate the statistical significance of a biological sequence. In many cases, biologists need sophisticated shuffling tools that preserve not only the counts of distinct letters but also higher-order statistics such as doublet counts, triplet counts, and, in general, <it>k</it>-let counts.</p> <p>Results</p> <p>We present a sequence analysis tool (named uShuffle) for generating uniform random permutations of biological sequences (such as DNAs, RNAs, and proteins) that preserve the exact <it>k</it>-let counts. The uShuffle tool implements the latest variant of the Euler algorithm and uses Wilson's algorithm in the crucial step of arborescence generation. It is carefully engineered and extremely efficient. The uShuffle tool achieves maximum flexibility by allowing arbitrary alphabet size and let size. It can be used as a command-line program, a web application, or a utility library. Source code in C, Java, and C#, and integration instructions for Perl and Python are provided.</p> <p>Conclusion</p> <p>The uShuffle tool surpasses existing implementation of the Euler algorithm in both performance and flexibility. It is a useful tool for the bioinformatics community.</p> http://www.biomedcentral.com/1471-2105/9/192 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Gillespie Joel Anderson James Jiang Minghui Mayne Martin |
spellingShingle |
Gillespie Joel Anderson James Jiang Minghui Mayne Martin uShuffle: A useful tool for shuffling biological sequences while preserving the k-let counts BMC Bioinformatics |
author_facet |
Gillespie Joel Anderson James Jiang Minghui Mayne Martin |
author_sort |
Gillespie Joel |
title |
uShuffle: A useful tool for shuffling biological sequences while preserving the k-let counts |
title_short |
uShuffle: A useful tool for shuffling biological sequences while preserving the k-let counts |
title_full |
uShuffle: A useful tool for shuffling biological sequences while preserving the k-let counts |
title_fullStr |
uShuffle: A useful tool for shuffling biological sequences while preserving the k-let counts |
title_full_unstemmed |
uShuffle: A useful tool for shuffling biological sequences while preserving the k-let counts |
title_sort |
ushuffle: a useful tool for shuffling biological sequences while preserving the k-let counts |
publisher |
BMC |
series |
BMC Bioinformatics |
issn |
1471-2105 |
publishDate |
2008-04-01 |
description |
<p>Abstract</p> <p>Background</p> <p>Randomly shuffled sequences are routinely used in sequence analysis to evaluate the statistical significance of a biological sequence. In many cases, biologists need sophisticated shuffling tools that preserve not only the counts of distinct letters but also higher-order statistics such as doublet counts, triplet counts, and, in general, <it>k</it>-let counts.</p> <p>Results</p> <p>We present a sequence analysis tool (named uShuffle) for generating uniform random permutations of biological sequences (such as DNAs, RNAs, and proteins) that preserve the exact <it>k</it>-let counts. The uShuffle tool implements the latest variant of the Euler algorithm and uses Wilson's algorithm in the crucial step of arborescence generation. It is carefully engineered and extremely efficient. The uShuffle tool achieves maximum flexibility by allowing arbitrary alphabet size and let size. It can be used as a command-line program, a web application, or a utility library. Source code in C, Java, and C#, and integration instructions for Perl and Python are provided.</p> <p>Conclusion</p> <p>The uShuffle tool surpasses existing implementation of the Euler algorithm in both performance and flexibility. It is a useful tool for the bioinformatics community.</p> |
url |
http://www.biomedcentral.com/1471-2105/9/192 |
work_keys_str_mv |
AT gillespiejoel ushuffleausefultoolforshufflingbiologicalsequenceswhilepreservingthekletcounts AT andersonjames ushuffleausefultoolforshufflingbiologicalsequenceswhilepreservingthekletcounts AT jiangminghui ushuffleausefultoolforshufflingbiologicalsequenceswhilepreservingthekletcounts AT maynemartin ushuffleausefultoolforshufflingbiologicalsequenceswhilepreservingthekletcounts |
_version_ |
1725809064335638528 |