A degeneration-reducing criterion for optimal digital mapping of genetic codes

Bioinformatics may seem to be a scientific field processing primarily large string datasets, as nucleotides and amino acids are represented with dedicated characters. On the other hand, many computational tasks that bioinformatics challenges are mathematical problems understandable as operations wit...

Full description

Bibliographic Details
Main Authors: Helena Skutkova, Denisa Maderankova, Karel Sedlar, Robin Jugas, Martin Vitek
Format: Article
Language:English
Published: Elsevier 2019-01-01
Series:Computational and Structural Biotechnology Journal
Online Access:http://www.sciencedirect.com/science/article/pii/S2001037018301557
Description
Summary:Bioinformatics may seem to be a scientific field processing primarily large string datasets, as nucleotides and amino acids are represented with dedicated characters. On the other hand, many computational tasks that bioinformatics challenges are mathematical problems understandable as operations with digits. In fact, many computational tasks are solved this way in the background. One of the most widely used digital representations is mapping of nucleotides and amino acids with integers 0–3 and 0–20, respectively. The limitation of this mapping occurs when the digital signal of nucleotides has to be translated into a digital signal of amino acids as the genetic code is degenerated. This causes non-monotonies in a mapping function. Although map for reducing this undesirable effect has already been proposed, it is defined theoretically and for standard genetic codes only. In this study, we derived a novel optimal criterion for reducing the influence of degeneration by utilizing a large dataset of real sequences with various genetic codes. As a result, we proposed a new robust global optimal map suitable for any genetic code as well as specialized optimal maps for particular genetic codes.
ISSN:2001-0370