IMPLEMENTATION OF THE BRANCH AND BOUND METHOD FOR THE PROBLEM OF RECOVERING A DISTANCES MATRIX BETWEEN DNA STRINGS
Bioinformatics has two important problems for the study of evolutionary processes: calculating the distance between DNA sequences and restoring a matrix of distances sequences of different genomes when not all initial elements are known. Due to the large dimension of DNA chains, heuristic algorithms...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | Russian |
Published: |
The Fund for Promotion of Internet media, IT education, human development «League Internet Media»
2019-04-01
|
Series: | Современные информационные технологии и IT-образование |
Subjects: | |
Online Access: | http://sitito.cs.msu.ru/index.php/SITITO/article/view/475 |
id |
doaj-03d8516bb1164c9492bf066230e7113f |
---|---|
record_format |
Article |
spelling |
doaj-03d8516bb1164c9492bf066230e7113f2020-12-02T01:17:36ZrusThe Fund for Promotion of Internet media, IT education, human development «League Internet Media»Современные информационные технологии и IT-образование2411-14732019-04-01151819110.25559/SITITO.15.201901.81-91IMPLEMENTATION OF THE BRANCH AND BOUND METHOD FOR THE PROBLEM OF RECOVERING A DISTANCES MATRIX BETWEEN DNA STRINGSMikhail E. Abramyan0Boris F. Melnikov1Marina A. Trenina2Southern Federal University (Russia)Russian State Social University (Russia)Togliatti State University (Russia)Bioinformatics has two important problems for the study of evolutionary processes: calculating the distance between DNA sequences and restoring a matrix of distances sequences of different genomes when not all initial elements are known. Due to the large dimension of DNA chains, heuristic algorithms are used to solve the first problem. The main lack of them is that when using different heuristic algorithms to calculate the distance between the same pair DNA chains, we produce different values. To make a comparative analysis of these heuristic algorithms, a badness index was introduced for all the triangles formed in the DNA matrix, and ideally it should be zero. This indicator was used in the future and to solve the second problem. Based on the fact that the badness indicator should be equal to zero, the algorithm of restoring the matrix of distances between DNA chains is developed. This algorithm is optimized using the branch and bound method. This method selects a sequence of calculations of unknown elements, in which the value of the badness matrix will be the smallest. A part of the paper is devoted to the detailed description of algorithms. The most important among the auxiliary algorithms is the algorithm for selecting the separating element, i.e. the unknown element, which we recover first. The main algorithm, i.e. the actual recovering the DNA matrix by the method of branch and bound, includes a description of the task, which consists of subtasks containing a set of the transformed matrix, a sequence of already restored elements of the original matrix and the set of those not yet filled elements of the matrix, which cannot be selected in the next step of the algorithm. The paper also includes the program implementation of the described algorithms.http://sitito.cs.msu.ru/index.php/SITITO/article/view/475branch and bound methodDNA sequencesdistance matrixpartially filled matrixrecovery |
collection |
DOAJ |
language |
Russian |
format |
Article |
sources |
DOAJ |
author |
Mikhail E. Abramyan Boris F. Melnikov Marina A. Trenina |
spellingShingle |
Mikhail E. Abramyan Boris F. Melnikov Marina A. Trenina IMPLEMENTATION OF THE BRANCH AND BOUND METHOD FOR THE PROBLEM OF RECOVERING A DISTANCES MATRIX BETWEEN DNA STRINGS Современные информационные технологии и IT-образование branch and bound method DNA sequences distance matrix partially filled matrix recovery |
author_facet |
Mikhail E. Abramyan Boris F. Melnikov Marina A. Trenina |
author_sort |
Mikhail E. Abramyan |
title |
IMPLEMENTATION OF THE BRANCH AND BOUND METHOD FOR THE PROBLEM OF RECOVERING A DISTANCES MATRIX BETWEEN DNA STRINGS |
title_short |
IMPLEMENTATION OF THE BRANCH AND BOUND METHOD FOR THE PROBLEM OF RECOVERING A DISTANCES MATRIX BETWEEN DNA STRINGS |
title_full |
IMPLEMENTATION OF THE BRANCH AND BOUND METHOD FOR THE PROBLEM OF RECOVERING A DISTANCES MATRIX BETWEEN DNA STRINGS |
title_fullStr |
IMPLEMENTATION OF THE BRANCH AND BOUND METHOD FOR THE PROBLEM OF RECOVERING A DISTANCES MATRIX BETWEEN DNA STRINGS |
title_full_unstemmed |
IMPLEMENTATION OF THE BRANCH AND BOUND METHOD FOR THE PROBLEM OF RECOVERING A DISTANCES MATRIX BETWEEN DNA STRINGS |
title_sort |
implementation of the branch and bound method for the problem of recovering a distances matrix between dna strings |
publisher |
The Fund for Promotion of Internet media, IT education, human development «League Internet Media» |
series |
Современные информационные технологии и IT-образование |
issn |
2411-1473 |
publishDate |
2019-04-01 |
description |
Bioinformatics has two important problems for the study of evolutionary processes: calculating the distance between DNA sequences and restoring a matrix of distances sequences of different genomes when not all initial elements are known. Due to the large dimension of DNA chains, heuristic algorithms are used to solve the first problem. The main lack of them is that when using different heuristic algorithms to calculate the distance between the same pair DNA chains, we produce different values.
To make a comparative analysis of these heuristic algorithms, a badness index was introduced for all the triangles formed in the DNA matrix, and ideally it should be zero. This indicator was used in the future and to solve the second problem. Based on the fact that the badness indicator should be equal to zero, the algorithm of restoring the matrix of distances between DNA chains is developed. This algorithm is optimized using the branch and bound method. This method selects a sequence of calculations of unknown elements, in which the value of the badness matrix will be the smallest.
A part of the paper is devoted to the detailed description of algorithms. The most important among the auxiliary algorithms is the algorithm for selecting the separating element, i.e. the unknown element, which we recover first. The main algorithm, i.e. the actual recovering the DNA matrix by the method of branch and bound, includes a description of the task, which consists of subtasks containing a set of the transformed matrix, a sequence of already restored elements of the original matrix and the set of those not yet filled elements of the matrix, which cannot be selected in the next step of the algorithm. The paper also includes the program implementation of the described algorithms. |
topic |
branch and bound method DNA sequences distance matrix partially filled matrix recovery |
url |
http://sitito.cs.msu.ru/index.php/SITITO/article/view/475 |
work_keys_str_mv |
AT mikhaileabramyan implementationofthebranchandboundmethodfortheproblemofrecoveringadistancesmatrixbetweendnastrings AT borisfmelnikov implementationofthebranchandboundmethodfortheproblemofrecoveringadistancesmatrixbetweendnastrings AT marinaatrenina implementationofthebranchandboundmethodfortheproblemofrecoveringadistancesmatrixbetweendnastrings |
_version_ |
1724410146900148224 |