IMPLEMENTATION OF THE BRANCH AND BOUND METHOD FOR THE PROBLEM OF RECOVERING A DISTANCES MATRIX BETWEEN DNA STRINGS

Bioinformatics has two important problems for the study of evolutionary processes: calculating the distance between DNA sequences and restoring a matrix of distances sequences of different genomes when not all initial elements are known. Due to the large dimension of DNA chains, heuristic algorithms...

Full description

Bibliographic Details
Main Authors: Mikhail E. Abramyan, Boris F. Melnikov, Marina A. Trenina
Format: Article
Language:Russian
Published: The Fund for Promotion of Internet media, IT education, human development «League Internet Media» 2019-04-01
Series:Современные информационные технологии и IT-образование
Subjects:
Online Access:http://sitito.cs.msu.ru/index.php/SITITO/article/view/475
id doaj-03d8516bb1164c9492bf066230e7113f
record_format Article
spelling doaj-03d8516bb1164c9492bf066230e7113f2020-12-02T01:17:36ZrusThe Fund for Promotion of Internet media, IT education, human development «League Internet Media»Современные информационные технологии и IT-образование2411-14732019-04-01151819110.25559/SITITO.15.201901.81-91IMPLEMENTATION OF THE BRANCH AND BOUND METHOD FOR THE PROBLEM OF RECOVERING A DISTANCES MATRIX BETWEEN DNA STRINGSMikhail E. Abramyan0Boris F. Melnikov1Marina A. Trenina2Southern Federal University (Russia)Russian State Social University (Russia)Togliatti State University (Russia)Bioinformatics has two important problems for the study of evolutionary processes: calculating the distance between DNA sequences and restoring a matrix of distances sequences of different genomes when not all initial elements are known. Due to the large dimension of DNA chains, heuristic algorithms are used to solve the first problem. The main lack of them is that when using different heuristic algorithms to calculate the distance between the same pair DNA chains, we produce different values. To make a comparative analysis of these heuristic algorithms, a badness index was introduced for all the triangles formed in the DNA matrix, and ideally it should be zero. This indicator was used in the future and to solve the second problem. Based on the fact that the badness indicator should be equal to zero, the algorithm of restoring the matrix of distances between DNA chains is developed. This algorithm is optimized using the branch and bound method. This method selects a sequence of calculations of unknown elements, in which the value of the badness matrix will be the smallest. A part of the paper is devoted to the detailed description of algorithms. The most important among the auxiliary algorithms is the algorithm for selecting the separating element, i.e. the unknown element, which we recover first. The main algorithm, i.e. the actual recovering the DNA matrix by the method of branch and bound, includes a description of the task, which consists of subtasks containing a set of the transformed matrix, a sequence of already restored elements of the original matrix and the set of those not yet filled elements of the matrix, which cannot be selected in the next step of the algorithm. The paper also includes the program implementation of the described algorithms.http://sitito.cs.msu.ru/index.php/SITITO/article/view/475branch and bound methodDNA sequencesdistance matrixpartially filled matrixrecovery
collection DOAJ
language Russian
format Article
sources DOAJ
author Mikhail E. Abramyan
Boris F. Melnikov
Marina A. Trenina
spellingShingle Mikhail E. Abramyan
Boris F. Melnikov
Marina A. Trenina
IMPLEMENTATION OF THE BRANCH AND BOUND METHOD FOR THE PROBLEM OF RECOVERING A DISTANCES MATRIX BETWEEN DNA STRINGS
Современные информационные технологии и IT-образование
branch and bound method
DNA sequences
distance matrix
partially filled matrix
recovery
author_facet Mikhail E. Abramyan
Boris F. Melnikov
Marina A. Trenina
author_sort Mikhail E. Abramyan
title IMPLEMENTATION OF THE BRANCH AND BOUND METHOD FOR THE PROBLEM OF RECOVERING A DISTANCES MATRIX BETWEEN DNA STRINGS
title_short IMPLEMENTATION OF THE BRANCH AND BOUND METHOD FOR THE PROBLEM OF RECOVERING A DISTANCES MATRIX BETWEEN DNA STRINGS
title_full IMPLEMENTATION OF THE BRANCH AND BOUND METHOD FOR THE PROBLEM OF RECOVERING A DISTANCES MATRIX BETWEEN DNA STRINGS
title_fullStr IMPLEMENTATION OF THE BRANCH AND BOUND METHOD FOR THE PROBLEM OF RECOVERING A DISTANCES MATRIX BETWEEN DNA STRINGS
title_full_unstemmed IMPLEMENTATION OF THE BRANCH AND BOUND METHOD FOR THE PROBLEM OF RECOVERING A DISTANCES MATRIX BETWEEN DNA STRINGS
title_sort implementation of the branch and bound method for the problem of recovering a distances matrix between dna strings
publisher The Fund for Promotion of Internet media, IT education, human development «League Internet Media»
series Современные информационные технологии и IT-образование
issn 2411-1473
publishDate 2019-04-01
description Bioinformatics has two important problems for the study of evolutionary processes: calculating the distance between DNA sequences and restoring a matrix of distances sequences of different genomes when not all initial elements are known. Due to the large dimension of DNA chains, heuristic algorithms are used to solve the first problem. The main lack of them is that when using different heuristic algorithms to calculate the distance between the same pair DNA chains, we produce different values. To make a comparative analysis of these heuristic algorithms, a badness index was introduced for all the triangles formed in the DNA matrix, and ideally it should be zero. This indicator was used in the future and to solve the second problem. Based on the fact that the badness indicator should be equal to zero, the algorithm of restoring the matrix of distances between DNA chains is developed. This algorithm is optimized using the branch and bound method. This method selects a sequence of calculations of unknown elements, in which the value of the badness matrix will be the smallest. A part of the paper is devoted to the detailed description of algorithms. The most important among the auxiliary algorithms is the algorithm for selecting the separating element, i.e. the unknown element, which we recover first. The main algorithm, i.e. the actual recovering the DNA matrix by the method of branch and bound, includes a description of the task, which consists of subtasks containing a set of the transformed matrix, a sequence of already restored elements of the original matrix and the set of those not yet filled elements of the matrix, which cannot be selected in the next step of the algorithm. The paper also includes the program implementation of the described algorithms.
topic branch and bound method
DNA sequences
distance matrix
partially filled matrix
recovery
url http://sitito.cs.msu.ru/index.php/SITITO/article/view/475
work_keys_str_mv AT mikhaileabramyan implementationofthebranchandboundmethodfortheproblemofrecoveringadistancesmatrixbetweendnastrings
AT borisfmelnikov implementationofthebranchandboundmethodfortheproblemofrecoveringadistancesmatrixbetweendnastrings
AT marinaatrenina implementationofthebranchandboundmethodfortheproblemofrecoveringadistancesmatrixbetweendnastrings
_version_ 1724410146900148224