A Convolutional Code-Based Sequence Analysis Model and Its Application

A new approach for encoding DNA sequences as input for DNA sequence analysis is proposed using the error correction coding theory of communication engineering. The encoder was designed as a convolutional code model whose generator matrix is designed based on the degeneracy of codons, with a codon tr...

Full description

Bibliographic Details
Main Authors: Xiaoli Geng, Xiao Liu
Format: Article
Language:English
Published: MDPI AG 2013-04-01
Series:International Journal of Molecular Sciences
Subjects:
Online Access:http://www.mdpi.com/1422-0067/14/4/8393
id doaj-c56b6e7f703b4637b769e9f1bed693c0
record_format Article
spelling doaj-c56b6e7f703b4637b769e9f1bed693c02020-11-24T21:45:06ZengMDPI AGInternational Journal of Molecular Sciences1422-00672013-04-011448393840510.3390/ijms14048393A Convolutional Code-Based Sequence Analysis Model and Its ApplicationXiaoli GengXiao LiuA new approach for encoding DNA sequences as input for DNA sequence analysis is proposed using the error correction coding theory of communication engineering. The encoder was designed as a convolutional code model whose generator matrix is designed based on the degeneracy of codons, with a codon treated in the model as an informational unit. The utility of the proposed model was demonstrated through the analysis of twelve prokaryote and nine eukaryote DNA sequences having different GC contents. Distinct differences in code distances were observed near the initiation and termination sites in the open reading frame, which provided a well-regulated characterization of the DNA sequences. Clearly distinguished period-3 features appeared in the coding regions, and the characteristic average code distances of the analyzed sequences were approximately proportional to their GC contents, particularly in the selected prokaryotic organisms, presenting the potential utility as an added taxonomic characteristic for use in studying the relationships of living organisms.http://www.mdpi.com/1422-0067/14/4/8393convolutional codedegeneracycodoninformational unitcode distancecharacteristic average code distanceGC contenttaxonomy
collection DOAJ
language English
format Article
sources DOAJ
author Xiaoli Geng
Xiao Liu
spellingShingle Xiaoli Geng
Xiao Liu
A Convolutional Code-Based Sequence Analysis Model and Its Application
International Journal of Molecular Sciences
convolutional code
degeneracy
codon
informational unit
code distance
characteristic average code distance
GC content
taxonomy
author_facet Xiaoli Geng
Xiao Liu
author_sort Xiaoli Geng
title A Convolutional Code-Based Sequence Analysis Model and Its Application
title_short A Convolutional Code-Based Sequence Analysis Model and Its Application
title_full A Convolutional Code-Based Sequence Analysis Model and Its Application
title_fullStr A Convolutional Code-Based Sequence Analysis Model and Its Application
title_full_unstemmed A Convolutional Code-Based Sequence Analysis Model and Its Application
title_sort convolutional code-based sequence analysis model and its application
publisher MDPI AG
series International Journal of Molecular Sciences
issn 1422-0067
publishDate 2013-04-01
description A new approach for encoding DNA sequences as input for DNA sequence analysis is proposed using the error correction coding theory of communication engineering. The encoder was designed as a convolutional code model whose generator matrix is designed based on the degeneracy of codons, with a codon treated in the model as an informational unit. The utility of the proposed model was demonstrated through the analysis of twelve prokaryote and nine eukaryote DNA sequences having different GC contents. Distinct differences in code distances were observed near the initiation and termination sites in the open reading frame, which provided a well-regulated characterization of the DNA sequences. Clearly distinguished period-3 features appeared in the coding regions, and the characteristic average code distances of the analyzed sequences were approximately proportional to their GC contents, particularly in the selected prokaryotic organisms, presenting the potential utility as an added taxonomic characteristic for use in studying the relationships of living organisms.
topic convolutional code
degeneracy
codon
informational unit
code distance
characteristic average code distance
GC content
taxonomy
url http://www.mdpi.com/1422-0067/14/4/8393
work_keys_str_mv AT xiaoligeng aconvolutionalcodebasedsequenceanalysismodelanditsapplication
AT xiaoliu aconvolutionalcodebasedsequenceanalysismodelanditsapplication
AT xiaoligeng convolutionalcodebasedsequenceanalysismodelanditsapplication
AT xiaoliu convolutionalcodebasedsequenceanalysismodelanditsapplication
_version_ 1725906651409547264