A Convolutional Code-Based Sequence Analysis Model and Its Application
A new approach for encoding DNA sequences as input for DNA sequence analysis is proposed using the error correction coding theory of communication engineering. The encoder was designed as a convolutional code model whose generator matrix is designed based on the degeneracy of codons, with a codon tr...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2013-04-01
|
Series: | International Journal of Molecular Sciences |
Subjects: | |
Online Access: | http://www.mdpi.com/1422-0067/14/4/8393 |
id |
doaj-c56b6e7f703b4637b769e9f1bed693c0 |
---|---|
record_format |
Article |
spelling |
doaj-c56b6e7f703b4637b769e9f1bed693c02020-11-24T21:45:06ZengMDPI AGInternational Journal of Molecular Sciences1422-00672013-04-011448393840510.3390/ijms14048393A Convolutional Code-Based Sequence Analysis Model and Its ApplicationXiaoli GengXiao LiuA new approach for encoding DNA sequences as input for DNA sequence analysis is proposed using the error correction coding theory of communication engineering. The encoder was designed as a convolutional code model whose generator matrix is designed based on the degeneracy of codons, with a codon treated in the model as an informational unit. The utility of the proposed model was demonstrated through the analysis of twelve prokaryote and nine eukaryote DNA sequences having different GC contents. Distinct differences in code distances were observed near the initiation and termination sites in the open reading frame, which provided a well-regulated characterization of the DNA sequences. Clearly distinguished period-3 features appeared in the coding regions, and the characteristic average code distances of the analyzed sequences were approximately proportional to their GC contents, particularly in the selected prokaryotic organisms, presenting the potential utility as an added taxonomic characteristic for use in studying the relationships of living organisms.http://www.mdpi.com/1422-0067/14/4/8393convolutional codedegeneracycodoninformational unitcode distancecharacteristic average code distanceGC contenttaxonomy |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Xiaoli Geng Xiao Liu |
spellingShingle |
Xiaoli Geng Xiao Liu A Convolutional Code-Based Sequence Analysis Model and Its Application International Journal of Molecular Sciences convolutional code degeneracy codon informational unit code distance characteristic average code distance GC content taxonomy |
author_facet |
Xiaoli Geng Xiao Liu |
author_sort |
Xiaoli Geng |
title |
A Convolutional Code-Based Sequence Analysis Model and Its Application |
title_short |
A Convolutional Code-Based Sequence Analysis Model and Its Application |
title_full |
A Convolutional Code-Based Sequence Analysis Model and Its Application |
title_fullStr |
A Convolutional Code-Based Sequence Analysis Model and Its Application |
title_full_unstemmed |
A Convolutional Code-Based Sequence Analysis Model and Its Application |
title_sort |
convolutional code-based sequence analysis model and its application |
publisher |
MDPI AG |
series |
International Journal of Molecular Sciences |
issn |
1422-0067 |
publishDate |
2013-04-01 |
description |
A new approach for encoding DNA sequences as input for DNA sequence analysis is proposed using the error correction coding theory of communication engineering. The encoder was designed as a convolutional code model whose generator matrix is designed based on the degeneracy of codons, with a codon treated in the model as an informational unit. The utility of the proposed model was demonstrated through the analysis of twelve prokaryote and nine eukaryote DNA sequences having different GC contents. Distinct differences in code distances were observed near the initiation and termination sites in the open reading frame, which provided a well-regulated characterization of the DNA sequences. Clearly distinguished period-3 features appeared in the coding regions, and the characteristic average code distances of the analyzed sequences were approximately proportional to their GC contents, particularly in the selected prokaryotic organisms, presenting the potential utility as an added taxonomic characteristic for use in studying the relationships of living organisms. |
topic |
convolutional code degeneracy codon informational unit code distance characteristic average code distance GC content taxonomy |
url |
http://www.mdpi.com/1422-0067/14/4/8393 |
work_keys_str_mv |
AT xiaoligeng aconvolutionalcodebasedsequenceanalysismodelanditsapplication AT xiaoliu aconvolutionalcodebasedsequenceanalysismodelanditsapplication AT xiaoligeng convolutionalcodebasedsequenceanalysismodelanditsapplication AT xiaoliu convolutionalcodebasedsequenceanalysismodelanditsapplication |
_version_ |
1725906651409547264 |