Entropy and Fractal Dimension Study of the TDP-43 Protein Low Complexity Domain Sequence in ALS Disease Severity and SARS-CoV-2 Gene Sequences in Virulence Variability
The low complexity domain (LCD) sequence has been defined in terms of entropy using a 12 amino acid sliding window along a protein sequence in the study of disease-related genes. The amyotrophic lateral sclerosis (ALS)-related TDP-43 protein sequence with intra-LCD structural information based on cr...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-08-01
|
Series: | Entropy |
Subjects: | |
Online Access: | https://www.mdpi.com/1099-4300/23/8/1038 |
id |
doaj-2469a46e206e43dc96f047dcb4e785b7 |
---|---|
record_format |
Article |
spelling |
doaj-2469a46e206e43dc96f047dcb4e785b72021-08-26T13:44:16ZengMDPI AGEntropy1099-43002021-08-01231038103810.3390/e23081038Entropy and Fractal Dimension Study of the TDP-43 Protein Low Complexity Domain Sequence in ALS Disease Severity and SARS-CoV-2 Gene Sequences in Virulence VariabilitySunil Dehipawala0Eric Cheung1George Tremberger2Tak Cheung3Physics Department, City University of New York Queensborough Community College, Bayside, NY 11364, USAPsychiatry Department, Montefiore Mount Vernon Hospital, Mount Vernon, NY 10550, USAPhysics Department, City University of New York Queensborough Community College, Bayside, NY 11364, USAPhysics Department, City University of New York Queensborough Community College, Bayside, NY 11364, USAThe low complexity domain (LCD) sequence has been defined in terms of entropy using a 12 amino acid sliding window along a protein sequence in the study of disease-related genes. The amyotrophic lateral sclerosis (ALS)-related TDP-43 protein sequence with intra-LCD structural information based on cryo-EM data was published recently. An application of entropy and Higuchi fractal dimension calculations was described using the Znf521 and HAR1 sequences. A computational analysis of the intra-LCD sequence entropy and Higuchi fractal dimension values at the amino acid level and at the ATCG nucleotide level were conducted without the sliding window requirement. The computational results were consistent in predicting the intermediate entropy/fractal dimension value produced when two subsequences at two different entropy/fractal dimension values were combined. The computational method without the application of a sliding-window was extended to an analysis of the recently reported virulent genes—Orf6, Nsp6, and Orf7a—in SARS-CoV-2. The relationship between the virulence functionality and entropy values was found to have correlation coefficients between 0.84 and 0.99, using a 5% uncertainty on the cell viability data. The analysis found that the most virulent Orf6 gene sequence had the lowest nucleotide entropy and the highest protein fractal dimension, in line with extreme value theory. The Orf6 codon usage bias in relation to vaccine design was discussed.https://www.mdpi.com/1099-4300/23/8/1038entropyfractal dimensionTDP-43low complexity domain sequenceSARS-CoV-2HAR1 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Sunil Dehipawala Eric Cheung George Tremberger Tak Cheung |
spellingShingle |
Sunil Dehipawala Eric Cheung George Tremberger Tak Cheung Entropy and Fractal Dimension Study of the TDP-43 Protein Low Complexity Domain Sequence in ALS Disease Severity and SARS-CoV-2 Gene Sequences in Virulence Variability Entropy entropy fractal dimension TDP-43 low complexity domain sequence SARS-CoV-2 HAR1 |
author_facet |
Sunil Dehipawala Eric Cheung George Tremberger Tak Cheung |
author_sort |
Sunil Dehipawala |
title |
Entropy and Fractal Dimension Study of the TDP-43 Protein Low Complexity Domain Sequence in ALS Disease Severity and SARS-CoV-2 Gene Sequences in Virulence Variability |
title_short |
Entropy and Fractal Dimension Study of the TDP-43 Protein Low Complexity Domain Sequence in ALS Disease Severity and SARS-CoV-2 Gene Sequences in Virulence Variability |
title_full |
Entropy and Fractal Dimension Study of the TDP-43 Protein Low Complexity Domain Sequence in ALS Disease Severity and SARS-CoV-2 Gene Sequences in Virulence Variability |
title_fullStr |
Entropy and Fractal Dimension Study of the TDP-43 Protein Low Complexity Domain Sequence in ALS Disease Severity and SARS-CoV-2 Gene Sequences in Virulence Variability |
title_full_unstemmed |
Entropy and Fractal Dimension Study of the TDP-43 Protein Low Complexity Domain Sequence in ALS Disease Severity and SARS-CoV-2 Gene Sequences in Virulence Variability |
title_sort |
entropy and fractal dimension study of the tdp-43 protein low complexity domain sequence in als disease severity and sars-cov-2 gene sequences in virulence variability |
publisher |
MDPI AG |
series |
Entropy |
issn |
1099-4300 |
publishDate |
2021-08-01 |
description |
The low complexity domain (LCD) sequence has been defined in terms of entropy using a 12 amino acid sliding window along a protein sequence in the study of disease-related genes. The amyotrophic lateral sclerosis (ALS)-related TDP-43 protein sequence with intra-LCD structural information based on cryo-EM data was published recently. An application of entropy and Higuchi fractal dimension calculations was described using the Znf521 and HAR1 sequences. A computational analysis of the intra-LCD sequence entropy and Higuchi fractal dimension values at the amino acid level and at the ATCG nucleotide level were conducted without the sliding window requirement. The computational results were consistent in predicting the intermediate entropy/fractal dimension value produced when two subsequences at two different entropy/fractal dimension values were combined. The computational method without the application of a sliding-window was extended to an analysis of the recently reported virulent genes—Orf6, Nsp6, and Orf7a—in SARS-CoV-2. The relationship between the virulence functionality and entropy values was found to have correlation coefficients between 0.84 and 0.99, using a 5% uncertainty on the cell viability data. The analysis found that the most virulent Orf6 gene sequence had the lowest nucleotide entropy and the highest protein fractal dimension, in line with extreme value theory. The Orf6 codon usage bias in relation to vaccine design was discussed. |
topic |
entropy fractal dimension TDP-43 low complexity domain sequence SARS-CoV-2 HAR1 |
url |
https://www.mdpi.com/1099-4300/23/8/1038 |
work_keys_str_mv |
AT sunildehipawala entropyandfractaldimensionstudyofthetdp43proteinlowcomplexitydomainsequenceinalsdiseaseseverityandsarscov2genesequencesinvirulencevariability AT ericcheung entropyandfractaldimensionstudyofthetdp43proteinlowcomplexitydomainsequenceinalsdiseaseseverityandsarscov2genesequencesinvirulencevariability AT georgetremberger entropyandfractaldimensionstudyofthetdp43proteinlowcomplexitydomainsequenceinalsdiseaseseverityandsarscov2genesequencesinvirulencevariability AT takcheung entropyandfractaldimensionstudyofthetdp43proteinlowcomplexitydomainsequenceinalsdiseaseseverityandsarscov2genesequencesinvirulencevariability |
_version_ |
1721193571159113728 |