Entropy and Fractal Dimension Study of the TDP-43 Protein Low Complexity Domain Sequence in ALS Disease Severity and SARS-CoV-2 Gene Sequences in Virulence Variability

The low complexity domain (LCD) sequence has been defined in terms of entropy using a 12 amino acid sliding window along a protein sequence in the study of disease-related genes. The amyotrophic lateral sclerosis (ALS)-related TDP-43 protein sequence with intra-LCD structural information based on cr...

Full description

Bibliographic Details
Main Authors: Sunil Dehipawala, Eric Cheung, George Tremberger, Tak Cheung
Format: Article
Language:English
Published: MDPI AG 2021-08-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/23/8/1038
id doaj-2469a46e206e43dc96f047dcb4e785b7
record_format Article
spelling doaj-2469a46e206e43dc96f047dcb4e785b72021-08-26T13:44:16ZengMDPI AGEntropy1099-43002021-08-01231038103810.3390/e23081038Entropy and Fractal Dimension Study of the TDP-43 Protein Low Complexity Domain Sequence in ALS Disease Severity and SARS-CoV-2 Gene Sequences in Virulence VariabilitySunil Dehipawala0Eric Cheung1George Tremberger2Tak Cheung3Physics Department, City University of New York Queensborough Community College, Bayside, NY 11364, USAPsychiatry Department, Montefiore Mount Vernon Hospital, Mount Vernon, NY 10550, USAPhysics Department, City University of New York Queensborough Community College, Bayside, NY 11364, USAPhysics Department, City University of New York Queensborough Community College, Bayside, NY 11364, USAThe low complexity domain (LCD) sequence has been defined in terms of entropy using a 12 amino acid sliding window along a protein sequence in the study of disease-related genes. The amyotrophic lateral sclerosis (ALS)-related TDP-43 protein sequence with intra-LCD structural information based on cryo-EM data was published recently. An application of entropy and Higuchi fractal dimension calculations was described using the Znf521 and HAR1 sequences. A computational analysis of the intra-LCD sequence entropy and Higuchi fractal dimension values at the amino acid level and at the ATCG nucleotide level were conducted without the sliding window requirement. The computational results were consistent in predicting the intermediate entropy/fractal dimension value produced when two subsequences at two different entropy/fractal dimension values were combined. The computational method without the application of a sliding-window was extended to an analysis of the recently reported virulent genes—Orf6, Nsp6, and Orf7a—in SARS-CoV-2. The relationship between the virulence functionality and entropy values was found to have correlation coefficients between 0.84 and 0.99, using a 5% uncertainty on the cell viability data. The analysis found that the most virulent Orf6 gene sequence had the lowest nucleotide entropy and the highest protein fractal dimension, in line with extreme value theory. The Orf6 codon usage bias in relation to vaccine design was discussed.https://www.mdpi.com/1099-4300/23/8/1038entropyfractal dimensionTDP-43low complexity domain sequenceSARS-CoV-2HAR1
collection DOAJ
language English
format Article
sources DOAJ
author Sunil Dehipawala
Eric Cheung
George Tremberger
Tak Cheung
spellingShingle Sunil Dehipawala
Eric Cheung
George Tremberger
Tak Cheung
Entropy and Fractal Dimension Study of the TDP-43 Protein Low Complexity Domain Sequence in ALS Disease Severity and SARS-CoV-2 Gene Sequences in Virulence Variability
Entropy
entropy
fractal dimension
TDP-43
low complexity domain sequence
SARS-CoV-2
HAR1
author_facet Sunil Dehipawala
Eric Cheung
George Tremberger
Tak Cheung
author_sort Sunil Dehipawala
title Entropy and Fractal Dimension Study of the TDP-43 Protein Low Complexity Domain Sequence in ALS Disease Severity and SARS-CoV-2 Gene Sequences in Virulence Variability
title_short Entropy and Fractal Dimension Study of the TDP-43 Protein Low Complexity Domain Sequence in ALS Disease Severity and SARS-CoV-2 Gene Sequences in Virulence Variability
title_full Entropy and Fractal Dimension Study of the TDP-43 Protein Low Complexity Domain Sequence in ALS Disease Severity and SARS-CoV-2 Gene Sequences in Virulence Variability
title_fullStr Entropy and Fractal Dimension Study of the TDP-43 Protein Low Complexity Domain Sequence in ALS Disease Severity and SARS-CoV-2 Gene Sequences in Virulence Variability
title_full_unstemmed Entropy and Fractal Dimension Study of the TDP-43 Protein Low Complexity Domain Sequence in ALS Disease Severity and SARS-CoV-2 Gene Sequences in Virulence Variability
title_sort entropy and fractal dimension study of the tdp-43 protein low complexity domain sequence in als disease severity and sars-cov-2 gene sequences in virulence variability
publisher MDPI AG
series Entropy
issn 1099-4300
publishDate 2021-08-01
description The low complexity domain (LCD) sequence has been defined in terms of entropy using a 12 amino acid sliding window along a protein sequence in the study of disease-related genes. The amyotrophic lateral sclerosis (ALS)-related TDP-43 protein sequence with intra-LCD structural information based on cryo-EM data was published recently. An application of entropy and Higuchi fractal dimension calculations was described using the Znf521 and HAR1 sequences. A computational analysis of the intra-LCD sequence entropy and Higuchi fractal dimension values at the amino acid level and at the ATCG nucleotide level were conducted without the sliding window requirement. The computational results were consistent in predicting the intermediate entropy/fractal dimension value produced when two subsequences at two different entropy/fractal dimension values were combined. The computational method without the application of a sliding-window was extended to an analysis of the recently reported virulent genes—Orf6, Nsp6, and Orf7a—in SARS-CoV-2. The relationship between the virulence functionality and entropy values was found to have correlation coefficients between 0.84 and 0.99, using a 5% uncertainty on the cell viability data. The analysis found that the most virulent Orf6 gene sequence had the lowest nucleotide entropy and the highest protein fractal dimension, in line with extreme value theory. The Orf6 codon usage bias in relation to vaccine design was discussed.
topic entropy
fractal dimension
TDP-43
low complexity domain sequence
SARS-CoV-2
HAR1
url https://www.mdpi.com/1099-4300/23/8/1038
work_keys_str_mv AT sunildehipawala entropyandfractaldimensionstudyofthetdp43proteinlowcomplexitydomainsequenceinalsdiseaseseverityandsarscov2genesequencesinvirulencevariability
AT ericcheung entropyandfractaldimensionstudyofthetdp43proteinlowcomplexitydomainsequenceinalsdiseaseseverityandsarscov2genesequencesinvirulencevariability
AT georgetremberger entropyandfractaldimensionstudyofthetdp43proteinlowcomplexitydomainsequenceinalsdiseaseseverityandsarscov2genesequencesinvirulencevariability
AT takcheung entropyandfractaldimensionstudyofthetdp43proteinlowcomplexitydomainsequenceinalsdiseaseseverityandsarscov2genesequencesinvirulencevariability
_version_ 1721193571159113728