Identification and Analysis of Long Repeats of Proteins at the Domain Level

Amino acid repeats play an important role in the structure and function of proteins. Analysis of long repeats in protein sequences enables one to understand their abundance, structure and function in the protein universe. In the present study, amino acid repeats of length >50 (long repeats) w...

Full description

Bibliographic Details
Main Authors: David Mary Rajathei, Subbiah Parthasarathy, Samuel Selvaraj
Format: Article
Language:English
Published: Frontiers Media S.A. 2019-10-01
Series:Frontiers in Bioengineering and Biotechnology
Subjects:
Online Access:https://www.frontiersin.org/article/10.3389/fbioe.2019.00250/full
id doaj-456f22f8cf664e38a17ed80978dbbdae
record_format Article
spelling doaj-456f22f8cf664e38a17ed80978dbbdae2020-11-25T01:38:43ZengFrontiers Media S.A.Frontiers in Bioengineering and Biotechnology2296-41852019-10-01710.3389/fbioe.2019.00250464697Identification and Analysis of Long Repeats of Proteins at the Domain LevelDavid Mary RajatheiSubbiah ParthasarathySamuel SelvarajAmino acid repeats play an important role in the structure and function of proteins. Analysis of long repeats in protein sequences enables one to understand their abundance, structure and function in the protein universe. In the present study, amino acid repeats of length >50 (long repeats) were identified in a non-redundant set of UniProt sequences using the RADAR program. The underlying structures and functions of these long repeats were carried out using the Gene3D for structural domains, Pfam for functional domains and enzyme and non-enzyme functional classification for catalytic and binding of the proteins. From a structural perspective, these long repeats seem to predominantly occur in certain architectures such as sandwich, bundle, barrel, and roll and within these architectures abundant in the superfolds. The lengths of the repeats within each fold are not uniform exhibiting different structures for different functions. We also observed that long repeats are in the domain regions of the family and are involved in the function of the proteins. After grouping based on enzyme and non-enzyme classes, we observed the abundant occurrence of long repeats in specific catalytic and binding of the proteins. In this study, we have analyzed the occurrence of long repeats in the protein sequence universe apart from well-characterized short tandem repeats in sequences and their structures and functions of the proteins at the domain level. The present study suggests that long repeats may play an important role in the structure and function of domains of the proteins.https://www.frontiersin.org/article/10.3389/fbioe.2019.00250/fulllong repeatsproteindomainprotein familyenzyme and non-enzyme classesstructural fold
collection DOAJ
language English
format Article
sources DOAJ
author David Mary Rajathei
Subbiah Parthasarathy
Samuel Selvaraj
spellingShingle David Mary Rajathei
Subbiah Parthasarathy
Samuel Selvaraj
Identification and Analysis of Long Repeats of Proteins at the Domain Level
Frontiers in Bioengineering and Biotechnology
long repeats
protein
domain
protein family
enzyme and non-enzyme classes
structural fold
author_facet David Mary Rajathei
Subbiah Parthasarathy
Samuel Selvaraj
author_sort David Mary Rajathei
title Identification and Analysis of Long Repeats of Proteins at the Domain Level
title_short Identification and Analysis of Long Repeats of Proteins at the Domain Level
title_full Identification and Analysis of Long Repeats of Proteins at the Domain Level
title_fullStr Identification and Analysis of Long Repeats of Proteins at the Domain Level
title_full_unstemmed Identification and Analysis of Long Repeats of Proteins at the Domain Level
title_sort identification and analysis of long repeats of proteins at the domain level
publisher Frontiers Media S.A.
series Frontiers in Bioengineering and Biotechnology
issn 2296-4185
publishDate 2019-10-01
description Amino acid repeats play an important role in the structure and function of proteins. Analysis of long repeats in protein sequences enables one to understand their abundance, structure and function in the protein universe. In the present study, amino acid repeats of length >50 (long repeats) were identified in a non-redundant set of UniProt sequences using the RADAR program. The underlying structures and functions of these long repeats were carried out using the Gene3D for structural domains, Pfam for functional domains and enzyme and non-enzyme functional classification for catalytic and binding of the proteins. From a structural perspective, these long repeats seem to predominantly occur in certain architectures such as sandwich, bundle, barrel, and roll and within these architectures abundant in the superfolds. The lengths of the repeats within each fold are not uniform exhibiting different structures for different functions. We also observed that long repeats are in the domain regions of the family and are involved in the function of the proteins. After grouping based on enzyme and non-enzyme classes, we observed the abundant occurrence of long repeats in specific catalytic and binding of the proteins. In this study, we have analyzed the occurrence of long repeats in the protein sequence universe apart from well-characterized short tandem repeats in sequences and their structures and functions of the proteins at the domain level. The present study suggests that long repeats may play an important role in the structure and function of domains of the proteins.
topic long repeats
protein
domain
protein family
enzyme and non-enzyme classes
structural fold
url https://www.frontiersin.org/article/10.3389/fbioe.2019.00250/full
work_keys_str_mv AT davidmaryrajathei identificationandanalysisoflongrepeatsofproteinsatthedomainlevel
AT subbiahparthasarathy identificationandanalysisoflongrepeatsofproteinsatthedomainlevel
AT samuelselvaraj identificationandanalysisoflongrepeatsofproteinsatthedomainlevel
_version_ 1725051933632233472