Summary: | Expansion of homo-repeats is a molecular basis for human neurological diseases. We are the first who studied the influence of homo-repeats with lengths larger than four amino acid residues on the aggregation properties of 1449683 proteins across 122 eukaryotic and bacterial proteomes. Only 15% of proteins (215481) include homo-repeats of such length. We demonstrated that RNA-binding proteins with a prion-like domain are enriched with homo-repeats in comparison with other non-redundant protein sequences and those in the PDB. We performed a bioinformatics analysis for these proteins and found that proteins with homo-repeats are on average two times longer than those in the whole database. Moreover, we are first to discover that as a rule, homo-repeats appear in proteins not alone but in pairs: hydrophobic and aromatic homo-repeats appear with similar ones, while homo-repeats with small, polar and charged amino acids appear together with different preferences. We elaborated a new complementary approach to demonstrate the influence of homo-repeats on their host protein aggregation properties. We have shown that addition of artificial homo-repeats to natural and random proteins results in intensification of aggregation properties of the proteins. The maximal effect is observed for the insertion of artificial homo-repeats with 5-6 residues, which is consistent with the minimal length of an amyloidogenic region. We have also demonstrated that the ability of proteins with homo-repeats to aggregate cannot be explained only by the presence of long homo-repeats in them. There should be other characteristics of proteins intensifying the aggregation property including such as the appearance of homo-repeats in pairs in the same protein. We are the first who elaborated a new approach to study the influence of homo-repeats present in proteins on their aggregation properties and performed an appropriate analysis of the large number of proteomes and proteins.
|