A guild of 45 CRISPR-associated (Cas) protein families and multiple CRISPR/Cas subtypes exist in prokaryotic genomes.

Clustered regularly interspaced short palindromic repeats (CRISPRs) are a family of DNA direct repeats found in many prokaryotic genomes. Repeats of 21-37 bp typically show weak dyad symmetry and are separated by regularly sized, nonrepetitive spacer sequences. Four CRISPR-associated (Cas) protein f...

Full description

Bibliographic Details
Main Authors: Daniel H Haft, Jeremy Selengut, Emmanuel F Mongodin, Karen E Nelson
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2005-11-01
Series:PLoS Computational Biology
Online Access:http://europepmc.org/articles/PMC1282333?pdf=render
id doaj-2e8aa03d7f3b476b924959dd7fda3bfc
record_format Article
spelling doaj-2e8aa03d7f3b476b924959dd7fda3bfc2020-11-25T01:34:04ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582005-11-0116e6010.1371/journal.pcbi.0010060A guild of 45 CRISPR-associated (Cas) protein families and multiple CRISPR/Cas subtypes exist in prokaryotic genomes.Daniel H HaftJeremy SelengutEmmanuel F MongodinKaren E NelsonClustered regularly interspaced short palindromic repeats (CRISPRs) are a family of DNA direct repeats found in many prokaryotic genomes. Repeats of 21-37 bp typically show weak dyad symmetry and are separated by regularly sized, nonrepetitive spacer sequences. Four CRISPR-associated (Cas) protein families, designated Cas1 to Cas4, are strictly associated with CRISPR elements and always occur near a repeat cluster. Some spacers originate from mobile genetic elements and are thought to confer "immunity" against the elements that harbor these sequences. In the present study, we have systematically investigated uncharacterized proteins encoded in the vicinity of these CRISPRs and found many additional protein families that are strictly associated with CRISPR loci across multiple prokaryotic species. Multiple sequence alignments and hidden Markov models have been built for 45 Cas protein families. These models identify family members with high sensitivity and selectivity and classify key regulators of development, DevR and DevS, in Myxococcus xanthus as Cas proteins. These identifications show that CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a repeat cluster or filling the region between two repeat clusters. Distinctive subsets of the collection of Cas proteins recur in phylogenetically distant species and correlate with characteristic repeat periodicity. The analyses presented here support initial proposals of mobility of these units, along with the likelihood that loci of different subtypes interact with one another as well as with host cell defensive, replicative, and regulatory systems. It is evident from this analysis that CRISPR/cas loci are larger, more complex, and more heterogeneous than previously appreciated.http://europepmc.org/articles/PMC1282333?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Daniel H Haft
Jeremy Selengut
Emmanuel F Mongodin
Karen E Nelson
spellingShingle Daniel H Haft
Jeremy Selengut
Emmanuel F Mongodin
Karen E Nelson
A guild of 45 CRISPR-associated (Cas) protein families and multiple CRISPR/Cas subtypes exist in prokaryotic genomes.
PLoS Computational Biology
author_facet Daniel H Haft
Jeremy Selengut
Emmanuel F Mongodin
Karen E Nelson
author_sort Daniel H Haft
title A guild of 45 CRISPR-associated (Cas) protein families and multiple CRISPR/Cas subtypes exist in prokaryotic genomes.
title_short A guild of 45 CRISPR-associated (Cas) protein families and multiple CRISPR/Cas subtypes exist in prokaryotic genomes.
title_full A guild of 45 CRISPR-associated (Cas) protein families and multiple CRISPR/Cas subtypes exist in prokaryotic genomes.
title_fullStr A guild of 45 CRISPR-associated (Cas) protein families and multiple CRISPR/Cas subtypes exist in prokaryotic genomes.
title_full_unstemmed A guild of 45 CRISPR-associated (Cas) protein families and multiple CRISPR/Cas subtypes exist in prokaryotic genomes.
title_sort guild of 45 crispr-associated (cas) protein families and multiple crispr/cas subtypes exist in prokaryotic genomes.
publisher Public Library of Science (PLoS)
series PLoS Computational Biology
issn 1553-734X
1553-7358
publishDate 2005-11-01
description Clustered regularly interspaced short palindromic repeats (CRISPRs) are a family of DNA direct repeats found in many prokaryotic genomes. Repeats of 21-37 bp typically show weak dyad symmetry and are separated by regularly sized, nonrepetitive spacer sequences. Four CRISPR-associated (Cas) protein families, designated Cas1 to Cas4, are strictly associated with CRISPR elements and always occur near a repeat cluster. Some spacers originate from mobile genetic elements and are thought to confer "immunity" against the elements that harbor these sequences. In the present study, we have systematically investigated uncharacterized proteins encoded in the vicinity of these CRISPRs and found many additional protein families that are strictly associated with CRISPR loci across multiple prokaryotic species. Multiple sequence alignments and hidden Markov models have been built for 45 Cas protein families. These models identify family members with high sensitivity and selectivity and classify key regulators of development, DevR and DevS, in Myxococcus xanthus as Cas proteins. These identifications show that CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a repeat cluster or filling the region between two repeat clusters. Distinctive subsets of the collection of Cas proteins recur in phylogenetically distant species and correlate with characteristic repeat periodicity. The analyses presented here support initial proposals of mobility of these units, along with the likelihood that loci of different subtypes interact with one another as well as with host cell defensive, replicative, and regulatory systems. It is evident from this analysis that CRISPR/cas loci are larger, more complex, and more heterogeneous than previously appreciated.
url http://europepmc.org/articles/PMC1282333?pdf=render
work_keys_str_mv AT danielhhaft aguildof45crisprassociatedcasproteinfamiliesandmultiplecrisprcassubtypesexistinprokaryoticgenomes
AT jeremyselengut aguildof45crisprassociatedcasproteinfamiliesandmultiplecrisprcassubtypesexistinprokaryoticgenomes
AT emmanuelfmongodin aguildof45crisprassociatedcasproteinfamiliesandmultiplecrisprcassubtypesexistinprokaryoticgenomes
AT karenenelson aguildof45crisprassociatedcasproteinfamiliesandmultiplecrisprcassubtypesexistinprokaryoticgenomes
AT danielhhaft guildof45crisprassociatedcasproteinfamiliesandmultiplecrisprcassubtypesexistinprokaryoticgenomes
AT jeremyselengut guildof45crisprassociatedcasproteinfamiliesandmultiplecrisprcassubtypesexistinprokaryoticgenomes
AT emmanuelfmongodin guildof45crisprassociatedcasproteinfamiliesandmultiplecrisprcassubtypesexistinprokaryoticgenomes
AT karenenelson guildof45crisprassociatedcasproteinfamiliesandmultiplecrisprcassubtypesexistinprokaryoticgenomes
_version_ 1725073880167481344