ebook img

SURVEY AND SUMMARY Comparative genomics of defense systems in archaea and bacteria PDF

18 Pages·2013·1.4 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview SURVEY AND SUMMARY Comparative genomics of defense systems in archaea and bacteria

4360–4377 Nucleic Acids Research, 2013, Vol. 41, No. 8 Published online 6 March 2013 doi:10.1093/nar/gkt157 SURVEY AND SUMMARY Comparative genomics of defense systems in archaea and bacteria Kira S. Makarova, Yuri I. Wolf and Eugene V. Koonin* National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA Received December 27, 2012; Revised February 14, 2013; Accepted February 15, 2013 Do w n lo a d ABSTRACT genomicsandexperimentalstudyofvirus-hostinteraction e d Our knowledge of prokaryotic defense systems has h(5a–v8e).revealed many new antiviral defense mechanisms from vastly expanded as the result of comparative h The defense systems of prokaryotes can be classified ttp gateionon.mTichisaneaxlpysainss,iofonllioswbeodthbqyuaenxptietartimivee,ntianlcluvadliindg- iancttoiont.woThberofiardst ggrroouupps itnhcalutddesifftehrosine dtheefeirnsemosdysetsemosf s://ac a the discovery of diverse new examples of known thatfunctionontheself–non-selfdiscriminationprinciple, d e types of defense systems, such as restriction- with DNA usually being the target of the discriminatory m ic modification or toxin-antitoxin systems, and qualita- recognition; these defense mechanisms can be viewed as .o u tive, including the discovery of fundamentally new prokaryotic immunity. At least three types of defense p.c systems and their derivatives belong to this group. The o defense mechanisms, such as the CRISPR-Cas m immunity system. Large-scale statistical analysis best characterized of these are the extremely numerous /na reveals that the distribution of different defense and diverse restriction-modification (R-M) system that r/a systemsinbacterialandarchaealtaxaisnon-uniform, uosgenimzeetahnydlactlieoanvetoalnaybeulntmheod‘siefilefd’ g‘nenoonm-sieclfD’ DNNAAan(9d–1re1c)-. rticle with four groups of organisms distinguishable with -a Another defense system in this group is DNA phospho- b s respect to the overall abundance and the balance rothioation (known as the DND system), which labels tra between specific types of defense systems. The DNA by phosphothiolation and destroys unmodified ct/4 genes encoding defense system components in bac- DNA (8,12,13). The R-M and DND systems represent 1/8 terialandarchaeatypicallyclusterindefenseislands. the prokaryotic version of innate immunity. /4 3 In addition to genes encoding known defense Unlike R-M and DND systems, which attack non-self 60 systems, these islands contain numerous unchar- invaders indiscriminately, the CRISPR (Clustered /24 0 acterized genes, which are candidates for new types Regularly Interspaced Short Palindromic Repeats)-Cas 95 of defense systems. The tight association of the (tCheReInScPoRu-natsesroscwiaittehdingfeencetsio)ussysatgemenstiasnadbalettatockmitemspoerciizfe- 50 b genes encoding immunity systems and dormancy- ically afterwards (14–18). Thus, CRISPR-Cas is often y gu orcelldeath-inducingdefensesystemsinprokaryotic e viewed as a prokaryotic adaptive immunity system. s genomes suggests that these two major types of The second group of defense systems is generally based t on defensearefunctionallycoupled,providingforeffect- on programmedcell deathordormancy induced byinfec- 08 ive protection at the population level. tion. Numerous and diverse toxin-antitoxin (TA) systems A p belonginthiscategory.Dependingonthenatureoftoxins ril 2 andantitoxins,theTAsystemsarecurrentlyclassifiedinto 0 INTRODUCTION 1 three types: typeI with antisense RNA asantitoxin and a 9 Arms race between viruses and their hosts is arguably the protein, usually a small membrane holin-like protein as a most powerful and relentless driving force in evolution toxin; type II, in which both toxin and antitoxin are (1–3). As a result, numerous extremely diverse and elab- proteins, and type III, in which with the RNA antitoxin orateantiviraldefensesystemshaveevolvedandoccupya directly inactivates the protein toxin (7,19–28). Two add- substantial part of the genome especially in free-living itionaltypesofTAsystems(IVandV)havebeenrecently archaea and bacteria (4,5). Although some of these proposed based on distinct mechanisms of action of the systems have been known for many years and have been respective antitoxins (29,30). In addition to the TA thoroughly characterized, recent advances in comparative systems, abortive infection (ABI) or phage exclusion *To whom correspondence should be addressed. Tel:+1 301 496 2477, ext 294; Fax:+1 301 480 9241; Email: [email protected] PublishedbyOxfordUniversityPress2013. ThisisanOpenAccessarticledistributedunderthetermsoftheCreativeCommonsAttributionNon-CommercialLicense(http://creativecommons.org/licenses/ by-nc/3.0/),whichpermitsunrestrictednon-commercialuse,distribution,andreproductioninanymedium,providedtheoriginalworkisproperlycited. NucleicAcidsResearch,2013,Vol.41,No.8 4361 systems also often use the mechanism of cell death or The immediate outcome of the analysis of the distribu- dormancy. These systems have not been so far classified tion of defense genes is their pronounced enrichment in in detail, but some of them fit well into the TA systems archaea compared with bacteria and in thermophiles description (31). The vast majority of toxins in both TA (especially hyperthermophiles) compared with mesophiles systems and ABI systems interfere with the translation andpsychrophiles(5).Thetwotrends,thedependencyon process, mostly via mRNA or tRNA cleavage. taxonomy and temperature preference, seem to be inde- Numerousrecentcomparativegenomicstudiesnotonly pendent of each other. A deeper analysis of the distribu- revealedthehighabundanceoftheknowndefensesystem tion of the relative abundances of genes belonging to and predicted new ones whose molecular mechanisms of different defense systems reveals four distinct clusters of action remain to be characterized but also highlighted organisms in the principal component-like space several distinct properties of these systems. (Figure 2) as indicated by gap function analysis (32). This observation implies the existence of four distinct . The genes encoding different defense systems often ‘defense strategies’: (i) all defense systems are under-rep- Do cluster in genomic islands of larger than an operon w resented relative to their expected abundance: in the re- n size. lo spectiveorganisms,defenseiseitherabandonedaltogether a . The immunity systems are often encoded within the or reduced to bare-bones minimum; (ii) the total number de d same genomic loci with systems that cause cell death ofgenesdedicatedtodefenseisclosetotheexpectedvalue; fro or dormancy, and, at least in some cases, the two prevalence of R-M and ABI over TA and CRISPR; (iii) m classes of defense systems functionally cooperate. the total number of genes dedicated to defense is close to http . Dcoimffebriennettofamfoirlmies(aolfmotosxt)inasllapnodssaibnlteitToxAinpsaoirfst.en re- tRh-eMexapnedcteAdBvI;alauned; p(irve)vaallelndceefeonfseTAsysatenmdsCaRreISoPvRer-orevper- s://ac . Defense systems or their components sometimes resented,i.e.agreaterthanaveragefractionofthegenome ad e change their mode of actions. Thus, R-M systems m is dedicated to antivirus defense (Figure 2A). can switch to the functional mode characteristic of An overwhelming majority of bacterial thermophiles, ic.o u TA systems, whereas individual components of TA along with the archaea, regardless of the optimal growth p.c systems can act solo as ABI systems. temperature, follow strategies (iii) or (iv), including a om general over-representation of defense system genes /n The purpose of this article is to examine these recent a observations in some detail and to focus on several (Figure 2B and C). Bacteria are widely spread across the r/a rseycsetenmtlys opfrebdaicctteerdiaaannddsatrilclhpaoeao.rlTyhcehfaurnaccttieornizsedanddefceonmse- egnrotiurepspsahroamwientgeraspraanceg,ewoifthdemfeonssteosftrtahteegliaersgaemboancgtertihael rticle-a parative genomics of well-characterized prokaryotic repCreersteanitnaltyi,veongeenhoamsetso(kFeiegpurien2mCi)n.d that the aforemen- bstra defense systems such as R-M, TA and CRISPR-Cas c have been discussed in detail in multiple reviews; there- tioned partitioning of the archaeal and bacterial defense t/41 feonrtef,ehaeturer,eswoefotnhleyseinscylsutdeembs.rief summaries of the pertin- sstyrsatteemgsiesbyisgceonnodmiteioanneadlyosnis.ouInr apbairltiticyutloari,daesnstiigfynmdeefnetnosef /8/436 anorganismtothefirststrategy(noorlittledefense)could 0 /2 be somewhat naı¨ve in the sense that some of these organ- 4 0 isms might use completely uncharacterized novel defense 95 5 DISTRIBUTION OF DEFENSE SYSTEMS IN systems. This concern is minor when it comes to parasitic 0 b ARCHAEA AND BACTERIA AND FOUR orsymbioticorganismswithverysmallgenomestowhich y g DISTINCT DEFENSE STRATEGIES INFERRED this strategy (or perhaps more precisely, lack of defense u e FROM GENOME ANALYSIS strategy) trivially applies. However, extreme paucity of st o identifiable defense systems has been noted also for n Thefractionofbacterialandarchaealgenomesallottedto 0 some bacteria with large genomes, e.g. Paenibacillus sp, 8 defense systems varies broadly, from virtual absence to A (cid:2)bo1u0n%d f(oFrigeuarceh1Aty)p.eTohfesdeefdeinssteribsuytsitoenmssrbefleceacutstehemalonwy wthiethseacasgeesn,otmheepooftemntoiarel utnhkannowsenvsenloommeglaabragsee,san(5d).itIins pril 2 0 a question of major interest whether the lifestyle of these 1 more instances undoubtedly remain to be discovered as 9 organisms renders defense systems superfluous or favours discussedintherestofthisarticle.Theoverallabundance novel defense mechanisms. ofdefensesystemsshowsnearlyperfectlinearscalingwith genome size (5). The number of TA genes generally in- creases faster than linearly (as a power of (cid:2)1.3 of the DEFENSE ISLANDS totalnumberofgenes),ABIsystemgenestakeanapproxi- mately constant fraction of the genome ((cid:2)1 per 1000 Manycases ofclustering of defensegenes on thechromo- genes), and R-M genes scale sublinearly with the genome somes have been described (27,33,34) as well as involve- size (power of (cid:2)0.75) (Figure 1B). The CRISPR-Cas ment of transposable elements in horizontal transfer of system abundance is statistically the same in large and defense genes (35–37), indicating high mobility and pref- small genomes. The differential scaling with genome size erential attachment of these systems. Thus, unlike other implies that it is most appropriate to analyse the abun- functionalgroupsofbacterialandarchaealgenes(suchas dance of defense systems genes relative to the expected sugar metabolism, energy metabolism, etc.), defense abundance, given the host genome size. systems and mobilome-related genes, such as prophages, 4362 NucleicAcidsResearch,2013,Vol.41,No.8 A Pelodictyon phaeoclathratiforme. The detection of TA extreme abundance of defense systems in taxonomically R-M scattered bacteria implies that such over-representation is not lineage-specific but is perhaps dictated by the Abi ecology of the respective organisms that might be CRISPR subject to unusual massive assault by invasive agents (5). all defense Thissimpleoperationaldefinitionofdefenseislandshas islands proved extremely useful for the prediction of new defense systems (5) and understanding the cooperation between them (see later in the text). Figure 3 shows several examples of defense islands that are specifically enriched for genes from different defense systems and include 0.00001 0.0001 0.001 0.01 0.1 1 several still experimentally uncharacterized genes that D fraction of genome in defense systems o w are implicated in antivirus defense. n B 1000 lo a d e d 100 DEFENSE MECHANISMS IN BACTERIA fro m AND ARCHAEA m es, syste 10 ITnhneatRe-iMmmsuynsitteym: DsNarAemproodbifiacbaltyiotnhesysbteesmtsstudied phage https://a n c no. of ge 0.11100 1000 TRA-M 10000 dp(e9xeli–tfcre1aen1tms)io.eenBmdeoiecvfcaerhureasssienttriyosicmfitntiohitnnihsebepnagrdceatnocetonrimciuaacillcoeiwamosripengsogarinnttioazmnatcthoieoel,encaeuxsatlnwaednreslbpliviraoeosltoateghpinye- ademic.oup Abi domainarchitectureoftheR-Msystems,detailedrulesfor .co CRISPR m 0.01 no. of genes, total restriction enzyme classification and nomenclature have /n all defense been developed (38). This classification divides the R-M ar/a Figure 1. Themajortypesofdefensesystemsinbacterialandarchaeal systems into four major types (I–IV), on the basis of rtic genomes.(A)Distribution(probabilitydensityfunction)ofthegenome subunit composition, ATP(GTP) requirement and le fraction occupied by defense systems in bacteria and archaea. cleavage mechanism (39–41). All the R-M systems -ab (B) Scaling of the number of genes in defense systems with the total s number of genes. A data set of 572 genomes (the largest genome in a function on the same principle of self–non-self discrimin- tra genus with addition of E. coli K12 and B. subtilis subsp. subtilis) was ation, with one enzyme, a methyltransferase (MTase), ct/4 selectedtorepresent1516genomesthatwerecompletelysequencedand modifying the self DNA and the other one, restriction 1 available through the NCBI Genome database as of February 2012. endonuclease (REase), cleaving non-methylated foreign /8/4 3 DNA (38,42). Type II R-M systems are the simplest and 6 0 by far the most common and are mostly used for experi- /2 4 form clusters the size of which by far exceeds the size of mental applications owing to the fact that these enzymes 09 typicaloperonsandthatareunlikelytoappearbychance. cleavethetargetDNAathighlyspecificsites.TheTypeII 55 0 Statistically significantover-clustering ofdifferent defense R-M systems have been further classified into several b y systems has been demonstrated (5). Briefly, many defense subtypes, primarily on the basis of cleavage specificity g u operons tend to be in closer physical proximity to each (41). The Type II systems consist solely of the MTase– es other on the chromosome, compared with the random REase pair that is typically encoded within the same t o n expectation [see (5) for details]. This finding suggests the operon, although some cases ofapparent disjointed local- 0 8 possibility of synergistic interactions between different izationofthetwogeneshavebeenreported(43).Themost A p tuynpeeqsuoivfodcaeflednesfiensiytisotnemosf.tAhelthdoefuegnhseciusrlarenndtslyanthdenreoicslenaor cthormeeplgeexnAesT,Pw-hdiecphenendceondteTtyhpeeRI R(r-eMstrsicytsitoenm),sMenc(ommopdaifiss- ril 20 1 understanding of the mechanism(s) of their formation, cation)andS(specificity)subunitsoftheR-MAcomplex; 9 a simple operational definition has been proposed. theRsubunitalsocontainsadistinctATPasedomainthat A defense island is defined as a string of continuous belongstothehelicaseSuperfamilyII(42,44,45).TypeIII genes, at least one of which belongs to a known defense R-MsystemresembleTypeIIsystemsinthattheyconsist gene families, which are flanked by house-keeping genes; of only R and M subunit but, on the other hand, are such islands are significantly enriched by defense and similar to Type I systems in that the R subunit also mobilome-related genes, compared with analogous contains the helicase domain and the reaction is blocksformedbyothergenomicsystems(5).Thepercent- ATP-dependent (46,47). Type IV R-M systems are ageofgenesfoundindefenseislandsvariesfrom0to30% distinct two-subunit complex that consist of a across the current collection of prokaryotic genomes AAA+family GTPase and an endonuclease, and cleave (Figure 1A) (5). The greatest fraction of the genome the target DNA non-specifically (45,48). dedicated to antiviral defense was detected in the cyano- Many genomic loci that encompass R-M systems of all bacterium Microcystis aeruginosa, the proteobacterium fourmajortypesalsoincludevariablegroupsofadditional Bartonella tribocorum and the bacteroidetes bacterium genes that appear to be co-expressed with the genes for NucleicAcidsResearch,2013,Vol.41,No.8 4363 A 4 3 bi) 2 A + 1 M R- 0 (-6 -4 -2 0 2 4 - R) -1 P S -2 1 RI C + -3 2 A D (T -4 3 own 4 lo -5 a d TA + CRISPR + R-M + Abi ed fro m B 4 h ttp 3 s ://a bi) 2 cad A em M + 1 ic.o R- 0 up (-6 -4 -2 0 2 4 .c PR) - -1 om/n RIS -2 AM ar/a + C -3 BM rticle A -a (T -4 ABTT bstra -5 ct/4 TA + CRISPR + R-M + Abi 1 C /8/4 3 100% 6 0 14 /2 18 3 4 75% 19 17 34 4 17 33 35 09 7 20 55 14 3 1 8 0 b 50% 17 5 9 33 y g 25% 24 24 17 3 29 23 4 28 24 uest o 8 n 0% 1 9 7 2 2 6 14 3 2 16 141 08 A 4Crenarchaeota3 2Euryarchaeota 1 Actinobacteria Bacteroidetes/Chlorobi group Chlamydiae/Verrucomicrobia Cyanobacteria Firmicutes Alphaproteobacteria Betaproteobacteria delta/epsilon subdivisions Gammaproteobacteria other pril 2019 Figure 2. Distributionofknownandpredicteddefensesystemsinarchaealandbacterialgenomes.(A)Thefour‘defensestrategies’.Here,1–4refers tothefourstrategiesdiscussedinthetext.Theaxesshowlogsoftheratiosofthenumbersofgenesbelongingtoagiventypeofdefensesystemsto the number expected from the scaling shown in Figure 1B. The horizontal axis is the sum of the logs for all four types and the vertical axis is (TA+CRISPR)(cid:3)(R-M+ABI).(B)Defensestrategiesusedbybacterialandarchaealthermophilesandmesophiles.BT,AT,BMandAMstandfor bacterial thermophiles, archaeal thermophiles, bacterial mesophiles and archaeal mesophiles, respectively. The axes show logs of the ratios of the numbersofgenesbelongingtoagiventypeofdefensesystemstothenumberexpectedfromthescalingshowninFigure1B.Thehorizontalaxisis thesumofthelogsforallfourtypesandtheverticalaxisis(TA+CRISPR)(cid:3)(R-M+ABI).(C)Distributionofthedefensestrategiesamongmajor prokaryotictaxa.Here,1–4referstothefourstrategiesdiscussedinthetext.Thenumberofanalysedgenomesforeachtaxonisindicatedinsidethe respectivebar. Theexpectedabundanceofgenes belongingtothedefensesystemsofeachtypeinagivengenomewascalculatedfromthe genome sizeusingtheobservedscalingrelationships(Figure1B).Logarithmsoftheratiosoftheobservedandexpectedfrequenciesofdefensesystemgenes in genomes were analysed using Principal Component Analysis; then the data were projected into the space of two orthogonal axes with integer coefficients closest to the first principal components. 4364 NucleicAcidsResearch,2013,Vol.41,No.8 R-M system subunits (5) (Figure 3). Although most of The third DNA modification system, which is involved these genes have not been experimentally characterized, inPhageGrowthLimitation(Pgl)system,issofarpoorly one such case has been studied in considerable detail characterized experimentally. The Pgl system is centred and presents a remarkable example of the interplay around the PglZ protein family in which the only recog- between different defense mechanisms. The Escherichia nizabledomainbelongstothealkalinephosphatasesuper- coli anticodon nuclease (ACNase) prrC co-localizes with family(pfam08665)(60).Thescarceexperimentalevidence threegenesforR-MtypeIcsystemprrIandcontributesto indicates that PglZ confers protection against the temper- the T4 phage exclusion mechanism (49–51). This genomic atebacteriophagephiC31inStreptomycescoelicolorA3(2) association that is conserved in diverse bacteria implies (61,62). This system also includes the P-loop ATPase also a functional connection, and at least one case has domain-containing protein PglY, the methylase PglW been studied in detail. The PrrC nuclease, normally and the serine-threonine kinase PglX (the latter two inactive, can be allosterically activated either by unmodi- proteins are encoded in a different locus in S. coelicolor fiedDNAorbythesmallanti-restrictionpeptideencoded genome).ThebacteriathatpossessthePglsystemsupport Do by the T4-like enterobacteriophages. The activated PrrC a phage burst on initial infection, but subsequent phage wn ACNase cleaves the anticodon of tRNALys in a growth cycles are severely restricted (62). Although the loa d GTP-dependent manner; the GTP hydrolysis is catalysed molecular mechanism of the Pgl system has not been ex- e d by the N-terminal ABC NTPase domain of PrrC. The perimentally elucidated, it has been hypothesized that it fro cleavage of tRNALys inhibits the host translation and as methylatestheDNAofthephageprogenyratherthanthe m a consequence the reproduction of the T4 phage. The hostDNAsothatonre-infection,thesurvivingcellsinthe http RtolobCeelninzkymedettohaRt i-sMhosmysotleomgso,ushatos PsirmrCiladrobesioncohtemseiecmal sparmeveenSttrpehpatgoemgyrcoews tchol(o6n1y,62co).uTldhuasc,titvhaetePgthlesyssytesmtemmiagnhdt s://ac a properties and is activated under genotoxic stress function via a reverse R-M mechanism combining the d e (52,53). Recent analysis has shown that the ACNase self–non-self discrimination and virus-induced cell death m ic domain of both proteins belongs to the HEPN superfam- modesofantivirusdefenseinanoveldefensestrategy.The .o u ily that is merging as a major group of ribonucleases that recent comparative analysis of the neighbourhoods of the p .c are involved in various forms of defense and stress pglZ gene revealed a substantial complexity of genetic or- om response (54,55). ganizationofthissystemthatcouldbepossiblycompared /n a Site-specific DNA backbone S-modification and only with the CRISPR-Cas system (see later in the text) r/a cleavage of unmodified DNA and the dndABCDE genes (5).SupplementaryTableS2liststhegenefamiliesthatare rtic (after DNA degradation phenotype; alternatively, these associated with pglZ gene. One of these families is le-a genes are designated dpt, i.e. DNA phosphothiolation) COG1479 (or DUF262 or DGQHR domain) that has b s involved in this system have been first discovered in been previously identified within the Type I R-M system tra c Streptomyces lividans 1326. Five additional genes locusinCampylobacterjejuni(63).Thecoredomainofthe t/4 (dndFGHI) that are strongly linked to this system have COG1479familybelongstotheParB-likesuperfamilyand 1/8 been found by analysis of the genomic neighbourhoods is often fused to other nucleases such HNH-type nuclease /4 3 (12,13). Recently, the genes required for modification domain, PD-(D/E)xK-like nuclease and HEPN domain, 60 (dndABCDE) and restriction (dndFGH) have been suggestingthatitmightbeanothercaseofaprogrammed /24 identified in the related system from Salmonella enterica cell-death system associated with various DNA modifica- 09 5 serovar Cerro 87 (8). The structures and biochemical tion systems (5). Based on the presence of the pglZ gene, 50 activities of the DndA and DndC proteins that are this system is found in 174 of 1516 completely sequenced by directly involved in S-modification are relatively g genomes that represent most of the major bacterial u wgeenlle-sundaesrssotcoioatded(56w,5i7th), atnhdisthseysftuenmctioanrseofletshse coltehaerr. lineages and several methanogenic and halophilic est o archaea. The remarkable complexity of the Pgl system n Moreover, the neighbourhood around the genes that 0 seems to reflect a still poorly understood elaborate 8 comprise this system is highly flexible, including cysteine A molecular mechanism of self–non-self discrimination and p desulfurase dndA, which often is not linked to the other fine-tuned regulation. ril 2 dnd genes (8). Here, we present results of additional 0 1 sequence and gene context analysis for these genes that 9 Adaptive immunity: the CRISPR-Cas system show a strong link of several components of the DND systems with ABI and TA systems (Supplementary Table The CRISPR-Cas system uses a unique defense mechan- S1). For instance, DndB, the potential negative regulator ism that involves incorporation of virus DNA fragments of restriction (13,58), contains an N-terminal region that into CRISPR repeat arrays and subsequent utilization of belongs to the ABI protein family AbiU1/AIPR/ transcripts of these inserts (spacers) as guide RNAs to COG1479, which encompasses a ParB superfamily cleave the cognate virus genome (34,64–67). Thus, the nuclease domain often fused to other nuclease domains CRISPR-Cas system represents bona fide adaptive from different families and linked to R-M systems immunity that until recently has not been discovered in (55,59). In DndB, the ParB-like domain is additionally prokaryotes and, moreover, is the most clear-cut known fused to a HEPN domain. A distinct HEPN domain case of Lamarckian inheritance (68). The role in antiviral from a different subfamily (DUF4145) is fused to DndF defense that initially was predicted for this system on the NTPase. Domains of the latter subfamily are often fused basis of the detection of spacers identical to fragments of to REase components of Type I R-M systems (55). virus and plasmid genomes and comparative analysis of NucleicAcidsResearch,2013,Vol.41,No.8 4365 Ferroglobus placidus DSM 10642 (Ferp_1845-Ferp_1869) Cas4 Csm4 Small subunit Cas1 (COG4343) Cas2 (MTH323-family) (COG3337) Cmr4 Cas10 Cas4 HEPN (COG1895)MNT (COG1708) Cmr4 Cmr1 (COG1468) MNT (COG1708) HEPN (COG2445) Cas8a2 Cas3’’ Cas3’ Cas5 Cas7 Small subunit, Csa5 CasRa (COG1517) Cas6 Microcystis aeruginosa NIES 843 (MAE_04150-MAE_04400) D o w Transposase, Transposase, PHD RelE RelE RHH PHD RelE RelE n DDE supefamily (COG3415)DDE supefamily (COG3335) (COG2161) (COG4115) (COG2929) (COG5304) (COG2161)(COG4115)(COG4115) lo a d PIN PIN HTH PIN HTH HTH ed (COG2405) (COG1569) (COG2886) (COG2405) (COG2886) (COG4737) VPEP fro PHD PIN HTH MAE_04340MAE_04350 Transposase, Transposase, MAE_04390 MAE_04400 m (COG2161)(COG2405)(COG2886) -like -like DDE supefamily (COG3335)DDE supefamily (COG3415) -like -like h ttp s Aromatoleum aromaticum EbN1 (ebA5218-ebA5250) ://ac a d HTH (COG2378) DUF1819 DUF1788 e m ATPase PIN (COG1569) RHH ic.o u p .c Methylase (COG1002) o m ATPase, FS /na ATPase (COG3950) HNH nuclease AbiLi, ATPAse (COG1106) AbiLii, Toprim domain (COG4637) r/a GIY-YIG rtic nuclease DUF4276 Transposase, DDE supefamily (COG4584) DnaC (COG1484) le -a b pAgo (COG1431) PglZ-like phosphatase stra c Lon family ATPase (COG4930) t/4 1 /8 /4 3 6 Polymorphum gilvum SL003B 26A1 (SL003B_3746-SL003B_3759) 0 /2 4 Metal-dependent hydrolase 0 Membrane protein (COG1451) HsdR(COG0610) 95 5 0 R(DNUAFs3e8 H00 s)uperfamily HsdS (COG0732) HsdM (COG0286) HsdS (COG0732) Pasins oincviaetretads perotein by g RHH RelE ue DNA invertase Pin (COG1961) DUF2924 (COG3609) (COG3668) s t o n 0 8 Chlorobium phaeobacteroides BS1 (Cphamn1_1145-Cphamn1_1161) A p Recombinase, XerD family Helicase (COG0553) DUF4391 ril 20 1 9 Methylase, Mod (COG2189) Restriction nuclease, Res (COG3587) ParB-like nuclease, RloF (COG1479) PrrC, ATP-dependent ribonuclease (COG4694) PIN (COG5611) AbrB (COG2002) Methylase (fragment) Transposase, DDE supefamily ParB-like nuclease, RloF (COG1479) PD-(D/E)XK nuclease (DUF4263/COG3586) RHH (COG3609) RelE (COG3668) Figure 3. Examplesofdefenseislandsinarchaealandbacterialgenomes.Thegenesareshownbyblockarrowswiththesizeroughlyproportionalto the sizeofthe correspondinggene.The genomicposition ofeachregion isindicated givenin parenthesesafterthe speciesnamein theformofthe range of genes denoted using the systematic names for the respective species. Colour coding is the following: pink are components of TA systems, read, components of CRISPR-Cas systems; dark blue, Pgl system; light blue, regulatory components; green, R-M systems; yellow, ABI system; orange,pAgo;brown, componentsthatarespredictedtobeinvolved indefense;grey,unknownprotein.Theproteinfamily ordomainsnames are providedabovetherespectivearrows;someofthesefamilieswererecentlyintroducedanddescribedinthecourseofcomparativegenomicanalysisof defense islands (5); COG or Pfam families are indicated in parentheses. Pgl, Phage Growth Limitation; HTH, helix-turn-helix; RHH, ribbon-helix-helix; GIY-YIG, conserved motif in a nuclease family. 4366 NucleicAcidsResearch,2013,Vol.41,No.8 Casproteinsequenceshasbeensuccessfullyconfirmedex- domain that in particular comprises the core of diverse perimentally (69). Within the few years since this key DNA and RNA polymerases (where it is denoted the breakthrough, the CRISPR research evolved into a Palm domain). Among the Cas proteins, different distinct,highlydynamicfieldofmicrobiologywithconsid- variants of the RRM domain are present in Cas2 (a erable biotechnology potential (70–73). The recent toxin-like ribonuclease), Cas10 (the so-called CRISPR advances in the study of CRISPR-Cas systems are polymerase, a protein that is homologous to polymerases covered in many reviews (15,74–76); therefore, here we and cyclases but whose actual biochemical activity presentonlyabriefoutlineofthefunctionsandcompara- remains unknown) and in the largest group of Cas tive genomics of prokaryotic adaptive immunity and proteins known as the RAMP (Repeat-Associated discuss the likely scenarios for the evolution of the differ- Mysterious Proteins) superfamily (Figure 4B). In particu- ent types of CRISPR-Cas. lar, all CRISPR-Cas systems of Type I and most of the The CRISPR-Cas systems are classified into three systems of Type III include a dedicated ribonuclease for distinct types (I, II and III) (18) and several yet unclassi- the pre-crRNA processing that typically belongs to the D o fiedminor variants(77).Thisclassification was developed Cas6 family of the RAMPs (82,83). In some cases, e.g. wn throughacombinationofcomparisonsofthesequencesof in CRISPR-Cas systems of Type I-C, the function of loa d theCasproteins,casgenerepertoiresandgenomicorgan- Cas6 is displaced by a catalytically active RAMP of the e d ization of the CRISPR-Cas loci. For each type and Cas5 family (84). In contrast, Type II CRISPR-Cas fro subtype, a specific signature gene has been identified systems use an unrelated mechanism of pre-crRNA m allowing easy classification of the highly variable cleavage. This version of pre-crRNA processing requires http CRISPR-Cas loci in the course of genome analysis (18). the involvement of the double-stranded RNA-specific s The mechanism of CRISPR-Cas is usually divided into RNase III, a specialized trans-encoded small RNA, ://a c a threestages:(i)adaptation,whennewspacershomologous which is complementary to a single CRISPR repeat, and d e to protospacer sequences in viral genomes or other alien still unidentified domains of the Cas9 protein m ic DNA molecules are integrated into the CRISPR repeat (18,69,85,86). .o u cassettes; (ii) expression and processing of pre-crRNA In Type I-E and I-F CRISPR-Cas systems, the p .c into short guide crRNAs; and (iii) interference, when the endoribonuclease that catalyses the processing of the o m alienDNAorRNAistargetedbyacomplexcontaininga pre-crRNA is a subunit of a multisubunit (or /n a CRISPR RNA (crRNA) guide and a set of Cas proteins multidomain) complex known as CASCADE r/a [forreview,see(15)].Below,wefocusonthebasicbuilding (CRISPR-associated complex for antiviral defense) (87). rtic blocks of the distinct types of CRISPR-Cas systems and The mature crRNA remains associated with the le -a summarize the current considerations on the origin and CASCADE complex that scans the target DNA for a b s evolution of this system. match, and once one is found, recruits the Cas3 protein tra c MostoftheCasproteinsequencesevolveunderrelaxed that cleaves the target via its HD endonuclease domain t/4 purifyingselection(78)and/orundergoacceleratedevolu- (88). In Type III systems (at least the model system from 1 /8 tion resulting from the virus-host arms race [e.g. (79)]. the archaeon Pyrococcus furiosus), the Cas6 /4 3 Consequently, most of these sequences are weakly endoribonuclease does not belong to the CASCADE 6 0 conserved in evolution so that conventional sequence complex that is apparently not directly involved in the /2 4 comparison partitions the Cas proteins into >100 processing but instead binds the mature crRNA (89,90). 09 5 families (18). However, advanced sequence analysis Thisdistinctionapart,thearchitecturesoftheCASCADE 5 0 combined with structural comparison identifies conserved complexesinTypeIandTypeIIICRISPR-Casaresimilar b y domainsbetweenCasproteinfamiliesthatwereoriginally and include a large subunit, a small subunit and a pair of g u consideredunrelatedandthusenablestheidentificationof RAMPs that belong to the Cas5 and Cas7 families es the major building blocks that are shared by different (84,87,90–92) (Figure 4A). Despite the high level of t o n CRISPR-Cas types (Figure 4A) (18,34,64,77). The two sequence divergence and structural rearrangements that 0 8 proteins that are present in the great majority of the is typical of many Cas proteins, there appears to be a A p CRISPR-Cas systems are Cas1 and Cas2 that together direct homologous relationships between the respective ril 2 are required and sufficient for spacer integration (the subunits of the Type I and Type III CASCADEs (77). A 0 1 adaptation phase of the CRISPR-Cas response) (80). notable difference is that Type I CRISPR-Cas 9 The only CRISPR-Cas loci that lack Cas1 and Cas2 encompasses a single Cas7 protein that is present in genes are some Type III systems that co-exist with Type several copies in the CASCADE, whereas in Type III Isystemswithinthesamegenomeandapparentlyborrow systems, there are several paralogous Cas7-like proteins. Cas1 and Cas2 proteins from the latter (18). Although In Type II CRISPR-Cas, a single large multidomain both Cas1 and Cas2 are involved in adaptation, Cas1 protein, Cas9, is responsible for all the functions that in endonuclease that adopts a unique a-helical fold (81) Type I and Type III systems are performed by the appears to possess all the required enzymatic activities, CASCADE and the Cas3 protein (93). whereas Cas2 might perform a distinct function that is ThetargetDNAcleavageinTypeI(88)andmostlikely not mechanistically related to spacer acquisition (see dis- in Type III systems (77) is catalysed by homologous HD cussion later in the text). family nucleases. In many Type III systems, the HD With the exception of Cas1, most of the common Cas domain is fused to the cas10 gene, the large subunit of proteinscontainvariousversionsoftheRNARecognition the CASCADE-like complex, whereas in Type I systems, Motif (RRM) domain, a widespread RNA-binding the most common protein architecture is Cas3 in which NucleicAcidsResearch,2013,Vol.41,No.8 4367 A Type I Type II Type III Type U Cas6 or R NCAasse 9III Cas6 Cas5 tracrRNA Cas8 Cas9 Cas10 Cas8, Small subunit# tracrRNA:crRNA Small subunit Small subunit DinG Cas7 (one) and target Cas7-like (several) Cas7 (one) Cas5 (one) binding domains Cas5-like (one) Cas5 (one) HD domain Cas9 HD domain RuvC-like, Cas3’’ of Cas10 HNH domains Cas2 Cas3’ Cas1 Cas1 Cas4 Cas2 Cas1 Cas2 CRISPR Csn2 CRISPR COG1517* Cas4 D o CRISPR w n lo a crRNA and target d Tcrleaanvsacgriept binding complex Helicase Tcaleragveat ge ed dcoismpepnosnaebnlte C R ICSAPSRC ADE Associated Other optional from Spacer insertion repeat-spacer units immunity components h ttp s ://a B c CCAASSCCAADDEE ccoommppoonneennttss a d I-E Helicase * * * * * * * * em HD Cas3 Cse1 LS Cse2 Cas7 R Cas5 R Cas6 R Cas1 Cas2 ic.o SS u p CCAASSCCAADDEE ccoommppoonneennttss .c o III-A * * * * * m /n a Cas1 Cas2 HD Cas10 LS SS Cas7 R Cas5 R Cas7 R Cas6 R r/a CCAASSCCAADDEE ccoommppoonneennttss rtic le U -ab Csf1 1 LS SS Cas7 R Cas5 R stra c t/4 1 /8 /4 3 4 1 3 2 RRM fold Cas6, pdb: 3I4H 60 two RRM domains /2 4 0 9 C N 55 0 b y Figure 4. General principles of the structure and organization of four CRISPR-Cas types. (A) The building blocks of four distinct CRISPR-Cas gu systemtypes.Thecasgenesanddomaindescriptionforeachbuildingblockaregiven.Genenamesfollowthecurrentnomenclatureandclassification es (18). The symbol ‘#’ indicates the putative small subunit that appears to be fused to the large subunit in several Type I subtypes (77). Asterisk t o indicates that those COG1517 family proteins that contain a third effector (toxin) domain are implicated in immunity-dormancy/suicide coupling. n 0 (B)RRMdomain-containingproteinsinCRISPR-Cassystems.Generalorganizationofoperonsisshownbyarrowswithsizeroughlyproportional 8 tothesizeofrespectivegene.Homologousgenesareshownbythearrowsofsamecolourorhashing.Colourcodingisthesameasinthe(A).Gene Ap apnindkfraemctialyngnlaems,ewsitahresetamkietrnanfrsopmare(n1t8,r7e7c)t.anAgdledsitiionndaicladtiensgigdneatteiorinosr:atLeSd,RlaRrgMe sfuobldu.nTit;heSSp,rosmteainllrseupbruesneint;tiRng, RfaAmMiliPess.wRitRhMRRdMomdaoinmsaainressfohrowwnhibchy ril 20 structureshavebeensolvedaredenotedbyasterisks.AtopologydiagramoftheRRMfoldisshowninthebottomleft:betastrandsareshownby 19 red arrows; the purple shapes each denotes a single alpha helix in the typical RRM fold that, however, are replaced by more complex secondary structurearrangementsinsomevariantsincludingRAMPs.ThestructureofCas6,thetypicalRAMPsuperfamilyproteinwithtwoRRMdomains, isshowninthebottomright.ThecoloursofthecoreRRMelementsarethesameasinthetopologydiagram;inaddition,theglycine-richloop,the signature feature of the RAMP superfamily proteins, is shown in blue; amino acids involved in catalysis are rendered in yellow. theHDdomainisfusedtoadistincthelicasedomainthat stranded breaks. During this process, the HNH nuclease is essential for the interference stage (88,94). Type II domainofCas9cleavesthestrandofthetargetDNAthat systems use an unrelated mechanism that involves two is complementary to the crRNA, whereas the RuvC distinct nuclease domains, HNH and RuvC-like, both domain cleaves the second strand (95). contained within the Cas9 protein (95). This mechanism The Cas1 endonuclease, the CASCADE subunits and involves a unique two-RNA structure that consists of the the Cas3 helicase-nuclease are essential for the immune mature crRNA base-paired, which isbase-paired with the function of the respective CRISPR-Cas systems. In trans-encodedsmallRNAanddirectsCas9tothecognate addition, the CRISPR-Cas loci encompass many other DNA sequence where this protein introduces double- genes that encode proteins whose mechanistic role in 4368 NucleicAcidsResearch,2013,Vol.41,No.8 adaptive immunity remains unclear but that belong to Putative defense systems associated with prokaryotic protein families implicated in other defense systems. Argonaute homologs These CRISPR-associated gene products include the Anotherputativedefensesystemthatremainstobeexperi- ribonuclease Cas2, the RecB-like nuclease Cas4 and mentally characterized centres around prokaryotic homo- numerous representatives of the COG1517 superfamily loguesoftheslicernucleaseargonaute(pAgo),thecentral of helix-turn-helix and putative ligand-binding domain component of the eukaryotic RNAi system (100). In all, containingproteins(34,77).Mostoftheseproteins,inpar- 189 pAgo sequences have been identified in complete or ticular Cas2, contain domains that are predicted to be draft genomes that represent most of the major branches nucleases and toxins, suggesting a secondary role as of archaea and bacteria. For bacterial pAgos from associated immunity components [see details later in the Aquifex aeolicus and Thermus thermophiles, site-specific text and (55)]. Finally, the functions of several Cas DNA-guided endoribonuclease activity has been proteins remain completely obscure. demonstrated in vitro (101,102), but the natural target D Taken together, the results of comparative sequence and the source of the guide DNA molecule(s) remain to ow analysis, structural studies and experimental data suggest n be determined. The pAgos could be classified into two lo that despite the remarkable complexity and diversity, all a large monophyletic groups: the ‘long’ form that contains d CtioRnIaSlPpRri-nCcaipslseysstaenmds,ugsiveetnhethsaemcoenasrecrhviatetciotunraolfatnhdefpurninc-- ainaPcAtivZate(odligriobnounculecoletaidsee) bdionmdianign)s aanndd PthIeW‘Ish(oarctt’ivfeoromr ed fro m cipal building blocks, share a common ancestry that lacks the PAZ domain (100). Almost all pAgos that h (Figure 4A). It is notable, however, that some of the es- lack a PAZ domain appear to be inactivated, and the ttp s sential components of the CRISPR-Cas systems can be genes encoding for these proteins are associated with a ://a replaced either by homologous proteins, such as the sub- c variety of predicted deoxyribonucleases in putative a stitution of Cas5 for Cas6 in Type I-C CASCADE operons, including those from PD-(D/E)xK, Sir2 and dem complexes,orbynon-homologousbutfunctionallyanalo- phospholipaseDsuperfamilies.Furthermore,strongasso- ic gous proteins, such as the substitution of the HNH and ciation of the pAgo gene with defense islands has been .ou p RuvC-like domains of Cas9 for the HD nuclease. demonstrated (100). Thus, it can be the hypothesized .c o Undertherecentlyproposedparsimoniousevolutionary that the PAZ domain-containing pAgos directly destroy m scenario, only a few evolutionary events would suffice to virus or plasmid transcripts via their endoribonuclease /na esuxbptlayipnesth(e55e)m. eFrugertnhceermofoCreR, IcSoPmRp-aCriassonsysotfemthetypreecsenatnldy apcAtigvoitsyc,owuhlderbeaesstthruecatuprpaalrseunbtluyniintsacotfivpartoedteiPnAcZo-mlapclkeixnegs r/article solved structures of all major components of the thatcontainendonucleasestargetingDNA.Analternative -a b CsmAaSllCAsuDbuEnictosmmpilgexhtsuhgagveestesvtohlavtedthferoRmAMthPesaanncdesttrhael p(soesesilbaitleitryinistthheattepxAt)gothraetptraersgenettss ahodsitstninucctleAicBaIcisdysatenmd strac large subunit resembling the Cas10 protein that contains causes death or dormancy of the infected cell. t/41 two RRM domains and an alpha-helical domain Regardless of the specific mechanisms, it is likely that /8/4 resembling the small subunit (96,97). The Cas10 protein pAgos are key components of a novel defense system 36 (thelargesubunitofTypeIIICRISPR-Cassystems)could that uses guide DNA or RNA molecules to cleave target 0/2 have evolved from an ancestor RRM (Palm) domain- nucleic acids (100). 40 9 containing polymerase or cyclase and, combined with 5 5 the HD domain, might have originally functioned as a 0 b CRISPR-independent defense (innate immunity) system SYSTEMS INDUCING PERSISTENCE AND y g u (55). The Cas1–Cas2 module originally might have func- PROGRAMMED CELL DEATH es tioned independently as a TA system (see discussion later t o Toxins–antitoxins n inthetext). Joining this module withthe hypotheticalan- 0 8 cestral CASCADE-HD system might have led to the BothTypeIandTypeIITAsystemsoriginallyhavebeen A p etrmanersgfoenrmceatoiofnthoefaadnapintantaioten ismtamgeunaitnydmaceccohradniinsgmlyinthtoe cphlaasrmacidtesriaznedd aenss‘uardidnigcttihveeirmpoedrusilsetse’ntcheaitnaareheonsctoldineedagine ril 20 1 one for adaptive immunity. after a cell division (103,104). The toxin component of all 9 The ancestral Cas10-like protein and the entire ances- TAsystemsisaproteinthatkillscellsifexpressedabovea tral, subtype III-like CRISPR-Cas system most likely certain level, whereas the antitoxin component reversibly evolved in hyperthermophilic archaea and was subse- inactivates the toxin and/or regulates its expression, quently horizontally transferred to bacteria. Indeed, in thereby preventing cell killing. Unlike the toxin, the archaeal hyperthermophiles, this variant of the CRISPR- antitoxin is metabolically unstable so that, unless Cas system is (nearly) universal in these organisms, in a the antitoxin is continuously expressed, the free toxin sharp contrast to the presence of any form of CRISPR- can be accumulated in amounts sufficient to kill a cell Cas in <50% of archaeal and bacterial mesophiles (25,105–108).Oncethefirstgenomeshavebeensequenced, (18,77,98). In accord withthis scenario, a recent mathem- itbecameclearthatnumerousTAsystemsarepresentnot atical modelling study has shown that the benefits of onlyonplasmidsbutalsoonthechromosomesofbacteria adaptiveimmunityaresubstantiallygreaterunderthecon- and archaea (25,107). ditions of limited virus mutability that seems to be char- This surprising discovery stimulated a debate on the acteristic of hyperthermophilic habitats (99). functions of the chromosomal TA systems and NucleicAcidsResearch,2013,Vol.41,No.8 4369 promptedaseriesofcomparativegenomicandexperimen- associated with the interpretation of the guilt by associ- tal studies that resulted in the discovery of dozens of new ationpredictionsandthestandardvalidationexperiments TA systems. These findings and the current ideas on the indicate that additional experimental approaches are biological roles of TA systems are summarized in several required to determine whether some recently identified recentreviews(19,26,109–111).Briefly,itappearsthatthe systems are bona fide TA systems. TA systems provide a mechanism for cell persistence to Additionalexamplesofpoorlycharacterized(predicted) cope with various stress conditions (23,24,111). The TA systems are given in Supplementary Table S3. One of majority of Type II toxins target different components the most abundant of the predicted TA systems, that is of translation systems, especially mRNA (112,113), particularcommoninhyperthermophilicarchaea,consists whereas Type I toxins affect membrane integrity (114). of a HEPN domain-containing protein the minimal However, other targets of toxins have been identified as nucleotidyltransferase (MNT). Among the two compo- well, such as DNA gyrase (115) and the cell division nents of this TA system, the HEPN domain protein is GTPase FtsZ (116). Because Type I toxins have never likely the toxin (118) that is predicted to function as a Do been implicated in virus resistance and are not frequently RNAse probably targeting an RNA during translation wn observedindefenseislands,wedonotconsiderthemhere. (54,55), whereas the MNT is the antitoxin. Although the loa d Instead, we focus on Type II TA systems, particularly HEPN–MNTmodule sharesall thetypical characteristics e d poorly characterized variants (Supplementary Table S3), of TA systems (27), the molecular mechanism of this fro anddiscusstheresultsoftherecenteffortstoidentifynew system, and in particular the role of the nucleotidyl- m TA families using in silico approaches. transferase activity of the antitoxin, remains unclear. http TAThseystceommspucatantiboenacllaaspsipfireodacinhteos tfhorreeprgerdoiuctpios:n(io)f‘gnueiwlt TgrhoeupHs,EoPnNe opfrwotheiicnhsisinovtehre-sreeprseyssetnemtesd ibneltohnegrmtoophtiwleos s://ac a byassociation’whenanewtoxinorantitoxinispredicted and the other one in mesophiles (27). The HEPN and d e byvirtueoflinkage,inbacterialandarchaealgenomes,to MNT domains are often fused to each other, which is m ic genes that belong to known antitoxin or toxin families not typical of other TA pairs. Furthermore, the paRep1/ .o u (27,117);(ii)identificationofgenepairswithcharacteristic paRep8(Pyrobaculumaerophilumrepetitivefamily)family p .c features of TA systems such as tight linkage of genes of HEPN domains, which is represented almost exclu- om encoding small proteins, propensity for HGT and sively in thermophiles and is specifically expanded in /n a presence on plasmids or within genomic islands with crenarchaea, is not associated with MNT; therefore, it r/a other defense genes (5,27); and (iii) statistical analysis of remains to be determined whether these proteins are rtic wholegenomesequencingclonesaimedatidentificationof toxins of a distinct family of TA systems using a still le-a genes that are unclonable (toxic) in E. coli (118). unidentified antitoxin. b s The new predicted TA systems usually are validated Another two component system in which one of the tra c experimentally in E. coli by a kill/rescue assay in which proteins is a predicted nucleotidyltransferase is t/4 overexpressionofatoxinisexpectedtoinhibitcellgrowth DUF1814-COG5340. More than 700 occurrences of this 1/8 orkillthecell, whereasco-expressionofthetoxinandthe system were detected in 430 sequenced genomes of most /4 3 antitoxinrestoresgrowth(117).However,therecentcom- major lineages of archaea and bacteria including several 60 prehensive study revealed numerous genes that appear to Mycoplasma species with small genomes. Homology of /24 be unclonable in E. coli but do not meet the definition of the DUF1814 family with the ABI AbiG (123) and AbiE 09 5 TAsystems,includingmanymetabolicenzymesandinfor- families (124) has been demonstrated (5). In this case, 50 mational genes such as ribosomal proteins (118,119). however, the nucleotidyltransferase (DUF1814) appears by Although not all of these genes form two-gene operons to function as the toxin, whereas the COG5340 protein gu tthhaatt daoresagtyepiimcablaloafncTeAorstyosxteicmitsy, othfeasneifinntedrimngesdiaintediscuabte- t[(h5a)t, sceoenatalsionsSauppprleemdiecntetdaryHTTaHbledoSm4]a.iBnoitshtAheBIansytisttoexmins est on stratecanresultintoxicityofagenethatcanbemitigated 0 appear to act at the stage of phage DNA replication, but 8 byaproperregulationorco-expressionbyenzymeusinga A their molecular mechanisms remain unknown (22). p toxicproduct,mimickingtheTAbehaviour.Thus,predic- Yet another putative new toxin is COG2856, a ril 2 tion of new TA systems from experimental results metzincinsuperfamilyproteaseassociatedwithapotential 01 obtained with this approach requires caution and should 9 antitoxin,aHTH-domainproteinoftheXrefamily,often involve assessment of the known and predicted functions fused to the protease (125). These putative operons are andoperonicorganizationofthecandidategenes.Several abundant in bacterial and archaeal genomes, phages and experimentally validated TA systems (e.g. GinA and plasmids, with lineage-specific expansions in several GinC) do not form evolutionarily conserved two gene bacteria. Interestingly, in the bacterium Deinococcus operons, suggesting modes of actions distinct from the radiodurans, a COG2856 gene (irrE) is a major radiation typical toxin–antitoxin mechanism (120). For example, resistance determinant (126). GinA, a close homologue of the phage Mu host-nuclease Comprehensive comparative genomic analysis of the inhibitorproteinGam,whichinhibitsRecBCDbindingto distribution and co-occurrence of known and predicted dsDNAends(121),andits‘antitoxin’Sak,asingle-strand families of toxins and antitoxins leads to the following annealingprotein(122),areoftenlinkedtootherenzymes principal conclusions: involved in recombination and repair (120). Accordingly, itappearsmostlikelythatGinAandGinCareinvolvedin . The abundance of TA systems in the genomes scales repair-related functions as well. These complications superlinearly with the genome size (5,27).

Description:
archaea and bacteria. Kira S. Makarova, Yuri I. Wolf and Eugene V. Koonin* . creases faster than linearly (as a power of $1.3 of the . Distribution of known and predicted defense systems in archaeal and bacterial genomes. (A) The four network is due to the modularity of the TA systems whereby
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.