Published online January 10, 2006 66–77 Nucleic Acids Research, 2006, Vol. 34, No. 1 doi:10.1093/nar/gkj412 Dispersal and regulation of an adaptive mutagenesis cassette in the bacteria domain Ivan Erill, Susana Campoy1, Gerard Mazon1 and Jordi Barbe´1,2,* Biomedical Applications Group,CentroNacionalde Microelectro´nica, 08193 Bellaterra,Spain, 1Centrede Recerca en Sanitat Animal (CReSA), 08193Bellaterra, Spain and 2Departamentde Gene`tica iMicrobiologia,Universitat Auto`noma de Barcelona08193 Bellaterra, Spain D o ReceivedSeptember30,2005;RevisedandAcceptedDecember8,2005 w n lo a d ABSTRACT resistance generation and of bacterial co-evolution with the e d Recently, a multiple gene cassette with mutagenic host immune response. fro A major mechanism in adaptive mutation is translesion m translation synthesis activity was identified and h synthesis (TLS). Mediated by error-prone or lesion bypass ttp shown to be under LexA regulation in several pro- polymerases, TLS allows the cell to replicate past a variety s teobacteria species. In this work, we have traced of DNA lesions and distortions, thereby promoting survival ://a c a downinstancesofthismultiplegenecassetteacross under endogenous or environmental insults, but also drastic- d e the bacteria domain. Phylogenetic analyses show ally increasing the stationary-phase mutation rate (1). Mem- m ic thatthiscassettehasundergoneseveralreorganiza- bersoftherecentlydiscoveredYfamilyofDNApolymerases, .o u tions since its inception in the actinobacteria, and such as Escherichia coli polymerases IV (encoded by the p.c dinB gene) and V (encoded by the umuD and umuC genes), o that it has dispersed across the bacterial domain m throughacombinationofverticalinheritance,lateral have been shown to be error prone and poorly processive, /na gene transfer and duplication. In addition, our ana- lying at the core of TLS in most organisms (2). Indeed, Y r/a lyses show that LexA regulation of this multiple fhaamveilybepeonlylimnkereadsetso aareranpgreeseonftminutaagllenliicveacdtiovmitiaeisn:sfraonmd rticle gene cassette is persistent in all the phyla in which -a adaptive mutagenesis in bacteria (3) to genetic instability in b s ithasbeendetected,andsuggestthatthisregulation human cancer (4). tra is prompted by the combined activity of two of its Well before their true nature was established, both E.coli ct/3 constituent genes: a polymerase V homolog and dinB and umuDC products have already been linked to the 4/1 analphasubunitoftheDNApolymeraseIII. SOSresponse(5).FirstdescribedinE.coli,theSOSsystemis /6 6 a global response mechanism against DNA damage that in /2 4 E.coli regulates up to 40 genes involved in DNA repair and 0 1 5 cell survival (6,7). Induced by single-stranded fragments of 2 INTRODUCTION 6 DNA generated by either DNA damage-mediated replication b y Adaptivemutationinbacteriahasbecomeafieldofincreasing inhibition or enzymatic processing of broken DNA ends (8), g u interest in the last years. Also termed stationary-phase or the SOS response is governed by the LexA and RecA e s stress-induced mutation (1), adaptive mutation concerns the proteins, both of which are also members of this regulatory t o n mechanisms by which bacteria increase their mutation rate network. In normal conditions, LexA binds to operator sites 2 3 when placed under non-lethal selective conditions, thereby of regulated genes and effectively represses their expression. N o spontaneously generating mutations that relieve the selective Conversely, in the advent of DNA damage single-stranded v e pressure. Adaptive mutation is of special interest in micro- DNA (ssDNA) fragments bind and activate the RecA m b biology because it has a broad influence on the mechanisms protein, which acquires coprotease activity and promotes er 2 of evolution, and particularly on those taking place in closed the autocatalytic cleavage of LexA (9). Cleaved LexA is 0 1 andstressfulmediasuchasthehostsofbacterialpathogens.It unable to bind DNA, leading to the unrepressed expression 8 is therefore of importance to the study of pathogen evolution ofSOSgenes.OnceDNAdamagehasbeenaddressed,ssDNA insideahost,andmorespecificallytotheanalysisofantibiotic concentrationfallsandbothRecAandLexAregaintheirusual *TowhomcorrespondenceshouldbeaddressedatDepartamentdeGene`ticaiMicrobiologia,Ed.C,UniversitatAuto`nomadeBarcelona08193Bellaterra,Spain. Tel:+34935811837;Fax:+34935812387;Email:[email protected] Theauthorswishittobeknownthat,intheiropinion,thefirsttwoauthorsshouldberegardedasjointFirstAuthors (cid:2)TheAuthor2006.PublishedbyOxfordUniversityPress.Allrightsreserved. Theonlineversionofthisarticlehasbeenpublishedunderanopenaccessmodel.Usersareentitledtouse,reproduce,disseminate,ordisplaytheopenaccess versionofthisarticlefornon-commercialpurposesprovidedthat:theoriginalauthorshipisproperlyandfullyattributed;theJournalandOxfordUniversityPress areattributedastheoriginalplaceofpublicationwiththecorrectcitationdetailsgiven;ifanarticleissubsequentlyreproducedordisseminatednotinitsentiretybut onlyinpartorasaderivativeworkthismustbeclearlyindicated.Forcommercialre-use,[email protected] Nucleic Acids Research, 2006, Vol. 34, No. 1 67 conformation,repressingagainthesystem.TheLexAprotein C.crescentus and all the sequenced alpha proteobacteria, in is widespread among bacteria and several distinct LexA- which the presence of the imuA-imuB-dnaE2 cassette is ubi- binding sites have been reported in different phylogenetic quitous, lack an umuDC TLS system suggested that this groups. E.coli LexA, for instance, binds a 16 bp consensus gene cassette might be playing a TLS role in these species. sequence (CTGTN ACAG) (5) that is also found in most Thislineofreasoningwasstrengthenedwiththeidentification 8 gamma and beta proteobacteria (10). Similarly, Gram- of split cassette homologues in other species lacking an positivebacteriaLexAhasbeenshowntobindapalindromic umuDC system, such as the Planctomycetes Rhodopirellula motif with consensus GAACN GTTY (11), which is very baltica or the Actinomycetales M.tuberculosis and 4 similar to that observed in cyanobacteria and green non- Propionibacterium acnes, although the first gene of these sulfur bacteria (12,13). Additional LexA-binding sites have split cassette instances in the actinobacteria, which we will been defined in the alpha proteobacteria (GTTCN GTTC) here term imuA0, could not be shown to be an homolog of 7 (14,15), the Xanthomonadales (TTAN6TACTA) (16), its proteobacteria equivalent imuA. Do w dFeibltraobparcotteerosbuacccteinrioageMneysxo(TcoGcCcuNsCNxa4nGtThuGsCA(C)T(R17H)AaMndRtYhe- cibMleo,rLeerxeAce-nretgwuolartkedhaismaulAso-imidueBnt-idfineadEa2-DliNkeA-cdaassmeattgeeiinndthue- nloa d BYGTTCAGS) (18) and Bdellovibrio bacteriovorus delta proteobacterium B.bacteriovorus (19), and has shown e d (TTACN3GTAA) (19). that its imuA gene, although annotated as and sharing partial fro Recently, a multiple gene cassette encoding two error- homology with recA, does not trans-complement a recA m h prone polymerases was described in Pseudomonas putida defective E.coli strain. To further elucidate the nature and ttp aenradl hgoammomloag,sboeftathaisndsamalephcaaspserottteeowbearceteridiaenstipfieecdiesin(s2e0v)-. etevdolauntioenxtoefntshivisemdualttaibpalesegesneaerccahssoefttiem,hueAr,eiwmeuAha0,viemcuoBndauncd- s://ac a The original gene cassette in P.putida encoded a copy of the dnaE2 homologs across the bacteria domain, and we have d e LexA repressor (PP3116) recognizing a cyanobacteria-like made use of robust phylogenetic methods to infer the evolu- m ic LexA-binding sequence, a protein annotated in P.putida tionary history of this gene cassette, revealing its dispersal .o u as SulA (PP3117), a dinB homolog (PP3118) and an alpha through duplications, vertical inheritance and lateral gene p .c subunit of the DNA polymerase III (PP3119). This cassette transfer(LGT), anditscontinued regulationbyandinfluence o m was shown to be a DNA-damage inducible operon, self- on the SOS system. /n a regulated by its own encoded LexA protein, and homologs r/a of this full P.putida gene cassette were identified in the rtic genomesofPseudomonassyringae,Pseudomonasfluorescens MATERIALS AND METHODS le -a and all the sequenced Xanthomonadaceae (20). In addition, b Gene,proteinand genomic sequences s three-gene cassette derivatives lacking the lexA gene were tra also found in the close Pseudomonas aeruginosa, the beta ProteinandDNAsequencesforlexA,imuA0,imuA,imuBand ct/3 proteobacterium Ralstonia solanacearum and many alpha dnaE2 annotated homologs and their respective promoters 4/1 proteobacteria species such as Agrobacterium tumefaciens were downloaded from NCBI GenBank, TIGR Compre- /6 6 (20). Interestingly, all the reported cassette instances hensive Microbial Resource and JGI Integrated Microbial /2 4 harbored also an adapted LexA-binding site on the promoter Genomes database resources after identification of homologs 01 region of their first gene, and both DNA-damage inducib- by the on-site database textual or BLAST (22) search tools. 52 ility and LexA binding were experimentally confirmed Additional lexA, imuA0, imuA, imuB and dnaE2 DNA 6 b for cassette homologs in A.tumefaciens, P.aeruginosa, sequences from homologs in unfinished microbial genomes y g u Sinorhizobium meliloti and Xanthomonas campestris (20). were downloaded from either JGI Integrated Microbial e s Given the aforementioned divergence in LexA-binding Genomes or TIGR Unfinished Microbial Resource after t o n sequences among these bacterial groups, the persistence of their identification through these resources own BLAST ser- 2 3 LexA regulation in all theses species hinted at a positive vices, and DNA sequences were translated into their cor- N pressuretowardsLexAcontrolofthismultiplegenecassette. responding amino acid sequences with the EditSeq v5.0 ov e Later work on Caulobacter crescentus (21) confirmed that program (DNAStar). Complete and incomplete genome m iitnsduthcirbelee-geonpeercoans,setnteegahtoivmeolylogrewgualsataeldsobay DCN.cAr-edsacmenatgues aanssdemdbnlaieEs2fohrotmheoloorggsanwisemres hdaorbwonrlionagdeimduAal0,soimfurAo,mimtuhBe ber 20 1 LexA protein, and named its constituents as imuA, imuB aforementioned resources. 8 and dnaE2 (PP3117, PP3118 and PP3119 homologs, res- Identification ofLexA regulatory motifs pectively). Furthermore, experimental assays demonstrated that the imuA-imuB-dnaE2 cassette is responsible for most LexA regulatory motifs in the promoter of lexA, imuA0, DNA-damage induced mutations in C.crescentus, and is imuA, imuB and dnaE2 genes were searched using required for error-prone processing of DNA lesions in this RCGScanner v2.1 (10), a consensus-building software for species (21). In addition, phylogenetic analyses revealed the prediction of regulatory motifs, in those species with that the cassette dnaE2 gene was related to Mycobacterium known or inferred LexA-binding sequences. Taking tuberculosis dnaE2 gene, whose product had been shown advantageofthefactthatlexAisaself-regulatedgene,undes- previously to mediate SOS mutagenesis in this species. cribed LexA regulatory motifs were identified by first Further experiments confirmed that the imuB and dnaE2 looking for putative dyad motifs in the promoter region of gene products of C.crescentus cooperated in lesion bypass, lexA using custom MsWord macro routines to detect palin- with a mutational signature different from that observed dromic and direct repeat motifs with varying number of previously in the E.coli umuDC system. The fact that mismatches and spacer lengths. Candidate motifs were then 68 Nucleic Acids Research, 2006, Vol. 34, No. 1 sought in the promoter regions of recA and imuA0/imuA or actinobacteria, like the Streptomycineae, show evidence of dnaE2 genes with EditSeq’s Find function allowing for atwo-genecassetteconsistingofthednaE2andadinBhomo- ambiguity and, if found, the significance of these findings logwhichwehereascribetoimuB.Ontheotherhand,several was independently corroborated using the motif discovery actinobacteria present split cassette homologs in which the tool MEME (23) through its web-based interface at the San dnaE2 gene is isolated while two additional genes, the Diego Supercomputing Center. RecA homolog here labeled as imuA0 and imuB, make up a two-gene cassette. This isthe case of the previouslyreported Phylogenetic analyses M.tuberculosis, but also of all the sequenced Mycobacteri- aceae, Corynebacterineaeand of Nocardiafarcinica.Finally, Alignments of protein sequences were carried out using a instances of an imuA0-imuB-dnaE2 cassette can also be combined procedure to improve alignment quality. Protein found within the actinobacteria. P.acnes, Nocardioides sp. sequences were first aligned through CLUSTALW v1.83 (24) using Gonnet matrices and default [10], 25 and 5 gap- and Brevibacterium linens all possess a three-gene cassette, Do opening penalties for the multiple alignment stage, thus and this is very likely the layout of the imuA0-imuB-dnaE2 wn generating three different alignments. These three slightly cassettethatemergedfromtheactinobacteriaintosubsequent load phyla. e differentalignments,togetherwithalocalalignmentgenerated d by T-COFFEE Lalign method, were integrated as libraries Leaving the actinobacteria, an imuA0-imuB-dnaE2 like fro cassette can be detected in several bacterial phyla preced- m into T-COFFEE v1.37 (25) for optimization. T-COFFEE is h an alignment tool for the optimization of parametric consist- ing the proteobacteria, which is all the more remarkable ttp etoncryedinucmeutlhtiepliemapliagcntmoefnitnsitthiaaltderrarowrssoinndgifrfeeerdeyntparloiggnremsseinvtes dinuethteostehqeuesnccaendt rmepicrreosbeniatlatgioennommoesstdoaftatbhaessees.pAhynlaimshuaAr0e- s://ac a imuB-dnaE2 like cassette is found in the Chloroflexi d alignment methods. The optimized alignment was then visu- e Thermomicrobium roseum, the Planctomycetes R.baltica, m ally inspected with BioEdit v5.0.9 (26) and submitted to ic Gblocksv0.91b(27)withthehalf-gaps settingandotherwise the Verrucomicrobia Verrucomicrobium spinosum and the .ou two partially sequenced acidobacteria species (Acido- p default parameters to select conserved positions and discard .c bacterium capsulatum and Solibacter usitatus). In all these o poorly aligned and phylogenetically unreliable information. m Phylogenetic analyses of the T-COFFEE optimized instances, the imuA0-imuB-dnaE2 cassette presents the same /na alignments were carried out using both MrBayes v3.1.1 for configuration and apparently reflects a pattern of vertical r/a Bayesian (28) and PHYML v2.4.1 (29) for maximum- transmission from a common ancestor with the actino- rtic bacteria that ultimately leads to the delta proteobacteria, le likelihood (ML) inference of tree topologies. In both cases, -a where imuA-imuB-dnaE2 cassette instances can be identified b a mixed four-category gamma distributed rate plus propor- s tion of invariable sites model [invgamma] was applied and in B.bacteriovorus and Anaeromyxobacter dehalogenans. tra itsparameterswereestimatedindependentlybyeachprogram. Somewhere along this line, however, the imuA0 gene must ct/3 Four independent MrBayes Metropolis-Coupled Markov have undergone drastic changes, involving either extensive 4/1 mutationor,mostprobably,apartialorcompletesubstitution /6 ChainMonteCarlorunswerecarriedoutwithfourindepend- 6 ent chains for 106 generations, and 1000 bootstrap replicates by recombination that ultimately led to the imuA gene /24 were used for ML inference with PHYML. The resulting observed in the proteobacteria. A reliable phylogeny of the 01 phylogenetic trees were plotted with TreeView v1.6.6 (30) imuA0 and imuA genes cannot be reconstructed, since align- 526 and enhanced for presentation using CorelDraw Graphic ments generated with both sequences present almost no con- by Suite v12 (Corel Corporation). servedpositions,backinguptheprevioussuggestion(21)that gu imuA and imuA0 are not homologs, but could be functional e s analogs. Moreover, separate phylogenies of imuA and imuA0 t o n RESULTS AND DISCUSSION become only consistent when restricting the phylogeny to 2 3 well-defined groups such as the actinobacteria or the proteo- N Identificationof additional instances of o bacteria (data not shown), making it impossible to precisely v the imuA-imuB-dnaE2 cassette define the point at which the aforementioned recombination em b Using BLAST and PSI-BLAST searches with the products event took place. er 2 of each of the cassette genes as queries, homologs of this The imuA-imuB-dnaE2 configuration found in the delta 0 1 multiple gene cassette were located in most bacterial phyla, proteobacteria is also maintained in the alpha proteobacteria, 8 ranging from the actinobacteria to the proteobacteria. Cast where cassette instances can be found in all completely in the light of the accepted branching order of bacteria, as sequenced genomes and even in some plasmids (e.g. establishedby16SRNA(31),RecA(32)andproteinsignature A.tumefaciens). Similar three-gene cassettes are also found phylogenies (33), these findings reveal either a progressive in several gamma and beta proteobacteria species (e.g. accretion or fragmentation of the imuA-imuB-dnaE2 cassette Shewanella oneidensis, Vibrio parahaemolyticus), although in the actinobacteria, punctuated by several genetic reorg- cassetteprevalenceisnotsoregularinthesebacterialclasses. anizations that gave rise to a range of different cassette Neither imuA-imuB-dnaE2 cassettes, nor any of their con- configurations. stituent genes, can be found, for instance, in the Entero- As it can be seen in Figure 1, the simplest instance of an bacteriaceae and the Pasteurellaceae, although both are imuA-imuB-dnaE2 cassette constituent can be traced back richly sampled families in terms of available genome to some actinobacteria, such as Kineococcus radiotolerans, sequences. In addition, both gamma and beta proteobacteria Symbiobacterium thermophilum or Actinomyces naeslundii, presentseveralinstancesoflexA-imuA-imuB-dnaE2cassettes. in which only the dnaE2 gene can be found. Close Theseincludethepreviouslydescribedfour-genecassettesof Nucleic Acids Research, 2006, Vol. 34, No. 1 69 D o w n lo a d e d fro m h ttp s ://a c a d e m ic .o u p .c o m /n a r/a rtic le -a b s tra c t/3 4 /1 /6 6 /2 4 0 1 5 2 6 b y g u e s t o n 2 3 N o v e m b e r 2 0 1 8 70 Nucleic Acids Research, 2006, Vol. 34, No. 1 P.putida, P.fluorescens, P.syringae and the Xanthomon- imuB and dnaE2 genes of C.crescentus has already been adaceae (20), together with additional instances in Acidith- singled out as the source of most TLS mutagenic activity in iobacillus ferrooxidans, Methylococcus capsulatus and the thisspecies (21),and that unregulated TLS activity mediated beta proteobacteria Dechloromonas aromatica, Thiobacillus by dinB has been shown previously to yield mutator pheno- denitrificans and Azoarcus sp. types in E.coli (35), resulting in lowered adaptative fitness. As expected, LexA regulation is maintained in most actinobacteria presenting full imuA0-imuB-dnaE2 cassettes. LexAregulation ofimuA-imuB-dnaE2cassettes The sole exception concerns Rubrobacter xylanophilus, Asmentionedabove,astrikingcharacteristicofthepreviously but this bacterium does not present a Gram-positive LexA- identifiedimuA-imuB-dnaE2andlexA-imuA-imuB-dnaE2cas- binding motif upstream of its lexA gene and its lexA gene settes in the alpha, gamma and beta proteobacteria was their sequence shows substantial divergence from that of other persistent regulation by their host (or their own encoded) actinobacteria,suggestingthatitmayhaveevolvedseparately Do w LexA protein in spite of the drastic differences between the to recognize a new LexA-binding motif. Outside the n regulatorymotifsoftheseLexAproteins(10,15,20).Together actinobacteria, a Gram-positive LexA-binding motif can be loa d with the existence of lexA-imuA-imuB-dnaE2 cassettes, in identified upstream of both T.roseum lexA and imuA0 genes, e d which a lexA gene seems to be specifically set to control suggestingthat,inadditiontotheDehalococcoidetes(13),this fro the adjacent imuA-imuB-dnaE2 genes, and the complete is also the LexA-binding motif of the Thermomicrobiales. m h absenceofcassetteinstancesinthosephylaandclasseslacking Interestingly, a derivative of the Gram-positive LexA- ttp aLelexxAArgegenuelat(ieo.gn.oEfpismiluoAn-iPmroutBe-odbnaacEte2r-ilai)k,ethciasssseutgtegsesotuegdhtthtaot b(SinTdGinYgWmCoAtiWf wNiYthGrAemAiCnAisNce)nccaens aolfsothbeeofnoeunsdeeunpsitnreEam.coolfi s://ac a bemarkedlybeneficialfortheirbacterialhosts.Havingestab- V.spinosum lexA, recA and imuA0 genes, suggesting that this d e lished that the presence of imuA-imuB-dnaE2 cassettes was is the LexA-binding sequence in this species and that the m ic notlimitedtotheproteobacteria,andgiventhatanewcassette imuA0-imuB-dnaE2 cassette is regulated by V.spinosum .o u instance in B.bacteriovorus (19) had also been shown to be LexA. A similar situation is found also in the acidobacteria, p .c regulatedbyLexAwithyetanothermarkedlydivergentLexA- where NWTCN HTTC direct repeat motifs (close to the one o 7 m binding motif (TTACN3GTAA), we set about to examine associated with the alpha proteobacteria) (15) can be located /na LexA regulation in all the here-identified cassette instances. upstreamofthelexAandimuA0 genesinbothspecies.Thisis r/a Interestingly, even though no significant occurrences of all the more remarkable given that A.capsulatum presents rtic the Gram-positive LexA-binding motif (11) can be located two instances of the imuA0-imuB-dnaE2, and the afore- le -a upstream of dnaE2 genes in those species (K.radiotolerans, mentioned LexA-binding motif is present upstream of both b s S.thermophilum and A.naeslundii) presenting only this con- imuA0 genes. tra stituent of the imuA0-imuB-dnaE2 cassette, high-scoring and LexA regulation is also preserved in the delta proteo- ct/3 well-placedGram-positiveLexA-bindingsitescanalreadybe bacteria. Besides the imuA-imuB-dnaE2 cassette already 4/1 foundprecedingthednaE2geneinthednaE2-imuBcassettes shown to be LexA regulated and DNA damage inducible in /6 6 of the Streptomycineae. As shown in Figure 1, a similar B.bacteriovorus (19), derivatives of the M.xanthus LexA- /2 4 pattern of LexA regulation is conserved also in those actino- binding sequence (18) can also be found upstream of 01 bacteria presenting split cassette instances, with an imuA0- A.dehalogenans lexA, recA and imuA genes. Interestingly, 52 6 imuB tandem and an isolated dnaE2 gene. In these cases, of in M.xanthus, where only the dnaE2 gene can be found, b y whichM.tuberculosisisawell-knownrepresentative,apparent there is no evidence of a LexA-binding sequence upstream g u LexA regulation does not only concern the isolated dnaE2 of it, suggesting again that it is the combined presence of e s gene [which has been experimentally confirmed, (34)], but imuB and dnaE2 in a bacterial genome that prompts LexA t o n extends also to the imuA0-imuB tandem. Even though in regulation. As mentioned earlier, the presence of imuA- 2 3 some cases, like N.farcinica, evidence of LexA regulation imuB-dnaE2 cassettes among the alpha proteobacteria is N can be found only in the imuA0-imuB tandem, the systematic pervasive, with some species harboring up to three copies ov e regulation of at least one of the cassette components in all of this cassette (e.g. A.tumefaciens) and frequent evidence m b these species suggests that it is their concerted activity that of plasmid dissemination. Consistent with this widespread er 2 creates a positive pressure towards LexA regulation. In this distribution, LexA regulation and DNA-damage induction 0 1 context, it should be stressed that a combined activity of the of the imuA-imuB-dnaE2 cassettes have been reported in 8 Figure 1. Schematic representation of representative cassette configurations and similar instances found in complete and incomplete genome sequences. Allpositionsarerelativetogenome/contigstartposition.n.a.and(BLAST)indicatenon-annotatedcassetteinstancesofincompletegenomeslocatedthrough BLAST.EXPdenotesexperimentallyverifiedLexA-bindingmotifs.Bacterialnameabbreviationsforsimilarinstancesareasfollows:Aca—A.capsulatum,Acn— A.naeslundii, Afe—A.ferrooxidans, Atu—A.tumefaciens, Azo—Azoarcus sp., Bfu—Burkholderia fungorum Bja—Bradyrhizobium japonicum, Bma— Burkholderia mallei, Bme—Brucella melitensis, Bpa—Bordetella parapertussis, Bth—Burkholderia thailandensis, Bvi—Burkholderia vietnamiensis, Ccr—C.crescentus, Cdi—Corynebacterium diphtheriae, Cef—Corynebacterium efficiens, Hne—Hyphomonas neptunium, Ilo—I.loihiensis, Jan—Jannaschia sp., Mav—Mycobacterium avium, Mbo—Mycobacterium bovis, Mde—M.degradans, Mlo—Mesorhizobium loti, Msm—Mycobacterium smegmatis, Mxa—M.xanthus, Nar—Novosphingobium aromaticivorans, Nfa—N.farcinica, Nha—Nitrobacter hamburgensis, Noc—Nocardioides sp., Nwi—Nitrobacter winogradskyi,Pac—P.acnes,Pae—P.aeruginosa,Pde—Paracoccusdenitrificans,Pfl—P.fluorescens,Pol—Polaromonassp.,Ppu—P.putida,Psy—P.syringae, Reu—Ralstoniaeutropha,Rfe—Rhodoferaxferrireducens,Rge—Rubrivivaxgelatinosus,Rpa—Rhodopseudomonaspalustris,Rsp—Rhodobactersphaeroides Rso—R.solanacearum,Rxy—R.xylanophilus,Sal—Sphingopyxisalaskensis,Sam—Shewanellaamazonensis,Sav—Streptomycesavermitilis,She—Shewanella sp.,Sil—Silicibactersp.,Sme—S.meliloti,Spo—Silicibacterpomeroyi,Sth—S.thermophilum,Tde—T.denitrificans,Vvu—Vibriovulnificus,Wme—Wautersia metallidurans,Xax—XanthomonasaxonopodisandXor—Xanthomonasoryzae.AdditionalinformationforeachcassetteinstanceisprovidedinSupplementary TableS1,whichisincludedasSupplementaryData. Nucleic Acids Research, 2006, Vol. 34, No. 1 71 A.tumefaciens, C.crescentus and S.meliloti (20,21), and that have already been suggested to be part of to-date the corresponding LexA-binding motifs have been found undefined groups of actinobacteria (36–37). Both these upstream of imuA in other alpha proteobacteria species (15). species are positioned at the exit of the actinobacteria clade, Concerning the gamma and beta proteobacteria, LexA intermingling with the intermediate phyla (Planctomycetes, regulation has been reported (10) or is apparent in all the greennon-sulfurbacteria,acidobacteriaandVerrucomicrobia) identified instances of the imuA-imuB-dnaE2 cassette. It is that then lead to the proteobacteria, in which the natural interesting to note, however, that the previously reported branching order (delta, alpha, beta and gamma) can be (20) LexA-binding sequence of lexA-imuA-imuB-dnaE2 cas- observed. In agreement with previous phylogenies (38,39), settes(GTACN GTGC)isnotpresentinallthehere-identified the Xanthomonadaceae branch earlier than gamma and beta 4 four-gene cassette instances. A.ferrooxidans, T.denitrificans, proteobacteria,suggestingthattheyconstituteaseparategroup D.aromatica and Azoarcus all present an E.coli-like LexA- of early gamma/beta proteobacteria. bindingsequenceupstreamoftheircassettelexAgenewhich, AsimilarstoryistoldbytheDnaE2sequencetreeuptothe Do w in contrast to the Pseudomonadaceae and the Xanthomon- branchingpointofbetaandgammaproteobacteria(Figure3). n adaceae, is the sole lexA gene in their genome. A main cluster encompasses most of the actinobacteria, dis- loa d playing a branching order in which an imuA0-imuB-dnaE2 e d Reorganization and lateral gene transfer of cassette configuration leads to split cassette instances and fro thus suggesting that the later originated following a genetic m the imuA-imuB-dnaE2 cassette h reorganization of the former. In this context, the persistent ttp EcavsesnetttheouisghinthaepopvreorxailmldaitsetraibguretieomneonftthweiitmhutAhe-imeustBa-bdlinsahEed2 rgeegsutslataigoaninofaonpeoosirtibvoethprmesesmubreerstoowfatrhdesspLleitxAcasrseegttuelsatsiuogn-. s://ac a phylogeny of bacteria, its broad dispersal is punctuated by A separate group with long-branch lengths takes in those d e noticeable absences (e.g. cyanobacteria, epsilon proteo- species purporting only a dnaE2 gene (S.thermophilum, m ic bacteria) and seems to be intimately linked to the LexA net- R.xylanophilus, K.radiotolerans and T.roseum). The Strepto- .o u work,asevidencedbyitspersistentregulation,thepresenceof mycetaceae, which display a unique dnaE2-imuB cassette p .c four-gene cassettes including a lexA gene and the absence of configuration, also fall in this last group. Even though LGT o m cassettes in all classes and phyla lacking a lexA gene. More- events could be invoked to explain this clustering, its /n a over, the identification of several imuA-imuB-dnaE2 cassette origin seems to be a long-branch attraction artifact due to r/a duplication events (e.g. A.capsulatum, A.tumefaciens) and the substantial divergence in these sequences, a fact that rtic its recurring existence in plasmids (e.g. R.solanacearum, would be in agreement with a period of extensive genetic le -a S.meliloti) had suggested previously that the evolution of reorganization that led from three-gene imuA0-imuB-dnaE2 b s this cassette might have also been shaped by other factors, cassettes to the isolated dnaE2 and tandem dnaE2-imuB tra c such as LGT. Therefore, to ascertain whether there existed configurations seen in these species. t/3 relevant discrepancies in the particular phylogeny of the Incontrast,theconsistentplacementofthePlanctomycetes 4/1 imuA-imuB-dnaE2 cassette, a rigorous phylogenetic analysis R.baltica deep within the proteobacteria cluster is in clear /6 6 of this multiple gene cassette was conducted. Owing to the disagreementwithboththeestablishedandtheDnaE1phylo- /2 4 aforementioned lack of homology between imuA and imuA0, genies. This, together with the fact that R.baltica possesses 01 5 these sequences were discarded for the analysis. Similarly, a split cassette with imuA and imuB genes (RB11894 and 2 6 optimized alignments of imuB sequences following our RB11891) that present high identity with P.putida imuA b y astringent protocols yielded too few conserved positions andimuB(insteadofactinobacteriaimuA0andimuB),strongly g u (below 30) for a sound phylogenetic analysis over the suggeststhatR.balticaacquireditsimuA-imuB-dnaE2cassette e s whole bacteria domain. Therefore, and given that imuB through LGT from an ancestor of gamma and beta proteo- t o n genes were also absent in some of the identified cassette bacteria and later reorganized it into a split cassette with an 2 3 instances (e.g. K.radiotolerans), the phylogenetic analysis imuA-imuB tandem and an isolated dnaE2 gene. In this N o was carried out using the dnaE2 gene product sequence, respect, it should be noted that the clustering together of all v e which was both the most phylogenetically informative of imuA-imuB-dnaE2 cassette instances that are located in plas- m b the cassette gene sequences (623 conserved positions) and mids in the alpha proteobacteria strongly suggest that there er 2 alsotheonlyoneprovidingfullcoverageamongtheidentified has been extensive plasmid dissemination of the imuA- 0 1 cassette instances. Furthermore, to validate and reinforce imuB-dnaE2 cassette in this bacterial class. In the light of 8 the phylogenetic approach, the analysis was also conducted this, and given the close placement of R.baltica in the on the dnaE1 gene product sequence in all the species DnaE2 tree (between alpha and gamma/beta proteobacteria), studied. The dnaE1 gene, corresponding to the normal (non- plasmid dissemination could certainly be the mechanism cassette) alpha subunit of the DNA polymerase III, issimilar for the observed LGT event. in length to the dnaE2 gene and ought to be subjected to similar structural constraints, thus providing a reliable Duplication of the lexA-imuA-imuB-dnaE2 cassette contrast to the evolution of the dnaE2 gene and facilitating the identification of phylogenetic artifacts, such as those due Another major source of disagreement between the DnaE2 to fast-clocked evolution. and DnaE1 trees concerns the branching point of gamma AsitcanbeseeninFigure2,theDnaE1sequenceproduces and beta proteobacteria. While the DnaE1 tree supports a tree that is in broad agreement with the established a view that is in close agreement with the established phylogenies (31–33). A natural cluster groups all the actin- phylogenies, the DnaE2tree generatesthreeseparate clusters obacteria except those (R.xylanophilus and S.thermophilum) with low support values. Two of these clusters correspond, 72 Nucleic Acids Research, 2006, Vol. 34, No. 1 D o w n lo a d e d fro m h ttp s ://a c a d e m ic .o u p .c o m /n a r/a rtic le -a b s tra c t/3 4 /1 /6 6 /2 4 0 1 5 2 6 b y g u e s t o n 2 3 N o v e m b e r 2 0 1 8 Figure2.UnrootedDnaE1sequencetree.Shadedareascorrespondtoestablishedphylogeneticgroups.Branchsupportvaluesbelow0.85or50%forBayesian orMLinferencephylogeneticanalyses,respectively,arenotshown.Cassetteconfigurationsareshownusingconstituentgenesabbreviations:L—lexA,A0—imuA0, A—imuA, B—imuB, E—dnaE2. Name abbreviations not used in Figure 1 are as follows: Bba—B.bacteriovorus, Bbr—Bordetella bronchiseptica, Dar—D.aromatica,Kra—K.radiotolerans,Mca—M.capsulatus,Sco—Streptomycescoelicolor,Sus—S.usitatussp.,Tro—T.roseum,Vpa—V.parahaemolyticus andVsp—V.spinosum.AdditionalinformationforeachdnaE1locusisprovidedinSupplementaryTableS1,whichisincludedasSupplementaryData. Nucleic Acids Research, 2006, Vol. 34, No. 1 73 D o w n lo a d e d fro m h ttp s ://a c a d e m ic .o u p .c o m /n a r/a rtic le -a b s tra c t/3 4 /1 /6 6 /2 4 0 1 5 2 6 b y g u e s t o n 2 3 N o v e m b e r 2 0 1 8 74 Nucleic Acids Research, 2006, Vol. 34, No. 1 roughly,tothe beta andgamma proteobacteria classes, while cassette instances ofthe XanthomonadaceaeandthePseudo- the third groups together nearly all instances of the lexA- monadaceae (GTACN GTGC). Further evolution in the 4 imuA-imuB-dnaE2 cassette instances, intermingling beta Xanthomonadaceae would have then led to their own and gamma proteobacteria species. Owing to the limited lexA1 binding sequence [TTAN TACTA, (16)], following 6 resolution of the DnaE2 tree at this level, it is difficult to a deletion of the rest of the lexA1-imuA1-imuB1-dnaE21 cas- infer an accurate account on the evolutionary history of sette. Based on the DnaE1 tree, it seems clear that a major the cassette at this point. LGT of a primordial four-gene deletion of the lexA2-imuA2-imuB2-dnaE22 cassette took cassette could certainly be invoked to explain the observed place after the split between the Pseudomonadaceae and data, but repeated LGT events and ad hoc deletions would the rest of gamma proteobacteria, since there is no evidence be still required in the most plausible LGT scenario, which of either the lexA2 gene or the rest of its accompanying cannot be supported solely on the available phylogenetic cassette in the Vibrionaceae, the Pasteurellaceae, the data. In contrast, several facts point to an ancestral duplica- AlteromonadaceaeortheEnterobacteriaceae.Asimilardele- Do w tion of the lexA-imuA-imuB-dnaE2 cassette, which may have tion must have taken place early in the evolutionary history n then led through vertical inheritance and several deletions ofthebetaproteobacteria,sincethereisagainnoevidenceof loa to the observed distribution of three- and four-gene cassettes any component of the lexA2-imuA2-imuB2-dnaE22 cassette in de d among gamma proteobacteria. The consistent clustering of this bacterial subclass. fro Xanthomonadaceae and Pseudomonadaceae, for instance, is m h reminiscent of a vertical inheritance pattern with missing Reorganization ofduplicated lexA-imuA-imuB-dnaE2 ttp ifnactetrmtheadtiiatteisgoanmlymianpthroetseeobtwacotefraima iglireosupthsa.tMthoereloexvAer,getnhee cassettes inthe gamma and beta proteobacteria s://ac a of the lexA-imuA-imuB-dnaE2 cassette presents a markedly The duplication scenario outlined in Figure 4 raises some d e divergentLexA-bindingmotif(GTACN GTGC)notfoundin interesting points. On the one hand, a repeated feature in m 4 ic the other four-gene cassettes strongly suggests a common both gamma and beta proteobacteria concerns the genetic .o origin. Interestingly, lexA duplication has already been sug- reorganization of the lexA1-imuA1-imuB1-dnaE21 cassette, up gested in the Xanthomonadaceae (39), thus giving further involving the excision of the lexA1 gene from the cassette. .co m support to the duplication hypothesis. In fact, a phylogeny Thisseemstohaveoccurredearlyintheevolutionaryhistory /n a of the LexA protein (data not shown) also clusters together of the beta proteobacteria, but it must have also taken r/a both Xanthomonadaceae LexA proteins and one of the place independently in the gamma proteobacteria and the rtic Pseudomonadaceae LexA proteins in a separate branch, Xanthomonadaceae. Such a recurring pattern of convergent le -a while the remaining Pseudomonadaceae LexA protein clus- evolution points to some form of positive pressure towards b s ters naturally with Microbulbifer degradans in the gamma thesplitofthefour-genecassette,butitisdifficulttospeculate tra c proteobacteria cluster, thus suggesting again an ancestral on its nature. An appealing possibility concerns the need for t/3 duplication of lexA, and its attached cassette, at the root of more efficient regulation of the constituents of the imuA1- 4/1 the gamma and beta proteobacteria. imuB1-dnaE21 cassette. Even though operon organization is /6 6 Taken together, thus, the results from the DnaE1, DnaE2 a straight and effective means of regulation, the inclusion in /2 4 and LexA sequence trees support a scenario involving the the operon of its own governing gene (lexA1) reduces the 01 5 duplication of the full lexA-imuA-imuB-dnaE2 cassette. The ability to fine-tune the expression of downstream genes. It 2 6 proposed scenario, outlined in Figure 4, places the formation can be thus speculated that the observed convergent split of b y oftheoriginallexA-imuA-imuB-dnaE2cassetteinanancestor the lexA1-imuA1-imuB1-dnaE21 may be due to a requirement g u of the gamma and beta proteobacteria, following a genomic toimproverepressionofthemutagenicimuA1-imuB1-dnaE21 e s reorganization that brought together the extant imuA-imuB- cassetteinnormalconditions,ortoboostitsexpressionunder t o n dnaE2 cassette and lexA gene already seen in earlier proteo- adverse circumstances, without influencing negatively on the 2 3 bacteria. Drawing from the available data in the DnaE1 tree, behavior of the whole LexA regulon. In this context, it is N o whichplacesmostlexA-imuA-imuB-dnaE2cassetteharboring interesting to note a remarkable coincidence. The split of v e species at the root of gamma and beta proteobacteria, this the lexA1-imuA1-imuB1-dnaE21 in the gamma proteobacteria m b original lexA gene had already evolved or was on the way is concurrent with the first instance of a sulA gene in this er 2 toevolverecognitionofanE.coli-likemotif.Shortlyafter,or bacterial class (Figure 4). The sulA gene (40), encoding a 0 1 during the own reorganization event, this lexA-imuA-imuB- cell-division inhibitor and under mandatory regulation by 8 dnaE2cassetteunderwentaduplication,resultingintwocop- LexA in E.coli and Salmonella typhimurium, is absent from iesofthisfour-genecassettethatforthesakeofclaritywewill all other bacterial classes and phyla. Even though sulA is not here label as lexA1-imuA1-imuB1-dnaE21 and lexA2-imuA2- directly associated with lexA in these two species, its first imuB2-dnaE22.Followingduplication,thetwocassettecopies detectable homologs in Idiomarina loihiensis, the Shewanel- started to diverge, leading in their lexA gene to the full laceae or the Pseudomonadaceae are in a lexA-sulA tandem emergence ofdistinctLexA-binding motifs:the conventional operon configuration (20) and its transition into the E.coli E.coli one (CTGTN ACAG) in lexA1-imuA1-imuB1-dnaE21 configuration can still be traced through the analysis of its 8 cassettesandthatobservedinthelexA2-imuA2-imuB2-dnaE22 genomic surroundings in the Vibrionaceae (data not shown). Figure3.UnrootedDnaE2sequencetree.Shadedareascorrespondtoestablishedphylogeneticgroups.Branchsupportvaluesbelow0.85or50%forBayesianor MLinferencephylogeneticanalyses,respectively,arenotshown.Cassetteconfigurationsareshownusingconstituentgenesabbreviations:L—lexA,A0—imuA0, A—imuA,B—imuB,E—dnaE2.Pindicatesthatthecassetteislocatedinaplasmid.NameabbreviationsareasinFigure2.Additionalinformationforeach dnaE2locusisprovidedinSupplementaryTableS1,whichisincludedasSupplementaryData. Nucleic Acids Research, 2006, Vol. 34, No. 1 75 D o w n lo a d e d fro m h ttp s ://a c a d e m ic .o u p .c o m /n a r/a rtic le -a b s tra c t/3 4 /1 /6 6 /2 4 0 1 5 2 6 b y g u e s t o n 2 3 N o v e m b e r 2 0 1 8 Figure 4. Reconstruction of the proposed duplication hypothesis on a DnaE1 sequence tree of the gamma and beta proteobacteria classes, rooted using A.tumefaciensDnaE1(Atu)asanout-group.Branchsupportvaluesbelow0.85or50%forBayesianorMLinferencephylogeneticanalyses,respectively,are notshown.Cassetteconfigurationsandinstancesareshownontheright.Deletionandreorganizationeventsareindicatedbydifferentsymbols(seebottomleftof Figure4forsymbolexplanation).NameabbreviationsthatarenotusedinFigure2areasfollows:Eca—Erwiniacarotovora,Hdu—Haemophilusducreyi,Hin— Haemophilusinfluenzae,Msu—Mannheimiasucciniciproducens,Plu—Photorhabdusluminescenslaumondii,Pmu—Pasteurellamultocida,Stm—Salmonella entericaserovarTyphimurium,Vvu—Vibriovulnificus,Xfa—XylellafastidiosaandYpe—Yersiniapestis.AdditionalinformationforeachdnaE1locusandcassette configurationsisprovidedinSupplementaryTableS1,whichisincludedasSupplementaryData. It is thus reasonable to conjecture that the entrance of sulA sulA which has been found significant enough to incorrectly in the gamma proteobacteria may have played a role in the annotate this gene as sulA in some bacterial genomes (e.g. split of the lexA1-imuA1-imuB1-dnaE21 cassette, especially P.putida).Thislineofreasoningisfurthersupportedbywhat sincetheimuA1genesharesadegreeofhomologywithE.coli canbeobservedintheXanthomonadaceae,wherethesplitor
Description: