ebook img

The Bacterial Replicative Helicase DnaB Evolved from a RecA Duplication PDF

13 Pages·2000·1.03 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview The Bacterial Replicative Helicase DnaB Evolved from a RecA Duplication

Downloaded from genome.cshlp.org on March 6, 2023 - Published by Cold Spring Harbor Laboratory Press Article The Bacterial Replicative Helicase DnaB Evolved from a RecA Duplication Detlef D. Leipe,1 L. Aravind,2,3 Nick V. Grishin,1,4 and Eugene V. Koonin1,5 1NationalCenterforBiotechnologyInformation(NCBI),NationalLibraryofMedicine,NationalInstitutesofHealth,Bethesda Maryland20894USA;2DepartmentofBiology,TexasA&MUniversity,CollegeStation,Texas70843USA The RecA/Rad51/DCM1 family of ATP-dependent recombinases plays a crucial role in genetic recombination and double-stranded DNA break repair in Archaea, Bacteria, and Eukaryota. DnaB is the replication fork helicase in all Bacteria. We show here that DnaB shares significant sequence similarity with RecA and Rad51/DMC1 and two other related families of ATPases, Sms and KaiC. The conserved region spans the entire ATP- and DNA-binding domain that consists of about 250 amino acid residues and includes 7 distinct motifs. Comparison with the three-dimensional structure of Escherichia coli RecA and phage T7 DnaB (gp4) reveals that the area of sequence conservation includes the central parallel b-sheet and most of the connecting helices and loops as well as a smaller domain that consists of a amino-terminal helix and a carboxy-terminal b-meander. Additionally, we show that animals, plants, and the malarial Plasmodium but not Saccharomyces cerevisiae encode a previously undetected DnaB homolog that might function in the mitochondria. The DnaB homolog from Arabidopsis also contains a DnaG–primase domain and the DnaB homolog from the nematode seems to contain an inactivated version of the primase. This domain organization is reminiscent of bacteriophage primases–helicases and suggests that DnaB might have been horizontally introduced into the nuclear eukaryotic genome via a phage vector. We hypothesize that DnaB originated from a duplication of a RecA-like ancestor after the divergence of the bacteria from Archaea and eukaryotes, which indicates that the replication fork helicases in BacteriaandArchaea/Eukaryotahaveevolvedindependently. Geneticrecombinationisanessentialprocessforboth showameioticarrestphenotype,anditprobablyfunc- recombinational repair and sexual reproduction. In tionsintheformationofsynaptonemalcomplexesand Bacteria,thecentralroleinrecombinationisplayedby also in double-strand break repair (Bishop et al. 1992; the RecA recombinase enzyme (Radding 1989; Kowal- Dresseretal.1997;Yoshidaetal.1998).Thus,thereis czykowskiandEggleston1994;Seitzetal.1998).RecA functional overlap between Rad51 and DMC1 (Shino- is a DNA-dependent ATPase that promotes homolo- hara et al. 1997) and Caenorhabditis elegans seems to gous pairing and strand exchange between different have only a single Rad51/DMC1 homolog (Takanami double-stranded (ds) DNA molecules and is therefore et al. 1998). A Rad51/DMC1 homolog (termed RadA) necessaryforhomologousrecombinationandDNAre- thatcatalyzesDNApairingandstrandexchange(Seitz pair(Kowalczykowskietal.1994).Thebiochemicalac- etal.1998)isalsofoundintheArchaea(Sandleretal. tivities of RecA include the ability to form regular he- 1996). lical filaments, bind single-stranded (ss) and dsDNA, The RecA/RadA/DMC1 recombinases are closely and bind and hydrolyze nucleoside triphosphates related to three other groups of ATPases, namely bac- (Kowalczykowski et al. 1994). In addition to its direct terial Sms (also called RadA), bacterial DnaB, and ar- roleinrecombination,RecAfunctionsasacofactorin chaealandbacterialKaiC.TheSmsproteinisapoorly thecleavagereactionforLexA,therepressoroftheSOS characterized bacterial homolog of RecA in which the regulon (Little and Mount 1982; Witkin 1991). There RecA ATPase domain is fused to a Zn ribbon and a aretwotypesofRecA-likeproteinsinmanyeukaryotes, predicted serine protease domain (Koonin et al. 1996; namely Rad51 and DMC1/Lim15. Rad51 is expressed Aravind et al. 1999) (hereafter we use the designation in both meiotic and mitotic cells and mainly partici- SmstoavoidconfusionwiththearchaealRadA).Esch- pates in recombinational repair of double-strand erichiacolismsmutantsshowincreasedsensitivitytoX breaks (Shinohara et al. 1992; Doutriaux et al. 1998). rays,UVradiation,andmethylmethanesulfonate,sug- DMC1 is expressed in meiotic cells, its null mutants gestingaroleinrepairfortheSmsprotein(Neuwaldet al.1992;SongandSargentini1996). 3Present address: National Center for Biotechnology Information, National The cyanobacterial KaiABC gene cluster consti- LibraryofMedicine,NationalInstitutesofHealth,Bethesda,Maryland20894 tutes the circadian clock in the cyanobacterium Sy- USA. 4Presentaddress:DepartmentofBiochemistry,UniversityofTexasSouthwest- nechococcus(Ishiuraetal.1998;Iwasakietal.1999).The ernMedicalCenter,Dallas,Texas75235USA. KaiCproteingeneratesacircadianoscillationbynega- 5Correspondingauthor. [email protected];FAX(301)435-7794. tivefeedbackcontrolonitsownexpression(Ishiuraet 10:5–16 ©2000 by Cold Spring Harbor Laboratory Press ISSN 1054-9803/99 $5.00; www.genome.org Genome Research 5 www.genome.org Downloaded from genome.cshlp.org on March 6, 2023 - Published by Cold Spring Harbor Laboratory Press Leipe et al. al.1998).TheSynechococcusKaiCproteiniscomposed yeast(Game1993),XRCC2(Tambinietal.1997),R5H2 of two RecA-like domains joined head to tail. Highly and R5H3 (Cartwright et al. 1998), and TRAD (Kawa- conserved homologs of KaiC are found in the cyano- bata and Sacki 1998) in mammals, and several other bacterium Synechocystis, the bacterium Thermotoga, distinct RecA homologs found in Archaea and some andinallArchaeabutabsentfromotherbacteriaand bacteria (Aravind et al. 1999). Some of these orphan eukaryotes(Makarovaetal.1999). RecA homologs appear to contain an inactivated TheDnaBhelicaseisacrucialproteininbacterial ATPase domain (Aravind et al. 1999). Additional do- DNAreplication.ItunwindstheDNAduplexaheadof mains associated with the RecA core include a modi- the replication fork and is also responsible for attract- fied amino-terminal helix–hairpin–helix (HhH) do- ingtheDnaGprimasetothereplicationfork(Touguet maininthearchaeoeukaryoticRadA/DMC1,aamino- al.1994;Luetal.1996).Theactiveformoftheprotein terminal zinc finger and a carboxy-terminal Lon-type is a hexamer of identical 52.3-kD subunits that can protease domain in Sms, and a GTPase in one of the form rings with threefold (C3) and sixfold (C6) sym- archaealRecAhomologs(Aravindetal.1999). metry (Yu et al. 1996) and it has been hypothesized Here, using a combination of sequence database thattheamino-terminalATPasedomainsoftwoadja- searches, sequence alignments, phylogenetic analysis, cent protomers dimerizes to make the C6–C3 conver- and structural comparison, we show that (1) DnaB, sion (Fass et al. 1999). The crystal structure of the he- RecA,DMC1/RadA,Sms,andKaiCsharesignificantse- licase domain of phage T7 helicase–primase (gp4) has quence similarity along a region of 250 amino acids recently been solved (Sawaya et al. 1999) and it has that includes both the ATP-binding domain and the been found that the structure of the T7 helicase do- DNA-binding site; (2) DnaB likely evolved from RecA mainanditsinteractionswithneighboringsubunitsin byageneduplicationeventattheonsetoftheevolu- the crystal resemble those of the RecA and F ATPase tion of the Bacteria; (3) RecA and DnaB are likely to 1 (Sawayaetal.1999).InadditiontotheATPasedomain, performtheirfunctionbyasimilarmechanismofcon- E. coli DnaB comprises a globular amino-terminal do- formational change; (4) eukaryotes encode diverged main(proteolyticfragmentIII)thatisessentialforin- homologs of DnaB, some of which also contain a teractionwithotherproteinsinvolvedinDNAreplica- DnaG-type primase domain; these genes might have tion like DnaA, DnaC, and the DnaG primase (Na- beenintroducedintotheeukaryoticgenomebyahori- kayama et al. 1984; Biswas et al. 1994; Sutton et al. zontal transfer event involving a bacteriophage. We 1998).Thedomainconsistsofsixahelices(Weigeltet hypothesize that the common ancestor of the RecA/ al. 1998; Fass et al. 1999; Weigelt et al. 1999) that are DnaB superfamily functioned as a recombinase in the attachedtothecarboxy-terminalATPasedomainbya lastcommonancestor(LCA)ofallextantcellsandthat flexiblehinge(Milesetal.1997). aRecAhomolog(DnaB)wasrecruitedforthehelicase In addition to RecA, DMC1/Rad51/RadA, DnaB, function at the replication fork once DNA replication Sms,andKaiC,thereisalargenumberofproteinswith evolved in bacteria. This interpretation lends further more limited phylogenetic distribution that contain support to the hypothesis that the DNA replication the core RecA ATPase domain. These include, among machinery evolved independently in bacteria and ar- others,Rad51-interactingproteinsRad55andRad57in chaea/eukaryotes(Leipeetal.1999). Figure 1 (See pages 7–9.) Multiple alignment of the core domain of the RecA/DnaB superfamily of ATPases. From top to bottom (separated by horizontal lines) the alignment contains sequences from bacterial and chloroplast DnaB, DnaB proteins and primase– helicaseproteinsfrombacteriophagesandeukaryotes,bacterialSmsproteins,KaiCfromArchaeaandBacteria,RecArecombinasefrom BacteriaandphageT4,andRadAandRad51/DMC1recombinasesfromArchaeaandEukaryota.The80%consensusfortheseproteins isshownbelowthealignedsequences.Numbersindicatethedistancetotheamino-terminalmethionineandthecarboxylterminusof eachproteinandresiduesomittedwithinthealignment.(&)Thepositionofinteinsthathavenotbeenincludedinthealignment.The secondary structure elements derived from the X-ray structures of phage T7 gp4 and E. coli RecA are shown above the respective sequence.Helicesarerepresentedascylinders,strandsasarrows,andtheunorderedormobileloops1and2aslines.Keyresiduesthat arediscussedinthetextaremarkedbyarrowheads;thenumbersidentifythepositionoftheresidueingp4andRecAaccordingtothe originalpublications(Storyetal.1993;Sawayaetal.1999).Highlyconservedresiduesarecolorcodedandindicatedintheconsensus lineforthefollowinggroups.(Purple)Negativelycharged(D,E);(red)positivelycharged(H,K,R),charged(c=D,E,H,K,R);(green)tiny (u=G,A,S); (yellow) hydrophobic (h=A,C,F,I,L,M,V,W,Y) or aliphatic (l=I,L,V); (pale yellow) alcohol (o=S, T, Y); (light blue) polar (p=D,E,H,K,N,Q,R,S,T),(reddish-brown)small(s=A,C,D,G,N,P,S,T,V);(gray)big(b=notsmall).Alsocoloredareresiduesconserved onlywithintheDnaBfamily.Whereapplicable,sourceorganismsareidentifiedbyfour-letterabbreviations.(Aepe)Aeropyrumpernix; (Aqae)A.aeolicus;(Arfu)Archaeoglobusfulgidus;(Arth)A.thaliana;(Basu)Bacillussubtilis;(Bobu)Borreliaburgdorferi;(T7)bacteriophage T7;(T4)bacteriophageT4;(Cael)C.elegans;(CDnaB_Odsi)Odontellasinensischloroplast;(CDnaB_Popu)Porphyrapurpureachloroplast; (Chtr)Chlamydiatrachomatis;(Ecol)E.coli;(Glma)Glycinemax;(Hain)Haemophilusinfluenzae;(Hepy)H.pylori;(Hosa)Homosapiens; (Lema) Leishmania major; (Meja) Methanococcus jannaschii; (Meth) Methanobacterium thermoautotrophicum; (Mumu) Mus musculus; (Myge) Mycoplasma genitalium; (Mytu)Mycobacterium tuberculosis; (Plch) P. chabaudi; (Rhma) Rhodothermus marinus; (Sace) Saccharo- mycescerevisiae;(SPP1)BacillussubtilisbacteriophageSPP1;(Suso)Sulfolobussolfataricus;(Sy68)SynechocystisPCC6803;(Teth)Tetra- hymenathermophila;(Thma)T.maritima;(Trpa)Treponemapallidum. 6 Genome Research www.genome.org Downloaded from genome.cshlp.org on March 6, 2023 - Published by Cold Spring Harbor Laboratory Press DnaB Evolved from a RecA Duplication d.) n e g e l or f e g a p g n di e c e pr e e S ( 1 e r u g Fi Genome Research 7 www.genome.org Downloaded from genome.cshlp.org on March 6, 2023 - Published by Cold Spring Harbor Laboratory Press Leipe et al. d) e u n nti o C ( 1 e r u g Fi 8 Genome Research www.genome.org Downloaded from genome.cshlp.org on March 6, 2023 - Published by Cold Spring Harbor Laboratory Press DnaB Evolved from a RecA Duplication d) e u n nti o C ( 1 e r u g Fi Genome Research 9 www.genome.org Downloaded from genome.cshlp.org on March 6, 2023 - Published by Cold Spring Harbor Laboratory Press Leipe et al. RESULTS AND DISCUSSION The Core ATPase Domains of RecA and DnaB Are SpecificallyRelated BLAST searches seeded with the E. coli DnaB sequence retrieve the replicative helicases fromawiderangeofBacteriaandseveralbac- teriophages with highly significant E values (<10140) and the helicase–primase proteins from bacteriophages T3/T7 and T4 with less significantEvalues(between1015and1013). ThefirstiterationofthePSI-BLASTsearchun- expectedlyretrieved,withhighlysignificantE values,anumberofmembersoftheRecAsu- perfamily, namely bacterial Sms proteins and archaealandeukaryoticRadA/Rad51proteins. Forexample,thesequenceoftheSmsprotein from the bacterium Aquifex aeolicus was de- tected with an E value of 1019 and the se- quence of the murine Trad protein with an E valueof721019.Inaddition,previouslyun- Figure2 AmolscriptdiagramofE.coliRecAstructure.Areaswithsequence detected eukaryotic homologs of DnaB from conservationbetweenDnaBandRecA/DMC1/RadAarehighlighted,thenon- C. elegans, Arabidopsis thaliana, and Plasmo- conservedcarboxy-andamino-terminaldomainareshowninlightgray.The central parallel µ-sheet is blue and the elements that are involved in coordi- diumchabaudiwereretrievedwithEvaluesbe- natingloop1(betweenstrand4andhelixF)andloop2(betweenstrand5and tween 1017 and 1015; a human homolog of helix G) are green. Areas within the core domain that show no obvious se- these proteins was detected among EST prod- quencesimilaritybetweenRecAandDnaB(helixD,strand3andhelixE)are ucts by searching the database of expressed showninlightblue.ThesubdomaincomposedofhelixBandstrands6–8is showninyellow.ADPdiffusedintothecrystal(StoryandStitz1992)isshown sequencetags(dbEST)database(seediscussion in ball-and-stick representation. Conserved amino acid residues that are dis- below). Subsequent search iterations retrieve cussedinthetextandindicatedinthealignment(Fig.1)areshowninball- the entire RecA family. Conversely, searches and-stick representation: Lys-72 (in the P-loop), Glu-96, Asp-144 (Walker B), Gln-194,Arg-227,Lys-248,Lys-250,andTyr-264areatornearthecarboxyl seeded with E. coli RecA retrieve members of terminusofstrands2,4,5,6,and7,respectively.Aminoacidcoordinatesare the DnaB family starting with an E value of fromPDBfile2REB,locationofADPisfromPDBfile1REA.Theorientationofthe 0.001 for Helicobacter pylori DnaB in the first monomer, labels of strands, helices, loops, and residue enumeration are in PSI-BLAST iteration, with all the other mem- accordancewiththeoriginalpublications(Storyetal.1992;Storyetal.1993). bers of the DnaB family retrieved in subse- quent iterations. In all of these searches, RecA family 1 and 2). As detailed above, we found that the se- members and DnaB family members, respectively, quence of this 250-amino-acid central domain is spe- were consistently retrieved from the database before cificallyconservedbetweenDnaB,RecA,DMC1/RadA, any other ATPases. This suggests that within the class KaiC,andSmsproteinfamilies.Theimportanceofthis ofP-loopATPases,thereisaspecificstructuraland,by core for RecA function is underscored by the fact that inference, evolutionary, relationship between the Pk-REC,atruncated,210-amino-acidDMC1/RadAho- RecA, DMC1/RadA, Sms, KaiC, and DnaB families; molog from Pyrococcus, which consists of the core do- hereafter, we refer to them collectively as the RecA/ main alone, can complement UV-sensitive RecA mu- DnaBsuperfamily. tantsinE.coli(Rashidetal.1996). A multiple sequence alignment of the RecA and DnaB sequences was constructed on the basis of the Sequence and Structure Conservation in the PSI-BLAST output and refined manually using struc- RecA/DnaBSuperfamily tural information on RecA and DnaB (Fig. 1). The re- The structure of the E. coli RecA protein consists of a gion of sequence conservation between RecA, RadA/ majorcentraldomainflankedbytwosmallerdomains DMC1, Sms, KaiC, and DnaB extends for ~ 250 amino attheaminoandcarboxytermini(Storyetal.1992;see acids and includes the P-loop and the Mg2+-binding also Fig. 2, below). The central domain can be subdi- site (Walker A and B motifs, respectively), which are vided into a large subdomain encompassing strands involvedinNTPbindingandhydrolysis.Althoughthe 1–5andtheconnectinghelicesandloopsandasmall WalkerAmotifshowsthetypicalG..GKTpatterncon- subdomainthatrepresentstwononcontiguousregions servedinavastvarietyofATPaseandGTPases(Saraste ofthesequenceincludinghelixBandstrands6–8(Figs. et al. 1990), it is noteworthy that the second carbox- 10 Genome Research www.genome.org Downloaded from genome.cshlp.org on March 6, 2023 - Published by Cold Spring Harbor Laboratory Press DnaB Evolved from a RecA Duplication ylate typically found in the Walker B motif of several polymer and is thus distant from the (presumed) ATP largegroupsofATPases,forexample,theAAA+classof andDNAbindingsites(Storyetal.1993). chaperone-like ATPases (Neuwald et al. 1999) and su- ThemostconservedRecAresidueinmotif5(Gln- perfamily I and II helicases (Gorbalenya and Koonin 194)isfoundatthecarboxy-terminalendofstrand5. 1993), is replaced by an alcohol residue in the RecA/ In the structure, this residue is adjacent to the ATP DnaBsuperfamily(Fig.1). g-phosphate and it has been proposed to mediate a Motif 3 corresponds to E. coli RecA strand 2 and structural change on binding of ATP that stabilizes a thefollowingloopandischaracterizedbyacompletely conformation in the following loop 2 and/or helix G conservedglutamate(hhh[SD].E)thathasearlierbeen with high affinity for DNA (Story and Steitz 1992). described as a conserved feature of the DnaB family Similarly, the corresponding residue of phage T7 gp4 (Ilyina et al. 1992). The conserved glutamate is as- (His-465) is in a position to act as g-phosphate sensor sumedtoactivatethenucleophilicwatermoleculefor orconformationalswitchbyformingahydrogenbond an in-line attack of the ATP g-phosphate (Story and withtheATPg-phosphate(Sawayaetal.1999).Inad- Steitz 1992), and a E96D mutation in E. coli RecA re- ditiontotheconservationoftheputativeg-phosphate sultsina100-foldreductionintheATPhydrolysisrate sensoritself(glutamineinallbacterialDnaBsandhis- (CampbellandDavis1999a,b).Thecatalyticglutamate tidine in the eukaryotic DnaB homologs, phage T7 is highly conserved not only in the entire RecA/DnaB gp4,phageT4UvsX,andtheSmsfamily),considerable superfamily, but it is found in the same location (car- sequence conservation is also found in the preceding boxy-terminalofthestrandthatfollowstheP-loop)in helixFandstrand5inallmembersoftheRecA/DnaB a large number of Walker-type ATPases, for example, superfamily (Fig. 1). This suggests that the general F0/F1 ATPases and Rho helicase (Yoshida and Amano mode of ATP-binding/hydrolysis-mediated conforma- 1995).Interestingly,however,thismotifisnotdetect- tional change is conserved at least between RecA, able in NTPases, for example, the AAA+ class and the RadA/DMC1, and DnaB. Whether that holds true for superfamily1and2helicases,wheretheconservedas- theentiresuperfamilyisdoubtfulbecausetheputative partate in the Walker B motif (motif 4) is followed by g sensor (His-465/Gln-194) is not conserved in the another negatively charged residue (so-called DEXX double-domainKaiCproteinsandbecausetheloopbe- box).Astheconservedaspartateinmotif4isfollowed tween motifs 5 and 6 (loop 2) seems to be missing in bynonchargedresidueintheRecA/DnaBsuperfamily, KaiCandSms(Fig.1). it has been suggested that the second charged residue Inadditiontomediatingaconformationalchange of the Walker B motif is functionally replaced by the within a subunit, binding and hydrolysis of ATP is conserved glutamate in motif 3 in the RecA/DnaB su- likelytoinducetherotationofsubunitswithintheT7 perfamily(Sawayaetal.1999). gp4 hexamer (Sawaya et al. 1999). It has been sug- In addition to the catalytic glutamate in motif 3 gested that T7 gp4 residue Arg-522, which is close to andtheWalkerAandBmotifs(motifs2and4)thatare theg-phosphateofaboundATPinaneighboringsub- foundinawidevarietyofATPases,therearefourother unit,isresponsibleforcouplingATPhydrolysistosub- motifs (1, 5, 6, and 7 in Fig. 1) that show significant unit rotation (Sawaya et al. 1999). The importance of sequence conservation among the members of the the residue is underscored by the fact that Arg-522 is RecA/DnaB superfamily and that can be correlated the third residue of a [KR].[KR] motif located between with elements known from the crystal structure of strands 7 and 8 that is completely conserved in the RecA and T7 gp4 (Story and Steitz 1992; Story et al. DnaB, RecA, Sms, and KaiC families (Fig. 1). Surpris- 1992;Sawayaetal.1999)(Figs.1and2). ingly,the[KR].KR]motifappearstobemissinginthe Motif 1 is amino-terminal of the P-loop and cor- archaeoeukaryotic RadA/DMC1 family (Fig. 1) al- respondstohelixBandaglycine-richloopcontaining thoughRadA/DMC1sharesthestrandexchangefunc- a conserved negative charge with the consensus pat- tionwithRecAandsharesthehighestoverallsequence ternh.[ST]G...h[DE]...G(wherehstandsforahydro- similaritywithRecAwithintheRecA/DnaBsuperfam- phobicresidue,residuesinsquarebracketsarealterna- ily. There is a conserved positively charged residue tives,andadotstandsforanyresidue).InE.coliRecA, nearby in the predicted strand 7 of the RadA/DMC1 thetightturncompletedbyhelixBandtheneighbor- familyproteins(Fig.1),butwhetherornotthisresidue ing carboxy- and amino-terminal sequences is stabi- isfunctionallyequivalenttoArg-522willhavetoawait lized by hydrogen bonds between Thr-42 and Asp-48 thefirststructureofamemberofthisfamily. side chains and Asp-48 and Gly-54 backbone atoms In T7 gp4, the base of the bound nucleotide is (Story et al. 1993); all four residues involved in these sandwiched between Arg-504 and Tyr-535 (Sawaya et interactions are highly conserved within the entire al. 1999). Arg-504, at the carboxy-terminal end of RecA/DnaB superfamily (Fig. 1). No function has yet strand6inmotif6(Fig.1),isconservedaseitherArgor been assigned to motif 1, but it has been noted that Lys in DnaB and RecA but not in most KaiC and Sms this regions points towards the outside of the RecA proteins.T7gp4Tyr-535,atthecarboxy-terminalend Genome Research 11 www.genome.org Downloaded from genome.cshlp.org on March 6, 2023 - Published by Cold Spring Harbor Laboratory Press Leipe et al. ofstrand8inmotif7,seemsconservedasanaromatic the KaiC family is the most difficult to interpret. The residue (Phe, Tyr, His) within the DnaB family al- geneseeminglyhasundergonemultiplegeneduplica- thoughexactsuperpositionwouldrequireagapinthe tions and lateral transfers. The typical KaiC protein bacterial DnaB sequences (Fig. 1). In E. coli RecA, the composedoftwoRecA-likedomainsjoinedheadtotail base of the bound ADP stacks on Tyr-103 (Story and isfoundintheCyanobacteriaandtheArchaeaArcheo- Steitz 1992), which is a residue carboxyl terminus of globus, Pyrococcus, and Methanobacterium (Fig. motif3thatseemsconservedonlyinRecAbutnotin 3),whereasitisabsentfromMethanococcusandAeropy- anyoftheothermemberoftheRecA/DnaBsuperfam- rum.Asanadditionalcomplication,theMethanobacte- ily (Fig. 1). The other residues that are close to the riumKaiCismorecloselyrelatedtooneoftheSynecho- adeninebaseintheE.coliRecAstructureareAsp-100, cystis KaiC paralogs than to the double-domain KaiC Tyr-264,andGly-265(StoryandSteitz1992).Interest- foundinotherArchaealikeArcheoglobusandPyrococcus ingly,E.coliRecATyr-264isconservedasanaromatic KaiC (Fig. 3). In addition to the double-domain KaiC residueintheRecAfamilyandlocatedatthecarboxy- proteins,thereisalargenumberofsingle-domainKaiC terminal end of strand 8 similar (but seemingly not homologsthatareallarchaealwiththeexceptionofan identical) to the position of T7 gp4 Tyr-535. A con- apparent recent transfer into the hyperthermophilic served aromatic residue close to the carboxy-terminal bacteriumThermotogamaritima(Fig.3).Indeed,whole- endofstrand8isalsopresentintheRadA/DMC1,Sms, genomeanalysishasshownthatalmostaquarterofall and KaiC families, but they do not seem to align ex- T.maritimagenesarelikelyacquiredbylateraltransfer actlywiththearomaticresiduesineitherRecAorgp4/ fromtheArchaea(LogsdonandFanny1999;Nelsonet DnaB(Fig.1).Thelackofexactsuperpositioncouldbe al. 1999). The KaiC family as a whole seems to origi- caused by a suboptimal alignment or, alternatively, nate from the bacterial side of the RecA/DnaB super- might indicate that the spatial orientation of the family and is identified as a sister group to the Sms nucleoside with respect to the phosphate moiety dif- family with varying statistical support in most phylo- fers between the various members of the RecA/DnaB genetic analyses (results not shown). We hypothesize superfamily. that the ancestral KaiC was a single-domain protein Similarities between DnaB and RecA can also be that has been laterally transferred from the Bacteria foundinthesubunitinterface.Hexamerformationin intotheArchaeaandthatthetwo-domainKaiCorigi- T7gp4dependsonhelixAthatislocatedattheamino nated by gene duplication and fusion within the Ar- terminusofthehelicasedomain(Sawayaetal.1999).It chaea. In this model, the occurrence of the double- protrudesfromtherestofthemoleculeandcompletes domainKaiCintheCyanobacteriaanditslackinother a three-helix bundle (helices D1, D2, and D3) on a Bacteria is interpreted as a secondary lateral transfer neighboringsubunit(Sawayaetal.1999).Similarly,in fromtheArchaeaafterthemainbacteriallineageshad the RecA polymer, large parts of the subunit interface beenestablished. are formed by a protruding amino-terminal helix A (Fig. 2) and strand 0 of one subunit packing against EvolutionoftheEukaryoticDnaBProteins strand3andhelixEinaneighboringsubunit(Storyet TherearetwotypesofDnaBproteinsintheEukaryota. al. 1992). Thus, although no sequence similarity has TheDnaBsequencesfoundinchloroplastgenomesare beendetectedineithertheprotrudingamino-terminal highlysimilartothebacterialsequencesandthechlo- helixAortheotherinterfacehalfaroundhelixD,the roplast DnaB of the red algae Porphyra also shares the structuralsimilaritiessuggestthatthesubunitinterface intein position with Cyanobacteria and a few other ishomologousandwasalreadypresentinthecommon bacteria(Pietrokovski1996)(Fig.1).Thereistherefore ancestor of DnaB and RecA. In contrast, the amino littledoubtthattheseproteinsareverticallyinherited terminusofKaiCislocatedimmediatelybeforemotif1 from the bacterial endosymbiont that gave rise to the (Fig. 1) and a protruding helix is likely absent. It is plastids and that they are likely the functional heli- thereforeunlikelythattheKaiCproteinshavetheabil- cases in chloroplast DNA replication. In contrast, the ity to hexamerize and the head-to-tail fusion of two previously undetected nuclear eukaryotic DnaB ho- RecA-like ATPase domain in the two-domain KaiC mologs tend to group with the T-odd bacteriophage genes suggests that they might function as dimers. proteins (gp4) in which the DnaB helicase domain is Similarly,theamino-terminalregionofSmsproteinsis fusedtoaDnaG-typeprimasedomain,althoughthere taken up by the Zn-binding module, which might be is no strong statistical support for this clade (Fig. 3). analternativemeansofdimerizationbutalsocouldbe Also,whenthenucleareukaryoticDnaBsequencesare aDNA-bindingdomain. used as queries for database searches, they typically showthegreatestsimilaritytothebacteriophageDnaB homologs (data not shown). Furthermore, the DnaB EvolutionoftheKaiCFamily homologfromArabidopsishasthesamedomainarchi- Amongtheproteinsconsideredhere,theevolutionof tecture as the phage homologs, with the primase do- 12 Genome Research www.genome.org Downloaded from genome.cshlp.org on March 6, 2023 - Published by Cold Spring Harbor Laboratory Press DnaB Evolved from a RecA Duplication probably has been horizontally transferred into eukaryotes via a bacteriophage. Subse- quent evolution of this gene in eukaryotes seemedtohaveinvolveddegradationofthe primase domain, at least in some lineages, whereas the helicase domain remained in- tact. The unexpected tree topology for the eukaryotic DnaB homologs, namely the strongly supported grouping of the Plasmo- dium protein with the human one and the lack of statistically significant grouping of theplantproteinwiththerestoftheeukary- otes,suggestacomplexevolutionaryhistory of this gene, perhaps involving additional horizontal transfer events. The functions of the nuclear eukaryotic DnaB homologs re- main unclear. The plant and animal DnaB homologscontainaamonia-terminalexten- sionthatislikelytofunctionasanorganel- larimportpeptide;thus,aroleinmitochon- drialDNAreplicationorrepairseemsapos- sibility.ThispossibleuseofthephageDnaB for organellar function is reminiscent of a similar adaptation of a T-odd phage RNA polymerase in organellar transcription in plants(Hedtkeetal.1997). Evolution of the RecA/DnaBSuperfamily The sequence similarity between DnaB and RecA and their shared ability to form hexa- Figure 3 Unrooted phylogeny of the RecA/DnaB superfamily. The analysis is basedonanthealignmentoftheRecA/DnaBcoredomainshowninFig.1.The meric rings or helices of similar quaternary data matrix contains 221 residues seven of which are invariant or parsimony structure (Ogawa et al. 1993; Yu and Egel- uninformative.Supportforindividualbranchesisindicatedbybootstrapvalues man 1993, 1977; Yu et al. 1996; Seitz et al. for 1000 resampling of PAUP maximum parsimony (first number), PHYLIP dis- tance analysis (second number), and the reliability value computed by the 1998) raise the question of whether the PUZZLEsoftware(thirdnumber).Bootstrapvalues<50%arenotrecordedand RecA/DnaB superfamily is related to other brancheswithoutbootstrapnumbersarederivedfromadistancetreecomputed hexameric P-loop NTPases. There is no evi- withthePHYLIPprogramsprotdistandfitch.Branchlengthsarearbitraryanddo dence of a specific relationship with the not represent evolutionary distances. The two possible positions of the root as discussed in the text are indicated by black arrows. (Red) Eukaryota; (green) hexameric/dodecameric branch-migration Archaea; (blue) Bacteria; (pink) Bacteriophages. Names in boxes identify the helicase RuvB (Mitchell and West 1994) or individual protein families. The sequence identifiers are the same as for Fig. 1 SV40 large T antigen helicase (Mastrangelo exceptthattheGenBankidentifierwasomitted. et al. 1989; Weisshart et al. 1999) both of main located upstream of the DnaB domain and con- which belong to the AAA+ class, a distinct division of taining all the diagnostic sequence motifs of the P-loop NTPases (Neuwald et al. 1999; L. Aravind and ToprimdomainsoftheDnaG-typeprimases(Ilyinaet E.V. Koonin, unpubl.). In contrast, there are distinct al. 1992; Aravind et al. 1998) (data not shown). The similarities between the RecA/DnaB superfamily and DnaBhomologfromthenematodeC.elegansseemsto the family of ATPases that includes transcription ter- contain a diverged counterpart of the DnaG domain mination factor Rho and F –ATPase (Dombroski and 1 with disrupted catalytic motifs, and no trace of the Platt 1988; Gorbalenya and Koonin 1993; Miwa et al. DnaGdomaincouldbedetectedinthehomologfrom 1995; Washington et al. 1996). Within the core do- Plasmodium (the human coding sequence is incom- mainofRecAandF –ATPase(correspondingtostrands 1 pleteanditremainsunclearwhetherornottheprotein 1–8ofRecAandtheassociatedhelicesandloops),~ 130 contains a DnaG domain). This conservation of a residues can be superimposed with a Rmsd of <2.0 Å unique domain architecture between nuclear eukary- (Abrahams et al. 1994) and secondary structure ele- oticandbacteriophageDnaBhomologs,togetherwith ments also are largely congruent (Washington et al. the apparent absence of DnaB homologs in Archaea, 1996).AlthoughthisleaveslittledoubtthattheRecA/ suggests that the gene coding for the DnaB homolog DnaBsuperfamilyandtheRho/F familyshareacom- 1 Genome Research 13 www.genome.org Downloaded from genome.cshlp.org on March 6, 2023 - Published by Cold Spring Harbor Laboratory Press Leipe et al. mon ancestor that already had a hexameric quarter- might have been horizontally transferred into the eu- narystructure,italsoindicatesthathexamericNTPases karyoticlineageandisunlikelytoplayacriticalrolein as a whole (including RecA/DnaB, Rho/F , and the eukaryotic nuclear DNA replication given its absence 1 AAA+class)arenotamonophyleticgroup. in yeast. Instead, the eukaryotic DnaB homologs are Phylogeneticanalysisbasedonthemultiplealign- likely to function in organelles. These findings have ment of the core RecA/DnaB domain (~ 250 residues) consequences for our understanding of the evolution stronglysupportsthemonophylyofsixmajorgroups, of DNA replication. Given the involvement of RecA/ namely bacterial and chloroplast DnaB, eukaryotic DMC1/RadA in recombinational processes in all do- DnaBhomologs(withtheexceptionoftheplantone), mainsoflife,itseemslikelythatthisparticularfamily bacterial Sms, KaiC, bacterial RecA, and the archaeal/ wasalreadyrepresentedintheLCAofallextantcellu- eukaryoticRad51/DMC1/RadA(Fig.3).Themostcriti- larorganisms.Incontrast,DnaB,whichistheprincipal cal factor in interpreting this tree is the placement of helicaseinvolvedinbacterialDNAreplication,hasap- theroot.Unambiguousrootingispossibleonlywhena parently been recruited for this function after the di- reliabletreecanbeproducedfortwoparalogousfami- vergence of bacteria from the archaeal/eukaryotic lin- lies resulting from a duplication known to be present eage. Given that any replicative helicase has to be a in the last common ancestor (Gogarten et al. 1989; highly processive enzyme, the ability of RecA to form Iwabe et al. 1989; Brown and Doolittle 1995). To that hexameric rings (with the right diameter to encircle end, we have used the Rho/F ATPase family as the DNA)offersanexplanationwhyaRecAderivativewas 1 paralogous group for the entire RecA/DnaB superfam- asuitablecandidatetobeselectedastheprincipalhe- ily.However,theinformationcontainedintheoverall licase for bacterial DNA replication. Conversely, eu- alignmentwasinsufficienttoobtainareliablerooting karyotic replicative helicases might have been inde- (datanotshown).Thus,thetopologyofthetreeallows pendentlyrecruitedfromotherclassesofATPases,such for two principal, competing interpretations (Fig. 3). as the AAA+ class or the superfamily II helicases. The PlacingtherootbetweentheRecA/Rad51/DMC1/RadA notionthatthereplicativeDNAhelicaseoftheBacteria recombinases and the predominately bacterial assem- isnotanorthologofthecorrespondingreplicativehe- blageofSms,DnaB,andKaiCsuggestsanevolutionary licases in Archaea and Eukaryota is compatible with scenario in which a gene duplication in the LCA pro- the recently discussed hypothesis that the modern- ducedtheancestorofDnaB/Sms/KaiContheonehand typesystemforthereplicationofdsDNAhasevolved and the RecA/Rad51/RadA recombinases on the other independentlyinthebacterialandarchaeal/eukaryotic hand,andalatergeneduplicationinthebacteriallin- lineages(Leipeetal.1999). eage gave rise to DnaB and Sms. Consequently, the model has to assume that the ancestor of DnaB/Sms/ Methods KaiC has been secondarily lost from the archaeoeu- ThenonredundantdatabaseofproteinsequencesattheNCBI karyotic lineage. Alternatively, the root can be placed (NR) was searched using the gapped BLASTP and PSI-BLAST between the archaeoeukaryotic proteins (Rad51/ programs (Altschul et al. 1997). Briefly, the PSI-BLAST pro- DMC1/RadA) and the bacterial families (RecA/Sms/ gramconstructsaposition-dependentweightmatrix(profile) DnaB/KaiC) (Fig. 3). In this scenario, the RecA/DnaB using multiple alignments generated from the BLAST hits superfamilyevolvedfromasinglegeneintheLCAand above a certain expectation value (E value) and carries out iterative database searches using the information derived the bacterial subfamilies, namely RecA, DnaB, Sms from the profile. The statistical evaluation of the PSI-BLAST (and possibly KaiC), are derived from successive gene results is based on the extreme value distribution statistics duplication events within the bacterial lineage. The originally developed by Karlin and Altschul (1990) for local dataavailabledonotallowustodistinguishwithcer- alignmentswithoutgapsandsubsequentlyshownbyexten- tainty between these two scenarios, but we favor the sivecomputersimulationstoapplyalsotogappedalignments rootingbetweenRad51/DMC1/RadAandRecAbecause and to alignments obtained by using profiles (Altschul and it is the more parsimonious alternative that does not Gish1996;Altschuletal.1997).Ithasbeenemphasizedthat E values reported for each retrieved sequence at the point invokeasecondarygeneloss. whenitsalignmentwiththequerysequencepassesthecutoff for the first time are robust estimates of statistical signifi- cance.Onceasequencegetsincludedintheprofile,Evalues Conclusions reportedforitanditsclosehomologsatsubsequentiterations We show here that the DnaB and RecA/DMC1/RadA become inflated and do not represent the statistical signifi- proteinsformadistinctsuperfamilyofstructurallyand cance(AltschulandKoonin1998).HereweonlyreportEval- uesforthefirstappearanceofthegivensequenceabovethe evolutionarily related ATPases. Additionally, we de- cutoff. The dbEST was searched using the gapped TBLASTN scribe previously undetected DnaB homologs from program(Altschuletal.1997). phylogeneticallydivergenteukaryotes.Theeukaryotic Multiple sequence alignments were constructed using DnaBhomologthatsharesacommondomainorgani- thePSI-BLASToutputandmodifiedmanuallyonthebasisof zation with T-odd bacteriophage primases–helicases structuralconsiderations.Thealignmentswereformattedus- 14 Genome Research www.genome.org

Description:
Detlef D. Leipe,1 L. Aravind,2,3 Nick V. Grishin,1,4 and Eugene V. Koonin1,5. 1National . DNA-binding site; (2) DnaB likely evolved from RecA.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.