ebook img

Novel Predicted RNA-Binding Domains Associated with the Translation Machinery PDF

12 Pages·1999·1.72 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Novel Predicted RNA-Binding Domains Associated with the Translation Machinery

JMolEvol(1999)48:291–302 ©Springer-VerlagNewYorkInc.1999 Novel Predicted RNA-Binding Domains Associated with the Translation Machinery L.Aravind,1,2 EugeneV.Koonin2 1DepartmentofBiology,TexasA&MUniversity,CollegeStation,TX77843,USA 2NationalCenterforBiotechnologyInformation,NationalLibraryofMedicine,NationalInstitutesofHealth,Bethesda,MD20894,USA Received:15May1998/Accepted:20July1998 Abstract. Two previously undetected domains were enzymes. The evolution of the translation machinery identifiedinavarietyofRNA-bindingproteins,particu- componentscontainingtheS4,PUA,andSUI1domains larly RNA-modifying enzymes, using methods for se- musthaveincludedseveraleventsoflateralgenetransfer quence profile analysis. A small domain consisting of andgenelossaswellaslineage-specificdomainfusions. 60–65aminoacidresidueswasdetectedintheribosomal protein S4, two families of pseudouridine synthases, a Key words: RNA-binding domains — Ribosomal novel family of predicted RNA methylases, a yeast pro- proteinS4—Archaeosinetransglycosylase—Pseudou- tein containing a pseudouridine synthetase and a deam- ridine synthase — Translation machinery inase domain, bacterial tyrosyl-tRNA synthetases, and a number of uncharacterized, small proteins that may be involved in translation regulation. Another novel do- Introduction main,designatedPUAdomain,afterPseudoUridinesyn- NucleotidemodificationinrRNAsandtRNAsisamajor thaseandArchaeosinetransglycosylase,wasdetectedin venue of structural diversification and change of speci- archaeal and eukaryotic pseudouridine synthases, ar- ficity of these molecules (Limbach et al. 1994). These chaeal archaeosine synthases, a family of predicted posttranscriptional modifications include various forms ATPases that may be involved in RNA modification, a of base methylation, in situ base conversion as in pseu- family of predicted archaeal and bacterial rRNA meth- douridine (c) formation, or insertion of a modified base ylases. Additionally, the PUA domain was detected in a bytransglycosylationasinthecaseofqueuine(Limbach family of eukaryotic proteins that also contain a domain et al. 1994; Lane et al. 1995; Ofengand et al. 1995; homologous to the translation initiation factor eIF1/ Romieretal.1996).Whilemostcommoninthenoncod- SUI1;theseproteinsmaycompriseanoveltypeoftrans- ingRNAs(TollerveyandKiss1997),modificationsalso lation factors. Unexpectedly, the PUA domain was de- occur in mRNA and, in this case, may involve the more tected also in bacterial and yeast glutamate kinases; this dramatic RNA editing (Smith, Sowden 1996). Some of is compatible with the demonstrated role of these en- the RNA modifications, e.g., c formation, are universal zymesintheregulationoftheexpressionofothergenes. inallcells,whereasothersarerestrictedtoonedomainof We propose that the S4 domain and the PUA domain lifesuchasarchaeosineformationintheArchaea(Greg- bind RNA molecules with complex folded structures, son et al. 1993; Watanabe et al. 1997). addingtothegrowingcollectionofnucleicacid-binding AbroadspectrumofRNAmodificationenzymeshas domains associated with DNA and RNA modification beenidentified,includingvariousRNA-specificmethyl- ases, c synthases, tRNA-guanine transglycosylases in- volved in queuine (in Bacteria and eukaryotes) and ar- Correspondenceto:E.V.Koonin;e-mail:[email protected] chaeosine (in Archaea) formation, cytosine deaminases 292 involved in mRNA editing, and others (Rottman et al. enzymes to their target sites in RNA and, on other oc- 1994; Navarathan et al. 1995; Koonin 1996; Romier et casions, to act as translation regulators, via structure- al. 1997). Evidently, for each of these enzymes to per- specific RNA binding. A combined analysis of the phy- formtherequisitereactionsonRNAmolecules,anRNA- logenetic distribution and domain architectures of the bindingmoietyisrequired,eitherasabuilt-indomainof proteins containing these newly described domains pro- the enzyme or as a stand-alone protein or RNA. DNA vides for plausible evolutionary scenarios that are based and RNA-binding domains are central to genome repli- on the parsimony principle and include vertical inheri- cation,expression,andstabilitymaintenance.Thereisno tance,postulatedgenetransferevents,anddomainfusion singlestructuralequivalentofanucleicacid-bindingdo- and shuffling. mainbuttherepertoireisrelativelysmall,withthesame domains recurring in different systems and frequently embedded in multidomain proteins with different archi- Materials and Methods tectures. Some of the DNA- and RNA-binding domains tendtocombinewithenzymaticdomains,forwhichthey Thenonredundant(NR)proteinsequencedatabaseattheNationalCen- terforBiotechnologyInformation(NIH,Bethesda,MD)wassearched serve as a vehicle delivering the enzymes to the sites of usingthegappedBLASTPprogram(Altschuletal.1997).Thenucleo- their action on nucleic acids. Such mobile nucleic acid- tidedatabaseofexpressedsequencetags(dbEST)wassearchedusing binding modules associated with enzymatic domains in- the gapped version of the TBLASTN program, which uses a protein cludethehelix–hairpin–helix(HhH)domainidentifiedin sequenceasaqueryandsearchesthenucleotidedatabasetranslatedin all six frames. Further iterative searches were carried out using the a plethora of proteins involved in DNA replication and PSI-BLASTprogram,whichconstructsaposition-specificweightma- repair as well as ribosomal proteins (Doherty et al. trixbasedonthemultiplealignmentgeneratedinthefirstpassofthe 1996),thedouble-strandedRNA-bindingdomainpresent databasescreeningandsubsequentlyiteratesthesearchusingthispro- in RNAase III (St Johnston et al. 1992), and the S1 file(Altschuletal.1997).Additionally,separatelyconstructedmultiple domain found in ribosomal proteins and polynucleotide alignmentswereusedtoderiveaweightedprofileandseedthePSI- BLASTiterations(A.Schaffer,L.A.,andE.V.K.,unpublishedobser- phosphorylase (Bycroft et al. 1997). Other nucleic acid- vations).FurtherdatabasesearcheswerecarriedoutusingtheMoST binding domain do not have this tendency of combining program, which uses ungapped alignment blocks as an input to con- withenzymaticdomainsandoccurontheirownassingle struct a position-specific weight matrix and iteratively searches the or multiple copies or with other nonenzymatic domains. databaseuntilconvergenceusingtheratioofthenumberofobserved matches to the expected number as a cutoff (Tatusov et al. 1994). Examplesofsuch‘‘loners’’includetheRRM(RNArec- Multiple alignments were constructed by searching for similarity ognitionmotif)domain,whichiscommoninanumberof blocksusingGibbssamplingasimplementedintheMACAWprogram RNA-binding proteins (Shamoo et al. 1995), the so- (Schuleretal.1991;Neuwaldetal.1995);thealignmentsweresub- called KOW (Kyrpides–Ouzounis–Woese) domain sequentlyoptimizedgloballyusingtheClustalWprogram(Thompson found in NusG and ribosomal proteins (Kyrpides et al. etal.1994)andfurthermodifiedusingtheoutputsofthePSI-BLAST searches as a guide. The proteins were filtered for low-complexity 1996), and the cold shock domains of the OB (oligo- regions and decomposed into their predicted globular domains using nucleotide/oligosaccharide binding) fold class theSEGprogramwiththeparametersofwindowsize45,triggercom- (Schnuchel et al. 1993). plexityof3.4,andextensioncomplexityof3.75(WoottonandFeder- In the course of comparative analysis of the protein hen 1996). Secondary structure predictions were performed with the PHD program using the entire family alignment or subsets that pro- sequencesencodedinbacterial,archaeal,andeukaryotic duced most reliable alignments (Rost and Sander 1994). Structural genomes, we discovered two conserved and previously databasethreadingwasperformedonthebasisofthesecondarystruc- undetected domains. The first of these is an ancient do- turepredictions(Rostetal.1997). main found in ribosomal protein S4 (and designated the S4domainafterthiswell-studiedprotein),tyrosyl-tRNA synthetase, the RluA and RsuA families of c synthases, Results and Discussion several predicted RNA methylases, a putative novel RNA editing enzyme, and a number of uncharacterized The S4 Domain proteins. The second domain (designated PUA, after pseudouridine synthase and archaeosine transglyco- A number of c synthases, such as SFHB and YPUL of sylase) is seen primarily in eukaryotes and Archaea and the RSUA family and YCEC and YA32 of the RLUA isassociatedwithcsynthasesoftheTruBfamily,tRNA- family (Koonin, 1996), contain a predicted globular do- guanine transglycosylases, putative novel archaeal main,whichislocatedbetweentheNterminusandthec ATPases involved in RNA modification, a predicted ar- synthase catalytic domain. When the NR database was chaeal rRNA methylase, and a conserved family of pu- searched with the sequence of this domain using the tative novel eukaryotic translation factors. In addition PSI-BLAST program, ribosomal protein S4 sequences and unexpectedly, PUA domain has been detected in were retrieved at random expectation (e) values in the bacterialandyeastglutamatekinases.BothS4andPUA range of 10−3 to 10−4 in the first or second iteration. domains are predicted to deliver nucleotide-modifying When these searches were run to convergence, we also 293 detected several additional proteins at e values below about 460 nucleotides (Heilek et al. 1995). In addition, 10−3. These include bacterial tyrosyl-tRNA synthetases S4bindsacomplexpseudoknotinitsowntranscriptand and a conserved group of bacterial proteins with addi- represses its translation; the protein also shows nonspe- tional methyltransferase domains (Figs. 1 and 2), which cific RNA binding (Tang and Draper 1989). There ap- have been annotated in sequence databases as hemoly- pears to be no similarity between the two independent sinsonthebasisofthepropertiesofasingleproteinfrom RNAstructuresrecognizedbythisprotein.Earlystudies the spirochaete Serpulina hyodysenteriae (ter Huurne et yieldedconflictingresultsregardingthepartsofthepro- al.1994).Inordertoassessthestatisticalsignificanceof teinneededfortheinteractionwith16SRNA.However, thesesequencesimilaritiesmorerobustly,wecarriedout a more recent detailed analysis of deletions in the S4 reverse searches using the corresponding regions from protein,alongwithcirculardichroism(CD)studies,have thesedatabasehits.Ineachofthesecases,itwaspossible suggested that the region spanning residues 48 to 177 to reproducibly retrieve essentially the same set of pro- reproduces the binding specificity typical of the full teins at e values below 10−3 within four iterations, sug- length protein, indicating that the main RNA binding gesting a coherent evolutionary and structural relation- determinantslieinthisregion(BakerandDraper1995). ship between these proteins. Similar results were TheCDspectrasuggestthatthisfragmentoftheprotein obtainedinmotifsearchesusingtheMoSTprogramwith isstructuredsimilarlytotheintactproteinbutanyfurther r values of 0.005–0.01. deletions moving into the S4 domain defined above Thus, we defined a common globular domain in all cause the loss of structure (Baker and Draper 1995). these proteins, which we call S4 domain after the thor- Furthermore,toeprintassaysimplyabimodalRNAcon- oughlystudiedribosomalproteinthatappearstohavean tactbyS4protein,withoneofthesecontactsspecifically important conserved role in ribosomal structure and dependent on the region encompassing the S4 domain function (Tang and Draper 1989; Baker and Draper and the other one mediated by a predicted a helix-rich 1994; Heilek et al. 1995). The S4 domain consists of globulardomainwhichislocatedN-terminallyoftheS4 60–65 amino acid residues and typically occurs in a domain (Fig. 2). singlecopyatvariouspositionsindifferentproteins.The Independent evidence of the RNA-binding properties boundaries of the domain were determined on the basis of the S4 domain come from the studies on the C- ofbothsequenceconservationandthedomainorganiza- terminaldomainoftheBacillusstearothermophilustyro- tionoftherespectiveproteins.Morespecifically,intyro- syl-tRNA synthetase, which is required for tRNA bind- syl-tRNAsynthetasestheS4domainisintheextremeC ing by this enzyme (Carter et al. 1986). Though this terminus, defining the C-terminal boundary, and in the domainappearsdisorderedinthecrystalstructure(Brick predicted methyltransferases, it is at the extreme N ter- et al. 1989), CD spectra and fluorescence spectroscopy minus, indicating the N-terminal limit (Fig. 2). Notably, suggestthatitisstructuredinsolution(Guez-Ivanierand it is precisely the S4 domain in the boundaries defined Bedouelle 1996). It may be speculated that tRNA bind- herethatisduplicatedintheChlamydomonaschloroplast ing is required for the S4 domain in the tyrosyl-tRNA S4ribosomalprotein,providingadditionalsupportforits synthetasetoassumeitsproperconformation.Theseex- identificationasanindependent,structurallydistinctunit perimentalresults,togetherwithourobservationsonthe (Figs. 1 and 2). presenceoftheS4domaininRNA-modifyingenzymes, InspectionofthemultiplealignmentoftheS4domain suggest that this domain is capable of recognizing com- superfamily shows a characteristic pattern of conserved plex, unique 3D features in highly folded RNA mol- polar, hydrophobic, and small residues in its N-terminal ecules such as rRNAs, tRNAs, and untranslated regions and central parts; these residues are likely to define the of mRNAs. The presence of the S4 domain in c syn- conserved structural features of the domain. The C- thases such as RsuA, which pseudouridylates position terminal part does not show a high degree of conserva- 516locatedinahighlystructuredportionofE.coli16S tion between different families within the superfamily RNA(Wrzesinskietal.1995a),iscompatiblewiththese and may be important for interactions specific to each observations.OurdatabasesearchesshowthattheE.coli family(Fig.1).StructuralpredictionusingthePHDpro- RluA c synthase, which acts on both tRNA and 23S gramsuggeststhatthisdomainhasana/bstructurewith RNA (Wrzesinski et al. 1995b), lacks a S4 domain but thecentralhighlyconservedregioncorrespondingtothe homologous,highlyconservedproteinsfromE.coliand b elements (Fig. 1). a number of other bacterial species (e.g., the E. coli The evidence of the RNA-binding function of the S4 proteins YceC and SfhB) contain it, suggesting mecha- domaincomesfromseveralindependentstudiesonribo- nisticsimilaritiesbetweensubstraterecognitionbymany somalproteinS4itselfandontheC-terminaldomainof csynthases,tRNAbindingbytyrosyl-tRNAsynthetases, bacterial tyrosyl-tRNA synthetase. The E. coli S4 is a and rRNA binding by S4. The differences in the S4 do- multifunctional protein, which binds to a region of 16S mains of these enzymes, particularly in the variable C- rRNA that comprises most of the 58 domain and spans terminal portions, may contribute to the unique target Fig.1. TheS4domain.Themultiplealignmentwasconstructedas residues(L,F,W,Y,E,R,K,I;graybackground),polarresidues(S, describedinthetextusingMACAWandCLUSTALW.Namesofthe T,E,D,R,K,H,N,Q;browncoloring),hydroxylicresidues(S,T;blue proteinsareasassignedinSWISSPROTorasperdesignationsofthe coloring),andchargedresidues(R,K,H,D,E;purplecoloring).The individual genome sequencing projects, with the GenBank numbers multiplealignment-basedsecondarystructurepredictionisindicatedon shownnexttothem.Thenumbersflankingeachlineofthealignment topofthealignment;E(e)indicatesextendedconformation(b-strands), indicatetheboundariesoftheS4domaininthepolypeptide.These- andH(h)indicatesahelices;capitallettersindicatethemostconfident quences are grouped based on similarity and domain organization. predictions. Species abbreviations: Aae, Aquifex aeolicus; Af, Ar- Group1,sequencesoftheS4domainsfromtheribosomalproteinS4; chaeoglobusfulgidus;Bb,Borreliaburgdorferi;Bs,Bacillussubtilis; group 2, uncharacterized bacterial proteins; group 3, methyltransfer- Ce,Caenorhabditiselegans;Chre,Chlamydomonasrheinhardtii;Ec, ases; group 4, bacterial c synthases; group 5, bacterial stand-alone Escherischiacoli;Hi,Haemophilusinfluenzae;Hp,Helicobacterpy- versionsoftheS4domain;group6,theeukaryoticcsynthases;and lori;Hs,Homosapiens;Mj,Methanococcusjannaschii;Mpn,Myco- group7,bacterialtyrosyl-tRNAsynthetases.Theconservedpositions plasmapneumoniae;Mg,Mycoplasmagenetalium;Mta,Methanobac- areshadedusingthe80%consensusrule.Theaminoacidclassesused teriumthermoautotrophicum;Mtu,Mycobacteriumtuberculosis;Mle, inbuildingtheconsensuswerehydrophobicresidues(A,C,F,I,L,M, Mycobacteriumleprae;Sc,Saccharomycescerevisiae;Sp,Schizosac- V,W,Y;yellowbackground),smallresidues(A,C,S,T,D,N,V,G, charomycespombe;Thy,Treponemahyodysenteriae. P;bluebackground),tinyresidues(A,G,S;greenbackground),big 295 Fig.2. RepresentativedomainarchitecturesofS4domain-containing isthepredictedmetal-bindingdomainmentionedinthetext.Thead- proteins.Theproteinsaredrawntoscaleasindicated.PSUSstandsfor ditional a helix-rich globular domain in S4 that has been shown to thecatalyticdomainofthecsynthases,inthiscase,oftheRluAand contactRNAindependentlyisalsoshown. RsuAfamilies.TheadditionalglobulardomainintheeukaryoticPSUS specificities,giventhehighsequenceconservationinthe aeolicus,andthearchaeonArchaeoglobusfulgidus(Figs. catalytic domains. Furthermore, the presence of the S4 1 and 2). The member of this family from Treponema domain in bacterial tyrosyl-tRNA synthetases but not in hyodysenteriae has been described as a hemolysin (ter theirarchaealoreukaryoticorthologssuggestsmajordif- Huurne et al. 1994), but given the broad distribution of ferences in the modes of tRNATyr binding. the family in both parasitic and free-living species and The detection of the S4 domain in several other pro- their domain organization, this is unlikely to define the teins seems to be of considerable interest. A group of common function of these proteins. The combination of eukaryotic c synthases of the RluA family contains the the S4 domain, predicted to bind RNA, and the methyl- S4 domain as well as additional, uncharacterized, con- transferasedomain,suggeststhatthisisanovelfamilyof serveddomains,withthedistinctivedomainorganization rRNA methylases. Given their patchy phylogenetic dis- conservedatleastinthecrowngroupofeukaryotes(Fig. tribution, these enzymes may be responsible for taxon- 2). Remarkably, one of the four yeast c synthases— specific rRNA methylation. Rib2, contains, in addition to the S4 domain, a C- Several small bacterial proteins (e.g., YabO from B. terminal domain similar to a wide range of deaminases, subtilisandYrfHfromE.coli)consistalmostentirelyof including RNA editing cytidine deaminases (Fig. 2). the S4 domain (Fig. 2). An interesting possibility is that WhileithasbeeninitiallyproposedthatRib2hasafunc- theseputativeRNA-bindingproteinsmaybeinvolvedin tion in riboflavin biosynthesis (Tzermia et al. 1997), the translationalregulationsimilarlytoS4.Thesporadicoc- observed domain organization suggests that it may be a currence of these proteins in bacteria suggests that they novel RNA-editing enzyme that produces c from cyto- mayhavespecificfunctionsonlyinthecontextofcertain sine through deamination with subsequent pseudo- operons and their untranslated leaders. uridylation. Another previously undetected family includes pro- The PUA Domain teinsthat,inadditiontotheS4domain,containthecon- servedmotifstypicalofSAM-dependentmethyltransfer- Using the same iterative strategy for database searching ases (Kagan and Clarke 1994; Koonin et al. 1995), and that is described above for the S4 domain, significant are encoded by a variety of taxonomically diverse bac- sequence similarity (e value below 10−4 at the second teria, including Helicobacter pylori, Bacillus subtilis, PSI-BLAST iteration) was detected between the C- Spirochaetae, Mycobacteria, Synechocystis, Aquifex terminal regions of the archaeal tRNA guanine- 296 transglycosylase involved in archaeosine synthesis and gesting that a defect in rRNA maturation may be in- theeukaryoticCBF5proteins,whichbelongtotheTruB volved in the pathogenesis (Heiss et al., 1998). family of c synthetases (Koonin 1996). Both of these TheArchaeosinebiosynthesisenzymeshavecatalytic proteinsareinvolvedinRNAbasemodificationbuttheir domains that are distantly related to the family of trans- catalytic domains are completely unrelated, leading to glycosylases involved in queuine (Q) biosynthesis in the suggestion that the conserved region defines a novel bacteria and eukaryotes (Gregson et al. 1993; Watanabe RNA-bindingdomain.Comprehensiveiterativesearches et al. 1997). These Q-specific transglycosylases (TGT) resulted in the identification of this domain (hereafter lackthePUAdomainandarepresentonlyinbacteriaand PUA domain, after pseudouridine synthases and ar- eukarya, with the exception of the archaeon A. fulgidus, chaeosine-specific transglycosylases) in other archaeal whichappearstohaveacquiredasinglegeneencodinga andeukaryoticproteinswithadditionaldomainsandalso Q-specificTGTfrombacteria.Conversely,thearchaeal- as highly conserved stand-alone versions. The PUA do- typearchaeosine-specificTGTswerenotdetectedineu- mainclearlyislesscommoninbacteria,butquiteunex- karyotes (including ESTs) or in bacterial genomes pectedly,inadditiontopredictedRNAmethylasesfound sampled so far. The four sequenced archaeal genomes only in two species, it was detected at the C termini of each encode two distinct forms of the TGT with a C- bacterial and fungal glutamate kinases (Figs. 3 and 4). terminal PUA domain (Fig. 4), which may be specifi- LiketheS4domain,PUAisasmall,compactdomain callyresponsibleforarchaeosineincorporationatdiffer- that consists of 78–83 amino acid residues. The N- ent sites in tRNA molecules. terminal boundary was defined by the extreme N- TheuncharacterizedproteinscontainingthePUAdo- terminallocationofthePUAdomaininafamilyofpre- main possess interesting features that make them candi- dicted RNA methylases, whereas the C-terminal dates for further studies on RNA modification. One of boundary was easily determined given the C-terminal these is a small protein highly conserved in archaea and location of the PUA domain in CBF5-like c synthases eukarya(e.g.,yeastYER007).Thisisastand-alonever- and tRNA transglycosylases (Fig. 4). Inspection of the sion of the PUA domain (Fig. 4), and its extraordinary multiple alignment of the PUA domain reveals highly conservation suggests that it may function as a RNA- conserved motifs that center around stretches of hydro- binding cofactor for RNA modifying enzymes. phobic residues (Fig. 3). The domain also contains two Another protein family, which includes mammalian highlyconservedpositionsoccupiedbysmallaminoac- ligatin, a protein that, apparently erroneously, has been ids(primarilyglycines),whichmaydefineafunctionally describedasatraffickingreceptorforphosphoglycopro- important turn. Secondary structure prediction using the PHDprogramsuggeststhatPUAdomainhasab-strand- teins(Jakoietal.1989),ischaracterizedbyaN-terminal PUA domain and at least one additional C-terminal rich structure, with the predicted strands corresponding globulardomain(Fig.4).Thisfamilyishighlyconserved totheconservedhydrophobicpatches(Fig.3).Thispat- atleastinthecrowngroupofeukarya.Adatabasesearch tern of secondary structure elements resembles that in with the C-terminal domains of the ligatin family pro- some other RNA-binding domains, particularly the OB teinsunexpectedlyshowedamarginallysignificantsimi- fold(Murzin1993).GiventheprevalenceoftheOBfold larity(evalue,~ 0.07)withthesequencesoftheeukary- amongRNAbindingdomains,itcannotberuledoutthat PUA is a distinct version of this fold. otic translation initiation factor eIF-1/SUI1 and its UnlikethesituationwiththeS4domain,experimental orthologuesfromarchaeaandbacteria(Kasperaitisetal. details that could shed light on the specific functions of 1995;Tatusovetal.1997;KyrpidesandWoese1998)as thePUAdomainaresketchy.TheconservationofCBF5 wellasaconservedfamilyofuncharacterizedeukaryotic and the associated PUA domain in archaea and eukarya proteins. Iterative reverse searches with the proteins of suggests that this enzyme performs a fundamental pseu- the SUI1 family and the new conserved family showed douridylation function, for which PUA domain is re- statistically significant similarity to the ligatins (e.g., e quired. Interestingly, the second yeast c synthase of the value<10−4atthethirditerationinasearchinitiatedwith TruBfamilylacksaPUAdomainandhasbeenshownto theyeastproteinYJY4).MultiplealignmentoftheSUI1 catalyzecformationinbothcytoplasmicandmitochon- orthologuesandnewlydetectedsimilardomainscontains drialtRNAs(Beckeretal.1997);theabsenceofthePUA a number of conserved positions, with two prominent domain is compatible with the transfer of the gene cod- motifs located in the middle and near the C terminus of ing for this enzyme from bacteria to the eukaryotic the domain, and is predicted to possess a compact b/a nucleargenome.Incontrast,thePUAdomain-containing structure(Fig.5).SUI1isinvolvedintherecognitionof protein CBF5 is essential for rRNA maturation but does the initiation codon by the eIF2–GTP–Met–tRNA ter- i notaffecttRNA,suggestingthatPUAdomainmaycon- nary complex (Naranda et al. 1996; Yoon and Donahue tribute to the specific rRNA binding. Interestingly, the 1992)and,accordingly,islikelytopossessRNA-binding humanorthologueofCBF5,aproteinnameddyskerin,is activity. So far, the members of this family have been theproductofthegenemutatedinabonemarrowfailure identified only as standalone small proteins (Kyrpides disorder,X-linkedrecessivedyskeratosiscongenita,sug- and Woese 1998) but the findings presented here 297 dasindicatedinthelegendtoFig.1.Theconservedresiduesareshadedusingthe80%consensusFig.1.Group1,stand-alonePUAdomainproteinssuchasyeastYER007anditshomologues;groupfusedtothetRNAtransglycosylases;group4,PUAdomainsinligatin-likeproteins;group5,PUAases;group7,PUAdomainfusedtoglutamatekinases.ThespeciesabbreviationsareasinFig.1.m,Musmusculus;Rr,Rattusrattus;Ph,P.horikoshii. onstructeareasindomainsomethylactis;M main.ThemultiplealignmentofthePUAdomainwascentionsandsecondarystructurepredictionsdesignationsctotheTRUBfamilysynthasedomain;group3,PUASreductase-likedomains;group6,PUAdomainfusedtns:Dm,Drosophilamelanogaster;Kl,Kluyveromycesl Fig.3.ThePUAdorule.Theshadingconv2,PUAdomainsfuseddomainsfusedtoPAPAdditionalabbreviatio 298 Fig. 4. Representative domain architectures of PUA domain- the tRNA glycosylases. The PP-loop ATPase domain is the domain containingproteins.Theproteinsaredrawntoscaleasindicated.The similartoPAPSreductases.Theredbarinthisdomainspecifiesthe PSUSdomainisthecatalyticdomainofthecsynthasesoftheTruB intact PP loop, which is typically disrupted in the PAPS reductases familyandtheTGTdomainistheTIMbarrelfoldcatalyticdomainof (42).TheproteinsarenamedasinFig.3. strongly suggest that SUI1 is yet another mobile RNA- was completely unexpected, as the principal function of binding domain. The ligatin-like protein and the other this enzyme in proline biosynthesis has nothing to do conserved eukaryotic proteins typified by YJY4 may be with RNA binding. It has been shown, however, that a novel class of translation factors with two RNA- Bacillus subtilis glutamate kinase (the proB gene prod- binding domains of different specificity. uct) down-regulates the expression of sigma D-depen- Two proteins found in M. jannaschii, with an ortho- dent genes by an as yet unknown mechanism, possibly logue of at least one of them detected in Pyrococcus by inhibiting the translation of sigma D itself (Ogura, horikoshii,containaPUAdomainprecededbyaspecific Tanaka, 1996). It may be speculated that such a regula- ferredoxin-like domain and succeeded by a domain that tory mechanism is wide spread in bacteria and requires issimilartoPAPSreductase(Fig.4)butretainsanintact bindingofaspecificRNAstructurebythePUAdomain. PP-loop motif, suggesting ATPase activity (Bork and Experimental demonstration of the role of the PUA do- Koonin 1994). An interesting possibility is that these main in glutamate kinases will be of major interest. proteins are involved in taxa-specific RNA modifica- tions. In a number of archaeal species, tRNAs contain Evolutionary Implications of the Phylogenetic thiolated bases such as thiouridine derivatives and thio- Distribution of S4 and PUADomains thymine, which may play a role in the thermal stability (Edmonds et al. 1991). It seems possible that the PUA Examination of the phylogenetic distribution and se- domain–PAPS reductase proteins are uncharacterized quence relationships of S4 and PUA domains, together enzymes involved in the biosynthesis of such bases withthoseoftheRNAmodifyingenzymes,mayhelpin through sulfate activation and subsequent reduction. Al- unravelingtheevolutionaryhistoryoftherespectivepro- ternatively,theymaybeinvolvedinsulfatemodification tein families, which seem to be intimately linked to the of the sugar backbone analogous to the reaction cata- evolutionoftranslation(Fig.6).Theresultingevolution- lyzedbytheNodPprotein(Schwedocketal.1994).Ina ary scenarios are based on the parsimony principle and striking parallel with the S4 domain family, a unique accordingly are speculative inasmuch as it can never be fusionofthePUAdomainwithapredictedrRNAmeth- proved that evolution had followed the most parsimoni- ylaseofadistinctfamilythatincludeseukaryoticnucleo- ous scheme. Nevertheless, such scenarios seem to pro- lar proteins (e.g., P120) and their bacterial homologues vide the most likely explanation for the observed rela- (Koonin 1994) was observed in P. horikoshii (Fig. 4). tionships. Finally, the presence of a bona fide PUA domain in Obviously,ribosomalproteinS4hasbeenencodedby theglutamatekinasesfrommostbacteriaandfromyeasts thecenancestorgenome.Thisiscompatiblewiththeuni- 299 Fig. 5. The eIF1/SUI1 domain in translation factors and in PUA teins containing a C-terminal SUI1 domain and a predicted metal- domain-containing proteins. The alignment construction and all the binding domain at the N terminus; 3, the ligatin-like family of PUA designationsareasinFigs.1and3.Group1,eukaryoticandarchaeal domain-containingproteins.Additionalspeciesnames:At,Arabidopsis eIF1/SUI1factors;group2,uncharacterizedfamilyofeukaryoticpro- thaliana;Zm,Zeamays. Fig.6. HypotheticalscenariosfortheevolutionofthetranslationmachinerycomponentscontainingtheS4andPUAdomains.Arrowsshow postulatedhorizontalgenetransfers;‘‘×2’’indicatesapostulatedduplication. versalroleofS4inthetranslationaccuracycenterofthe proteins containing the S4 domain (Fig. 2) and also the ribosome, suggesting that the interactions between S4 largestnumberofcsynthasesoftheRsuAandtheRluA and 16S RNA (Heilek et al. 1995) had already been families. Taken together with the presence of the S4 established in the cenancestor. In contrast, the related domain in members of both of these families, this sug- RsuA and the RluA families of c synthases (Koonin geststhatearlyintheevolutionofthebacteria,theribo- 1996) and the small, stand-alone versions of the S4 do- somal S4 domain has been exapted for the related func- mainarelargelyrestrictedtoBacteriaandeukaryotes,to tionofrRNArecognitionbythecommonancestorofthe the exclusion of the Archaea. The bacteria show the RsuA and RluA families of c synthases (Fig. 6a). A greatestdiversityintermsofdomainorganizationofthe similar selective pressure for the recognition of specific 300 structuresinRNAappearstohavefavoreditsfusionwith the S4 and PUA domains, to identify novel domain ar- the predicted RNA methylases and with tyrosyl-tRNA chitectures in a variety of proteins associated with the synthetase. Given the involvement of S4 in translational translation machinery and to predict new translation regulationinbacteria,itislikelythatcertainversionsof regulators and RNA modification enzymes. We further this domain have been were recruited for solely regula- demonstrated that SUI1, previously identified only as a tory functions as proposed above. The bacterial S4 do- stand-alone translation factor, is yet another mobile main-containing c synthases appear to have entered the RNA-binding domain. eukaryotesfromtheirendosymbionts.Specifically,sofar The translation system is highly conserved in evolu- only RluA family members have been detected in fungi tion and is generally considered to be the paragon of and animals, whereas plants encode also at least one evolutionarystability(Dennis1997;Tatusovetal.1997; RsuAfamilyenzyme,whichsuggestsgenetransferfrom KyrpidesandWoese1998).Recentphylogeneticstudies, the mitochondrial genome and from the chloroplast ge- however, have revealed considerable complexity in the nome, respectively (Fig. 6a). Some of these proteins evolutionofaminoacyl-tRNAsynthetases,apparentlyin- clearly have undergone further evolution in eukaryotes cluding multiple events of horizontal gene transfer and asexemplifiedbythefusionofenzymaticdomainsinthe lineage-specific gene loss (Ibba et al. 1997a, b; Koonin putative editing enzyme Rib2 and the acquisition by eu- and Aravind 1998). These studies are complemented by karyotic c synthases of an additional predicted metal- complex evolutionary scenarios for RNA-binding do- bindingdomainCterminalofthecatalyticdomain(Figs. mains suggested by the present analysis. 2 and 6a). In the same vein, the S4 domain-containing methylase of Archaeoglobus fulgidus most likely has Note Added at Proof been acquired via horizontal transfer from bacteria. The fusion of the S4 domain to the tyrosyl-tRNA synthetase While this manuscript was being processed for publica- appearstohavebeenanancientevent,asitpredatesthe tion, the NMR and crystal structures of the ribosomal separationofallknownbacteriallines;interestingly,itis protein S4 from Bacillus stearothermophilus have been paralleled by a similar fusion of an unrelated RNA- published [Markus MA, Gerstner RB, Draper DE, Tor- binding domain of the OB fold class to animal tyrosyl- chia DA (1998)]. The solution structure of ribosomal tRNA synthetases (L.A., A.G. Murzin, and E.V.K., un- protein S4 d41 reveals two subdomains and a positively published observations). charged surface that may interact with RNA [EMBO J Similarly to the evolutionary history of the S4 do- 17:4559–4571; Davies C, Gerstner RB, Draper DE, Ra- main-containing proteins, the evolutionary scenario of makrishnan V, White SW (1998)]. The crystal structure the PUA-containing proteins should have included sev- of ribosomal protein S4 reveals a two-domain molecule eral gene fusion events as well as lateral gene transfers with an extensive RNA-binding surface: one domain (Fig. 6b). It appears most likely that the cenancestor shows structural homology to the ETS DNA-binding genome encoded a stand-alone version of the PUA do- motif [EMBO J 17:545–558]. The N-terminal boundary main,whereastwooftheextantPUA-containingprotein oftheconservedS4domainshowninourFig.1precisely families,namely,theTruBfamilyofcsynthasesandthe corresponds to the beginning of domain 2 in both struc- TGT family, were also represented, but lacked the PUA tures.TheC-terminalboundaryoftheconserveddomain, domain. The formative events in the evolution of the however, maps between a6 and b4 in domain which PUA-containing proteins should have been the early fu- showsthattheC-terminalb-hairpinofthisdomainisnot sion to the TruB domain in the common ancestor of required for the formation of a compact and functional Archaea and Eukarya and a parallel, independent fusion structure. The conserved domain consists of 3 a-helices toglutamatekinaseinanancestralbacterium.Alongthe (a4,a5anda6)and3b-strands(b1,b2andb3).Allthe archaean lineage, a variety of further fusions has oc- secondary structure elements were identified correctly curred, resulting in the transfer of the PUA domain to and almost precisely by the alignment-based prediction archaea-specificenzymesatdifferentpointsinevolution. (Fig.1),withtheexceptionofb2thatconsistsofonly2 Similarly, in the eukaryotic lineage, there has been a residues and was not predicted. Markus and co-workers unique fusion with the SUI1 domain, producing the li- notice the sequence similarity between domain 2, the gatin-relatedfamilyofputativenoveltranslationfactors. C-terminal region of bacterial tyrosyl-tRNA synthetases Finally,insomebacteriallineages,e.g.,thespirochaetes, andtheE.coliYrfHprotein.Additionally,however,they the PUA domain has been deleted from the glutamate claim a similarity to a region of viral reverse transcrip- kinase. tases that could not be corroborated by our analysis and is, in our opinion, spurious. Conclusions References Computer-aidedanalysesofcompleteproteomesenabled ustoidentifytwopreviouslyundetectedbuthighlycon- AltschulSF,MaddenTL,SchafferAA,ZhangJ,ZhangZ,MillerW, servedfamiliesoflikelyRNA-bindingdomains,namely, LipmanDJ(1997)GappedBLASTandPSI-BLAST:Anewgen-

Description:
main, designated PUA domain, after PseudoUridine syn- thase and Archaeosine transglycosylase components containing the S4, PUA, and SUI1 domains must have included several events of Naranda T, MacMillan E, Donahue T, Hershey J (1996) SUI1/p16 is required for the activity of eukaryotic
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.