doi:10.1006/jmbi.2001.4518 available onlineat http://www.idealibrary.com on J. Mol.Biol. (2001) 307,755–769 Expanding the Genetic Code: Selection of Efficient Suppressors of Four-base Codons and Identification of ‘‘Shifty’’ Four-base Codons with a Library Approach in Escherichia coli Thomas J. Magliery2, J. Christopher Anderson1 and Peter G. Schultz1* 1Department of Chemistry, The Naturally occurring tRNA mutants are known that suppress (cid:135)1 frame- Scripps Research Institute shift mutations by means of an extended anticodon loop, and a few have 10550 N. Torrey Pines Rd., La been used in protein mutagenesis. In an effort to expand the number of Jolla, CA 92037, USA possible ways to uniquely and efficiently encode unnatural amino acids, we have devised a general strategy to select tRNAs with the ability to 2Department of Chemistry suppress four-base codons from a library of tRNAs with randomized 8 University of California or 9 nt anticodon loops. Our selectants included both known and novel Berkeley, CA 94720, USA suppressible four-base codons and resulted in a set of very efficient, non- cross-reactive tRNA/four-base codon pairs for AGGA, UAGA, CCCU and CUAG. The most efficient four-base codon suppressors had Watson- Crick complementary anticodons, and the sequences of the anticodon loops outside of the anticodons varied with the anticodon. Additionally, four-base codon reporter libraries were used to identify ‘‘shifty’’ sites at which (cid:135)1 frameshifting is most favorable in the absence of suppressor tRNAs in Escherichia coli. We intend to use these tRNAs to explore the limits of unnatural polypeptide biosynthesis, both in vitro and eventually in vivo. In addition, this selection strategy is being extended to identify novel five- and six-base codon suppressors. #2001AcademicPress *Corresponding author Keywords: four-base codon; suppressors; genetic code; shifty codons Introduction stimulated by such elements as upstream rRNA- binding motifs, stem-loop structures and under- used codons. Indeed, these programmed shifts can Pioneering work by Crick, Brenner and co- be critical to protein production, e.g. the gag-pol workers in the early 1960s established the triplet polyprotein of Rous Sarcoma virus, the Ty1 and nature of the genetic code (Crick et al., 1961), and Ty3 elements of yeast, the dnaX gene of DNA Pol 40 years later only rare exceptions are known to III tau and gamma units, and the self-regulating the universal correspondence of mRNA codon to RF2 protein in Escherichia coli (Gesteland et al., amino acid (Fox, 1987). However, it has also 1992). become clear that frame maintenance during trans- lation is not absolute (Atkins et al., 1991; Kurland, As early as 1968 it was recognized that frame- 1992; Parker, 1989). In addition to the inherent shift mutations could be externally suppressed accuracy limits of the ribosome in maintaining (Riyasaty & Atkins, 1968). Shortly thereafter, frame, shifts can be promoted by mutant tRNAs Yourno & Tanemura (1970) showed that protein and can occur with high frequency at ‘‘pro- from a suppressed frameshift was of wild-type grammed’’ sites in mRNA. Frameshifts can be sequence, and Riddle & Carbon (1973) showedthat theexternalframeshiftsuppressor sufD inSalmonel- la is tRNAGly with a four-base anticodon CCCC Present address:T. J.Magliery,Department of instead of the wild-type CCC. (Note that we adopt Chemistry, The Scripps Research Institute, 10550N. the convention herein of writing both codon and Torrey PinesRoad, LaJolla, CA 92037, USA. anticodon sequences in the 50 to 30 direction.) These T.J.M.and J.C.A.contributed equally tothis work. experimentsconclusively demonstrated that certain Abbreviations used: . E-mailaddress ofthe corresponding author: tRNAs with extended anticodons can read non-tri- [email protected] plet codons. Most of the known four-base codon 0022-2836/01/030755–15$35.00/0 #2001AcademicPress 756 Selection ofEfficientSuppressors ofFour-base Codons suppressors have subsequently been isolated from merely render the tRNA less capable of three-base Salmonella or yeast and are largely single-base decoding and in the process make room for other insertion mutants of tRNAPro, tRNAGly or tRNALys tRNAs to promiscuously interact with the codon reading CCCX, GGGX or AAAX, where it is not by a slippage mechanism. For example in Salmonel- always known which X can be read. The known la, mutant tRNAPro with a GGGG anticodon is four-base codon suppressors tend to act with effi- methylated in such a way (underlined G37) that ciencies from very poor to approaching 20% pairing with the mRNA cytosine is obstructed. (Atkins et al.,1991). Apparently, the tRNAPro with cmo5UGG antico- Less well known are (cid:135)1 suppressors with more don, which typically reads CCA, CCG and CCU, extended anticodon loops. For example, the Salmo- can read CCC in the absence of its normal cognate nellasufT621 derivativeoftRNAArgcausesinsertion tRNA , but favors slippage since there are only GGG of arginine in response to CCGU. A synthetic two canonical base-pairs. mutant of this tRNA, which bears a spontaneous Four-base codon suppression has begun to be mutation in the DHU loop of unknown import- used for in vitro protein engineering experiments, ance, has a nine-base anticodon loop that causes initially by Ma et al., to insert alanine at UAGG (cid:135)1 frameshifting (Tuohy et al., 1992). The only and AGGU with complementary anticodons other known tRNAs with nine-base anticodon engineered into tRNAAla (Kramer et al., 1998; Ma loops mediating (cid:135)1 frameshifting are the related et al., 1993). Sisido’s group has used extended- tRNAPro-derived yeast suppressors SUF7, SUF8, anticodon tRNAs to insert unnatural amino acids SUF9 and SUF11, which lack a 31:39 anticodon into proteins in vitro, employing known suppres- stem pair (Atkins etal., 1991). sible four-base codons such as AGGU and The specificity of pairing is not entirely predict- CGGG (Hohsaka et al., 1996; Hohsaka et al., able for extended-anticodon tRNAs. For example, 1999b; Murakami et al., 1998). Significantly, when the suppression of GGGN codons by tRNAs Hohsaka et al. (1999a), used both of these bearing NCCC anticodons (SUF16 derivatives) was codons together in a single transcript to insert analyzed,itwas found thatthe only non-functional two different unnatural amino acid residues into N:N pair was G:G. This result indicates that the same protein site-specifically. Watson-Crick pairing at the fourth codon nucleo- In an effort to develop improved tRNA/four- tide is even less strictly enforced than at the third base codon pairs for protein mutagenesis, Moore ‘‘wobble’’ position for normal codons. However, et al. examined the ability of Su6 (amber tRNALeu) the identity of the N:N pair does affect suppression derivatives with NCUA anticodons to suppress efficiency, and canonical pairings in the fourth pos- UAGN codons and also monitored decoding in the ition gave the strongest suppression (Gaber & (cid:255)1 and 0 frames. In an RF1-deficient E. coli strain, Culbertson, 1984). Additionally, it has been found UCUA suppressed the UAGA codon with up to that sufJ, a modified tRNAThr with anticodon 26% efficiency with little decoding in the 0 or (cid:255)1 UGGU, is capable of decoding ACCC and ACCU frames(Mooreetal.,2000). Thispairwasalsofairly in addition to ACCA (ACCG was not tested; Bossi efficient in the Curran & Yarus study, and differ- & Roth, 1981). ences between the results of independent exper- In an engineered system, Curran & Yarus (1987) iments are likely due, in part, to the extent of tested the ability of derivatives of the glutamine- aminoacylation and maturation with different inserting amber suppressor Su7 with NCUA anti- tRNA scaffolds. It is clear that decoding four-base codons to suppress UAGN codons. While the codons with complementary anticodons is canonical pairs tended to suppress well, some generally favorable, but the rules that govern the non-canonical pairs were also efficient. For comparative suppression efficiencies for different example, the best pair arose from a CCUA antico- four-base codons remain unclear. don and a UAGA codon; also, the GCUA tRNA Likewise,itisnotexhaustivelyknown whatgov- preferred UAGA and UAGG to its canonical erns the likelihood of frameshift in the absence of UAGC.Theabilitytoreadafour-basecodonversus suppressor tRNAs. Atkins and co-workers estab- a three-base codon for these extended-anticodon lished the generality of ‘‘leakiness’’ in frame main- tRNAs was also assessed, by using tRNAs that tenance by measuring the weak activity from decode four-base codons whose first three bases insertion and deletion mutants of the gene for encode a stop codon. In this case there is no com- b-galactosidase induced by the acridine derivative petitive translation with a canonical tRNA. In gen- ICR-191D. Sixteen mutants varied in activity by eral, four-base decoding predominated when a 100-fold, with the strongest 0.06% of wild-type fourth Watson-Crick pair was possible and was (Atkins et al., 1972). This, and studies of immuno- less likely to predominate otherwise. The authors precipitated truncation products of b-galactosidase proposed that this is a consequence of mRNA- (Manley, 1978) place the average frequency of fra- induced stabilization of a tRNA conformation that meshift in the range of 5(cid:2)10(cid:255)4 per translocation favors four-base decoding, possible only when a event. However, factors such as ribosomal fourth canonical base-pair is present (Curran & mutation (rpsL and ram), streptomycin treatment, Yarus,1987). amino acid starvation and ‘‘hungry’’ codons In contrast, Qian et al. (1998) have proposed that, greatly increase the frequency of frame errors at least in some cases, extended-anticodon tRNAs (Atkins et al., 1972; Weiss & Gallant, 1983; Weiss Selection of EfficientSuppressors ofFour-base Codons 757 et al., 1988a). For example, in E. coli, tandem repeat to high concentrations of ampicillin (100- of AGG, a rare arginine codon read by a minor 2000 mg ml(cid:255)1). Invariably, sequencing of these tRNA, results in (cid:135)1 frameshift up to 50% of the clones showed that the b-lactamase genes con- time(Spanjaard& van Duin, 1988). tainedonlythree nucleotides at the sites of interest, The best-known ((cid:135)1 and (cid:255)1) frameshifting corresponding to Ser and Cys at the conserved S70 events, both programmed and non-programmed, site and corresponding to all amino acids except occur because of single-base repeats constituting Cys, Met, Pro, Trp and Tyr at the permissive S124 a ‘‘slippery run’’ of mRNA. For example, the (cid:135)1 site (although this is probably largely due to frameshift that avoids the UGA stop codon at the undersampling of three-base sequence space, since programmed site in RF2 occurs at the sequence the amount of contamination was small). Presum- CUUU, mediated by the CUU-decoding tRNALeu ably, these ‘‘deletion’’ mutants, which occur at (Weiss et al., 1987). While it is easy to understand rates much higher than the 10(cid:255)6 upper-limit esti- how a long repeat is a frame-maintenance conun- mated for deletion in vivo (Cupples et al., 1990), are drum for the ribosome, it is equally striking that the result of inefficiency in the coupling and cap- high levels of shifting are observed only in the pre- ping reactions employed during synthesis of the sence ofother‘‘stimulatory’’elements(Atkinsetal., oligonucleotides used to generatethe libraries. 1990), such as a Shine-Delgarno-like sequence To remove this contamination, cells transformed upstream and a UGA stop codon downstream of with the reporter libraries were grown in small the RF2 shift site (Weiss et al., 1987, 1988b); an batches and analyzed on agar plates for moderate RNA-pseudoknot downstream of the shift site for levels of ampicillin resistance (100 mg ml(cid:255)1). By the F2 protein of coronavirus IBV (Brierley et al., pooling the cultures that contained no ampicillin 1989); and the rare AGG codon adjacent down- resistance at this level, uncontaminated, over- stream to the CUU slip site in yeast Ty elements sampled reporter libraries were obtained (about (Belcourt &Farabaugh,1990). 5000 clones for the S124(N4) library and 10,000 We were interested in employing a library clones for the S70(N4) library). Sequencing of 48-96 approach to examine all possible four-base codons clones from each reporter library showed that the for their propensity to shift and, moreover, to libraries were random at each nucleotide and suffi- exhaustively identify tRNAs that will efficiently ciently unbiased that unique four-base codons suppress four-base codons. To this end, we have were pulled out only once or twice among these installed the sequence NNNN into various codons clones. of the gene for b-lactamase and selected with Because the b-lactamase reporter libraries were ampicillin in the presence and absence of deriva- constructed from pBR322-dervived vectorscontain- tives of tRNASer with randomized, extended (eight ing ColE1 origins, tRNA suppressor libraries were 2 or nine nucleotide) anticodon loops. We have suc- constructed with pACYC184-derived vectors with cessfully identified at least four non-interacting p15a origins to permit co-maintenance within a sets of efficient four-base codon-tRNA pairs, and single bacterium. These libraries consisted of we have defined a strikingly simple pattern that derivatives of tRNASer with the anticodon loop (7 2 underlies ‘‘shiftiness’’ in four-base codons. These nt)replacedwitheightorninerandom nucleotides, pairs will be used in our ongoing efforts to engin- transcribed under the control of the strong lpp pro- eer bacteria capable of inserting unnatural amino moter and efficient rrnC terminator. This tRNA acid residues into proteins. was chosen because seryl-tRNA synthetase (SerRS) does not recognize the anticodon loop of tRNASer. Major recognition elements are, instead, in the long Results variable arm, acceptor stem, and the D and T(cid:9)C loops, where the SerRS contacts the tRNA (Biou Library construction and characterization et al., 1994; Price et al., 1993). The choice of a tRNA Since the basis of four-base codon suppression is scaffold and cognate synthetase that is permissive poorly understood, we adopted a combinatorial to anticodon loop structure ensures that suppres- approach to examine the suppression of all poss- sionefficiencyisrelatedtocodon-dependenteffects ible four-base codons by all possible appropriately rather than the effects of variable aminoacylation sized tRNA anticodon loops. Two types of libraries with different anticodon sequences. For each of were generated: four-base codon reporters and these libraries, 24 clones were sequenced and tRNA suppressors (Figure 1). The reporter system found to beuniqueand unbiased. involved replacement of a conserved (S70) or per- missive (S124) codon in the gene for b-lactamase Selected four-base codons at S124 and S70 with four random nucleotides. These libraries, which each contain 44(cid:136)256 unique members, bear The libraries were crossed by the transformation an insertion mutation at the site of interest, produ- of the appropriate tRNA library into competent cing an inactive enzyme as a result of the frame cells of the reporter strains, followed by selection alteration that renders transformed bacteria incap- on media containing various concentrations of able of survival on ampicillin. To our surprise, ampicillin. Survival on higher concentrations of however, approximately 0.5% of the members of ampicillin requires that more b-lactamase be pre- these reporter libraries survived on moderate sent in the cell, which is related to the extent to 758 Selection ofEfficientSuppressors ofFour-base Codons Figure 1. Schematic of b-lacta- mase libraries and tRNASer libraries. Insertion of NNNN at Ser70 or Ser124 results in abortive translation unless the frameshift is suppressed. which the tRNA in the selectant is able to mediate lytic Ser of TEM-1 b-lactamase and requires Ser or frameshift at the site of mutation in the reporter. Cys for activity, confirms that these four-base Thus, survival at the highest levels of ampicillin codons are suppressed by our modified seryl- indicates the presence of tRNAs that read the co- tRNA. transformedfour-basecodons most efficiently. Since there was no apparent pattern to the The library of 8 nt anticodon loop tRNAs was sequences of the suppressed four-base codons, we first crossed with the S124 and S70 reporter devised an experiment to examine whether or not libraries, choosing two sites to examine the effects any four-base codon could be suppressed, at least of mRNA context. In addition to different general at low levels. Six clones from the S70(N4) reporter contexts in the gene, these sites also have different library were selected at random and crossed local contexts with different nucleotides 50 and 30 against the tRNASer(N8) library. From these clones to the four-base codon (Figure 1). At the permiss- (Table 2), tRNAs that efficiently suppress UUCU, ive S124 site, four-base codons selected at moder- GGAU and CGGA to at least 200 mg ml(cid:255)1 ampicil- ate ampicillin concentrations (200-300 mg ml(cid:255)1) lin could be found. AAUG and ACGC suppressors included a subset of AGGN, CCCN, CGGN, were generally much weaker, between 5 and 10 mg GGGN and UAGN, for which four-base codon ml(cid:255)1 ampicillin. A suppressor for UGAA, based on suppressors are known, as well as CUAG, UCAA, the UGA stop codon, could not befound, however, CCUU, CUCU, GGAC and UGCG (Table 1). Only even at 5 mgml(cid:255)1ampicillin. one of these, AGGA, could be suppressed at The procedure for removal of the three-base con- 1000 mg ml(cid:255)1 ampicillin. At the S70 site, however, tamination at the catalytic S70 site cannot remove AGGG, CUCU and UAGA were also suppressed those clones that contain a random three-base at 1000 mg ml(cid:255)1, and more codons were rep- codon that corresponds to a missense mutation (i.e. resented at the 300 mg ml(cid:255)1 level than for the S124 anything other than Ser or Cys). As a result, upon site. Suppression of the S70 site, which is the cata- crossing this library with the tRNA library, a num- Selection of EfficientSuppressors ofFour-base Codons 759 Table 1. Selectedfour-base codons at S124and S70at high and moderateampicillin concentrations S124 S70 1000mgml(cid:255)1 300mgml(cid:255)1 1000mgml(cid:255)1 300mgml(cid:255)1 AGGA AGGA AGGA AGGA AGGG AGGG AGGC AGGC CUCU AGGG CCCU UAGA AGGU UAGA CCAU CGGG CCCC UCAA CCCU CGGC CUAC CUAG CUCU UAGA ber of clones were isolated that contained missense suppressors contain 8-nt anticodon loops or 7 nt three-base codons that were presumably being (i.e. normal) loops as a result of the same type of suppressed by the Ser-inserting tRNA library ‘‘deletion’’ seenin thereporter libraries. member. Interestingly, the most efficiently sup- pressed, AGG and CGG at 1000 mg ml(cid:255)1, corre- spond to the first three nucleotides of some of the Sequence variation of tRNA suppressors of most efficiently suppressed four-base codons four-base codons (Table 3). It was not determined whether the tRNA In order to examine the diversity of tRNA antic- odon loops that could suppress each selected codon, the reporter plasmid was isolated and the tRNASer(N8) library was re-crossed against indi- Table 2. Selection of tRNA suppressors of randomly vidual reporter sequences at S124. These were then chosen codons selected at a variety of levels of ampicillin for each reporter, and the tRNA sequences were examined. Maximumsuppression Codon level tRNAsequences In each case, suppression at the highest levels was mediated by a tRNA bearing the complementary UGAA >5 Nonedetected anticodon (Table 4). Interestingly, the sequences AAUG 5-10 CUCGUUAC CUCGUUAA external to the anticodon (i.e. the two nucleotides ACGC 5-10 CUGCGUAU on either side of the anticodon) converged on CUACGUAU GGACGUAA UUCU >200 UUAGAAAG AUAGAAAG ACGGAAAC Table 3. Codons for which Ser missense suppressors UUAGAAAC were slected at S70 CUAGAAAC ACAGAAAC [Amp] Codon Acid CGGGAAAA GGAU >200 GUGUCCAU 1000 AGG Arg CAGUCCAU CGC Arg UGAUCCAU 300 ACA Thr CCAUCCAG AGG Arg UAGUCCAC AUA Ile UCAUCCAC CCA Pro CCAUCCAC CCU Pro UAAUCCAC CGA Arg AAGUCCAA CGG Arg CGGA >200 CGGCCCGG UUG Leu UUUCCGUA 100 ACG Thr CCUCCGUA AUA Ile GUUCCGGC CCU Pro CUUCCGCU CGA Arg ACUCCGCG CGG Arg GUUCCGGC CUC Leu ACUCCGUA GCC Ala AAUCCGCU GGC Gly ACUCCGUA GUG Val ACUCCGUU UUA Tyr Suppressionlevelsareinampicillinconcentration,mgml(cid:255)1. Ampicillinconcentrationsareinmgml(cid:255)1. 760 Selection ofEfficientSuppressors ofFour-base Codons Table 4. Selectants from re-cross of individual four-base Watson-Crick complementarity to the first codons againsttRNASer(N8) three bases of the codon is most critical, followed by complementarity at the fourth base, the pre- sence of a purine at position 37 (30 to the antico- don) and the presence of a U at position 33 (50 to the anticodon). Analogously, to examine how efficiently AGGA could be suppressed by 9-nt anticodon loop tRNAs, the AGGA reporter at S124 was crossed against tRNASer(N9) and selected at 20 and 300 mg ml(cid:255)1 ampicillin. At 300 mg ml(cid:255)1ampicillin, the con- sensus sequence was BNKUCCUAH, where K(cid:136)G or U and H(cid:136)A or C or U (Table 6). However, at 20 mg ml(cid:255)1, the consensus sequence degenerated to NNNUCCUAN. In a single case out of 38 clones, CUUCCUAUU, the extra nucleotide in the loop was on the 30 side of the anticodon (UCCU). At both levels of ampicillin selection, tRNAs with 8 nt anticodon loops were among the selectants, pre- Ampicillin survival levels are in mg ml(cid:255)1 for codons at S124. sumably as a result of ‘‘deletion’’ from inefficient ND(cid:136)notdetermined. aEven at the highest levels of ampicillin selection, three capping during oligonucleotide synthesis seen in tRNAs were found for the AGGA anticodon: A...A, C...A, the reporter libraries. These deletions dominate the C...C. selection at higher ampicillin levels, indicating that bDeterminedatS70. suppression of four-base codons (or at least AGGA) is more robust with tRNAs bearing 8 nt anticodon loopsrather than 9-nt loops, overall. different sequences depending upon the anticodon. Specificity of tRNA suppressors of four- To examine the effect of sequence variation in the base codons anticodon loop outside of the anticodon per se, tRNAs from the tRNASer(N8) library were selected The most robust suppressors of each of the most at different levels of ampicillin (20 mg ml(cid:255)1 to favorable codons were isolated and re-crossed 1500 mg ml(cid:255)1) against a single reporter codon at against the S70(N4) reporter library with selection S124 (Table 5). For this experiment, we selected the at very low ampicillin (10 mg ml(cid:255)1). Here, S70 was most efficiently suppressed codon (AGGA) to selected, since the site is slightly easier to suppress, examine the widest array of suppression efficiency. and we were interested in examining extremely At 1500 mg ml(cid:255)1 ampicillin, the sequences con- low levels ofsuppression, ifthey existed. Selectants verged on MUUCCUAM, where M(cid:136)A or C. At were then transferred to increasingly higher con- 300 mg ml(cid:255)1, the convergence was reduced to centrations of ampicillin and sequenced to identify NBUCCURN, where B(cid:136)C or G or U and R(cid:136)A the extent to which each tRNA is capable of or G. At 20 mg/ml, the consensus was reduced to suppressing other four-base codons. In each case, NNNCCUDN, where D(cid:136)A or G or U, with loss suppression of the canonical four-base codon was of canonical recognition of the fourth base of the highly favored, with suppression of related codons codon but maintenance of the other three bases. with non-canonical pairing at the fourth codon This allows us to rank the different anticodon pos- position possible at fivefold lower ampicillin at itions in importance for suppression efficiency: best (Table 7). The tRNA that efficiently reads Table 5. Selectedeight-nt anticodon looptRNAs to suppress AGGA atmoderate and low ampicillin levels 300mgml(cid:255)1 20mgml(cid:255)1 ACUCCUAA CUUCCUAC AAUCCUAG CAUCCUAU UAUCCUGU ACUCCUAC GGUCCUAU AAUCCUAU CGGCCUAU UGUCCUAG AGUCCUAC GUUCCUAA ACUCCUAU CGUCCUGU UGUCCUAU AGUCCUGC UCUCCUAA ACGCCUAU CGUCCUUU UGUCCUGC AUUCCUAG UCUCCUAC ACUCCUAU CUACCUAU UUUCCUAU CCUCCUAA UUUCCUAC AGGCCUGG CUUCCUAA UUACCUAA CCUCCUAC UUUCCUAG AGUCCUAC GGUCCUGA UUUCCUAG CGUCCUAU UUUCCUAU AGUCCUUG GUUCCUAA UUUCCUAU CUUCCUAA AUUCCUAG GUUCCUAU UUUCCUGU CAACCUAA GUUCCUGU CAUCCUAA UAUCCUAA Sequenceswerefoundonetosixtimeseach.Onlytheanticodonloopsequenceisshown. Selection of EfficientSuppressors ofFour-base Codons 761 Table 6. Selectednine-nt anticodon looptRNAs to suppress AGGA at moderateand low ampicillin levels 300mgml(cid:255)1 20mgml(cid:255)1 CAUUCCUAU AAUUCCUAC CCUUCCUAA GAUUCCUAU CCUUCCUAU ACGUCCUAC CCUUCCUAU GCUUCCUAU CUGUCCUAA ACUUCCUAC CUGUCCUAC UCGUCCUAG GCUUCCUAU ACUUCCUAU CUGUCCUAU UCUUCCUAA GGUUCCUAU AGUUCCUAC CUUUCCUAA UCUUCCUAG UUGUCCUAC AUUUCCUAA CUUUCCUAU UUGUCCUAC AUUUCCUAC GACUCCUAC UUUUCCUAC CACUCCUAC GAGUCCUAC UUUUCCUAU CCAUCCUAU GAUUCCUAC CUUCCUAUU Sequenceswerefoundonetofourtimeseach.Onlytheanticodonloopsequenceisshown. AGGA (1500 mg ml(cid:255)1 ampicillin), for example, ing at of the four-base codons at S124 was no more reads AGGU to only 100 mg ml(cid:255)1, and AGGG and than twice the background hydrolysis by TOP10 AGGC to only 50 mg ml(cid:255)1. In contrast, suppression E. coli (0.8-1.7 units versus 0.9 unit), which was of no other codon could be detected for the tRNA comparable to readthrough of the amber stop that suppresses CUAG. codon (1.2 units, Figure 2). Suppression of the four-base codons was in the range of 17 to 240 units, or 2.5-35% of wild-type (AGT or b-lactamase Efficiency of frameshift mediated by four-base on pBR322). However, in this assay, the very-effi- codon suppressors cient supD suppressor of the amber stop codon exhibited virtually the same b-lactamase activity as The level of readthrough of the four-base codons observed in cells with AGT (Ser) at the S124 site. and the efficiency of suppression of these codons This may indicate that it is possible to saturate the were quantified by determining the amount of amount of b-lactamase that is exported to the peri- b-lactamase per cell with the chromogenic sub- plasm in this system. Therefore, while the activities strate nitrocefin. Convenient arbitrary units were allow us to compare frameshift efficiencies, we defined to normalize the activity for the number of cannot strictly correlate these activity data with cells assayed and to subtract out the effects of the suppression efficiencies as compared to translation slow background hydrolysis of nitrocefin (see of wild-type b-lactamase. Materials and Methods). Spontaneous frameshift- Identification of ‘‘shifty’’ codons It was found empirically during the process of Table 7. Codons selected by most efficient four-base removing three-base contamination from the codon suppressors libraries that no four-base codon was read through bythenaturaltranslationalmachinery atlevelssuf- ficient to permit survival at 100 mg ml(cid:255)1 ampicillin. However, at lower levels of ampicillin, some four- base codons can be read at low levels, or are inher- ently shifty (Table 8). Strikingly, each of these four- Table 8. ‘‘Shifty’’ four-base codons detected at the S124 site 25mgml(cid:255)1 10mgml(cid:255)1 AAAA AAAA AGGC AGGC CGGC CGAU CUUA CGGA CUUU CGGC GGGA CGGG UUUC CGGU UUUU CUUA CUUC CUUU GGGA Sequences for the anticodon loop are shown. Codon in par- UUUC entheses is that which elicited the tRNA in selection against a UUUU single reporter (Table 3). Suppression levels are indicated for eachtRNA/four-basecodonpairinmgml(cid:255)1ampicillin. Selectionlevelsaregiveninampicillinconcentration. 762 Selection ofEfficientSuppressors ofFour-base Codons Figure 2. Suppression efficiency of four-base codon/tRNA pairs measured by nitrocefin turnover. See Materials and Methods for definition of units. Readthrough of UAG, AGGA, CCCU, CUAG and UAGA was 0.8-1.7 units, while TOP10 cells alone exhibit about 0.9 unit. Suppression of four-base codons resulted in 17-240 units of activity, compared to 970 for suppression of UAG by supD. Results are the average of three to five independent trials per data point. base codons has a tandem repeat in the central two codon in the E. coli genome (Inokuchi & Yamao, nucleotides (a purine repeat in one case). It is also 1995). Nevertheless, the best suppressors do not noteworthy that some of the best suppressed simply correspond to the least represented tRNAs codons (e.g. AGGC, CGGN and CGAU) but not all or codons, since we found no robust suppressors of them (e.g. AGGA) are among this shifty set. For of highly underrepresented codons AGAN, AUAN some codons (e.g. AGGN, AAAN, and GGGN), or CGAN. Also, among randomly selected codons only a subset of the family is present, indicating that vary in usage from overrepresented (AAUG that slipping at a particular three-base codon is and UUCU) to moderate representation (ACGC) to strongly influenced by the base that follows. Some underrepresented (GGAU and CGGA), and that of these, such as AGGA, are probably only slightly lack any evident sequence pattern, weak to moder- less shifty (5 mg ml(cid:255)1), but it becomes difficult to ate suppressors could be found (Table 9). Of examine libraries at this low level of selection due course, the procedure employed here selects for to backgroundsurvival of satellites on ampicillin. growth on ampicillin and therefore requires both an efficient four-base codon/tRNA interaction and a lack of toxicity to the cell. We cannot rule out the Discussion possibility of efficient but toxic suppressors of highlyrepresented four-basecodons. Suppression of four-base codons The identity of the fourth base has a profound When the first frameshift suppressors were iso- effect on the ability of the four-base codon to be lated, it was thought that only repeating codons suppressed. For the AGGN family, the series like GGGG could be suppressed. However, a num- appears to be AGGA>AGGG>AGGC(cid:25)AGGU ber of exceptions, like ACCN and UAGN, were for canonical suppressors. However, Hohsaka et al. later found or engineered (Atkins et al., 1991). (1996), saw little difference in the in vitro suppres- Among those efficient suppressors identified in sion levels among these four codons using a yeast this study, it is clear that there is no sequence pat- tRNAPhe scaffold, perhaps due to the extreme over- tern that generally underlies efficient four-base expression conditions he employed in contrast to suppression. For example, CUAG is completely the moderate, natural b-lactamase promoter non-repetitive, and GGGG, AAAA and UUUU are employed here. Likewise, only UAGA of UAGN, not among the best suppressors. However, the and CGGC and CGGG of CGGN, are among the most efficiently suppressed four-base codons corre- most strongly suppressed codons. spond to three-base codons (plus one 30 nucleotide) Some of the four-base codons that we identified withlow usage and low tRNA abundance in E. coli were already known, including AGGN, used in (Table 9). For example, AGG is the least used protein engineering (Hohsaka et al., 1996; Ma et al., Selection of EfficientSuppressors ofFour-base Codons 763 Table 9. Codon usage and tRNA abundance for codons suppressible with 8 nt anticodon loop tRNAs and ‘‘shifty’’ codons tRNAabundance S124suppressionwith S124readthrough Codon(acid) Codonusage(0.1%)a (relative)b tRNA (withouttRNA) AGG(Arg) 0.9 ND (cid:135) (cid:135) CCA(Pro) 7.4 ND (cid:135) CCC(Pro) 3.4 ND (cid:135) CGG(Arg) 3.6 Minor (cid:135) (cid:135) CUA(Leu) 2.5 Minor (cid:135) CUC(Leu) 9.3 0.30 (cid:135) UAG(amber) ND N/A (cid:135) UCA(Ser) 5.6 0.25 (cid:135) AAA(Lys) 38.4 1.00 (cid:135) CGA(Arg) 2.6 0.90 (cid:135) CUU(Leu) 9.9 0.30 (cid:135) GGG(Gly) 8.6 0.10 (cid:135) UUU(Phe) 16.9 0.35 (cid:135) AAU(Asn) 14.8 0.60 weak ACG(Thr) 11.4 ND weak GGA(Gly) 5.5 0.20 weak UUC(Phe) 19.1 0.35 weak aUsageinE.coli.Ifallsensecodonswereusedequally,usagewouldbe16.4. bRelativetotRNALeuwithcodonusage56.3. 3 ND,nodata.N/A,notapplicable.Minor,abundancetoolowforaccuratemeasurement(Inokuchi&Yamao,1995). 1993),CCCU, which isSUF8 (Cummins et al., 1985) there is no special pattern underlying the sup- and UAGA (Curran & Yarus, 1987; Moore et al., pressed codons, as might be expected if slippage 2000). However, CUAG is apparently a completely were occurring. novel efficiently suppressed four-base codon, con- Diversity among tRNAs that suppress four-base sistent with the fact that CTAG is underrepre- codons can also be assessed by comparing the sented in the bacterial genome (Burge et al., 1992). tRNA sequences selected with a single reporter These suppressors acted with efficiencies of 2.5- codon. This approach gives us some idea about the 35% of the maximum activity detectable with a tolerance of the E. coli translational apparatus to nitrocefin-based assay (AGU or supD/UAG), variation in the tRNAs for suppression of four- which is consistent with activities from the most base codons (presumably combined with tRNA efficient natural and engineered suppressors maturation and aminoacylation differences, which known. Also, fairly efficient suppressors for the are probably minimal here). Figure 3 shows the four-base codons CUAC, CCAU, UCAA, CCUU consensus sequences for tRNAs able to suppress and CUCU were found (Tables 1 and 4). Examin- AGGA at various levels of ampicillin compared ation of the anticodon loop sequences shows that with the known E. coli tRNAs. The best suppres- the best tRNAs for each codon have a canonical sors of this codon maintain the universal U33 and anticodon, but the rest of the loop does not necess- highly favored A37 seen in E. coli tRNAs. At the arily converge on the sequence consensus observed 300 mg ml(cid:255)1 ampicillin level, U33 or C33 and A37 in E. coli tRNAs. For example, the tRNA that sup- are favored, but G is present in some clones at presses CUAG has G33 and G37 instead of U33 both sites. When much weaker suppressors were and A37, and the tRNA that suppresses CCCU has selected (20 mg ml(cid:255)1 ampicillin), no base is con- a U37. (See Figure 1 for numbering of nucleotides served at position 33 and, though A37 is favored, in the 8 nt and 9 nt anticodon loops.) Presumably, G37 and U37 are seen. This calls into question the these mutations adjust the loop conformation to necessity of the U33 to make the sharp turn seen in allow maximally favorable interaction between the the X-ray crystal structure of tRNAPhe (Holbrook codon and anticodon, which is quite possibly et al., 1978; Sussman & Kim, 1976). Even the 33.5 different from the most favorable conformation for position corresponding to AGGA is not entirely a three-base anticodon. These data support a conserved in these weak suppressors, though model in which the extended-codon tRNA directly U33.5 is still strongly favored. Some 5% of the reads all four bases of the codon (as opposed to a clones selected at 300 mg ml(cid:255)1 and 19% at 20 mg slippage model), since (1) the most efficient sup- ml(cid:255)1 have the possibility of an A:U or U:A pair at pressors invariably had Watson-Crick complemen- the 32:38 site, which would formally extend the tarity at all four bases of the codon, (2) the anticodon stem by one pair and leave a 6 nt loop. suppression at S70 required insertion of a serine or Other work has suggested that it is unlikely that a cysteine residue, demonstrating that the modified 6 nt loop can present a four-base anticodon (Atkins tRNASer was delivering the amino acid, and (3) et al., 1991), and five of the seven clones of this 764 Selection ofEfficientSuppressors ofFour-base Codons Figure 3. Nucleotide representation at anticodon loop sites in E. coli tRNAs and moderate and weak tRNA sup- pressors of AGGA with 8 nt and 9 nt anticodon loops. Suppression is at S124 and listed in ampicillin concentration of mgml(cid:255)1. type retain complementary UCCU anticodons, per- 32:38 pairing with the favored U38. At 300 mg haps suggesting why only the weaker A:U pair is ml(cid:255)1 ampicillin, three of the nine clones have seen at this site and not the C:G pair. Nevertheless, the possibility of a weak G32:U38 wobble pair the possibility of 6 nt anticodon loop tRNAs (resulting in a 7 nt anticodon loop and extended decoding four-base codons cannot beruled out. anticodon stem), but none can canonically pair. When this selection scheme was applied to 9 However, among the weaker suppressors, two of nt anticodon loop tRNAs that suppress AGGA 37 clones can have a U:A or A:U pair and five (Figure 3), a UCCU anticodon was found in can have a G:C pair. Particularly in the latter each tRNA and, in every clone but one, the case, it is possible that the 7 nt anticodon loop extra nucleotide is added 50 to the anticodon (as with an extended anticodon stem is the active was always the case with the 8 nt anticodon conformation suppressing the four-base codon. loop tRNAs). At 300 mg ml(cid:255)1 ampicillin, U38 is Indeed, recent work by Auffinger & Westhof favored, while either pyrimidine is favored at (1999) suggests structural conservation at the 20 mg ml(cid:255)1. This differs both from the E. coli 32:38 position in all tRNAs and excludes G:C preference for A38 and lack of preference in 8 pairs, perhaps lending credence to the view that nt tRNAs. U33 is strongly preferred at both this pair is part of the stem, not the loop, in ampicillin concentrations, much more so than in these cases. the 8 nt anticodon loops. Also, while there is In addition to exhibiting high suppression effi- degeneracy at the 32 position, the stronger sup- ciency, any useful four-base codon suppressor pressors do not have A32, possibly to avoid must also not cross-react with other codons (three
Description: