ManfredT.Reetz DirectedEvolutionofSelectiveEnzymes ManfredT.Reetz Directed Evolution of Selective Enzymes CatalystsforOrganicChemistryandBiotechnology Author AllbookspublishedbyWiley-VCHare carefullyproduced.Nevertheless,authors, ManfredT.Reetz editors,andpublisherdonotwarrantthe MPIfürKohlenforschung informationcontainedinthesebooks, Kaiser-Wilhelm-Platz1 includingthisbook,tobefreeoferrors. 45470Mülheim Readersareadvisedtokeepinmindthat Germany statements,data,illustrations,procedural detailsorotheritemsmayinadvertently and beinaccurate. Philipps-UniversitätMarburg LibraryofCongressCardNo.:appliedfor FachbereichChemie Hans-Meerwein-Straße4 BritishLibraryCataloguing-in-Publication 35032Marburg Data Germany Acataloguerecordforthisbookisavail- ablefromtheBritishLibrary. Cover Enzymestructure- Bibliographicinformationpublishedbythe http://dx.doi.org/10.2210/pdb3g02/pdb DeutscheNationalbibliothek TheDeutscheNationalbibliothek liststhispublicationintheDeutsche Nationalbibliografie;detailed bibliographicdataareavailableonthe Internetat<http://dnb.d-nb.de>. ©2017Wiley-VCHVerlagGmbH&Co. KGaA,Boschstr.12,69469Weinheim, Germany Allrightsreserved(includingthoseof translationintootherlanguages).Nopart ofthisbookmaybereproducedinany form – byphotoprinting,microfilm,or anyothermeans – nortransmittedor translatedintoamachinelanguage withoutwrittenpermissionfromthe publishers.Registerednames,trademarks, etc.usedinthisbook,evenwhennot specificallymarkedassuch,arenottobe consideredunprotectedbylaw. PrintISBN:978-3-527-31660-1 ePDFISBN:978-3-527-65549-6 ePubISBN:978-3-527-65548-9 MobiISBN:978-3-527-65547-2 oBookISBN:978-3-527-65546-5 CoverDesign SchulzGrafik-Design, Fußgönheim,Germany Typesetting SPiGlobal,Chennai,India PrintingandBinding Printedonacid-freepaper V Contents Preface IX 1 IntroductiontoDirectedEvolution 1 1.1 GeneralDefinitionandPurposeofDirectedEvolutionof Enzymes 1 1.2 BriefAccountoftheHistoryofDirectedEvolution 4 1.3 ApplicationsofDirectedEvolutionofEnzymes 16 References 17 2 SelectionversusScreeninginDirectedEvolution 27 2.1 SelectionSystems 27 2.2 ScreeningSystems 44 2.3 ConclusionsandPerspectives 52 References 53 3 GeneMutagenesisMethods 59 3.1 IntroductoryRemarks 59 3.2 Error-PronePolymeraseChainReaction(epPCR)andOther Whole-GeneMutagenesisTechniques 60 3.3 SaturationMutagenesis:AwayfromBlindDirectedEvolution 70 3.4 RecombinantGeneMutagenesisMethods 85 3.5 CircularPermutationandOtherDomainSwappingTechniques 91 3.6 Solid-PhaseCombinatorialGeneSynthesisforLibraryCreation 92 3.7 ComputationalTools 96 References 101 4 StrategiesforApplyingGeneMutagenesisMethods 115 4.1 GeneralGuidelines 115 4.2 RareCasesofComparativeStudies 118 4.3 ChoosingtheBestStrategywhenApplyingSaturation Mutagenesis 130 4.3.1 GeneralGuidelines 130 VI Contents 4.3.2 ChoosingOptimalPathwaysinIterativeSaturationMutagenesis (ISM) 135 4.3.3 SystematizationofSaturationMutagenesis 142 4.3.4 SingleCodeSaturationMutagenesis(SCSM):UseofaSingleAmino AcidasBuildingBlock 149 4.3.5 TripleCodeSaturationMutagenesis(TCSM):AViable CompromisewhenChoosingtheOptimalReducedAminoAcid Alphabet 151 4.4 Techno-EconomicalAnalysesofSaturationMutagenesis Strategies 154 4.5 CombinatorialSolid-PhaseGeneSynthesis:AnAlternativefor theFuture? 159 References 160 5 SelectedExamplesofDirectedEvolutionofEnzymeswith EmphasisonStereo-andRegioselectivity,SubstrateScope,and/or Activity 167 5.1 ExplanatoryRemarks 167 5.2 CollectionofSelectedExamplesfromtheLiterature2010upto 2016 189 References 189 6 DirectedEvolutionofEnzymeRobustness 205 6.1 Introduction 205 6.2 ApplicationofepPCRandDNAShuffling 207 6.3 B-FITApproach 211 6.4 IterativeSaturationMutagenesis(ISM)atProtein–Protein InterfacialSitesforMultimericEnzymes 215 6.5 AncestralandConsensusApproachesandtheirStructure-Guided Extensions 216 6.6 ComputationallyGuidedMethods 219 6.6.1 SCHEMAApproach 219 6.6.2 FRESCOApproach 221 6.6.3 FireProtApproach 223 6.6.4 ConstrainedNetworkAnalysis(CNA)Approach 224 6.6.5 AlternativeApproaches 226 References 227 7 DirectedEvolutionofPromiscuity:ArtificialEnzymesasCatalysts inOrganicChemistry 237 7.1 IntroductoryBackgroundInformation 237 7.2 TuningtheCatalyticProfileofPromiscuousEnzymesby DirectedEvolution 245 7.3 ConclusionsandPerspectives 259 References 260 Contents VII 8 LearningfromDirectedEvolution 267 8.1 BackgroundInformation 267 8.2 CaseStudiesFeaturingMechanistic,Structural,and/or ComputationalAnalysesoftheSourceofEvolvedStereo-and/or Regioselectivity 269 8.2.1 EpoxideHydrolase 269 8.2.2 Ene-ReductaseoftheOldYellowEnzyme(OYE) 273 8.2.3 Esterase 279 8.2.4 CytochromeP450Monooxygenase 282 8.3 AdditiveversusNon-additiveMutationalEffectsinFitness Landscapes 287 References 296 Index 303 IX Preface Directed evolution is a term that is used in two distinctly different research areas:(i)ThegeneticmanipulationoffunctionalRNAs,adisciplineinitiatedby S.Spiegelmannhalfacenturyagoandextendingtothepresentdayinthelabo- ratoriesofJ.W.Szostak,J.F.Joyce,andothersand(ii)thegeneticmanipulation of genes (DNA) with the aim to engineer the catalytic profiles of enzymes as catalystsinorganicchemistryandbiotechnology,especiallystereoselectivity.This monographfocusesonthelatterfield.Itbeginswithanintroductorychapterthat featuresthebasicprinciplesofdirectedevolution,andisfollowedbyachapteron screeningandselectionmethods.Criticalanalysesofrecentdevelopmentscon- stitutetheheartofthemonograph.Ratherthanbeingcomprehensive,emphasis is placed on methodology development in the quest to maximize efficiency, reliability, and speed when performing this type of protein engineering. The primaryapplicationsconcernthesynthesisofchiralpharmaceuticals,fragrances, andplantprotectingagents. The directed evolution methods and strategies featured in this book can also beusedwhenengineeringmetabolicpathways,developingvaccines,engineering antibodies, creating genetically modified yeasts for the food industry, engi- neering proteins for pollution control, developing photosynthetic CO fixation, 2 geneticallymodifyingplantsforagriculturalandmedicinalpurposes,engineering CRISPR-Cas9 nucleases for genome editing, and modifying DNA polymerases forforensicpurposesandforacceptingnon-naturalnucleotides.Afewstudiesof theseapplicationsareincludedhere. Thismonographisintendednotonlyforthosewhoareinterestedinlearning thebasicsofdirectedevolutionofenzymes,butalsoforadvancedresearchersin academiaandindustrywhoseekguidelinesforperformingproteinengineering efficiently. I wish to thank Dr Zhoutong Sun for reading Chapters 3 and 4 and dis- cussing some of the issues related to molecular biology. Thanks also goes to Dr Gheorghe-Doru Roiban and Dr Adriana Ilie for editing all the chapters and constructing some of the figures. Any errors that may remain are the responsibilityoftheauthor. Marburg ManfredT.Reetz January2016 1 1 IntroductiontoDirectedEvolution 1.1 GeneralDefinitionandPurposeofDirectedEvolutionofEnzymes Enzymeshavebeenusedascatalystsinorganicchemistryformorethanacentury [1a],butthegeneraluseofbiocatalysisinacademiaand,particularly,inindustry hassufferedfromthefollowingoftenencounteredlimitations[1b–d]: • Limitedsubstratescope • Insufficientactivity • Insufficientorwrongstereoselectivity • Insufficientorwrongregioselectivity • Insufficientrobustnessunderoperatingconditions. Sometimes, product inhibition also limits the use of enzymes. All of these problemscanbeaddressedandgenerallysolvedbyapplyingdirectedevolution (or laboratory evolution as it is sometimes called) [2]. It mimics Darwinian evolution as it occurs in Nature, but it does not constitute real natural evolu- tion. The process consists of several steps, beginning with mutagenesis of the gene encoding the enzyme of interest. The library of mutated genes is then insertedintoabacterialoryeasthostsuchasEscherichiacoliorPichiapastoris, respectively, which is plated out on agar plates. After a growth period, single coloniesappear,eachoriginatingfromasinglecell,whichnowbegintoexpress the respective protein variants. Multiple copies of transformants as well as wild-type(WT)appear,whichunfortunatelydecreasethequalityoflibrariesand increase the screening effort. Colony harvesting must be performed carefully, because cross-contamination leads to the formation of inseparable mixtures of mutants with concomitant misinterpretations. The colonies are picked by a robotic colony picker (or manually using toothpicks), and placed individually in the wells of 96- or 384-format microtiter plates that contain nutrient broth. Portionsofeachwell-contentarethenplacedintherespectivewellsofanother microtiter plate where the screening for a given catalytic property ensues. In some (fortunate) cases, an improved variant (hit) is identified in such an initial library, which fulfills all the requirements for practical application as defined by the experimenter. If this does not happen, which generally proves to be the DirectedEvolutionofSelectiveEnzymes:CatalystsforOrganicChemistryandBiotechnology,FirstEdition. ManfredT.Reetz. ©2017Wiley-VCHVerlagGmbH&Co.KGaA.Published2017byWiley-VCHVerlagGmbH&Co.KGaA. 2 1 IntroductiontoDirectedEvolution X Mutagenesis Transformation X X X Target gene Bacterial colonies on agar plate Repeat the Expression of whole process the target protein Biocatalysis Identification of Enzyme variants improved variants Scheme1.1 Thebasicstepsindirectedevolutionofenzymes.Therectanglesrepresent96 wellmicrotiterplatesthatcontainenzymevariants,thereddotssymbolizinghits. case, then the gene of the best variant is extracted and used as a template in the next cycle of mutagenesis/expression/screening (Scheme 1.1). This mimics “evolutionarypressure,”whichistheheartofdirectedevolution. In most directed evolution studies further cycles are necessary for obtaining theoptimalcatalyst,eachtimerelyingontheDarwiniancharacteroftheoverall process.Acrucialfeaturenecessaryforsuccessfuldirectedevolutionisthelinkage betweenphenotypeandgenotype.Ifalibraryinarecursivemodefailstoharbor animprovedmutant/variant,theDarwinianprocessendsabruptlyinalocalmin- imumonthefitnesslandscape.Fortunately,researchershavedevelopedwaysto escapefromsuchlocalminima(“deadends”)(seeSection4.3). Directed evolution is thus an alternative to so-called “rational design” in which the researcher utilizes structural, mechanistic, and sequence informa- tion, possibly flanked by computational aids, in order to perform site-directed mutagenesis at a given position in a protein [3]. The molecular biological technique of site-specific mutagenesis with exchange of an amino acid at a specificpositioninaproteinbyoneoftheother19canonical aminoacidswas established by Michael Smith in the late 1970s [4a] which led to the Nobel Prize[4b].Themethodisbasedondesignedsyntheticoligonucleotidesandhas been used extensively by Fersht [4c] as well as numerous other researchers in the study of enzyme mechanisms [4b]. This approach to protein engineering has also been fairly successful in thermostabilization experiments in which, for example, mutations leading to stabilizing disulfide bridges or intramolecular H-bridges are introduced “rationally” [5]. Nevertheless, in a vast number of other cases, directed evolution of protein robustness constitutes the superior 1.1 GeneralDefinitionandPurposeofDirectedEvolutionofEnzymes 3 strategy[6].Moreover,whenaimingforenhancedorreversedenantioselectivity, diastereoselectivity,and/orregioselectivity,rationaldesignismuchmoredifficult [3], in which case directed evolution is generally the preferred strategy [7]. In some cases, researchers engaging in rational design actually prepare a set of mutants, test such a “library” and even combine the designed mutations, a processthatresembles“real”laboratoryevolution,asshownbyBornscheuerand coworkerswhogenerated28rationallydesignedvariantsofalipase,oneofthem showinganimprovedcatalyticprofile[8].OtherexamplesarelistedinTable5.1 in Chapter 5. However, this technique has limitations, and standard directed evolutionapproachesaremoregeneralandmostreliable. Directedevolutionofenzymesisnotasstraightforwardasitmayappeartobe at this point. The challenge in putting the above principles into practice has to dowiththevastnessofproteinsequencespace.Highstructuraldiversityiseas- ily designed in mutagenesis, but the experimenter is quickly confronted by the so-called“numbersproblem”whichinturnrelatestothescreeningeffort(bottle- neck).Whenmutagenizingagivenprotein,thetheoreticalnumberofvariantsN isdescribedbyEq.(1.1),whichisbasedontheuseofall20canonicalaminoacids asbuildingblocks[2]: N =19MX!∕[(X−M)!M!] (1.1) where M denotes the total number of amino acid substitutions per enzyme moleculeandXisthetotalnumberofresidues(sizeofproteinintermsofamino acids).Forexample,whenconsideringanenzymecomposedof300aminoacids, 5700 different mutants are possible if one amino acid is exchanged randomly, 16millioniftwosubstitutionsoccursimultaneously,andabout30billionifthree aminoacidsaresubstitutedsimultaneously[2]. Suchcalculationspinpointadilemmathataccompaniesdirectedevolutionto this day, namely how to probe the astronomically large protein sequence space efficiently. One strategy is to limit diversity to a point at which screening can be handled within a reasonable time, but excessive diversity reduction should be avoided because then the frequency of hits in a library diminishes and may tendtowardzeroinextremecases.Findingtheoptimalcompromiseconstitutes the primary issue of this monograph. A very different strategy is to develop selectionsystemsratherthanexperimentalplatformsthatrequirescreening.In aselectionsystem,thehostorganismthrivesandsurvivesbecauseitexpressesa varianthavingthecatalyticcharacteristicsthattheresearcherwantstoevolve.A thirdapproachisbasedontheuseofvarioustypesofdisplaysystems,whichare sometimescalled“selectionsystems,”althoughtheyaremorerelatedtoscreening. TheseissuesaredelineatedinChapter2,whichservesasaguideforchoosingthe appropriate system. Since it is extremely difficult to develop genuine selection systemsordisplayplatformsfordirectedevolutionofstereo-andregioselective enzymes, researchers had to devise medium- and high-throughput screening systems(Chapter2).