ebook img

Dynamics of Competitive Evolution on a Smooth Landscape PDF

4 Pages·0.43 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Dynamics of Competitive Evolution on a Smooth Landscape

Dynamics of Competitive Evolution on a Smooth Landscape 3 0 0 Weiqun Peng, Ulrich Gerland, Terence Hwa, and Herbert Levine 2 Center for Theoretical Biological Physics and Department of Physics, University of California at San Diego, La Jolla, CA 92093-0319 n a (Dated: February 1, 2008) J WestudycompetitiveDNAsequenceevolution directedbyin vitroprotein binding. Thesteady- 7 state dynamics of this process is well described by a shape-preserving pulse which decelerates and 1 eventually reaches equilibrium. We explain this dynamical behavior within a continuum mean- field framework. Analytical results obtained on the motion of the pulse agree with simulations. ] h Furthermore,finite population correction to the mean-field results are found to beinsignificant. c e PACSnumbers: 87.10+e, 87.23.Kg m - t Competitiveevolutionsuchasbreedinghasbeenprac- We consider a pool of N DNA sequences of length L. ta ticedforages. Withrecentadvancesinmolecularbiology, EachsequenceS~ =(b1,b2,...,bL)ofnucleotidesbi issub- s this method is widely usedto developnovelproteins and jectto independent single-nucleotidemutationsata rate . t DNA sequences for a variety of applications [1]. The ba- ν 1pernucleotidepergeneration. Selectionisaccom- a 0 ≪ m sicideaofcompetitivemolecularevolutionisstraightfor- plished through protein-DNA binding. Let the binding - ward: ineachgeneration,anumberofmoleculeswithcer- energy of a sequence S~ to the protein be ES~ and let the d taindesiredcharacteristicsare selectedfromthe popula- fraction of such sequences in the pool be nS~. Assuming n tion;theyarethendiversified(viapointmutationand/or thermodynamic equilibrium for the binding process, the o recombination[2])andamplifiedbacktotheoriginalpop- selectionfunctionissimplythebindingprobability,given c [ ulationsize. The“speed”ofevolutionaswellasthefinal bytheFermifunction[7]P(ES~,µ)=1/[1+exp(ES~ µ)], − equilibrium distribution depend on a variety of factors where mu is the chemical potential and all energies are 2 such as the mutation rate, selection strength, molecule expressed in units of k T. Here µ serves as a (soft) se- v B 7 length, and population size. A systematic quantitative lection threshold. and is determined by the fractionφ of 1 understanding of these dependencies is lacking thus far. DNA sequences that remain bound to the proteins after 1 Suchunderstandingisnotonlyoftheoreticalinterest,but selection, i.e., S~ P(ES~,µ)nS~ =φ. It canbe controlled 4 also helpful in improving the efficiency of the breeding by either the nPumber of available proteins or, as in the 0 processes. In this study, we develop a theoretical model experiment [4], by the washing strength. The fraction φ 2 0 for the simplest type of competitive evolution involving can be varied from φ < 1 (weak competition) to φ > 0 / onlypointmutationsonasmoothlandscape. Weachieve (strongcompetition). W∼edefinetheevolutionprocess∼it- t a an understanding of this model with concepts and tech- eratively whereby in each round, N daughter sequences m niques developed in the study of front propagation[3]. arechosenfromthe existing poolaccordingto P(ES~,µ), - and then point mutation are introduced with rate ν to d To make the discussion concrete, we focus on the in 0 generate the new sequence pool. n vitro evolution of DNA sequences due to competitive o binding to proteins. An example of such a system is the Finally we need to specify the binding energy ES~. We c recentexperimentofDubertretet.al.[4],whereDNAse- assume that each nucleotide taking part in the bind- : v quencesareselectedcompetitivelyaccordingtotheirrel- ing contributes independently, and adopt a “two-state” Xi ativeaffinitiesforthelac-repressorprotein. Inthisexper- model[7]whichassignsanenergypenaltyǫ(ofthe order iment,selectionisaccomplishedbycoatingabeakerwith ofafewk T’s)foreachnucleotidewhichdoesnotmatch r B a lac-repressor molecules followed by subsequent washing, the one the protein prefers. This form of binding energy sothatonlythestrongly-boundsequencesremain. Muta- has been shown to work reasonably well for specific sys- tionandamplificationarethenaccomplishedbymultiple tems [8] and has been argued to hold for a wide class of stagesofpolymerasechainreaction[5]. Whiletheexper- regulatory proteins [9, 10]. Given this energy model, a iment of Ref. [4] easily accomplished the goal of finding DNA sequence with r mismatches has a binding proba- the best binding sequence starting from a pool of ran- bility P(r,r0)=1 1+eǫ(r−r0) , where r0 µ/ǫ is the dom sequences in a few generations,the shortness of the selection threshold.in(cid:2) the “mism(cid:3)atch space”≡r. [11]. bindingsequence[20basepairs(bp)]makesitdifficultto explore the interesting dynamics of the competitive evo- The above evolution model is easily implemented on lution process. In our study we consider the evolution a computer. We fix three of the model parameters at process of Ref. [4] applied to much longer sequences so N = 5 105, L = 170 and ν = 0.01 from here on, and 0 × thatthatthesteadystatedynamicscanbeexamined. An vary only the selection strength through the choice of example of such a system might be the histone-octamer, theselectionstringencyφ. Atypicalsimulationresultfor which is known to bind DNA sequences of 147bp [6]. strongselection(φ=0.1)isshownasthespace-timeplot 2 0.3 being the “diffusion coefficient” and “drift velocity” re- n t=10 spectively [11], ν ν L. The r-dependences of v and 0 t=40 ≡ 0.2 0.15 t=70 D reflect the different phase space volume for the differ- n t=100 ent mismatches. For example, the form of v(r) ensures 0.1 0 that the the distribution approaches the maximum en- 0 −5 0 5 10 tropy point with r = A−1L mismatches by mutation 150 r−r (t) A 100 150 0 alone. The third term in Eq. (1) represents the effect 100 50 of the selection/amplification, controlled by the growth 50 t 0 0 r function U defined in Eq. (2). (The factor τ O(1) de- ∼ notes generation time.) Competition is explicitly mani- FIG.1: Thespace-timetrajectory ofthemismatch distribu- festedin the n dependence ofthe growthfunctionU, via tionn(r,t)accordingtothecompetitiveevolutionmodelwith the threshold r (t) which is determined from the con- 0 φ = 0.1. The inset shows the distribution n(r,t) at genera- dition φ = drP(r,r (t))n(r,t). In Eq. (2), an over- 0 tions t=10,40,70,100, after the initial transient period and all shift in UR by the constant 1 has been included to before the distribution reaches equilibrium at r ≈ 0. These − ensure that the population size N is conserved after se- distributions overlap upon shift by theirrespective threshold r0(t), indicating theshape-invariance of thepulse. lection/amplification, in accordance with the evolution process. This shift produces the desired competitive ef- fectthatindividualswhichbindbetterthanthethreshold of the mismatchdistribution n(r,t)≡ S~ nS~δ(ES~−rǫ) r0(t)arereproducedandthosenotmeetingthethreshold in Fig. 1. We see that the distributioPn quickly forms a decay away. In the actual analysis, we will approximate shape-preserving pulse (see the inset of Fig. 1), which the Fermi function P(r,r ) by a step function Θ(r r), 0 0 − moves, decelerates, and eventually reaches equilibrium which turns out not to affect the qualitative behavior. in the neighborhood of the optimal sequence (at r = Wewillseethatthesimplicityofthecontinuummean- 0). Basically, the selection eliminates weak binders in field equation provides an analytic understanding of the population to improve the average binding energy, generic features of the evolutionary dynamics, including hence the selection threshold r0 is decreased in the next theexistenceofthedecelerating,shape-preservingpopu- round, while the change of r0 further selects sequences lationpulse;italsoprovidesananalyticalestimateofthe with better binding energies. Along with new variety smallnessofthe finite-N correction. Howeverthatquan- generated by mutation, a propagating pulse results. titative differences do exist between our simplified de- We next investigate the dynamical behavior of the scriptionandthe breeding schemesemployedin the sim- aboveevolutionmodelanalyticallyusingamean-fieldde- ulations and experiments, due to the phenomenological scription. It will be convenientto describe the dynamics nature of simplified description, and the continuum ap- in the mismatch space r. Let us first consider the con- proximationsusedinboththe mismatchspaceandtime. tribution from point mutation. For a sequence of length (A quantitatively more accurate approach has recently L, “alphabet size” ( =4 for nucleotides) and r mis- been developed by Kloster and Tang [14].) A A matches, there are L ( 1) ways to mutate to a new We start with the simplest case of infinite sequence · A− sequenceviaasinglepointmutation. Amongthem,there length (while keeping ν a finite constant), yielding con- are (L r)( 1)waysto increaser by one,and r ways stant coefficients D(r) = D and v = ν. Making the − A− to reduce r by one. Hence, a standard master equation AnsatzinEq.(1)thatthedistributionn(r,t)=n[y(r,t)] can be written to describe the mutational dynamics of where y r r (t) and r (t) = ct for some constant 0 0 ≡ − − the distribution n(r,t) in the mean field limit N 1 speed c, we obtain the ODE ≫ [11, 12, 13]. The effect of selection/amplificationprocess ′′ ′ can be phenomenologicallymodeled by an additive term Dn (y) [(c+ν)n(y)] +u(y)n(y)=0, (4) − proportional to φ−1P(r,r ) for weak selection. Further 0 where u(y) φ−1Θ( y) 1 /τ. A physically allowed takingthecontinuumlimitinr(validinthelimitoflarge ≡ − − L and smooth population distribution), we arrive at the solution of Eq.(cid:0)(4) exists for e(cid:1)very c≥c0−ν, where following mean-field description for n(r,t) [11]: c 4D(φ−1 1)/τ. 0 ≡ − ∂tn=∂r[∂r(D(r)n) v(r)n]+ U[r;n] n(r,t)(1) p − · In fact, the smallest possible speed c c ν is se- U[r;n]= φ−1P(r,r (t)) 1 /τ, (2) min ≡ 0 − 0 − lectedbythedynamicsgivenareasonablycompactinitial (cid:2) (cid:3) distribution. Here, velocity selection follows the famil- The first two terms on the right-hand side of (1) result iar marginal stability mechanism [3]: The selected solu- from the (conservative) mutational processes, with tionn∗(y)(withthe velocityc )is the onethatdecays min ν 2 r r most sharply at the pulse front (i.e., the r < r end) D(r)= 1 A− , v(r)=ν 1 A 0 2 (cid:18) − 1L(cid:19) (cid:18) − 1L(cid:19) amongalltheallowedsolutions. Thus,asthefrontofthe A− A− (3) distribution broadens from the initial condition, it first 3 reaches the asymptotic decay of n∗. From then on, the for r (t): 0 distribution stops broadening and moves with the speed c . Standard arguments show that this is equivalent γr (t)+r˙ (t)=γrEQ (5) min 0 0 0 to the condition that the front be marginally stable in the frame of reference moving with c . Note that as where rEQ is the equilibrium threshold, so that a static min 0 c √D √ν, c < 0 when the mutation rate ν is distribution can be achieved in the moving frame. The 0 min ∝ ∝ sufficiently large, indicating the worsening of the overall population mean r follows exactly the same motion (in affinity of the sequences due to accumulation of delete- fact,r(t) r (t)exceptwhenselectionisveryweak),i.e., 0 ≈ rious mutations despite the presence of selection. These asingleexponentialwithtimeconstantγ (whichdepends results apply also to the more generalFermi function, as only on the point mutation rate ν ). This is a generic 0 it is only the asymptotic behavior of the growth term result independent of the details of the fitness function, [i.e., U(r r )] that governs velocity selection. as long as a pulse solution exists for Eq. (1). The decay 0 ≪ constantγobtainedfromsimulationofthediscretemodel In evolutionarydynamics,the populationsizeN often is in quantitative agreement with the expectation (4ν ) playsaveryimportantrole[12,13]. ToseehowN enters, for weak selection (1 > φ > 0.25); an example is sh3ow0n we note that the mean-field equation (1) has an incon- inFig.2(a). Infact, the sa∼mequalitative result(i.e., the sistency in that at the very front of the moving pulse, existence of a shape-preserving pulse) holds for strong arbitrarily small n gets the benefit of exponential am- selection as shown already in Fig. 1 where φ=0.1. plification. But in reality, the number of individuals is The shape of the pulse, i.e. the equilibrium distribu- discretesothatnshouldalwaysbegreaterthan1/N. To tionnEQ,isgovernedbythesameODEasEq.(4),except deal with this problem, a cutoff procedure was proposed that the constant velocity c is replaced by γ(y+rEQ). within the mean-field framework [15, 16]. Here we em- − 0 The resulting equation again has a continuum of phys- ploy this procedureto estimate the effect ofa finite pop- ically allowed solutions, each having a different shape ulationonthe evolutionaryvelocity [18]. Specifically, we andcorrespondingtoadifferentequilibriumpositionrEQ modify the selection/amplification term u(y) in Eq. (4) 0 (hencedifferentrEQ). Herewehaveaninterestinggener- tou(y)Θ(n N−1)(fory <0). Adirectextensionofthe approachin−Ref. [16] leads to δc /c π2/ln2N for the alizationofvelocityselectiontotheselectioninsteadfrom 0 0 ∼ 2 a continuumofdeceleratingpulses. Again,starting from fractional change in c , which has the same scaling form 0 a compact initial distribution, the dynamics selects the as that for the Fisher equation [16]. To test this result, solution[19]whosefront(y <0)decaysmostrapidly,(in we ran simulations (with a modified mutational scheme this case a Gaussian falloff), whereas the other solutions to achieve a constant drift) to measure the propagating have a power-law front. speed of the pulse for different population sizes N. Our The rEQ extracted from the selected solution agrees findingofδc/c 0.06betweenN =5 103and5 108is well with its heuristic approximationof (ν c )/γ when ≈ × × 0 inlinewiththeexpectationandindicatesthatundertyp- − γ 1; see Fig. 2(b). The theory is quantitatively ac- icalexperimentalconditions,thefluctuationeffectdueto ≪ curate [20] when the selection is not too strong (e.g., finite population is insignificant. φ−1 < 2.5). For very strong selection, the equilibrium thresholdpositionrEQ approachesr =0 andthe bound- We next examine the more realistic situation of finite 0 aryconditionthereneedstobetakenintoaccount. When sequence length L. The important new effect is due to c and γ are expressed in terms of original parameters, the r-dependence in the drift velocity v(r) [see Eq. (3)], 0 which, as the population approaches towards r = 0, in- creasingly hinders its advance. This can already be ap- 200 150 preciatedifweassumeaquasisteady-statedynamicsand 150 Simulation Simulation replaceνintheformulaforcminwithv(r)=ν γr[where r fitting 100 Theory − γ ≡ 34ν0accordingto(3)]: Wefindastablestationarypo- 100 rEQ sition,rEQ (ν c )/γ wherec =0. Hereweidentify 50 ≈ − 0 min 50 this position naturally with the mean of the population rEQ rnEQ(r)dr. 00 200 400 600 00 2 14/φ−1 6 8 ≡ Generation R To proceed with a more rigorous analysis, we neglect FIG.2: (a)Evolutionofthemeanmismatchr(t)forφ=0.5. the r dependence in D which has only a small quantita- The equilibrium distribution is used as the initial condi- tiveeffect. Also,weassumethattheequilibriumposition tion to mitigate transient effects. The solid line is a single- rEQ ≫ 1 so that the boundary condition at r = 0 can exponentialfitusingthetheoreticalvalueofγ = 43ν0 =.0133. be safely ignored. Returning to the mean-field equation (b) Equilibrium positions as a function of selection pressure (1), we use the same moving-pulse Ansatz as before ex- φ−1−1. ThelineisthetheoreticalestimaterEQ=(ν−c0)/γ, cept that we no longer fix a linear time dependence to using a generation time τ =2.77 (obtained by calibrating c0 from theory with that from simulation). the threshold r (t). This Ansatz produces a linear ODE 0 4 rEQ =(ν c )/γ suggests that for a population with se- tobemoreefficientthanitscompetitorstowinthebattle 0 − quence length L 1, the population pulse could stall at for evolutionary survival. ≫ rEQ 1. In order for the population to reach the opti- ≫ We thank M. Kloster for detail comments and discus- mal at r 0, we need to increase the selection strength ≈ sions. We also acknowledge helpful discussions with D. (i.e., lowering φ) or decrease the mutation rate so that φ−1 > 1+ν τL/2, to overcome the bigger entropic bar- Kessler, Q. Ouyang, L. Tang and J. Widom. This re- 0 search is supported by the NSF through Grants DMR- rier a∼ssociated with longer sequences. 0211308and MCB-0083704. As there have been extensive studies of evolution on various landscapes in the context of population genet- ics [12, 13, 21], it is worth comparing the dynamical be- havior of competitive evolution with that of more com- mon evolutionary models. The traditional study of evo- [1] E. T. Farinas, T. Bulter and F. H. Arnold, Curr. Opin. lution focuses on fixed fitness landscapes, where every Biotechnol. 12, 545 (2001). genotype(e.g.,sequenceS~)hasapredeterminedabsolute [2] W. P. C. Stemmer, Nature 370, 389 (1994). fitness value (i.e., the reproductive rate of the sequence [3] See, e.g., P. Collet and J. -P. Eckmann, Instabilities and S~). Competitive evolution is different in that it is sub- Fronts inExtended Systems (PrincetonUniversityPress, Princeton, 1990). jecttoadynamicfitnesslandscape. Thatis,thefitnessis [4] B.Dubertret,S.Liu,Q.OuyangandA.Libchaber,Phys. measured relative to a dynamic selection threshold and Rev. Lett.86, 6022 (2001). progress towards the best binding sequence occurs via [5] See, e.g., H.Lodish et. al.,Molecular cell biology, 4th ed competition among the currently existing genotypes — (W.H. Freeman & Co., New York,c2000). beingbetterisall-important,notbeingbest. Thisaspect [6] A. Thastrom et al.,J. Mol. Biol. 288, 213 (1999). [7] P. H.von Hippeland O.G. Berg, Proc. Natl.Acad. Sci. of competitive evolution leads to qualitatively different 83, 1608 (1986). dynamicalbehavior. Forcomparison,wecanconsiderthe [8] D.S. Fields, Y. He, A.Y. Al-Uzri and G.D. Stormo, J. simplest and most widely studied fixed landscape, i.e., Mol. Biol. 271, 178 (1997). the smooth “Mt. Fuji” landscape [12, 13, 21, 22], where [9] U. Gerland, J.D. Moroz and T. Hwa, Proc. Natl. Acad. eachnucleotidecontributesindependentlyandadditively Sci. 99, 12105 (2002). tothefitnessofthesequence,thusformingalandscapeon [10] To focus on the steady state dynamics, we have ignored which fitness rises steadily toward a single peak. For in- “nonspecific”protein-DNAbinding[7,9].Thiswouldaf- fecttheinitialstageofevolutionwhichplaysadominant finite sequence length,the mean-fieldtheory fails inthat role in theexperiment of Ref. [4]. it produces an unphysical, run-away solution [15] due to [11] U. Gerland and T. Hwa, J. Mol. Evol. 55, 386 (2002). the unlimited growth rate of n at the high fitness states [12] L. Peliti, cond-mat/9712027. [15,17,22],andafinite populationhasatravelingspeed [13] B. Drossel, Adv.Phys. 50, 209 (2001). that is essentially proportional to population size [17]. [14] M. Kloster and C. Tang, manuscript in preparation. For finite sequence length, the finite population dynam- [15] L. S.Tsimring, H.LevineandD.A.Kessler, Phys.Rev. ics is orders-of-magnitudeslower in reaching equilibrium Lett. 76, 4440 (1996). [16] E.BrunetandB.Derrida,Phys.Rev.E56,2597(1997). than the (now non-divergent) mean-field prediction [17]. [17] D. A. Kessler, H. Levine, D. Ridgway and L. Tsimring, Incontrast,finitepopulationeffectsmerelycauseasmall J. Stats. Phys.87, 519 (1997); ibid., 87, 519 (1997). correction for the competitive evolutionary process. [18] Strictlyspeaking,1/N isthecutoffvalueinthesequence To summarize, we investigated the dynamics of com- space (S~) rather than the mismatch space (r). As there petitive evolution in the context of molecular evolu- is only selection in the mismatch space, the distribution tion experiments. The major result concerns the ex- forms a compact packet and undergoes random walk in istence and properties of a shape-invariant population theorthogonaldirection.Hence,weexpectthecutoff1/N pulse which propagates towards an eventual equilibrium to be valid up to a constant even in mismatch space. [19] The selected solution can beexpressed via theparabolic configuration. Analytical results on the motion of the pulse obtained from the mean-field equation are in good cylinder function Dp(y˜), with nEQ ∝ Dp+(y˜)e−y˜2/4 agreement with simulations. Also, corrections due to fi- [Dp−(−y˜)e−y˜2/4] for y >0 (y ≤0), where p+ =−1/γτ nite population size are shown to be insignificant. An [p− = (φ−1−1)/γτ] and y˜ ≡ (y+r0EQ−ν0/γ) γ/D. interesting aspect of our findings is the convergence of rEQ is determined by matchingconditions at y=p0. 0 the evolutionprocessto a solutionfar fromoptimal(i.e., [20] Although the analytical solution gives a fair description rEQ 1), if the selection strength is not sufficiently of the motion of the pulse, it offers a pulse shape much ≫ broaderthanobservedinsimulation.Abetterdescription strong or mutation rate not sufficiently low. In general, of theshape is presented in [14]. competitive evolution is rather different from the usual [21] See, e.g, R. Bu¨rger, The Mathematical Theory of Selec- picture of climbing a fixed fitness landscape. This ap- tion, Recombination and Mutation (John Wiley & Sons, proachmay be applicable more generally,e.g. to natural Chichester, UK,2000). evolutionincaseswherecompetitionforscarceresources [22] G. Woodcock and P. G. Higgs, J. Theor. Biol. 179, 61 is the primary driving force, as an organism only needs (1996).

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.