TheoreticalPopulationBiology62,9–46(2002) doi:10.1006/tpbi.2002.1582,availableonlineathttp://www.idealibrary.comon Mutation–Selection Balance: Ancestry, Load, and Maximum Principle Joachim Hermisson Departmentof EcologyandEvolutionary Biology, YaleUniversity, New Haven,Connecticut06520 Oliver Redner Institutfuu.r MathematikundInformatik, Universitaa.tGreifswald, Friedrich-Ludwig-Jahn-Str. 15a, 17487Greifswald, Germany Holger Wagner TechnischeFakultaa.t,Universitaa.tBielefeld, Postfach 100131,33501Bielefeld,Germany and Ellen Baake Institutfuu.r MathematikundInformatik, Universitaa.tGreifswald, Friedrich-Ludwig-Jahn-Str. 15a, 17487Greifswald, Germany ReceivedAugust6,2001 We analyze the equilibrium behavior of deterministic haploid mutation–selection models. To this end, both the forward and the time-reversed evolution processes are considered. The stationary state of the latter is called the ancestral distribution, which turns out as a key for the study of mutation–selection balance.Wefindthattheancestralgenotypefrequenciesdeterminethesensitivityoftheequilibriummean fitness to changes in the corresponding fitness values and discuss implications for the evolution of mutational robustness. We further show that the difference between the ancestral and the population meanfitness,termedmutationalloss,providesameasureforthesensitivityoftheequilibriummeanfitness tochangesinthemutationrate.Theinterrelationofthelossandthemutationloadisdiscussed.Foraclass of models in which the number of mutations in an individual is taken as the trait value, and fitness is a functionofthetrait,weusetheancestorformulationtoderiveasimplemaximumprinciple,fromwhichthe mean and variance of fitness and the trait may be derived; the results are exact for a number of limiting cases,andotherwiseyieldapproximationswhichareaccurateforawiderangeofparameters.Theseresults are applied to threshold phenomena caused by the interplay of selection and mutation (known as error thresholds). They lead to a clarification of concepts, as well as criteria for the existence of error thresholds. &2002ElsevierScience(USA) Key Words: mutation–selection model; clonal reproduction; branching process; backward processes; mutationload;epistasis;mutationalrobustness;errorthreshold;statisticalphysics. 1. INTRODUCTION usually interested in the statistical properties of the equilibrium distribution of genotypes, like the means and variances of fitness and trait(s). The standard A lot of research in theoretical population genetics approach to determine these starts out from the hasbeendirected towardsmutation–selectionmodelsin equilibrium condition for the genotype frequencies multilocus systems and infinite populations. One is (which takes the form of an eigenvalue equation if the 9 0040-5809/02$35.00 #2002ElsevierScience(USA) Allrightsreserved. 10 Hermissonet al. population is haploid). On this basis, a wealth of TABLEI methods andresultshasbeencreated;foracomprehen- sive and up-to-date account, see Buu.rger (2000). GlossaryofRepeatedlyUsedNotation In this article, we present an alternative route, which a ancestorfrequencies 2.2 relies on a time-reversed version of the mutation– G;g mutationalloss (24) selection process and its stationary distribution}to be g mutationallossfunction (31) called the ancestral distribution, as opposed to the H evolutionmatrix (4) equilibriumdistributionoftheforwardprocess.Wewill I identitymatrix 2.1 apply this approach to tackle a rather general class of i;j;k;‘ genotype/classlabels 2.1 L;l mutationload 2.4 models for haploids, or diploids without dominance. It m mutationrates 2.1 is only assumed that fitness is a function of a trait, and M mutationmatrix 2.1 genotypes with equal trait values have equivalent N numberofmutationclasses/sequencelength 2.1 mutation patterns. In fact, this is a standard assump- p populationfrequencies 2.1 tion, and is often implied without special mention. It Q generatorofreversedprocess 2.2 R;r reproductionrates 2.1 applies, for example, if (geno)types are identified with a r fitnessfunction (27) collection of loci with two alleles each (wildtype and R reproductionmatrix 2.1 mutant), which mutate independently and according to s(cid:1) mutationaleffects 2.4 the same process, and the number of (deleterious) s (binary)sequence 2.1 mutations plays the role of the trait. The assumption T timeevolutionmatrix 2.1 t time 2.1 of permutation invariance (with respect to the loci) is U(cid:1);u(cid:1) genomicmutationrates 2.1 certainly a distortion of biological reality, but, even in u(cid:1) mutationfunctions (27) this simplified setting, general answers have previously V;v variances 2.4 been considered impossible (Charlesworth, 1990), and X;x mutationaldistance 2.4 researchershaveresortedtomore specificchoicesofthe X mutationclasses 2.1 Y;y arbitrarytrait 2.7 fitness functionand themutation model (e.g.,quadratic z relativereproductivesuccess 2.2 fitness functions and unidirectional mutation). g overallfactorforreproductionrates 5.3 With the helpof the ancestral distribution, we will be k biallelicmutationasymmetryparameter 2.1 able to tackle general fitness functions (with arbitrary l largesteigenvalueofH 2.1 max epistasis), as well as general mutation schemes (includ- m overallmutationrate 2.1,(56) ing arbitrary amounts of back mutation), from the Note. Symbols are given together with the section or equation in permutation invariant class. The mutation–selection whichtheyaredefined.Symbolswhosescopeisonlyasinglesection equilibrium will be characterized through a maximum arenotshown. principlewhichrelatestheequilibriumpopulationtothe ancestral one, and may be evaluated explicitly to yield Since the article treats quite a number of topics, we expressions for the mean fitness and the mean trait start out with a brief reader’s guide to the main results value, as well as the variances of these quantities. The here and give hints on what parts can be skipped at a results are exact for a number of limiting cases, and firstreading.LetusalsomentionthatTableIcontainsa otherwiseyieldapproximationswhichareaccuratefora glossary of repeatedly used notation. wide range of parameters. The scene is set in Section 2, where we will introduce The results will then be used to settle the long- the model (the continuous-time mutation–selection standing issue of characterization and classification of model) and establish its relationship with a multitype errorthresholdphenomenainthismodelclass.Anerror branchingprocess.Twoconceptsthatarecentraltothis threshold may be generally, but vaguely, circumscribed paper, the ancestor distribution and the mutation class as a critical mutation rate beyond which mutation can limit, are developed in this section. Section 2.2 nolongerbecontrolledbyselectionandleadstogenetic introduces the ancestor distribution as the stationary degeneration; for review, see Eigen et al. (1989). Some, distribution of the time-reversed branching process and but by no means all, mutation–selection models display links the algebraic properties of the model to a suchbehavior.Itturnsoutthataconsistentdefinitionof probabilistic picture that also helps to shape biological an error threshold is rather subtle and must be sorted intuition.Inordertoallow quickprogresstotheresults out first. On this basis, we will classify mutation– in the remainder of the article, however, we have selection models according to their threshold behavior summarized the main points in Section 2.3. In Section (if any). 2.4, the means and variances of the trait and of fitness Mutation–Selection Balance 11 with respect to the equilibrium population and with Appendix A describes the connection between our respect to the ancestors are introduced. In Section 2.5, mutation–selection model and a system of quantum- the difference between the ancestral and the population statistical mechanics, which had been used previously mean fitness, termed mutational loss, is shown to (e.g., Baake et al., 1997; Baake and Wagner, 2001) to provide a measure for the sensitivity of the equilibrium solveamorerestrictedmodelclass,andwhichalsoserved meanfitnesstochangesinthemutationrate.Thisresult as the source of concepts and methods for the current is used and expanded in some of the applications in article. However, the body of the paper does not require Sections 5 and 6, but can be skipped at first reading. anyknowledgeofphysicsandremainsentirelywithinthe Sections 2.6 and 2.7 are mainly concerned with the framework of population genetics and classical prob- mutation class limit, along with the proper scaling of ability theory. Appendices B and C, finally, contain the fitness functions and mutation schemes. Like the well- proofs from Sections 4 and 6, respectively. known infinite-sites limit, this limit assumes an infinite number of types, but uses a different scaling. As a consequence,itisvalidifthetotalmutationrateislarge relative to typical fitness differences between types. In 2. MODEL SETUP this paper, the mutation class limit is used to derive analytic expressions for means and variances of fitness 2.1. The Model and the trait for the general case with back mutations and a non-linear fitness function. It is also crucial for Weconsidertheevolutionofaninfinitepopulationof our discussion of threshold behavior in Section 6. haploid individuals (or diploids without dominance) Section3isacondensedsummaryofthemainresults under mutation and selection. Disregarding environ- related to the maximum principle. The mean fitness at mental effects, we take individuals to be fully described mutation–selection balance equals the maximum of the bytheirgenotypes,whicharelabeledbytheelementsof difference between thefitnessfunctionandthe so-called f1;...;Mg:LetpðtÞbetherelativefrequencyoftypeiat i mutational loss function, where the position of the time t; so that P pðtÞ¼1; and let pðtÞ¼ðp ðtÞ;...; i i 1 maximum determines the mean ancestral trait. Once p ðtÞÞT with T denoting transposition. Throughout this M these means are known, explicit expressions are avail- article we will use the formalism for overlapping able for the mean trait and the variances of fitness and generations, which works in continuous time, and only trait.Thederivations arepostponedtoSection4,which comment on extensions to the analogous model for may be skipped at first reading. discretegenerations.Thestandardsystemofdifferential The following two sections are devoted to applica- equations which describes the evolution of the vector tions. Both are, to a large extent, independent of each pðtÞ is (Crow and Kimura, 1970; Hofbauer, 1985; see otherandrelyonlyonthematterintroducedinSections also Buu.rger, 2000) 2 and 3. In Section 5, we first discuss the evolutionary pp’ðtÞ¼½R (cid:8)RR%ðtÞ(cid:9)pðtÞþX ½m p ðtÞ(cid:8)m pðtÞ(cid:9): ð1Þ significance of the mutational loss, and then turn to the i i i ij j ji i j mutationload.Explicitexpressionsarederivedforsmall (back) mutation rates; but arbitrary mutation rates are Here, R is the Malthusian fitness of type i; which is i covered by the maximum principle, which may be connectedtotherespectivebirth and deathratesasR ¼ i % P interpreted as a generalized version of Haldane’s B (cid:8)D; and RRðtÞ¼ RpðtÞ designates the mean i i i i i principle.Consequences fortheevolutionofmutational fitness. Further, m is the rate at which a j-individual ij effects and for mutational robustness are discussed. mutates to i; and the dot denotes the time derivative. In Finally, a note is added as to the accuracy of the this model, mutation and selection are assumed to be expressions for the means and variances. independentprocesseswhichgooninparallel.However, In Section 6, we first analyze the definitions available mutation may also be viewed as occurring during for the error threshold. It will turn out that various reproduction. In this case, the mutation rate is given by notions must be distinguished, which coincide only in m :¼v B ; where v is the respective mutation prob- ij ij j ij special cases. For each of these notions, a criterion for ability during a reproduction event. Since, formally, this the existence of an error threshold is derived from the leadstothesamemodel,itneednotbediscussedfurther. maximum principle. Furthermore, the phenomena are For some of the main results of this article, further illustrated by means of examples and discussed with assumptions on the mutation scheme are required. To respect to their biological implications. Section 7 this end, we collect genotypes into classes X of equal k provides a summary and an outlook. fitness, 04k4N; and assume mutations only to occur 12 Hermissonet al. betweenneighboringclasses.LetR denotethefitnessof k classk andUk(cid:1) themutationratefromclassXk toXk(cid:1)1 (i.e., thetotal rate foreach genotypeinX tomutateto k some genotype in class Xk(cid:1)1), with the convention U0(cid:8) FIG. 1. Rates for mutations and back mutations at each site or ¼Uþ ¼0: Thus, we obtain a variant of the so-called locusofabiallelicsequence. N single-step mutation model: pp’ ðtÞ¼½R (cid:8)RR%ðtÞ(cid:8)Uþ(cid:8)U(cid:8)(cid:9)p ðtÞ k k k k k mutation rates. However, monotonic fitness is never þ Uþ p ðtÞþU(cid:8) p ðtÞ: ð2Þ k(cid:8)1 k(cid:8)1 kþ1 kþ1 assumed, unless this is stated explicitly. In much of the following, we will treat the general (Here, theconvention p ðtÞ¼p ðtÞ¼0 isused.) We (cid:8)1 Nþ1 model (1), which builds on single genotypes, and the can, for example, think of X as the wildtype class with 0 single-step mutation model (2), in which the units are maximum fitness and fitness only depending on the genotypeclasses,withthehelpofacommonformalism. number of mutations carried by an individual. If, Tothisend,notethatbothmodelscanberecastintothe further, mutation is modeled as a continuous point following general form using matrices of dimension M; process (or if multiple mutations during reproduction respectively N þ1: can be ignored), Eq.(1) reduces to Eq.(2), with an appropriate choice of mutation classes. Depending on pp’ðtÞ¼ðH(cid:8)RR%ðtÞIÞpðtÞ: ð4Þ therealizationonehasinmind,theU thendescribethe k Here,Iistheidentity.TheevolutionmatrixH¼RþM total mutation rate affecting the whole genome or just is composed of a diagonal matrix R that holds the some trait or function. Malthusianfitnessvalues,andthemutationmatrixM¼ In most of our examples, we will use the Hamming ðM Þwitheitheroff-diagonalentriesreadingm ;orwith graphasourgenotypespace.Here,genotypesarerepre- ij ij sented as binary sequences s¼s1s2...sN 2fþ;(cid:8)gN; Uk(cid:1) on the secondary diagPonals. The diagonal elements thus M ¼2N: The two possible values at each site, + in each case are Mii ¼(cid:8) jai Mji; hence the column sums vanish. Where the more restrictive form of the and(cid:8);maybeunderstoodeitherinamolecularcontext single-stepmodelisneeded,thiswillbestatedexplicitly. asnucleotides(purinesandpyrimidines)or,onacoarser Unless we talk about unidirectional mutation (U(cid:8) (cid:13)0 level, as wildtype and mutant alleles of a biallelic k for the single-step mutation model), we will always multilocus model. We will assume equal mutation rates assumethatMisirreducible(i.e.,eachentryisnon-zero at all sites, but allow for different rates, mð1þkÞ and for a suitable power of M). mð1(cid:8)kÞ; for mutations from + to (cid:8) and for back Let now TðtÞ:¼expðtHÞ; with matrix elements T ðtÞ: mutations, respectively, according to the scheme de- ij Then, the solution of (4) is given by (see, e.g., Buu.rger, picted in Fig. 1. 2000, Chapter III.1) Clearly, the biallelic model reduces to a single-step mutation model (with the same N) if the fitness TðtÞpð0Þ pðtÞ¼ ð5Þ landscape1 is invariant under permutation of sites. To P T ðtÞp ð0Þ i;j ij j this end, we distinguish a reference genotype s ¼ þ as can easily be established by differentiating and using þþ(cid:12)(cid:12)(cid:12)þ; in most cases the wildtype or master P P % H p ðtÞ¼ RpðtÞ¼RRðtÞ: Due to irreducibility, sequence, and assume that the fitness Rs of sequence s i;j ij j i i i the population vector converges to a unique, globally depends only on the Hamming distance k ¼d ðs;s Þ to H þ stable equilibrium distribution p:¼lim pðtÞ with s (i.e., the number of mutations, or ‘‘(cid:8)’’ signs in the t!1 þ p >0 for all i; which describes mutation–selection sequence). The resulting total mutation rates between i balance. By the Perron–Frobenius theorem, p is the the Hamming classes Xk and Xk(cid:1)1 read (right) eigenvector corresponding to the largest eigenva- Uþ ¼mð1þkÞðN (cid:8)kÞ and U(cid:8) ¼mð1(cid:8)kÞk ð3Þ k k lue, lmax; of H: if mutation is assumed to be an independent point process at all sites. We usually have the situation in 2.2. The Branching Process}Forward and mindinwhichfitnessdecreaseswithk andwilltherefore Backward in Time speakofUþ andU(cid:8) asthedeleteriousandadvantageous k k Our approach will heavily rely on genealogical relationships, which contain a more detailed informa- 1We use the notion of a fitness landscape (Kauffman and Levin, tion than the time course of the relative frequencies (5) 1987) as synonymous with fitness function for the mapping from genotypestoindividualfitnessvalues. alone. Let us, therefore, reconsider the mutation– Mutation–Selection Balance 13 selection model as a branching process. Branching Secondly,takingexpectationsofY andmarginalizing i processes have been classical tools in population over all other variables, one obtains the differential genetics to approximate the fixation rates of a single equation for the conditional expectations mutant type in a finite population. This approach goes d back to Haldane (1927) (see also Crow and Kimura, EðYðtÞjnð0ÞÞ¼ðB (cid:8)DÞEðYðtÞjnð0ÞÞ dt i i i i 1970),andhasbeenusedinmanyrecentapplicationsas X þ ½M EðY ðtÞjnð0ÞÞ(cid:8)M EðYðtÞjnð0ÞÞ(cid:9): ð7Þ well (e.g., Barton, 1995; Otto and Barton, 1997). ij j ji i We pursue a different route here by considering the j process of mutation, reproduction and death as a Clearly, our evolution matrix H appears as the (continuous-time) multitype branching process, as de- infinitesimal generator here, and the solution is given scribed previously for the quasi-species model (Deme- by TðtÞnð0Þ; where TðtÞ:¼expðtHÞ (see also Hofbauer triusetal.,1985;HofbauerandSigmund,1988,Chapter and Sigmund, 1988, Chapter 11.5). In particular, we 11.5).Letusstartwithafinitepopulationofindividuals, have EðYðtÞje Þ¼T ðtÞ for the expected number of i- i j ij which reproduce (at rates Bi), die (at rates Di;), or individuals at time t; in a population started by a single change type (at rates Mij) independently of each other, j-individual at time 0 (a ‘‘j-clone’’). In the same way, withoutanyrestrictiononpopulationsize.LetYiðtÞbethe TijðtÞ is the expected number of descendantsof type i at random variable denoting the number of individuals of time tþt in a j-clone started at an arbitrary time t; cf. type i at time t; and niðtÞ the corresponding realization; left panelof Fig. 2. (Note that due totheindependence collect the components into vectors Y and n; and let ei of individuals and the Markov property, the progeny betheithunitvector.Thetransitionprobabilitiesforthe distribution depends only on the age of the clone, and jointdistribution,PrðYðtÞ¼nðtÞjYð0Þ¼nð0ÞÞ;whichwe on the founder type.) Further, the expected total size of willabbreviateasPrðnðtÞjnð0ÞÞbyabuseofnotation,are aj-cloneofaget;irrespectiveofthedescendants’types, governed by the differential equation2 is P T ðtÞ: i ij Initial conditions come into play if we consider the d PrðnðtÞjnð0ÞÞ reproductive success of a clone relative to the whole dt population. A population of independent individuals, ! ! X X withinitialcompositionpðtÞ;hasanexpectedmeanclone ¼ (cid:8) BiþDiþ Mji niðtÞ PrðnðtÞjnð0ÞÞ size P T ðtÞp ðtÞ at time tþt (note that t always i jai i;j ij j X means ‘‘absolute’’ time, whereas t denotes a time þ BðnðtÞ(cid:8)1ÞPrðnðtÞ(cid:8)e jnð0ÞÞ i i i increment). The expected size of a single j-clone at time i tþt; relative to the expected mean clone size of the X þ DiðniðtÞþ1ÞPrðnðtÞþeijnð0ÞÞ whole population, then is i X , þ M ðn ðtÞþ1ÞPrðnðtÞ(cid:8)e þe jnð0ÞÞ: ð6Þ X X ij j i j z ðt;tÞ:¼ T ðtÞ T ðtÞp ðtÞ: ð8Þ j ij k‘ ‘ i;j iaj i k;‘ The z express the expected relative success of a type The connection of this stochastic process with the j after evolution for a time interval t; in the sense that, if deterministicmodeldescribedinSection2.1istwo-fold. z ðt;tÞ>1 ð51Þ; we can expect the clone to flourish Firstly, in the limit of an infinite number of individuals j P more (less) than average (this does in general not mean ðn:¼ nð0Þ!1Þ; the sequence of random variables i i that type j is expected to increase (decrease) in YðnÞðtÞ=n converges almost surely to the solution yðtÞ of abundance relative to the initial population). Clearly, yy’ ¼Hy with initial condition yð0Þ¼nð0Þ=n (Ethier and the values of z depend on the fitness of type j; Kurtz, 1986, Chapter 11, Theorem 2.1). That is, j but also on its mutation rate and the fitness of its Prðlim YðnÞðtÞ=n¼yðtÞÞ¼1; and the superscript ðnÞ n!1 (mutated) offspring. (If there is only mutation, but no denotes the dependence on the number of individuals. P reproduction or death, one has a Markov chain and The connection is now clear since pðtÞ:¼yðtÞ= yðtÞ i i z ðt;tÞ(cid:13)1:) solves the mutation–selection equation (1). j Wenowconsiderlinesofdescent,asintherightpanel of Fig. 2. To this end, we randomly pick an individual 2Note that differentiability of the transition probabilities is alive at time tþt; and trace its ancestry back in time; guaranteedinafinite-state,continuous-timeMarkovchain,provided this results in an unbranched line (in contrast to the thetransitionratesarefinite,cf.KarlinandTaylor(1975,Chapter4) andKarlinandTaylor(1981,Chapter14). lineage forward in time). Let ZtþtðtÞ denote the type 14 Hermissonet al. FIG. 2. The multitype branching process. Individuals reproduce (branching lines), die (ending lines), or mutate (lines changing type) independentlyofeachother;thevarioustypesareindicatedbydifferentlinestyles.Left:Thefatlinesmarktheclonefoundedbyasingleindividual (bullet)attimet:Right:Thefatlinesmarkthelinesofdescentdefinedbythreeindividuals(bullets)attimetþt:Aftercoalescenceoftwolines,their ancestorreceivestwicethe‘‘weight’’,asindicatedbyextrafatlines. found at time t4tþt; where we will drop the index for On the other hand, easier readability. We seek its probability distribution PrðZðtþtÞ¼i;ZðtÞ¼jÞ PrðZðtÞ¼jÞ: Since the (relative) clone size z ðt;tÞ also j ¼PrðZðtÞ¼jjZðtþtÞ¼iÞPrðZðtþtÞ¼iÞ determines the expected (relative) frequency of lines * present at time tþt that contain a j-type ancestor at ¼PP ðt;tÞpðtþtÞ; ð12Þ ji i time t; we have * where PP ðt;tÞ:¼PrðZðtÞ¼jjZðtþtÞ¼iÞ is the transi- ji PrðZðtÞ¼jÞ¼z ðt;tÞp ðtÞ¼:a ðt;tÞ: ð9Þ tion probability of the time-reversed process and is j j j * obtained from (10) and (12) as PP ðt;tÞ¼a ðt;tÞP ðtÞ(cid:17) ji j ij The a ðt;tÞ define a probability distribution ðpðtþtÞÞ(cid:8)1: With Eqs.(8), (9), and (11), one therefore j i P ð a ðt;tÞ(cid:13)1Þ; which will be of major importance, obtains the elements of the backward transition matrix j j * and may be interpreted in two ways. Forward in time, PP as a ðt;tÞ is the frequency of j-individuals at time t; wjeighted by their relative number of descendants after PP*jiðt;tÞ¼pjðtÞP TTijððttÞÞp ðtÞðpiðtþtÞÞ(cid:8)1: ð13Þ evolution for some time t: Looking backward in time, k;‘ k‘ ‘ * a ðt;tÞisthefractionofthe(p-distributed)populationat BydifferentiatingPPðt;tÞwithrespecttotandevaluating j timetþtwhose ancestorattimet isoftypej:Weshall it at t¼0; one obtains the matrix QðtÞ governing the therefore refer to aðt;tÞ as the ancestral distribution at correspondingbackwardprocessincontinuoustime.Its * the earlier time, t: elements read Q ðtÞ¼ðd=dtÞPP ðt;tÞj ¼p ðtÞðH Let us, at this point, expand a little further on this (cid:8)d RR%ðtÞÞðpðtÞÞ(cid:8)1(cid:8)djipp’ðtÞ=pðtÞ:Usijnig(4)t¼th0issimjplifieisj ij i ij i i backward picture by explicitly constructing the time- to reversed process. This is done in the usual way, by (p ðtÞH ðpðtÞÞ(cid:8)1; iaj; writing the joint distribution of parent–offspring pairs Q ðtÞ¼ j ij i ð14Þ (i.e., pairs ZðtÞ and ZðtþtÞ) in terms of forward and ji (cid:8)Pkai pkðtÞHikðpiðtÞÞ(cid:8)1; i¼j: backward transition probabilities. On the one hand, Note that the backward process is, in general, state- PrðZðtþtÞ¼i;ZðtÞ¼jÞ dependent (it does not generate a Markov chain). Note also that time reversal works in the same way if sets of ¼PrðZðtþtÞ¼ijZðtÞ¼jÞPrðZðtÞ¼jÞ types X instead of single types are considered, as long k ¼P ðtÞa ðt;tÞ: ð10Þ ij j as mutation and reproduction rates are the same within classes.Furthermore,ananalogoustreatmentispossible Here,P ðtÞ:¼PrðZðtþtÞ¼ijZðtÞ¼jÞmaybeobtained ij both for mutation coupled to reproduction, as well as by rewriting the (conditional) expectations defining the for discrete generations. P (forward) branching process as T ðtÞ¼P ðtÞ T ðtÞ; ij ij k kj As to the asymptotic behavior of our branching which gives process, it is well known that, for irreducible H and , t!1; the time-evolution matrix expðtðH(cid:8)l IÞÞ X max P ðtÞ¼T ðtÞ T ðtÞ: ð11Þ becomes a projector onto the equilibrium distribution ij ij kj k p; with matrix elements pizj (e.g., Karlin and Taylor, Mutation–Selection Balance 15 1981, the appendix). Here, z is the Perron–Frobenius as HH* is symmetric, we have zz* (cid:18)pp* (where (cid:18) means (PF) left eigenvector of H; normalized such that proportional to). Hence, due to a ¼z p ¼zz* pp* (cid:18)pp*2; k k k k k k P zp ¼1:Assuggestedbyournotation,onealsohas one has pp* (cid:18)pffiaffiffiffiffi: Thus, we obtain the following i i i k k * explicit form of the eigenvalue equation for HH; lim zðt;tÞ¼z; ð15Þ t;t!1 which will be crucial for the derivation of our main which may be seen from (8).3 We therefore term z the results: i relative reproductive success of type i: ½R (cid:8)Uþ(cid:8)U(cid:8)(cid:9)pffiaffiffiffiffiþqffiUffiffiffiþffiffiffiffiffiffiUffiffiffiffi(cid:8)ffiffiffipffiaffiffiffiffiffiffiffiffi At stationarity, the matrix governing the backward k k k k k(cid:8)1 k k(cid:8)1 cparnocneosswsbimeipnltifierepsrettoedQajsi ¼aMpjaðrHkijo(cid:8)vgdeijnlemraaxtÞopri(cid:8).1F;uwrthhiecrh, þqffiUffiffiffikþffiffiffiUffiffiffiffik(cid:8)ffiþffiffiffi1ffiffipffiaffiffikffiffiþffiffi1ffiffi¼lmaxpffiaffiffikffiffi: ð16Þ the (asymptotic) ancestor distribution, given by a ¼ Note that Eq.(16) relates the mean fitness of the i % zp; turns out to be the stationary distribution of equilibrium population ðRR¼l Þ to the ancestor i i max P P the backward process, since Q a ¼ p ðH frequencies a : i ji i i j ij k (cid:8)d l Þp(cid:8)1zp ¼P p zðH (cid:8)d l Þ¼0: Due to ij max i i i i j i ij ij max ergodicity of the backward process (Q is irreducible if 2.4. Observables and Averages H is), a is, at the same time, the distribution of types along each line of descent (with probability 1). In this section we define the observables, i.e., measurable quantities, that are used to describe the 2.3. The Equilibrium Ancestor Distribution populationonitsevolutionarycourse.Besidestheusual populationmean,weshallalsointroducethemeanwith Aswesawinthelastsubsection,thereisasimplelink respect to the ancestor distribution (see Section 2.3). between the algebraic properties of H and the probabil- We shall consider means and variances of two istic structure of the mutation–selection process at observables in the following. To this end, we character- equilibrium, which may be summarized as follows. The ize each type (or class) i by its fitness value R and its i P PF right eigenvector p (with i pi ¼1) determines the mutational distance Xi from the reference genotype (or composition of the population at mutation–selection the class X ). For the biallelic model in particular, 0 balance; the corresponding left eigenvector z (normal- mutational distance corresponds to the Hamming P ized so that i zipi ¼1) contains the asymptotic distance to sþ: If, in addition, this is the fittest type, Xi offspring expectation (or relative reproductive success) just gives the number of deleterious mutations. But in of the various types; and the ancestral distribution, general it can also be used to describe the value of any definedbyai ¼pizi;givestheasymptoticdistributionof additive trait with equal contributions of sites or loci. types that are met when lines of descent are followed Similarly, for single-step mutation, we define X to be k backwardintime(cf.Fig. 2).Figure3showsp;a;andz thedistancefromtheclassX ;thusX ¼k forclassX : 0 k k for a single-step mutation model with a linear fitness Again,X maybeviewedas(thegeneticcontributionto) k function. One sees that z decreases exponentially. any character with discrete values that depends linearly For the single-step mutation model, we may directly on the mutation classes. transform the eigenvalue equation Hp¼lmaxp into an Population average: Representing an arbitrary obser- equation for a: To this end, we define a diagonal vable as ðOÞ; such as ðRÞ or ðXÞ; we will denote its i i i transformation matrix S with non-zero elements population average as Sbkyk ¼HH* Q:¼k‘¼S1HpS(cid:8)ffiUffiffi1ffi‘(cid:8)ffi:ffiffi=ffiTffiffiUffiffihffi‘ffiþeffi(cid:8)ffiffi1ffifficoarnrdesopbotnadininga PsyFmrmigehtrticanmdatlreifxt OO%ðtÞ:¼ X OipiðtÞ: ð17Þ eigenvectorsaregivenbypp* ¼Spandzz* ¼S(cid:8)1z:Butnow, i Byomissionofthetimedependencewewillindicatethe corresponding equilibrium average. 3Both z and p also admit a more stochastic interpretation. If the % population does not go to extinction, one has limt!1Yi=ðPj YjðtÞÞ Astomeanfi%tness,RRðtÞdeterminesthemutationload, ¼pi almostsurely,i.e.,thestochasticityisinthepopulationsize,not LðtÞ:¼Rmax(cid:8)RRðtÞ:Here,Rmax ¼maxiRi isthefitness of in the relative frequencies (Kesten and Stigum, 1966; Hofbauer and the fittest genotype, in line with the usual convention Sgeignmeruantedd, 1b9y88,HC(cid:8)hlampatxeIr; 1o1n.5e). hFausrthliemr,t!f1ortPrtðhYeðtÞcarit0icjaYlð0pÞr¼oceesjsÞ (see, e.g., Ewens, 1979; Buu.rg%er, 2000). It%is well known ¼zj=C and limt!1ð1=tÞEðYiðtÞjYð0Þ¼ej;YðtÞa0Þ¼Cpi; where C is that the equilibrium value RR:¼limt!1RRðtÞ is given by a constant; this is the continuous-time analog of a result by Jagers the largest eigenvalue, l ; of the evolution matrix H: max (1975,p.94).Notethat,inthelongrun,theexpectedoffspringdepend For the variance of fitness, V ðtÞ¼P ðR (cid:8)RR%ðtÞÞ2(cid:17) onthefoundertypeonlythroughtheprobabilityofnon-extinctionof % R i i itsprogeny. piðtÞ; we differentiate RRðtÞ according to (1), i.e., 16 Hermissonet al. FIG. 3. Equilibriumvaluesofpopulationfrequenciesp (dottedline),ancestorfrequenciesa (dashedline),andrelativereproductivesuccessz k k k (solid line) for the biallelic model with additive fitness R ¼gðN(cid:8)kÞ (where g is the loss in reproduction rate due to a single mutation), point k mutationratem¼0:2g;mutationasymmetryparameterk¼1;andsequencelengthN ¼100:Thelogarithmicrightaxisreferstoz only. 2 k ðd=dtÞRR%ðtÞ¼P Rpp’ðtÞ¼V ðtÞþP RM p ðtÞ; and Just as for the fitness distribution, we define the hence i i i R i;j i ij j population mean, XX%ðtÞ¼PN XpðtÞ; and variance, d % X VXðtÞ¼Pi ðXi(cid:8)XX%ðtÞÞ2piðtÞ; oif¼0theimiutational distance. VRðtÞ¼dtRRðtÞ(cid:8) RiMijpjðtÞ Ancestral average: We will also need the ancestral i;j average of our observables, that is, the average with ! d % X X respect to the ancestral distribution defined in Eq.(9): ¼dtRRðtÞþ ðRj(cid:8)RiÞMij pjðtÞ: ð18Þ OO#ðt;tÞ:¼Pi Oiaiðt;tÞ¼Piziðt;tÞOipiðtÞ: In the follow- j i ing, we will only be concerned with the ancestral The interpretation of this completely general formula is distribution in equilibrium, i.e., with both t and t going as follows: In the absence of mutation, Eq.(18) just to infinity. We obtain the ancestral average of any reproduces Fisher’s fundamental theorem, i.e., the observable ðOÞ in this limit as i variance in fitness equals the change in mean fitness, aslongasthereisnodominance(see,e.g.,Ewens,1979). OO# :¼ X Oa ¼X zOp: ð20Þ i i i i i If mutation is present, however, a second component i i emerges, which is given by the population mean of the mutational effects on fitness, weighted by the corre- These averages may be read forward in time sponding rates. It may be understood as the rate of (correspondingtoaweightingofthecurrentpopulation change in mean fitness due to mutation alone. At withexpectedoffspringnumbers),andbackwardintime mutation–selection balance, this second term is ob- (correspondingtoanaveragingw.r.t.thedistributionof viously the only contribution to variance in fitness. the ancestors). A third interpretation is available if the For the single-step mutation model in particular, we mutation matrix is irreducible, which entails that the can define deleterious and advantageous mutational equilibrium backward process defined by Q is ergodic. effects separately as sþk ¼Rk (cid:8)Rkþ1 and s(cid:8)k ¼Rk(cid:8)1(cid:8) Then, with probability 1, the equilibrium ancestral Rk; respectively. For decreasing fitness values (which is average also coincides with the average of the obser- the usual case, but not strictly presupposed here) these vable over a lineage backwards in time. Note that the are positive. This way we obtain information so obtained is not contained in the V ¼sþUþ(cid:8)s(cid:8)U(cid:8) ¼sþUþ(cid:8)s(cid:8)U(cid:8) population average, which is merely a ‘‘time-slice’’ R average. The ancestral mean adds a time component þ Covðsþ;UþÞ(cid:8)Covðs(cid:8);U(cid:8)Þ ð19Þ to the averaging procedure, which provides extra for the equilibrium variance, a result we will rely on in informationontheevolutionarydynamics.InAppendix the following. A, we shall show that our ancestral averaging coincides Mutation–Selection Balance 17 with the way observables are evaluated in a system of fitness values, or mutation rates, may be due to quantum statistical mechanics. environmental changes, or changes in the genetic back- % % ground.) Clearly, HþDH has RRþDRR as the largest % P % P 2.5. Linear Response and Mutational Loss eig%envalue, with DRR’ iDRið@RR=@RiÞþ i;jDMij(cid:17) ð@RR=@M Þ to linear order in DR and DM : If only ij i ij We now come to another interpretation of the fitness values are affected, (21) yields equilibrium ancestor frequencies introduced in Section % X DRR’ DRa; ð25Þ 2.2. Consider the derivative of the equilibrium mean i i i fitness with respect to the ith fitness value in a general where the a belong to the original system. If only the system of parallel mutation and selection (1): i mutationrateschangebyvariationsinacommonfactor % " # @RR @ X m; (24) leads to ¼ z H p @Ri @Ri j;k j jk k DRR% ’(cid:8)DmG: ð26Þ " # m % @ X ¼a þRR z p ¼a; ð21Þ i @R j j i Wewillcometofurtherinterpretationanddiscussionof i j themutationallossandtheresponserelationsinSection where we made use of the normalization condition 5.1. P P z p ¼ a (cid:13)1: The ancestor frequency a there- j j j j j i fore measures the linear response (or sensitivity) of the 2.6. Fitness Functions and Mutation Models equilibrium mean fitness to changes in the ith fitness value.4Asimilarcalculationfortheresponsetochanges For many of the results and all of our examples, we in the mutation rates results in will restrict our treatment to the case of the single-step % mutation modelas described by Eq.(2). Although most @RR ¼ðz (cid:8)z Þp : ð22Þ ofourresultsdonotdependonthisparticularchoicewe @M i j j ij will,forsimplicity,concentrateonthisschemehere,and Using (21) and (22), we can express the equilibrium onlybrieflydiscussthepossibleextensions.Wewillstart mean fitness as follows: out with a discussion of fitness functions and mutation % % schemes in this context. Depending on whether the % # X X @RR X @RR RR¼RRþ ziMijpj ¼ Ri@R þ Mij@M : ð23Þ phenotype or the genotype is considered the primary i;j i i i;j ij quantity for the model, the inherent approximation Let us give a variational interpretation for the mainly concerns the mutation or the fitness part, ancestor mean fitness as well. To this end, we define respectively. the mutational loss G of the system as the difference If Xk ð04k4NÞ are the values of a quantitative trait between ancestor and population mean fitness in on which selection acts, fitness may be taken as an equilibrium. Assume now that we change all mutation arbitraryfunctionofit.Theessentialassumption,inthis ratesM byvariationsinacommonfactorm:From(23) case, is that genotypes with equal trait values have ij and (21) we then find that the mutational loss relates to equivalent mutation patterns, with mutation in single the linear response of the equilibrium mean fitness to steps asanadditional simplification. Thisistheoriginal changes in the mutation rates as viewinwhichthisassumptionfirstappeared,withXk as % theelectricchargeofproteins(OhtaandKimura,1973). # % @RR G:¼RR(cid:8)RR¼(cid:8)m : ð24Þ The numerous papers to follow have been reviewed by @m Buu.rger (1998, 2000). Actually, this relation holds for arbitrary (clonal) If, on the other hand, X is the number of mutations k mutation–selection systems, in particular also if muta- with respect to the wildtype (i.e., X ¼k as in the k tion and reproduction are coupled (in which case the biallelic model), single-step mutation is a natural mutation rates are replaced by mutation probabilities). approximation and directly emerges if mutation and Equations (21) and (24) may also be used to reproductionaremodeledasindependentprocesses.The determine the change in mean fitness if H changes to essential simplification, in this case, consists in the HþDH; to linear order in DH: (Small changes in the choiceofthegenotypefitnessvalues,whichdependonly on k: This way, only the average epistatic effect is included in the model, whereas any variance among 4If mutation is coupled to reproduction, the linear response to variationsinthedeathrateD isgivenby(cid:8)a: epistatically interacting mutations is disregarded. i i 18 Hermissonet al. Fitness functions of this kind, although undoubtedly bythemselves,twoofthemarewellstudied,andwewill lacking much of the biological complexity, have been show that the formulas reduce to well-known results used as standard landscapes throughout population there. genetics literature. While the principal reason for this The first case is the limit of vanishing back mutations, seems to lie in the large simplifications due to permuta- defined by U(cid:8) (cid:13)0 in our model. The second one is a k tion invariance, they already take full account of the limit of linearity, in which fitness and mutation rates limited information on fitness provided by mutation depend linearly on some trait Y ¼Ny ¼Nyðx Þ with k k k accumulation experiments (e.g., Crow and Simmons Y ¼0 and Y ¼N; such as 0 N (1983), but see also the discussion by Phillips et al., rðxÞ¼r (cid:8)ayðxÞ; uþðxÞ¼bþð1(cid:8)yðxÞÞ; 0 2000). Further, they include a broad range of examples u(cid:8)ðxÞ¼b(cid:8)yðxÞ: ð29Þ with vastly diverging properties, ranging from simple additivefitnessoverquadratic}orotherwisepolynomial Note that, if Y is proportional to the mutational k or exponential}landscapes with smoothly varying distanceX ¼k;thefitnessfunctionislinearwhereasthe k fitness values (e.g. Charlesworth, 1990) to truncation mutation functions u(cid:1) reproduce the mutation scheme selection (e.g., Kondrashov, 1988) and Eigen’s sharply ofthebiallelicmodelifb(cid:1) ¼mð1(cid:1)kÞ:Thislimitcanbe peaked landscape (Eigen et al., 1989). understood as the limit of vanishing epistasis, in which For a consistent treatment of our model in the the system is known as the Fujiyama model in the mutation class limit N !1 (to be defined in the next sequence space literature (cf. Kauffman, 1993). subsection), it will be advantageous to think of the The third case is the limit of an infinite number of fitness values and mutation rates as being determined mutation classes, N !1; which we will call mutation by the mutational distance per class (or site), classlimitforshort.Inthecaseofthebiallelicmultilocus x :¼X =N 2½0;1(cid:9): model,thislimithasbeenusedanddiscussedinarecent k k R ¼Nr ¼Nrðx Þ; U(cid:1) ¼Nu(cid:1) ¼Nu(cid:1)ðx Þ: ð27Þ publication (Baake and Wagner, 2001). It addresses the k k k k k k situation of weak or almost neutral mutations, where Here, r and u(cid:1) are also introduced as fitness and total the average mutational effect (over the mutation k k mutationratesperclass.Theycannowbethoughtofas classes) is small compared to the mean total mutation being defined, without loss of generality, by three rate,U (cid:20)s:Thelimitfurtherassumesthatdifferencesin functions r and u(cid:1) on the compact interval [0,1]. We mutation rate between neighboring (pairs of) classes are willrefertorasthefitnessfunction,andtouþ andu(cid:8) as small compared to the mean rate itself. In this case, the(deleteriousandadvantageous)mutationfunctionsof genetic change by mutation proceeds in many steps of the model. Both uþ and u(cid:8) are assumed to be smallaverageeffectandthemodelisagenuinemulticlass continuous and positive, with boundary conditions model in the sense that typically a large number of u(cid:8)ð0Þ¼uþð1Þ¼0; and r to have at most finitely many classes are relevant in mutation–selection equilibrium. discontinuities, being either left or right continuous at Note that only the average mutational effect must be each discontinuity in ]0,1[. This should include all small; this includes the possibility of single steps with biologically relevant examples. For the biallelic model, much larger effect (such as in truncation selection). the mutation functions are simple linear functions of x: Technically, the limit N !1 is performed such that uþðxÞ¼mð1þkÞð1(cid:8)xÞ; u(cid:8)ðxÞ¼mð1(cid:8)kÞx: ð28Þ the mutational effects s(cid:1) and the fitness values and mutation rates per class, r and u(cid:1); remain constant. If Note that the classical stepwise mutation model (Ohta fitnessvaluesandmutationratesaredefinedbythethree and Kimura, 1973) is not covered by this framework, functionsrandu(cid:1) asdescribedabove(27),increasingN since its genotype space Z is inherently non-compact. simply leads to finer ‘‘sampling’’ of the functions. Withthiskindofscaling,themeansandvariancesper class of the observables defined in Section 2.4 approach 2.7. Three Limiting Cases well-defined limits, which then serve as approximations Our primary aim in the following sections is to fortheoriginalmodelwithfiniteN:Wewilldenotethem establishsimplerelationsfortheequilibriummeansand by the corresponding lower case letters, i.e., rr#:¼RR#=N; variances of mutational distance and fitness that lend v :¼V =N; etc.; an additional subscript will indicate X X themselves to biological interpretation. Whereas these the limit value, e.g., xx% :¼lim xx%: Note that it is, in 1 N!1 relations are approximations in the general case, they general,thevarianceperclassofagivenquantitythatis restonthreelimitingcasesaspillars,forwhichtheyhold meaningfulinthislimit,notthevarianceofthequantity as exact identities. All three are biologically meaningful per class (e.g., VarðX=NÞ), which tends to zero
Description: