Adv.Appl.Prob. 31,1002–1026(1999) PrintedinNorthernIreland AppliedProbabilityTrust1999 RECENT COMMON ANCESTORS OF ALL PRESENT-DAY INDIVIDUALS ∗ JOSEPHT.CHANG, YaleUniversity Abstract Previous study of the time to a common ancestor of all present-day individuals has focused on models in which each individual has just one parent in the previous generation. For example, ‘mitochondrial Eve’ is the most recent common ancestor (MRCA)whenancestryisdefinedonlythroughmaternallines. InthestandardWright– Fishermodelwithpopulationsizen,theexpectednumberofgenerationstotheMRCA isabout2n,andthestandarddeviationofthistimeisalsoofordern. Herewestudya two-parentanalogoftheWright–Fishermodelthatdefinesancestryusingbothparents. Inthismodel,ifthepopulationsizenislarge,thenumberofgenerations,Tn,backto a MRCA has a distribution that is concentrated around lgn (where lg denotes base-2 logarithm),inthesensethattheratioTn/(lgn)convergesinprobabilityto1asn→∞. Also,continuingtotracebackfurtherintothepast,atabout1.77lgngenerationsbefore thepresent, allpartialancestryofthecurrentpopulationends, inthefollowingsense: withhighprobabilityforlargen,ineachgenerationatleast1.77lgngenerationsbefore thepresent,allindividualswhohaveanydescendantsamongthepresent-dayindividuals areactuallyancestorsofallpresent-dayindividuals. Keywords: Coalescent; Wright–Fisher model; Galton–Watson process; genealogical models;populationgenetics AMS1991SubjectClassification: Primary92D25 Secondary60J85 1. Introduction Startingwiththesetofallofuspresent-dayhumans,imaginetracingbackintimethrough our mothers, our mothers’mothers, and so on. This is the maternal family tree of mankind, andweareatitsleaves. Recentresearchhassuggestedthatthewomanattherootofthistree lived roughly 100000 or 200000 years ago, perhaps inAfrica (Cann et al. (1987); Vigilant etal. (1991)). Thiswomanhasbeendubbed‘mitochondrialEve’,sinceallpresent-dayhuman mitochondrial DNA descended from hers. Mitochondrial Eve was undoubtedly not the only woman alive at her time, so the name ‘Eve’ is misleading, as has been pointed out by a number of authors; see, for example, Ayala (1995). However, this misunderstanding aside, questionsoftheoriginsofmankindandthenatureofourrelationshipstoeachotherarestill ofkeeninterest,andtheresearchonmitochondrialEvehasreceivedagreatdealofpublicity, generating headlines in the popular press as well as in scientific publications. Svante Pa¨a¨bo (1995)explains: Received9October1997;revisionreceived19June1998. ∗ Postaladdress: StatisticsDepartment,YaleUniversity,Box208290,NewHaven,CT06520,USA. Emailaddress:[email protected] 1002 Recentcommonancestors 1003 ...therecentdateofourmitochondrialancestorisinasensethereallycontrover- sialconclusionfromthesestudies. Everyoneagreesthatwetraceourancestryto Homoerectus,whoemergedinAfricaandfromtherecolonizedmostofEurasia about a million years ago or even earlier. What the mitochondrial data seem to show,however,isthatwehaveamuchmorerecentancestor,onewholivedsome 100000or200000yearsago. Whatcapturestheimaginationisnottheparticularchoicetotracebackthroughthematernal line,butratheritistheideathatallofpresent-dayhumanitymayhaveacommonancestorwho livedaslittleas100000yearsago,atimethatseemstomanytobesurprisinglyrecent. Ifwe retain this idea while removing the restriction to the maternal line, the question becomes: Howfarbackintimedoweneedtotracethefullgenealogyofmankindinordertofindany individualwhoisacommonancestorofallpresent-dayindividuals? Inthispaperweaddress thissortofquestioninasimplemathematicalmodel. The coalescent model of Kingman (1982a) forms the basis of many of the calculations, formal and informal, used in recent treatments of questions about mitochondrial Eve and relatedtopics. Thecoalescentisalarge-populationlimitofanumberofthefundamentalmod- els of population genetics, including the Wright–Fisher process. These models are haploid, with each individual in a given generation having a single parent in the previous generation. The Wright–Fisher model assumes ‘random mating’, in the sense that the parent of a given individualisequallylikelytobeanyoftheindividualsinthepreviousgeneration. Thestandard modelalsopostulatesaconstantpopulationsize,whichmaybean‘effectivepopulationsize’ whenmodelingmoregeneralsituations. Anumberofimportantpropertiesofthecoalescent modelareusedinapplications. Forexample,themodelimpliesarelationshipbetweencoales- cencetimesandpopulationsize: theexpectedcoalescencetime(measuredingenerations)of alargesampleisabouttwicethepopulationsize. Hudson(1990)givesasurveyofthetheory andapplicationsofthecoalescent. Herewestudyanaturaltwo-parentanalogoftheWright–Fisherprocess. (Thisprocesswas previouslyconsideredbyKa¨mmerle(1991)andMo¨hle(1994);seetheendofthissectionfor a discussion of related work.) We assume the population size is constant at n. Generations are discrete and non-overlapping. The genealogy is formed by this random process: in each generation,eachindividualchoosestwoparentsatrandomfromthepreviousgeneration. The choices are made just as in the standardWright–Fisher model—randomly and equally likely over the n possibilities—the only difference being that here each individual chooses twice instead of once. All choices are made independently. Thus, for example, it is possible that whenanindividualchooseshistwoparents, hechoosesthesameindividualtwice, sothatin factheendsupwithjustoneparent;thishappenswithprobability1/n. Thismodelisdesignedonlyasasimplestartingpointforthought;ofcourseitisnotmeant tobeparticularlyrealistic. Still,onemightworrythatthissimplemodelignoresconsiderations ofsexandallowsimpossiblegenealogies. Ifthisseemsbothersome,analternativeinterpreta- tionofthesameprocessisthateach‘individual’isactuallyacouple,andthatthepopulation consists of n monogamous couples. Then the random choices cause no contradictions: the husbandandwifeeachwereborntoacouplefromthepreviousgeneration. Theycouldeven comefromthesamecoupleinthepreviousgeneration. Our interest here is in finding individuals who are common ancestors of all present-day individuals. Forconvenience,weusetheabbreviation‘CA’torefertoacommonancestorof allpresent-dayindividuals,and‘MRCA’standsfor‘mostrecentcommonancestor.’ 1004 J.T.CHANG Itturnsoutthatmixingoccursextremelyrapidlyinthetwo-parentmodel,sothatCAsmay befoundwithinanumberofgenerationsthatdependslogarithmicallyonthepopulationsize. In particular, our first main result says that the number of generations back to an MRCA is aboutlgn,wherelgdenoteslogarithmtobase2. Theorem1. LetTndenotethenumberofgenerations,countingbackintimefromthepresent, toanMRCAofallpresent-dayindividuals,inapopulationofsizen. Then T n P →1 asn→∞. lgn Thiscontrastsdramaticallywiththeone-parentsituation. Forexampleifnis1million,then theone-parentMRCA(‘Eve’)isexpectedtooccurabout2milliongenerationsago,whereas atwo-parentMRCAoccurswithhighprobabilitywithinthelast20generationsorso. Also, thevariabilityintheone-parentsituationissuchthattheactualtimetotheMRCAmayeasily be as small as half the expected time or as large as double the expected time, say, even in arbitrarily large populations. In contrast, the time to an MRCA for the two-parent model is much less variable. For example, if the population is large enough, it is very unlikely that a randomrealizationofthetwo-parentMRCAtimewilldifferfromlgnbyevenonepercent. This paper also addresses a second related question. Imagine tracing back through the two-parent genealogy. According to Theorem 1, after about lgn generations, we will reach the most recent generation that contains a CA. That generation might contain just one CA, or it might contain more than one. In any case, if we continue tracing back further through successivegenerations,thenthetitle‘CA’becomesmuchlessofaprestigiousdistinction. For example,bothparentsofaCAwillbeCAs,andallgrandparentsofaCAwillbeCAs,andso on. Eventually,inagivengeneration,many(andinfactmost)oftheindividualswillbeCAs. AtsomepointwereachagenerationinwhichsomeindividualsareCAs(havingallpresent- dayindividualsasdescendants)andsomeare‘extinct’(havingnopresent-dayindividualsas descendants),butnoindividualisintermediate(havingsomebutnotallpresent-dayindividuals as descendants). That is, at this point, everyone who is not extinct is a CA. This condition persistsforeveraswetracebackintime: everyindividualisaCAorextinct. Thenextresult showsthatthisconditionisreachedveryrapidlyinthemodelstudiedhere. Theorem2. Let Un denote the number of generations, counting back in time before the present,toagenerationinwhicheachindividualiseitheraCAofallpresent-dayindividuals or an ancestor of no present-day individual. Let γ denote the smaller of the two numbers satisfyingtheequationγ e−γ =2e−2,andletζ =−1/(lgγ)≈0.7698. Then U n P →1 asn→∞. (1+ζ)lgn Thus, within about 1.77lgn generations, a tiny amount of time in comparison with the orderntimerequiredtogetaone-parentCA,everyoneinthepopulationiseitheraCAofall present-dayindividualsorextinct. Figure1showsasmallexampletoillustratethedefinitionsandstatements. Thepopulation size is 5. At the bottom of the figure is generation 0, the present. Going up in the graph correspondstogoingbackintime,sothatthetoprowisgeneration−5. Foreachindividual I in each previous generation, we calculate the set of present-day individuals (individuals in generation 0) that are descendants of I. For example, the set of present-day descendants of individual #1 in generation −1 is {3,5}. The calculations propagate backward in time Recentcommonancestors 1005 / / S 0 0 S S S / 0 S {4,5} S S / 0 S {4,5} {1,2,3,4} {1,2,3,4} {2,3} / {4,5} S 0 {2,3} {4} {5} {3,5} {1,2,4} 1 2 3 4 5 Figure1:Anexampleillustratingthemodel. Herethefourthindividualingeneration−2isaCA.By generation−5,allindividualsareCAsorextinct: individuals1,4,and5areCAs,andindividuals2and 3areextinct. according to the rule: the set of descendants of an individual I is the union of the sets of descendantsofthechildrenofI. Forexample,thesetofpresent-daydescendantsofindividual #4 in generation −2 is the union {3,5}∪{5}∪{1,2,4}, which is the whole population S. Thus, individual #4 in generation −2 is a CA of the set S of all present-day individuals. Continuing backward in time, at generation −5 we reach the stage where each individual hasasdescendantseitherthewholepopulationS ortheemptyset∅. Thatis,eachindividual ingeneration−5iseitheraCAorextinct,havingasdescendantseithereverybodyornobody from the set of present-day individuals, and all generations prior to generation −5 also have thisproperty. Intheexampleshown,T =2andU =5. 5 5 Whatisthesignificanceoftheseresults?Anapplicationtotheworldpopulationofhumans would be an obvious misuse. For example, we would not claim that a common ancestor of every present-day human may be found within the last lgn generations. Even if we took n to be 5 billion, this would imply a CA just about 32 generations ago—perhaps 500 years or so. An important source of the inapplicability of the model to this situation is the obvious non-randomnatureofmatinginthehistoryofmankind. Forexample,parentsaremuchmore likely to live within a few miles of their children than a thousand miles away or halfway around the world. So the model studied here is too simple to be directly applicable to the evolution of mankind as a whole. In such complicated situations, the results sound a note of caution: if the logarithmic time to CAs seems patently implausible, then at least one of the assumptions of the model, such as the random mating assumption, must be causing a great deal of trouble. On the other hand, it would be interesting to know whether there are simpler real-life situations in which the assumptions of the model do apply reasonably well and the theorems provide reasonably accurate quantitative descriptions. Perhaps a relatively homogeneouspopulationlackingdiscerniblestructures(geographicorotherwise)thatinteract stronglywithreproductionwouldbeapromisingcandidate. 1006 J.T.CHANG TherandomtimeanalysedinTheorem2seemsofnaturalinterestinthisprocessandmay alsobepertinenttocertainquestionsabout‘speciestrees’or‘populationtrees’(asopposedto ‘genetrees’). Inmanycontextsthespeciestreeisconsideredtobetherealobjectofinterest, andweusegeneticdataandgenetreestoattempttolearnaboutthespeciestree. Forexample, forhumans,chimpanzees,andgorillas,isthe‘truespeciestree’(HC)G,(HG)C,or(CG)H? Roughly,theconceptualframeworkofthisquestionisasfollows. Thereweretwo‘speciation events’thatsplitasinglespeciesancestraltohumans,chimpanzees,andgorillasintothethree separatemodernspecies. Thetree(HC)G,forexample,saysthatthefirstsuchsplitseparated the subpopulation that eventually became modern gorillas from the remainder, which later splittobecomemodernhumansandchimpanzees. Unfortunately,moreprecisedefinitionsof theconceptofspeciestreethatremainusefulindifficultorunclearcasesseemhardtocome by. One might adopt the viewpoint that the proper starting point for a definition of ‘species tree’isthefulltwo-parentgenealogyofallpresent-dayindividuals. Givensuchadefinition, if we knew all details of this genealogy, then we could read off an answer to the H, C, and G question (the answer might be ‘none of the 3 choices above’—that is, the species tree is notwelldefinedoratleastnotbifurcating). OneinterpretationofthetimeUn isasfollows. Supposeweimagineacasewhereevolutionreallyproceededaccordingtoaneatsuccession of ‘speciation events’. Under a certain reasonable definition of a species tree, if the times between those speciation events exceed Un, then the species tree is guaranteed to be well defined and coincide with the history of speciation events. This idea will be discussed more fullyelsewhere. Acaveattoforestallpotentialmisunderstanding: thispaperisnotaboutgenetics. Thatis,it isnotaboutwhogetswhatgenes;itisaboutsomethingmoreprimitive,namely,theancestor– descendantrelationship. One-parentmodelsareappropriateintracingthehistoryofasample of nonrecombining genes or small bits of DNA; a single nucleotide descends from a single nucleotidefromeitherthemotherorfather,butnotboth. Hereweareconsideringancestryin themorecommon, demographicsenseoftheword, asappliedtopeople, forexample, rather thangenes. Previous genetics research that is somewhat related, although still very different from the present study, considers models incorporating recombination. This type of model has been investigated in a number of papers, including those of Hudson (1983) and Griffiths and Marjoram(1997). ThehistoryofasampleofDNAsequencesmaybedescribedbyacollection ofgenealogies,witheachnucleotidepositionintheDNAhavingitsownone-parentgenealogy. The genealogies for two positions that experience no recombination between them will be congruent, with the paths of the two genealogies going back through the same individuals, whereas a recombination between two positions causes the genealogies of those positions to differ. Each of the genealogies in the collection will have its own MRCA (the nucleotide at its root), which may occur in different individuals. Each of these individuals will be a CA in the sense considered in this paper, but the most recent of these individuals is generally not an MRCA in our sense. Our MRCA is more recent, since the paths from ancestors to descendants consist of all potential paths for genes to be transmitted, and may include paths that did not happen to be taken by any genes. No previous results about these genetic models have been similar to the results here, for example, in getting times of order logn. This is not surprising, since the asymptotics would require an assumption that the sequence lengths and the number of recombinations tend to infinity. This is another manifestation of the statement that the questions we are investigating here are not fundamentally genetics questions. Recentcommonancestors 1007 There is some previous work on the process we study here and related processes. Two papers of Ka¨mmerle (1989, 1991) introduce a general class of two-parent (called ‘bisexual’ in those papers) versions of the Wright–Fisher and other processes. These papers focus on two main questions. First, they analyse the probability of extinction of a set of individuals in the present generation, that is, the probability that the set of individuals eventually has no descendants in some future generation. Second, in a two-parent version of the Moran model, they study the number Rn(t) of individuals t generations ago who have at least one descendantinthepresentgeneration. Ka¨mmerle(1989)findsthattheMarkovchain{Rn(t) : t = 0,1,...}, suitably normalized and suitably initialized (with the initialization essentially requiringthatthechainisstartedinsteadystate),convergesweaklyasn → ∞toadiscrete- timeOrnstein–Uhlenbeckprocess. Mo¨hle (1994) both generalizes and refines the results of Ka¨mmerle. In particular, Mo¨hle providesadetailedanalysisoftheextinctionprobabilitiesinatwo-parentWright–Fishermodel thatapproximatestheprobabilitiesuptoo(1/n). Healsoestablishesweakconvergenceina generalclassoftwo-parentmodels,includinganOrnstein–Uhlenbecklimitforthetwo-parent Wright–Fisherprocess. Mo¨hlealsohasanumberofotherpapersinpress,includingonethat relaxestheassumptionofconstantpopulationsize. Thesepreviousresultsarecomplementarytotheresultsinthispaper. Thepreviouspapers consideredindividualswhohaveatleastonedescendantinagivenfuturegeneration. Herewe consider CAs, who have as descendants all members of the future generation. The previous resultsabouttheprocess{Rn(t)}applytolarget,thatis,tothebehavioroftheprocessmany generationsbeforethepresent,withtheprocessinsteadystate. Herewefocusonthebehavior ofarelatedprocessatsmall(i.e.recent)times,startingfarawayfromsteadystate. Weshow that at about t = 1.77lgn generations before the present, with high probability the Rn(t) individualswhohaveatleastonepresent-daydescendantareallinfactCAs. 2. Simulations Table1presentsasmallsimulationstudyconsistingof25trialseachforn=500,n=1000, n = 2000, and n = 4000. Two numbers are reported for each trial: Tn, the number of generations back to an MRCA, and Un, the generation at which every individual is either a CAorextinct. InthesesimulationsthedistributionofthetimebacktoanMRCAisindeedquiteconcen- trated around the value lgn, which is nearly 9 for n = 500, nearly 10 for n = 1000, and so on. Thus,thesimulationresultsshowthattheasymptotic(n→∞)statementofTheorem1is ‘notsoasymptotic’,inthatitdescribesthesituationwellevenforrathersmallvaluesofn. The behaviorpredictedbyTheorem2isalsoreflectedreasonablywellinthesimulations,although onemighthaveguessedanumericalconstantcloserto2ratherthan1.77fromthissmallstudy. 3. Proofs 3.1. Generalideasandtools We start with the observation that although Theorems 1 and 2 are phrased in terms of countinggenerationsbackintimefromthepresentuntilsomeconditionobtains,theseresults maybeprovedbycountingforwardintimefromafixedgeneration. Forexample, theevent {Tn ≤m}requiresthataCAofallindividualsingeneration0maybefoundamonggenerations −1,−2,...,−m. Thisisequivalenttorequiringthatifwestartwithgeneration−mandtrace forward in time, then some individual in generation −m becomes a CA of all individuals in somegenerationt ∈{−m+1,−m+2,...,0}. 1008 J.T.CHANG Table1:Asmallsimulationstudy. Foreachoffourpopulationsizesn,thetwotimesTn andUn are reportedfor25trials. n=500 n=1000 n=2000 n=4000 1 10 18 11 19 12 24 13 24 2 9 18 11 21 12 22 13 23 3 10 21 10 20 11 23 12 24 4 9 18 11 20 12 22 13 24 5 10 19 10 20 12 21 13 24 6 9 19 11 31 12 23 13 24 7 10 19 11 20 12 21 13 24 8 10 21 10 23 12 27 13 24 9 9 19 11 20 11 24 13 24 10 9 17 11 20 12 24 13 31 11 9 19 10 21 12 26 13 23 12 9 18 11 21 11 22 13 23 13 9 19 11 21 12 21 13 25 14 9 20 11 20 12 25 13 22 15 9 19 10 20 11 24 13 24 16 10 21 10 21 11 23 13 23 17 9 19 11 19 12 24 13 23 18 9 17 10 26 12 22 13 24 19 9 19 11 21 11 23 13 25 20 10 18 10 21 12 22 13 25 21 9 19 11 19 12 22 12 25 22 10 19 11 26 12 22 13 26 23 9 19 10 20 12 24 13 27 24 10 17 11 21 12 22 13 26 25 10 19 11 23 12 23 13 25 Sowewillcountgenerationsforwardintime,andforconvenienceletusrenumbergenera- tionssothattheinitialgenerationis‘generation0’. Thepopulationatgenerationt ≥0consists ofnindividualsdenotedbyIt,1,It,2,...,It,n. WecanpictureIt,1,It,2,...,It,n asdotsinan array as in Figure 1, with It,j being the jth dot in row t. The association of a number j to individual It,j is an arbitrary labeling of the individuals within generation t. Assigned only as a means of referring to individuals, the labels have no significance in the model, which does not order the individuals within a generation. Let µt,1,νt,1,µt,2,νt,2,...,µt,n,νt,n be independent and uniformly distributed on the set {1,...,n}. We interpret µt,j and νt,j as labels of the parents of individual It,j; that is, the parents of It,j are It−1,µt,j and It−1,νt,j. DefiningasequenceofrandomsetsGi,Gi,... recursivelybyGi ={i}and 0 1 0 Git ={j ≤n:µt,j ∈Git−1orνt,j ∈Git−1}, Git isthesetoflabelsofthedescendantsofI0,i ingenerationt. LetGit denotethecardinality ofGit. TheconditionalprobabilitythatindividualIt+1,j hasatleastoneparentamongtheGit membersofGit is P({µt+1,j ∈Git}∪{νt+1,j ∈Git}|Git)=(Git/n)+(Git/n)−(Git/n)(Git/n). Recentcommonancestors 1009 Theprocess{Git :t =0,1,...}isaMarkovchainwithtransitionprobabilities (cid:1) (cid:2) (cid:3) (cid:4) (Git+1 |Git)∼Bin n,2Gnit − Gnit 2 , (1) whereBin(n,p)denotesthebinomialdistributionforthenumberofsuccessesinnindepen- denttrialseachhavingsuccessprobabilityp. Throughout the proof, {Gt} will denote a Markov chain with transition probabilities as in (1), although in different parts of the proof we will consider different possible initial values G . For example, taking G = 1 corresponds to following the descendants of a particular 0 0 individualingeneration0. Intheearlystagesoftheprocess,whileGt remainssmallrelative ton,inviewof(1)theconditionaldistributionofGt+1givenGt isnearlyPoisson(2Gt),thatis, thePoissondistributionwithmean2Gt. Inotherwords,while{Gt}remainssmall,itevolves nearly as a Galton–Watson branching process {Yt} with offspring distribution Poisson(2). Ka¨mmerle (1991) gave a formal statement of a result of this nature. A special case of his result says that for fixed u, the joint distribution of (G0,G1,...,Gu) converges to that of (Y0,Y1,...,Yu)asn→∞. Forourpurposes,wewillusethefollowingresultthatallowsus toapproximateprobabilitiesfortheGprocessbythosefortheY processuptoahigherorder ofaccuracyandoverlongerintervalsoftimethatmayhaverandomlengths. Lemma3. Let Y ,Y ,... denote a Galton–Watson branching process with offspring dis- 0 1 tribution Poisson(2). Suppose that Y0 = G0 = 1. Define τbY = inf{t : Yt ≥ b} and τ0Yb =inf{t :Yt =0orYt ≥b},withcorrespondingdefinitionsforτbGandτ0Gb. Asn→∞,if mandbsatisfymb2 =o(n),then P{τG >m}=P{τY >m}(1+o(1)) (2) b b and P{τG >m}=P{τY >m}(1+o(1)). (3) 0b 0b Proof.Astraightforwardcalculationboundsthelikelihoodratio L(y |x):= P{Gt+1 =y |Gt =x} P{Yt+1 =y |Yt =x} P{Bin(n,(2x/n)−(x2/n2))=y} = ≤e2x(1−(2x/n)+(x2/n2))n−y, P{Poisson(2x)=y} sothat (cid:2) (cid:3) 2x x2 logL(y |x)≤2x+(n−y) − + ≤(x2+2xy)/n. n n2 ThisholdswheneverthedenominatorP{Yt+1 = y | Yt = x}ispositive,thatis,forallx > 0 andy ≥ 0,andalsoforx = y = 0. Thus,forallsuchpairsofx andy satisfyingx < band y <b,wehave logL(y |x)≤3b2/n. Asimilarcalculationgivesthelowerbound logL(y |x)≥−5b2/(2n)[1+O(b/n)], 1010 J.T.CHANG so that logL(y | x) ≥ −3b2/n for sufficiently large n. So if x1,...,xm are all less than b, then P{G1 =x1,...,Gm =xm}=P{G1 =x1 |G0 =1}···P{Gm =xm |Gm−1 =xm−1} =P{Y1 =x1,...,Ym =xm}L(x1 |1)···L(xm |xm−1) ≤P{Y1 =x1,...,Ym =xm}e3mb2/n and P{G1 =x1,...,Gm =xm}≥P{Y1 =x1,...,Ym =xm}e−3mb2/n. Thus, (cid:5) (cid:5) P{τbG >m}= ··· P{G1 =x1,...,Gm =xm} 0≤(cid:5)x1<b 0≤(cid:5)xm<b ≤ ··· P{Y1 =x1,...,Ym =xm}e3mb2/n 0≤x1<b 0≤xm<b =P{τY >m}e3mb2/n (4) b and, similarly, P{τG > m} ≥ P{τY > m}e−3mb2/n, so that, by the assumption that mb2 = b b o(n), we obtain P{τG > m} = P{τY > m}(1+o(1)). This proves (2). The proof of (3) b b uses the same reasoning, with the summations in (4) ranging over 0 < xt < b rather than 0≤xt <b. The previous result will be useful because the Poisson Galton–Watson process is simple andwellunderstood. Thenextlemmarecordsafewwellknownitemsforfuturereference. Lemma4. Let Y ,Y ,... denote a Galton–Watson process with offspring distribution 0 1 Poisson(2). Definethemomentgeneratingfunctionψ(z) = E(zY1) = e−2+2z. Theextinction probability ρ = P{Yt = 0 forsomet} ≈ 0.20319 is the smaller of the two solutions of ψ(ρ) = ρ, and ρ = γ/2, where γ is as defined in Theorem 2. The t-fold composition ψt =ψ ◦···◦ψ satisfiesψt(z)↑ρ forall0≤z≤ρ. Therelationρ =γ/2isconfirmedbycomparingthedefinitionsofρandγ. Despitethesimple relationship,wewillkeepthetwodifferentlettersinournotationforconceptualclarity. Defininggt =Gt/n,wehave E(gt+1 |gt)=2gt −gt2 =gt(2−gt). (5) That is, if the fraction of descendants of a given individual is currently gt, it is expected to multiplybyafactorof2−gt inthenextgeneration. Forexample, intheearlystagesofthe processwhenthefractiongt issmall,itnearlydoublesinexpectationinthenextgeneration. Forverysmallgt (oftheorder1/n,forexample)therandomvariabilityislarge;forexample, theprocesscouldeasilygoextinct. ThisiswhenitismostusefultoapproximatetheGprocess bythePoisson(2)Galton–Watsonprocess. On the other hand, for larger values of gt, the multiplication factor gt+1/gt, although expected to be somewhat smaller, has much less variability. The deviations of this factor from its expected value are bounded probabilistically by large deviations inequalities for the binomialdistribution. WewillusethefollowinginequalityofBernstein(1946)asabasictool. Recentcommonancestors 1011 Lemma5. (Bernstein’sinequality.) IfX ∼Bin(n,p)andr >0,then (cid:6) (cid:7) −r2 P{X ≥np+r}≤exp . (6) 2np(1−p)+(2/3)r Since n − X ∼ Bin(n,1 − p), the right-hand side of (6) is also an upper bound for the probabilityP{X ≤np−r}. 3.2. ProofofTheorem1 Outline.The proof will be divided into several parts. We start from generation 0 and trace forwardintime. Stage 1: By the end of Stage 1, we identify an individual I in generation 0 who has a numberofdescendantsthatissmallcomparedton,butlargeenoughsothatI isunlikelyever to become extinct. In particular, we look for a generation t such that some individual I in generation0hasatleastlg2(n)descendantsingenerationt. Withprobabilityapproaching1, thishappensintimeo(lgn),negligiblecomparedwithlgn; thisisshownbyusingLemma3 to approximate our process by a Poisson Galton–Watson process. The rest of the proof will show that with probability approaching 1, individual I becomes a CA within (1+()(lgn) generations,where( isanarbitrarypositivenumber. Stage2: Letβ ∈ (0,1). Stage2followsthedescendantsofI untilreachingageneration containingatleastnβ descendants. Inviewof(5),sincenβ isasmallfractionofnforlargen, throughoutStage2thenumberofdescendantsinagenerationisexpectedtobenearlydouble thenumberofdescendantsinthepreviousgeneration. Andlg2(n)islargeenoughsothatthe multiplicationfactorwillbeveryclosetoitsexpectedvalue,withhighprobability. SoStage2 shouldnottakemuchmorethanaboutlg(nβ)=βlg(n)generations. Stage 3: This stage brings the count of descendants of I up from nβ to (1/2)n. Since the fraction of descendants during Stage 3 stays below 1, the expected multiplication factor 2 is at least 2 − 1 = 3. Again, this multiplication factor is very reliable, so that with high 2 2 probability Stage 3 takes no more than about log3/2{(n/2)/(nβ)} generations. We can make thisanarbitrarilysmallfractionoflgnbychoosingβ closeenoughto1. Stage 4: Now we switch to looking at the fraction Bt of individuals in a generation who arenot descendantsofindividualI. Thisfractionisexpectedtosquareeachgeneration. This causesBt todecreaseveryquickly. Fixingα ∈(1,2),weshowthatStage4,whichtakesthe 2 3 fractionBt from 1 downton−α,takesonlyorderlglgntime. 2 Stage 5: This completes the process, ending when the B process hits 0, and individual I hasbecomeaCA.Weshowthatthistakesjustonegenerationwithhighprobability. Upper bound: Combining the results of Stages 1 through 5 gives the probabilistic upper boundlimn→∞P{Tn ≤(1+()lgn}=1. Lowerbound: Hereweshowthatlimn→∞P{Tn ≥(1−()lgn}=1. Thisisdonebyusing Bernstein’sinequalitytoproveanassertionofthefollowingform: forpositiver andδ,once theprocessofdescendantsofanygivenindividualreachesapowernr ofn,itisveryunlikely toincreasebyafactorofmorethan2+δ inageneration,whereasitwouldhavetodosoin ordertohaveTn <(1−()lgn. 3.2.1. Stage1.Herewewillshowthatwithhighprobability,withinanumberofgenerations negligiblecomparedtolgn,wecanfindagenerationwithatleastlg2nindividualswhoshare a common ancestor. For simplicity we give a crude argument that circumvents the need to
Description: