ebook img

Calibrating the Human Mutation Rate via Ancestral Recombination Density in Diploid Genomes PDF

25 Pages·2015·2.43 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Calibrating the Human Mutation Rate via Ancestral Recombination Density in Diploid Genomes

RESEARCHARTICLE Calibrating the Human Mutation Rate via Ancestral Recombination Density in Diploid Genomes MarkLipson1*,Po-RuLoh2,3,SriramSankararaman1,3,NickPatterson3,BonnieBerger3,4, DavidReich1,3,5* 1DepartmentofGenetics,HarvardMedicalSchool,Boston,Massachusetts,UnitedStatesofAmerica, 2DepartmentofEpidemiology,HarvardSchoolofPublicHealth,Boston,Massachusetts,UnitedStatesof America,3MedicalandPopulationGeneticsProgram,BroadInstituteofMITandHarvard,Cambridge, Massachusetts,UnitedStatesofAmerica,4DepartmentofMathematicsandComputerScienceandArtificial IntelligenceLaboratory,MassachusettsInstituteofTechnology,Cambridge,Massachusetts,UnitedStatesof America,5HowardHughesMedicalInstitute,HarvardMedicalSchool,Boston,Massachusetts,United StatesofAmerica *[email protected](ML),[email protected](DR) OPENACCESS Abstract Citation:LipsonM,LohP-R,SankararamanS, PattersonN,BergerB,ReichD(2015)Calibratingthe Thehumanmutationrateisanessentialparameterforstudyingtheevolutionofourspecies, HumanMutationRateviaAncestralRecombination interpretingpresent-daygeneticvariation,andunderstandingtheincidenceofgeneticdis- DensityinDiploidGenomes.PLoSGenet11(11): e1005550.doi:10.1371/journal.pgen.1005550 ease.Nevertheless,ourcurrentestimatesoftherateareuncertain.Mostnotably,recent approachesbasedoncountingdenovomutationsinfamilypedigreeshaveyieldedsignifi- Editor:GrahamCoop,UniversityofCaliforniaDavis, UNITEDSTATES cantlysmallervaluesthanclassicalmethodsbasedonsequencedivergence.Here,wepro- poseanewmethodthatusesthefine-scalehumanrecombinationmaptocalibratetherate Received:February18,2015 ofaccumulationofmutations.Bycomparinglocalheterozygositylevelsindiploidgenomes Accepted:September3,2015 tothegeneticdistancescaleoverwhichtheselevelschange,weareabletoestimatea Published:November12,2015 long-termmutationrateaveragedoverhundredsorthousandsofgenerations.Weinfera Copyright:©2015Lipsonetal.Thisisanopen rateof1.61±0.13×10−8mutationsperbasepergeneration,whichfallsinbetweenphylo- accessarticledistributedunderthetermsofthe geneticandpedigree-basedestimates,andwesuggestpossiblemechanismstoreconcile CreativeCommonsAttributionLicense,whichpermits ourestimatewithpreviousstudies.Ourresultssupportintermediate-agedivergences unrestricteduse,distribution,andreproductioninany medium,providedtheoriginalauthorandsourceare amonghumanpopulationsandbetweenhumansandothergreatapes. credited. DataAvailabilityStatement:Alldatahave previouslybeenmadeavailableaspartofrefs.21 and24. AuthorSummary Funding:MLacknowledgessupportfromthe Therateatwhichnewheritablemutationsoccurinthehumangenomeisafundamental SimonsFoundation(www.simonsfoundation.org)and parameterinpopulationandevolutionarygenetics.However,recentdirectfamily-based NationalInstitutesofHealth(www.nih.gov;grant R01GM108348,toBB).PRLwassupportedby estimatesofthemutationratehaveconsistentlybeenmuchlowerthanpreviousresults NationalInstitutesofHealthfellowshipF32 fromcomparisonswithothergreatapespecies.Becausesplittimesofspeciesandpopula- HG007805andSSbyNationalInstitutesofHealth tionsestimatedfromgeneticdataareofteninverselyproportionaltothemutationrate, grantK99GM111744.NPandDRweresupportedby resolvingthedisagreementwouldhaveimportantimplicationsforunderstandinghuman NationalScienceFoundation(www.nsf.gov) evolution.Inourwork,weapplyanewtechniquethatusesmutationsthathaveaccumu- HOMINIDgrant#1032255andNationalInstitutesof latedovermanygenerationsoneithercopyofachromosomeinanindividual’sgenome. HealthgrantGM100233.DRisanInvestigatoratthe PLOSGenetics|DOI:10.1371/journal.pgen.1005550 November12,2015 1/25 CalibratingtheHumanMutationRate HowardHughesMedicalInstitute(www.hhmi.org). Thefundershadnoroleinstudydesign,data Insteadofanexternalreferencepoint,werelyonfine-scaleknowledgeofthehuman collectionandanalysis,decisiontopublish,or recombinationratetocalibratethelong-termmutationrate.Ourprocedureaccountsfor preparationofthemanuscript. possibleerrorsfoundinrealdata,andwealsoshowthatitisrobusttoarangeofmodel CompetingInterests:Theauthorshavedeclared violations.Usingeightdiploidgenomesfromnon-Africanindividuals,weinferarateof thatnocompetinginterestsexist. 1.61±0.13×10−8single-nucleotidechangesperbasepergeneration,whichisintermedi- atebetweenmostphylogeneticandpedigree-basedestimates.Thus,ourestimateimplies reasonable,intermediate-agepopulationsplittimesacrossarangeoftimescales. Introduction Allgeneticvariation—thesubstrateforevolution—isultimatelyduetospontaneousheritable mutationsinthegenomesofindividualgermlinecells.Themostcommonlystudiedmutations arepointmutations,whichconsistofsingle-nucleotidechangesfromonebasetoanother.The rateatwhichthesechangesoccur,incombinationwithotherforces,determinesthefrequency withwhichhomologousnucleotidesdifferfromoneindividual’sgenometoanother. Anumberofdifferentapproacheshavepreviouslybeenusedtoestimatethehumanmuta- tionrate[1–3],ofwhichwementionfourcategorieshere.Thefirstmethodistocountthe numberoffixedgeneticchangesbetweenhumansandanotherspecies,suchaschimpanzees [4].Populationgenetictheoryimpliesthatifthemutationrateremainsconstant,thenneutral mutations(thosethatdonotaffectanorganism’sfitness)shouldaccumulatebetweentwo genomesataconstantrate(thewell-known“molecularclock”[5]).Thus,themutationrate canbeestimatedbasedonthedivergencetimeofthegenomes,ifthiscanbeconfidently inferredfromfossilevidence.However,eveniftheageoffossilremainscanbeaccuratelydeter- mined,assigningtheirproperphylogeneticpositionsisoftendifficult.Moreover,becauseof sharedancestralpolymorphism,thetimetothemostrecentcommonancestorisalwaysolder —andsometimesfarolder—thanthetimeofspeciesdivergence,meaningthatsplit-timecali- brationscannotalwaysbedirectlyappliedtogeneticdivergences. Asecondcommonapproach,whichhasonlybecomepossiblewithinthelastfewyears,isto countnewlyoccurringmutationsindeepsequencingdatafromfamilypedigrees,especiallypar- ent-childtrios[6–10].Thisapproachprovidesadirectestimatebutcanbetechnicallychalleng- ing,asitissensitivetogenotypeaccuracyanddataprocessingfromhigh-throughputsequencing. Inparticular,sporadicsequencingandalignmenterrorscanbedifficulttodistinguishfromtrue denovomutations.Surprisingly,thesesequencing-basedestimateshaveconsistentlybeenmuch lowerthanthosebasedonthefirstapproach:intheneighborhoodof1–1.2×10−8perbaseper generation,asopposedto2–2.5×10−8forthosefromlong-termdivergence[1–3]. Athirdmethod,andanotherthatisonlynowbecomingpossible,istomakedirectcompari- sonsbetweenpresent-daysamplesandprecisely-datedancientgenomes.Thismethodissimi- lartothefirstone,butbyusingtwotime-separatedsamplesfromthesamespecies,itavoids thedifficultyofneedinganexternallyinferredsplittime.Arecentstudyofahigh-coverage genomesequencefroma45,000-year-oldUpperPaleolithicmodernhumanproducedtwoesti- matesofthistype[11].Directmeasurementofdecreasedmutationalaccumulationinthissam- pleledtorateestimatesof0.44–0.63×10−9perbaseperyear(rangeof14estimates),or1.3– 1.8×10−8perbasepergeneration(assuming29yearspergeneration[12]).Analternativetech- nique,leveragingtimeshiftsinhistoricalpopulationsizes,yieldedanestimateof0.38– 0.49×10−9perbaseperyear(95%confidenceinterval),or1.1–1.4×10−8perbasepergenera- tion,althoughare-analysisofdifferentmutationalclassesledtoatotalestimateof0.44– 0.59×10−9perbaseperyear(1.3–1.7×10−8),inbetteragreementwiththefirstapproach[11]. PLOSGenetics|DOI:10.1371/journal.pgen.1005550 November12,2015 2/25 CalibratingtheHumanMutationRate Finally,afourthtechniqueistocalibratetherateofaccumulationofmutationsusingasepa- rateevolutionaryratethatisbettermeasured.Inonesuchstudy,theauthorsusedamodelcou- plingsingle-nucleotidemutationstomutationsinnearbymicrosatelliteallelestoinferasingle- nucleotiderateof1.4–2.3×10−8perbasepergeneration(90%confidenceinterval)[13].In principle,thisgeneraltechniqueisappealingbecauseitonlyinvolvesintrinsicinformation, withoutanyreferencepoints,andyetcanleveragethesignalofmutationsthathaveoccurred overmanygenerations. Inthisstudy,wepresentanewapproachthatfallsintothisfourthcategory:wecalibratethe mutationrateagainsttherateofmeioticrecombinationevents,whichhasbeenmeasuredwith highprecisioninhumans[14–17].Intuitively,ourmethodmakesuseofthefollowingrelation- shipbetweenthemutationandrecombinationrates.Ateverysiteiinadiploidgenome,the twocopiesofthebasehavesometimetomostrecentcommonancestor(TMRCA)T,mea- i suredingenerations.Thegenomecanbedividedintoblocksofsequencethathavebeeninher- itedtogetherfromthesamecommonancestor,withdifferentblocksseparatedbyancestral recombinations.IfagivenblockhasaTMRCAofTandalengthofLbases,andifμistheper- generationmutationrateperbase,thentheexpectednumberofmutationsthathaveaccumu- latedineithercopyofthatblocksincetheTMRCAis2TLμ.Thisistheexpectednumberofhet- erozygoussitesthatweobserveintheblocktoday(disregardingthepossibilityofrepeat mutations).Wealsoknowthatiftheper-generationrecombinationrateisrperbase,thenthe expectedlengthoftheblockis(2Tr)−1.Thus,theexpectednumberofheterozygoussitesper block(regardlessofage)isμ/r. Thisrelationshipallowsustoestimateμgivenagoodpriorknowledgeofr.Ourfullmethod ismorecomplexbutisbasedonthesameprinciple.Weshowbelowhowwecancapturethe signalofheterozygosityperrecombinationtoinferthehistoricalper-generationmutationrate fornon-Africanpopulationsoverapproximatelythelast50–100thousandyears(ky).A broadlysimilarideaisalsoappliedinanindependentstudy[18],butoveramorerecenttime scale(upto*3ky,viamutationspresentininferredidentical-by-descentsegments),andthe twofinalestimatesareinverygoodagreement. Results Overviewofmethods Onedifficultyofthesimplemethodoutlinedaboveisthatinpracticewecannotaccurately reconstructthebreakpointsbetweenadjacentnon-recombinedblocks.Instead,weuseanindi- rectstatisticthatcapturesinformationaboutthepresenceofbreakpointsbutcanbecomputed inasimpleway(withoutdirectlyinferringblocks)andaveragedovermanyloci(Fig1). Startingfromacertainpositioninthegenome,theTMRCAofthetwohaploidchromo- somesasafunctionofdistanceineitherdirectionisastepfunction,withchangesatancestral recombinationpoints(Fig1A).Heterozygosity,beingproportionaltoTMRCAinexpectation (anddirectlyobservable),followsthesamepatternonaverage(Fig1B). Ifweconsideracollectionofstartingpositionshavingsimilarlocalheterozygosities,thenas afunctionofthegeneticdistancedawayfromthem,theaverageheterozygositydisplaysa (cid:1) smoothrelaxationfromthecommonstartingvaluetowardtheglobalmeanheterozygosityH astheprobabilityincreasesofhavingencounteredrecombinationpoints(Fig1C).Wedefinea statisticH (d)thatequalsthisaverageheterozygosity,whereSisasetofstartingpointsindexed S bythelocalnumberofheterozygoussitesper100kb(wealsouseSattimestorefertothehet- erozygosityrangeitself).TheTMRCAsofthesepointsdeterminethetimescaleoverwhichour inferredvalueofμismeasured.Ourdefaultchoiceistousestartingpointswithalocaltotalof 5–10heterozygoussitesper100kb(seeMethods). PLOSGenetics|DOI:10.1371/journal.pgen.1005550 November12,2015 3/25 CalibratingtheHumanMutationRate Fig1.ExplanationofthestatisticHS(d).(A)Ancestralrecombinationsseparatechromosomesintoblocks ofpiecewise-constantTMRCA(andhenceexpectedheterozygosity).(B)Fromthedata,wemeasurelocal heterozygosityasafunctionofgeneticdistance;redandbluecirclesrepresentheterozygousand homozygoussites,respectively,alongadiploidgenome.(C)OurstatisticHS(d)isanaverageheterozygosity asafunctionofgeneticdistanceovermanystartingpointswithsimilarlocalheterozygosities,yieldinga smoothrelaxationtowardthegenome-wideaverage. doi:10.1371/journal.pgen.1005550.g001 Toestimateμ,weusethefactthattheprobabilityofhavingencounteredarecombinationas onemovesawayfromastartingpointisafunctionofbothdandthestartingheterozygosity H (0),sincesmallervaluesofH (0)correspondtosmallerTMRCAs,withlesstimeforrecom- S S binationtohaveoccurred,andhencelongerunbrokenblocks.Thisrelationshipallowsusto PLOSGenetics|DOI:10.1371/journal.pgen.1005550 November12,2015 4/25 CalibratingtheHumanMutationRate calibrateμagainsttherecombinationraterviatherelaxationrateofH (d).Ourinferencepro- S cedureinvolvesusingcoalescentsimulationstocreatematching“calibrationdata”withknown valuesofμandthensolvingforthebest-fitmutationrateforthetestdata(seeMethodsandFig 2).WenotethatwhencomparingH (d)forrealdatatothecalibrationcurves,alargervalueof S μwillcorrespondtoalowercurve.ThisisbecauseH (0)isfixed,whichmeansthatthe S TMRCAsatthestartingpointsareproportionallylowerforlargervaluesofμ.Thus,recombi- nationsarelessfrequentasafunctionofd,leadingtoaslowerrelaxation. Inorderforourinferencestobeaccurate,thecalibrationcurvesmustrecapitulateasclosely aspossibleallaspectsoftherealdatathatcouldaffectH (d)(seeMethods,S1Text,andFig2). S First,becausecoalescentprobabilitiesdependonancestralpopulationsizes,weusePSMC[19]to learnthedemographichistoryofoursamples.Next,weadaptapreviouslydevelopedtechnique [20]toinferthefine-scaleuncertaintyofourgeneticmap.Finally,wecorrectourrawinferred valuesofμforthreeadditionalfactorsinordertoisolatethedesiredmutationalsignal:(1)we multiplybyacorrectionforgenotypeerrors;(2)wesubtractthecontributionofnon-crossover geneconversion,usingaresultfrom[21]adjustedforlocalrecombinationrate;and(3)wescale thefinalvaluetocorrespondtogenome-widebasecontentandmutability(seeMethodsandS1 Text).Wealsotestadditionalpotentialmodelviolationsthroughsimulations(seeS1TextandS1 Fig).Weaccountforstatisticaluncertaintyusingablockjackknifeandincorporateconfidence intervalsformodelparameters;allresultsaregivenasmean±standarderror. Simulations First,forsevendifferentscenarios,includingarangeofpossiblemodelviolations,wegenerated 20simulateddiploidgenomeswithaknowntruemutationrate(μ=2.5×10−8pergeneration exceptwhereotherwisespecified)andranourprocedureaswewouldforrealdata,withper- turbedgeneticmapsforboththetestdataandcalibrationdata(varianceparameterα=3000 M−1;seeMethods).Tomeasuretheuncertaintyinourestimates,weperformed25independent trialsofeachsimulation,andwealsocomparedthestandarddeviationsoftheestimatesacross trialswithjackknife-basedstandarderrors(aswewouldmeasureuncertaintyforrealdata). FulldetailsofthesimulationprocedurescanbefoundinMethodsandS1Text. Inallcases,theH5–10(d)curvesmatchedquitewellbetweenthetestdataandthecalibration data,andourfinalresultswerewithintwostandarderrorsofthetruerate(Fig3).Furthermore, ourjackknifeestimatesofthestandarderrorwerecomparabletotherealizedstandarddevia- tionsandonaverageconservative,especiallyforthemostcomplexsimulation(g),despitenot incorporatingPSMCuncertainty(seeMethods):0.08×10−8,0.04×10−8,0.04×10−8, 0.06×10−8,0.09×10−8,0.05×10−8,and0.11×10−8,respectively,forthesevenscenarios(see Fig3forempiricalstandarddeviations).Thefactthatalloftheinferredratesareclosetothe truevaluesleadsustoconcludethatnoneoftheaspectsofthebasicprocedureorthetested modelviolationscreateasubstantialbias. Errorparameters Beforeobtainingmutationrateestimatesfromrealdata,wequantifiedtwoimportanterror parameters:therateoffalseheterozygousgenotypecallsandthedegreeofinaccuracyinour geneticmap. Weestimatedthegenotypeerrorratebytakingadvantageofthefactthatmethylatedcyto- sinesatCpGdinucleotidesareroughlyanorderofmagnitudemoremutablethanotherbases [3,7,8,10](seeMethods).Thus,suchmutationsarestronglyover-representedamongtrue heterozygoussitesascomparedtofalselycalledheterozygoussites.Bycountingtheproportion ofCpGmutationsoutofallheterozygoussitesaroundourascertainedstartingpoints,we PLOSGenetics|DOI:10.1371/journal.pgen.1005550 November12,2015 5/25 CalibratingtheHumanMutationRate Fig2.Illustrationofthestepsofourinferenceprocedure.(A)Overview:fromthedata,wecomputeboth thestatisticHS(d)andotherparametersnecessarytocreatematchingcalibrationcurveswithknownvalues ofμ.(B)Detailsofcapturingaspectsoftherealdataforthecalibrationdata.(C)ComputationofHS(d):the statisticcapturestheaverageheterozygosityasafunctionofgeneticdistancedfromastartingpointwith heterozygosityinadefinedrangeS,averagedovermanysuchpoints.(D)Forthefinalinferredvalueofμ,we comparematchedHS(d)curvesfortherealdataandcalibrationdata(withknownvaluesofμ). doi:10.1371/journal.pgen.1005550.g002 PLOSGenetics|DOI:10.1371/journal.pgen.1005550 November12,2015 6/25 CalibratingtheHumanMutationRate Fig3.Resultsforsimulateddata.Meansandstandarddeviationsof25independenttrialsaregiven,andthecurvesdisplayedareforrepresentativeruns matchingthe25-trialmeans.Thetruesimulatedrateisμ=2.5×10−8unlessotherwisespecified.(A)Baselinesimulateddata;theinferredrateisμ= 2.47±0.05×10−8.(B)Basicsimulateddatawithatruerateof1.5×10−8;theinferredrateisμ=1.57±0.04×10−8.(C)Datawithatruerateof1.5×10−8plus geneconversion;theinferredrateisμ=1.49±0.05×10−8(correctedfromarawvalueof1.70×10−8withgeneconversionincluded).(D)Datawithsimulated genotypeerrors;theinferredrateisμ=2.39±0.06×10−8(correctedfromarawvalueof2.71×10−8withgenotypeerrorsincluded).(E)Datasimulatedwith variablemutationrate;theinferredrateisμ=2.61±0.08×10−8.(F)Datafromasimulatedadmixedpopulation;theinferredrateisμ=2.57±0.07×10−8.(G) Simulateddatawithallthreecomplicationsasin(D)–(F);theinferredrateisμ=2.53±0.06×10−8(correctedfromarawvalueof2.77×10−8). doi:10.1371/journal.pgen.1005550.g003 PLOSGenetics|DOI:10.1371/journal.pgen.1005550 November12,2015 7/25 CalibratingtheHumanMutationRate inferredanerrorrateofapproximately1per100kb(1.08±0.28×10−5perbase;seeMethods andS1Text),consistentwithpreviousresults[22]. Itwasalsonecessaryforustoestimatetheaccuracyofourgeneticmap.Weusedthe “shared”versionoftheAfrican-American(AA)mapfrom[17]asourbasemapandamodified versionoftheerrormodelof[20]:Z*Gamma(αγ(g+πp),α),whereZisthetruegenetic lengthofamapinterval,gistheobservedgeneticlength,pisthephysicallength,αisthe parametermeasuringtheaccuracyofthemap,andγandπareconstants(seeMethods).Based onpedigreecrossoverdatafrom[23],weestimatedα=2802±14M−1forthefullAAmapand α=3414±13M−1forthe“shared”map,whichshouldserveaslowerandupperbounds(see Methods).Forouranalyses,wetookα=3100M−1(withastandarderrorof300M−1to accountforouruncertaintyintheprecisevalue).Thismeansthat1/α(cid:1)0.03cMcanbe thoughtofasthelengthscalefortheaccuracyofgeneticdistancesaccordingtothebasemap (seeMethodsfordetails).Inordertotranslatetheuncertaintyinαintoitseffectontheinferred μ,werepeatedourprimaryanalysiswitharangeofalternativevaluesofα(S2Fig). Wenotethatthevaluesofαreportedin[20]aresubstantiallylowerthanours,whichwesus- pectisbecauseourvalidationdatahavemuchfinerresolutionthanthoseusedpreviously. (Whenusingthesamevalidationdata,the“shared”andHapMapLD[15]mapsappeartoberel- ativelysimilarinaccuracy.)Ifwesubstituteournewαvaluesfortheoriginalapplicationofinfer- ringthedateofNeanderthalgeneflowintomodernhumans,weobtainalessdistanttimeinthe past,28–65ky(mostlikely35–49ky),versus37–86ky(mostlikely47–65ky)reportedin[20]. Whilerelativelyrecent,thisdaterangeisnotinconflictwitharchaeologicalevidenceorwithan estimateof49–60ky(95%confidenceinterval)basedonanUpperPaleolithicgenome[11]. EstimatesforEuropeansandEastAsians Ourprimaryresults(Fig4)wereobtainedfromeightdiploidgenomesofEuropeanandEast Asianindividuals(twoeachFrench,Sardinian,Han,andDai)usingourstandardparameter settings(seeaboveandMethods).Forallreal-dataapplications,tominimizenoisefromthe randomizedelementsoftheprocedure(namely,coalescentsimulationandgenerationofthe perturbedcalibrationmap),weaveraged25independentcalibrationsofthedatatoobtainour finalpointestimate.Withalleightindividualscombined,weestimatedamutationrateofμ= 1.61±0.13×10−8pergeneration(Fig4A).Usingthisvalueofμ,ourstartingheterozygosity H (0)(cid:1)7.4×10−5correspondstoaTMRCAofapproximately1550–3100generations,or45– S 90ky,assuminganaveragegenerationtimeof29years[12]. Itispossiblethatourfullestimatecouldbeslightlyinaccurateduetopopulation-leveldiffer- encesineitherthefine-scalegeneticmapordemographichistory(seeS1Text).However,we expectEuropeansandEastAsianstobecompatibleinourprocedurebothbecausetheyarenot toodistantlyrelatedandbecausetheyhavesimilarpopulationsizehistories[19,24].Totest empiricallytheeffectsofcombiningthepopulations,weestimatedratesforthefourEuropeans andfourEastAsiansseparately(Fig4Band4C).Usingthesamegenotypeerrorcorrections, wefoundthattheH5–10(d)curvesaswellasthefinalinferredvaluesweresimilartothosefor thefulldata:μ=1.72±0.14×10−8forEuropeansandμ=1.55±0.14×10−8forEastAsians. Thus,inconjunctionwithoursimulationresults,itappearsthatthefulleight-genomeestimate isrobusttotheeffectsofpopulationheterogeneity. Additionally,toinvestigatetheinfluenceofdifferentmutationaltypes,weestimatedrates separatelyforCpGtransitionsandallothermutations(seeMethods).Weinferredvaluesofμ =0.50±0.06×10−8forCpGsandμ=1.36±0.13×10−8fornon-CpGs(S3Fig),withasum (1.87±0.14×10−8)thatissomewhathigherthanourfull-dataestimate.SinceCpGtransitions areknowntocompriseapproximately17–18%ofallmutations[8],ourfull-dataandnon-CpG PLOSGenetics|DOI:10.1371/journal.pgen.1005550 November12,2015 8/25 CalibratingtheHumanMutationRate Fig4.ResultsforEuropeansandEastAsians.(A)Alleightindividualstogether;theinferredrateisμ=1.61±0.13×10−8pergeneration.(B)Resultsfor thefourEuropeans;theinferredrateisμ=1.72±0.14×10−8.(C)ResultsforthefourEastAsians;theinferredrateisμ=1.55±0.14×10−8.Forallreal-data results,thecurvesdisplayedareforrepresentativecalibrationsmatchingtheoverallmeans.Thereportedvaluesarealsocorrectedforgeneconversion, genotypeerror,andbasecontent,whichexplainstheapparentdiscrepancybetweenthefinalestimatesandthecurves(forexample,theestimate(A)is correctedfromarawvalueof2.00×10−8). doi:10.1371/journal.pgen.1005550.g004 estimatesappeartobeinverygoodagreement,whereastheCpG-onlyestimateislikely inflated,perhapsbecauseourmethodperformspoorlywiththelowdensityofheterozygous sites(only1per100kbwindowforourCpG-onlystartingpoints).Asaresult,webelievethat ourvalueofμ=1.61×10−8isaccurate,oratmostslightlyunderestimated,asatotalmutation rateforallsites. PLOSGenetics|DOI:10.1371/journal.pgen.1005550 November12,2015 9/25 CalibratingtheHumanMutationRate Estimatesforotherpopulations Wealsoranourprocedureforthreeothernon-Africanpopulations:aboriginalAustralians, Karitiana(anindigenousgroupfromBrazil),andPapuaNewGuineans.Usingtwogenomes perpopulationandcomputingcurvesforstartingregionswith1–15heterozygoussitesper100 kb(toincreasethenumberoftestregions,withapotentialtrade-offinaccuracy),weinferred ratesofμ=1.86±0.19×10−8,μ=1.37±0.19×10−8,andμ=1.62±0.17×10−8forAustralian, Karitiana,andPapuan,respectively(Fig5).Wenotethattherelativelyhigh(butnotstatisti- callysignificantlydifferent)per-generationvalueforAustraliansisconsistentwiththehigh averageagesoffathersinmanyaboriginalAustraliansocieties[12,25].Overall,giventhe expectedsmalldifferencesforhistorical,cultural,orbiologicalreasons(including,asmen- tionedabove,ouruseofthesame“shared”geneticmapforallgroups),wedonotseeevidence ofsubstantialerrorsorbiasesinourprocedurewhenappliedtodiversepopulations. Discussion Usinganewmethodforestimatingthehumanmutationrate,wehaveobtainedagenome- wideestimateofμ=1.61±0.13×10−8single-nucleotidemutationspergeneration.Our approachcountsmutationsthathavearisenovermanygenerations(afewthousand,i.e.,sev- eraltensofthousandsofyears)andreliesonourexcellentknowledgeofthehumanrecombina- tionratetocalibratethelengthoftherelevanttimeperiod. Wehaveshownthatourestimateisrobusttomanypossibleconfoundingfactors(S1Fig). Inadditiontostatisticalnoiseinthedata,ourmethoddirectlyaccountsforancestralgenecon- versionandforerrorsingenotypecallsandinthegeneticmap.Wehavealsodemonstrated, basedonsimulations,thatheterogeneityindemographicandgeneticparameters,includingthe mutationrateitself,doesnotcauseanappreciablebias.However,weacknowledgethatouresti- materequiresalargenumberofmodelingassumptions,andwhilewehaveattemptedtojustify eachstepofourprocedureandtoincorporateuncertaintyateachstageintoourfinalstandard error,itispossiblethatwehavenotpreciselycapturedtheinfluenceofeveryconfounder.Simi- larly,whileweconsiderabroadrangeofpossiblesourcesoferror,wecannotguaranteethat theremightnotbeothersthatwehaveneglected. Themeaningofanaveragerate Itisimportanttonotethatthemutationrateisnotconstantatallsitesinthegenome[26]. Aswehavediscussed,webelievethatthisvariabilitydoesnotcauseasubstantialbiasinour inferences,buttotheextentthatsomebasesmutatefasterthanothers,arateisonlymeaning- fulwhenassociatedwiththesetofsitesforwhichitisestimated.Forexample,methylated cytosinesatCpGpositionsaccumulatepointmutationsroughlyanorderofmagnitudefaster thanotherbasesbecauseofspontaneousdeamination[3,7,8,10].Sucheffectscanleadto larger-scalepatterns,suchasthehighermutabilityofexonsascomparedtothegenomeasa whole[27]. Inourwork,wefilterthedatasubstantially,removingmorethanathirdofthesitesinthe genome.Thefilterstendtoreducetheheterozygosityoftheremainingportions[24,28],which istobeexpectediftheyhavetheeffectofpreferentiallyremovingfalseheterozygoussites.We alsomakeasmalladjustmenttoourfinalvalueofμtoaccountfordifferencesinbasecomposi- tionbetweenourascertainedstartingpointsandthe(filtered)genomeasawhole(seeMeth- ods).Forreference,inS1Table,wegiveheterozygositylevelsandhuman–chimpanzee divergencestatisticsforsitespassingourfilters,i.e.,thesubsetofthegenomeforwhichour inferredratesareapplicable. PLOSGenetics|DOI:10.1371/journal.pgen.1005550 November12,2015 10/25

Description:
The human mutation rate is an essential parameter for studying the evolution of our recombination rate to calibrate the long-term mutation rate.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.