Table Of ContentRESEARCHARTICLE
Calibrating the Human Mutation Rate via
Ancestral Recombination Density in Diploid
Genomes
MarkLipson1*,Po-RuLoh2,3,SriramSankararaman1,3,NickPatterson3,BonnieBerger3,4,
DavidReich1,3,5*
1DepartmentofGenetics,HarvardMedicalSchool,Boston,Massachusetts,UnitedStatesofAmerica,
2DepartmentofEpidemiology,HarvardSchoolofPublicHealth,Boston,Massachusetts,UnitedStatesof
America,3MedicalandPopulationGeneticsProgram,BroadInstituteofMITandHarvard,Cambridge,
Massachusetts,UnitedStatesofAmerica,4DepartmentofMathematicsandComputerScienceandArtificial
IntelligenceLaboratory,MassachusettsInstituteofTechnology,Cambridge,Massachusetts,UnitedStatesof
America,5HowardHughesMedicalInstitute,HarvardMedicalSchool,Boston,Massachusetts,United
StatesofAmerica
*mlipson@genetics.med.harvard.edu(ML),reich@genetics.med.harvard.edu(DR)
OPENACCESS
Abstract
Citation:LipsonM,LohP-R,SankararamanS,
PattersonN,BergerB,ReichD(2015)Calibratingthe
Thehumanmutationrateisanessentialparameterforstudyingtheevolutionofourspecies,
HumanMutationRateviaAncestralRecombination
interpretingpresent-daygeneticvariation,andunderstandingtheincidenceofgeneticdis-
DensityinDiploidGenomes.PLoSGenet11(11):
e1005550.doi:10.1371/journal.pgen.1005550 ease.Nevertheless,ourcurrentestimatesoftherateareuncertain.Mostnotably,recent
approachesbasedoncountingdenovomutationsinfamilypedigreeshaveyieldedsignifi-
Editor:GrahamCoop,UniversityofCaliforniaDavis,
UNITEDSTATES cantlysmallervaluesthanclassicalmethodsbasedonsequencedivergence.Here,wepro-
poseanewmethodthatusesthefine-scalehumanrecombinationmaptocalibratetherate
Received:February18,2015
ofaccumulationofmutations.Bycomparinglocalheterozygositylevelsindiploidgenomes
Accepted:September3,2015
tothegeneticdistancescaleoverwhichtheselevelschange,weareabletoestimatea
Published:November12,2015
long-termmutationrateaveragedoverhundredsorthousandsofgenerations.Weinfera
Copyright:©2015Lipsonetal.Thisisanopen rateof1.61±0.13×10−8mutationsperbasepergeneration,whichfallsinbetweenphylo-
accessarticledistributedunderthetermsofthe
geneticandpedigree-basedestimates,andwesuggestpossiblemechanismstoreconcile
CreativeCommonsAttributionLicense,whichpermits
ourestimatewithpreviousstudies.Ourresultssupportintermediate-agedivergences
unrestricteduse,distribution,andreproductioninany
medium,providedtheoriginalauthorandsourceare amonghumanpopulationsandbetweenhumansandothergreatapes.
credited.
DataAvailabilityStatement:Alldatahave
previouslybeenmadeavailableaspartofrefs.21
and24. AuthorSummary
Funding:MLacknowledgessupportfromthe
Therateatwhichnewheritablemutationsoccurinthehumangenomeisafundamental
SimonsFoundation(www.simonsfoundation.org)and
parameterinpopulationandevolutionarygenetics.However,recentdirectfamily-based
NationalInstitutesofHealth(www.nih.gov;grant
R01GM108348,toBB).PRLwassupportedby estimatesofthemutationratehaveconsistentlybeenmuchlowerthanpreviousresults
NationalInstitutesofHealthfellowshipF32 fromcomparisonswithothergreatapespecies.Becausesplittimesofspeciesandpopula-
HG007805andSSbyNationalInstitutesofHealth tionsestimatedfromgeneticdataareofteninverselyproportionaltothemutationrate,
grantK99GM111744.NPandDRweresupportedby
resolvingthedisagreementwouldhaveimportantimplicationsforunderstandinghuman
NationalScienceFoundation(www.nsf.gov)
evolution.Inourwork,weapplyanewtechniquethatusesmutationsthathaveaccumu-
HOMINIDgrant#1032255andNationalInstitutesof
latedovermanygenerationsoneithercopyofachromosomeinanindividual’sgenome.
HealthgrantGM100233.DRisanInvestigatoratthe
PLOSGenetics|DOI:10.1371/journal.pgen.1005550 November12,2015 1/25
CalibratingtheHumanMutationRate
HowardHughesMedicalInstitute(www.hhmi.org).
Thefundershadnoroleinstudydesign,data Insteadofanexternalreferencepoint,werelyonfine-scaleknowledgeofthehuman
collectionandanalysis,decisiontopublish,or recombinationratetocalibratethelong-termmutationrate.Ourprocedureaccountsfor
preparationofthemanuscript.
possibleerrorsfoundinrealdata,andwealsoshowthatitisrobusttoarangeofmodel
CompetingInterests:Theauthorshavedeclared violations.Usingeightdiploidgenomesfromnon-Africanindividuals,weinferarateof
thatnocompetinginterestsexist. 1.61±0.13×10−8single-nucleotidechangesperbasepergeneration,whichisintermedi-
atebetweenmostphylogeneticandpedigree-basedestimates.Thus,ourestimateimplies
reasonable,intermediate-agepopulationsplittimesacrossarangeoftimescales.
Introduction
Allgeneticvariation—thesubstrateforevolution—isultimatelyduetospontaneousheritable
mutationsinthegenomesofindividualgermlinecells.Themostcommonlystudiedmutations
arepointmutations,whichconsistofsingle-nucleotidechangesfromonebasetoanother.The
rateatwhichthesechangesoccur,incombinationwithotherforces,determinesthefrequency
withwhichhomologousnucleotidesdifferfromoneindividual’sgenometoanother.
Anumberofdifferentapproacheshavepreviouslybeenusedtoestimatethehumanmuta-
tionrate[1–3],ofwhichwementionfourcategorieshere.Thefirstmethodistocountthe
numberoffixedgeneticchangesbetweenhumansandanotherspecies,suchaschimpanzees
[4].Populationgenetictheoryimpliesthatifthemutationrateremainsconstant,thenneutral
mutations(thosethatdonotaffectanorganism’sfitness)shouldaccumulatebetweentwo
genomesataconstantrate(thewell-known“molecularclock”[5]).Thus,themutationrate
canbeestimatedbasedonthedivergencetimeofthegenomes,ifthiscanbeconfidently
inferredfromfossilevidence.However,eveniftheageoffossilremainscanbeaccuratelydeter-
mined,assigningtheirproperphylogeneticpositionsisoftendifficult.Moreover,becauseof
sharedancestralpolymorphism,thetimetothemostrecentcommonancestorisalwaysolder
—andsometimesfarolder—thanthetimeofspeciesdivergence,meaningthatsplit-timecali-
brationscannotalwaysbedirectlyappliedtogeneticdivergences.
Asecondcommonapproach,whichhasonlybecomepossiblewithinthelastfewyears,isto
countnewlyoccurringmutationsindeepsequencingdatafromfamilypedigrees,especiallypar-
ent-childtrios[6–10].Thisapproachprovidesadirectestimatebutcanbetechnicallychalleng-
ing,asitissensitivetogenotypeaccuracyanddataprocessingfromhigh-throughputsequencing.
Inparticular,sporadicsequencingandalignmenterrorscanbedifficulttodistinguishfromtrue
denovomutations.Surprisingly,thesesequencing-basedestimateshaveconsistentlybeenmuch
lowerthanthosebasedonthefirstapproach:intheneighborhoodof1–1.2×10−8perbaseper
generation,asopposedto2–2.5×10−8forthosefromlong-termdivergence[1–3].
Athirdmethod,andanotherthatisonlynowbecomingpossible,istomakedirectcompari-
sonsbetweenpresent-daysamplesandprecisely-datedancientgenomes.Thismethodissimi-
lartothefirstone,butbyusingtwotime-separatedsamplesfromthesamespecies,itavoids
thedifficultyofneedinganexternallyinferredsplittime.Arecentstudyofahigh-coverage
genomesequencefroma45,000-year-oldUpperPaleolithicmodernhumanproducedtwoesti-
matesofthistype[11].Directmeasurementofdecreasedmutationalaccumulationinthissam-
pleledtorateestimatesof0.44–0.63×10−9perbaseperyear(rangeof14estimates),or1.3–
1.8×10−8perbasepergeneration(assuming29yearspergeneration[12]).Analternativetech-
nique,leveragingtimeshiftsinhistoricalpopulationsizes,yieldedanestimateof0.38–
0.49×10−9perbaseperyear(95%confidenceinterval),or1.1–1.4×10−8perbasepergenera-
tion,althoughare-analysisofdifferentmutationalclassesledtoatotalestimateof0.44–
0.59×10−9perbaseperyear(1.3–1.7×10−8),inbetteragreementwiththefirstapproach[11].
PLOSGenetics|DOI:10.1371/journal.pgen.1005550 November12,2015 2/25
CalibratingtheHumanMutationRate
Finally,afourthtechniqueistocalibratetherateofaccumulationofmutationsusingasepa-
rateevolutionaryratethatisbettermeasured.Inonesuchstudy,theauthorsusedamodelcou-
plingsingle-nucleotidemutationstomutationsinnearbymicrosatelliteallelestoinferasingle-
nucleotiderateof1.4–2.3×10−8perbasepergeneration(90%confidenceinterval)[13].In
principle,thisgeneraltechniqueisappealingbecauseitonlyinvolvesintrinsicinformation,
withoutanyreferencepoints,andyetcanleveragethesignalofmutationsthathaveoccurred
overmanygenerations.
Inthisstudy,wepresentanewapproachthatfallsintothisfourthcategory:wecalibratethe
mutationrateagainsttherateofmeioticrecombinationevents,whichhasbeenmeasuredwith
highprecisioninhumans[14–17].Intuitively,ourmethodmakesuseofthefollowingrelation-
shipbetweenthemutationandrecombinationrates.Ateverysiteiinadiploidgenome,the
twocopiesofthebasehavesometimetomostrecentcommonancestor(TMRCA)T,mea-
i
suredingenerations.Thegenomecanbedividedintoblocksofsequencethathavebeeninher-
itedtogetherfromthesamecommonancestor,withdifferentblocksseparatedbyancestral
recombinations.IfagivenblockhasaTMRCAofTandalengthofLbases,andifμistheper-
generationmutationrateperbase,thentheexpectednumberofmutationsthathaveaccumu-
latedineithercopyofthatblocksincetheTMRCAis2TLμ.Thisistheexpectednumberofhet-
erozygoussitesthatweobserveintheblocktoday(disregardingthepossibilityofrepeat
mutations).Wealsoknowthatiftheper-generationrecombinationrateisrperbase,thenthe
expectedlengthoftheblockis(2Tr)−1.Thus,theexpectednumberofheterozygoussitesper
block(regardlessofage)isμ/r.
Thisrelationshipallowsustoestimateμgivenagoodpriorknowledgeofr.Ourfullmethod
ismorecomplexbutisbasedonthesameprinciple.Weshowbelowhowwecancapturethe
signalofheterozygosityperrecombinationtoinferthehistoricalper-generationmutationrate
fornon-Africanpopulationsoverapproximatelythelast50–100thousandyears(ky).A
broadlysimilarideaisalsoappliedinanindependentstudy[18],butoveramorerecenttime
scale(upto*3ky,viamutationspresentininferredidentical-by-descentsegments),andthe
twofinalestimatesareinverygoodagreement.
Results
Overviewofmethods
Onedifficultyofthesimplemethodoutlinedaboveisthatinpracticewecannotaccurately
reconstructthebreakpointsbetweenadjacentnon-recombinedblocks.Instead,weuseanindi-
rectstatisticthatcapturesinformationaboutthepresenceofbreakpointsbutcanbecomputed
inasimpleway(withoutdirectlyinferringblocks)andaveragedovermanyloci(Fig1).
Startingfromacertainpositioninthegenome,theTMRCAofthetwohaploidchromo-
somesasafunctionofdistanceineitherdirectionisastepfunction,withchangesatancestral
recombinationpoints(Fig1A).Heterozygosity,beingproportionaltoTMRCAinexpectation
(anddirectlyobservable),followsthesamepatternonaverage(Fig1B).
Ifweconsideracollectionofstartingpositionshavingsimilarlocalheterozygosities,thenas
afunctionofthegeneticdistancedawayfromthem,theaverageheterozygositydisplaysa
(cid:1)
smoothrelaxationfromthecommonstartingvaluetowardtheglobalmeanheterozygosityH
astheprobabilityincreasesofhavingencounteredrecombinationpoints(Fig1C).Wedefinea
statisticH (d)thatequalsthisaverageheterozygosity,whereSisasetofstartingpointsindexed
S
bythelocalnumberofheterozygoussitesper100kb(wealsouseSattimestorefertothehet-
erozygosityrangeitself).TheTMRCAsofthesepointsdeterminethetimescaleoverwhichour
inferredvalueofμismeasured.Ourdefaultchoiceistousestartingpointswithalocaltotalof
5–10heterozygoussitesper100kb(seeMethods).
PLOSGenetics|DOI:10.1371/journal.pgen.1005550 November12,2015 3/25
CalibratingtheHumanMutationRate
Fig1.ExplanationofthestatisticHS(d).(A)Ancestralrecombinationsseparatechromosomesintoblocks
ofpiecewise-constantTMRCA(andhenceexpectedheterozygosity).(B)Fromthedata,wemeasurelocal
heterozygosityasafunctionofgeneticdistance;redandbluecirclesrepresentheterozygousand
homozygoussites,respectively,alongadiploidgenome.(C)OurstatisticHS(d)isanaverageheterozygosity
asafunctionofgeneticdistanceovermanystartingpointswithsimilarlocalheterozygosities,yieldinga
smoothrelaxationtowardthegenome-wideaverage.
doi:10.1371/journal.pgen.1005550.g001
Toestimateμ,weusethefactthattheprobabilityofhavingencounteredarecombinationas
onemovesawayfromastartingpointisafunctionofbothdandthestartingheterozygosity
H (0),sincesmallervaluesofH (0)correspondtosmallerTMRCAs,withlesstimeforrecom-
S S
binationtohaveoccurred,andhencelongerunbrokenblocks.Thisrelationshipallowsusto
PLOSGenetics|DOI:10.1371/journal.pgen.1005550 November12,2015 4/25
CalibratingtheHumanMutationRate
calibrateμagainsttherecombinationraterviatherelaxationrateofH (d).Ourinferencepro-
S
cedureinvolvesusingcoalescentsimulationstocreatematching“calibrationdata”withknown
valuesofμandthensolvingforthebest-fitmutationrateforthetestdata(seeMethodsandFig
2).WenotethatwhencomparingH (d)forrealdatatothecalibrationcurves,alargervalueof
S
μwillcorrespondtoalowercurve.ThisisbecauseH (0)isfixed,whichmeansthatthe
S
TMRCAsatthestartingpointsareproportionallylowerforlargervaluesofμ.Thus,recombi-
nationsarelessfrequentasafunctionofd,leadingtoaslowerrelaxation.
Inorderforourinferencestobeaccurate,thecalibrationcurvesmustrecapitulateasclosely
aspossibleallaspectsoftherealdatathatcouldaffectH (d)(seeMethods,S1Text,andFig2).
S
First,becausecoalescentprobabilitiesdependonancestralpopulationsizes,weusePSMC[19]to
learnthedemographichistoryofoursamples.Next,weadaptapreviouslydevelopedtechnique
[20]toinferthefine-scaleuncertaintyofourgeneticmap.Finally,wecorrectourrawinferred
valuesofμforthreeadditionalfactorsinordertoisolatethedesiredmutationalsignal:(1)we
multiplybyacorrectionforgenotypeerrors;(2)wesubtractthecontributionofnon-crossover
geneconversion,usingaresultfrom[21]adjustedforlocalrecombinationrate;and(3)wescale
thefinalvaluetocorrespondtogenome-widebasecontentandmutability(seeMethodsandS1
Text).Wealsotestadditionalpotentialmodelviolationsthroughsimulations(seeS1TextandS1
Fig).Weaccountforstatisticaluncertaintyusingablockjackknifeandincorporateconfidence
intervalsformodelparameters;allresultsaregivenasmean±standarderror.
Simulations
First,forsevendifferentscenarios,includingarangeofpossiblemodelviolations,wegenerated
20simulateddiploidgenomeswithaknowntruemutationrate(μ=2.5×10−8pergeneration
exceptwhereotherwisespecified)andranourprocedureaswewouldforrealdata,withper-
turbedgeneticmapsforboththetestdataandcalibrationdata(varianceparameterα=3000
M−1;seeMethods).Tomeasuretheuncertaintyinourestimates,weperformed25independent
trialsofeachsimulation,andwealsocomparedthestandarddeviationsoftheestimatesacross
trialswithjackknife-basedstandarderrors(aswewouldmeasureuncertaintyforrealdata).
FulldetailsofthesimulationprocedurescanbefoundinMethodsandS1Text.
Inallcases,theH5–10(d)curvesmatchedquitewellbetweenthetestdataandthecalibration
data,andourfinalresultswerewithintwostandarderrorsofthetruerate(Fig3).Furthermore,
ourjackknifeestimatesofthestandarderrorwerecomparabletotherealizedstandarddevia-
tionsandonaverageconservative,especiallyforthemostcomplexsimulation(g),despitenot
incorporatingPSMCuncertainty(seeMethods):0.08×10−8,0.04×10−8,0.04×10−8,
0.06×10−8,0.09×10−8,0.05×10−8,and0.11×10−8,respectively,forthesevenscenarios(see
Fig3forempiricalstandarddeviations).Thefactthatalloftheinferredratesareclosetothe
truevaluesleadsustoconcludethatnoneoftheaspectsofthebasicprocedureorthetested
modelviolationscreateasubstantialbias.
Errorparameters
Beforeobtainingmutationrateestimatesfromrealdata,wequantifiedtwoimportanterror
parameters:therateoffalseheterozygousgenotypecallsandthedegreeofinaccuracyinour
geneticmap.
Weestimatedthegenotypeerrorratebytakingadvantageofthefactthatmethylatedcyto-
sinesatCpGdinucleotidesareroughlyanorderofmagnitudemoremutablethanotherbases
[3,7,8,10](seeMethods).Thus,suchmutationsarestronglyover-representedamongtrue
heterozygoussitesascomparedtofalselycalledheterozygoussites.Bycountingtheproportion
ofCpGmutationsoutofallheterozygoussitesaroundourascertainedstartingpoints,we
PLOSGenetics|DOI:10.1371/journal.pgen.1005550 November12,2015 5/25
CalibratingtheHumanMutationRate
Fig2.Illustrationofthestepsofourinferenceprocedure.(A)Overview:fromthedata,wecomputeboth
thestatisticHS(d)andotherparametersnecessarytocreatematchingcalibrationcurveswithknownvalues
ofμ.(B)Detailsofcapturingaspectsoftherealdataforthecalibrationdata.(C)ComputationofHS(d):the
statisticcapturestheaverageheterozygosityasafunctionofgeneticdistancedfromastartingpointwith
heterozygosityinadefinedrangeS,averagedovermanysuchpoints.(D)Forthefinalinferredvalueofμ,we
comparematchedHS(d)curvesfortherealdataandcalibrationdata(withknownvaluesofμ).
doi:10.1371/journal.pgen.1005550.g002
PLOSGenetics|DOI:10.1371/journal.pgen.1005550 November12,2015 6/25
CalibratingtheHumanMutationRate
Fig3.Resultsforsimulateddata.Meansandstandarddeviationsof25independenttrialsaregiven,andthecurvesdisplayedareforrepresentativeruns
matchingthe25-trialmeans.Thetruesimulatedrateisμ=2.5×10−8unlessotherwisespecified.(A)Baselinesimulateddata;theinferredrateisμ=
2.47±0.05×10−8.(B)Basicsimulateddatawithatruerateof1.5×10−8;theinferredrateisμ=1.57±0.04×10−8.(C)Datawithatruerateof1.5×10−8plus
geneconversion;theinferredrateisμ=1.49±0.05×10−8(correctedfromarawvalueof1.70×10−8withgeneconversionincluded).(D)Datawithsimulated
genotypeerrors;theinferredrateisμ=2.39±0.06×10−8(correctedfromarawvalueof2.71×10−8withgenotypeerrorsincluded).(E)Datasimulatedwith
variablemutationrate;theinferredrateisμ=2.61±0.08×10−8.(F)Datafromasimulatedadmixedpopulation;theinferredrateisμ=2.57±0.07×10−8.(G)
Simulateddatawithallthreecomplicationsasin(D)–(F);theinferredrateisμ=2.53±0.06×10−8(correctedfromarawvalueof2.77×10−8).
doi:10.1371/journal.pgen.1005550.g003
PLOSGenetics|DOI:10.1371/journal.pgen.1005550 November12,2015 7/25
CalibratingtheHumanMutationRate
inferredanerrorrateofapproximately1per100kb(1.08±0.28×10−5perbase;seeMethods
andS1Text),consistentwithpreviousresults[22].
Itwasalsonecessaryforustoestimatetheaccuracyofourgeneticmap.Weusedthe
“shared”versionoftheAfrican-American(AA)mapfrom[17]asourbasemapandamodified
versionoftheerrormodelof[20]:Z*Gamma(αγ(g+πp),α),whereZisthetruegenetic
lengthofamapinterval,gistheobservedgeneticlength,pisthephysicallength,αisthe
parametermeasuringtheaccuracyofthemap,andγandπareconstants(seeMethods).Based
onpedigreecrossoverdatafrom[23],weestimatedα=2802±14M−1forthefullAAmapand
α=3414±13M−1forthe“shared”map,whichshouldserveaslowerandupperbounds(see
Methods).Forouranalyses,wetookα=3100M−1(withastandarderrorof300M−1to
accountforouruncertaintyintheprecisevalue).Thismeansthat1/α(cid:1)0.03cMcanbe
thoughtofasthelengthscalefortheaccuracyofgeneticdistancesaccordingtothebasemap
(seeMethodsfordetails).Inordertotranslatetheuncertaintyinαintoitseffectontheinferred
μ,werepeatedourprimaryanalysiswitharangeofalternativevaluesofα(S2Fig).
Wenotethatthevaluesofαreportedin[20]aresubstantiallylowerthanours,whichwesus-
pectisbecauseourvalidationdatahavemuchfinerresolutionthanthoseusedpreviously.
(Whenusingthesamevalidationdata,the“shared”andHapMapLD[15]mapsappeartoberel-
ativelysimilarinaccuracy.)Ifwesubstituteournewαvaluesfortheoriginalapplicationofinfer-
ringthedateofNeanderthalgeneflowintomodernhumans,weobtainalessdistanttimeinthe
past,28–65ky(mostlikely35–49ky),versus37–86ky(mostlikely47–65ky)reportedin[20].
Whilerelativelyrecent,thisdaterangeisnotinconflictwitharchaeologicalevidenceorwithan
estimateof49–60ky(95%confidenceinterval)basedonanUpperPaleolithicgenome[11].
EstimatesforEuropeansandEastAsians
Ourprimaryresults(Fig4)wereobtainedfromeightdiploidgenomesofEuropeanandEast
Asianindividuals(twoeachFrench,Sardinian,Han,andDai)usingourstandardparameter
settings(seeaboveandMethods).Forallreal-dataapplications,tominimizenoisefromthe
randomizedelementsoftheprocedure(namely,coalescentsimulationandgenerationofthe
perturbedcalibrationmap),weaveraged25independentcalibrationsofthedatatoobtainour
finalpointestimate.Withalleightindividualscombined,weestimatedamutationrateofμ=
1.61±0.13×10−8pergeneration(Fig4A).Usingthisvalueofμ,ourstartingheterozygosity
H (0)(cid:1)7.4×10−5correspondstoaTMRCAofapproximately1550–3100generations,or45–
S
90ky,assuminganaveragegenerationtimeof29years[12].
Itispossiblethatourfullestimatecouldbeslightlyinaccurateduetopopulation-leveldiffer-
encesineitherthefine-scalegeneticmapordemographichistory(seeS1Text).However,we
expectEuropeansandEastAsianstobecompatibleinourprocedurebothbecausetheyarenot
toodistantlyrelatedandbecausetheyhavesimilarpopulationsizehistories[19,24].Totest
empiricallytheeffectsofcombiningthepopulations,weestimatedratesforthefourEuropeans
andfourEastAsiansseparately(Fig4Band4C).Usingthesamegenotypeerrorcorrections,
wefoundthattheH5–10(d)curvesaswellasthefinalinferredvaluesweresimilartothosefor
thefulldata:μ=1.72±0.14×10−8forEuropeansandμ=1.55±0.14×10−8forEastAsians.
Thus,inconjunctionwithoursimulationresults,itappearsthatthefulleight-genomeestimate
isrobusttotheeffectsofpopulationheterogeneity.
Additionally,toinvestigatetheinfluenceofdifferentmutationaltypes,weestimatedrates
separatelyforCpGtransitionsandallothermutations(seeMethods).Weinferredvaluesofμ
=0.50±0.06×10−8forCpGsandμ=1.36±0.13×10−8fornon-CpGs(S3Fig),withasum
(1.87±0.14×10−8)thatissomewhathigherthanourfull-dataestimate.SinceCpGtransitions
areknowntocompriseapproximately17–18%ofallmutations[8],ourfull-dataandnon-CpG
PLOSGenetics|DOI:10.1371/journal.pgen.1005550 November12,2015 8/25
CalibratingtheHumanMutationRate
Fig4.ResultsforEuropeansandEastAsians.(A)Alleightindividualstogether;theinferredrateisμ=1.61±0.13×10−8pergeneration.(B)Resultsfor
thefourEuropeans;theinferredrateisμ=1.72±0.14×10−8.(C)ResultsforthefourEastAsians;theinferredrateisμ=1.55±0.14×10−8.Forallreal-data
results,thecurvesdisplayedareforrepresentativecalibrationsmatchingtheoverallmeans.Thereportedvaluesarealsocorrectedforgeneconversion,
genotypeerror,andbasecontent,whichexplainstheapparentdiscrepancybetweenthefinalestimatesandthecurves(forexample,theestimate(A)is
correctedfromarawvalueof2.00×10−8).
doi:10.1371/journal.pgen.1005550.g004
estimatesappeartobeinverygoodagreement,whereastheCpG-onlyestimateislikely
inflated,perhapsbecauseourmethodperformspoorlywiththelowdensityofheterozygous
sites(only1per100kbwindowforourCpG-onlystartingpoints).Asaresult,webelievethat
ourvalueofμ=1.61×10−8isaccurate,oratmostslightlyunderestimated,asatotalmutation
rateforallsites.
PLOSGenetics|DOI:10.1371/journal.pgen.1005550 November12,2015 9/25
CalibratingtheHumanMutationRate
Estimatesforotherpopulations
Wealsoranourprocedureforthreeothernon-Africanpopulations:aboriginalAustralians,
Karitiana(anindigenousgroupfromBrazil),andPapuaNewGuineans.Usingtwogenomes
perpopulationandcomputingcurvesforstartingregionswith1–15heterozygoussitesper100
kb(toincreasethenumberoftestregions,withapotentialtrade-offinaccuracy),weinferred
ratesofμ=1.86±0.19×10−8,μ=1.37±0.19×10−8,andμ=1.62±0.17×10−8forAustralian,
Karitiana,andPapuan,respectively(Fig5).Wenotethattherelativelyhigh(butnotstatisti-
callysignificantlydifferent)per-generationvalueforAustraliansisconsistentwiththehigh
averageagesoffathersinmanyaboriginalAustraliansocieties[12,25].Overall,giventhe
expectedsmalldifferencesforhistorical,cultural,orbiologicalreasons(including,asmen-
tionedabove,ouruseofthesame“shared”geneticmapforallgroups),wedonotseeevidence
ofsubstantialerrorsorbiasesinourprocedurewhenappliedtodiversepopulations.
Discussion
Usinganewmethodforestimatingthehumanmutationrate,wehaveobtainedagenome-
wideestimateofμ=1.61±0.13×10−8single-nucleotidemutationspergeneration.Our
approachcountsmutationsthathavearisenovermanygenerations(afewthousand,i.e.,sev-
eraltensofthousandsofyears)andreliesonourexcellentknowledgeofthehumanrecombina-
tionratetocalibratethelengthoftherelevanttimeperiod.
Wehaveshownthatourestimateisrobusttomanypossibleconfoundingfactors(S1Fig).
Inadditiontostatisticalnoiseinthedata,ourmethoddirectlyaccountsforancestralgenecon-
versionandforerrorsingenotypecallsandinthegeneticmap.Wehavealsodemonstrated,
basedonsimulations,thatheterogeneityindemographicandgeneticparameters,includingthe
mutationrateitself,doesnotcauseanappreciablebias.However,weacknowledgethatouresti-
materequiresalargenumberofmodelingassumptions,andwhilewehaveattemptedtojustify
eachstepofourprocedureandtoincorporateuncertaintyateachstageintoourfinalstandard
error,itispossiblethatwehavenotpreciselycapturedtheinfluenceofeveryconfounder.Simi-
larly,whileweconsiderabroadrangeofpossiblesourcesoferror,wecannotguaranteethat
theremightnotbeothersthatwehaveneglected.
Themeaningofanaveragerate
Itisimportanttonotethatthemutationrateisnotconstantatallsitesinthegenome[26].
Aswehavediscussed,webelievethatthisvariabilitydoesnotcauseasubstantialbiasinour
inferences,buttotheextentthatsomebasesmutatefasterthanothers,arateisonlymeaning-
fulwhenassociatedwiththesetofsitesforwhichitisestimated.Forexample,methylated
cytosinesatCpGpositionsaccumulatepointmutationsroughlyanorderofmagnitudefaster
thanotherbasesbecauseofspontaneousdeamination[3,7,8,10].Sucheffectscanleadto
larger-scalepatterns,suchasthehighermutabilityofexonsascomparedtothegenomeasa
whole[27].
Inourwork,wefilterthedatasubstantially,removingmorethanathirdofthesitesinthe
genome.Thefilterstendtoreducetheheterozygosityoftheremainingportions[24,28],which
istobeexpectediftheyhavetheeffectofpreferentiallyremovingfalseheterozygoussites.We
alsomakeasmalladjustmenttoourfinalvalueofμtoaccountfordifferencesinbasecomposi-
tionbetweenourascertainedstartingpointsandthe(filtered)genomeasawhole(seeMeth-
ods).Forreference,inS1Table,wegiveheterozygositylevelsandhuman–chimpanzee
divergencestatisticsforsitespassingourfilters,i.e.,thesubsetofthegenomeforwhichour
inferredratesareapplicable.
PLOSGenetics|DOI:10.1371/journal.pgen.1005550 November12,2015 10/25
Description:The human mutation rate is an essential parameter for studying the evolution of our recombination rate to calibrate the long-term mutation rate.