ARTICLE doi:10.1038/nature13908 The contribution of de novo coding mutations to autism spectrum disorder IvanIossifov1*,BrianJ.O’Roak2,3*,StephanJ.Sanders4,5*,MichaelRonemus1*,NiklasKrumm2,DanLevy1,HollyA.Stessman2, KaliT.Witherspoon2,LauraVives2,KarynneE.Patterson2,JoshuaD.Smith2,BryanPaeper2,DeborahA.Nickerson2, JeanselleDea4,ShanDong5,6,LuisE.Gonzalez7,JeffreyD.Mandell4,ShrikantM.Mane8,MichaelT.Murtha7, CatherineA.Sullivan7,MichaelF.Walker4,ZainulabedinWaqar7,LipingWei6,9,A.JeremyWillsey4,5,BorisYamrom1, Yoon-haLee1,EwaGrabowska1,10,ErtugrulDalkic1,11,ZihuaWang1,StevenMarks1,PeterAndrews1,AnthonyLeotta1, JudeKendall1,InessaHakker1,JulieRosenbaum1,BeicongMa1,LindaRodgers1,JenniferTroge1,GiuseppeNarzisi1,10, SeungtaiYoon1,MichaelC.Schatz1,KennyYe12,W.RichardMcCombie1,JayShendure2,EvanE.Eichler2,13, MatthewW.State4,5,7,14&MichaelWigler1 Wholeexomesequencinghasproventobeapowerfultoolforunderstandingthegeneticarchitectureofhumandisease. Hereweapplyittomorethan2,500simplexfamilies,eachhavingachildwithanautisticspectrumdisorder.Bycom- paringaffectedtounaffectedsiblings,weshowthat13%ofdenovomissensemutationsand43%ofdenovolikelygene- disrupting(LGD)mutationscontributeto12%and9%ofdiagnoses,respectively.Includingcopynumbervariants,coding denovomutationscontributetoabout30%ofallsimplexand45%offemalediagnoses.AlmostallLGDmutationsoccur oppositewild-typealleles.LGDtargetsinaffectedfemalessignificantlyoverlapthetargetsinmalesoflowerintelligence quotient(IQ),butneitheroverlapssignificantlywithtargetsinmalesofhigherIQ.WeestimatethatLGDmutationinabout 400genescancontributetothejointclassofaffectedfemalesandmalesoflowerIQ,withanoverlappingandsimilar numberofgenesvulnerable tocontributorymissensemutation. LGDtargetsinthejointclassoverlap with published targetsforintellectualdisabilityandschizophrenia,andareenrichedforchromatinmodifiers,FMRP-associatedgenes andembryonicallyexpressedgenes.Mostofthesignificanceforthelattercomesfromaffectedfemales. Autismspectrumdisorder(ASD)ischaracterizedbyimpairedsocial unaffectedsiblingsandtheparentsofeachfamily.WithintheSSC,the interactionandcommunication,repetitivebehaviourandrestrictedinter- overallgenderbiasinaffectedindividuals,7malesto1female,isnearly ests.Ithasastrongmalebias,especiallyinhigh-functioningaffected twicethattypicallyreported.ExomeswereanalysedatColdSpringHar- individuals.Thecontributionfromtransmissionhaslongbeensuspected borLaboratory(CSHL),YaleSchoolofMedicine,andUniversityof fromincreasedsiblingrisk1,butmorerecentlytheroleofgermlinede Washington(ExtendedDataFigs1and2andSupplementaryTable1). novo(DN)mutationhasbeenestablished,firstfromlarge-scalecopy Pipelineswereblindwithrespecttoaffectedstatus.Foruniformity,all numbervariationinsimplexfamilies2–5,andsubsequentlyfromexome datawerereanalysedwiththeCSHLpipeline,allowingcomparisonof sequencing.ThesmallerDNvariantsobservedbyDNAsequencingpin- analysistools.Allcallswerevalidatedorstronglysupported,aslisted pointcandidategenetargets6–8.Thesedevelopmentshavepromoteda (SupplementaryTable2)anddescribed(Methods). newmodelforcausation,andre-evaluationofsiblingrisk9,10. HerewereportwholeexomesequencingoftheSimonsSimplexCol- RatesandtargetsofDNmutation lection(SSC)11andanextensivelistofDNmutatedtargets,including ForgreatestprecisionwemeasuredDNratesinquadfamilies(oneaffec- 27recurrentLGD(nonsense,frameshiftandsplicesite)targets.Thesize tedandoneunaffectedchild)overgenomicpositionsatwhichallfamily anduniformityofthisstudyallowanunprecedentedevaluationofgen- membershad$403sequencecoverage(MethodsandSupplementary eticvulnerabilitytoASD.Wesubdividetargetsetsbymutationtype Table3).This‘joint403region’intheSSCwas32gigabases(Gb)intotal, (missenseandLGD)andaffectedchildstatus(genderandnon-verbal or48%ofthetargetedexome,from1,867quads.DNeventswereshared IQ,towhichwereferthroughoutas‘IQ’),andexploretheoverlapbe- bysiblings1%ofthetime(SupplementaryTable2);and1%ofmutations tweentargetsetsandtheirenrichmentforcertaingenecategories.We hadnearbynucleotidepositionsaltered,presumablybysinglemutagenic makeestimatesofthenumberofgenesvulnerabletoagivenmutation events12–14(SupplementaryTable4).Theoverallrateofbasesubstitu- typeandtheproportionofsimplexautismresultingfromDNmutation tionis1.831028(61029)perbasepair(SupplementaryTable5). foreachaffectedsubpopulation. RatesofDNsynonymousmutationinaffected(0.34perchild)and unaffected(0.33perchild)siblingsdonotdiffersignificantly(Fig.1). SSCsequencingandvalidation Bycontrast,LGDmutationsoccuratsignificantlyhigherratesinaffec- Wereporton2,517of,2,800SSCfamiliesincluding,800thatwere tedversusunaffectedsiblings(Fig.1andExtendedDataFig.3).The previouslypublished6–8.Wesequenced2,508affectedchildren,1,911 rateofLGDmutationsis0.12inunaffectedsiblingsand0.21inaffected 1ColdSpringHarborLaboratory,ColdSpringHarbor,NewYork11724,USA.2DepartmentofGenomeSciences,UniversityofWashingtonSchoolofMedicine,Seattle,Washington98195,USA.3Molecular& MedicalGenetics,OregonHealth&ScienceUniversity,Portland,Oregon97208,USA.4DepartmentofPsychiatry,UniversityofCalifornia,SanFrancisco,SanFrancisco,California94158,USA.5Department ofGenetics,YaleUniversitySchoolofMedicine,NewHaven,Connecticut06520,USA.6CenterforBioinformatics,StateKeyLaboratoryofProteinandPlantGeneResearch,SchoolofLifeSciences,Peking University,Beijing100871,China.7ChildStudyCenter,YaleUniversitySchoolofMedicine,NewHaven,Connecticut06520,USA.8YaleCenterforGenomicAnalysis,YaleUniversitySchoolofMedicine,New Haven,Connecticut06520,USA.9NationalInstituteofBiologicalSciences,Beijing102206,China.10NewYorkGenomeCenter,NewYork,NewYork10013,USA.11DepartmentofMedicalBiology,Bulent EcevitUniversitySchoolofMedicine,67600Zonguldak,Turkey.12DepartmentofEpidemiologyandPopulationHealth,AlbertEinsteinCollegeofMedicine,Bronx,NewYork10461,USA.13HowardHughes MedicalInstitute,Seattle,Washington98195,USA.14DepartmentofPsychiatry,YaleUniversitySchoolofMedicine,NewHaven,Connecticut06520,USA. *Theseauthorscontributedequallytothiswork. 00 MONTH 2014 | VOL 000 | NATURE | 1 ©2014Macmillan Publishers Limited. All rights reserved RESEARCH ARTICLE 1.5 overlapofDNLGDtargetsinaffectedprobandswithFMRPtargets(55 d hil 0.10 observedversus34.1expected;P5431024)andchromatinmodifiers er c 1.0 (26observedversus11.8expected;P5331024).Wealsoobservedsig- ents p 0.5 0.05 nobalsefrrovmedmveurtsautsio4n5i.0negxepneecsteexdp;rPes5se2d3in1e0m2b3r)y.oTnhiecldaettveerlospigmnaelncto23m(6e5s v E 0.0 0.00 mainlyfromthesmallnumberoffemaleaffectedindividuals(23ob- SubstitutionsSynonymous Missense Nonsense Splice site servedversus8.5expectedfrom67LGDtargets;P5531026).The27 P = 1 × 10–3 P = 0.56 P = 0.01 P = 9 × 10–3 P = 0.03 geneswithrecurrentLGDsshowstrongenrichmentforFMRPtargets 0.2 0.3 (14observedversus2.6expected;P5431028)andchromatinmodi- child ASD fiers(6observedversus0.9expected;P5231024).Bycontrast,no er 0.2 Sib significantenrichmentforthesegenesetsisseenfortheDNLGDtar- ents p 0.1 0.1 getTshineu1,n5a0f0feDctNedmsiibslsienngsse. targetsinprobandsarealsoenrichedfor v E FMRP targets and embryonically expressed genes. We observe 171 0.0 0.0 Indels No frameshift Frameshift LGDs FMRPtargets(144.8expected;P50.03),and220embryonicallyex- P = 0.03 P = 0.47 P = 7 × 10–3 P = 2 × 10–5 pressedgenes(191.4expected;P50.03).Asbefore,thesignalforem- Figure1|RatesofdenovoeventsbymutationaltypeintheSSC. Rates bryonicallyexpressedgenescomesalmostentirelyfromthesmallnumber perchildareestimatedfromthe403jointcoveragetargetregion,then offemaleaffectedindividuals(48observed,31.1expectedfrom244tar- extrapolatedfortheentireexome.Mutationtypesaredisplayedbyclass, gets;P50.002).Withtheexceptionofchromatinmodifiers,contrib- andthecombinedrateforallLGDsisshownatthebottomright.Foreachevent utoryDNmissenseandLGDmutationstendtostrikesimilarfunctional type,thesignificancebetweenprobandsandunaffectedsiblingsisgiven. classesofgenes. Sib,unaffectedsiblings.Theerrorsbarrepresent95%confidenceintervalfor themeanrates. DenovomutationandIQ HigherIQprobandsareheavilyskewedtowardsmales24.Forfurther probands, an ‘ascertainment differential’ of 0.2120.1250.09 (P5 analyses,wechosetodividetheaffectedmalepopulationroughlyin 231025).Thus,weestimate,43%(0.09outof0.21)ofLGDevents halfintohigherandlowerIQsets.WeinvestigatedwhetherhigherIQ inprobandscontributetoASDdiagnoses.ForDNmissense,therateis (.90)malescompriseapopulationwithadistinguishablegeneticsig- 0.82forunaffectedsiblingsand0.94foraffectedprobands,anascer- nature.ThereisadecreasedascertainmentdifferentialforDNLGD tainmentdifferentialof0.12(P50.01).Weestimateonly,13%(0.12 mutationsinmalechildrenwithhigherIQrelativetootheraffected out of 0.94)of DN missense eventsin probands contribute to ASD individuals(ExtendedDataFig.3andSupplementaryTable6).Thisis diagnoses.Thereisawideconfidenceintervalforthemissenseascer- notstatisticallysignificantoverthejoint403region.However,over tainmentdifferential(SupplementaryTable6);forthisreason,wecon- theentiredataset,thedropinIQis5pointsformaleswithDNLGD siderprimarilytheLGDeventsforouranalysisandlookonmissense mutationcomparedtothosewithoutmutation(P50.01;Fig.2).The dataassupporting. meanIQofaffectedmaleswithrecurrentDNLGDsdrops20points ToidentifygenetargetsforDNmutation,weexaminedallfamily (P50.00001,Fig.2).Significanceisalsoevidentasweexaminetargets dataincludingtrios.Weprovideacompletelistofallmutations(Sup- byfunctionalclass.MaleswithLGDmutationsinFMRPtargetshave plementaryTable2)alongwiththenumberofmutationsofeachtype anaverage14-pointdrop(P50.001).ThistrendcontinueswithLGD ineachgene(SupplementaryTable7).Atotalof391DNLGDmuta- targetsintheotherfunctionalclasses—chromatinmodifiersandembry- tionsin353targetgeneswereidentifiedandvalidatedinautismpro- onicallyexpressedgenes—butwithreducedsignificance.Weobserve bands.Ofthese,27targetgeneswererecurrent(Fig.2).Among1,500 littlesignalfromDNmissensemutation,eveninrecurrenttargets,either missensetargetsinprobands,145wererecurrent. becausetheseeventsarelesslikelytocontributeorbecausetheyareless WeexaminedallallelestransmittedoppositeaDNLGDtarget.We severe.Femaleprobandsshowthesametrendsasmales,butasthey sawnoinstancein391observationsinwhichthealleleoppositean compriseasmallerpopulation,thesignificanceisweak(Fig.2). LGDtargetcarriedararetransmittedLGDvariant(in,1%ofparental Furtherevidenceforadistinguishablesignatureamongthehigher exomes),andonlyfourinwhichsuchanallelecarriedararemissense IQcomesfromthefunctionalenrichmentwithinDNtargetgenesets. variant.Thus,theDNmutationsdonotgenerallycausehomozygous LGDtargetsinfemalesareenrichedforallthreefunctionalgeneclasses. loss-of-functionoftheirtarget(SupplementaryTable8). LGDtargetsinlowerIQaffectedmalesaresignificantlyenrichedfor Confirmingpreviousresults7,8,15,observedDNmutationsarisethree theFMRP-associatedandchromatin-modifiergeneclasses(Supplemen- timesasofteninthepaternalbackground,andmutationratesrisewith taryTable6).However,forLGDtargetsinhigherIQmalesweseeno ageofeitherparent(ExtendedDataFig.4andMethods).Thelattermay statisticallysignificantenrichmentforanyofthegenecategories. provideapartialexplanationforincreasedautismratesinchildrenborn ofolderparents. Targetoverlapsinchildrenandmutation-typegroups Wepartitionedchildrenintofourprimarygroups:unaffectedsiblings, Functionalclusteringintargetgenes affected females, affected males with higher IQ, and affected males Previousstudiespresentedevidenceoffunctionalclusteringintargets withlowerIQ.Weanalysedtheseandvariouscombinationsforthree ofDNLGDmutationinaffectedindividuals6–8,16.Ourlargerdataset typesofDNmutations:LGDs,missenseandsynonymous(Supplemen- wasexaminedwithanimprovednull‘lengthmodel’formutationin tary Table6).Targets ofsynonymousmutationsinallchildrenand whichtheprobabilityofDNmutationinageneisproportionaltoits targetsofLGDandmissensemutationsinunaffectedsiblingshaveno length(MethodsandExtendedDataFig.5).Wetestedforenrichment significantoverlapwithtargetsfromanyothergroup.Weseenosigni- withinDNLGDandmissensetargetsinprobandsandsiblingsforthe ficantoverlapbetweentargetsinhigherIQmaleswithtargetsfromother followingsixclasses:(1)FMRPtargetgenes,withtranscriptsboundby groups.Instrongcontrast,the67LGDtargetsfromaffectedfemales thefragileXmentalretardationprotein8,17;(2)genesencodingchromatin overlapsignificantlywiththe166LGDtargetsfromlowerIQaffected modifiers;(3)genesexpressedpreferentiallyinembryos18,19;(4)genes males(10observed,1.3expected,P5731027).Wethereforereferto encodingpostsynapticdensityproteins20;(5)essentialgenes21;and(6) thegroupoflowerIQmalesandaffectedfemalesasa‘joint’class.Inthis genesidentifiedasMendeliandiseasegenes22(Table1,Supplementary class,the874missenseand223LGDtargetsalsooverlapsignificantly(39 Table6andMethods).Thesedataprovidethestrongestevidenceyetfor observed,22.1expected,P50.0008).Thus,notonlydomissenseand 2 | NATURE | VOL 000 | 00 MONTH 2014 ©2014Macmillan Publishers Limited. All rights reserved ARTICLE RESEARCH Frameshift Male F FMRP targets Nonsense Female C Chromatin modifiers Splice site E Embryonically expressed P Postsynaptic density proteins 20 90 100 160 Proband Unaffected Non-verbal IQ Gene Categories LGD MS LGD MS CHD8 FC 7 DYRK1A E 4 ANK2 F P 3 1 GRIN2B F P 3 1 DSCAM F 3 1 CHD2 C 3 ARID1B FC 2 1 KDM6B FC 2 ADNP F E 2 MED13L F E 2 1 NCKAP1 F P 2 ANKRD11 F 2 DIP2A F 2 SCN2A F 2 4 TNRC6B F 2 WDFY3 F 2 1 PHF2 CE 2 WAC C 2 KDM5B E 2 2 2 2 POGZ E 2 2 1 RIMS1 P 2 FOXP1 2 GIGYF1 2 KATNAL2 2 KMT2E 2 1 TBR1 2 1 TCF7L2 2 100 LGDs LGDs LGDs LGDs LGDs Missense Missense Synonymous 90 all recurrent in FXG in CHM in EMB all recurrent all 1.0 al IQ80 0.01 0.01 0.2 0.4 n-verb70 0.5 0.001 0.2 0.5 0.9 0.9 0.4 No60 0.00001 0.4 0.06 0.10 Male with de novo type 50 MFeamlea wlei twhoituht d dee n noovvoo t tyyppee Female without de novo type 40 Figure2|Recurrentlyhitgenesandnon-verbalIQ. Affectedfemales DNmutationsareconsidered:allLGDs,recurrentLGDs,LGDsinFMRP accountfor13.5%oftheSSCwithameanIQof78,whereasaffectedmaleshave targets(FXG),LGDsinchromatinmodifiers(CHM),LGDsinembryonically ameanIQof86(top,P51027byStudent’st-test).Verticaldashedline expressedgenes(EMB),allmissensemutations,recurrentmissensemutations indicatesanIQof90.Middle(left)showsIQforaffectedchildrenwithLGD andsynonymousmutations.ProbandsaredividedbythepresenceofDN mutationsingeneshitrecurrently(right).Recurrentlymutatedgenesare mutationsandgender.Means,95%confidenceintervalsandPvalues clusteredintofourcategoriesasshown.Thelastfourcolumnsgiveoverall (Student’st-test)areshown. numbersofDNLGDandmissense(MS)mutations.Bottom,eightclassesof LGDmutationtargetgeneswithsharedfunctionality,thesamegenes setsofvulnerablegenes.Wenextsoughtevidenceforexcessrecurrence aresometimestargeted. oftargets.Wefirstexaminedsynonymousmutationsandmutationsin unaffectedchildren.Amongthe647synonymouseventsinprobands, Numberofvulnerablegenes thereare25genetargetsfoundinmorethanonechild,closetothenull Ouranalysisoffunctionalclusteringandoverlapswithintargetclasses expectationof19.9(P50.13).RecurrentLGD(n53outof179events) suggeststhatthemutationsascertainedinprobandstargetrestricted ormissensetargets(70outof1,143events)inunaffectedsiblingsare Table1|EnrichmentofDNmutationsinsixgeneclasses rDNLGD(ASD) DNLGD(ASD) DNmiss(ASD) DNLGD(sib) DNmiss(sib) No.ofgenes Overlap(27) Overlap(353) Overlap(1,513) Overlap(176) Overlap(1,066) Geneclass Obs Exp P Obs Exp P Obs Exp P Obs Exp P Obs Exp P FMRP 842 14 2.6 431028 55 34.1 431024 171 144.8 0.03 14 17.0 0.52 117 102.9 0.15 Chromatin 428 6 0.9 231024 26 11.8 331024 57 50.0 0.31 5 5.9 1.00 37 35.6 0.80 Embryonic 1,912 6 3.4 0.15 65 45.0 231023 220 191.4 0.03 20 22.5 0.65 142 136.0 0.58 PSD 1,445 4 2.5 0.31 34 32.5 0.78 159 138.1 0.07 22 16.2 0.15 113 98.1 0.12 Essential 1,750 7 3.2 0.04 50 42.4 0.22 201 180.3 0.10 20 21.2 0.91 127 128.1 0.96 Mendelian 256 0 0.6 1.00 3 8.0 0.07 31 34.0 0.66 5 4.0 0.61 20 24.1 0.47 DNLGD(Scz) 93 2 0.3 0.03 9 3.7 0.01 16 15.7 0.90 2 1.8 0.71 8 11.2 0.45 DNLGD(ID) 30 3 0.1 131024 8 1.2 331025 10 4.9 0.04 0 0.6 1.00 5 3.5 0.41 Wetestedeightclasses(Methods)forenrichmentagainstfivelistsoftargetsofDNmutations.Theseincludegeneswith(1)recurrentDNLGDmutationsinprobands(rDNLGD(ASD));(2)DNLGDmutationsin probands(DNLGD(ASD));(3)DNmissensemutationsinprobands(DNmiss(ASD));(4)DNLGDsinsiblings(DNLGD(sib));and(5)DNmissensemutationsinsiblings(DNmiss(sib)).Observed(obs)and expected(exp)numbersareshownwithPvaluesobtainedfromtwo-sidedbinomialtests.ExpectednumbersandPvaluesarebasedonalengthmodelinwhichDNmutationsoccurrandomlyinallgenes, proportionaltolength. 00 MONTH 2014 | VOL 000 | NATURE | 3 ©2014Macmillan Publishers Limited. All rights reserved RESEARCH ARTICLE alsoclosetonullexpectations(P50.2and0.04,respectively).Inaffected Weusetheascertainmentdifferentialasanestimateofcontribution. maleswithhigherIQtherearenoexcessrecurrenttargetsamong137 Thesumoftheascertainmentdifferentialsformissense,nonsense,con- LGDs mutations (2 observed, 1.0 expected, P5 0.3) or among 728 sensussplicesitedisruptionandframeshiftDNmutationsis0.21per missensemutations(26observed,24.7expected,P50.4).Bycontrast, affectedchild.Adding0.06,theascertainmentdifferentialfromlarge among probands the number of recurrent LGD (n527 out of 391 DNcopynumbervariants2,3,bringsthetotalto0.27(Fig.4).Excluding events)andmissensetargets(145outof1,675events)arenotcompat- higherIQmales,thevalueis0.33.Inaffectedfemalesitis0.45.Thisisa iblewiththenullexpectationof7.6(P,0.0001)and115.0(P50.001), conservativeestimatefortheroleofDNmutationintheSSCfamilies respectively.Giventhesefindings,aswellasthelackofoverlapbetween becausewehavenotyetascertainedintermediate-sizeDNcopynumber targetsofhigherandlowerIQmales,wefocusedonthejointclassof variants(CNVs),copy-neutralrearrangements,regulatorymutations femaleprobandsandaffectedmalesoflowerIQ.Forthejointclass,there ormutationsofnon-codinggenes. were22recurrentLGDtargetsamong254eventswith3.3expected AlthoughtheSSCisasimplexcollection,itisprobablyonlymar- (P,0.0001).Forthe944missenseevents,60recurrenttargetsareobserved ginallydepletedforhigh-riskfamiliesbecausesmallbroodsizepre- with40.2expected(P50.0005). ventsthebirthofseveralaffectedchildren,especiallyiftheunaffected Wenextusedrecurrenceanalysisandthelengthmodeltoestimate siblingisfemale.Weestimate10andconfirm9bygenderbiasinunaffected thenumberofvulnerablegenes(Fig.3)andtheprobabilitythatarecur- siblings(1,400femalesand1,264males,P50.0089)that,40%ofthe rentmutationofagiventypeiscontributory(Methods).Themostlikely SSCfamiliesarehigh-risk.Inasimplegeneticmodel,DNmutationhas numberofgenesvulnerabletoDNmutationsinthejointclassisesti- noroleinhigh-riskfamiliesbutisobligatoryforlow-riskfamilies10,so matedtobe387forLGDtargetswitha95%confidenceintervalof149– DNmutationwouldcontributeto,60%oftheSSC.Thesumofthe 915,and404forDNmissensetargets(confidenceinterval71–3,050). ascertainmentdifferentialforallobservableDNtypesinallthepro- Fromthelengthmodelandourestimatethat43%ofLGDmutationsare bandsisabout30%,abouthalfofthat.Ifthenumberofunobservedand contributory,wehave90%confidencethatagivenLGDmutationcontri- consequentialDNmutationsissimilartothenumberofobservedand butestoautisminagenerecurrentlyhitbyanLGDmutation(Methods). consequentialDNexomemutations,theactualcontributionisnotfar Bythesamemethods,wecompute35%confidenceincontributionfrom fromthatpredictedbythissimplemodel. missensemutationsinrecurrenttargets.Usingexistingmodelsforpri- oritizingtargets7,welistalltargetsofrecurrentDNcodingmutation Targetsandcognitivedefects accordingtotheirrank(SupplementaryTable9). We examined the incidence and targets of DN LGD mutations for childrenwithlowerandhigherIQs.AffectedchildrenwithhigherIQs Discussion haveagreaterincidenceofLGDmutationsthanunaffectedsiblings, TheSSCwasassembledwiththeexplicithypothesisthatfindingtargets butalowerincidencethanaffectedfemalesormaleswithlowerIQ. ofDNmutationwouldbeapathtogenediscovery.Wenowhave353 Moreover, there are few recurrently hit genes among the DN LGD candidateLGDgenetargets,27genesrecurrentlyhitbyLGDevents, targetsofaffectedmaleswithhigherIQ,andlittleoverlapwiththeDN and145recurrentmissensetargets,eachwithabout40%,90%and35% LGDtargetsofaffectedmaleswithlowerIQorfemales.LGDtargetsin chanceofbeingcontributory,respectively. higherIQmalesarenotenrichedfortheFMRP-associatedgenes.These LGD in all probands 80 LGD in female and low-nvIQ male probands Observed nsity Missense in female and low-nvIQ male probands 70 Model e d y bilit 60 a b o Pr 50 n o 0 500 1,000 1,500 2,000 2,500 3,000 3,500 4,000 buti40 Number of vulnerable genes ntri o C LGD in affected females 30 LGD in affected males with low nvIQ sity Missense in affected males with low nvIQ 20 n e d bility 10 a b o Pr 0 CNVs LGDs Missense Total Total Total Total Total Model all all all all female male low-IQ high-IQ male male 0.0 0.2 0.4 0.6 0.8 1.0 Figure4|EstimatedcontributionsofCNVs,LGDsandmissenseDN Class vulnerability mutationstosimplexASD. AscertainmentdifferentialsforthreetypesofDN Figure3|Numberofvulnerablegenesandclassvulnerability. Weassume mutation(CNVs,LGDsandmissense)areinterpretedasameasureof thepropertyofbeingvulnerableisindependentofgenelength,butthe ‘contribution,’thepercentageofprobandsinwhomthemutationcontributed probabilityofbeinghitbymutationisproportionaltogenelength.Weusethe todiagnosis.Wecombinethethreemutationtypesin‘total’ontheassumption observedratesofmutationofagiventypeinspecifiedpopulationsandnumber ofadditivity.Wepresentthismeasurefor‘all’probandsandselected ofrecurrentmutationstoestimatethenumberofgenesvulnerabletothose subpopulationsasindicated.WealsoshowtheexpectedcontributionofallDN mutations(top).Thedegreesofvulnerabilityinthoseclassesarethe mutationinasimplexcollectioncomputedfromasimplegeneticmodel9 distributionsshowninthebottompanel(Methods).nvIQ,non-verbalIQ. (‘model’).Errorbarsrepresent95%confidenceintervals. 4 | NATURE | VOL 000 | 00 MONTH 2014 ©2014Macmillan Publishers Limited. All rights reserved ARTICLE RESEARCH observationssuggest a differentdistribution ofgenetic mechanisms IQautism(Fig.3).ThemodeforLGDvulnerabilityinfemalesisfour- underlyingASDinhigherIQmales. foldlowerthanforlowerIQmales,mainlybecausetheprevalenceis WecanexamineoverlapbetweenLGDtargetsforautism,withpub- fourfoldlower.Reducedpenetranceinfemalesisnotwellunderstood, lishedtargetsforintellectualdisabilityandschizophrenia25–29.Weapplied butmaybeconsequenttosexuallydimorphicdevelopment.Supportfor ourlengthmodelformutationincidenceandfoundsignificantoverlap thisisseenintherelativeenrichmentofembryonicallyexpressedgenes ofintellectualdisabilityandschizophreniatargetswithASDtargets astargetsinfemales. (Table1),butonlyinthejointclassofaffectedmaleswithlowerIQand Partial genevulnerabilitycanbeexplainedinseveral ways:some females(SupplementaryTable6).Theoverlapcanhavemanyexplana- LGDmutationsresultinautism,somehavelittleeffect,andsomepro- tions:diagnosticconflation;pleiotropyforthesamemutation;differ- duceotherdiagnosesorevenlethality.Regardless,manyLGDmuta- entconsequencesfordifferentmutationsinthesamegene;andvarying tionswillstronglypredisposetoASD.Weexpectthistobereflectedin genetic or environmental background. The DN targets of affected decreased functional variation in the human gene pool, as we have maleswithhigherIQdonotoverlapthesesets,againsuggestingdis- previouslyshownforFMRP-associatedgenes8. tinctmechanisms. Givenouranalysisofgenevulnerabilityandthelackofevidencefor compound heterozygosity, damage to a single allele will often have Propertiesoftargetclasses severeconsequencesfordevelopment.Whatunderliesthevulnerabil- Thisstudyissufficientlylargeanduniformtoenableinferencesabout itytohaploinsufficiencyisunclear.Halfthenormalgenedosagecan targets,distinguishedbymutationtypes,propertiesofaffectedchildren resultinhalfthelevelofgeneproducts,andtherearemanyexamples andtargetfunctions.Weobserveasignificantcontributionfrommis- inwhichphysiologyrequiresproperdosage32–37.Also,havingtwocopies sensemutations,withanoverallmagnitudecomparabletothatfromLGD ofagenewillreducevariabilityofexpression38.Withonlyonefunctional mutations.BothLGDandmissensemutationtargetsareenrichedin allele,therecouldbeincreasedvariationinlevelsofexpression,includ- thesamefunctionalgenesets,especiallyamonglowerIQmales(Sup- ingdangerouslylowlevelsatcrucialmomentsinlineagedevelopment, plementaryTable6).ExcludinghigherIQmales,weestimatethemost alteringthecompositionoftissues.Monoallelicexpressionalsoneeds likelynumberofgenesvulnerabletoLGDsisabout400,withasimilar tobeconsidered39.Finally,sometruncationeventsmightleadtodom- numberofgenesvulnerabletomissense.Thetwosetsoverlapsubstantially. inantnegativealleles. Targetsinautismareenrichedincertainfunctionalcategories,pro- vidingdeepersupportforpreviouslypublishedobservations6–8.FMRP- Presentandfutureimplications associatedgenesandchromatinmodifiersareprominenttargetsinall Fromtheclinicalperspective,earlydiagnosisandfamilycounsellingare groupsexcepthigherIQmales.Theformerarethoughttofunctionin complicatediftherearehundredsofgenetictargets,especiallyiffeware neuroplasticity.Embryonicallyexpressedgenesaresignificantlyenriched knownwithcertainty.Sequencingofmorecohortsisthusclearlywar- asLGDormissensetargets,butonlyinfemales.Enrichmentinthese ranted.Fromthetherapeuticperspective,thegoodnewsisthatinalmost genesmayreflectthatthesecontributorymutationscausealterations allcasesDNmutationsoccurinprobandsinwhomanormalalleleisalso beforeafemaleprotectiveeffecttakesplace. present.Itistheoreticallypossiblethatenhancingactivityoftheremain- RecurrentLGDtargetsencodereceptors,ionchannelsandsynaptic ing alleles might alleviate symptoms. So in our view, the long-term proteinslikelytofunctiondirectlyinneuro-circuitry(forexample,SCN2A, prognosisfortreatingASDispositive.Moreover,ASDtargetsoverlap GRIN2BandRIMS1),butalsoproteinsfunctioningincytoskeletalremod- withtargetsforintellectualdisabilityandschizophrenia,somechanism- elling(forexample,ANK2andMED13L)andtranscriptionalregulation. basedtreatmentsmightworkfordifferentdiagnosticcategories.Inthe Chromodomainhelicasegenefamilymemberscarrymanyrecurrent intermediateterm,functionalclusteringsuggeststhattreatmentsmight LGDs.ThemostfrequentlyhitgeneisCHD8(ref.30),followedbyCHD2 betailoredtoasmallernumberofconvergentpathways. (threeLGDs)andfourothermembers(oneLGDeach)ofthatfamily. CHD8isatranscriptionalregulatorthoughttobeimportantforsup- OnlineContentMethods,alongwithanyadditionalExtendedDatadisplayitems pressionoftheWnt–b-cateninsignallingpathwaythroughhistoneH1 andSourceData,areavailableintheonlineversionofthepaper;referencesunique tothesesectionsappearonlyintheonlinepaper. recruitment31.AnotherintriguingtargetistheproteinkinaseDYRK1A, hitfourtimesandlocatedintheDown’ssyndromecriticalregion7. Received4July;accepted3October2014. Publishedonline29October2014. Genevulnerabilityandmolecularmechanisms Wecannotdeterminethepenetranceofspecificmutationsobserved 1. Jeste,S.S.&Geschwind,D.H.Disentanglingtheheterogeneityofautismspectrum disorderthroughgeneticfindings.NatureRev.Neurol.10,74–81(2014). here,aswedonotseethemoftenenoughinanunselectedpopulation. 2. Sanders,S.J.etal.MultiplerecurrentdenovoCNVs,includingduplicationsofthe Nevertheless,weintroducetheterm‘genevulnerability’astheprobabil- 7q11.23Williamssyndromeregion,arestronglyassociatedwithautism.Neuron itythatagiventypeofmutationinagivengenecontributestoagiven 70,863–885(2011). 3. Levy,D.etal.Raredenovoandtransmittedcopy-numbervariationinautistic condition.Geneswithnon-zerovulnerabilitydefinethevulnerableclass. spectrumdisorders.Neuron70,886–897(2011). Wecanextendthisconceptto‘classvulnerability’,definedasthemean 4. Marshall,C.R.etal.Structuralvariationofchromosomesinautismspectrum genevulnerabilityoveraclassofgenes.Mathematically,classvulner- disorder.Am.J.Hum.Genet.82,477–488(2008). ability,V,iscomputedbysolvingthefollowingequationforV:F3A 5. Sebat,J.etal.Strongassociationofdenovocopynumbermutationswithautism. Science316,445–449(2007). 5P3H3V,inwhichFistheprevalenceofthecondition,Aisthe 6. Sanders,S.J.etal.Denovomutationsrevealedbywhole-exomesequencingare ascertainmentdifferentialforDNmutationsofagiventypeinthegene stronglyassociatedwithautism.Nature485,237–241(2012). class,PistheexpectedproportionofthepopulationwithDNmuta- 7. O’Roak,B.J.etal.Sporadicautismexomesrevealahighlyinterconnectedprotein tionsofthatgiventype,andHistheprobabilitythatsuchmutationshit networkofdenovomutations.Nature485,246–250(2012). 8. Iossifov,I.etal.Denovogenedisruptionsinchildrenontheautisticspectrum. thegeneclass. Neuron74,285–299(2012). Wecancomputeadistributionofclassvulnerabilityforallvulner- 9. Ronemus,M.,Iossifov,I.,Levy,D.&Wigler,M.Theroleofdenovomutationsinthe ablegenestargetedbyagivenmutationaltype(Methods)becauseF,A geneticsofautismspectrumdisorders.NatureRev.Genet.15,133–141(2014). 10. Zhao,X.etal.Aunifiedgenetictheoryforsporadicandinheritedautism.Proc.Natl andPhaveempiricallysampleddistributionsandHhasadistribution Acad.Sci.USA104,12831–12836(2007). inferredfromthetotallengthofthegeneclass.Thedistributionofclass 11. Fischbach,G.D.&Lord,C.TheSimonsSimplexCollection:aresourcefor vulnerabilityforDNLGDsinmaleswithlowerIQhasamodearound identificationofautismgeneticriskfactors.Neuron68,192–195(2010). 0.4(Fig.3).Inotherwords,,40%ofDNLGDsinvulnerablegenesin 12. Campbell,C.D.etal.Estimatingthehumanmutationrateusingautozygosityina founderpopulation.NatureGenet.44,1277–1281(2012). amalecontributetodiagnosesoflowerIQASD.Similarly,,10%of 13. Michaelson,J.J.etal.Whole-genomesequencinginautismidentifieshotspotsfor missensemutationsinvulnerablegenescontributetodiagnosesoflower denovogermlinemutation.Cell151,1431–1442(2012). 00 MONTH 2014 | VOL 000 | NATURE | 5 ©2014Macmillan Publishers Limited. All rights reserved RESEARCH ARTICLE 14. Schrider,D.R.,Hourmozdi,J.N.&Hahn,M.W.Pervasivemultinucleotide 37. Zhang,F.,Gu,W.,Hurles,M.E.&Lupski,J.R.Copynumbervariationinhuman mutationaleventsineukaryotes.Curr.Biol.21,1051–1054(2011). health,disease,andevolution.Annu.Rev.GenomicsHum.Genet.10,451–481 15. Kong,A.etal.Rateofdenovomutationsandtheimportanceoffather’sageto (2009). diseaserisk.Nature488,471–475(2012). 38. Eckersley-Maslin,M.A.&Spector,D.L.Randommonoallelicexpression: 16. Neale,B.M.etal.Patternsandratesofexonicdenovomutationsinautism regulatinggeneexpressiononealleleatatime.TrendsGenet.30,237–244 spectrumdisorders.Nature485,242–245(2012). (2014). 17. Darnell,J.C.etal.FMRPstallsribosomaltranslocationonmRNAslinkedto 39. Jeffries,A.R.etal.Randomorstochasticmonoallelicexpressedgenesareenriched synapticfunctionandautism.Cell146,247–261(2011). forneurodevelopmentaldisordercandidategenes.PLoSONE8,e85093(2013). 18. Kang,H.J.etal.Spatio-temporaltranscriptomeofthehumanbrain.Nature478, 483–489(2011). SupplementaryInformationisavailableintheonlineversionofthepaper. 19. Voineagu,I.etal.Transcriptomicanalysisofautisticbrainrevealsconvergent molecularpathology.Nature474,380–384(2011). AcknowledgementsSimonsFoundationAutismResearchInitiativegrantstoE.E.E. 20. Baye´s,A.etal.Characterizationoftheproteome,diseasesandevolutionofthe (SF191889),M.W.S.(M144095R11154)andM.W.(SF235988)supportedthiswork. humanpostsynapticdensity.NatureNeurosci.14,19–21(2011). AdditionalsupportwasprovidedbytheHowardHughesMedicalInstitute 21. Blake,J.A.,Bult,C.J.,Kadin,J.A.,Richardson,J.E.&Eppig,J.T.TheMouseGenome (InternationalStudentResearchFellowshiptoS.J.S.)andtheCanadianInstitutesof Database(MGD):premiermodelorganismresourceformammaliangenomics HealthResearch(DoctoralForeignStudyAwardtoA.J.W.).E.E.E.isanInvestigatorofthe andgenetics.NucleicAcidsRes.39,D842–D848(2011). HowardHughesMedicalInstitute.WethankallthefamiliesattheparticipatingSSC 22. Feldman,I.,Rzhetsky,A.&Vitkup,D.Networkpropertiesofgenesharboring sites,aswellastheprincipalinvestigators(A.L.Beaudet,R.Bernier,J.Constantino, inheriteddiseasemutations.Proc.NatlAcad.Sci.USA105,4323–4328(2008). E.H.CookJr,E.Fombonne,D.Geschwind,D.E.Grice,A.Klin,D.H.Ledbetter,C.Lord, 23. Willsey,A.J.etal.Coexpressionnetworksimplicatehumanmidfetaldeepcortical C.L.Martin,D.M.Martin,R.Maxim,J.Miles,O.Ousley,B.Peterson,J.Piggot,C.Saulnier, projectionneuronsinthepathogenesisofautism.Cell155,997–1007(2013). M.W.State,W.Stone,J.S.Sutcliffe,C.A.WalshandE.Wijsman)andthecoordinators 24. Newschaffer,C.J.etal.Theepidemiologyofautismspectrumdisorders.Annu.Rev. andstaffattheSSCsitesfortherecruitmentandcomprehensiveassessmentof PublicHealth28,235–258(2007). simplexfamilies;theSFARIstaffforfacilitatingaccesstotheSSC;andtheRutgers 25. deLigt,J.etal.Diagnosticexomesequencinginpersonswithsevereintellectual UniversityCellandDNARepository(RUCDR)foraccessingbiomaterials.Wewouldalso disability.N.Engl.J.Med.367,1921–1929(2012). liketothanktheCSHLWoodburySequencingCenter,theGenomeInstituteatthe 26. Fromer,M.etal.Denovomutationsinschizophreniaimplicatesynapticnetworks. WashingtonUniversitySchoolofMedicine,andYaleCenterforGenomicAnalysis(in Nature506,179–184(2014). particularJ.Overton)forgeneratingsequencingdata;E.AntoniouandE.Ghibanfor 27. Lee,S.H.etal.Geneticrelationshipbetweenfivepsychiatricdisordersestimated theirassistanceindataproductionatCSHL;andT.Brooks-Boone,N.Wright-Davisand fromgenome-wideSNPs.NatureGenet.45,984–994(2013). M.WojciechowskifortheirhelpinadministeringtheprojectatYale.TheNHLBIGO 28. McCarthy,S.E.etal.Denovomutationsinschizophreniaimplicatechromatin ExomeSequencingProjectanditsongoingstudiesproducedandprovidedexome remodelingandsupportageneticoverlapwithautismandintellectualdisability. variantcallsforcomparison:theLungGOSequencingProject(HL-102923),theWHI Mol.Psychiatry19,652–658(2014). SequencingProject(HL-102924),theBroadGOSequencingProject(HL-102925),the 29. Rauch,A.etal.Rangeofgeneticmutationsassociatedwithseverenon-syndromic SeattleGOSequencingProject(HL-102926)andtheHeartGOSequencingProject sporadicintellectualdisability:anexomesequencingstudy.Lancet380, (HL-103010). 1674–1682(2012). 30. O’Roak,B.J.etal.Multiplextargetedsequencingidentifiesrecurrentlymutated AuthorContributionsCSHL:I.I.,M.R.andM.W.designedthestudy;I.I.,D.L.,B.Y.,Y.L., genesinautismspectrumdisorders.Science338,1619–1622(2012). E.G.,E.D.,P.A.,A.L.,J.K.,G.N.,S.Y.,M.C.S.,K.Y.andM.W.analysedthedata;M.R.,I.H.,J.R., 31. Nishiyama,M.,Skoultchi,A.I.&Nakayama,K.I.HistoneH1recruitmentbyCHD8is B.M.,L.R.,J.T.andW.R.M.generatedtheexomedataatColdSpringHarborLaboratory; essentialforsuppressionoftheWnt-b-cateninsignalingpathway.Mol.Cell.Biol. I.I.,Z.W.,S.M.andJ.T.confirmedthevariants;I.I.,M.R.andM.W.wrotethepaper.UCSF/ 32,501–512(2012). Yale:S.J.S.andM.W.S.designedthestudy;S.J.S.,S.D.,L.W.andA.J.W.analysedthe 32. Birchler,J.A.&Veitia,R.A.Genebalancehypothesis:connectingissuesofdosage data;S.J.S.,J.D.,L.E.G.,J.D.M.,C.A.S.,M.F.W.andZ.W.confirmedthevariants;S.M.M.and sensitivityacrossbiologicaldisciplines.Proc.NatlAcad.Sci.USA109, M.T.M.generatedtheexomedataatYaleMedicalCenter.UW:B.J.O.,J.S.andE.E.E. 14746–14753(2012). designedthestudy;B.J.O.andN.K.analysedthedata;B.J.O.,H.A.S.,K.T.W.andL.V. 33. Cooper,D.N.,Krawczak,M.,Polychronakos,C.,Tyler-Smith,C.&Kehrer-Sawatzki, confirmedthevariants;E.E.E.andJ.S.revisedthemanuscript;K.E.P,J.D.S.,B.P.and H.Wheregenotypeisnotpredictiveofphenotype:towardsanunderstandingof D.A.N.generatedtheexomedataattheUniversityofWashington. themolecularbasisofreducedpenetranceinhumaninheriteddisease.Hum. Genet.132,1077–1130(2013). AuthorInformationSequencedatausedintheseworkareavailablefromtheNational 34. Darnell,J.C.Defectsintranslationalregulationcontributingtohumancognitive DatabaseforAutismResearch(http://ndar.nih.gov/),understudyDOI:10.15154/ andbehavioraldisease.Curr.Opin.Genet.Dev.21,465–473(2011). 1149697.Reprintsandpermissionsinformationisavailableatwww.nature.com/ 35. Veitia,R.A.,Bottani,S.&Birchler,J.A.Genedosageeffects:nonlinearities,genetic reprints.Theauthorsdeclarecompetingfinancialinterests:detailsareavailable interactions,anddosagecompensation.TrendsGenet.29,385–393(2013). intheonlineversionofthepaper.Readersarewelcometocommentontheonline 36. Weischenfeldt,J.,Symmons,O.,Spitz,F.&Korbel,J.O.Phenotypicimpactof versionofthepaper.Correspondenceandrequestsformaterialsshouldbe genomicstructuralvariation:insightsfromandforhumandisease.NatureRev. addressedtoJ.S.([email protected]),E.E.E.([email protected]), Genet.14,125–138(2013). M.W.S.([email protected]),orM.W([email protected]). 6 | NATURE | VOL 000 | 00 MONTH 2014 ©2014Macmillan Publishers Limited. All rights reserved ARTICLE RESEARCH METHODS Wherepossible,allfourfamilymembersweresequencedonthesamelaneto minimizebatcheffects.AllstrongandweakLGDcandidatevariantsfromtheCSHL Samplecollection.Mostfamilies(2,517)camefromcurrentorformermembers pipeline,alongwithanadditionalsetofLGDcandidatesfromtheYalepipeline, oftheSSC.TheSSCwasassembledat13clinicalcentres,accompaniedbydetailed andstandardizedphenotypicanalysisasreportedpreviously11.SeveralIQmea- weresubjectedtoexperimentalvalidationasfollows:variant-specificprimerswere designedforPCRamplificationofcandidateSNVsandindelsfromallfamily sures(verbal,non-verbalandfullspectrum)wererecorded;inthiswork,westrati- members,andampliconsweresentforSangersequencing. fiedprobandsbynon-verbalIQ,whichwerefertoassimply‘IQ’throughoutthe text.Familieswithsingleprobandsandunaffectedsiblingswerepreferentially Sequenceanalysispipelines.Sequencedatawereinterpretedasfamilygenotypes recruited,whereasfamilieswithtwoprobandswerespecificallyexcluded11.Families usingpipelinetoolsateachrespectivedatacentre.Almostallofthedatawere fromtwoassociatedcollectionswerealsosequenced:theSimonsAncillaryCollec- reanalysedwiththeCSHLpipeline.Weshowthecoverage(ExtendedDataFig.6) tion(SAC,n5123),andtheSimonsTwinCollection(STC,n513).TheSACincludes andyields(ExtendedDataFig.7)fordenovocallsfromeachcentre.The24fam- familiesthatfailedinclusioncriteriafortheSSC,typicallybecauseaparent,sibling iliessequencedatallthreecentresdemonstratedgoodagreementbetweenpipe- orsecond-orthird-degreerelativeoftheaffectedparticipanthasbeendiagnosed linesandplatforms(SupplementaryTables11and12). withASD,orforcasesinwhichtheproband’sASDdiagnosiswasquestionable. Theanalysispipelinesgeneratedcandidatedenovoevents,definedasvariants TheSTCconsistsoffamiliesofmonozygotictwinsinwhichatleastoneco-twinis presentinthechildandabsentinbothparents.Wefilteredoutvariantsseen affectedbyASD.TheinstitutionalreviewboardsofColdSpringHarborLaboratory, frequentlyintheparentsofthecollection(allelefrequency.0.3%),reasoning YaleMedicalCenterandUniversityofWashington,Seattleapprovedthisstudy. thatmostofthesewouldbefalsepositivesowingtounevencoverageinaparent. WritteninformedconsentfromallsubjectswasobtainedbySFARI.Bloodsamples CandidatesgeneratedbylocalpipelinesorbythecommonCSHLpipelinewere weredrawnfromparentsandchildren(affectedandunaffected)andsenttothe validatedattherespectivecentreswithre-sequencingand2,504wereverified.In RutgersUniversityCellandDNARepository(RUCDR)forDNApreparation.DNAs ourfinalcallset,weincludeallverifiedcallsfromeachcentre,andomitanycall from2,517families(of,2,800totalintheSSC)wereusedinthisstudy.Results thatwasrejected.Inaddition,becausealmostall(1,640of1,644)strongpoint from774oftheSSCfamiliesincludedherewerepublishedinearlierwork6–8.The mutationsgeneratedbythecommonCSHLpipelinewereverifiedwhensuccess- samplesweresplitacrossthethreecentres:ColdSpringHarborLaboratory(CSHL), fullytested(SupplementaryTable11),suchstrongcandidatesareincludedinour theDepartmentofGeneticsattheYaleSchoolofMedicine(YALE),andDepartment callsetevenifthevalidationtestfailedorifthecandidateeventwasnottested.All ofGenomeSciencesattheUniversityofWashington(UW).Thesplitwasnot frameshiftmutationswerevalidated,andweexcludeallthatwererejected.Allde uniformwithrespecttonumberoffamiliesortheproportionsoffemaleprobands novocallsusedinthesubsequentanalysis,alongwiththeirvalidationstatus,are andprobandswithlowerIQ(ExtendedDataFig.2andSupplementaryTable1). listedinSupplementaryTable2.Pipelinesforanalysisandvalidationwereblind Severalfamiliesweresequencedatmultiplecentres,with24familiessequencedin withrespecttoaffectedandunaffectedstatus. allthreecentres(ExtendedDataFig.1andSupplementaryTable1). CSHL(uniform)pipeline.Sequencedatafromthethreecentreswereanalysed Exomecapture,sequencingandvalidation.Thethreecentresdifferedinthe withthecomputationalpipelinedescribedpreviously8.Inbrief,theIlluminaanal- preciseexomecaptureplatform,readlengthandvalidationprotocols. ysispipeline(CASAVA1.8)wasusedforbasecalls,andcustomsoftwarewasused CSHL.Theprotocolsdescribedpreviously8wereappliedtothefamiliesnewlysequenced tode-multiplexreadsandtrimbarcodesfromCSHLderiveddata.DatafromYale atCSHL.Inbrief,SeqCapEZHumanExomeLibraryv2.0(RocheNimbleGen) andUWwerede-multiplexedattherespectivecentresbeforeanalysisthroughthe reagentswereusedwithacustombarcodingprotocolthatenabledsimultaneous CSHLpipeline.BWA42wasusedtoalignsequencereadstothehg19referencegenome, exomeenrichmentof#4genomesandthesequencingof#8individualsperIllumina andbothPicard(http://broadinstitute.github.io/picard/)andGATK43wereused HiSeq2000lane.Allexomesequencingwasperformedusingpaired-end100-base formarkingPCRduplicates,family-basedsequencerealignmentandqualityscore pair(bp)reads.AllstrongandweakLGDcandidatevariantsaswellasadditional recalibration.Asdescribedpreviously,amultinomialmodel-basedfamilygenotyper variantsfromfamiliessequencedatCSHLweresubjectedtoexperimentalvalid- wasusedtogeneratecandidateSNVandindel‘Mendelviolators,’eachannotated ation.Gene-specificprimersweredesignedforPCRamplificationofcandidatesingle with:(1)aconfidencescore(denovoScr)thatreflectstheposteriorprobabilityof nucleotidevariables(SNVs)andindels,andampliconswerepooledandsequenced Mendelviolationatthelocus;(2)agoodness-of-fitscore(chi2Score)showingthe onanIlluminaMiSeq.Approximately100variantswerevalidatedperlanewithpaired- degreetowhichtheassumptionsofthemultinomialmodelareapplicabletothe end150-bpreads.Wherepossible,theparentaloriginwasdeterminedbyphasing observeddata;(3)countsofreadsperalleleandperfamilymember;and(4)allele oflinkedtransmittedSNVs. frequencyandnoiseratesforthecandidatepositionbasedonthewholecollection. UW.Sampleswerecapturedandsequencedbyoneofthreemethods.Inthepilot CandidatesSNVswithdenovoScr$60andchi2Score.0.0001werelabelled‘strong’ set(19quads),sampleswerecapturedusingSeqCapEZHumanExomeLibrary providedthatthepositionwasnotpolymorphicornoisyinthepopulation,andthat v1.0(RocheNimbleGen)reagents(UW-M1)7,40.Theremainingsampleswerecap- theparentswerehomozygousforthereferenceallele. turedusingSeqCapEZHumanExomeLibraryv2.0(RocheNimbleGen)reagents7. ForSNVs,acut-offdenovoScrvalueof60wasdictatedbythedesiretokeep Newlysequencedsampleswereeitherprocessedasdescribedpreviously7(UW-M2) falsepositivestoaminimum,andwaschosenaftercomputingtheproportionof orwithamodified(UW-M3)protocol(SupplementaryTable10).ForUW-M2, denovocandidatesthatappearatpolymorphicloci(asurrogateforfalsepositives) single-plexcapturesandsingle-plexsequencingruns(non-pooled)wereperformed asafunctionofthescore(seetheSupplementofref.8).Thelowfalsepositiverate asdescribedpreviously7.ForUW-M3,single-plexcapturewasperformedasin (,5%)wasalsoconfirmedthroughexperimentalvalidation(SupplementaryTable11). UW-M2;however,inthepost-capturePCR,an8-bpindexbarcodewasadded. Inaddition,weobservethatonly1%ofDNmutationsaresharedbetweentwo Post-PCRlibrarieswerequantifiedandpooledinsetsof,96.Thesepoolswere siblings(SupplementaryTable2),puttinga3%caponfalsepositivesowingto thensequencedontheIlluminaMiSeqplatformtoevaluatelibrarycomplexityand failuretoobserveparentscorrectly.Atstringentthresholdsthefalsenegativerateis sampledistribution.Poolswererebalancedonthebasisofperformance,thense- generallyhigh,butthroughsimulationswedeterminedthat,evenwithstringent quencedacrossmultipleHiSeq2000lanesusingpaired-end50-bpreads.Additional thresholds,regionswithdeepcoverage(403orhigherjointcoverage)hadlowfalse laneswereaddeduntilsamplesreachedtargetcoverage(203:,80%;83:,90%). negativerates(,5%). Ifadditionalcoveragewasrequiredforsomesamples,subpoolswerealsogener- IndelsweretreateddifferentlythanSNVs.Themultinomialmodelassumesa ated.ForsamplesprocessedwithUW-M1andUW-M2,predicteddenovocallswere smallallelebias,appropriatewhencallingSNVs,butnotfordenovoindels— validatedusingstandardPCRandSangersequencing7.ForUW-M3processed particularlyforlongevents(.10bp).Toaddressthis,cut-offsfor‘strong’indels samples,customMolecularInversionProbe(MIP)captureprobesweredesigned werelowered(denovoScr.30andchi2Score.1029).Toreducenoise,weadded withtargetingarmsflankingregionsofinterest.Probesweredesignedwithoutor requirementsfor‘clean’readcounts:parentswerenotallowedtohaveanyreads withdegeneratetags,andpoolsof,50–100probesweregenerated30,41.Asdescribed containingthecandidateindel,andwererequiredtohaveatleast15readssup- earlier41,setsoffamilies(,96samples)werecapturedusingthesepoolswith50– portingthereferenceallele.Atleastoneofthechildrenhadtohave$6readswith 100ngofgenomicDNAastemplate.Captureproductswerethenpooledandse- thecandidatevariant,andthosereadshadtocomprise$5%ofreads.Experimental quencedonanIlluminaMiSeq.CandidatesitesfailingMIPQCorcapture,or validationdemonstratedthatthefalsepositiverateinthestrongindelsis,10%, showingevidenceofsignificantshiftsinallelebalance,werevalidatedusingthe andsimulationsforindelswithoutextremeallelebias(mostofthose,10bp) standardPCR/Sangermethod.Ifsitesrepeatedlyfailedtheassay,theyweredis- revealthatthefalsenegativerateinwell-coveredregions(403)is,5%. carded.NovelsitescalledbytheCSHLpipelinewerevalidatedusingthesame All‘strong’SNVsandindelsarereportedhereunlessrejectedbyvalidation.To methodsasUW-M3. addressthehighfalsenegativerates,wedefinedaclassof‘weak’SNVandindels YALE.Wholeblood-derivedgenomicDNAwasenrichedforexonicsequences drawnfromthresholdslowerthanstrongcandidates.AllweakLGDcandidates usingSeqCapEZHumanExomeLibraryv2.0(RocheNimbleGen)reagents.All weresubjectedtovalidation,andonlythosesuccessfullyvalidatedarereported.In familymemberswerebarcodedandeachpooloffoursampleswassequencedusing addition,duringmethoddevelopment(forexample,throughmanualinspection 75-bppaired-endreadsonsinglelanesoftheIlluminaHiSeq2000instrument. orseeref.44),wevalidatedalargenumberofcandidatesthatdidnotmeeteven ©2014Macmillan Publishers Limited. All rights reserved RESEARCH ARTICLE theweakdefinition.Candidatevariantsfoundasvalidunderthesecircumstances Testforexcessiverecurrence.IfwehaveRrecurrentgenesinKeventsina arereportedhereandlabelledas‘notcalled’.Thislabelisalsousedwhenthe mutation-child-typeclass,wetestforexcessrecurrencebycomparingRtothe CSHLuniformpipelinemissedacallfromtheUWandYALEdatathatwas numberofrecurrentgenesexpectedunderthegenelength-basednullmodel.We successfullyvalidated. buildtheexpectationbyperforming10,000simulations.Ineachsimulation,we UWpipeline.AllsamplesusingUW-M1andUW-M2protocolswereprocessed sampleKgeneswithreplacementwheretheprobabilityofsamplingageneis asdescribedearlier7.ForUW-M3,updatedversionsofBWA(0.5.9-r16),Picard- proportionaltoitslength.Wethencountthenumberofrecurrences. tools(1.48)andGATK(1.0-6125)wereused.TheGATKUnifiedGenotyperwas Estimationofthenumberofvulnerablegenes.Toestimatethenumberofvul- usedinsinglesamplemodewithfilterflags(AB.0.75,lowquality,QD,5.0, nerablegenesforagivenmutation-child-type,westartwiththeobservednumber QUAL#50.0) andin parallelwith theSAMtoolspipelineas describedprev- ofevents(K),theobservednumberofrecurrentevents(R),theestimatedposterior iously40(seetheGATKUnifiedGenotyperfordefinitionsofthesescores).Only distributionsfortheratesofmutationsofthegiventypeintheascertained(Mdist) positionswith$8-foldcoveragewereconsidered.Childgenotypecallswerecom- andfortheunaffected(Pdist)population.Wethenexplorepossiblenumberofvul- paredtotheparentalgenotypestoidentifypossibledenovoevents.PredictedDN nerablegenes(T)from1to4,000.ForeachT,weestimate(throughasimulation SNVswereanalysedagainstasetof946exomestoremoverecurrentartefactsand describedinthenextparagraph)thelikelihoodL(T)5P(RjT,Mdist,Pdist,K).As- likelyundercalledsites.IndelswerealsocalledwiththeGATKUnifiedGenotyper sumingallnumbersofvulnerablegenesfrom1to4,000areequallylikely,wecom- andSAMtools45,andincludedonlythosewith$25%ofreadsshowingavariantat puteaposteriordistributionofthenumberofgenesp(T)proportionaltoL(T)and aminimumdepthof83.Thesewerethenfilteredagainstalargersetof1,779 determinethemaximumvalueand95%confidenceintervals. exomes(aswithSNVs).Thosesitespassing(thatis,notpresent)intheexome Toestimatethelikelihood,L(T)5P(RjT,Mdist,Pdist,K),weperform10,000 screenandalsonotpresentinmultipleUW-M3processedfamiliesweremanually simulationsforeveryT.Ineachsimulation: evaluatedbyinspectingalignmentsintheIntegrativeGenomicsViewer(http:// (1)WerandomlyselectTdistinctvulnerablegenesfromallgenes,withoutrespect www.broadinstitute.org/igv/home).Siteswithobviousmisalignments(forexample, tolength.Unlikemutation,whichstrikesageneaccordingtoitslength,weassume non-gappedindelsorsoft-clippedonlyreads)wereremoved.Moreover,ifreads thatthechanceagenecancauseautismifmutatedisindependentofitslength. supportingthepredictedDNmutationwerepresentin$5%of20(ormore)reads (2)WeselectthenumberNofcontributoryeventsbysamplingfromabino- inoneoftheparents,thesitewasexcluded.Forsiteswithlowercoverage,avariant mialdistributionBinom(K,A/M),whereParandomlyselectedratefromPdist,M wasexcludedifpresentin$10%(forexample,1in10,or2in20)ofparentreadsor isarandomlyselectedfromMdist,A5M2Pisasampledascertainmentdif- (forquads)ifatleastonevariantreadwaspresentinoneparentandtheotherchild. ferential,andA/Misanestimateoftheproportionofcontributoryevents. Yalepipeline.TheYaledatawereanalysedasdescribedpreviously6.Inbrief, (3)WesimulateNcontributorymutationeventsbyselectingNeventswith CASAVA1.8wasusedfordemultiplexingandbasecalling,readswerealignedto replacementfromtheTvulnerablegenesproportionaltotheirlength. hg19withBWA42,andSAMtools45wasusedformarkingPCRduplicatesandgeno- (4)Tosimulaterandomevents,weselectK2Ngenesfromallwell-covered typing.In-housescriptswereusedforfamily-basedassessmentofdenovomutations geneswithreplacementproportionaltotheirlength. andannotationagainstgenesandtheexomevariantserver(http://varianttools. (5)WerecordthenumberofrecurrenteventsintheKselectedeventsfromabove. sourceforge.net/Annotation/EVS). WesetL(T)5P(RjT,Mdist,Pdist,K)to betheproportionofsimulationsin Nullmodelsfortargetoverlapsandrecurrence.Weintroducethetermmuta- whichthenumberofrecurrenteventsisexactlyR.P(T)isobtainedbynormal- tion-child-typetorefertoasetofeventsofacertainmutationaltype(forexample, izing L(T). For every simulation in which the number of recurrent events is missenseorLGD)inchildrenofacertaintype(forexample,maleaffectedindivi- exactlyR,wealsorecordtheproportionofcontributoryeventsamongtherecur- dualswithhigherIQorunaffectedsiblings).Weobservetargetenrichmentingene rentevents,andthevulnerabilitypointestimateasdiscussedinthenextsection. classes,anddocumentoverlapsandrecurrencebetweenandwithinmutation-child- Vulnerability.Weusetheequationdescribedinthetext:F3A5P3H3V,in types.Tomeasuresignificance,weuseanullmodelinwhichtheprobabilitythata whichFistheprevalenceofthegivenconditioninthepopulation,Aistheascer- geneishitbymutationisproportionaltoitslength,amodelsupportedbyobser- tainmentdifferentialforDNmutationsofagiventypeinpersonsascertainedfor vation(ExtendedDataFig.5).Weexaminethedistributionsoflengthsofgene thatcondition,PistheexpectedproportionofthepopulationwithsuchDNmu- targetsofdenovosynonymous,missenseandLGDmutationinaffectedchildren tations,Histheprobabilitythatsuchamutationhitsthetarget,andVisthemean andsiblings.Thesedistributionsarecomparedtosimulationsofgenespickedat classvulnerability.Thesevariablesareinfactrandomvariableswithempirically randomorinproportiontotheirlength.Thedatafitwellwiththemodelthat deriveddistributions. mutationfrequencyislinearlydependentongenelength.Thegroupwiththe largestdeviationfromthisruleisthesetofDNtargetsinaffectedchildren,both Wefirstdemonstratethemethodforcomputingtheclassvulnerabilitypoint for missense (P50.001) and for LGDs (P50.001, Supplementary Table 13). estimateforgenesvulnerabletoLGDmutationsfortheASDmalesoflowerIQ, ThesePvaluesaredefinedastheprobabilitythatthemedianlengthofthetarget assumingthatthevariablesarefixed.Onein75malesisdiagnosedwithautism, classcanariseunderthenullmodel,andarecomputedbysimulationsofequal andweestimate(fromempiricallyderivedgenderbiases)thatthree-quartersof numberofgenesweightedbylength.Althoughthedeviationisstatisticallysignifi- thesemalesareoflowerIQ,yieldingaprevalenceF51/100.Fromourstudy,0.23 cant,itisofsuchaminoramountthatweignoreitforthenullmodel. ofthesehaveanLGD.BecausetheexpectedproportionofpeoplewithaDNLGD Measuringoverlaps.Wetestforoverlapsbetweentargetsofagivenmutation- mutationis P50.11, only A50.12of this subpopulationhave an LGDin a child-typeandothersetsofgenes(forexample,overlapofDNLGDtargetsin vulnerablegenethatcontributestoascertainment.ThusF3A51.231023is affectedgirlswithFMRP-associatedgenes)aswellasoverlapsbetweentargetsof theproportionofmalesthathavelowerIQandautismresultingatleastpartially twodifferentmutation-child-types(forexample,overlapbetweenthetargetsof fromaDNLGD.ThisproportionisalsogivenbyP3H3V,inwhichHisthe DNmissenseinallprobandsandthetargetsofDNLGDsinallprobands).Inboth probabilitythattheLGDhitswithinthegenesvulnerabletoLGDmutations,and cases,observedoverlapsarecomparedtothoseexpectedunderthelength-based Visthemeanclassvulnerabilityforthesegenes.P,asalreadystated,is0.11.We nullmodeldiscussedabove. havecomputedthenumberofgenesvulnerabletoLGDmutations,N,forthe LetTbethetargetsofmutationofagiventypeinachildofagiventype,Sa affectedmaleswithlowerIQtobeabout400genes(SupplementaryTable6). predefinedgeneset,andOtheintersectionofTandS.WeaskforanygeneGthat Assumingmembershipinthetargetclassisindependentofgenelength,and carriesasinglemutation,whattheprobabilityp(S)isthatthemutation(and about20,000genes,wecalculateH5400/20,00050.02,andsolveVtobe0.55. henceG)fallsinS.Weestimatep(S)bycollapsingallrecurrenthitstoone,and Weassumethefollowingprevalence:F51/75forASDinmales,F51/100for applyingthelength-basednullmodeltoS.Thusp(S)istheratioof(1),thesumof ASDwithlowerIQinmales,F51/300forASDwithhigherIQinmales,and exome-capturedlengthsofthegenesofS,dividedby(2),thesumoftheexome- F51/300forASDingirls.AandPareempiricallyderivedgammadistributions capturedlengthsofallgenes.SupplementaryTable7showsthelengthofthe fromthesampledPoissonratesofDNLGDmutationsinaffectedandunaffected capturedportionofallgenesintheexomeweanalyse.Using‘jj’todesignatethe siblings.BykeepingtheobservednumberofLGDeventsandtheobservedpro- numberofmembersinaset,wethenperformatwo-sidedbinomialtestofjOj portionofLGDeventsconstant,wesamplefromthedistributionoftargetnum- outcomesinjTjopportunitiesgiventheprobabilityofsuccessp(S). berNandthedistributionsonAandPasdescribedintheprevioussection.Weset Whenwetestoverlapsbetweentargetsoftwodifferentmutation-child-types, Htobetheratioofthetotallengthofuniformlysampledvulnerablegenestothe wetakeoneofthetargetsasTandcomparetheothertargetsasS.However,before totallengthoftheanalysedcapturedexome,andcomputeavulnerabilitypoint constructingTandS,wecleanuptargetssharedbyTandSthatresultfrom estimateasdescribedjustabove.ThesesampledvaluesaredisplayedinFig.3, mutationssharedbetweensiblingsinthesamefamily,orfrommultiplemutations bottom.ThemodeforVis0.4formalesoflowerIQ. ofdifferenttypesaffectingasinglegeneinonechild.Wethenapplythemethodof ParentalageandphasingofDNmutations.Weusedtwodifferentstrategiesfor theparagraphabove.Finally,wereversetheprocedureforcreatingofSandT,and modellingtherelationshipbetweenratesofDNsubstitutionsandtheagesofthe reportbothresults(SupplementaryTable6). parents. ©2014Macmillan Publishers Limited. All rights reserved ARTICLE RESEARCH ThefirststrategydoesnotdependonknowledgeoftheparentoforiginforDN download.html).Thisdatasetprovidesnormalizedexpressionlevelsfor,17,000 substitutions,whichwedonotknowforthevastmajorityofDNsubstitutions. genesacrossbrainregionsfrom36individuals,18ofwhichwerefromembryos. Becausetheagesofthemotherandthefatherarestronglycorrelated,wecan Eachbrainwasfurthersubdividedinto14anatomicalregionsforatotalof508 effectivelyusethisstrategyonlytoexploretherelationshipbetweenthefather’s regions.Wecomputedcorrelationvaluesforthe17,000genes,andgenerateda ageandtheratesofDNsubstitutions.Overprobandsandsiblingsinthe403 graphbyconnectinggenesthathadcorrelations.0.85,thenidentifiedconnected -jointfamilytarget,wemodelthenumberofmutationsperchildassampledfrom componentsandaveragedtheexpressionofgeneswithinthesecomponentsasa aPoissondistributionwithrateRc5Tc3(A3Fc1B),inwhichRcistherateof functionoftheannotatedageofthebrainandbyregion.Eachregionissortedfirst DNsubstitutionsperchild,Fcistheageofthefatheratthebirthofthechild,Tcis byage,thenbytype(ExtendedDataFig.8).Theaveragednormalizedexpressionof theratioofthelengthofthe403-targetinthatchildtothetotalexomelength,and the1,912genesinthefirstcomponentdecreasesafterbirth,andhencewecallthis AandBarewholepopulationparameters,estimatedbymaximizingthelike- setembryonic. lihoodoverallchildren. SupplementaryTable7showsthegenesintheeightfunctionalclassesthatare ThesecondstrategyisapplicableonlytoDNmutationsforwhichwehave withinthecapturedexomeregionsandwereusedinallanalyses. successfully‘phased’theparentoforiginbyproximitytoalinkedpolymorphism. Foreachparentalgender,weseparatelyperformatwo-sidedone-samplet-testto 40. O’Roak,B.J.etal.Exomesequencinginsporadicautismspectrum comparetheparentalagesofeachphasedDNmutationtothemeanofparental disordersidentifiesseveredenovomutations.NatureGenet.43,585–589 (2011). agesinourpopulation. 41. Boyle,E.A.,O’Roak,B.J.,Martin,B.K.,Kumar,A.&Shendure,J.MIPgen:optimized DNsubstitutionsincrease,0.4perpaternaldecade(ExtendedDataFig.4), modelinganddesignofmolecularinversionprobesfortargetedresequencing. consistentwithpreviousstudies15andtheincreaseinautismasafunctionof Bioinformatics30,2670–2672(2014). paternalage46,47.Wherewecoulddetermineparentalphase,DNsubstitutionsarose 42. Li,H.&Durbin,R.FastandaccurateshortreadalignmentwithBurrows–Wheeler morefrequentlyinthepaternal(287)thaninthematernal(80)background.Among transform.Bioinformatics25,1754–1760(2009). phasedDNevents,themeanageatbirthwas34.6forthefatherand32.0yearsfor 43. McKenna,A.etal.TheGenomeAnalysisToolkit:aMapReduceframeworkfor analyzingnext-generationDNAsequencingdata.GenomeRes.20,1297–1303 themother,whereastherespectivemeanageswere33.2and31.1yearsforfathers (2010). andmothersinthewholepopulation(P50.0001and0.047,respectively,thatthese 44. Narzisi,G.etal.Accuratedenovoandtransmittedindeldetectionin differencesarisebychance). exome-capturedatausingmicroassembly.NatureMethods11,1033–1036 Geneclassdefinition.Fordeterminingoverlapwithdenovomutations,func- (2014). tionalgeneclassesweredefinedasfollows.‘FMRP’aregenesencodingtranscripts 45. Li,H.etal.TheSequenceAlignment/MapformatandSAMtools.Bioinformatics25, thatbindtoFMRP17.‘Chromatin’indicateschromatinmodifiersasdefinedby 2078–2079(2009). 46. Reichenberg,A.etal.Advancingpaternalageandautism.Arch.Gen.Psychiatry63, GeneOntology(GO;http://www.geneontology.org/).‘PSD’isasetofgenesencod- 1026–1032(2006). ing proteins that have been identified in postsynaptic densities20. ‘Mendelian’ 47. Croen,L.A.,Najjar,D.V.,Fireman,B.&Grether,J.K.Maternalandpaternalageand representpositionallyidentifiedhumandiseasegenes22,and‘essential’genesare riskofautismspectrumdisorders.Arch.Pediatr.Adolesc.Med.161,334–340 humanorthologuesofmousegenesassociatedwithlethalityintheMouseGenome (2007). Database21.‘dnLGD(Scz)’aredenovoLGDsinschizophrenia26,48,49and‘dnLGD 48. Gulsuner,S.etal.Spatialandtemporalmappingofdenovomutationsin schizophreniatoafetalprefrontalcorticalnetwork.Cell154,518–529 (ID)’aredenovoLGDsinintellectualdisability25,29. (2013). ‘Embryonic’genesarethoseexpressedinpost-mortemhumanembryonicbrains19, 49. Xu,B.etal.Exomesequencingsupportsadenovomutationalparadigmfor derived from downloaded expression data18 (http://www.brainspan.org/static/ schizophrenia.NatureGenet.43,864–868(2011). ©2014Macmillan Publishers Limited. All rights reserved RESEARCH ARTICLE ExtendedDataFigure1|Numberoffamiliessequencedbycentre. The numbersoffamiliessequencedatthethreecentresareplottedasaVenn diagram.Familiessequencedatmorethanonecentreareindicatedbythe overlappingregionsbetweencircles.CSHL,ColdSpringHarborLaboratory; UW,UniversityofWashington,Seattle;YALE,YaleMedicalCenter. ©2014Macmillan Publishers Limited. All rights reserved