ebook img

Trans-ethnic meta-regression of genome-wide association studies accounting for ancestry ... PDF

12 Pages·2017·0.33 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Trans-ethnic meta-regression of genome-wide association studies accounting for ancestry ...

HumanMolecularGenetics,2017,Vol.26,No.18 3639–3650 doi:10.1093/hmg/ddx280 AdvanceAccessPublicationDate:17July2017 AssociationStudiesArticle ASSOCIATION STUDIES ARTICLE Trans-ethnic meta-regression of genome-wide D o w association studies accounting for ancestry n lo a d e increases power for discovery and improves d fro m h fine-mapping resolution ttp s ://a Reedik Ma€gi1, Momoko Horikoshi2,3, Tamar Sofer4, Anubha Mahajan2, ca d e Hidetoshi Kitajima2, Nora Franceschini5, Mark I. McCarthy2,6,7, COGENT- mic .o Kidney Consortium, T2D-GENES Consortium and Andrew P. Morris1,2,8,9,* up .c o m 1EstonianGenomeCenter,UniversityofTartu,Tartu,Estonia, 2WellcomeTrustCentreforHumanGenetics, /h m UniversityofOxford,Oxford,UK, 3LaboratoryforEndocrinology,MetabolismandKidneyDiseases,RIKEN, g/a CenterforIntegrativeMedicalSciences,Yokohama,Japan, 4DepartmentofBiostatistics,Universityof rtic Washington,Seattle,WA,USA, 5DepartmentofEpidemiology,UniversityofNorthCarolina,ChapelHill,NC, le-a b USA, 6OxfordCentreforDiabetes,EndocrinologyandMetabolism,RadcliffeDepartmentofMedicine, s tra UniversityofOxford,Oxford,UK, 7OxfordNIHRBiomedicalResearchCentre,OxfordUniversityHospitalsTrust, c t/2 Oxford,UK, 8DepartmentofBiostatisticsand 9DepartmentofMolecularandClinicalPharmacology,University 6 /1 ofLiverpool,Liverpool,UK 8/3 6 3 *Towhomcorrespondenceshouldbeaddressedat:DepartmentofBiostatistics,UniversityofLiverpool,BlockFWaterhouseBuilding,1-5BrownlowStreet, 9 LiverpoolL693GL,UK.Tel:þ441517949756;Fax:þ441517065932;Email:[email protected] /39 7 6 5 6 9 b y Abstract g u e Trans-ethnicmeta-analysisofgenome-wideassociationstudies(GWAS)acrossdiversepopulationscanincreasepowerto st o detectcomplextraitlociwhentheunderlyingcausalvariantsaresharedbetweenancestrygroups.However,heterogeneityin n 0 alleliceffectsbetweenGWASattheselocicanoccurthatiscorrelatedwithancestry.Here,anovelapproachispresentedto 9 A detectSNPassociationandquantifytheextentofheterogeneityinalleliceffectsthatiscorrelatedwithancestry.Weemploy p trans-ethnicmeta-regressiontomodelalleliceffectsasafunctionofaxesofgeneticvariation,derivedfromamatrixofmean ril 2 0 pairwiseallelefrequencydifferencesbetweenGWAS,andimplementedintheMR-MEGAsoftware.Throughdetailedsimula- 1 9 tions,wedemonstrateincreasedpowertodetectassociationforMR-MEGAoverfixed-andrandom-effectsmeta-analysis acrossarangeofscenariosofheterogeneityinalleliceffectsbetweenethnicgroups.Wealsodemonstrateimprovedfine- mappingresolution,inlocicontainingasinglecausalvariant,comparedtothesemeta-analysisapproachesandPAINTOR, andequivalentperformancetoMANTRAatreducedcomputationalcost.ApplicationofMR-MEGAtotrans-ethnicGWASof kidneyfunctionin71,461individualsindicatesstrongersignalsofassociationthanfixed-effectsmeta-analysiswhenhetero- geneityinalleliceffectsiscorrelatedwithancestry.ApplicationofMR-MEGAtofine-mappingfourtype2diabetessusceptibil- itylociin22,086casesand42,539controlshighlights:(i)strongevidenceforheterogeneityinalleliceffectsthatiscorrelated Received:May15,2017.Revised:June28,2017.Accepted:July13,2017 VCTheAuthor2017.PublishedbyOxfordUniversityPress. ThisisanOpenAccessarticledistributedunderthetermsoftheCreativeCommonsAttributionLicense(http://creativecommons.org/licenses/by/4.0/), whichpermitsunrestrictedreuse,distribution,andreproductioninanymedium,providedtheoriginalworkisproperlycited. 3639 3640 | HumanMolecularGenetics,2017,Vol.26,No.18 withancestryonlyattheindexSNPfortheassociationsignalattheCDKAL1locus;and(ii)99%crediblesetswithsixorfewer variantsforfivedistinctassociationsignals. Introduction effectsofavariantacrossGWAS,weightedbytheircorrespond- ingstandarderrors,canthenbemodelledinalinearregression There is increasing evidence from genome-wide association framework,includingtheaxesofgeneticvariationascovariates. studies (GWAS) that common SNPs driving complex human Theflexibilityofthismodelenablespartitioningoftheheteroge- trait associations are shared across diverse populations (1,2), neity into components that are correlated with ancestry and andfurthermore,thatallelesatthesesignalsdemonstratecon- residual variation, which would be expected to improve fine- cordantdirectionsofeffectacrossethnicities(3).Thisobserva- mapping resolution. Here, we present the results of a detailed tion is consistent with a model in which causal variants are simulation study to investigate the properties of trans-ethnic sharedacrossdiversepopulations,forwhichtrans-ethnicmeta- D meta-regressionforthedetectionandfine-mappingoflocicon- o analysisoffersanopportunitytoincreasepowertodetectnovel w tainingasinglecausalvariantcontributingtoabinaryphenotype n locithroughincreased samplesize.However,heterogeneityin lo overarangeofscenariosforheterogeneityinalleliceffectsbe- a alleliceffectsbetweenGWASatSNPsintheseloci,whichcan- tweendiversepopulations.Wecomparetheperformanceofthe ded nanoatlbyseisa,cbcuomt wmhoidcahteisdctohrrroeulagthedtrwaditithioannaclefistxreyd,-ceaffnecotcscmureftoar- maneatlay-sriesg,riemsspiloenmewnittehdfiinxeMd-ETaAnSdOFrTan(7d)o,man-edffwecittshM(RAE2N)TRmAet(a8-) from severalreasons.First,variabilityinpatternsoflinkagedisequi- h librium (LD) with the causal variant(s) between ethnic groups andPAINTOR(10)inthecontextoffine-mapping.Wealsopre- ttp willpropagateheterogeneitybetweenpopulationsintheallelic stoe:n(ti)thGeWreAsSulotsfokfidanneaypfpulniccattiioonnionft7r1a,n46s1-eitnhdniivcidmueatlas-roefgArefsrsicioann s://a effects of SNPs, which has the advantage of enabling high- ca American, East Asian, European and Hispanic/Latino ancestry d resolution fine-mapping (4–6). Second, the causal variant(s) e fromtheCOGENT-KidneyConsortium(11)and;(ii)fine-mapping m mexapyosiunrteeraaccrtowssithpoapnuleantivoinrosn,morewntitahl rSiNskPsfatchtaotrdthifafetrdiinffearlsleilne f4o2u,5r3t9ypceon2trdoilasbeotfesEa(Tst2DA)ssiaunsc,eEputirboiplietyanl,ocSioiunth22A,0s8i6anc,asAefsriacnand ic.oup frequencybetweenethnicgroups,therebygeneratingheteroge- AmericanandMexicanAmericanancestryfromtheT2D-GENES .co neityinallelicmaineffectsunlessaccountedforintheanalysis. m Third, the quality of imputation might vary between popula- Consortium(12). /h m tions,dependentonthereferencepanelused,leadingtodown- g /a ward bias in allelic effect estimates within ethnic groups in Results rtic whichgenotypesarelesswellpredicted. le Oneapproachtoallowforheterogeneityinalleliceffectsbe- Wehavedevelopedanovelapproachtoaggregateassociation -a b tweenGWASistoutilisemeta-analysisunderarandom-effects sauccmomunatryfosrtahteistteircosgaecnreoistys GinWaAllSelfircomeffedcivtsertsheaptoipsuclaotriroenlastetdo stra model.TheRE2 meta-analysisincreases poweroverthetradi- c tionalrandom-effectsmodelbytakingaccountoftheexpected with ancestry (Materials and Methods). Briefly, we employ t/26 homogeneityofalleliceffectsbetweenGWASunderthenullhy- trans-ethnicmeta-regressiontomodelalleliceffectsasafunc- /1 8 pothesis ofnoassociation for whichallalleliceffects are zero tionofaxesofgeneticvariation,derivedfromamatrixofmean /3 6 (7).However,thesemodelsdonotassumeanystructuretothe pairwise allele frequency differences between GWAS. The 39 heterogeneityinalleliceffectsbetweenpopulationsthatwould meta-regressionmodelpartitionsheterogeneityinalleliceffects /3 9 beexpectedintrans-ethnicmeta-analysis.Toaccountforthis between GWAS into two components: (i) heterogeneity that is 76 correlatedwithancestry;and(ii)residualheterogeneity.Bayes’ 5 structure, MANTRA implements a Bayesian partition model 6 that clusters GWAS using a prior model of similarity between factorsinfavourofassociationcanbederivedfromthemeta- 9 b them,assessedbymeanpairwisegenome-wideallelefrequency regression model for each variant, enabling fine-mapping and y g differences (8). Compared to fixed- and random-effects meta- constructionofcrediblesets.Themeta-regressionmethodology ue s analysis,MANTRAhasbeendemonstratedtoincreasepowerto has been implemented in the MR-MEGA (Meta-Regression of t o Multi-Ethnic Genetic Association) software (http://www.geeni n detect association and improve the resolution of trans-ethnic 0 varamu.ee/en/tools/mr-mega). 9 fine-mapping across a range of heterogeneity scenarios (8,9). A Nmeevtehrotdhselteossa,ppMroAxNimTRaAtetuhteilipsoessteMrioarrkdoisvtricbhuatiinonMofomntoedeClapralo- pril 2 Simulationstudydesign 0 rameters, which can be computationally intractable for meta- 19 analysisoflargenumbersofGWASandSNPs.Fortrans-ethnic Webeganbyundertakingadetailedsimulationstudytocom- fine-mapping, methodology integrating association summary paretheperformanceofthemeta-regressionmethodologywith statistics and functional annotation to improve localisation of existing approaches for discovery and fine-mapping of GWAS causal variants has been implemented in PAINTOR (10), al- lociacrossdiversepopulations.Weconsideredthe26reference thoughthisapproachdoesnottakeaccountofthegeneticsimi- populationsfromPhase3ofthe1000GenomesProject(13),in- larity between GWAS to inform the structure of heterogeneity corporatinghaplotypesofAfrican,EastAsian,European,Native inalleliceffects. American and South Asian ancestry (Supplementary Material, To address the shortcomings of existing methodologies for TableS1).Weusedasubsetof13,189autosomalvariantsfrom aggregatingGWASfromdiversepopulations,wehavedeveloped the reference panel with minor allele frequency (MAF)>5% in anovelapproachtodetectandfine-mapcomplextraitassocia- allpopulations,andseparatedbyatleast1Mb,toderivethema- tionsignalsviatrans-ethnicmeta-regression.Thisapproachuses trix of pairwise Euclidean distances between the populations. genome-widemetricsofdiversitybetweenpopulationstoderive We then implemented multi-dimensional scaling of the dis- axes of genetic variation via multi-dimensional scaling. Allelic tance matrix to derive three axes of genetic variation to HumanMolecularGenetics,2017,Vol.26,No.18 | 3641 Homogenous scenario 1 0.8 Meta regression wer0.6 Random-effects Po0.4 Fixed-effects 0.2 0 1 1.05 1.1 1.15 1.2 1.25 Odds-ra(cid:3)o African-specific scenario Eurasian scenario 1 1 0.8 0.8 D wer0.6 wer0.6 own Po0.4 Po0.4 lo a 0.2 0.2 d e d 0 1 1.05 1.1 1.15 1.2 1.25 0 1 1.05 1.1 1.15 1.2 1.25 fro m Odds-ra(cid:3)o Odds-ra(cid:3)o h Na(cid:3)ve American scenario Non-ancestral heterogeneity scenario ttp s 1 1 ://a c 0.8 0.8 ad e Power00..46 Power00..46 mic.o u 0.2 0.2 p .c 0 0 om 1 1.05 1.1 1.15 1.2 1.25 1 1.05 1.1 1.15 1.2 1.25 /h Odds-ra(cid:3)o Odds-ra(cid:3)o m g /a mFigeutar-ea1n.aPlyoswise;rratonddoemte-cetfafescstosc(iaRtEi2o)nm,aettag-eannoamlyes-iws;iadnedsimgneitfiac-arengcrees(Psi<on5(cid:2)in1c0lu(cid:3)d8)i,nugsainxgesaoltfegrnenateitvicevaaprpiarotiaocnhaessctoovaagrgiaretegsataesGimWpAleSmacernotsesddiniveMrRse-MpoEGpuAl.aPtioownesr:fiisxpedre-seeffnetcetds rticle asafunctionoftheallelicodds-ratioforeachoffivescenariosforheterogeneityineffectsbetweenpopulations,describedinSupplementaryMaterial,TableS1. -a b s tra separatepopulationsbetweenancestrygroups(Supplementary threeaxesofgeneticvariationascovariatestoseparateances- c Material,Fig.S1). try groups. For comparison, we also aggregated association t/2 6 Weconsideredarangeofmodelsofassociationofacausal summarystatisticsviafixed-effects(inverse-varianceweighted /1 8 variantwithabinaryphenotypeacrossancestrygroups,para- log-oddsratios)andrandom-effects(RE2)meta-analysisimple- /3 6 meterised in terms of the allelic effect (odds-ratio, w) in each mented in METASOFT (7). We have not included MANTRA in 3 9 population(SupplementaryMaterial,TableS1).Thesescenarios our comparisons of methods for false positive error rates and /3 9 incorporated heterogeneity in allelic effects of the causal power because: (i) the increased computational burden makes 76 5 variant between ancestry groups: (i) homogenous; (ii) African- simulationsintractableand;(ii)therequiredderivationofnomi- 6 9 specific; (iii) Eurasian; (iv) Native American; (v) random (non- nalandgenome-widesignificancethresholdsforBayes’factors b y ancestral).Undermodel(i),thealleliceffectofthecausalvariant infavourofassociationacrosstheallelefrequencyspectrumis g u ishomogeneousacrossallpopulations.Undermodel(ii),theal- notstraightforward. e s lelic effect of the causal variant is specific to populations of Falsepositiveerrorratesfordetectingassociationwerecon- t o n African ancestry. Under model (iii), the allelic effect of the sistentwiththenominalsignificancethreshold(P<0.05),across 0 9 causal variant is zero in populations of African ancestry, and allheterogeneityscenariosconsidered,forfixed-andrandom- A htheoteseroogfenEueoroupsebaent,wSeoeunthpoApsuialantiaonndsNofaEtiavsetAAmsiaenricaannceasntcryesatnryd. eafxfeesctosfmgeentae-taicnvaalyrsiaist,ioanndtofoarccmouentat-froergrheestseiorongeinnceliutydiinngatlhlerleiec pril 2 0 Undermodel(iv),thealleliceffectofthecausalvariantisspe- effects between ancestry groups (Supplementary Material, 19 cific to, but heterogeneous between, populations of Native TableS2). Americanancestry.Finally,undermodel(v),thealleliceffectof Forscenariosinwhichheterogeneityinalleliceffectsbetween thecausalvariantisspecifictoonepopulationineachancestry populations was correlated with ancestry (African-specific, group. EurasianandNativeAmerican),greatestpowertodetectassocia- tionwasattainedforthemeta-regressionincludingthreeaxesof geneticvariationascovariates(Fig.1). Thegainsinpowerover Simulationstudy:falsepositiveerrorrateandpower fixed-andrandom-effectsmeta-analysisweregreatestwhenthe Toassessfalsepositiveerrorratesandpowerforeachscenario, effectofthevariantwasspecifictooneancestrygroup(African- we generated 1,000 replicates of genotype data for the causal specific and Native American). For all three of these scenarios, variantin1,000casesand1,000controlsfromeachpopulation powertodetectheterogeneityinalleliceffectsthatiscorrelated (MaterialsandMethods).Associationsummarystatisticsforthe withancestryinthemeta-regressionmodelisgreaterthanthat causal variant were aggregated across populations using the obtained from Cochran’s Q statistic in the fixed-effects meta- meta-regression model, implemented in MR-MEGA, including analysis(SupplementaryMaterial,Fig.S2). 3642 | HumanMolecularGenetics,2017,Vol.26,No.18 3.5 modelattainedthenominalsignificance threshold(P<0.05), as expected(SupplementaryMaterial,Fig.S2).Powertodetectresid- 3 ualheterogeneityinalleliceffectsinthemeta-regressionmodel e) og scal 2.5 aotrtvaiianeCdocthheranno’smQinsatlastiisgtnicifiicnatnhceefithxereds-hefofledc.tsmeta-analysisalso et size (l 2 e s 1.5 dibl Simulationstudy:fine-mappinglociwithasinglecausal e % cr 1 variant 9 9 0.5 Toassessfine-mappingresolutionwithinlocicontainingasingle causalvariant,foreachscenario,wegenerated500replicatesof 0 genotype data for variation in a 2Mb genomic region, in 1,000 Homogeneous African-specific Eurasian Na(cid:3)ve American Non-ancestral D cases and 1,000 controls for each population (Materials and o Fixed-effects Random-effects Meta regression MANTRA PAINTOR Methods).Foreachreplicate,weconsideredtwosettings:(i)‘per- wn lo fect’data,whereallvariantsintheregionwerecaptured,withno a d 0.6 missing genotypes or errors, for benchmarking purposes; and ed (ii)‘imperfect’data,whereonly100randomlyselectedvariantsin fro 0.5 the2Mbregionwereretained,torepresentatypicalGWASarray, m erior probability 00..34 at(‘iMymnpadeptsteehrrfiferaeorlcmeta’snutddhlatetiMna1ge0ste0sht0cotaiGdnffesgo)ns.lod,FwmoorefegesoeaPbnctrhaooitjrneyecepptdelPsitchhwaaetasepes,ofi3mosrtreepbfrueoiotrteerhdnp‘curpeopebprtafaoebnchitelia’ltpay(1lnoo3d-f) https://acade ost drivingtheassociationforeachvariantfromthemeta-regression m an p 0.2 model, implemented in MR-MEGA, including three axes of ge- ic.o Me netic variation as covariates to separate ancestry groups. For up 0.1 comparison, posterior probabilities of driving the association .c o werederived,foreachvariant,from:(i)fixed-andrandom-effects m 0 meta-analysis, implemented inMETASOFT (7);(ii)MANTRA(8); /hm Homogeneous African-specific Eurasian Na(cid:3)ve American Non-ancestral g and(iii)PAINTOR(10),assumingasinglecausalvariantatthelo- /a Fixed-effects Random-effects Meta regression MANTRA PAINTOR cusandapproximatingLDbetweenvariantsineachpopulation rtic fromhaplotypesinthe1000GenomesProjectPhase3reference le Figure2.Metricsoffine-mappingresolution,with‘perfectdata’,acrossalterna- -a panel(13).NotethatwedidnotrunPAINTORinamodetoinfer b tive approaches to aggregate GWASacrossdiverse populations:fixed-effects s meta-analysis;random-effectsmeta-analysis;meta-regressionincludingaxes functionalenrichmentbecauseoursimulationsdidnotusean- tra c ofgeneticvariationascovariatesasimplementedinMR-MEGA;MANTRA;and notationtoweighttheselectionofthecausalvariantinthere- t/2 PAINTOR.Twometricsarepresented:(i)themediannumberofvariantsinthe gion.Ineachreplicate,weusedposteriorprobabilitiesfromeach 6 99%crediblesetonalog10-scale;and(ii)themeanposteriorprobabilityascribed ofthefivemethodstoconstructthe99%crediblesetdrivingthe /18 tothecausalvariant.Metricsarepresentedforeachoffivescenariosforhetero- /3 associationsignalatthelocus(MaterialsandMethods). 6 geneityineffectsbetweenpopulations,describedinSupplementaryMaterial, 3 TableS1.Ineachscenario,theoddsratiohasbeenfixedtoobtainapproximately We considered three metrics of fine-mapping performance 9/3 80%powertodetectassociationatgenome-widesignificance(P<5(cid:2)10(cid:3)8)inthe acrosssimulations:(i)thenumberofvariantsinthe99%credible 97 meta-regressionanalysis. set;(ii)themeanposteriorprobabilityascribedtothecausalvari- 65 6 ant;and(iii)thecoverageofthecausalvariantbythe99%credible 9 Forthescenarioinwhichheterogeneityinalleliceffectsbe- set.Smallercrediblesetscorrespondtofine-mappingathigherres- by tween populations is random (non-ancestral), power was low olution,whilstthemeanposteriorprobabilityforthecausalvari- gu e forallmethods,butgreatestforrandom-effectsmeta-analysis ant measures accuracy. For each heterogeneity scenario, we s (Fig.1).Asexpected,powertodetectheterogeneityinallelicef- considered population-specific odds-ratios with approximately t on fects that is correlated with ancestry in the meta regression 80%powertodetectassociationwiththemeta-regressionmodel 09 modelattainedthenominalsignificancethreshold(P<0.05)for (Fig.1):homogeneous,w¼1.10;African-specific,w¼1.25;Eurasian, A p thisscenario(SupplementaryMaterial,Fig.S2).Powertodetect w¼1.10;NativeAmerican,w¼1.30;andnon-ancestral,w¼1.35. ril 2 residual heterogeneity in allelic effects in the meta-regression Wefirstconsideredthe‘perfect’datasetting,whereallvari- 0 1 model or via Cochran’s Q statistic in the fixed-effects meta- antsintheregionwerecaptured,withnomissinggenotypesor 9 analysiswasequivalent. errors (Fig. 2, Table 1). First, we note that, across the range of Finally,forthescenarioofhomogenousalleliceffectsacross scenariosconsidered,thecoverageofthecausalvariantbythe populations, greatest power to detect association was attained credible set obtained from PAINTOR was not consistent with through fixed-effects meta-analysis, as expected (Fig. 1). There 99%, suggesting that this method is not well calibrated in our was only a small reduction in power for random-effects (RE2) simulations. Only meta-regression, as implemented in MR- meta-analysis,whichappropriatelyaccountsforthelackofhet- MEGA,attainedcoverageratesforthecausalvariantthatwere erogeneityunderthenullhypothesisofnoassociation(7).There consistentwith99%acrossallheterogeneityscenarios.Forsce- wasafurthersmallreductioninpowerforthemeta-regression narios in which heterogeneity in allelic effects was correlated model, which was penalised for the additional parameters re- withancestry(African-specific,EurasianandNativeAmerican), quiredfortheaxesofgeneticvariationthatdonotcontributeto the resolution and accuracy of fine-mapping was always sub- heterogeneityinalleliceffectsbetweenpopulationsinthissce- stantiallyworseforthefixed-orrandom-effectsmeta-analysis, nario.Forthisscenario,powertodetectheterogeneityinallelic withthemeta-regressionmodelandMANTRAperformingbet- effects that is correlated with ancestry in the meta regression ter than PAINTOR. For example, for the Native American HumanMolecularGenetics,2017,Vol.26,No.18 | 3643 Table1.Coverageofthecausalvariantbythe99%crediblesetacross500simulationsofeachscenariowith‘perfect’dataforfivefine-mapping approaches:(i)fixed-effectsmeta-analysis;(ii)random-effectsmeta-analysis;(iii)meta-regressionaccountingforheterogeneityinallelicef- fectsimplementedinMR-MEGA;(iv)MANTRA;and(v)PAINTOR Fine-mappingmethod Heterogeneityscenario Homogeneous African-specific Eurasian NativeAmerican Non-ancestral Fixed-effects 0.998 0.986 0.660 0.944 0.996 Random-effects 1.000 0.990 0.966 1.000 0.998 Meta-regression 0.992 0.998 0.978 0.992 0.980 MANTRA 0.994 0.982 0.992 0.918 0.878 PAINTOR 0.692 0.772 0.972 0.916 0.922 Coveragerateshighlightedinboldareconsistentwith99%(basedon500simulationsofeachofthefivescenarios). D o w 3.5 allelic effects between populations is random (non-ancestral), nlo PAINTOR outperformed all other methods in terms of fine- a d 3 mappingresolutionandaccuracy.Forthisscenario,axesofge- e g scale)2.5 nmeettica-rveagrrieastisoionntmhaotdedlisctainngnuoitshfulblyroaacdcoeutnhtnifcorgnroounp-asncinesttrhael d from edible set size (lo1.52 hmamneoetttgesaer-oinangoneutanhsleyeaistliy9lse9lb%miceetecwtfrhfeeeodecdnitbsslGeacoWcsrneoAstsiSsdw.epFraoeisnpdaus.lliHlamyto,iiwolfanoerrsv,aettchrh,reeothssnsecuetmmnhabeereairronaonopfgfovesahtrooei---f https://acad % cr 1 rior probability for the causal variant was substantially lower em 990.5 forPAINTORthantheotherfine-mappingmethods. ic.o Wethenconsideredthemorerealistic‘imperfect’dataset- up 0 ting,inwhichasubsetofgeneticvariationacrossalocuswas .co Homogeneous African-specific Eurasian Na(cid:3)ve American Non-ancestral assayeddirectlywithaGWASarray,withsubsequentimputa- m /h Fixed-effects Random-effects Meta regression MANTRA PAINTOR tion up to haplotypes from the 1000 Genomes Project Phase 3 m g referencepanel(13).Coverageofthecausalvariantbythe99% /a 0.25 crediblesetwasreducedforallmethodsacrosstherangeofsce- rtic nariosconsidered(SupplementaryMaterial,TableS3).Thisre- le -a ducedcoveragereflectsthatthecausalvariantmaynotalways b Mean posterior probability000.1..125 bdcprscaeureueinrsrbcvwfg,eeoeededrrdelmaltosfosuifaomstlnrothspic‘ccnepeuiegaecnttraeoaiifduofnernsicaotaethcsslxdrievcgoaclansuotmrsaansila’eia,socntilahondltlmoeptwhfdrorpeaosopdasumurwlgelowahdiwtttahhiwteosehrnieitc(cmshFor,pienpogaodsut.nshii3ttdbseee)tl.drreetihnvosudatrersatpiwt.amranioTtabthhasyaceabrthohtirlaestiavthlstyaeettoairlhvobesee---- stract/26/18/3639/397 0.05 We also compared, across simulations, the computational 65 6 burdenofeachofthetrans-ethnicmeta-analysisapproachesto 9 b 0 assess association with variants within the locus y Homogeneous African-specific Eurasian Na(cid:3)ve American Non-ancestral (Supplementary Material, Table S4). Using a dedicated single gu e Fixed-effects Random-effects Meta regression MANTRA PAINTOR coreprocessor,MANTRAwasthemostcomputationallyexpen- s sive(meanruntimeof66minutes),comparedtolessthantwo t on Figure3.Metricsoffine-mappingresolution,withimputeddata,acrossalterna- minutesforallothermethods. 09 tive approaches to aggregate GWASacrossdiverse populations:fixed-effects A meta-analysis;random-effectsmeta-analysis;meta-regressionincludingaxes p ofgeneticvariationascovariatesasimplementedinMR-MEGA;MANTRA;and ril 2 PAINTOR.Twometricsarepresented:(i)themediannumberofSNPsinthe99% Trans-ethnicmeta-analysisofGWASofkidneyfunction 0 1 crediblesetonalog10-scale;and(ii)themeanposteriorprobabilityascribedto WeconsiderednineGWASofkidneyfunction,assessedbythe 9 thecausalvariant.Metricsarepresentedforeachoffivescenariosforheteroge- estimatedglomerularfiltrationrate(eGFR),in71,461individuals neity in effects between populations, described in Supplementary Material, TableS1.Ineachscenario,theoddsratiohasbeenfixedtoobtainapproximately ofAfricanAmerican,EastAsian,EuropeanandHispanic/Latino 80%powertodetectassociationatgenome-widesignificance(P<5(cid:2)10(cid:3)8)inthe ancestry(SupplementaryMaterial,TableS5).Analysesofthese meta-regressionanalysis. GWAS, including 71,638 individuals, have been previously re- portedbytheCOGENT-KidneyConsortium(11).However,since scenario, the median number of SNPs in the 99% credible set publication of these results, 177 individuals from HCHS/SOL was1,156and2,063forfixed-andrandom-effects,respectively, have withdrawn consent, and association analyses have been whilstforthemeta-regression,MANTRAandPAINTORwasjust repeated for this cohort. Each GWAS was imputed up to the 7,10and15,respectively.Thisimprovedfine-mappingresolu- 1000 Genomes Project Phase 1 reference panel (14), and each tionreflectstheincreasedpowerobtainedthroughmodellingof variantpassingqualitycontrolwastestedforassociationwith heterogeneityinalleliceffectsbetweenGWASthatiscorrelated eGFR (Materials and Methods). Association summary statistics with ancestry. For the scenario in which heterogeneity in for each variant were aggregated across studies via: (i) fixed- 3644 | HumanMolecularGenetics,2017,Vol.26,No.18 Table2.Lociattaininggenome-widesignificantevidenceofassociation(P<5(cid:2)10(cid:3)8)witheGFRinMR-MEGAmeta-regressionof71,461individuals Locus LeadSNP Chr Position(bp,b37) Alleles Fixed-effectsmeta-analysis MR-MEGAmeta-regression Effect Other Beta SE P-value pQ P-value pHET-ANC pHET-RES SLC43A1 rs35716097 5 176,806,636 T C (cid:3)1.092 0.128 3.5(cid:2)10(cid:3)17 0.13 3.0(cid:2)10(cid:3)17 0.016 0.66 SHROOM3 rs28394165 4 77,394,018 C T (cid:3)0.949 0.117 1.0(cid:2)10(cid:3)15 0.0028 1.8(cid:2)10(cid:3)15 0.041 0.036 UNCX rs62435145 7 1,286,567 T G (cid:3)1.097 0.138 4.0(cid:2)10(cid:3)15 0.16 8.3(cid:2)10(cid:3)15 0.042 0.58 PDILT-UMOD rs77924615 16 20,392,332 G A (cid:3)1.184 0.147 1.9(cid:2)10(cid:3)15 0.010 9.7(cid:2)10(cid:3)15 0.10 0.017 BCAS3 rs9895661 17 59,456,589 C T (cid:3)0.990 0.132 1.7(cid:2)10(cid:3)13 0.18 6.5(cid:2)10(cid:3)13 0.085 0.38 GCKR rs1260326 2 27,730,940 C T (cid:3)0.867 0.115 9.0(cid:2)10(cid:3)14 0.072 2.0(cid:2)10(cid:3)12 0.54 0.041 WDR72 rs690428 15 53,950,578 A C (cid:3)0.688 0.115 4.1(cid:2)10(cid:3)9 7.8(cid:2)10(cid:3)5 8.7(cid:2)10(cid:3)11 0.00053 0.0081 CPS1 rs715 2 211,543,055 C T (cid:3)0.880 0.128 1.1(cid:2)10(cid:3)11 0.21 1.3(cid:2)10(cid:3)10 0.31 0.21 SPATA5L1-GATM rs2486288 15 45,712,339 C T (cid:3)0.875 0.126 7.5(cid:2)10(cid:3)12 0.73 1.8(cid:2)10(cid:3)10 0.65 0.63 D ALMS1 rs11884776 2 73,746,923 C T (cid:3)0.929 0.141 7.6(cid:2)10(cid:3)11 0.15 3.3(cid:2)10(cid:3)10 0.59 0.035 o w LRP2 rs57989581 2 170,194,459 C A (cid:3)1.961 0.315 8.6(cid:2)10(cid:3)10 0.19 7.7(cid:2)10(cid:3)10 0.025 0.70 n PIP5K1B rs4744712 9 71,434,707 A C (cid:3)0.756 0.112 2.8(cid:2)10(cid:3)11 0.90 8.1(cid:2)10(cid:3)10 0.80 0.80 loa PRKAG2 rs10265221 7 151,414,329 C T (cid:3)0.952 0.146 1.2(cid:2)10(cid:3)10 0.24 1.9(cid:2)10(cid:3)9 0.44 0.19 de d DSLACB222-AC29 crsh3r156:30904904526:D 56 16309,,460745,,572664 DC RT (cid:3)(cid:3)01..812920 00..112963 11..22(cid:2)(cid:2)1100(cid:3)(cid:3)190 00..7498 21..24(cid:2)(cid:2)1100(cid:3)(cid:3)98 00..5366 00..7449 from LOC100132354-VEGFA rs881858 6 43,806,609 A G (cid:3)0.777 0.127 1.6(cid:2)10(cid:3)9 0.0019 2.1(cid:2)10(cid:3)8 0.40 0.00092 h ttp s Chr:chromosome.SE:standarderror.pQ:Cochran’sQP-value.pHET-ANC:P-valueforheterogeneitycorrelatedwithancestry.pHET-RES:P-valueforresidualheterogeneity. ://a c a d effects meta-analysis, implemented in METASOFT (7); and step-wiseconditionalanalysesrevealedatotalofsevendistinct e m (ii)trans-ethnicmeta-regression,implementedinMR-MEGA,in- signals of association across the four loci, three mapping to ic cluding the two axes of genetic variation as covariates KCNQ1,twotoCDKN2A-B,andoneeachatIGF2BP2andCDKAL1. .ou p (MaterialsandMethods,SupplementaryMaterial,Fig.S3). For each distinct association signal, we applied the meta- .c o Genome-wide,weobservedstrongcorrelationinassociation regressionmodel,implementedinMR-MEGA,includingthreeaxes m P-valuesforeGFRfromthetrans-ethnicmeta-regressionandthe of genetic variation as covariates (Materials and Methods, /hm fixed-effects meta-analysis (Supplementary Material, Fig. S4). SupplementaryMaterial,Fig.S6).Weobservedgenome-widesignif- g /a StrongersignalsofassociationwitheGFRwereobservedfromthe icantevidenceofT2Dassociation(P<5(cid:2)10(cid:3)8)forindexSNPsfor rtic meta-regressionwhentherewasheterogeneityinalleliceffects eachdistinctsignalacrossthefoursusceptibilitylocifrommeta- le betweenGWASthatwascorrelatedwithancestry.Atotalof16 regressionaccountingforancestrywiththreeaxesofgeneticvaria- -ab lociattainedgenome-widesignificantevidence(P<5(cid:2)10(cid:3)8)ofas- tion as covariates (Table 3, Supplementary Material, Fig.S7). We stra sociation with eGFR from the trans-ethnic meta-regression observedstrongevidenceforheterogeneityinalleliceffectsthatis c (Table 2), with the strongest signals observed at/near SLC34A1 correlatedwithancestryonlyattheindexSNPfortheassociation t/26 (rs35716097,P¼3.0(cid:2)10(cid:3)17),SHROOM3(rs28394165,P¼1.8(cid:2)10(cid:3)15), signalattheCDKAL1locus(rs9368222,P¼0.00042).Theheteroge- /18 UNCX (rs62435145, P¼8.3(cid:2)10(cid:3)15) and PDILT-UMOD (rs77924615, neitywasprimarilyaccountedforbythethirdaxisofgeneticvaria- /36 P¼9.7(cid:2)10(cid:3)15).Signalsofassociationattheselociwerestronger tion(P¼0.0046), whichseparatesGWASofSouthAsianancestry 39 from the fixed-effects meta-analysis than the meta-regression from those of African American, East Asian, European and /39 7 whentheleadSNPdemonstratedlittleevidenceofheterogeneity Mexican American descent (Supplementary Material, Fig. S6). 6 5 inalleliceffectsbetweenGWAS.Thestrongestevidenceofhet- Allelic effect sizes increased along this axis (log-odds ratio 2.69, 6 9 erogeneityinalleliceffectsinthefixedeffectsmeta-analysis,as standarderror0.81),suggestingthatrs9368222hasweakereffects b y assessedbyCochran’sQstatistic,wasobservedfortheleadSNP onT2DsusceptibilityinSouthAsianpopulations(Supplementary g u atWDR72(rs690428,P¼7.8(cid:2)10(cid:3)5).Inthemeta-regressionanaly- Material,Fig.S8).Thesedataareconsistentwithpreviousreports es sis, the heterogeneity was partially correlated with ancestry ofheterogeneityattheCDKAL1locus(15,16),wherealleliceffects t o n (P¼0.00053), where allelic effects of the lead SNP on eGFR are arestrongerinEuropeanandEastAsianancestrypopulationsthan 0 9 specific to populations of European and East Asian descent inotherethnicgroups. A p (SupplementaryMaterial,Fig.S5). assoCcoinastitoruncstiiognnaolfs9a9c%rocsrsedthibelefosuetrssoufsvcaerpitainbitlsitdyrilvoicnigredvisetainlecdt ril 2 0 thattheresolutionoffine-mappingattainedfrommetaregres- 19 Fine-mappingoffourT2Dsusceptibilityloci:CDKAL1, sionwasequivalenttothatpreviouslyreportedusingMANTRA CDKN2A-B,IGF2BP2andKCNQ1 (12) (Table 4). The most precise localisation was observed for We considered 18 GWAS of T2D susceptibility in 22,086 T2D two of the association signalsat the KCNQ1 locus, indexed by cases and 42,539 controls of East Asian, European, South rs2237897(4variantsmappingto342bpofanintronofKCNQ1) Asian, African American and Mexican American ancestry and rs231353 (4 variants mapping to 38.5kb of KCNQ1-OT1). (Supplementary Material, Table S6), analyses of which have AttheCDKN2A-Blocus,the99%crediblesetsforbothassocia- beenpreviouslyreportedbytheT2D-GENESConsortium(12).In tionsignalsincorporateatotalof12non-overlapping variants theirstudy,eachGWASwasimputeduptothe1000Genomes that map to the same<5kb interval. Annotation of the 99% ProjectPhase1referencepanel(14)forthefourloci,andeach credible sets revealed inclusion of no coding variants, consis- variantpassingqualitycontrolwastestedforassociationwith tent with previous reports that T2D association signals at T2Dsusceptibility.Associationsummarystatisticsforeachvari- thesefourlociaremostlikelytobemediatedthroughgenereg- antwerethenaggregatedacrossGWASusingMANTRA(8),and ulation(12,17). HumanMolecularGenetics,2017,Vol.26,No.18 | 3645 Table3.IndexSNPsfordistinctT2Dassociationsignalsatfoursusceptibilitylocionthebasisofaggregationofsummarystatisticsfrom18 GWAS(22,086casesand42,539controls)fromdiversepopulationsusing:(i)MR-MEGAmeta-regressionaccountingforancestrywiththreeaxes ofgeneticvariationascovariates;and(ii)reportedresultsfromfixed-effectsmeta-analysis Locus IndexSNP Chr Position Alleles MR-MEGAmeta-regression Fixed-effectsmeta-analysis Risk Other P-value pHET-ANC pHET-RES OR(95%CI) P-value pQ IGF2BP2 rs11705729 3 185,507,299 T C 2.1(cid:2)10(cid:3)19 0.50 0.44 1.14(1.11–1.17) 1.3(cid:2)10(cid:3)21 0.49 CDKAL1 rs9368222 6 20,686,996 A C 5.1(cid:2)10(cid:3)31 0.00042 0.23 1.17(1.14–1.21) 4.1(cid:2)10(cid:3)30 0.0058 CDKN2A-B rs10965246 9 22,132,698 T C 4.8(cid:2)10(cid:3)37 0.62 0.0012 1.31(1.26–1.36) 8.4(cid:2)10(cid:3)40 0.0029 rs10757282 9 22,133,984 C T 3.9(cid:2)10(cid:3)11 0.16 0.24 1.12(1.09–1.16) 2.0(cid:2)10(cid:3)12 0.17 KCNQ1 rs231353 11 2,709,019 G A 2.7(cid:2)10(cid:3)9 0.92 0.64 1.11(1.07–1.14) 1.7(cid:2)10(cid:3)11 0.79 rs233448 11 2,840,424 C T 3.9(cid:2)10(cid:3)10 0.34 0.17 1.12(1.09–1.16) 9.5(cid:2)10(cid:3)12 0.18 rs2237897 11 2,858,546 C T 2.9(cid:2)10(cid:3)10 0.33 0.36 1.19(1.14–1.26) 7.7(cid:2)10(cid:3)12 0.35 D o Chr:chromosome.OR:odds-ratio.CI:confidenceinterval.pHET-ANC:P-valueforheterogeneitycorrelatedwithancestry.pHET-RES:P-valueforresidualheterogeneity. wn pQ:Cochran’sQstatisticP-value. loa d e d Table4.Propertiesof99%crediblesetsofvariantsunderlyingdistinctT2Dassociationsignalsatfoursusceptibilitylocionthebasisofaggrega- fro m tionofassociationsummarystatisticsfrom18GWAS(22,086casesand42,539controls)fromdiversepopulationsusing:(i)MR-MEGAmeta-re- h gressionaccountingforancestrywiththreeaxesofgeneticvariationascovariates;and(ii)MANTRA ttp s Locus IndexSNP MR-MEGAmeta-regression MANTRA ://a c SNPs Distance(bp) Interval(bp) SNPs Distance(bp) Interval(bp) a d e m IGF2BP2 rs11705729 40 39,163 185,495,320–185,534,483 36 31,027 185,503,456–185,534,482 ic CDKAL1 rs9368222 6 12,330 20,675,792–20,688,121 5 12,330 20,675,792–20,688,121 .o u CDKN2A-B rs10965246 6 1,556 22,132,698–22,134,253 5 1,371 22,132,698–22,134,068 p .c rs10757282 6 4,041 22,133,645–22,137,685 7 4,435 22,133,251–22,137,685 o m KCNQ1 rs231353 4 38,477 2,691,471–2,729,947 3 17,549 2,691,471–2,709,019 /h rs233448 13 20,175 2,837,723–2,857,897 11 20,273 2,837,625–2,857,897 m g rs2237897 4 342 2,858,295–2,858,636 3 197 2,858,440–2,858,636 /a rtic le -a b Discussion s study. However, we would still expect increased power over tra Wehavedevelopedanovelapproachtoaggregatingassociation fixed-andrandom-effectsanalysisbyallowingforheterogene- ct/2 summary statistics across GWAS from diverse populations itybetweenethnicgroups. 6 throughtrans-ethnicmeta-regression.Theapproachmodelsal- Alternative metrics to the genome-wide mean allele fre- /18 leliceffects,weightedbytheirstandarderrors,asafunctionof quencydifferenceexistforquantifyingtheextentofgeneticdif- /36 3 axesofgeneticvariation,derivedfrompairwiseallelefrequency ferences between GWAS. We investigated the impact of an 9 /3 differences, genome-wide, between studies. Across a range of alternative metric, the fixation index (FST) (18), on multi- 97 scenarios of heterogeneity in allelic effects between ancestry dimensional scaling of the 26 populations from the 1000 65 groups,meta-regressionhasincreasedpowertodetectassocia- GenomesProjectPhase3referencepanel(13)usedinoursimu- 69 tionoverfixed-andrandom-effectsmeta-analysis,whilstmain- lationstudy.Whilsttheabsoluteprojectionofpopulationsonto by tainingfalsepositiveerrorrates. the first three principal components changed from those ob- gu Axes of genetic variation are generated via multi- tainedfrommeanallelefrequencydifferences,theirrelativepo- es dimensional scaling of the mean allele frequency difference, sitionsontheseaxesofgeneticvariationwerehighlycorrelated t on genome-wide, betweeneachpair ofGWAScontributing tothe (SupplementaryMaterial,Fig.S9).Consequently,theuseofFST 09 meta-regression. In most consortia meta-analysis settings, al- as a distance metric, instead of mean allele frequency differ- A p lseolceiafrteioqnuesnucmiemsawroyusldtabtiesteixcspefcotredeatcohbeSNpPro,vinideadddaistioonnetooftahse- eynsicserse,shualstsn.oimpactonourdownstreammeta-regressionanal- ril 20 1 allelic effect size and corresponding standard error, for exam- Themeta-regressionmodelassumesalineartrendinallelic 9 ple.IfcontributingGWASdonotprovideallelefrequencyinfor- effectswitheachaxisofgeneticvariationincludedasacovari- mation,onesolutionistousedatafromreferencepopulations, ate.Whilstitisunlikelythatthislineartrendwillholdexactly, such as those from the 1000 Genomes Project (13,14). GWAS wehavedemonstratedthataxesofgeneticvariationaresuffi- from the same broad ethnic group would be matched to the cienttoclusterGWASofsimilarancestry,butalsodistinguish same reference population, and would therefore be located at populations within the same ethnic group (Supplementary the same position on axes of genetic variation. Consequently, Materials,FigsS1,S3,S6).Consequently,ifthealleliceffectofa MR-MEGAwouldbeabletodetectheterogeneityinalleliceffects variant is specific to one ancestry, or varies between diverse between ancestry groups, but would not be able to recognise populationsaccordingtotheirgeneticsimilarity(withinorbe- moresubtledifferences,duetoadmixtureforexample,within tweenethnicgroups),includingaxesofgeneticvariationasco- ethnicities.Wewouldthereforeexpecttheretobearelativeloss variates in the meta-regression model can account for this inpowertodetectassociationinsettingswhereheterogeneity heterogeneity. Indeed, the heterogeneity scenarios considered inalleliceffectswascorrelatedwithadmixtureproportions,for inoursimulationstudydonotassumealineartrendontheal- example in the ‘Native American’ scenario in our simulation lelic effect of the causal variant in any of the axes of genetic 3646 | HumanMolecularGenetics,2017,Vol.26,No.18 variation(SupplementaryMaterial,TableS1).However,inthose analysescanbeperformedusingbackwardeliminationtoiden- scenarios for which heterogeneity is correlated (non-linearly) tifyindexvariantsforeachdistinctassociationsignal,forexam- withancestry(Africanspecific,EurasianandNativeAmerican), pleasimplementedinGCTA(23),untilassociationatthelocus meta-regressionincludingthreeaxesofgeneticvariationasco- is fully explained. Fine-mapping is then undertaken for each variates offered improved power to detect association over distinct association signal by conditioning on all other index fixed- and random-effects meta-analysis (Fig. 1). Only when variantsatthelocus.Eachofthesedistinctsignalsisassumed heterogeneity is completely uncorrelated with ethnicity (non- torepresentadifferentunderlyingcausalvariant,actinginiso- ancestralscenario)didthepoweroftherandom-effectsmeta- lationorthroughhaplotypeeffects.Suchanapproachhasbeen analysisexceedthatofthemeta-regression. widely employed for fine-mapping association signals for a Themeta-regressionmodelenablespartitioningofheteroge- rangeofcomplexhumantraitsanddiseases,inthecontextof neityinalleliceffectsbetweenGWASthatiscorrelatedwithan- both trans-ethnic and ancestry-specific meta-analyses cestry from residual variation due to other sources (such as (11,12,17,24–32). variable phenotype definition). Heterogeneity in allelic effects Unfortunately,theresultsofoursimulationstudyhighlight D duetoancestryisofparticularrelevancetofine-mapping,sinceit thatthere isnosingle optimal approach tothe aggregationof o w canoccurasaresultofdifferencesinpatternsofLDbetweendi- GWAS from diverse populations across the range of scenarios n lo verse populations, which we model in the meta-regression forheterogeneityinalleliceffectswehaveconsidered.Forex- a d frameworkbyincluding axesofgeneticvariationascovariates. ample, under a scenario in which allelic effects are homoge- e d Consequently, the meta-regression model offers substantial neous across ethnic groups, there is a small loss in power for fro gainsinfine-mappingresolutionoverfixed-andrandom-effects the meta-regression model compared to fixed- and random- m h meta-analysis,evenforheterogeneityscenariosinwhichallelic effects meta-analysis that is due to the inclusion of axes of ttp effectsdonotfollowalineartrendintheaxesofgeneticvariation geneticvariationascovariatesthatarenotpredictiveofhetero- s (Figs2and3).Wealsocomparedthemeta-regressionapproach geneity.Ouranalyseshavefocussedonthreeaxesthatdistin- ://a c withMANTRA,whichmodelsheterogeneityinalleliceffectsbe- guish populations of African, East Asian, European, Native ad e tweenGWASaccordingtoapriormodelofgeneticsimilaritybe- American and South Asian ancestry. Reducing the number of m tweenthem.Thefine-mappingresolutionachievedbythemeta- axes of genetic variation included as covariates in the meta- ic.o regressionmodelwasgreaterthanthatforMANTRA,exceptin regressionmodelwoulddecreasethelossinpower,compared u p the scenario in which heterogeneity in allelic effects between tofixed-orrandom-effects meta-analysis,underascenarioof .c o studieswasrandom,irrespectiveofancestry,andcannotbyac- homogenous allelic effects across populations. However, the m /h counted for by axes of genetic variation that distinguish broad powerofthemeta-regressionmodeltodetectSNPassociation m g ethnic groups. Similar performance between the methods was would then bedecreased when heterogeneity inallelic effects /a alsoobservedthroughapplicationtofine-mappingofassociation between GWAS is driven by ancestry. One solution to this di- rtic signalsforT2Dinfourestablishedsusceptibilityloci. lemma is to use both fixed-effects meta-analysis and meta- le -a There has been recent development of novel methods for regression for aggregation of GWAS from diverse populations, b s fine-mappingthatutilisemeta-analysissummarystatisticsand although thresholds of significance should be adjusted to ac- tra c areferencepanelofLDbetweenvariantsacrossalocus,includ- countformultipletestingateachSNP. t/2 ingCAVIAR(19),PAINTOR(10)andFINEMAP(20).Bymodelling One of the advantages of the meta-regression approach is 6 /1 LDbetweenvariantsacrossalocus,theseapproacheshavethe thatwecanassessthecontributionofeachaxisofgeneticvari- 8 /3 advantage that they can allow for fine-mapping of multiple ationtoheterogeneityinalleliceffectsbetweenGWAS.Forex- 6 3 causal variants, simultaneously. However, CAVIAR and ample,weobservedstrongevidenceforheterogeneityinallelic 9/3 FINEMAPallowforspecificationofasingleLDreferenceacross effectsonT2DsusceptibilityduetoancestryattheCDKAL1lo- 9 7 the locus, which is not appropriate in the context of trans- cus,whichwasaccountedforbyoneaxisofgeneticvariation. 65 6 ethnic fine-mapping because the correlation between variants Alleliceffectsizesincreasedalongthisaxis,separatingthoseof 9 b isnotthesamefordiversepopulations.PAINTOR,ontheother SouthAsianancestryfromotherethnicgroups,consistentwith y g hand, overcomes this problem by allowing for specification of previousreportsthatthislocushasgreaterimpactonpopula- u e ethnic- or population-specific association summary statistics tionsofEuropeanandEastAsiandescent. s and LD references. Previously reported simulation highlighted Asecondadvantageofthemeta-regressionapproachisthat t on substantial improvements in fine-mapping resolution for additionalcovariatescanbeincludedtoinvestigateothersour- 09 PAINTORoveranapplicationofCAVIARusingan‘average’LD cesofpotentialheterogeneityinalleliceffectsbetweenstudies. A p referenceacrossallethnicgroups(10).PAINTORalsohasthead- Forexample,wheresex-specificassociationsummarystatistics ril 2 vantagethatitcanincorporateapriormodelofcausalitybased areavailable,inclusionofsexascovariateprovidesanassess- 0 1 ongenomicannotation,allowingaboostintheposteriorproba- ment in allelic effects between males and females, after ac- 9 bility that coding variants drive association signals, for exam- counting for ancestry. Inclusion of imputation quality metrics ple, as observed in genome-wide enrichment analyses (21). asacovariateenablesconfirmationthatapparentheterogeneity Nevertheless,theresultsofoursimulationstudyoflociwitha inalleliceffectsbetweenstudiesisnotareflectionofvariable single causal variant highlight that PAINTOR is not well cali- imputationsuccess,whichmayvaryaccordingtoancestrybe- brated across the scenarios considered, even in the ‘perfect’ cause of the availability of closely matched population haplo- data setting, and has lower resolution than MR-MEGA and typesinthereferencepanel,forexample. MANTRA(larger99%crediblesetsandlessposteriorprobability In conclusion, trans-ethnic meta-regression, as imple- ascribedtothecausalvariant)whenheterogeneityinallelicef- mentedintheMR-MEGAsoftware,offersapowerfulapproach fectsiscorrelatedwithancestry. forthediscoveryandfine-mappingofcomplextraitlociacross Analternativeapproachtoallowformultiplecausalvariants GWASfromdiversepopulations.Withtheincreasingavailabil- istofirstdissect‘distinct’associationsignalsataGWASlocus ityofmulti-ancestryGWASofcomplexhumantraits,powerful through (approximate) conditional analysis (22). Conditional statisticalmethodologyfortrans-ethnicmeta-analysis,suchas HumanMolecularGenetics,2017,Vol.26,No.18 | 3647 thatimplementedinMR-MEGA,showsgreatpromiseforfuture Fine-mapping improvements in our understanding of the genetic basis of Consideralocusencompassingapre-specifiedintervalfroman commondiseases. indexvariant.Forthejthvariantinthelocus,weapproximate theBayes’factorinfavourofassociation(33)by MaterialsandMethods (cid:4)X (cid:3)ðTþ1ÞlnK(cid:5) ConsideraseriesofKGWASofacomplextrait.Ateachvariant, Kj¼exp j 2 : weassumethatallGWASarealignedtothesamereferenceal- lele.WedenotethereferenceallelefrequencyofthejthSNPin the kth GWAS by p . We construct a matrix of pairwise Wethencalculatetheposteriorprobabilitythatthejthvariant kj EuclideandistancesbetweenGWASacrossautosomalvariants, isdrivingtheassociationsignalatthelocusby denotedD¼[d ],where kk’ D dkk0 ¼PjIj(cid:2)pPkjj(cid:3)Ijpk0j(cid:3)2: pj¼PKiKj i: ownloade d Inthisexpression, thesummationinthedenominatorisover fro all variants across the locus. Finally, we derive a 99% credible m Inthisexpression,Ijisabinaryindicatorvariableoftheinclu- h sionofthejthvariantinthedistancecalculation.Werecommend set(34)fortheassociationsignalby:(i)rankingallvariantsac- ttp cordingtotheirBayes’factor,K;and(ii)includingrankedvari- s dividing the genome into 1Mb bins, and utilising one variant ants until their cumulative posjterior probabilityof driving the ://a withMAFofatleast5%inallGWASfromeachbintominimise c theimpactofLD.Wethenimplementmulti-dimensionalscaling associationattainsorexceeds0.99. ade m ofthe distancematrix, D, toderive T axesofgenetic variation, ic denotedxkforthekthGWAS.Notethatthechoiceofthenumber Software .ou ofaxesofgeneticvariationwilldependonthepopulationdiver- p .c sityofGWAS,butisrestrictedtoT(cid:4)K-2. We have implemented the methodology in the MR-MEGA om Forthejthvariant,wedenotetheestimatedeffectoftheref- (Meta-Regression of Multi-Ethnic Genetic Association) soft- /h m erencealleleinthekthGWAS,andthecorrespondingvariance, ware(http://www.geenivaramu.ee/en/tools/mr-mega).Foreach g befyfebcktjaacnrdosvskjG,WresApSeicntiavelilny.eWareretghreenssmioondmelotdheel,rgeifveernenbcyeallele sintuclduyd,inagflaotnefileroowf apsesrocviaartiioannt,suamndmacorylusmtantsistfiocrs tisherevqauriiraendt, /article nameandpositioninthegenome,effectandotheralleles,effect -a b E(cid:2)bkj(cid:3)¼ajþXTt¼1btjxkt; (1) asilzleel.eFforreqeaucehncvya,riaalnletl,icMeRf-fMecEtGaAndalisgtnasndsaturddieesrrtoor,thanedsasmamepelfe- strac fectallele,andflipstheallelefrequencyandalleliceffectifre- t/2 6 quired. MR-MEGA then performs multi-dimensional scaling of /1 wherea istheinterceptandb istheeffectofthetthaxisofge- 8 j tj meangenome-wideallelefrequencydifferencesbetweeneach /3 netic variation for the jth variant. The contribution of the kth 6 pairofGWAS.Meta-regressionisundertakeninalinearregres- 3 GWASisweightedbytheestimatedinversevarianceoftheref- sion framework, as described above, including axes of genetic 9/3 erencealleleeffectatthejthvariant,denotedv(cid:3)1.Wecaninter- 9 pret the intercept, aj, as the expected allelickejffect of the jth vnaormiaitciocnonatsrocolvaatrtihateesstiundtyhelemveol,daenl.dM/oRr-MafEteGrAmcaentap-reergfroersmsiogen-. 7656 variantforapopulationofancestryrepresentedbyzerooneach 9 For each variant, MR-MEGA provides: (i) P-value and Bayes’ b oftheTaxesofgeneticvariation. y factor in favour of association, accounting for heterogeneity g Wetestthenullhypothesisofnoassociationofthejthvari- u that is correlated with ancestry; (ii) P-value for heterogeneity e antacrossGWASbycomparingthedevianceofmodel(1)with s asjt¼rabin1je¼d,..w.¼ithbTtjh¼e0rteosuthltaitngfotrewsthsicthattishteicpdaeranmoteetderXsjahreavuinngcoann- thheatetriosgecnorerietyla.ted with ancestry; and (iii) P-value for residual t on 09 approximatechi-squareddistributionwithTþ1degreesoffree- A p dom.Wecanalsotestforthepresenceofheterogeneityinalle- Simulationstudy:falsepositiveerrorrateandpower ril 2 lic effects between GWAS that is correlated with ancestry by 0 1 comparing the deviance of model (1) with b ¼...¼b ¼0 to For each replicate, the causal variant was selected at random 9 1j Tj thatforwhichtheparametersareunconstrained,withthere- fromthosereportedinthereferencepanelfromPhase3ofthe sultingteststatistichavinganapproximatechi-squareddistri- 1000 Genomes Project (13) with MAF>1% in all populations. bution with T degrees of freedom. Finally, the deviance of Genotypes in each population were then simulated using the model(1),withallparametersunconstrained,providesatestof causal variant population-specific odds-ratio (Supplementary residualheterogeneityinalleliceffectsbetweenGWASafterac- Material,TableS1)and1000GenomesProjectallelefrequency, countingforancestry,havinganapproximatechi-squareddis- underanassumptionofHardy-Weinbergequilibrium.Foreach tributionwithK-T-1degreesoffreedom. replicateofdata,ineachpopulation,wetestedforassociation We can also assess the contribution of the tth axis of ge- of the causal variant with case-control status in a logistic re- neticvariationtoheterogeneityinalleliceffectsbycomparing gressionframeworkunderanadditivemodelinthelog-oddsra- thedevianceofmodel(1)withb ¼0tothatforwhichthepa- tio in PLINK (35), and obtained estimated allelic effect sizes, tj rameters are unconstrained, with the resulting test statistic correspondingstandarderrorsandZ-scores. having an approximate chi-squared distribution with one de- We then tested for association of the causal variant with greeoffreedom. case-control status across populations using the meta- 3648 | HumanMolecularGenetics,2017,Vol.26,No.18 regression model, implemented in MR-MEGA, including three priorN(0,0.22)forlog-oddsratios;and(ii)theBayes’factorfrom axes of genetic variation as covariates to separate ancestry MANTRA (8) using the matrix of pairwise Euclidean distances groups. For comparison, we also tested for association using between the reference populations to model heterogeneity. fixed-effects (inverse-variance weighted log-odds ratios) and These(approximate)Bayes’factorswereusedtoobtainthepos- random-effects(RE2)meta-analysisimplementedinMETASOFT terior probability of driving the association for each variant (7).Falsepositiveerrorrateswereassessedatanominalsignifi- acrossthelocus.Finally,weundertooktrans-ethnicmeta-anal- cance threshold (P<0.05), whilst power was evaluated at ysisacrosspopulationsusingPAINTOR(10),assumingasingle genome-widesignificance(P<5(cid:2)10(cid:3)8).Next,wetestedforhet- causalvariantatthelocus(option‘-enumerate1’),andapproxi- erogeneity, at nominal significance (P<0.05), in allelic effects mating LD between variants in each population from haplo- betweenpopulationsthatiscorrelatedwithancestryusingthe typesinthe1000GenomesProjectPhase3referencepanel(13). meta-regression modelimplementedinMR-MEGA. Finally, we Undertheassumptionofauniformpriormodelofcausality(no tested for residual heterogeneity, at nominal significance functionalenrichment),weusedPAINTORtogeneratethepos- (P<0.05), in allelic effects between populations: (i) from the terior probability of driving the association for each variant D meta-regression model implemented in MR-MEGA after ac- acrossthelocus.Ineachreplicate,weconstructedthe99%cred- o w countingforancestry;and(ii)usingCochran’sQstatisticfrom ible set driving the association signal at the locus for each n lo thefixed-effectsmeta-analysisimplementedinMETASOFT(7). methodby:(i)rankingallvariantsbytheirposteriorprobability; a d and(ii)includingrankedvariantsuntiltheircumulativeposte- e d riorprobabilityattainsorexceeds0.99. fro m Simulationstudy:fine-mappinglociwithasingle h causalvariant ttp Trans-ethnicmeta-analysisofGWASofkidneyfunction s For each replicate, the 2Mb region was centred on a single ://a Each GWAS was pre-phased and imputed up to the 1000 c causal variant, selected at random fromthose reported in the Genomes Project Phase1 reference panel (14)using IMPUTEv2 ade referencepanelfromPhase3ofthe1000GenomesProject(13) m withMAF>1%inallreferencepopulations.Genotypesineach e(3a7c,h38)GoWrAmSinifi:m(ia)cM(3A7F).(cid:5)V0a.r5i%an;tasnwde(riei)rIeMtaPiUnTedEvf2orinafnoa(cid:5)ly0s.i4soinr ic.o populationwerethensimulated,usingHAPGEN2(36),usingthe u causal variant population-specific odds-ratio (Supplementary minimacr2(cid:5)0.3(40).KidneyfunctionwasassessedbyeGFR,cal- p.c Material, Table S1) and haplotypes from the 1000 Genomes culated from serum creatinine (mg/dL), with adjustment om for age, sex and ethnicity by means of the four variable /h ProjectPhase3referencepanel(13). m ModificationofDietinRenalDiseaseequation(42).Withineach g aernrtoWsrsien,fiftorhrsetbcreoengncishoinmdeaarrrekedinctahgpept‘uuprreeprdofe,scewts’i.tdhFaotnraoesmaecttihsinsrigen,pgwlighceaentreeotoaylfpledvsaatroair-, setaurdrye,garessssoicoinatfiroanmoefweGorFkR,wunitdheeraacnhavdadriitainvetwdoassatgeestmedodinela,alinnd- /article with adjustment for study-specific covariates to account for -a we considered the 1Mb region centred on the causal variant. b confounding due to population structure (Supplementary s Withineachpopulation,wetestedallvariantsinthisregionfor Material, Table S5). Within each study, association summary trac association with case-control status in a logistic regression statistics were corrected ineach study for residual population t/2 framework under an additive model in the log-odds ratio in 6 structure by genomic control (43) (Supplementary Material, /1 PLINK (35), and obtained estimated allelic effect sizes, corre- 8 TableS5). /3 spondingstandarderrorsandZ-scores. 6 Association summary statistics for each variant passing 3 Wethenconsideredthemorerealistic‘imperfect’dataset- qualitycontrolinatleast50%ofthetotalsamplesizewereag- 9/3 ting.Foreachreplicateofdata,genotypesatonly100randomly 9 gregated across studies via fixed-effects meta-analysis, with 7 selectedvariantsinthe2Mbregionwereretained,torepresent 65 inverse-variance weighting, implemented in METASOFT (7). 6 atypicalGWASarray.Withineachpopulation,separately,this 9 Association summary statistics from the meta-analysis were b scaffold of genotypes was imputed up to haplotypes from the y then corrected for a second round of genomic control (43) g 1000 Genomes Project Phase 3 reference panel (13) using u IMPUTEv2(37,38).Imputationwasperformedinthe1Mbregion (kGC¼1.029). Heterogeneity in allelic effects between studies at es centredonthecausalvariant,withtheremaining500kbregions each variant was assessed by means of Cochran’s Q-statistic t on utipo-n,anwdedtohwenn-tsetsreteadmforertaaisnseodciaatsiobnufofefras.llWviatrhiainntesacwhitphopcauslae-- fMroEmTASOthFeT (7fi).xWede-eifmfepcltesmemnteetda-manualtlyi-sdisimeinmsipolneamlesnctaelding ionf 09 A thematrixofpairwiseEuclideandistancesbetween studiesto p controlstatusinalogisticregressionframeworkunderanaddi- derivetwoaxesofgeneticvariationthatweresufficienttosepa- ril 2 tivemodelinthelog-oddsratioinSNPTEST(39),takingaccount 0 rateGWASbetweenancestrygroups(SupplementaryMaterial, 1 ofuncertaintyintheimputationprocesswiththegenotypedos- 9 Fig. S3). We then applied the meta-regression model, imple- age (‘expected’ option), and obtained estimated allelic effect mentedinMR-MEGA,toeachvariantpassingqualitycontrolin sizes, corresponding standard errors and Z-scores. We per- atleast50%ofthetotalsamplesize,includingthetwoaxesof formedpost-imputationqualitycontrol,andexcludedvariants geneticvariationascovariates.Associationsummarystatistics withIMPUTEv2info<0.4fromdownstreamanalyses(40). fromthemeta-analysiswerethencorrectedforasecondround Foreachreplicateofdata,forboth‘perfect’and‘imperfect’ ofgenomiccontrol(43)(k ¼1.017). data settings, we obtained Bayes’ factors in favour of associa- GC tion for each variant from the meta-regression model, imple- mentedinMR-MEGA,includingthreeaxesofgeneticvariation Fine-mappingoffourT2Dsusceptibilityloci:CDKAL1, ascovariatestoseparateancestrygroups.Forcomparison,we CDKN2A-B,IGF2BP2andKCNQ1 also obtained, for each variant: (i) approximate Bayes’ factors (41) on the basis of allelic effect estimates, and corresponding WemadeuseofsummarystatisticsderivedbytheT2D-GENES standard errors, from fixed-effects and random-effects meta- Consortium(12)forsevendistinctsignalsofT2Dassociationat analysis implemented in METASOFT (7), assuming a Gaussian thefourloci.Briefly,ateachlocus,thescaffoldofgenome-wide

Description:
model. The RE2 meta-analysis increases power over the tradi- sent the results of an application of trans-ethnic meta-regression to: (i) GWAS of kidney Benner, C., Spencer, C.A., Havulinna, A.S., Salomaa, V.,. Ripatti, S. and
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.