GBE The Missing Link of Jewish European Ancestry: Contrasting the Rhineland and the Khazarian Hypotheses Eran Elhaik1,2,* 1DepartmentofMentalHealth,JohnsHopkinsUniversityBloombergSchoolofPublicHealth 2McKusick-NathansInstituteofGeneticMedicine,JohnsHopkinsUniversitySchoolofMedicine *Correspondingauthor:E-mail:[email protected]. Accepted:December 5, 2012 D o w n lo Abstract ad e d The question of Jewish ancestry has been the subject of controversy for over two centuries and has yet to be resolved. The fro m “Rhinelandhypothesis” depictsEasternEuropeanJewsasa“populationisolate” thatemergedfromasmallgroupofGerman h Jewswhomigratedeastwardandexpandedrapidly.Alternatively,the“Khazarianhypothesis”suggeststhatEasternEuropeanJews ttp s descendedfromtheKhazars,anamalgamofTurkicclansthatsettledtheCaucasusintheearlycenturiesCEandconvertedtoJudaism ://a c inthe8thcentury.MesopotamianandGreco–RomanJewscontinuouslyreinforcedtheJudaizedempireuntilthe13thcentury. a d Followingthecollapseoftheirempire,theJudeo–KhazarsfledtoEasternEurope.TheriseofEuropeanJewryisthereforeexplainedby em thecontributionoftheJudeo–Khazars.Thusfar,however,theKhazars’contributionhasbeenestimatedonlyempirically,asthe ic.o u absence of genome-wide data from Caucasus populations precluded testing the Khazarian hypothesis. Recent sequencing of p .c modernCaucasuspopulations prompted us to revisit the Khazarian hypothesis andcompare it withthe Rhinelandhypothesis. o m Weappliedawiderangeofpopulationgeneticanalysestocomparethesetwohypotheses.OurfindingssupporttheKhazarian /g b hypothesisandportraytheEuropeanJewishgenomeasamosaicofNearEastern-Caucasus,European,andSemiticancestries, e/a therebyconsolidatingpreviouscontradictoryreportsofJewishancestry.WefurtherdescribeamajordifferenceamongCaucasus rtic populationsexplainedbytheearlypresenceofJudeansintheSouthernandCentralCaucasus.Ourresultshaveimportantimplica- le-a tionsforthedemographicforcesthatshapedthegeneticdiversityintheCaucasusandformedicalstudies. bs tra Key words: Jewish genome, Khazars, Rhineland, Ashkenazi Jews, population isolate, Eastern European Jews, Central c t/5 European Jews, population structure. /1 /6 1 /7 2 8 populations with Eastern European Jews, at odds with the 1 Introduction 17 narrativeofaCentralEuropeanfoundergroup.Becausecor- b y Contemporary Eastern European Jews comprise the largest rectingforpopulationstructureandusingsuitablecontrolsare g u ethno-religiousaggregateofmodernJewishcommunities,ac- criticalinmedicalstudies,itisvitaltoexaminethehypotheses es counting for approximately 90% of over 13 million Jews purporting to explain the ancestry of Eastern and Central t o n worldwide (Ostrer 2001). Speculated to have emerged from EuropeanJews.Oneofthemajorchallengesforanyhypoth- 16 asmallCentralEuropeanfoundergroupandthoughttohave esis is to explain the massive presence of Jews in Eastern Ap maintainedhighendogamy,EasternEuropeanJewsarecon- Europe, estimated at eight million people at the beginning ril 2 0 sidereda“populationisolate”andinvaluablesubjectsindis- of the 20th century. We investigate the genetic structure of 1 9 easestudies(Carmeli2004),althoughtheirancestryremains European Jews, by applying a wide range of analyses— debatable between geneticists, historians, and linguists includingthreepopulationtest,principalcomponent,biogeo- (Wexler 1993; Brook 2006; Sand 2009; Behar et al. 2010). graphical origin, admixture, identity by descent (IBD), allele Recently, several large-scale studies haveattemptedtochart sharing distance, and uniparental analyses—and test their the genetic diversity of Jewish populations by genotyping veracity in light of the two dominant hypotheses depicting Eurasian Jewish and non-Jewish populations (Conrad et al. either a sole Middle Eastern ancestry or a mixed Middle 2006; Kopelman et al. 2009; Behar et al. 2010). Eastern–Caucasus–Europeanancestrytoexplaintheancestry Interestingly, some of these studies linked Caucasus ofEasternEuropeanJews. (cid:2)TheAuthor(s)2012.PublishedbyOxfordUniversityPressonbehalfoftheSocietyforMolecularBiologyandEvolution. ThisisanOpenAccessarticledistributedunderthetermsoftheCreativeCommonsAttributionNon-CommercialLicense(http://creativecommons.org/licenses/by-nc/3.0/),which permitsunrestrictednon-commercialuse,distribution,andreproductioninanymedium,providedtheoriginalworkisproperlycited. GenomeBiol.Evol.5(1):61–74. doi:10.1093/gbe/evs119 AdvanceAccesspublicationDecember14,2012 61 GBE Elhaik The “Rhineland hypothesis” envisions modern European (figs. 1 and 2) (Polak 1951; Brook 2006; Sand 2009). The Jews to be the descendents of the Judeans—an assortment Khazarian, Armenian, and Georgian populations forged of Israelite–Canaanite tribes of Semitic origin (figs. 1 and 2) fromthisamalgamationoftribes(Polak1951)werefollowed (supplementary note S1, Supplementary Material online). It by relative isolation, differentiation, and genetic drift in situ proposes two mass migratory waves: the first occurred over (Balanovsky et al. 2011). Biblical and archeological records the200yearsfollowingtheMuslimconquestofPalestine(638 allude to active trade relationships between Proto-Judeans CE) and consisted of devoted Judeans who left Muslim and Armenians in the late centuries BCE (Polak 1951; Palestine for Europe (Dinur 1961). Whether these migrants Finkelstein and Silberman 2002), that likely resulted in a joinedtheexistingJudaizedGreco–Romancommunitiesisun- small scale admixture between these populations and a clear, as is the extent of their contribution to the Southern Judean presence in the Caucasus. After their conversion to Europeangenepool.Thesecondwaveoccurredatthebegin- Judaism,thepopulationstructureoftheJudeo–Khazarswas D o ningofthe15thcenturybyagroupof50,000GermanJews further reshaped by multiple migrations of Jews from the w n who migrated eastward and ushered an apparent hyper- Byzantine Empire and Caliphate to the Khazarian Empire loa d baby-boom era for half a millennium (Atzmon et al. 2010). (fig.1).FollowingthecollapseoftheirempireandtheBlack e d The Rhinelandhypothesis predictsa Middle Easternancestry Death(1347–1348)theJudeo–Khazarsfledwestward(Baron fro m toEuropeanJewsandhighgeneticsimilarityamongEuropean 1993), settling in the rising Polish Kingdom and Hungary h Jews(Ostrer2001;Atzmonetal.2010;Beharetal.2010). (Polak 1951) and eventually spreading to Central and ttp s The competing “Khazarian hypothesis” considers Eastern Western Europe. The Khazarian hypothesis posits that ://a c EuropeanJewstobethedescendantsofKhazars(supplemen- European Jews are comprised of Caucasus, European, and a d tary note S1, Supplementary Material online). The Khazars Middle Eastern ancestries. Moreover, European Jewish com- em were a confederation of Slavic, Scythian, Hunnic–Bulgar, munitiesareexpectedtobedifferentfromoneanotherboth ic.o u Iranian,Alans,andTurkishtribeswhoformedinthecentral– inancestryandgeneticheterogeneity.TheKhazarianhypoth- p .c northernCaucasusoneofmostpowerfulempiresduringthe esis also offers two explanations for the genetic diversity in o m lateIronAgeandconvertedtoJudaisminthe8thcenturyCE Caucasus groups first by the multiple migration waves to /g b e /a rtic le -a b s tra c t/5 /1 /6 1 /7 2 8 1 1 7 b y g u e s t o n 1 6 A p ril 2 0 1 9 FIG.1.—MapofEurasia.AmapofKhazariaandJudahisshownwiththestateoforiginofthestudiedgroups.EurasianJewishandnon-Jewish populationsusedinallanalysesareshowninsquareandroundbullets,respectively(supplementarytableS3,SupplementaryMaterialonline).Themajor migrationsthatformedEasternEuropeanJewryaccordingtotheKhazarianandRhinelandhypothesesareshowninyellowandbrown,respectively. 62 GenomeBiol.Evol.5(1):61–74. doi:10.1093/gbe/evs119 AdvanceAccesspublicationDecember14,2012 GBE TheMissingLinkofJewishEuropeanAncestry D o w n lo a d e d fro m h ttp s ://a c FIG.2.—Anillustratedtimelinefortherelevanthistoricalevents.Thehorizontaldashedlinesrepresentcontroversialhistoricaleventsexplainedbythe ad e differenthypotheses,whereassolidblacklinesrepresentundisputedhistoricalevents. m ic .o u p .c Khazaria during the 6th–10th centuries and second by the populations, estimated at 0.5% per generation over the om Judeo–KhazarswhoremainedintheCaucasus. past50generations(Ostrer2001).Theserelativelyrecentad- /g b e Genetic studies attempting to infer the ancestry of mixtures have likely reshaped the population structure of all /a European Jews yielded inconsistent results. Some studies EuropeanJewsandincreasedthegeneticdistancesfromthe rtic le pointed to the genetic similarity between European Jews CaucasusorMiddleEasternpopulations.Therefore,wedonot -a b and Caucasus populations like Adygei (Behar et al. 2003; expect to achieve perfect matching with the surrogate s tra Levy-Coffman 2005; Kopelman et al. 2009), whereas some Khazarian and Judean populations but rather to estimate c pointedtothesimilaritytoMiddleEasternpopulationssuchas theirrelatedness. t/5/1 Palestinians (Hammer et al. 2000; Nebel et al. 2000), and /61 /7 others pointedtothesimilarity toSouthernEuropean popu- Materials and Methods 28 lations like Italians (Atzmon et al. 2010; Zoossmann-Diskin 11 7 2010). Most of these studies were done in the pregenome- DataCollection b y wide era using uniparental markers and including different Thecompletedatasetcontained1,287unrelatedindividuals gu e rthefeeirrernecseultps.opMuolarteiornesc,enwthsictuhdmiesakeemspitlodyiinffigcuwlthotolecgoemnopmaree o5f318,3J1ew5iashutaonsodm7a4l snionng-leJenwuischleoptoidpeulpatoiolynmsogrepnhoistmypse(dSNoPvse)r. st on 1 data reported high genetic similarity of European Jews to Alinkagedisequilibrium(LD)-pruneddatasetwascreatedby 6 A Detruazl.e2,0I1ta0l;iaBne,haarnedtaMl.id2d0l1e0)E.astern populations (Atzmon r(re2m>o0v.in4g)inonweinmdoewmsboefr2o0f0aSnNyPspa(sirlidoinfgStNhPeswinindsotrwonbgyL2D5 pril 2 0 Although both the Rhineland and Khazarian hypotheses SNPs at a time) using indep-pairwise in PLINK (Purcell et al. 19 depict a Judean ancestry and are not mutually exclusive, 2007). This yielded a total of 221,558 autosomal SNPs that theyarewelldistinguished,asCaucasusandSemiticpopula- werechosenforallautosomalanalysesexcepttheidenticalby tionsareconsideredethnicallyandlinguisticallydistinct(Patai descent(IBD)analysisthatutilizedthecompletedataset.Both andPatai1975;Wexler1993;Balanovskyetal.2011).Jews, data sets were obtained from http://www.evolutsioon.ut.ee/ according to either hypothesis, are an assortment of tribes MAIT/jew_data/ (last accessed December 19, 2012) (Behar whoacceptedJudaism,migratedelsewhere,andmaintained et al. 2010). Mitochondrial DNA (mtDNA) and Y- theirreligionuptothisdateandare,therefore,expectedto chromosomal data were obtained from previously published exhibitcertaindifferencesfromtheirneighboringpopulations. data sets as appeared in Behar et al. (2010). These markers Because both hypotheses posit that Eastern European Jews were chosen to match the phylogenetic level of resolution arrived at Eastern Europe roughly at the same time (13th achieved in previously reported data sets and represent a and15thcenturies),weassumedthattheyexperiencedsimilar diversified set of markers. A total of 11,392 samples were low and fixed admixture rates with the neighboring assembled for mtDNA (6,089) and Y-chromosomal (5,303) GenomeBiol.Evol.5(1):61–74. doi:10.1093/gbe/evs119 AdvanceAccesspublicationDecember14,2012 63 GBE Elhaik analyses from 27 populations (supplementary tables S1 and PrincipalComponentAnalysis S2,SupplementaryMaterialonline). Although the commonly used “multipopulation” principal component analysis (PCA) has many attractive properties, it should be practiced with caution to avoid biases due to the Terminology choice of populations and varying sample sizes (Price et al. Incommonparlance,EasternandCentralEuropeanJewsare 2006; McVean 2009). To circumvent these biases, we de- practically synonymous with Ashkenazi Jews and are con- velopedasimple“dualpopulation”frameworkconsistingof sideredasingleentity(Tianetal.2008;Atzmonetal.2010; three “outgroup” populations that are available in large Behar et al. 2010). However, the term is misleading, for the sample sizes and are the least admixed—Mbuti and Biaka Hebrewword“Ashkenaz”wasappliedtoGermanyinmedi- Pygmies (South Africa), French Basques (Europe), and Han eval rabbinical literature—contributing to the narrative that Chinese (East Asia)—and two populations of interest, all of D o modern Eastern European Jewry originated on the Rhine. equalsamplesizes.Thecornerstoneofthisframeworkisthat wn We thus refrained from using the term “Ashkenazi Jews.” it minimizes the number of significant PCs to four or fewer loa d Jews were roughly subdivided into Eastern (Belorussia, (Tracy-Widom test, P<0.01) and maximizes the portion of ed Latvia,Poland,andRomania)andCentral(Germany,Nether- explained variance to over 20% for the first two PCs. PCA fro m lands, and Austria) European Jews. In congruence with the calculations were carried out using smartpca of the h literature that considers “Ashkenazi Jews” distinct from EIGENSOFT package (Patterson et al. 2006). Convex hulls ttp s “SephardicJews,” weexcludedthelater.Completepopula- werecalculatedusingMatlab“convhull”functionandplotted ://a c tionnotationisdescribedinsupplementarytableS3,Supple- aroundtheclustercentroids.Relatednessbetweentwopopu- a d e mentaryMaterialonline. lationsofinterestwasestimatedbythecommensurateover- m lap of their clusters. Small populations (<7 samples) were ic.o u excludedfromtheanalysis. p ChoiceofSurrogatePopulations .co m As the ancient Judeans and Khazars have been vanquished EstimatingtheBiogeographicalOriginsofPopulation /g b e and their remains have yet to be sequenced, in accordance Novembreetal.(2008)proposedaPCA-basedapproach,ac- /a with previous studies (Levy-Coffman 2005; Kopelman et al. curatetoafewhundredkilometerswithinEurope,toidentify rtic le 2009;Atzmonetal.2010;Beharetal.2010),contemporary thecurrentbiogeographicaloriginofapopulation.Although -a b Middle Eastern and Caucasus populations were used as this approach has no implied historical model, it correlates stra surrogates. Palestinians were considered proto-Judeans be- genetic diversity with geography and can thus be a useful ct/5 cause they are assumed to share a similar linguistic, ethnic, tool to study biogeography. To decrease the bias caused by /1 and geographic background with the Judeans and were multiple populations of uneven sizes (Patterson et al. 2006; /61 /7 shown to share common ancestry with European Jews McVean2009),weadoptedthedual-populationframework 2 8 (Bonne´-Tamir and Adam 1992; Nebel et al. 2000; Atzmon with three outgroup populations and two populations of 11 7 et al. 2010; Behar et al. 2010). Similarly, Caucasus interest: a population of known geographical origin during b y Georgians and Armenians were considered proto-Khazars therelevanttimeperiodshowntoclusterwiththepopulation g u e because they are believed to have emerged from the same inquestion(e.g.,Armenians)andthepopulationinquestion s genetic cohort as the Khazars (Polak 1951; Dvornik 1962; (e.g.,EasternEuropeanJews).Thefirstfourpopulationswere t on 1 Brook2006). used as a training set for the population in question. PCA 6 A calculationswerecarriedoutasdescribedearlier.Therotation p angleofPC1–PC2coordinateswascalculatedasdescribedby ril 2 0 TheThreePopulationTest Novembre et al. (2008). Briefly, in each figure the PC axes 1 9 Thef3statisticsusesallelefrequencydifferencestoassessthe were rotated to find the angle that maximizes the summed presenceofadmixtureinapopulationXfromtwootherpopu- correlationofthemedianPC1andPC2valuesofthetraining lationsAandB,sothatf3(X;A,B)(Reichetal.2009).IfXisa populationswiththelatitudeandlongitudeoftheircountries. mixtureofAandB,ratherthantheresultofgeneticdrift,f3 Latitudinalandlongitudinaldatawereobtainedfromthelit- wouldbenegative.Asignificantnegativef3indicatesthatthe erature or by the country’s approximate centroid. Geodesic ancestorsofgroupXexperiencedahistoryofadmixturesub- distanceswerecalculatedinkilometersusingtheMatlabfunc- sequent to their divergence from A and B. The f3 statistics tion“distance.” were calculated with the threepop program of TreeMix AdmixtureAnalysis (Pickrell and Pritchard 2012) with k¼500 over the set of 221,558SNPs.ThistestdiffersfromADMIXTURE(Alexander Astructure-likeapproachwasappliedinasupervisedlearning etal.2009),whichreportstheproportionsofadmixturewith modeasimplementedinADMIXTURE(Alexanderetal.2009). themostlikelyancestor. ADMIXTURE provides an estimation of the individual’s 64 GenomeBiol.Evol.5(1):61–74. doi:10.1093/gbe/evs119 AdvanceAccesspublicationDecember14,2012 GBE TheMissingLinkofJewishEuropeanAncestry ancestriesfromtheallelefrequenciesofthedesignatedances- Arlequinversion3.1(Excoffieretal.2005).Inbrief,similarity tralpopulations.ADMIXTURE’sbootstrappingprocedurewith between populations was defined as the fraction of I hap- defaultparameterswasusedtocalculatethestandarderrors. logroups that the two populations shared as measured by Weobservedlow(<0.05)standarderrorsinallouranalyses. theKroneckerfunctiond (i): xy WiththeexceptionofSouthernEuropeans,populationswere I sortedbytheirmeanAfricanandAsianancestries.Inthisana- d ¼X(cid:2) ðiÞ, ð2Þ xy xy lysis, the three Netherland Jews were grouped with Eastern i¼1 EuropeanJews. which equals 1 if the haplogroup frequency of the ith hap- logroupisnonzeroforbothpopulationsandequals0other- IBDAnalysis wise. In other words, populations sharing the same exact D TodetectIBDsegments,weranfastIBD10timesusingdiffer- haplogroups or their mutual absence are considered more o w ent randomseedsandcombinedtheresultsasdescribedby genetically similar than populations with different hap- n lo Browning and Browning (2011). Segments were considered logroups. For brevity, we considered only haplogroups with ad e to be IBD only if the fastIBD score of the combined analysis frequencies higher than 0.5%. This measure has several de- d was less than e–10. This low threshold corresponds to long sirableproperties thatmakeit anexcellent measure foresti- fro m shared haplotypes ((cid:2)1cM) that are likely to be IBD. Short mating genetic distance between populations, such as a h gaps (<50 indexes) separating long domains were assumed simpleinterpretationintermsofhomogeneityandapplicabil- ttps to be false-negative and concatenated (Browning and itytobothmtDNAandY-chromosomaldata. ://a c Browning 2011). Pairwise-IBD segments between European ad e Jewsanddifferentpopulationswereobtainedbyfindingthe m Results ic maximumtotalIBDsharingbetweeneachEuropeanJewand .o u allotherindividualsofaparticularpopulation. To confirm that the Rhineland and Khazarian hypotheses p.c o indeedportraydistinctancestries,weassessedthedegreeof m AlleleSharingDistances backgroundadmixturebetweenCaucasusandSemiticpopu- /gb e lations. We calculated the f3 statistics between Palestinians /a Adilsletalencshesarbinegtwdeisetannpceosp(uAlaStDio)nwsaassuisteidsfleosrsmseeanssuitrivinegtgoesnmetaicll and six Caucasus and Eurasian populations using African rticle San as an outgroup, for example, f3(Palestinians, San, -a samplesizesthanothermethods.PairwiseASDwascalculated b s using PLINK (Purcell et al. 2007), and the average ASD be- Armenians). The f3 results for Turks (–0.0013), Armenians tra tweenpopulationsIandJ,wascomputedas and Georgians (–0.0019), Lezgins and Adygei (–0.0015), ct/5 andRussians (–0.0011) indicateda minorbutsignificant ad- /1 ! /6 XX mixture (–26<Z-score<–13) between Palestinians and the 1 WIJ ¼ Wij =nm, ð1Þ /7 populations tested. Because Armenians and Georgians 2 i2I i2J 81 diverged from Turks 600 generations ago (Schonberg et al. 1 where Wij is the distance between individuals i and j from 2011),wecanassumethatthelion’sshareoftheiradmixture 7 b y populations I and J of sizes n and m, respectively. To verify derivedfromthatancestryandwithintheexpectedlevelsof gu tphraotacthhewseasASuDseddiffweirtehnctehseanreullsighnyipfioctahnets,isa: bHoo:tAstSraDp(app-, backgroundadmixturetypicaltotheregionratherthanrecent est o 0 1 admixturewithSemiticpopulations.Therefore,similaritiesbe- n p2)¼ASD (p1, p3), where the ASD between populations p1 tweenEuropeanJewsandCaucasuspopulationswillunlikely 16 andp2iscomparedwiththeASDbetweenpopulationsp2and beduetoasharedSemiticancestry. Ap p3 (supplementary note S2, Supplementary Material online). PCA was next used to identify independent dimensions ril 2 0 TocomparecontinentalJewishcommunities,individualswere 1 that capture most of the information in the data. PCA was 9 groupedbytheircontinentandthecomparisonwascarriedas appliedusingtwoframeworks:the“multipopulation”carried described. forallpopulations(fig.3)andseparatelyforEurasianpopula- tions along with Pygmies and Han Chinese (supplementary UniparentalAnalysis fig.S2,SupplementaryMaterialonline)andournovel“dual- To infer the migration patterns of European Jews, we inte- population” framework (supplementary fig.S3, Supplemen- grated haplogroup data from over 11,300 uniparental tary Material online). In all analyses, the studied samples chromosomes with geographical data. The haplogroup fre- aligned along the two well-established geographic axes of quencies were compared between populations to obtain a globalgeneticvariation:PC1(sub-SaharanAfricavs.therest measure of distance between populations. Pairwise genetic of the Old World) and PC2 (east vs. west Eurasia) (Li et al. distances between population haplogroups (supplementary 2008). Our results reveal geographically refined groupings, tables S1 and S2, Supplementary Material online) were esti- suchasthenearly symmetricalcontinuous Europeanrim ex- matedbyapplyingtheKroneckerfunctionasimplementedin tending from Western to Eastern Europeans, the parallel GenomeBiol.Evol.5(1):61–74. doi:10.1093/gbe/evs119 AdvanceAccesspublicationDecember14,2012 65 GBE Elhaik D o w n lo a d e d fro m h ttp s ://a c a d e m ic .o u p .c o m /g b e /a FIG.3.—Scatterplotofallpopulationsalongthefirsttwoprincipalcomponents.Forbrevity,weshowonlythepopulationsrelevanttothisstudy.The rtic insetmagnifiesEurasianandMiddleEasternindividuals.Eachlettercodecorrespondstooneindividual(supplementarytableS3,SupplementaryMaterial le online).Apolygonsurroundingalloftheindividualsamplesbelongingtoagroupdesignationhighlightsseveralpopulationgroups. -ab s tra c t/5 Caucasusrim,andtheNearEasternpopulations(supplemen- therearenorecordsofCaucasuspopulationsmass-migrating /1 /6 tary fig. S1, Supplementary Material online) organized in to Eastern and Central Europe prior to the fall of Khazaria 1 /7 Turk–IranianandDruzeclusters(fig.3).MiddleEasternpopu- (Balanovskyetal.2011),thesefindingsimplyasharedorigin 2 8 1 lationsformagradientalongthediagonallinebetweenBed- forEuropeanJewsandCaucasuspopulations. 1 7 ouins and Near Eastern populations that resembles their ToassesstheabilityofourPCA-basedapproachtoidentify b y geographical distribution. The remaining Egyptians and the thebiogeographicaloriginsofapopulation,wefirstsoughtto gu e bulk of Saudis distribute separately from Middle Eastern identify the biogeographical origin of Druze. The Druze reli- s t o populations. gionoriginatedinthe11thcentury, butthepeople’sorigins n 1 EuropeanJewsareexpectedtoclusterwithnativeMiddle remainasourceofmuchconfusionanddebate(Hitti1928). 6 A EasternorCaucasuspopulationsaccordingto the Rhineland WetracedDruzebiogeographicalorigintothegeographical p or Khazarian hypotheses, respectively. The results of all PC coordinates: 38.6±3.45(cid:3) N, 36.25±1.41(cid:3) E (supplementary ril 2 0 1 analyses(fig.3,supplementaryfigs.S2andS3,Supplementary fig.S4,SupplementaryMaterialonline)intheNearEast(sup- 9 Material online)showthatover70% of EuropeanJewsand plementary fig. S1, Supplementary Material online). Half of almost all Eastern European Jews cluster with Georgian, the Druze clustered tightly in Southeast Turkey, and the re- Armenian, and Azerbaijani Jews within the Caucasus rim mainingwerescatteredalongnorthernSyriaandIraq.These (fig. 3 and supplementary fig. S3, Supplementary Material results are in agreement with the findings of Shlush et al. online).Approximately15%ofCentralEuropeanJewscluster (2008)usingmtDNAanalysis.Theinferredgeographicalpos- with Druze and the rest cluster with Cypriots. All European itionsofDruzewereusedinthesubsequentanalyses. JewsclusterdistinctlyfromtheMiddleEasterncluster.Strong The geographical origins of European Jews varied for dif- evidence for the Khazarian hypothesis is the clustering of ferentreferencepopulations(fig.4andsupplementaryfig.S5, EuropeanJewswiththepopulationsthatresideonopposite SupplementaryMaterialonline),butalltheresultsconverged ends of ancient Khazaria: Armenians, Georgians, and to Southern Khazaria along modern Turkey, Armenia, Azerbaijani Jews (fig. 1). Because Caucasus populations re- Georgia, and Azerbaijan. Eastern European Jews clustered mainedrelativelyisolatedintheCaucasusregionandbecause tightlycomparedwithCentralEuropeanJewsinallanalyses. 66 GenomeBiol.Evol.5(1):61–74. doi:10.1093/gbe/evs119 AdvanceAccesspublicationDecember14,2012 GBE TheMissingLinkofJewishEuropeanAncestry D o w n lo a d e d fro m h ttp s FIG.4.—BiogeographicaloriginofEuropeanJews.FirsttwoprincipalcomponentswerecalculatedforPygmies,FrenchBasques,HanChinese(black), ://a Armenians(blue),andEasternorCentralEuropeanJews(red)—allofequalsize.PCAwascalculatedseparatelyforEasternandCentralEuropeanJewsand c a d the results were merged. Using the first four populations as a training set, Eastern (squares) and Central (circles) European Jews were assigned to e m geographicallocationsbyfittingindependentlinearmodelsforlatitudeandlongitudeaspredictedbyPC1andPC2.Eachshaperepresentsanindividual. ic Majorcitiesaremarkedincyan. .ou p .c o m Thesmallestdeviationsinthegeographicalcoordinateswere European ancestries exhibit opposite gradients among /g obtained with Armenians for both Eastern (38±2.7(cid:3) N, Europeanpopulations.TheNearEastern–Caucasusancestries be 39.9±0.4(cid:3) E) and Central (35±5(cid:3) N, 39.7±1.1(cid:3) E) are dominant among Central (38%) and Eastern (32%) /artic European Jews (fig. 4). Similar results were obtained for European Jews followed by Western European ancestry le -a Georgians (supplementary fig. S5, Supplementary Material (30%).Amongnon-Caucasuspopulations,theCaucasusan- b s online). Remarkably, the mean coordinates of Eastern Euro- cestryisthelargestamongEuropeanJews(26%)andCypriots tra c pean Jews are 560 km from Khazaria’s southern border (31%). These populations also exhibit the largest fraction of t/5 (42.77(cid:3) N,42.56(cid:3) E)nearSamandar—thecapitalcityofKha- MiddleEasternancestryamongnon–MiddleEasternpopula- /1/6 1 zariafrom720to750CE(Polak1951). tions. As both Caucasus and Middle Eastern ancestries are /7 2 The duration, direction, and rate of gene flow between absentinEasternEuropeanpopulations,ourfindingssuggest 8 1 populations determine the proportion of admixture and the thatEasternEuropeanJewsacquiredtheseancestriespriorto 17 b total length of chromosomal segments that are identical by their arrival to Eastern Europe. Although the Rhineland hy- y g descent. Admixture calculations were carried out using a pothesisexplainstheMiddleEasternancestrybystatingthat u e s supervisedlearningapproachinastructure-likeanalysis.This JewsmigratedfromPalestinetoEuropeinthe7thcentury,it t o approach has many advantages over the unsupervised ap- fails to explain the large Caucasus ancestry, which is nearly n 1 6 proach that not only traces ancestry to K abstract unmixed endemictoCaucasuspopulations. A p ppoenpduelanttiolyn(sCuhnakdrearvathrtei2a0s0su9m;WpteioisnsathnadtLtohnegy2e0v0o9lv)ebdutinadlseo- EasAteltrhnoaungdhCtheenytrcalluEsuterropweiathnCJeawuscasshuasrepoaplualragteiofnrsac(tfiiogn. 5o)f, ril 20 1 isproblematicwhenappliedtostudyJewishancestry,which WesternEuropeanandMiddleEasternancestries,bothabsent 9 canbedatedonlyasfarbackas3,000years(fig.2).Moreover, inCaucasuspopulations.AccordingtotheKhazarianhypoth- the results of the unsupervised approach vary based on the esis,theWesternEuropeanancestrywasimportedtoKhazaria particularpopulationsusedfortheanalysisandthechoiceof byGreco–RomanJews,whereastheMiddleEasternancestry K, rendering the results incomparable between studies. alludestothecontributionofbothearlyIsraeliteProto-Judeans Admixturewascalculatedwithareferencesetofsevenpopu- as well as Mesopotamian Jews (Polak 1951; Koestler 1976; lations representing largely genetically distinct regions: Sand2009).CentralandEasternEuropeanJewsdiffermostly Pygmies(SouthAfrica),Palestinians(MiddleEast),Armenians in their Middle Eastern (30% and 25%, respectively) and (Caucasus), Turk–Iranians (Near East), French Basque (West Eastern European ancestries (3% and 12%, respectively), Europe),Chuvash(EastEurope),andHanChinese(EastAsia) probablyduetolateadmixture. (fig.5).Theancestralcomponentsgroupedallpopulationsby Druze exhibits a large Turk–Iranian ancestry (83%) in theirgeographicalregionswithEuropeanJewsclusteringwith accordance with their Near Eastern origin (supplementary Caucasus populations. As expected, Eastern and Western fig. S4, Supplementary Material online). Druze and Cypriot GenomeBiol.Evol.5(1):61–74. doi:10.1093/gbe/evs119 AdvanceAccesspublicationDecember14,2012 67 GBE Elhaik D o w n lo a d e d fro m h ttp s ://a c a d e m ic .o u p .c o m /g b e /a rtic le -a FIG.5.—AdmixtureanalysisofEuropean,Caucasus,NearEastern,andMiddleEasternpopulations.Thexaxisrepresentsindividualsfrompopulations b s sortedaccordingtotheirancestriesandarrayedgeographicallyroughlyfromNorthtoSouth.Eachindividualisrepresentedbyaverticalstackedcolumn tra (100%)ofcolor-codedadmixtureproportionsoftheancestralpopulations. ct/5 /1 /6 1 appearsimilartoEuropeanJewsintheirMiddleEasternand thebestofourknowledge,thesearethelargestIBDregions /7 2 WesternEuropeanancestries,thoughtheydifferlargelyinthe everreportedbetweenEuropeanJewsandnon-Jewishpopu- 8 1 proportionofCaucasusancestry.Theseresultscanexplainthe lations.ThedecreaseintotalIBDbetweenEuropeanJewsand 17 b genetic similarity between European Jews, Southern Euro- other populations combined with the increase in distance y g peans,andDruzereportedinstudiesthatexcludedCaucasus fromtheCaucasussupporttheKhazarianhypothesis. u e s populations (Price et al. 2008; Atzmon et al. 2010; Zooss- WenextestimatedthelevelofendogamyamongEurasian t o mann-Diskin2010).Overall,ourresultsportraytheEuropean Jewish communities and compared their genetic distances n 1 6 JewishgenomeasamosaicofNearEastern-Caucasus,Wes- with non-Jewish neighbors, Caucasus, and Middle Eastern A p ttreirensEinurdoepcereaans,inMgidpdrolepEoartsitoenrns.,andEasternEuropeanances- pdoogpaumlatyioinnsJ.eOwuisrhrepsoupltusleaxtipoannsd(Btheehaprreevtioaul.s2r0e1p0o)rtaonfdhnigahrreonw- ril 20 1 Togleanfurtherdetailsofthegenomicregionscontribut- the endogamy to regional Jewish communities (table 1, left 9 ingtothegeneticsimilaritybetweenEuropeanJewsandthe panel).Jewsaresignificantlymoresimilartomembersoftheir perspective populations, we compared their total genomic owncommunitythantootherJewishpopulations(P<0.01, regions shared by IBD. If European Jews emerged from bootstrapttest),withtheconspicuousexceptionofBulgarian, Caucasus populations, the two would share longer IBD re- Turkish,andGeorgianJews.Theseresultsstressthehighhet- gionsthanwithMiddleEasternpopulations.TheIBDanalysis erogeneity among Jewish communities across Eurasia and exhibits a skewed bimodal distribution embodying a major even within communities, as in the case of the Balkan and Caucasus ancestry with a minor Middle Eastern ancestry CaucasusJews. (fig. 6), consistent with the admixture results (fig. 5). The When compared with non-Jewish populations, all Jewish total IBD regions shared between European Jews and communitiesweresignificantly(P<0.01,bootstrapttest)dis- Caucasus populations (9.5cM on average) are significantly tantfromMiddleEasternpopulationsand,withtheexception larger than regions shared with Palestinians (5.5cM) of Central European Jews, significantly closer to Caucasus (Kolmogorov–Smirnov goodness-of-fit test, P<0.001). To populations (table 1, right panel). Similar findings were 68 GenomeBiol.Evol.5(1):61–74. doi:10.1093/gbe/evs119 AdvanceAccesspublicationDecember14,2012 GBE TheMissingLinkofJewishEuropeanAncestry D o w n lo a d e d fro m h ttp s ://a c a d e m ic .o FIG.6.—ProportionoftotalIBDsharingbetweenEuropeanJewsanddifferentpopulations.Populationsaresortedbydecreasingdistancefromthe up Caucasus.ThemaximalIBDbetweeneachEuropeanJewandanindividualfromeachpopulationaresummarizedinboxplots.Linespassthroughthemean .co m values. /g b e /a reportedbyBeharetal.(2010)althoughtheyweredismissed tothoseobtainedfromthebiparentalanalyses.BothmtDNA rtic as“abiasinherentinourcalculations.” However,wefound and Y-chromosomal analyses yield high similarities between le -a no such bias. The close genetic distance between Central European Jews and Caucasus populations rooted in the b s European Jews and Southern European populations can be Caucasus (fig. 7) in support of the Khazarian hypothesis. tra c attributedtoalateadmixture.Theresultsareconsistentwith Interestingly, the maternal analysis depicts a specific t/5 ourpreviousfindingsinsupportoftheKhazarianhypothesis. Caucasus founding lineage with a weak Southern European /1/6 1 As the only commonality among all Jewish communities is ancestry(fig.7A),whereasthepaternalancestryrevealsadual /7 2 their dissimilarity from Middle Eastern populations (table 1, Caucasus–Southern European origin (fig. 7B). As expected, 8 1 right panel), grouping different Jewish communities without the maternal ancestry exhibits a higher relatedness scale 17 b correcting for their country of origin, as is commonly done, withnarrowdispersalcomparedwiththepaternalancestry. y g wouldincreasetheirgeneticheterogeneity. Dissectinguniparentalhaplogroupsallowsustodelvefur- u e s Finally, we carried uniparental analyses on mtDNA and therintoEuropeanJews’migrationroutes.Astheresultsdo t o Y-chromosome comparing the haplogroup frequencies be- notspecifywhethertheSouthernEurope–Caucasusmigration n 1 6 tweenEuropeanJewsandotherpopulations.TheRhineland was ancient or recent nor indicate the migration’s direction, A p hpyapteortnhaelsaisnddempiacttesrnMaildadnleceEsatrsiteesrnbootrhig,iwnsheforeraEsutrhoepKeahnazJaerwiasn’ tthhaetreisa,rferofmouSropuotshseibrnleEsucreonpaeriotos.thOefCthaeuscea,stuhseoorntlhyehoisptoproicsaitlely, ril 20 1 hypothesisdepictsaCaucasusancestry alongwithSouthern supportable scenarios are ancient migrations from Southern 9 European and Near Eastern contributions of migrates from EuropetowardKhazaria(6th–13thcenturies)andmorerecent Byzantium and the Caliphate, respectively. Because Judaism migrations from the Caucasus to Central and Southern wasmaternallyinheritedonlysincethe3rdcenturyCE(Patai Europe (13th–15th centuries) (Polak 1951; Patai and Patai andPatai1975),themtDNAisexpectedtoshowastronger 1975; Straten 2003; Brook 2006; Sand 2009). A westward local female-biased founder effect compared with the migration from the diminished Khazariatoward Central and Y-chromosome. Haplogroup similarities between European Southern Europe would have exhibited a gradient from the Jews and other populations were plotted as heat maps on Caucasus toward Europe for both matrilineal and patrilineal the background of their geographical locations (fig. 7). The lines.Suchagradientwasnotobserved.Bycontrast,Judaized pairwisedistancesbetweenallstudiedpopulationsareshown Greco–Roman male-driven migration directly to Khazaria is insupplementaryfig.S6,SupplementaryMaterialonline. consistent with historical demographic migrations and could Our results shed light on sex-specific processes that, al- havecreatedtheobservedpattern.Moreover,wefoundlittle thoughnotevidentfromtheautosomaldata,areanalogous genetic similarity between European Jews and populations GenomeBiol.Evol.5(1):61–74. doi:10.1093/gbe/evs119 AdvanceAccesspublicationDecember14,2012 69 GBE Elhaik Table1 GeneticDistances(ASD)between Regionaland ContinentalJewish Communities(LeftPanel)andbetween RegionalJewishCommunitiesand Their Non-JewishNeighboringPopulations, Caucasus, andMiddleEastern Populations(Right Panel) RegionalJewishCommunity JewishPopulations Non-JewishPopulations Self European Asian African NeighboringPopulation Caucasus MiddleEastern EasternEuropean 0.2318 0.2328 0.2381 0.2446 Hungarian 0.2346 0.2340 0.2387 CentralEuropean 0.2312 0.2326 0.2378 0.2445 Italians 0.2335 0.2338 0.2385 Bulgarian 0.2326 0.2331 0.2376 0.2439 Romanian 0.2347 0.2337 0.2380 Turkish 0.2336 0.2336 0.2376 0.2439 Turkish 0.2353 0.2337 0.2379 Iraqi 0.2303 0.2351 0.2375 0.2447 Iranian 0.2363 0.2338 0.2381 Georgian 0.2304 0.2345 0.2372 0.2442 Georgian 0.2332 0.2332 0.2378 D Azerbaijani 0.2304 0.2365 0.2386 0.2465 Lezgins 0.2367 0.2352 0.2398 o w Iranian 0.2310 0.2364 0.2391 0.2434 Iranian 0.2414 0.2361 0.2383 n lo NOTE.—Underlinedentriesaresignificantlysmallerthroughouteachpanel.Thegeographicallynearestnon-Jewishpopulationswereconsideredneighboringpopulations. ad ThedistancesinthelasttwocolumnsarebetweenaJewishcommunityandoneCaucasus(ArmeniansorGeorgians)orMiddleEastern(Palestinians,Bedouins,orJordanians) ed populationthatexhibitedthelowestmeanASD. fro m h eastwardtotheCaspianSeaandsouthwardtotheBlackSea, hypotheses.Weshowedthatthehypothesesarealsogenet- ttp s delineatingthegeographicalboundariesofKhazaria(table1 ically distinct and that the miniscule Semitic ancestry in ://a andfig.1). Caucasus populations cannot account for the similarity be- c a d tweenEuropeanJewsandCaucasuspopulations.Therecent e m Discussion availability of genomic data from Caucasus populations ic.o allowed testing the Khazarian hypothesis for the first time u p Eastern and Central European Jews comprise the largest andpromptedustocontrastitwiththeRhinelandhypothesis. .c o group of contemporary Jews, accounting for approximately To evaluate the two hypotheses, we carried out a series of m/g 90% of over 13 million worldwide Jews. Eastern European comparativeanalysesbetweenEuropeanJewsandsurrogate be JWewarsImI. aDdeespuitpeothveerir9c0o%ntroovferEsuiarolpaenacnestJreyw,sEubreofpoereanWJeowrlds Keahcahzatirmiaen:aanredEJuasdteearnnapnodpuClaetniotrnaslEpuorsoinpgeatnheJeswamsgeeqnueetsictaiollny /article are an attractive group for genetic and medical studies closer to Khazarian or Judean populations? Under the -ab s due to their presumed genetic history (Ostrer 2001). Rhineland hypothesis, European Jews are also expected to tra c Correcting for population structure and using suitable con- exhibithighendogamy,particularlyacrosstheirEurasiancom- t/5 trols are critical in medical studies, thus it is vital to deter- munities,andbemoresimilartoMiddleEasternpopulations /1/6 mine whether European Jews are of Semitic, Caucasus, or compared with their neighboring non-Jewish populations, 1/7 other ancestry. whereastheKhazarianhypothesispredictstheoppositescen- 28 1 ThoughJudaismwasbornencasedintheological–historical ario. We emphasize that these hypotheses are not exclusive 17 myth,noJewishhistoriographywasproducedfromthetime andthatsomeEuropeanJewsmayhaveotherancestries. by g ofJosephusFlavius(1stcenturyCE)tothe19thcentury(Sand OurPC,biogeographicalestimation,admixture,IBD,ASD, u e 2009). Early historians bridged the historical gap simply by and uniparental analyses were consistent in depicting a st o linking modern Jews directly to the ancient Judeans (fig. 2), CaucasusancestryforEuropeanJews.Ourfirstanalysesre- n 1 aparadigmthatwaslaterembeddedinmedicalscienceand vealed tight genetic relationship of European Jews and 6 A c(Kryosetastllliezred19a7s6a;nSatrraratetinve2.0M0a7n),ymhaavinelychbayllsehnogwedintghitshnaatrarastoivlee CoraiguicnasoufsEpuorpoupleaatnionJeswasndtoptinhpeosinotuetdhtohfeKbhioagzaeroiagr(afipghsi.ca3l pril 20 1 Judean ancestry cannot account for the vast population of and 4). Our later analyses yielded a complex ancestry with 9 EasternEuropeanJewsinthebeginningofthe20thcentury a slightly dominant Near Eastern–Caucasus ancestry, large without the major contribution of Judaized Khazars and by Southern European and Middle Eastern ancestries, and a demonstrating that it is in conflict with anthropological, his- minor Eastern European contribution; the latter two differ- torical, and genetic evidence (Patai and Patai 1975; Baron entiated Central and Eastern European Jews (figs. 4 and 5 1993;Sand2009). andtable1).AlthoughtheMiddleEasternancestryfadedin With uniparental and whole genome analyses providing the ASD and uniparental analyses, the Southern European ambiguous answers (Levy-Coffman 2005; Atzmon et al. ancestry was upheld, probably attesting to its later time 2010; Behar et al. 2010), the question of European Jewish period(table1andfig.7). ancestry remained debated mainly between the supporters WeshowthattheKhazarianhypothesisoffersacompre- of the Rhineland and Khazarian hypotheses. Although both hensive explanation for the results, including the reported theoriesoversimplifycomplexhistoricalprocessestheyareat- Southern European (Atzmon et al. 2010; Zoossmann-Diskin tractive due to their distinct predictions and testable 2010)andMiddleEasternancestries(Nebeletal.2000;Behar 70 GenomeBiol.Evol.5(1):61–74. doi:10.1093/gbe/evs119 AdvanceAccesspublicationDecember14,2012
Description: