ebook img

Inferring Genetic Ancestry PDF

13 Pages·2010·0.52 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Inferring Genetic Ancestry

CORE Metadata, citation and similar papers at core.ac.uk Provided by Elsevier - Publisher Connector COMMENTARY Inferring Genetic Ancestry: Opportunities, Challenges, and Implications Charmaine D. Royal,1,* John Novembre,2 Stephanie M. Fullerton,3 David B. Goldstein,1 Jeffrey C. Long,4 Michael J. Bamshad,5 and Andrew G. Clark6 Increasingpublicinterestindirect-to-consumer(DTC)geneticancestrytestinghasbeenaccompaniedbygrowingconcernaboutissues rangingfromthepersonalandsocietalimplicationsofthetestingtothescientificvalidityofancestryinference.Theveryconceptof ‘‘ancestry’’issubjecttomisunderstandinginboththegeneralandscientificcommunities.Whatdowemeanbyancestry?Howexactly isancestrymeasured?Howfarbackcansuchancestrybedefinedandbywhichgenetictools?Howdowevalidateinferencesabout ancestryingeneticresearch?Whatarethedatathatdemonstrateourabilitytodothiscorrectly?Whatcanwesayandwhatcanwe notsayfromourresearchfindingsandthetestresultsthatwegenerate?ThiswhitepaperfromtheAmericanSocietyofHumanGenetics (ASHG)AncestryandAncestryTestingTaskForcebuildsuponthe2008ASHGAncestryTestingSummaryStatementinprovidingamore in-depthanalysisofkeyscientificandnon-scientificaspectsofgeneticancestryinferenceinacademiaandindustry.Itculminateswith recommendationsforadvancingthecurrentdebateandfacilitatingthedevelopmentofscientificallybased,ethicallysound,andsocially attentiveguidelinesconcerningtheuseofthesecontinuallyevolvingtechnologies. Introduction white paper, commissioned by the related to one another by varying In recent years, advances in genetics ASHG, expands discussion of the degrees,9soitisimportanttobeclear and genomics have brought new issues highlighted in the Society’s aboutwhatframeofreferenceisbeing dimensionstothecommercialgenetics 2008 Ancestry Testing Summary used indiscussions of ‘‘ancestry’’ and enterpriseintheformofDTCgenetic Statement8andintroducesadditional relationship (Figure 1). For example, testing.Withtheclickofamouse,the pertinent issues. The purpose of our because of recombination, each seg- publicnowhasdirectaccesstopersonal report is twofold: (1) to enlighten mentofthegenomehasitsownances- genetic and genomic information and engage ASHG members and the tral history, and various segments of related to health, ancestry, nutrition, broader scientific and general com- anindividual’sgenomemayhavean- physical traits, athletic ability, dating munities and (2) to assist in deter- cestralhistoriesthattracetodifferent compatibility,andaseeminglyinfinite mining the appropriate course of populations. list of other attributes. Although action for the Society in responding One commonly employed concept health-related DTC genetic testing tothecriticalconcerns. of ancestry is continental ancestry, appropriately continues to receive a It is important to note that the which assumes the existence of four substantial amount of attention,1,2 genetic tools employed by ancestry or five major ‘‘parental’’ populations discourseregardingotherDTCapplica- testing companies, as well as many thatgaverisewithinthelast100,000 tions of genetics, in particular the of the scientists involved in DTC years to existing populations.10 This echoing of concern about genetic ancestry testing, have their roots in, conception of ancestry is frequently ancestrytesting,isincreasing.3–6There andarestillapartof,academia.There- equated with that of ‘‘race,’’ and are approximately 40 companies, fore,inthisreport,asinthesummary thetermsareoftenusedinterchange- based in various countries, that statement,weevaluateapplicationof ably; however, this is problematic currently provide genetic ancestry ancestry estimation technologies in because in the history of science testing to the public (Table 1). The bothenvironmentsbecausethemany there have been many ‘‘racial’’ taxo- companiesdifferinboththerangeof overlapping scientific and nonscien- nomic schemes.11,12 A related view genetic testing services and the types tific issues in academic research have of ancestry is biogeographic ancestry, ofancestryteststhattheyoffer. not, to date, been adequately ad- in which a person’s origin is associ- The marketing of genetic tests di- dressed. ated with the geographic location(s) rectly to consumers is a priority area of presumed ancestors inferred by for the ASHG, asdemonstrated by its Definition(s)ofAncestry comparison with contemporary pop- statements on DTC health-related Our common origin as a species ulationslivingintheselocations.13,14 testing7 and ancestry testing.8 This impliesthatwe,asindividuals,areall There is also lineage or family history, 1InstituteforGenomeSciences&Policy,DukeUniversity,Durham,NC27708,USA;2DepartmentofEcologyandEvolutionaryBiology,UniversityofCal- iforniaLosAngeles,LosAngeles,CA90095,USA;3DepartmentofBioethics&Humanities,UniversityofWashingtonSchoolofMedicine,Seattle,WA 98195,USA;4DepartmentofAnthropology,UniversityofNewMexico,Albuquerque,NM87131,USA;5DepartmentofPediatrics,UniversityofWashing- tonSchoolofMedicine,Seattle,WA98195,USA;6DepartmentofMolecularBiologyandGenetics,CornellUniversity,Ithaca,NY14853,USA *Correspondence:[email protected] DOI10.1016/j.ajhg.2010.03.011.ª2010byTheAmericanSocietyofHumanGenetics.Allrightsreserved. TheAmericanJournalofHumanGenetics86,661–673,May14,2010 661 Table1. CompaniesProvidingDirect-to-ConsumerGeneticAncestryTesting StartDate, Companya URL Locationb GeneticTestingServices AncestryTestsOffered 1.AfricanAncestry http://www.africanancestry.com/ 2002 Ancestry mtDNA,Ychromosome 2.AfricanDNA http://www.africandna.com/ 2007 Ancestry mtDNA,Y 3.Ancestry.com http://ancestry.com 2007 Ancestry mtDNA,Y 4.AncestrybyDNA http://ancestrybydna.com 2009 Ancestry mtDNA,Y,admixture 5.ARGUSBiosciences http://www.argusbio.com/ 2003 Ancestry,cancertissuescreening, mtDNA-HV,mtDNA-FL,Y personalDNAsequencing 6.CambridgeDNA http://www.cambridgedna.com/ 2007 Ancestry mtDNA,Y,admixture Services UK 7.deCODEme http://www.decodeme.com/ 2007 R&D,completescan,cancerscan, mtDNA,Y,mapofkinship, Iceland cardioscan geneticatlas 8.Determigene http://www.determigene.com/ 2002 Paternity,immigrationDNA AncestraloriginsDNAancestry testing,infidelitytesting, map(populationmatches, twinzygosity,ancestry, nativeregionmatches, DNAsafeguarding strengthindicators) 9.DNAAncestry http://www.easternbiotech.com 2006 Ancestry mtDNA,Y Dubai,UAE 10.DNADirect http://www.dnadirect.com/web/ 2003 Screeningtests,geneticdisorders, Y drugresponse,DNAstorage, paternityandfamilytests 11.DNAIdentity http://www.800dnaexam.com/ 2006 Paternity,familyrelationship mtDNA,Y TestingCenter DNAtests,immigration,adoption, forensic,ancestry,identity 12.DNAHeritage http://www.dnaheritage.com/ 2003 Ancestry mtDNA,Y,surnameprojects UK 13.DNAReference http://www.dnareferencelab.com 2006 Paternity,immigrationpaternity mtDNA,Y,ethnicityDNA Laboratory testing,ancestry,forensic, makeup,Europeanancestry infectious-diseasetesting,R&D DNAtest,NativeAmerican ethnicityDNAtest 14.DNASolutions http://www.dnasolutions.us/ 2000 R&D,paternity,siblingDNA, mtDNA,Y grandparent,twinstest, ancestry,birdsexing 15.DNATestingSystems http://dnaconsultants.com/ 2003 Ancestry,paternity,linkage mtDNA,Y,admixture,Native disequilibrium American,African,Melungeon test,Hindusingle&double formales 16.DNATribes http://www.dnatribes.com/ 2006 Ancestry Autosomalanalysis 17.DNAWorldwide http://www.dna-worldwide.com/ 2005 Paternity,relationship, mtDNA,Y,worldDNAmatch UK immigration,ancestry,forensic, DNAstorage,petDNA 18.easyDNA http://www.easy-dna.com/ 2006 Paternity,legalDNAtest, AncestraloriginsDNAancestry relationship,DNAprofiles, map(populationmatches, twinzygosity,forensic,ancestry, nativeregionmatches,strength immigration,maternity indicators),mtDNA,Y 19.EthnoAncestry http://www.ethnoancestry.com/ 2004 Ancestry mtDNA,Y Scotland& Ireland 20.FamilyBuilder https://dna.familybuilder.com 2007 Ancestry mtDNA,Y 21.FamilyGenetics http://www.familygenetics.co.uk 2005 Ancestry mtDNA,Y UK 22.FamilyTreeDNA http://www.familytreedna.com 2000 Ancestry mtDNA,Y,autosomal,X-STR 23.Genebase http://www.genebase.com/ 2005 Ancestry mtDNA,Y,autosomalDNASTR Canada 24.Genelex http://www.healthanddna.com 2000 Paternity,drugsensitivity, mtDNA,Y,NativeAmerican ancestry,predictivegenetics DNA,JewishDNA,African ancestryDNAtesting 662 TheAmericanJournalofHumanGenetics86,661–673,May14,2010 Table1. Continued StartDate, Companya URL Locationb GeneticTestingServices AncestryTestsOffered 25.GeneticTesting http://www.gtldna.com/ 2002 Paternity,ancestry,infidelity, mtDNA,Y,ancestralorigins Laboratories,Inc. DNAmaternity,siblingship, DNAancestrymap(population twinzygosity,grandparentage, matches,nativeregionmatches, missingparent,immigration, strengthindicators) prenatal 26.Genetree http://www.genetree.com/ 2007 Ancestry mtDNA,Y 27.homeDNAdirect http://www.homednadirect.com/ 2006 Paternity,legalDNAtesting, mtDNA,Y,ancestralorigins relationship,forensic,ancestry DNAancestrymap(population matches,nativeregionmatches, strengthindicators) 28.International http://www.ibdna.com 2007 Paternity,ancestry,siblingship mtDNA,Y,ancestralorigins Biosciences UK DNAancestrymap(population matches,nativeregionmatches, strengthindicators) 29.Metaphase http://www.metaphasegenetics. 2002 Paternity,siblingship, Y PaternityTest com/ grandparentage,twins,prenatal, forensic,ancestry 30.OxfordAncestors http://www.oxfordancestors.com 2000 Ancestry mtDNA,Y UK 31.PaternityExperts http://www.paternityexperts.com 2004 Paternity,sibling,ancestry, Admixture forensicpaternity 32.PathwayGenomics http://www.pathway.com/ 2009 Healthconditions,ancestry, mtDNA,Y carrierstatus,personaltraits, monogenicdominants, drugresponses 33.RootsforReal http://www.rootsforreal.com/ 2002 Ancestry mtDNA,Y,admixture UK 34.TestCountry http://www.testcountry.com 2001 Healthconditions,substance AncestraloriginsDNAancestry abuse,health&wellness, map(populationmatches, pregnancy/fertility,early nativeregionmatches, diseasedetection,ancestry strengthindicators) 35.TheGenographic https://genographic. 2005 Research,ancestry mtDNA,Y Project nationalgeographic.com/ 36.UniversalGenetics; http://www.dnatestingforpaternity. 2009 Paternity,forensic,ancestry mtDNA,Y DNATestingLaboratory com/ 37.WarriorRoots http://www.warriorroots.com/ 2009 Ancestry,linkingancestry mtDNA,Y toancientwarriorgroups, athleticprofile 38.23andMe https://www.23andme.com/ 2006 Completescan mtDNA,Y,globalsimilarity LastupdatedFebruary23,2010. a This list does not include companies or sites that only promote genetic ancestry testing services. On November 17, 2009 deCODE genetics (owner of deCODEme)announcedthatithasfiledavoluntaryChapter11bankruptcypetitionbutwillcontinuetoofferservicesduringitsrestructuringprocess.OnJanuary 21,2010,deCODEgeneticswaspurchasedbySagaInvestmentsLLC,aconsortiumthatincludesPolarisVenturesandARCHVenturePartners.Sincetheinitial preparationofthistable,twocompanieshavebeenremovedfromthelist–DNAAncestorshasgoneoutofbusiness,andDNADiagnosticsCenterhasbecome apromotersite. b Startdateforofferinggeneticancestrytesting.Somecompanieswereinexistencepriortothisdate.Locationsarelistedfornon-U.S.locationsonly. which typically represents a genera- forakidneytransplantorbloodtrans- however, neither are these rules tional narrative about one’s relatives fusion. absolute nor have they been consis- through his or her maternal and In addition to the concepts tently applied.18 This further illus- paternal lines of descent.15 It is this described above, there are sociopolit- trates that ancestry-related social notion of ancestry that people often ical rules about ancestry that guide identity—how a society may see or focusonwhenconsideringtheirgene- membership in certain groups. In define a person or group in relation- alogy. However, we note that gene- the United States (US) these include ship to real or putative ancestry—is alogy does not necessarily confer the legal and historic utility of hypo- to be distinguished from personal genetic similarity in all biological descent (‘‘one-drop’’ rule) for African or group interpretations of such systems; for example, even one’s Americans and blood quantum laws identityoractualknowledgeofgene- offspringmaynotbetheidealmatch for Native Americans.16–18 In reality, alogy. TheAmericanJournalofHumanGenetics86,661–673,May14,2010 663 Figure1. GlobalAncestry ThearrowssymbolizemigrationofearlyhumanancestorsoutofAfrica.Thecolormosaicdenotesglobalpopulationdiversityresulting fromvarioussubsequentinter-andintra-continentalandregionalmigrations.Thepedigreerepresentsthecomplexnetworkofinterme- diateandrecentancestorsthatisthesubjectofindividualgeneticgenealogytesting. InterestinAncestryEstimation very close ancestors (i.e., parents or to illuminate patterns of past human Consumers and researchers are inter- grandparents) or our most distant migration and provide background ested inusinggenetic information to ancestors(i.e.,theearliesthominids), information about human genetic inferancestryforavarietyofreasons, geneticancestryteststypicallyaddress variation that is essential for distin- including genealogical, anthropolog- more intermediate levels of ancestry guishing the impact of demographic ical, and epidemiological. Most con- that are imprecisely defined and processes from the effects of natural sumers are interested in using identified10,20 (Figure 1). Given this selection.21–23 Unlike commercial ancestry testing to gain, confirm, or intrinsic imprecision, the power of ancestry testing, such inferences are extend knowledge about their recent commercial genetic tests to address nearly always made at the level of family genealogy.3,19 To permit infer- specific genealogical questions is populationsor groups,rather thanat ences about shared recent ancestry, contingent on several factors that we theindividuallevel.Asaconsequence commercialtestersofgeneticancestry willdiscusslaterinthispaper. of this plural focus, these ancestry employ a variety of genetic-marker Population geneticists and anthro- inferences are more robust with systems to make comparisons be- pologists use genetic markers and respecttotheirfundamentallyproba- tween a customer’s DNA and genetic comparative datasets similar to those bilistic nature, and the limitations of databases of individuals sampled used in commercial ancestry testing ancestry estimation for individuals from diverse populations and geo- to make inferences about population arecomparativelylessapparent. graphic regions. However, although histories and relationships. Ancestry Genetic epidemiologists with an the concept of ‘‘ancestry’’ is least estimation has enormous value in interestinidentifyinggeneticassocia- ambiguous when it refers to either thisregardbecauseithasthepotential tionswithdiseaseemploymethodsof 664 TheAmericanJournalofHumanGenetics86,661–673,May14,2010 ancestry inference for specific analyt- new mutations and genetic profiles. can be inferred even at fine spatial ical reasons: either to control for Second, as founder populations be- scales (i.e., the scale of several hun- statisticalbiasesrelatedtopopulation came more widely separated from dreds of kilometers) within Eu- stratification among cases and con- oneanother,theprobabilitythattwo rope.41–45 There is rapidly escalating trols24–26 or as a strategy to map randomly chosen individuals would interestinancestralhistoriesofother susceptibility variants that might be mate with each other became even continents and geographic regions, differentially distributed with respect lower, and matings were even more and the deeper timescales of popula- to ancestry within groups whose his- likely to occur between people living tions from Africa46 and India47 have tories more clearly demonstrate the close to each other, accentuating the identified deep splits among geo- ‘‘mixing’’ of two or more peoples. divergence between geographically graphic regions with complex pat- Recently admixed groups (such as isolatedpopulations. terns of past migration and admix- AfricanAmericansorHispanicAmeri- Over the past two decades, geneti- ture. Further studies of human cans)providetheopportunitytoper- cists have characterized the geo- genetic diversity are clearly needed; form mapping by admixture linkage graphic pattern of variation in great these will determine whether such disequilibrium, commonly referred detail by using both haploid and fine-scale geographic patterns can be to as admixture mapping.27–29 How- diploid genetic markers (described detected in other parts of the world ever, epidemiological inferences of below).32–37 Because different parts andassigninterpretivevaluestothese genetic ancestry are typically applied of the genome can have different patterns. to individuals and are nearly always ancestral histories, different marker It is important to note that the based on the analysis of large collec- systems often provide somewhat dif- diversity of human social structures, tions of single nucleotide polymor- ferent information about population intermarriage patterns, and demo- phisms(SNPs)orancestryinformative historyandindividualancestry. graphic histories makes it likely that markers(AIMs,describedbelow).Each Currently, we only have partial theresolutionofpopulationstructure individual’s genome is then mapped knowledgeofhowhumangeneticdi- will be challenging and that the as a mosaic of segments inferred to versityisdistributedacrosstheglobe, extentoftheresolutionwillvarycon- be derived from one or the other but initial studies38 are revealing the siderably among populations. The ancestral population (or both, in the degreeofresolutionpossibleintesting inclusion of individuals with recent case where maternal and paternal the relationship between genetic migration among ancestors creates alleles in the individual are each ancestry and geographic origins. the more complex problem of disen- derivedfromdifferentancestralpopu- Anumberofthesestudieshaveuseda tanglingrecentlymixedancestry. lations). collection of ~1100 DNA samples obtained from 51 populations living ToolsforInferringAncestry HumanHistoryandVariation in different parts of the world; these Estimatesofgeneticancestryarebased Genetic ancestry estimation is based samples constitute the Human Ge- either on the use of haploid markers on an understanding of the distribu- neticDiversityPanel(HGDP).39Anal- (mitochondrial DNA [mtDNA] or Y tionofdiversityamonghumanpopu- ysisof987microsatellitestypedinthe chromosome haplotypes) or on the lations that reflects the demographic HGDP collection, for example, in- use of multiple unlinked autosomal and evolutionary history of our ferred six population clusters that markers that are diploid and some- species. Genetic and archaeological correspond to continental regions timespreselectedtobe‘‘ancestryinfor- evidenceindicatesthat,overthepast (i.e., Africa, America, Central/South mative.’’ As uniparentally inherited 100,000yearsorso,asthepopulation Asia, East Asia, and Oceania).38,40 haploid markers, mtDNA provides size of humans increased markedly, Analysisof~642,000autosomalSNPs information about the female-to- humansdispersedfromEastAfricato inthe HGDPcollectionenabledclus- femaletransmittedlineage(malechil- populateotherpartsoftheworld30,31 teringofindividualsnotonlytothese dren also inherit mtDNA from their (Figure 1). The number of migration large geographic regions, but also to mothersbutdonottransmitittotheir events, their magnitude, and the specific populations within these offspring),whereastheYchromosome routes that migrants took are still regions.38,40 Although the HGDP is informative about male-to-male activeareasofresearch.Nevertheless, collection is a useful collection of transmitted lineage. More recently, it is apparent that the dispersal of widely distributed human popula- autosomal markers, which are inher- anatomically modern humans af- tions, it is a convenient sample and ited from both parents, have been fected the geographic distribution of does not sample densely within any used for assessing patterns of genetic diversity in at least two important one geographic region; hence, there variation in worldwidehuman popu- ways.First,founderpopulationstypi- are limitations to the accuracy of lations. Commercial genetic ancestry cally carried with them only a subset ancestryinferencewithinandamong testing primarily utilizes haploid ofthegeneticvariationfoundintheir regions. Several studies that sampled markers to make ancestry inferences mostimmediateancestralpopulation populations deeply across Europe (Table 1), whereas estimates of ge- while simultaneously developing haveshownthatpopulationstructure netic ancestry in epidemiological TheAmericanJournalofHumanGenetics86,661–673,May14,2010 665 applicationsrelyalmostexclusivelyon that can be inferred with confidence to estimate admixture proportions theconsiderationofallelefrequencies isthattheyshareacommonancestor. originating from African, European, ofautosomalSNPs.Populationgeneti- Without more information about Asian, and Native American popula- cistsandanthropologistsemployboth family history and/or the geographic tions,51 offer increased power for typesofmarkers;whichtypetheyuse distributionofcloselyrelatedmtDNA ancestry inference in comparison to depends on the availability of funds haplotypes, it is impossible to say a random set of autosomal markers. andthequestionsbeingaddressed. whether this match arises via recent Accordingly, a smaller set of markers mtDNAandYChromosomeMarkers Indonesian ancestors in the North can be used, reducing genotyping Haploid genetic markers such as American’s family tree, whether both costsandincreasingthroughput. mtDNA D-loop region sequences or share distant ancestors who lived in The use of AIMs has facilitated Y chromosome SNP haplotypes per- anentirelydifferentpartoftheworld, efforts to control for admixture and mitdirectcomparisonofthelineages orwhethertheIndonesianmatchhas population stratification in genetic of sampled and reference individ- recentNorthAmericanheritage.Simi- associationstudies.Specifically,know- uals. As such, and unlike probabilis- larly,itisdifficulttoarriveatarobust ing the proportion of an individual’s tic estimates of population ances- interpretation of an mtDNA haplo- ancestry that originated in different try, matches among haploid genetic type that exactly matches those populations and to what degree a markersareintuitivelyeasytounder- sampled from multiple geographic groupisdividedintogeneticsubpopu- stand: an exact match of a male’s locations, e.g., Indonesia, Thailand, lationscanbeusefulforbothreducing Y chromosome haplotype to a man andPapuaNewGuinea. false-positiveassociationsanduncov- living in Australia implies that these AutosomalVariants eringtrueassociations.Wenote,how- two men share a common paternal IncomparisontomtDNAandYchro- ever,thatnotallpeoplefromagiven ancestor. mosome markers, autosomal markers populationhavetheAIM(s)identified An important issue with regard to provide much more comprehensive with that population, and people lineage-based genetic estimates is information on individual ancestry from different populations can have that they reflect only a fraction of because cumulatively they represent thesameAIM(s).Genemappingwith any person’s total genetic ancestry. amuchgreaterproportionofgenome AIMs, or admixture mapping, has For example, the Parsis have Y chro- history (i.e., multiple biparentally in- alsobeenusedsuccessfullyforidenti- mosome information that indicates herited loci versus a single locus, as fying genomic regions associated anorigininIran,consistentwiththe inherited through mtDNA or the Y withdiseasesandhealth-relatedtraits historicalrecord,whereasthemtDNA chromosome). However, because the such as prostate cancer (MIM originates in Gujarat, a region in genomeisfinite,onlyasmallfraction #176807), hypertension (EHT [MIM northwestern India where the Parsis of ancestors are represented by each #145500]), and white blood cell arrived in approximately 900 AD, given genomic segment in an indi- count.29,52–54 Admixture mapping is beforemovingeventuallytoMumbai, vidual, and every ancestor does not most effective for identifying genetic India,andKarachi,Pakistan.48,49This necessarily pass on his or her DNA variantsassociatedwithhealthcondi- asymmetry of maternal and paternal at any given genomic segment to a tions that differ between recently ancestryisnotamatteroftestincon- descendant, so one can only ever admixed populations (e.g., tropical sistency; rather, it reflects the high have limited information on the ori- African and European in the case of likelihood that nearly everyone will gins of a given individual’s ances- most African Americans; and Native have ancestors from different geo- tors.10 American,European,andAfricanpop- graphiclocations. Autosomal variation can be mea- ulationsinthecaseofHispanicAmer- Anotherproblemrelatedtolineage- sured by whole-genome sequencing icans) and for which this difference basedcomparisonsinvolvestheinter- approaches,withgenome-widegeno- has not yet been fully explained by pretation of exact genetic matches typing panels, or via an assessment nongenetic factors. Although most between individuals. Although it is of AIMs. Whole-genome sequencing, DTC tests for ancestry offer lineage biologicallyjustifiedtoinferthattwo although ideal and likely to usher in testing that uses mtDNA and Y-chro- individuals with the same mtDNA arenaissanceingeneticanthropology, mosome markers, DTC testing with haplotype share a common ancestor, is still prohibitively expensive and so autosomal markers, especially with moving from this inference of com- isbeyondthereachofmostacademic whole-genomeSNPchips,isbecoming mon ancestry to the conclusion that researchers or commercial testing morecommon.Theresultsreportedto the match implies something about companies. Genome-wide genotyp- the consumer typically estimate the biogeographical ancestry of both ingarrays,thenext-mostcomprehen- admixture proportions from several individuals can be problematic. For sive approach, include SNPs that are populations, most often Africans, example, if someone lives in North commoninaselectsubsetofpopula- Europeans,Asians,andNativeAmeri- AmericaandhisorhermtDNAhaplo- tionsandleadtoascertainmentbiases cans. However, the interpretation of type exactly matches an individual that can impact ancestry estima- such estimates by both the scientist living in Indonesia, the only thing tion.50 AIMs, most often developed andtheconsumerisunclear. 666 TheAmericanJournalofHumanGenetics86,661–673,May14,2010 AccuracyofAncestryInferences in colonial-era population move- a discrete-deme admixture model Ideally,anyquantitativeclaimsabout ments. This creates a bias that might where thereis a set ofdiscrete demes ancestry should have an easily inter- leadustodefineancestryinreference (usually or always referred to as preted assessment of confidence or to particular sociopolitical groups. ‘‘ancestral’’ or ‘‘parental’’ popula- accuracy associated with them. Our Moreover,ourknowledgeofdiversity, tions), and each individual inherits interest in accuracy is to assess not and hence the genetic contributions proportions of his or her genome only what the accuracy estimate is to ancestry, of populations in many from each of these demes. The goal butalsohowwellwecandescribeour other parts of the world (e.g., East of the method is to estimate this list confidenceintheinferences.Wealso Africa,SouthAsia,ArabianPeninsula, ofadmixture proportions for each indi- stressthedifferencebetweenaccuracy andSoutheastAsia)islimited. vidual. Strictly speaking, a deme is a of a particular individual’s ancestry Lineage Identification with Uniparental breeding population, defined on the versus the inference of ancestry of a Markers basis of population genetic inference population sample. The former is While it is now possible to identify of intermixed genetic variation, and particularly important in the case of related groups of Y-chromosome and it is unlike classical anthropology’s DTC ancestry testing, whereas most mtDNA lineages with high accuracy, ‘‘races,’’ which are defined by mor- scientific research on ancestry infer- population-level inferences that have phology.11Nonetheless,theemphasis encedealswiththelatter.Theaccuracy been made from these uniparental of admixture estimation on differ- of ancestry inference methods is a systemsaresubstantiallylessaccurate. ences over similarities can be mis- function of (1) how the underlying Twosimpleexampleshelptoillustrate leading about the overall genetic patterns of human genetic variation this point. A large number of single- structureofthehumanspecies. are distributed across the geographic site changes have served as the basis Admixture estimation has greatly range of human habitation, (2) how for breaking Y chromosomes into advanced the field of ancestry infer- that diversity is surveyed (i.e., the different ‘‘haplogroups,’’ and it is ence; however, there are caveats to type and number of genetic markers accurate to say that Y chromosomes the interpretation of its results. First, used)andwhowassampled,(3)which within, say, haplogroup C are more the ‘‘ancestral populations’’ are not populationsareusedasreferences,and closely related to one another than directlyobserved—althoughinmany (4) the statistical methods used for toaYchromosomefromhaplogroup applications, samples from related interpretingpatternsofvariation. J. Thus, if two men both carry hap- populations are used as a proxy. For DistributionofGeneticVariation logroup C Y chromosomes, they are example, present-day Yoruba are Accuracy is limited by the fact that morelikelytoshareapaternallineage the most frequently used proxy for every person has hundreds of ances- than if they had different haplo- inferring African American ancestry, tors going back even a few centuries groups. Even so, this relationship despite the fact that most African and thousands of ancestors in just does not mean that they are more Americans derive their ancestry from a millennium. There is enormous geneticallysimilaroverall. diverse West African (and other stochastic variation to the portion of Ontheotherhand,inthescientific African) populations that existed the genome retained ina descendant literaturetherehasbeenaconnection over a span of several centuries and from a given ancestor, and there is drawn between one subset of hap- thatmightnotallbewellrepresented a rough expectation that it halves logroup C and Ghengis Khan on the by present-day proxy popula- every generation. Genetic ancestry basis of the commonness of that tions.46,56,57Second,ifsomeancestral tests can access only a fraction of branch of the Y chromosome gene- populations are missing altogether theseancestralcontributions.Further- alogyin partsoftheworldconquered from the analyses, programs such as more,thegenomicsegmentscontrib- by Ghengis Khan.55 Although such a STRUCTURE58 and FRAPPE59 will uted by a particular ancestor are far connectionisbynomeansimpossible, force the results into a composite of from all being uniquely identifiable, wecurrentlyhavenowayofassessing thereferencesamplesused;therefore, so even if one’s genome has those howmuchconfidencetoplaceinsuch the results will be skewed simply specific contributions, identification aconnection.Weemphasize,however, becauseofhowthealgorithmswork. ofparticularancestryisalwaysuncer- thatwheneverformalinferencesabout Ifapoorproxyisusedforoneances- tainandstatistical. populationhistoryhavebeenattemp- tral population, the method might Geneticists also make specific tedwithuniparentalsystems,thestatis- compensate by adding admixture choicesaboutwhichlevelsofancestry ticalpowerisgenerallylow.Claimsof from other ancestral populations. to examine. For example, many esti- connections, therefore, between spe- Considergeneticancestrytestingper- mations of genetic ancestry are de- cificuniparentallineagesandhistorical formed on an individual we will call signed to distinguish contributions figures or historical migrations of Joe, whose eight great-grandparents from reference populations that live peoplesaremerelyspeculative. were from southern Europe. The in particular geographic regions (e.g., AdmixtureEstimation HapMappopulationsareusedasrefer- West Africa, Europe, East Asia, and Forautosomalmarkers,ancestryinfer- encesfortestingJoe’sgeneticancestry. the Americas) that were prominent ence is most often performed under The HapMap’s European samples TheAmericanJournalofHumanGenetics86,661–673,May14,2010 667 consist of ‘‘northern’’ Europeans. In between the source populations, collaborations might be helpful and regions of Joe’s genome that vary whichisincorrect(e.g.,anindividual commendable, vigilance is needed in between northern and southern Eu- with an East Asian and European identifying and addressing potential rope (such regions might include the parent will be indistinguishable from conflicts of interest. The scientific lactase gene, LCT [MIM #603202]), an individual from Central Asia). claims of companies that choose not the genetic ancestry test using the This reinforces the need for models to disclose the contents of their pro- HapMap reference populations is that take these and other limiting prietarydatabasescannotbeassessed; likelytoincorrectlyassigntheancestry factors into account and recognize therefore, the reliability of the infor- ofthatportionofthegenometoanon- that in some cases accurate social mation they provide to consumers European population because that identificationscannotbemade. cannotbeverified.64 genomic region will appear to be ReferenceSamples StatisticalMethods moresimilartotheHapMap’sYoruba To infer ancestry, researchers rely on Regardless of the methods used or orHansamplesthantoits(northern) comparinganyindividual’sparticular samples referenced, steps should be Europeansamples. genetic profile to that of reference taken to adequately convey the Althoughthediscrete-demeadmix- populations.Researchgeneticistsben- amount of uncertainty in the infer- ture method is informative about efit from various publicly available ences about ancestry, whether in the ancestryinsettingswhereindividuals databases such as the HapMap, researchorcommercialsetting.Popu- have recent admixture from diverse HumanGenomeDiversityPanel,Per- lation genetic inference is ultimately continental populations, it does not legen Human Genome Resources, a statistical exercise, and rarely can performwellinsettingswhereindivid- POPRESproject,andSeattleSNPspro- definitive conclusions about ancestry uals have more ancestors from across jects. However, even the databases be made beyond the assessment of a continuous gradient of genetic that researchers consider the most whether putative close relatives are diversity. European populations, for applicable reflect a woefully incom- or are not related. Because ancestry example, despite revealing genetic plete sampling of human genetic inferences for less simple questions differences, have been shown (as diversity,andthishasimportantcon- requirerelianceoncomplexstatistical described above) to exhibit mainly sequencesfortheaccuracyofancestry procedureswithinherentuncertainty, continuous spatial patterns of varia- inference. One problem is that the both producers and consumers of tion. When admixture is estimated ‘‘ancestral populations’’ assumed by genetic ancestry estimates need to for European individuals under the some methods are not explicitly rep- have a fairly sophisticated under- assumption of two ancestral popula- resented in databases—and indeed standingofprobability. tions,28 the method chooses admix- cannotberepresentedassuchbecause Therearetwolevelstotheinherent ture proportions that make individ- we do not have the ability to sample uncertainty of these statistical infer- uals a mixture of ‘‘northern’’ and ancestral populations. A second ences. First, there is uncertainty in ‘‘southern’’ ancestral populations problem is that populations that are parameter estimates (for instance, eventhoughthereisnoindependent mixtures of the ‘‘typical’’ reference howlargearetheconfidenceintervals evidencethattwosuchancestralpop- populations (e.g., Africans, Asians, of admixture coefficients for an indi- ulationseverexisted. and Europeans) are substantially vidual?). Second, there is uncertainty Methodsforaddressingcontinuous, under-representedinthesedatabases. in how to interpret these parameters spatial population structure are still Recent sampling efforts, such as (e.g., what do the admixture coeffi- under development, but principal- HapMap Phase III samples, are help- cients mean—what does an individ- components analysis (PCA) has been ingtoremedythisproblem;however, ual’s haplogroup say about his or her widely applied in this context.60,61 continued attention to diverse sam- past?).Thecontextinwhichancestry The expected behavior of PCA on pling will be an important aspect of estimation is being used determines evenly spaced samples from spatially any subsequent surveys of human the importance of these sources of structured data is to return coordi- geneticvariation. error. In some research contexts (e.g., nates that are related to the geo- Some commercial scientists and whenancestryisusedasacovariatein graphic origin of each individual.62 privategroupshavetheirownunpub- genome-wide association studies), it Moreover,thereisaclearlyestablished lished databases with the potential might be sufficient to have some relationshipbetweenthegenealogical to provide more refined information quantitative variable that represents structureofasampleandtheprincipal than that available from publicly ancestry. In commercial ancestry- components, grounding PCA in firm available resources. In some cases, testing applications, however, inter- principles of population genetics.63 the commercial interest in ancestry pretationofteniskeybecausetheinfor- One caveat of PCA-based approaches testing is indirectly benefiting public mation that is presented might have is that if individuals are a product of research. For example, the company directpsychosocialandotherpersonal ‘‘recent admixture’’ from disparate 23andMe partially funded the geno- implicationsfortheindividual. origins, it will assign individuals to a typingfortheHumanGenomeDiver- The statistical methods used to single origin that is intermediate sityProjectsamples.38Althoughsuch perform ancestry inference vary with 668 TheAmericanJournalofHumanGenetics86,661–673,May14,2010 regardtotheassumptionstheymake, between genetic and environmental prompts a host of psychological, how much of the information avail- factors.71–73 social,legal,political,andethicalcon- able in the genetic data is extracted, There are circumstances in which cernsfromtheindividualtotheglobal andhowtheirstatementsaboutinfer- genetic factors influencing heath- level.Theseactualorpotentialconse- ence are summarized for the re- related traits are associated with quences have received increasing searcher or the consumer receiving specific genetic variations that tend attention3,6,81–86andmustbeconsid- the information. The ease in under- to be more prevalent in a particular eredalongsiderelevanttechnicaland standing the statistical confidence in racial or ethnic group than in the analyticalissues. the ancestry inference also varies restofthepopulation.Certaingenetic Knowledge about genetic ancestry, widely among methods. The most variants associated with hyperten- particularly if undesirable and unex- important aspect of reporting confi- sion,52 type 2 diabetes mellitus (T2D pected, can lead to the reshaping of dence in ancestry determinations is [MIM #125853]),74 end-stage renal group, familial, or personal iden- to accurately convey the level of un- disease (FSGS4 [MIM #612551]),75,76 tity.87–91 Anthropological andpopula- certainty in the interpretations and prostate cancer,53,77 and some treat- tion-genetics research that postulates to convey the real meaning of that mentresponses78,79havebeenshown or casts doubt on ancestral relation- uncertainty. to differ significantly in frequency ships has historically incited varying amonggroups.Therefore,diseaserisk degrees of identity-related conflict. AncestryandHealth or treatment response is associated Some of the most notable examples (Note:unlikeinothersectionsofthis with and, in some situations, influ- includethecaseofKennewickMan,92 report, where we mention ‘‘race’’ to enced by genetic factors that vary researchlinkingtheLembaandcertain make specific points, in this section among racial or ethnic groups. It is Jews,93,94 and the discovery of family we use ‘‘race’’ (and ethnicity) as con- notclearhowmuchofthisisactually ties between Thomas Jefferson and structed by the US Office of Manage- gene expression versus DNA SallyHemings.95Theoccurrenceof,or ment and Budget (OMB) and as used sequence. potential for, emotional distress in in US social, government, and bio- Given the complexity and limited people, families, and groups after medical research parlance. We realize understanding of the relationships receipt of conflicting information that there are various connotations among genetic variation, ancestry, about their identity through DTC and limitations of these terms, but race, ethnicity, and health and treat- ancestry testing has also been dis- ourgoalhereisonlytoprovideabrief ment outcomes, the translation of cussed.3,15,87,89–91 Nonetheless, some overview of some important issues geneticepidemiologicalresearchfind- research focused on consumers of pertaining to health outcomes and ings to clinical application requires ancestry testing has revealed that health differences within and among ample consideration of a variety of althoughancestrytestsmightpromote the referent ancestry-linked sociopo- factors, including personal, social, genetic thinking about ancestry and liticalgroups.) and other nongenetic factors. This ‘‘race,’’ test takers also were able to Researchers still poorly understand issue might be highlighted in the construct meaningful narratives of the relationship of genetic ancestry contextofDTCgenetictesting,where their identity.5 Clearly, additional to individual and population health, consumers might share ancestry test empirical research will need to but this relationship is a potentially results or ancestry-related estimates adequately explore the relationship important area for investigation in of disease risk with their healthcare between genetic ancestry testing and that it might have social and politi- providersandexpectthattheinforma- theidentitiesandoverallpsychological cal consequences.65–68 In the US, it tionbefactoredintotheircare.3,68In well-beingoftesttakers,theirfamilies, has been commonplace to report view of the ongoing national efforts andtheircommunities. disease prevalence for each racial or toincreasethepublic’sexplorationof Questions have been raised about ethnic group separately, and these family(health)history,80itispossible privacyandaboutthesecurityofrefer- prevalence estimates often vary thatthispracticecouldbecomewide- encedatabasesthatsupportancestry- amonggroups.69Thishasledtowide- spread as people seek to exhaust the estimation endeavors. For example, spread speculation that racial or availablesourcesofinformationabout for genetic-ancestry-testing compa- ethnic differences in individual or their family history and associated niesthataresoldorgobankrupt,there population health are primarily due health risks. As such, the healthcare are concerns about the future of the to genetic factors, including genetic community must be recognized as a privacypoliciesandothertermsunder ancestry.68,70 Yet, racial or ethnic key stakeholder in decision making which data were collected.96 Some identity could be associated with the concerning genetic ancestry infer- people also fear that commercial health of an individual or group in ence. ancestry-testing databases might be several ways. It might co-vary with more vulnerable than other genetics different environmental or genetic PersonalandSocietalImplications databases to alternate and inappro- factors that underlie risk or with Ancestry inference—in both its re- priateuses.64,97 The problemof alter- different interactions within and searchandcommercialapplications— nate uses of data in the context of TheAmericanJournalofHumanGenetics86,661–673,May14,2010 669 ancestry estimation might also be and discussion about the potentially typing procedures, but it might not extended to the unauthorized inclu- reifying effects of current ancestry- be very useful to do that and also sion of population-based genetic estimationpractices.3,16,86,106Beyond claim that the inferences from the research data or samples in ancestry- ancestryestimationitself,theroutine data are not validated or certified in related studies6,98 or in commercial treatment, in science, of ancestral, any way. Determination of feasibility ancestry-testing databases. These ethnic, and so-called racial groups as or of mechanisms for certification practices bring to the fore consider- bounded biological entities perpetu- and validation, as well as specific ation about evolving notions of ates an inaccurate concept of human approaches for enhancing consumer consent, anonymity, respect for variationandincreasesthepossibility understanding of the scientific and communities, group risk-benefit of stigmatization and discrimination nonscientific issues, will require assessment, and benefit sharing, and of the groups and the people within thoughtful deliberation beyond the these issues must also be addressed themonthebasisoftraits,behaviors, scopeofworkofthistaskforce. within the current broader discourse diseases, and other attributes.64,70,107 The academic research community on the sharing and secondary use of Scientists and the scientific establish- cannot afford to be exempt from genetic and genomic data and ment as a whole must attend to this similar efforts to increase scientific samples.99,100 longstanding and pervasive problem rigor and overall accountability in Acommonconcernaboutscientific of conveying conflicting messages genetic ancestry estimation. Indeed, effortstoexplainoriginsisthealleged pertainingtohumanvariation. thepeer-reviewprocessesforfunding diminished regard for important Although genetic ancestry infer- andjournalpublicationsaredesigned cultural, religious, social, historical, enceinresearchand themarketplace to assist in such efforts, but their and political processes that inform isthefocusofthisreport,wearewell effectiveness is compromised by the origin as well as group membership, awarethatthetechnologiesarebeing inadequacy and inconsistent applica- identity, and rights.3,16,101 Reports of employed in other arenas. For ex- tion of existing guidelines in this the use (or intended use) of ancestry ample, since 2003, the forensic use area. Because of the intrinsic uncer- testresultstomakeclaimsforbenefits ofDNAtomakedeterminationsabout taintiesofthescienceandthepoten- throughaffirmativeactionorforrights ancestryincriminalcaseshasbecome tial societal ramifications, the field of perceived to be associated with their more widespread.107–110 More re- population genetics as a whole could new-found Native American status cently,the‘‘HumanProvenancepilot benefit from improved and enforced have increased unease over the loss project’’proposedusingDNAancestry standardswithrespecttoterminology or gain of certain rights or entitle- testing to identify the nationalities and methodologies, as well as inter- ments.91,102,103 Entitlement could of people seeking asylum in the pretation and communication of also be viewed in terms of interest UK.111,112Theseandothersuchappli- researchfindings. amongsomeDTC-ancestry-testtakers cationsofgeneticancestryestimation Recently,Leeandcolleagues6called in seeking dual citizenship in coun- alsomeritscrutinybecausetheyhave for federal regulation of genetic triesidentifiedastheirancestralhome- the same technical problems dis- ancestry testing. At this juncture, we lands.104 This trend is similar to that cussed above and may pose palpable offer an alternate approach, one that discussedinrelationtosomepopula- threatstohumanwelfare. might itself lead to federal oversight, tion-geneticsresearchconnectingthe if subsequently deemed appropriate, Lemba and certain Jews.87,105 It Conclusionsand necessary, or practical. We believe remains to be seen what tangible Recommendations that effective decision making re- effects (if any) genetic ancestry infer- Concerns about analytical proce- garding genetic ancestry inference, ence will have on these pre-existing dures, interpretation, and the per- in particular DTC genetic ancestry entitlementissues. sonal and social implications of ge- testing, willbe best initiated through Genetic ancestry inference (in neticancestryinferencemakeitclear cooperative interaction among a va- particular,theuseofAIMsandadmix- that enormous care is required in riety of stakeholders, including suit- ture mapping techniques) could theapplicationofancestryestimation able federal agencies. Considering reveal the nuances of ancestry and in both research and commercial that such collective engagement has dispel the notion of race in humans settings.Amajorissueregardingcom- not yet occurred, it is premature to and/or the practice of equating race mercialancestrytestingisthatthereis assume reticence or resistance on the with ancestry. Paradoxically, it is no quality assurance guarantee. This part of any of the players or that equally capable of giving credence to gives rise to the question of whether federalregulationistheonlyrecourse. the idea that humans subdivide into there is a need for lab certification or On the basis of our review of the distinctbiologicalracesandimplying accreditation.Wetendtoleanagainst state ofthe science and the personal, that there are the clear-cut connec- anything so formal because it would societal, and health-related implica- tions between DNA and specific provide a stamp of approval by any tions of genetic ancestry inference in geographic regions or ethnic groups. designated accrediting body. It isone academia and industry, we make the There has been substantial anxiety thing tocertifyaccuracyofthegeno- followingrecommendations: 670 TheAmericanJournalofHumanGenetics86,661–673,May14,2010

Description:
Increasing public interest in direct-to-consumer (DTC) genetic ancestry testing has been accompanied by growing concern about issues What are the data that demonstrate our ability to do this correctly? Azoulay, K.G. (2003).
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.