ebook img

Phylogeography and adaptation genetics of stickleback from the Haida Gwaii archipelago revealed PDF

16 Pages·2013·1.46 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Phylogeography and adaptation genetics of stickleback from the Haida Gwaii archipelago revealed

MolecularEcology(2013)22,1917–1932 doi:10.1111/mec.12215 Phylogeography and adaptation genetics of stickleback from the Haida Gwaii archipelago revealed using genome-wide single nucleotide polymorphism genotyping BRUCE E. DEAGLE,*1 FELICITY C. JONES,†2 DEVIN M. ABSHER,‡ DAVID M. KINGSLEY†§ and THOMAS E. REIMCHEN* *DepartmentofBiology, University ofVictoria,Victoria,British Colombia, Canada,V8W 3N5,†DepartmentofDevelopmental Biology, Stanford University, Stanford, CA94305-5329, USA, ‡HudsonAlpha Institutefor Biotechnology, Huntsville, AL, 35806, USA,§Howard HughesMedical Institute, Stanford University, Stanford, CA94305-5329, USA Abstract Threespine stickleback populations are model systems for studying adaptive evolution and the underlying genetics. In lakes on the Haida Gwaii archipelago (off western Canada), stickleback have undergone a remarkable local radiation and show pheno- typic diversity matching that seen throughout the species distribution. To provide a historical context for this radiation, we surveyed genetic variation at >1000 single nucleotide polymorphism (SNP) loci in stickleback from over 100 populations. SNPs included markers evenly distributed throughout genome and candidate SNPs tagging adaptive genomic regions. Based on evenly distributed SNPs, the phylogeographic pat- tern differs substantially from the disjunct pattern previously observed between two highly divergent mtDNA lineages. The SNP tree instead shows extensive within watershed population clustering and different watersheds separated by short branches deep in the tree. These data are consistent with separate colonizations of most water- sheds, despite underlying genetic connections between some independent drainages. This supports previous suppositions that morphological diversity observed between watersheds has been shaped independently, with populations exhibiting complete loss of lateral plates and giant size each occurring in several distinct clades. Throughout the archipelago, we see repeated selection of SNPs tagging candidate freshwater adaptive variants at several genomic regions differentiated between marine–freshwater populations on a global scale (e.g. EDA, Na/K ATPase). In estuarine sites, both marine and freshwater allelic variants were commonly detected. We also found typically mar- ine alleles present in a few freshwater lakes, especially those with completely plated morphology. These results provide a general model for postglacial colonization of freshwater habitat by sticklebacks and illustrate the tremendous potential of genome- wide SNP data sets hold for resolving patterns and processes underlying recent adaptive divergences. Keywords: Gasterosteus,population structure, single nucleotide polymorphism Received12October 2012;revision received12December2012; accepted13December 2012 Introduction Correspondence:BruceE.Deagle,Fax:+61362323288; E-mail:[email protected] Observations of species’ phenotypic adaptations to local 1Presentaddress:AustralianAntarcticDivision,Kingston,Tas., environments have inspired and informed generations 7050,Australia. 2Presentaddress:FriedrichMiescherLaboratory,MaxPlanck of evolutionary biologists. Parallel adaptive changes Institute,Tuebingen,72076,Germany. under replicated ecological conditions, such as the ©2013BlackwellPublishingLtd 1918 B. E. DEAGLE ET AL. repeated evolution of ecomorphs in Anolis lizards lateral plates or completely missing the pelvic girdle) (Losos 2009) or the parallel diversification of cichlid fish and major differentiation in adult body size are particu- (Kocher 2004), have been particularly valuable for larly well-studied (Moodie & Reimchen 1976b; Reimchen understanding evolutionary processes. Many of these 1983, 1994; Reimchen et al. 1985; T. E. Reimchen, C. A. adaptive divergences have been extensively studied at Bergstrom and P. Nosil unpublished data; Gambling the genetic level, with most of the focus on collecting & Reimchen 2012). Natural selection drives much of phylogenetic information to clarify their historical the differentiation seen in these insular populations context (e.g. Losos et al. 1998; Allender et al. 2003). (Moodie1972;Reimchen1980,1983,1994,2000;Reimchen Phylogenetic data can reveal how many independent & Nosil 2002). However, as the populations in adjacent times various phenotypic transitions occurred, the watersheds, or over small geographic areas, may not directionality of change, and provide a timeline for the represent independent colonization events, it is not adaptations (Avise 2006). At the intra-specific level, clear how many separate times various divergent stick- phylogeographic surveys have typically employed lebackphenotypeshave arisen on Haida Gwaii. quickly evolving mitochondrial DNA (mtDNA) or mi- Phylogeographic data based on mtDNA have been crosatellite markers. However, with recent advances in collected for Haida Gwaii stickleback revealing two DNAsequencingandhigh-throughputgenotyping,gen- highly divergent mitochondrial lineages estimated to ome-wide data sets are beginning to be collected from have separated over a million years ago (O’Reilly et al. wild populations allowing more robust reconstruction 1993; Orti et al. 1994; Deagle et al. 1996). The distribu- of the structure and demographic histories of popula- tion of one lineage, only in freshwater populations near tions (e.g. Willing et al. 2010). By contrasting the levels postulated ice-free refugia, was initially interpreted as of divergence across the genome, these population anal- evidence of extended existence of this lineage in the yses also allow areas of the genome under selection to region (O’Reilly et al. 1993). A subsequent global study be identified (Beaumont & Balding 2004). With this showed the divergent mtDNA ‘refugia’ lineage was ability to identify outlier loci, and through the use of ubiquitous in Western Pacific samples collected around classical genetic approaches (such as QTL mapping), Japan (Japan Sea lineage) and formed a clade distinct the genetic basis of the traits involved in adaptive from European and most North American populations divergences is increasingly being revealed (Elmer & (ENA lineage) (Orti et al. 1994). The presence of this Meyer 2011). Phylogeographic surveys incorporating ancient mtDNA polymorphism and otherwise low adaptive genetic markers in addition to putatively neu- level of sequence divergence between Haida Gwaii tral markers will allow insight into whether parallel populations has obscured the relationships among adaptive genetic changes are occurring across many populations. populations and will allow new understanding of Over the last 10 years, stickleback research has been environment–phenotype–genotypeinteractions. brought to the forefront of ecological genomics through Radiation in form, physiology and behaviour of the development of a suite of resources, including threespine stickleback (Gasterosteus aculeatus) has been genome sequences of 21 individuals from diverse mar- the focus of a diverse range ofstudies examining evolu- ine and freshwater populations and a high-quality tionary processes (reviewed in Wootton 1976; Bell & threespine stickleback reference genome (Jones et al. Foster 1994). This small fish is widely distributed in 2012b). This has allowed the genetic basis for several marine and coastal fresh waters of the northern hemi- adaptive traits to be determined (Colosimo et al. 2005; sphere (Wootton 1976). Most of the diversification of Miller et al. 2007; Chan et al. 2010). One of the best stickleback occurred after morphologically conservative studied is the sticklebacks’ lateral plate armour. Marine anadromous stickleback colonized freshwater habitats fish are characterized by a row of ~35 bony plates run- that were created when ice sheets of the most recent ning down each side of the body (completely plated), glaciationreceded(<15000 ybp).Oneofthemoststriking this contrasts with freshwater populations which gener- examples of morphological differentiation in vertebrates allyretain <10 plates(lowplated).Intermediate(partially occurs in threespine stickleback from lakes on the Haida plated) also occur. The difference in plate morphs has a Gwaii archipelago, off western Canada. Among islands relatively simple genetic basis with more than 70% of and within watersheds, these fish display a remarkable variationinplatenumbercontrolledbytheEctodysplasin diversity in adult body size, longevity, defence morphol- gene (EDA) (Colosimo et al. 2004). Selection for lower ogy, trophic structures and nuptial pigmentation which lateral plate number in freshwater populations has been equal or exceed that found throughout the entire species attributed to many potential mechanisms (reviewed in distribution (Moodie&Reimchen 1976b;Reimchen et al. Bell 2001) and is almost universally due to shifts 1985; Reimchen 1994; Spoljaric & Reimchen 2007). Cases between two allelic forms of EDA (Colosimo et al. of extensive armour loss (i.e. populations lacking all 2005). Freshwater populations of Haida Gwaii are ©2013BlackwellPublishingLtd STICKLEBACK SNP-BASED POPULATION STRUCTURE 1919 mostly low plated (generally six or seven plates), but identified as being differentiated between marine and variation encompasses the complete range with lake freshwater populations (e.g. EDA, Na/K ATPase Jones populations having means from 0 to 30 lateral plates et al. 2012a) and their inclusion allow us to map their (T.E. Reimchen,C.A. Bergstrom andP. Nosil2013). distribution throughout a large number of populations Beyond the allelic shift seen at the EDA locus during on the archipelago. We address some specific questions colonization of freshwater, a large number of other regarding these adaptive variants: do all low plated parallel genetic changes differentiate marine and fresh- populations share the characteristic freshwater EDA water stickleback (Hohenlohe et al. 2010; Jones et al. haplotype found in most other surveyed populations? 2012a,b). These parallel genome-wide changes have Do the completely plated freshwater populations retain been identified using genome scan approaches in lim- the typical marine EDA haplotype and/or do they ited range of marine and freshwater populations and retain other marine-like genomic regions? More gener- therefore their generality across populations isnot clear. ally we compare marine–freshwater divergence within Recent work on Haida Gwaii stickleback has also iden- our data set with previous genome-wide analyses tified some genomic regions under divergent selection (Hohenlohe et al. 2010; Jones et al. 2012b). We also between adjoining stream and lake populations (Deagle furtherdocumentthedistributionofhaplotypesidentified et al. 2012). Somewhat paradoxically some of these as outliers between stream-lake populations from Haida stream-lake outlier loci are also highly differentiated in Gwaii (Deagleet al. 2012). studies considering marine–freshwater divergence. This indicates that certain marine adaptive variants are Materials and methods retained in at least some freshwater populations. By documenting the distribution of these genetic variants, Stickleback samples it may be possible to identify commonalities between the marine and freshwater populations which share From the Haida Gwaii archipelago, a total of 462 adaptive genomic regions and narrow the search for stickleback from 115 localities representing 54 water- candidate genes. sheds were genotyped (Fig. 1; Table S1, Supporting Here, we use a stickleback genome-wide genotyping information). Specimens (n = 5) from the mid-Pacific array with >1000 single nucleotide polymorphism Ocean (45°31′N, 179°24′W) were also genotyped as (SNP) markers (Jones et al. 2012a) to document genetic archetypal Pacific marine fish. We maximized the num- variation in a comprehensive geographic survey of ber of populations sampled to obtain a broad survey Haida Gwaii stickleback. Most markers on the array comparable in scope to previous morphological analy- were chosen to be evenly distributed across the stickle- ses (T. E. Reimchen, C. A. Bergstrom and P. Nosil back genome. Data from these SNPs provide a mul- 2013). Due to the large number of populations consid- tilocus view of relationships between populations ered, only two individuals were genotyped for most producing a historical framework in which to examine locations; however, in 15 populations (cid:1)10, fish were this remarkable morphological radiation. Within a phy- analysed [including eight populations from a previous logeographic context, it will be possible to determine study on adjacent stream-lake pairs (Deagle et al. 2012)]. whether cases of extreme morphological variation (e.g. The low number of individuals genotyped per site lake populations with body gigantism or major loss of meant that some common population genetic analyses body armour) are due to a convergence or common based on population allele frequency estimates were ancestry. These data will also be useful to evaluate inappropriate. Sample localities covered three major potential extended residency of freshwater stickleback physiographic regions (lowland, plateau and mountain; in glacial refugia that existed between Haida Gwaii and see Sutherland-Brown & Yorath 1989) and were classi- the mainland (see Reimchen & Byun 2005; for discus- fied as lake (n = 77), stream (n = 28) or marine/estua- sion). This geographic region has a complex Pleistocene rine (n = 10). Morphological variation between sampled history and is at the centre of debate surrounding the populations encompassed the extremes seen within the human coastal migration theory (Josenhans et al. 1997). species. Here, we have highlighted (i) ‘unarmoured’ The phylogeographic picture that emerges will also populations with extensive loss of bony lateral plates [12 provide a general model for patterns of postglacial populations with a mean of less than one lateral plate on colonization of freshwater habitat by sticklebacks on a left side of fish (T. E. Reimchen, C. A. Bergstrom and P. local scale and should provide insight into drivers of Nosil 2013), Fig. 1] and (ii) ‘giant’ populations with the genetic diversity withinfreshwater. largest recorded body lengths [eight populations identi- In addition to evenly spaced SNPs, candidate SNPs fied in (Gambling & Reimchen 2012), Fig. 1]. Collections linked to potentially adaptive genomic regions were also were made using minnow traps primarily in spring/ genotyped. These SNPs primarily tag genomic regions summer of 2009 and 2010 (samples stored in 95% ©2013BlackwellPublishingLtd 1920 B. E. DEAGLE ET AL. 20 km Anser Kumara Ck* Grus Tow Hill Silver Mica Yakan* Swan* Kumara Serendipity Ck* Fife* Lumme Rouge Outlet* Sangan* Imber Red Truck White* Chown* Clearwater Rouge Naked Delkatla Harelda Serendipity Gosling Midge Spence Out* Drizzle Out* Branta Blowdown Spence MASSET INLET Drizzle Solstice Nuphar JuBnolue Danube Bruin Ian Ain Drizzle In* Laurel Stump 20 km Skonun Out* Gros Mesa Skonun Big Fish Richter Spraint Middle Eden Parkes Pure Out* Pure Oeanda* Slim Boulton Otter Loon Ck* Watt Out* Coates Loon Watt Geike* Menyanthes Tlell Vaccinium Krajina Florence* Wiggins Kumdis Hickey Mercer New Years KumdisRiver* Capeball* Awun YakounRiver* Coho Spam* Seal Pontoon Blackwater* Peter Molitor Mayer Marie Mayer Out* Stellata Yakoun Brent* Van Woodpile Gold* Stiu Copper Woodpile Ck.* Dawson Sheldon Gudal Dawson Marine Marine Cumshewa Independent watersheds (<3 Pops) Mosquito MassetInlet drainage GRAHAM Mathers Watersheds ISLAND Skidegate Kumara Hiellen Kliki Sangan Wegner Clearwater Smith Oeanda Hidden Capeball Poque Escarpment Mayer Darwin Tlell Sundew Irridens MORESBY Dead Toad* Morphology ISLAND Lutea Giant Lower Victoria Unarmoured Completely plated Fig.1 HaidaGwaiilocalitieswherethreespinesticklebackwerecollected.PopulationswhichdrainintoMassetInletandthosefrom watersheds with greater than two collection localities are colour coded to illustrate connections. Symbols identify marine/estuarine sampling sites and morphologically distinct populations (completely plated, unarmoured and giant). An asterisk beside the popula- tionnameindicatesastreampopulation. ©2013BlackwellPublishingLtd STICKLEBACK SNP-BASED POPULATION STRUCTURE 1921 ethanol). Additional samples were from collections made between individuals within a population was also small in 1993 (see Deagle et al. 1996). For one location (Hare- (in populations where at least 4 individuals were lda), stickleback from both 1993 (n = 6) and 2009 (n = 6) genotyped the mean CV, within populations, was 8.5%). weregenotypedtoconfirmsampleswerecomparable. This suggests estimates of relative population heterozyg- osities are robust even with SNP data from only a few individuals. SNP genotyping We examined population heterozygosity as a function Genomic DNA was extracted from muscle tissue and of habitat type (lake, stream, marine). For lake popula- 1536biallelicSNPlocigenotypedusingIllumina’sBead- tions, we also assessed correlations between heterozy- Array Technology and GoldenGate assay (Illumina, San gosity and three physical parameters (distance from Diego, USA) following Jones et al. (2012a). SNPs were ocean via outlet, elevation and lake area) by fitting originally identified in two marine and three freshwater linear models. The relative importance of the indepen- populations distant from Haida Gwaii (>800 km) and dent variables (and confidence intervals calculated are distributed across all 21 linkage groups, mtDNA based on 1000 bootstraps) was determined using the R and unassembled scaffolds (see Jones et al. 2012a). package relaimpo(Gromping 2006). The SNPs can be classified into three groups: (i) SNPs chosen to be evenly distributed across the genome Tree-base analysis based on local recombination rate; (ii) SNPs chosen to tag unoriented or unassembled genomic regions; and Individual-based distance trees were produced with (iii) candidate SNPs targeting regions differentiated two arbitrarily selected stickleback from each locality between marine and freshwater populations identified and using data from the evenly spaced SNP data subset in previous studies (Colosimo et al. 2005; Jones et al. (760 loci). These trees were constructed in MEGA version 2012a,b) or potentially linked to traits of interest based 5 (Tamura et al. 2011) using the neighbour-joining (NJ) on published studies on homologous traits in other algorithm based on a pairwise uncorrected P distance diverse organisms. The SNPs genotyped here are the matrix (equivalent to allele sharing distance Gao & sameasthose inDeagle et al. (2012);theseincludethose Starmer 2007) calculated from an artificial nucleotide SNPs with good genotyping signals from Jones et al. sequence created by concatenating each individual’s (2012a) along with additional candidate SNPs. GENOMES- diploid SNP data (missing data coded as N). Substitu- TUDIO software (v 2010.2; Illumina, San Diego, USA) was tion of different individual stickleback from the same used to visualize intensity signals. Genotypes were ini- populations (where more than two individuals were tially called automatically, then position of all intensity genotyped) had only minor impact on tree branching clusters were visually inspected and adjusted manually. patterns atnodeswith low bootstrap support. SNPs with poorly separated loose clusters or exhibiting low signals were excluded from further analysis. SNPs Principal component analysis missing >10% of genotypes calls and any individuals with >5% missing data were excluded. Repeatability of Principal component analysis (PCA) is an effective calls was >99% for individual DNA samples genotyped approach for dimension reduction of multivariate data multiple times. The final data set included 1170 SNPs sets and has been widely adopted in the analysis of (773 evenly distributed, 117 genome assembly and 280 SNP data sets as an unsupervised method to identify candidate SNPs;TableS2,Supportinginformation)from underlying structure (Patterson et al. 2006). For PCA of 467 stickleback (462 Haida Gwaii, five mid-Pacific archipelago-wide genetic structure, we used data from Ocean). the evenly spaced SNP data subset and carried out separate analyses using two arbitrarily selected indi- vidual stickleback per population and using population Population heterozygosity allele frequencies based on all individuals. PCA Population heterozygosity was calculated as mean of requires a data set without missing values so we filled individual observed multilocus heterozygosities based in missing entries (0.7% of individual data, 0.1% of on all evenly distributed SNPs (excluding sex-linked population data) by randomly sampling data for that loci; n = 760, hereafter referred to as the evenly spaced locus across all localities (separate re-samplings had SNP data subset). Given the large number of loci, very minor impact on PCA clustering). We used the individual heterozygosity estimates are precise (ran- function prcomp in R statistical software (v 2.9.0) to domly dividing the SNP loci in half and calculating perform the PCA (R 2009). A k-means clustering algo- individual heterozygosity for both sets of loci yields a rithm, also implemented in R, was used to assign indi- median coefficient of variation (CV) of 5.0%). Variance viduals/populations to clusters based on the SNP data ©2013BlackwellPublishingLtd 1922 B. E. DEAGLE ET AL. (10 independent runs were used to confirm stability of difference) to previously identified outlier regions (Ho- clustering). henlohe et al. 2010; Jones et al. 2012b). We also carried out PCA of stickleback from all populations using just these divergent, habitat-associated SNPs to identify MtDNA analysis freshwater populations containing marine-associated The Haida Gwaii data set included 10 mtDNA SNPs, alleles, or marine populations containing freshwater- including two SNPs known to differentiate the Japan associated alleles. Sea and ENA lineages (cytochrome b gene position 564 Finally, we consider SNPs identified by Deagle et al. and 690 from Orti et al. 1994). We used these SNPs to (2012) as being highly differentiated between multiple further map the distribution of the Japan Sea lineage on Haida Gwaii stream-lake pairs of stickleback. As the Haida Gwaii. We also genotyped a larger sample of same SNP genotyping array was used in both studies, individuals from two lakes (Serendipity and Harelda) all outliers could be geographically mapped based on known to contain both mtDNA lineages (O’Reilly et al. the current data set. However, most of these are single 1993) to investigate whether intra-population mtDNA SNPs representing large genomic regions and therefore clustering was reflected in nuclear DNA markers, or are not ideal cross-population markers for the adaptive whether any evidence could be found for selection on variants (i.e. they can become unlinked due to recom- the joint mitochondrial-nuclear genotype. Fish from bination or allelic variation). Here we consider two these two lakes were typed with a mtDNA lineage outlier regions (chr4: 19.8 Mb and chr19: 14.8 Mb Dea- diagnostic restriction enzyme test (see Deagle et al. gle et al. 2012) identified in stream-lake analysis and 1996) prior to genome-wide SNP genotyping ensuring each defined here by three SNPs (chr4:19881291, approximately equal numbers of fish from each lineage 19881370 and 19881515 and chr19:14796728, 14798132, (Serendipity n = 17: eight ENA, nine Japan Sea and 14799088). Both these regions are also outliers in marine Hareldan = 16:eightENA, eightJapanSea). –freshwater comparisons (Hohenlohe et al. 2010; Jones et al. 2012b). Adaptive genetic variation Results Inclusion of SNPs that are linked to alternate allelic forms of various adaptive loci allows us to map the Heterozygosity geographic distribution of these alleles in Haida Gwaii populations. Our broad survey design and resultant Heterozygosity of Haida Gwaii populations for evenly limited sample sizes within populations precludes spaced SNPs ranged from 0.002 to 0.343 (mean = detection of local adaptive variation (i.e. only occurring 0.206 (cid:3) 0.078 SD) and was higher in marine localities in one or a few populations). Instead, we focused on compared to streams and lakes (Fig. 2a). The lowest allelic variants tagged by multiple SNPs on adaptive heterozygosity was found in small ponds and headwa- haplotypes documented across several populations in ter creeks; in one creek (Blackwater), the two fish were previous studies. These include SNPs tagging EDA homozygous for the same allele in 758 of 760 loci. In (chr4—12.8 Mb) and Na/K ATPase (chr1—21.7 Mb: can- lake populations, heterozygosity was correlated with didate gene for salinity tolerance differences), both loci three independent variables considered (log values of are highly differentiated between marine and freshwa- elevation, lake area and distance from ocean) (Fig. 2b). ter stickleback in several populations (Hohenlohe et al. In a full multiple regression model, there were no sig- 2010; Jones et al. 2012a,b). Allelic forms of these regions nificant interaction terms, with interaction terms were each assessed at 6 tightly linked SNPs (EDA, chrIV: removed all variables were significant (overall R2: 0.47). 12811933,12814920,12815024,12815271,12816360,12831803 Percentage of variation in heterozygosity explained by and Na/K ATPase, chrI: 21662413, 21672254, 21683350, each variable model (averaged over orderings) was as 21689292, 21694776, 21701627). To examine the associa- follows: lake elevation 33.2% (95% CI = 21.5–45.5%); tion between allelic forms of EDA and lateral plate phe- lake area 8.7% (95% CI = 2.2–20.6%); distance from notype, we scored all stickleback for lateral plate ocean5.4% (95%CI = 3.7–8.8%). number (left side). We also looked for potentially novel genomic regions that are highly differentiated between Tree-based analysis of population structure marine and freshwater localities in our data set. To do this, we calculated the difference in allele frequency A distance-based tree constructed using two fish per between a pool of all freshwater populations (n = 104) collection site reveals several levels of genetic structur- vs. a pool of all the marine/estuarine populations ing in Haida Gwaii populations (Fig. 3). Genetic (n = 10) and compared the most divergent SNPs (>50% distances betweenindividuals accountsfor most separa- ©2013BlackwellPublishingLtd STICKLEBACK SNP-BASED POPULATION STRUCTURE 1923 (a) (b) Fig.2 Population level heterozygosity of Haida Gwaii localities for evenly spaced single nucleotide polymorphisms. (a) Boxplots (median, range, upper/lower quartiles) showing heterozygosity in populations collected in different habitats. (b) Heterozygosity of lakepopulationsplottedagainsttheirelevations;sizeofplottingsymbolisproportionaltolakearea(fourthroot).Insetshowsasum- mary of a multiple regression model with three significant biophysical predictor variables for heterozygosity of lake populations. OverallR2was0.47,andpercentofvariationinheterozygosityexplainedbyeachvariableindependentlyisshown[calculatedusing arelativeimportancemeasureaveragedoverpredictororderings(Gromping2006)],confidenceintervalsbasedon1000bootstraps. tion within the tree, although there is considerable an unresolved node with 80 independent branches (Fig. variation (i.e. some population harbour very low levels S2, Supporting information). Many of these branches of genetic diversity—see heterozygosity section above). contain only fish from one population (most often these Individuals from the same population are almost are sole representatives for the watershed or marine/ universally grouped together at terminal nodes. The estuarinepopulations). few cases where fish collected at the same locality do The 12 populations containing predominantly unar- not cluster together occur at marine/estuarine sites, or moured stickleback are distributed across eight genetic when individuals from adjacent populations are inter- clusters which each branch independently from the spersed (Fig. 3). basal node of the tree (based on condensed tree). This The next group of well-supported clusters primarily is consistent with armour loss occurring independently joins fish collected from common watersheds (Fig 3). in separate watersheds. These populations include For example, in the Sangan watershed, which contains unarmoured stickleback in two groups of headwater a great deal of morphological diversity (see Reimchen lakes in adjacent watersheds that are geographically et al.1985),fishfrom Skonun Lakearegroupedtogether close (<750 m apart) but genetically distant (Juno vs. with, along with adjoining streams and several nearby Blowdown/Nuphar and Serendipity vs. Gosling/Naked; ponds (99% bootstrap support; Fig. 3; Detailed map in Detailed map in Fig.S1, Supporting information). Popu- Fig. S1, Supporting information). There are several lations of giant stickleback show a similar pattern of exceptions to the watershed-driven structuring. First, independence in different watersheds, with the eight there are many cases where populations from within giant populations coming from seven distinct genetic the same watershed are separated. For example, again clusters (basedon condensed tree). within the Sangan watershed, fish from Drizzle lake, two isolated lakes and stickleback collected near the PCA of population structure river mouth fall in separate or weakly supported clus- ters (Fig. 3; Fig. S1, Supporting information). Second, Principal component analysis partitioned genetic varia- there are examples in which adjacent freshwater water- tion into three broadly congruent clusters regardless of sheds are joined in the tree. This is most prevalent in whether individual genotypes or population level allele populations draining into a large saltwater inlet (Masset frequencies are considered (population level data pre- Inlet) on Graham Island (Fig. 3; Fig. S1, Supporting sented here). Based on population allele frequencies, information). the first two PCs account for 8.2% and 5.5% of the vari- The basal region of the tree is characterized by a ation respectively (Fig. 4; see Fig. S3 for population large number of short branches that are poorly labels, Supporting information). With membership supported by bootstrap values. Despite being poorly defined using k-means clustering (k = 3), the first clus- supported, these branches deep within the tree still ter contains marine localities as well as freshwater pop- tend to group populations by geographic region. When ulations from the entire western and southern regions a condensed tree is generated (50% bootstrap support of the archipelago. The second cluster is limited to cut-off value), the basal region of the tree collapses into localities on the north-east tip of Graham Island,includ- ©2013BlackwellPublishingLtd 1924 B. E. DEAGLE ET AL. Pure1 0.05 76 Mercer2 Gros2 Rouge1 Fig.3 Neighbour-joining distance treeconstructedwithsinglenucleotide polymorphism(SNP)datafrom twostickleback from each of 115 Haida Gwaii localities (n=227 individuals;three populationswithsingle fish).Colours and symbolsfollow scheme in Fig1. Tree constructed based on a pairwise uncorrected P distance matrix calculated using evenly spaced SNPs (n=760); missing datawereremovedin apairwisemanner.Bootstrapvalues>50%areshownnexttobranchesand values>96%aremarkedwithan asterisk. ing watersheds that flow both north and east into the closely linked lakes in the headwaters of the Oeanda ocean. The final cluster consists of a large number of River (Parkes,Middle, Richter). Graham Island populations distributed from the Sangan To investigate the number of SNPs that are driving watershed in the north through the central area and to separation of the three PCA clusters, we considered theeastcoast(Fig. 4).FurtherPCs,andk-meansclustering small groups of SNPs categorized according to their with higher values of k, tend to cluster small groups of informativeness (relative weightings on eigenvector); by populations from a single watershed. For example, PC3 examining these SNPs, in turn, we evaluated how (and k-means clustering k = 4) separates out three quickly the ability to approximate the observed cluster- ©2013BlackwellPublishingLtd STICKLEBACK SNP-BASED POPULATION STRUCTURE 1925 Geography of adaptive genomic regions 0 1 Changes in lateral plate phenotype, which generally Central/East Coast occur following colonization of freshwater habitats, are Graham Island 5%) 5 GNroarhtha-me aIssltaenrdn aElDmAost(alulenlievse:rsCall=ycoamttrpibleuttee,dLt=o ltowwo). aAlllelliclowformplsateodf 5. 2 ( fishwe genotyped were homozygousfor the L allele (as C P assessed at six tightly linked SNPs; Fig. 5a). All fish 0 with complete or partial lateral plate phenotypes had at least one C allele, and this includes stickleback from 10 Marine/Estuarine marine/estuarine localities with fish exhibiting a range –5 WMoesret sCboya Isstl aGnrdaham Island of plate phenotypes. It also includes completely and –5 0 5 partially plated fish from several freshwater lakes PC1 (8.2%) (Fig 5a). Of the four lake populations containing com- pletely or partially plated fish, two (Darwin and Hidden) Fig.4 Principalcomponentanalysisrevealsclusteringofgeographic could potentially have had recent gene flow with the regions based on population single nucleotide polymorphism (SNP)allelefrequencydata(evenlyspacedSNPs;n=760).Each marine environment (heterozygosity = 0.291 and 0.208 respectively). Two other lake populations (Stiu and point representsa sampling location, colours follow scheme in Fig.1withthemid-Pacificsamplelabelledwhite. Lower Victoria) are isolated from the ocean by high- gradient streams and have correspondingly lower het- erozygosity (0.151 and 0.056 respectively). Despite very ing declined (see Fig. S3, Supporting information). PC1 overall low heterozygosity, Lower Victoria contains and PC2 from the overall data set were highly corre- both the Cand Lallelicforms oftheEDA gene. lated with their top ten SNPs indicating that the overall The global marine–freshwater outlier genomic region clustering observed can be obtained with 20 SNPs. near the Na/K ATPase gene (defined by six SNPs in the However, it is not only these SNPs driving the clustering. current analysis) was also highly divergent between Reasonable approximations could be obtained (r2 > 0.5) these habitats in our data set (Fig 5a). The marine/estu- for any of the 10 SNP data subsets created from the 200 arine populations contained many fish heterozygous for top-weighted SNPs on either of the first two principal the alternate forms (consistent with variable plate mor- components. This analysis suggests that a broad range phology and the EDA locus pattern) and the freshwater of SNPs rather than a few selected SNPs are driving the haplotype was generally fixed in lakes and streams observedPCA clustering. (Fig 5a). Exceptions in which marine SNPs are retained in freshwater primarily occur in completely plated freshwater populations (Fig 5a). Comparison with previous mtDNA studies of To examine general patterns of SNP divergences population structure between marine and freshwater on the archipelago, we Analysis of mtDNA SNPs identified 39 fish from 12 identified SNPs that diverged most in frequency populations containing the Japan Sea mtDNA (Table S1, between these habitats (86 SNPs from 28 genomic Supporting information). These included 10 populations regions; for details see Fig. S5, Supporting information). where the lineage had previously been identified (O’Re- Most of these divergent SNPs were in genomic regions illy et al. 1993; Deagle et al. 1996) and two west coast characterized as outliers in previously marine–freshwater populations not included in prior studies (Menyanthes comparisons (23 of 28 regions were also outliers in and Stiu). The distribution of the Japan Sea mtDNA did either Hohenlohe et al. 2010; Jones et al. 2012b). A PCA not correspond to overall clustering of the SNP data based on these divergent SNPs (excluding SNPs linked (i.e.this mtDNA lineage wasdistributed throughout the to EDA to reduce the direct influence of plate morphol- tree and in different PCA clusters). In the two mixed ogy) produces a gradient of marine-like to freshwater- mtDNA populations, from which larger numbers of like fish along PC1 (Fig. 5b). The extremes of PC1 samplesweregenotyped(SerendipityandHarelda),fish (explaining 57% of the variance) are represented by did not cluster according to mtDNA lineage in NJ trees strong negative scores for the mid-Pacific samples and constructedbasedonevenlyspacednuclearSNPs(Fig.S4, marine waters of Haida Gwaii (Dawson Marine) and Supporting information). No strong associations were strong positive loadings for most freshwater individuals found between any nuclear SNP and mtDNA lineage, (Fig 5b). Several marine/estuarine collected individuals so we have no evidence of co-adapted mitochondrial- clustered with freshwater fish, indicating they were nuclear genes. freshwater residents. Other fish from these sites were ©2013BlackwellPublishingLtd 1926 B. E. DEAGLE ET AL. (a) (b) Fig.5 Populationdistributionof singlenucleotide polymorphisms(SNPs)linked toalternateallelicformsof candidateadaptiveloci (a) Boxplots showingfrequencyof four candidate loci in low-plated freshwater, completelyplated freshwater and marine/estuarine populations.Allfourgenomicregionshavepreviouslybeenshowntobedivergentbetweenmarineandfreshwaterlocationsinglo- bal comparisons. Chr4:19.8 and Chr19:14.8Mb were also previously identified as outliers between some Haida Gwaii parapatric stream-lake populations. (b) Principal component analysis plot showing positioning of individual stickleback based on subset of SNPs most divergent in allele frequency between marine and freshwater localities (n=78; excluding SNPs linked to EDA). Stickle- backfrommarine/estuarinecollectionsites areshownasdiamonds,completelyplated individuals arecircled.See Fig.S5 (Support- inginformation)forlistofSNPsandplotwithpopulationslabelled. intermediate, suggesting varying levels of admixture. lake habitats in the archipelago-wide data set. The chr4: Freshwater localities containing completely plated fish 19.8 Mb ‘lake haplotype’ (defined by 3 SNPs) was gener- generally cluster closer to marine fish, indicating that ally at a low frequency in freshwater populations; how- these populations tended to retain a suite of marine-like ever, it was prevalent in the marine/estuarine samples alleles across the genome in addition to the SNPs at (Fig. 5a).Somelakes wherethis haplotype wascommon EDA and Na/K ATPase (for details see Fig. S5, Support- contained giant stickleback (e.g. Laurel, Coates, Awun) ing information). However, this retained association a trait shared with lakes in which the outliers were with EDA C alleles is not universal. One notable exam- originally described, although not all giant populations ple is Poque lake, a lake population that is low plated sharedthishaplotype.Thech19:4.8 Mbshowsasimilar (both fish homozygous for EDA L), but otherwise con- pattern (i.e. the lake haplotype is common in marine/ tainedmanyallelesusuallyfound inmarinestickleback. estuarine sites and less common in freshwater), but the The chr11 5.7 Mb haplotype, a previously identified lake haplotype is present at an intermediate frequency inversion differing in orientation between marine and in many freshwater populations (Fig. 5a). The retention freshwater ecotypes (Jones et al. 2012b), is almost invariant of these marine-like haplotypes in relatively few lakes infreshwater,withPoquelakeandmostcompletelyplated explains how these loci can be outliers in different freshwater populations (e.g. Stiu and Hidden) matching studies comparing stream-lake and marine–freshwater theFWgenotypes(Fig.S5,Supportinginformation). populations. The two previously identified stream-lake outlier regions (chr4: 19.8 Mb and ch19: 4.8 Mb, Deagle et al. Discussion 2012) that are also outliers in marine–freshwater comparisons (Hohenlohe et al. 2010; Jones et al. 2012b) Our survey of genetic variation in Haida Gwaii stickle- were not strongly differentiated between stream and back using a genome-wide SNP array helps refine ©2013BlackwellPublishingLtd

Description:
quickly evolving mitochondrial DNA (mtDNA) or mi- crosatellite markers. However lotte Islands Hecate Strait, British Columbia, Canada. Science,.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.