Guoetal.BMCEvolutionaryBiology2012,12:2 http://www.biomedcentral.com/1471-2148/12/2 RESEARCH ARTICLE Open Access Nuclear and plastid haplotypes suggest rapid diploid and polyploid speciation in the N Achillea millefolium Hemisphere complex (Asteraceae) Yan-Ping Guo1*, Shuai-Zhen Wang1, Claus Vogl2 and Friedrich Ehrendorfer3 Abstract Background: Species complexes or aggregates consist of a set of closely related species often of different ploidy levels, whose relationships are difficult to reconstruct. The N Hemisphere Achillea millefolium aggregate exhibits complex morphological and genetic variation and a broad ecological amplitude. To understand its evolutionary history, we study sequence variation at two nuclear genes and three plastid loci across the natural distribution of this species complex and compare the patterns of such variations to the species tree inferred earlier from AFLP data. Results: Among the diploid species of A. millefolium agg., gene trees of the two nuclear loci, ncpGS and SBP, and the combined plastid fragments are incongruent with each other and with the AFLP tree likely due to incomplete lineage sorting or secondary introgression. In spite of the large distributional range, no isolation by distance is found. Furthermore, there is evidence for intragenic recombination in the ncpGS gene. An analysis using a probabilistic model for population demographic history indicates large ancestral effective population sizes and short intervals between speciation events. Such a scenario explains the incongruence of the gene trees and species tree we observe. The relationships are particularly complex in the polyploid members of A. millefolium agg. Conclusions: The present study indicates that the diploid members of A. millefolium agg. share a large part of their molecular genetic variation. The findings of little lineage sorting and lack of isolation by distance is likely due to short intervals between speciation events and close proximity of ancestral populations. While previous AFLP data provide species trees congruent with earlier morphological classification and phylogeographic considerations, the present sequence data are not suited to recover the relationships of diploid species in A. millefolium agg. For the polyploid taxa many hybrid links and introgression from the diploids are suggested. Background Hemisphere common yarrow taxa form such a complex, Speciescomplexesoraggregatesconsistofasetofclosely i.e., the Achillea millefolium aggregate. Centered in SE relatedspeciesoftenofdifferentploidylevels,whoserela- Europe and SW to C Asia, its diploid species are limited tionships are difficult to reconstruct. Such species com- toEurasia,whereasthepolyploidshavespreadthroughout plexes are common in angiosperms [1-5]. Rapid genetic, the N Hemisphere [10,11]. In N America, the 4x and 6x phenotypic and ecological differentiation on one hand, cytotypes form a complex of ecological races adapted to andhybridization/polyploidyontheother,playimportant manydifferent nicheswith marked genotypicdiversifica- rolesin theirevolutionary bursts[6-9]. Thetemperate N tion [6,12,13]. By cultivation in experimental gardens, Clausenetal.[6]documentedlocaladaptationofA.mille- folium populations to environments along an altitudinal *Correspondence:[email protected] 1MinistryofEducationKeyLaboratoryforBiodiversityScienceandEcological transectinCaliforniafromsealeveltoalpineregions.This Engineering,andCollegeofLifeSciences,BeijingNormalUniversity,Beijing hasbecomeaclassicexampleofrapidadaptivedivergence 100875,China ofplantpopulations[12-18]. Fulllistofauthorinformationisavailableattheendofthearticle ©2012Guoetal;licenseeBioMedCentralLtd.ThisisanOpenAccessarticledistributedunderthetermsoftheCreativeCommons AttributionLicense(http://creativecommons.org/licenses/by/2.0),whichpermitsunrestricteduse,distribution,andreproductionin anymedium,providedtheoriginalworkisproperlycited. Guoetal.BMCEvolutionaryBiology2012,12:2 Page2of18 http://www.biomedcentral.com/1471-2148/12/2 Our earlier AFLP data have characterized the Achillea individuals from each population were analyzed. Broadly millefoliumcomplexasaclade[10].Theinferredspecies the same individuals were sequenced for the two nuclear relationshipsofthediploidmembersconformedtoabin- loci and three cpDNA fragments; minor exceptions were ary bifurcating tree and generally agreed withtraditional due to repeated failures in sequencing a certain locus species delimitations and taxonomic arrangements. The from a certain individual (see Table 1). Three diploid polyploid members appeared to be polytomic and poly- species, taxonomically outside the A. millefolium aggre- phyletic although geographic patterns can be recognized gate but included in previous AFLP analyses [10], were [10,11]. Available data show that frequent exchange of also sampled for this study. They are the W-Eurasian genetic materials have been involved in the origins of A. nobilis-2x, the C-Mediterranean A. ligustica-2x and many polyploidy taxa. Meanwhile, it seems also to have the East Asian A. acuminata-2x (Table 1). In addition, occurred during the divergence of the diploid species sequences of two cpDNA regions, trnH-psbA and trnC- [10,11,19]. So far,we are stilluncertain about the demo- ycf6, of 43 North American populations available from graphichistoryofA.millefoliumagg.andtheprogenitors the NCBI data base (GenBank accessions EU128982- ofmanypolyploidtaxa. EU129456) [12] are incorporated into our plastid haplo- Due totheirdominantnature, AFLP markersare diffi- type network analysis. culttouseforinferringgeneticparametersofpopulations, To check ploidy levels of the populations studied, two particularly ancestral population sizes, split times, and methods, either chromosome counting or DNA ploidy migrationrates.Inprinciple,thiscanbeaccomplishedbet- leveldetermination,wereappliedusingyoungflowerbuds ter with DNA sequences either from organelles or from or fresh or silica gel-dried leaves, respectively. Young thenucleargenome[20,21].Yet,we stillmeetchallenges flower buds were collected in the field and fixed in Car- inpractice:PlastidDNAvariationisoftentoolowtoinfer noy’sfluid(ethanol:aceticacid=3:1).Tocountthechro- relevant gene trees with confidence. In addition, despite mosome number, fixed flower buds were stained and thelowereffectivepopulationsizeofplastids,plastidgene squashed in 4% acetocarmine and observed under the trees may still not reflect the species tree due to incom- microscope.DNAploidylevelswereinvestigatedwithpro- plete lineage sorting. With nuclear genes, frequent birth pidium iodide flow cytometry [28,29] from the prepared anddeathofgenecopies,lesslineagesortingduetohigher leaves. Information for ploidy levels obtained with the effective populationsize,secondary introgressionamong above two methods are marked in Table 1 by c and f, split species as well as intragenic recombination tend to respectively,whilethoseinferredfrompreviousstudies[12 hamper interpretation of the patterns of polymorphism andEhrendorferetal.,unpubl.Data]aremarkedbyi. [22-25]. Voucherspecimensaredepositedintheherbariaofthe Here, we survey DNA sequence variation sampled at Institute of Botany (WU) and the Department of Phar- twonuclearlociandthreeplastidfragmentsfrompopula- macognosy (HBPh), both at the University of Vienna, tions across diploid, tetraploid, and hexaploid species Austria,andoftheCollegeofLifeSciences,BeijingNor- throughoutthenaturaldistributionrangeofA.millefolium malUniversity(BNU). agg.andfromthreediploidcongenericspeciesoutsidethe aggregate.Onthediploidlevel,weinferthedemographic Data sampling history among species and populations using the newly Total genomic DNA was extracted from ca. 0.02 g silica generatedDNAsequencedataincomparisontotherela- geldesiccatedleafmaterialsfollowingthe2×CTABpro- tionships inferred from the previous AFLP data. To this tocol [30] with slight modifications: Before the normal end,wealsoapplyaprobabilisticmodel(IMa2)[26,27]to extraction process, sorbitol washing buffer was used to threewidespreaddiploidspecies to shedlighton thekey removepolysaccharidesintheleafmaterials(add 800μL parameters, i.e.,ancestral effective populationsizes,time sorbitolbuffertothegroundleafpowder®incubate the of speciation, and rates of gene flow. For the polyploid sample in ice for 10 min®centrifuge at10,000 g for 10 populations,wetrytountangletheirpolytomicandpoly- min at 4°C and then follow the established 2 × CTAB phyleticrelationshipswhichareprobablycomplicatedby protocol). geneflowonthesameandbetweendifferentploidylevels Two nuclear genes were sampled and partially usingtheco-dominantsingle-genehaplotypedata. sequenced for this study. They are the chloroplast- expressed Glutamine Synthase gene (ncpGS) and the Methods Sedoheptulose-Bisphosphatase (SBP) gene. The ncpGS Plant sampling gene has been used in many plant phylogenetic studies We sampled thirty populations of seven diploid, seven andshowntobesingle-copy inallangiospermspeciesso tetraploid and four hexaploid taxa or cytotypes of the far studied [31] and especially in Achillea [19]. We Achillea millefolium aggregate throughout the temperate sequencedpartofitscodingandnoncodingregionsfrom N Hemisphere (Table 1). On average, two to three exon 7 through to exon 11. The SBP gene has been Table1Taxa, populations and DNAloci sampled hG ttpuo Taxa Taabtaixboonrensvi- Pcoodpe. Ploidy Geographiclocalityofpopulations Canodlledcatoterss Ngnueunmcelebsae:rrof cinpdDivN.Aa:nnaulymzebderof ://wwwetal.B indiv./ .bM clones iomCE analyzed edcvolu Achilleamillefoliumagg. ncpGS SBP entio A.asiaticaSerg.s.lat.(=A. asi-2x NM 2xf China:DaqingMt.,41°04’52”N,112°35’56”E;2010m GR, 3/8 3/9 4 trana sergievskianaShaulo&Shmakov) ARX 2xf China:ArxanMt.,47°17’39”N,120°27’11”E;1130m 2Y0G0,6.08.25 3/14 3/ 3 l.com/1ryBiolo 2007.10.08 14 47gy SHB 2xf China:Hebei,42°26’N,117°15’E;1500m YG, - - 2 1-220 11 2007.07.27 42 8, AL1 2xc Russia:Altai,51°02’52”N,85°36’47”E;1100m M20S0,2.07.30 3/9 311/ 3 /12/212:2 A.asiaticaSerg.s.str. asi-4x AL9563 4xc Russia:Altai,49°32’66”N,88°13’35”E;2350m AT, 3/13 1/5 3 2003.08.02 AL3 4xc Russia:SSiberianlowlandnearNovosibirsk;220m MS, 3/13 3/ 2 2002.08.16 15 UT 4xc Uzbekistan:Tashkent,TschimganMt. HG,2002.11 - 3/ 3 15 A.asplenifoliaVent. asp BZ 2xc Austria:Burgenland,ZitzmannsdorferWiesen Tod, 3/9 2/8 2 2001.01.10 NS2 2xc Austria:Burgenland,Rust,nearlake“NeusiedlerSee” FE,JS,YG, 3/9 3/ 3 2003.05.27 15 Ta 2xf CzechRepublic:SouthMoravia,Terezin FE&LE, 3/11 2/ 2 2002.07.12 10 A.borealisBong.s.lat.(A.lanulosa bor-alp US2 4xi USA:Washington,Mt.RainierNationalPark;1350-2070m PS&AT, 2/12 2/ 2 Nutt.var.alpicolaRydb.) 2002.08.18 10 A.borealisBong.s.lat.(A.lanulosa bor-lan US5 4xi USA:Connecticut,HopevilleStatePark JE, 2/12 2/ 2 Nutt.var.?) 2004.07.17 10 A.borealisBong.s.lat.(A.lanulosa bor-lan US6 4xi USA:Utah,ascentfromSnowbirdAltatoLakeSecret KT, 3/15 3/ 3 Nutt.var.lanulosa) 2004.07.31 18 A.ceretanicaSennens.str. cer-2x 10240 2xc France:EPyrenees JS,2001 2/10 2/9 2 A.ceretanicaSennens.lat. cer-4x 10222 4xc France:MassifCentral JS,2001 2/10 2/9 2 A.cuspidataWall. cus ID 2xi Kashmir:34°25.80’N,75°44.80’E;3200m LK, 1/4 1/5 1 2004.09.14 A.distansWaldst.&Kit.exWilld. dis DIKF 6xi Austria,Kaltenleutgeben,Flösslberge JS,s.n. 2/16 2/ 2 13 A.inundataKondr. inu K13 4xi Ukraine:Kiev,SofDesnamouthintoDnjepr;100m FE&YG, 2/9 2/9 2 2003.07.28 A.latilobaLedeb.exNordm. lat Geo 2xi Georgia:Adjara,41°29’55″N,42°31’46″E;2006m DK, 3/12 1/4 3 Pa g 2004.07.18 e 3 A.millefoliumL.s.lat.(A.apiculata mil-api Ra-c 6xi Russia:Karelia,Louhskiregion,(a)KandalakshaNaturalReserve;(b)Kivbay, OA,2003.08 5/32 5/ 5 o N.I.Orlova) nearMedvezhijpeninsula;(c)nearcapeIvanovNavolok 37 f1 8 Table1Taxa, populations and DNAloci sampled (Continued) hG ttpuo A.millefoliumL.s.lat.(A.sudetica mil-sud STms 6xi Austria:Salzburg,HoheTauern;ca.2300m PS, 2/10 2/ 2 ://wet OA.prioz)seoalbaEhrend. ros-2x/ros-4x Si3 2x Slovenia:Ljubljana 2FE0,02.08.31 2/8 115/5 2 ww.bal.BM +4xc 2002.07.31 ioC mE Si6 2+x4xc Slovenia:Ljubljana,Podpec FE,2002.7.31 2/8 1/5 2 edcenvolutio V 2x Italy:Udine,Kanalta,MalborghettoValbruna JS,2002.07 3/15 2/ 3 trana +4xc 10 l.cry A.schmakoviiKupr. sch AL5 6xc Russia:Altai,51°02’52’’N,85°36’47’’E;1700m MS, 2/9 2/ 2 omBio 2002.07.30 15 /14log 7y A.setaceaWaldst.&Kit. set GR 2xi Greece:Thessaloniki,drainfromlakeLimniKoronia FE,2001 1/3 1/5 1 1-220 K4 2xc Ukraine:Kiev,BaldMt.,LisaGora FE&YG 3/7 2/ 2 1412 8, NS1 2xc Austria:Burgenland,EofSt.Margarethen;ca200m 2FE0,0J3S.0,7Y.2G2, 2/8 120/7 2 /12/212:2 2003.05.27 SeAA 2xc Turkey:Anatolia,Aksaray FE, 3/10 2/ 2 2002.03.26 10 A.styriacaJ.Saukel&J.Danihelka, sty StE 4xi Austria:Styria,Einach,Wald JS,s.n. 1/5 1/4 1 ined. SpeciesoutsideA.millefoliumagg. A.acuminata(Ledeb.)SchultzBip. acu CB1 2xf China:Jilin,ChangbaiMt.,HancongValley,680-620m YG&GR, 3/8 3/8 2002.07.24 ARX2 2xc China:InnerMongolia,Arxan,N47°17’39.5”,E120°27’09.9";865m YG, 3/12 2/ 2007.10.08 11 TB5 2xc China:Shanxi,TaibaiMt.,N34°01’17”,E107°18’21";1700m. YG, 3/9 3/7 2006.09.09 A.ligusticaVis.exNym. lig SN 2xi Italy:Sicily,NebrodiMts. FE, 1/3 1/5 1 2001.09.21 A.nobilisL. nob ZN 2xi CzechRep.:Znoimo LE&FE, 3/24 1/5 2002.07.13 cploidylevelcheckedbychromosomecounting;fploidylevelcheckedbyflowcytometry;iploidylevelinferredfromFE’spreviousstudiesorfromliterature. Namesofcollectors:AT=A.Tribsch;DK=D.Kharazishvili;FE=F.Ehrendorfer;GR=G.-Y.Rao;HG=H.Greger;JE=J.Ehrendorfer;JS=J.Saukel;KT=K.Tremetsberger;LE=L.Ehrendorfer-Schratt;LK=L.Klimes; MS=M.Staudinger;OA=O.Alexandrova;PS=P.Schönswetter;YG=Y.-P.Guo P a g e 4 o f 1 8 Guoetal.BMCEvolutionaryBiology2012,12:2 Page5of18 http://www.biomedcentral.com/1471-2148/12/2 studied in representative taxa of the family Asteraceae were eliminated to avoid influence of PCR-mediated [32]. It was shown to be single-copy by our preliminary recombination [19,35,36]. The final numbers of indivi- analysesinseveraldiploidspeciesofAchillea.Readersare duals/clones analyzed at each locus for each population referredto Maetal.[19] forprimersused foramplifying arelistedinTable1.Allthesequencesanalyzedweresub- the ncpGS locus and to Chapman et al. [32] for the SBP mitted to the NCBI GenBank under accession numbers locus(itsexon5throughto7,i.e.thelocusB12inChap- HQ601971-HQ602593 (the nuclear ncpGS and SBP manetal.2007). genes)andHQ450864-HQ451071(theplastidloci). ThreenoncodingchloroplastDNAregions,trnH-psbA, Theallelicdatasetsofthetwonucleargenes,ncpGSand trnC-ycf6 (including partial ycf6-psbM) andrpL16 were SBP,wereanalyzedseparately,whereassequencesofthree sequenced.PCRreactionswereconductedwithuniversal cpDNAfragmentswerecombinedasonelocus. primerpairs[33]. Gaps in the nuclear data sets were treated as missing Theamplificationwascarriedoutinavolumeof20μL data, whereas each indel position (no matter how many withfinalconcentrationof1×PCRbuffer,0.05UexTaq nucleotide sites it contained) of the plastid data set was (TaKaRa,Shiga,Japan)orHiFi(TransTaqDNApolymer- codedasabinarycharacter(0/1=A/C)usingtheprogram ase High Fidelity, TransGen Biotech), 200 μM of each GapCoder[37]. dNTP,1%DMSO,0.5μMof eachprimer,andwith 1μL AsA.millefoliumagg.consistsofspecieswithshortevo- templateDNAandddH Oaddedtothefinalvolume.The lutionaryhistory[10,11],NeighborJoining(NJ),Maximum 2 amplification was conducted on a Peltier thermocycler Parsimony (MP) and Median-Joining network were (Bio-RAD)initiatedwith5minofpre-denaturingat94°C applied to the present data. For the nuclear sequences, followed by 30 cycles of 1 min at 94°C, 30s at 48-55°C, Neighbour Joining and Parsimony analyses were per- and 1.5 minat 72°C. A final extensionwas then taken at formedwithMEGA5.05andPAUP*4.0b10a,respectively. 72°Cfor15minfollowedbyaholdat4°C.ThePCRpro- All nucleotide substitutions were equally weighted.Gaps ductswereelectrophoresedonandexcisedfromthe1.0% weretreatedasmissingdata.Wefirstanalyzeddataofthe agarosegelinTAEbuffer.Theywerethenpurifiedusinga diploidspeciestoshowdiversificationofthegenelineages, DNAPurificationkit(TianGenBiotechorTransGenBio- andthenofallthetaxatoinvestigaterelationshipsamong tech, Beijing, China). The purified PCR products were thepolyploidsanddiploidswithinA.millefoliumagg.The either used for direct sequencing (for the cpDNA frag- NJanalysiswasconductedwithKimura’s2-parameterdis- ments) or ligated into a pGEM-T Vector (for nuclear tances[38]andbootstrappedwith1000replicates.Forthe genes) withaPromegaKit(Promega Corporation,Madi- MP method, heuristic searches were performed using son,USA).Forsequencingthenucleargenes,aboutfiveto 1000 random taxon addition replicates with ACCTRAN eight positive clonesfromeach diploidandtentofifteen optimization and TBR branch swapping. Up to 10 trees fromeachpolyploidindividualwererandomlyselectedfor with scores larger than 10 were saved per replicate. The sequencing. The plasmid wasextracted with an Axyprep stabilityofinternalnodesoftheMPtreewasassessedby Kit (Axygene Biotechnology, Hangzhou, China). Cycle bootstrapping with 1000 replicates (MulTrees option in Sequencing wasconducted using ABIPRISM® BigDye™ effect, TBR branch swapping and simple sequence Terminator.Thesameprimersusedforamplification(for addition). cpDNA fragment) or the vector primers T7/Sp6 (for Median-Joining networkanalysis implemented inNet- nucleargenes)wereappliedhere.Thesequencedproducts workver.4.5.1.6availableathttp://www.fluxus-engineer- were run on an ABI PRISM™ 3700 DNASequencer (PE ing.com/sharenet.htm[39]wasappliedtothecpDNAdata AppliedBiosystems). set.Allvariablesiteswereequallyweightedandthehomo- plasylevel parameter (ε)wassettozerogiven thatvaria- Data analyses tionratesofthecloselyrelatedspeciesislow,especiallyin Sequences were assembled with the ContigExpress pro- theirplastidDNA. gram(InformaxInc.2000, NorthBethesda,MD),aligned To understandthe populationdemography at the time with ClustalX 1.81, and then manually improved with ofspeciationofthediploidspecies ofA.millefoliumagg., BioEdit version 7.0.1. To prevent possible sequencing weappliedaprobabilisticmodel,theIsolationwithMigra- errors,singlemutationsinthenucleargenedatasetslikely tionModelformultiplepopulationsimplementedinIMa2 generated by the cloning sequencing method were [27], to three widespread and closely related species A. excluded from the analyses. Furthermore, unique asplenifolia-2x and A. setacea-2x and A. asiatica-2x. sequencesinthe nuclear gene data matrix, which donot Thesespeciesarehereregardedasthreedivergedpopula- fall into any majority-rule consensus sequence group tions which share nuclear sequence variation. Shared [19,34]orshowinconstantbranchpositionsintreesbased allelescouldreflectancestralpolymorphismorgeneflow ondifferentsubsetsofdata,i.e.,withpartialcharactersor after separation of the populations or species. Assuming randomly selected sequences, during the initial analyses neutrality,retentionofancestralpolymorphismsislikelyif Guoetal.BMCEvolutionaryBiology2012,12:2 Page6of18 http://www.biomedcentral.com/1471-2148/12/2 speciationisfastrelativetodrift,whichisinverseininten- rates(2NM=4Nu ×m/2).The priorswere finally set as sity to the effective population size. As a rule of thumb, follows: the upper bound of population sizes q = 100, speciesarewellseparatedwithlittleancestralpolymorph- splittingtimest=5andmigrationratesm=2.0,respec- ism andthusalmostcompletelineagesorting,ifthetime tively. We ran the Markov-chain Monte Carlo (MCMC) ofseparationisatleastaslongasfourtimestheeffective simulationswith1,000,000burn-instepsand20,000gen- populationsize[40].Secondarygeneticexchangebetween ealogies sampled per locus. The analysis was done with the diverging species can also lead to shared alleles 10independentrunsintheMmode,eachusingidentical observed [41]. The multipopulation model IMa2 allows priors and 20 Metropolis-Coupled chains with different bothancestralpolymorphismandgeneflowsubsequentto random number seeds. The genealogies sampled from divergence. It assumes a known history of the sampled the M mode runs were combined in an L mode run to populations, whichcanbe representedbyarootedbifur- buildanestimateofthejointposteriorprobabilityofthe cating tree. In earlier analysis using AFLP data [10], we parameters[26,27]. inferredtherootedspeciestreeas:((A.asiatica,A.aspleni- folia), A. setacea). We note that Ima2 provides posterior Results distributions of parameters, such that the confidence in Nuclear gene trees with allelic haplotypesequences the inference of each parameter can be obtained from Amplification of the partial ncpGS and SBP genes pro- observingthespreadoftheposteriordistribution.TheIM duced a single clear band for each amplification. This model also assumes neutral genetic variation, freely and the results from earlier work (see “Methods”) sug- recombining unlinked loci andno intragenic recombina- gest sequences of the two nuclear loci obtained here tionorgeneconversion[42].Sequencesofthetwonuclear each as belonging to a set of orthologs. loci, the ncpGS and the SBP genes, and of three plastid AftereliminatingsomesequenceslikelycontainingPCR- fragments were used for this analysis. The polymorphic recombination (about 10% of the total), 303 sequences sitesofthesequencednuclearandplastidlociaremostly (clones) of the ncpGS and 313 of the SBP gene from of introns or intergenic spacers and thus should fit the broadlythesame70individualsof29populationsbelong- neutral variationmodel. Usingthe four-gamete criterion ingtoA.millefoliumagg.wereusedforthedata analyses [43], we do not find intragenic recombination in the (Table 1). In addition, 56 ncpGS and 36 SBP sequences nuclearsequencesamongthesethreespecies.Thedataof fromfive populationsofthreecongenericspeciesoutside the three plastid fragments were combined because the the A. millefolium agg., A. nobilis-2x, A. ligustica-2xand chloroplastgenesaregenerallylinkedandnoevidenceof A. acuminata-2x, were also analyzed here. The ncpGS recombinationbetweenthethreeregionsisfound. alignmentcontains918nucleotidepositionswithsequence To run IMa2, one random haplotype per plant indivi- length varying from 795 to 861 bps. The SBP alignment dual was chosen for the nuclear gene data sets, and the contains 420 nucleotide positions with sequence length plastid data set was composed of sequences from the varyingfrom386to405bps. same plant individuals. This avoids bias but decreases Prior to the analyses of all the diploid and polyploid the amount of information and thus leads to broader taxa, we present the gene trees at the diploid level first posterior distributions. The IS (Infinite Sites) model [44] (Figures1&2). of sequence evolution was chosen for the plastid locus, The diploid-only ncpGS data come fromseven species whereas, the HKY model [45] which allows for multiple withinandthreeoutsideA.millefoliumagg..Theycontain substitutions was selected for the two nuclear loci 31 haplotypes with 134 substitution sites from 186 because double mutations were found for a few poly- sequences (Additional file 1: S-Figure 1). Out of the 134 morphic sites at both loci. The inheritance scalar was polymorphicsites,122areinintrons.Intragenicrecombi- set to 1.0 for the nuclear and to 0.25 for the plastid loci, nationamongsomesamplesislikely:Adiscordanceinthe respectively. alignmentcanberesolvedbypostulatingarecombination To set upper bounds on the prior distributions of the in three of the four ncpGS haplotypes of A. nobilis-2x parameters, we estimated for each of the three species aroundthe89thpolymorphicsite(Additionalfile1:S-Fig- the geometric means of the population mutation rate ure 1). Another discordance can be resolved in the two 4Nu across all three loci using Watterson’s estimator θ haplotypesofA.cuspidata-2x(IIIinFigure1A)bypostu- (per sequence not per site). The largest mean value was lating a recombination around the 26th polymorphic site foundwithA.asiatica-2x(anestimateof4Nu=9.8205), betweenhaplotypegroupsofA.millefoliumagg.andofA. and this was used to set the upper bound on uniform ligustica-2x(Additionalfile1:S-Figure1). priorforeachofthethreepopulationdemographicpara- Figure1AshowsanunrootedNeighbourJoiningphylo- meters:populationsize(θ=4Nu),splittingtime(t=Tu, gram based on the diploid-only ncpGS data. The topol- where T is the time in generations since the common ogy of the MP tree on the same data set is broadly ancestry,anditisofthesameorderof4N)andmigration comparable,andthusonlybootstrapvaluesfromtheMP Guoetal.BMCEvolutionaryBiology2012,12:2 Page7of18 http://www.biomedcentral.com/1471-2148/12/2 B A.asplenifolia A . r o s e o alb a A . m A . c e r e tanica ille fo liu A m A.asiatica agg 3029 31 . A.acuminata 29=acu (ARX2-3/12; CB1-3/8) 30=acu (TB5-1/2) 31=acu (TB5-3/7) A.setacea A . n o b i l is A .crithmifolia A . c l y p e olata A . l i g u s t ica A . m o s c hata 100/100 A . o x y l o ba A.acuminata 10=nob (ZN-2/9) 1 3 = a s i ( A L 1 - 2 / 7 ) A.wilhelmsii 11=nob (Zn-1/1) 1 4 = a s i ( A R X - 2 / 8 ) 12=nob (Zn-3/9) 15=asp (NS2-1/3) 12 16=asp (BZ-1/1; Ta-1/1); A.nobilis 10 cer (10240-2/10) 21 A.cusIpIiIdata 11 17=asp (BZ-1/2) 20 18=asi (ARX-1/1) 19=asi (NM-3/5) 2201==ccuuss ((IIDD--11//25)) 100/100 100 /100 15 100/100 1 6 1 9 99/99 1 8 17 100/100 56/80 93/99 A.ceretanica 70/85 A . a s p l e nifolia 23 A.ligustica 0.01 72/74 97/99 14 A . a s i a tica 22 substitutions/site 13 IIb 22=lig (SN-1/1) 88/90 23=lig (SN-1/2) 8 6 72/95 7 31 IIa 542 A.asplenifolia A . r o s eoalba A.asiatica 26 Ib A.latiloba 24 25 9 A.nobilis 1=ros (Si6-1/1) A.setacea 2=ros (Si3-1/5; Si6-1/2) 24=set (K4-1/1) 100/100 3=ros (Si6-1/1) 25=set (K4-2/5) 4=ros (V-1/2) 26=set (K4-1/1; SeAA-3/8) 5=asp (BZ-2/6) 2278==sseett ((SGeRA-1A/-32;/ 2N)S1-2/8) Ia 67==aasspi ( (ANLS12-1-2/2/6; A; TRaX-3-1/1/40;) ;N rMos- 2(/V3-)1/3) 8=lat (Geo-3/12) 27 28 9=nob (Zn-1/5) Figure1ThencpGSgenetreeandtheAFLPtree.A.UnrootedNeighbourJoiningphylogramof10diploidAchilleaspecies(sevenofand threeoutsideA.millefoliumagg.)basedonthencpGSgenesequencedata.Thetreecontains31ncpGShaplotypesgeneratedfrom186clones (sequences)from49individualsof20populations.Bootstrapsupports(>50%)frombothmethods(NJ/MP)areshownnexttothemajor branches.Labelsofterminalbranchesarewrittenas“taxonabbreviation(populationcode-numberofindividuals/numberofclones)”.Fortaxa abbreviations,seeTable1.B.TheAFLPtreefromapreviousstudy(Guoetal.,2005[10])forcomparison. Guoetal.BMCEvolutionaryBiology2012,12:2 Page8of18 http://www.biomedcentral.com/1471-2148/12/2 Ia Ib A.asiatica, A.asplenifolia, 1=set (NS1-1/1; SeAA-2/8) A.roseoalba, A.ceretanica, 2=set (NS1-1/2) A.asplenifolia, A.roseoalba, 10 3=asp (BZ-2/3; NS2-1/1) A.ceretanica, A.setacea 963/62 A.setacea 4=asp (NS2-2/7) 4 7 7=asp (BZ-2/2; Ta-2/10); 5 = acsepr ((1B0Z2-410/1-;2 N/2S);2-1/2); 1 2 5 97/97 62/62 8 8 = acseir ((A1L012-410/3-1);/2); ros (Si3-1/5) ros (Si6-1/5); 3 set (GR-1/5; K4-1/5; NS1-1/1; SeAA-1/2) 6 = caseserpt ( ((1KB04Z2--4110//5-11;; /NN2S)S12--11//35)); 71/736 --/62 190==aassi i( (AARLX1--21//22;) ;N cMer- 1(1/10)2; 4a0s-p1 (/B3)Z-1/1) 61/66 79/96 79/96 13 A.ligustica1112 14A.acuminata 11=lig (SN-1/4) 13=acu (ARX2-2/11; CB1-2/6; TB5-1/2) 12=lig (SN-1/1) 14=acu (CB1-1/2; TB5-2/5) 97/99 III A.asiatica 16 67/91 51/-- 17 A.latiloba 15 II 16=asi (ARX-2/10) 17=asi (ARX-1/2) 15= lat (Geo-1/4) 98/98 --/98 A.asiatica 19 A.cuspidata IV 18 18=asi (AL1-1/2) 97/100 19=cus (ID-1/5); asi (AL1-2/4; NM-3/8) A.nobilis 20=nob (ZN-1/1) 0.05 substitutions/site 20 21=nob (ZN-1/4) 21 Figure2UnrootedNeighbourJoiningphylogramof10diploidAchilleaspecies(sevenofandthreeoutsideA.millefoliumagg.)based ontheSBPgenesequencedata.Thetreecontains21SBPhaplotypesgeneratedfrom163clones(sequences)from35individualsof19 populations.Bootstrapsupports(>50%)frombothmethods(NJ/MP)areshownnexttothemajorbranches.Labelsofterminalbranchesare writtenas“taxonabbreviation(populationcode-numberofindividuals/numberofclones)”.Fortaxaabbreviations,seeTable1. Guoetal.BMCEvolutionaryBiology2012,12:2 Page9of18 http://www.biomedcentral.com/1471-2148/12/2 analysis are presented in Figure 1A. In this gene tree, Figure3showsthencpGSgenetree (unrootedNJtree) haplotypes of A. millefolium agg. fall into two major ofallthediploidandpolyploidtaxaofA.millefoliumagg. groups, group I corresponding to A. setacea-2x, and andofthreecongenericdiploidspeices.Itisbasedon110 group II those of the other Eurasian diploid species haplotypesgeneratedfrom 359sequenceswith 155 poly- exceptA.cuspidata-2x.Thisrelationshipagreeswiththat morphicsubstitutionsites.InFigure3,allelesofeachpoly- fromthepreviousAFLPanalysis(Figure1Binferredfrom ploidindividual,populationortaxon(markedindifferent Guo et al., 2005). In haplotype group II, both the sub- colors)scatteramonghaplotypesofdifferentdiploidspe- groupsIIaandIIbharborsequencesofA.asplenifolia-2x cies (all in black letters) of A. millefolium agg. except and A. asiatiaca-2x. This is not in line with the AFLP A.cuspidata-2x.A.roseoalba-4xandA.asiatica-4xshare treeandismostlikelyduetoretentionofancestralpoly- some oftheir alleleswiththeir diploidcytotypes,respec- morphism or secondary contacts between the two spe- tively, whereas A. ceretanica-4x is quite differentiated cies. A. asplenifolia-2x shares its diverged alleles with fromA.ceretanica-2x.AllelesfromthetetraploidA.bor- A.ceretanica-2xandA.roseoalba-2x,respectively,corre- ealisvar.alpicolaandvar.lanulosa-4xinNAmerica,the sponding to the previous AFLP tree (Figure 1B). tetraploid A. asiatica-4x in C Asia and the hexaploid SequencesofA.ceretanica-2x(relictintheEasternPyre- A.millefoliumsubsp.apiculata-6xinNEEuropearemore nees) only appear in IIb and those of A. roseoalba-2x oftenassociatedwitheachotherthanwiththoseofother (from the meadows in the S-Alps) only in IIa. The polyploid taxa. Only the Ukrainian A. inundata-4x and sequences of A. cuspidata-2x (in the Himalayas) appear the C Asian A. schmacovii-6x share nuclear haplotypes distanttoothermembersofA.millefoliumagg.butclose withA.setacea-2x. toA.ligustica-2x(Figure1A).Thesingleplantsampleof The date set of SBP gene of all the diploids and poly- A.cuspidata-2xwascollectedrecently and thuswas not ploids contains 68 haplotypes generated from 349 includedinthe previousAFLP study.Clearlymore sam- sequenceswith68polymorphicsubstitutionsites.Dueto ples of this species should be investigated. Out of the the severe conflicts between the SBP gene tree and the four haplotypes of A. nobilis-2x, one falls into clade IIa species tree inferred from the AFLP and morphological together with several members of A. millefolium agg., data,sequencesofthisgenearenotsuitableforthephylo- and three form a group between IIa and IIb. The latter geneticinference,butcouldprovidesomecluesaboutthe canbe explained by the intragenic recombinationvisible progenitorsofthepolyploidtaxa.Wethereforeonlypre- inthealignment(Additionalfile1:S-Figure1).Consider- sent the SBP gene tree of all the diploid and polyploid ingrelationshipsofA.nobiliswithA.millefoliumagg.,the samples in the supplementary materials as Additional ncpGSgenetreeisneithercongruentwiththeAFLPtree file3:S-Figure3. (Figure1B)norwiththeSBPtree(seebelow,Figure2). The diploid-only SBP data also include seven species Phylogenetic networks based on plastid haplotypes withinandthreeoutsideA.millefoliumagg.Theycontain Thirty populations with broadly the same 70 individuals 21 haplotypes with 60 substitution sites based on 163 analyzed with the nuclear genes were sequenced at sequences.Outofthe60polymorphicsites, 48belongto three plastid loci, trnH-psbA, trnC-ycb6 and rpl16. The the intron regions. The alignment of the SBP sequences length variation and number of polymorphic sites of does not show obvious intragenic recombination (Addi- each fragment are listed in Table 2. tional file 2: S-Figure 2). The topology of the NJ tree is ForthediploidmembersofA.millefoliumagg.together broadly comparable with that of the MP tree and thus with their sister species A. ligustica, three plastid frag- onlybootstrapvaluesfromtheMPanalysisarepresented mentsfrom18populationsand34individualsgenerateda (Figure 2). The SBP gene tree (Figure 2) is remarkably combinedmatrixwith1855(varying from 1814 to1842) incongruentwiththencpGStree(Figure1A)andwiththe nucleotide positions and 26 variable sites. Out of the 26 AFLPtree(Figure 1B).InFigure2, haplotypes belonging variable sites, 18 are substitution sites and 8 are indels tomembersofA.millefoliumagg.donotgrouptogether. (Table 2). The polymorphic sitesallow the identification SomeofA.asiatica-2x,alloftheCaucasusA.latiloba-2x of 13 plastid haplotypes named as dH1-13, where “d” andtheHimalayanA.cuspidata-2x(heredefinedasAsian standsfordiploidstobedistinguishedfromthoseusedfor types)aredistantlyrelatedtotheothersofA.millefolium thediploid-polyploidcombineddata asdescribedbelow. agg.Surprisingly,sequencesoftheCEuropeanA.nobilis- AsshowninFigure4,polymorphicplastidhaplotypesare 2xareclosetotheAsiantype,whereas,haplotypesofthe foundwithin eachof the three relativelywidespread spe- EAsianA.acuminata-2xareclosetothemajorhaplotype cies A. setacea-2x, A. asplenifolia-2x and A. asiatica-2x. group of A. millefolium agg., which is mostly of the Furthermore,distributionoftheplastidpolymorphism is European members. We thus observe little sorting of not even among the diploid species. Among the three ancestralpolymorphismsoftheSBPgeneduringthespe- widespreadspecies,theEuropeanA.setaceaandA.asple- ciationprocessesofthediploidspeciesofAchillea. nifoliaeachharboursarelativelyfrequenthaplotype,dH7 Guoetal.BMCEvolutionaryBiology2012,12:2 Page10of18 http://www.biomedcentral.com/1471-2148/12/2 72/63 alp (US2-1/1); lan (US5-1/1) alp (US2-1/1) alp (US2-1/4); lan (US5-1/1) alp (US2-1/1) alp (US2-1/3) ros (V-1/2); api (Rb-1/1); lan (US5-1/1) dis (DIKF-1/1) lan (US5-1/1) api (Rc-1/1) lan (US6-1/1) inu (K13-1/1) lan (US6-1/1) lan (US6-1/1) 67/-- api (Rc-1/1); lan (US5-1/1; US6-1/5) api (Rc-1/1) ros-4x (V-1/1) sch (AL5-1/1) asp (BZ-2/6); api (Ra-1/2); asi-4x (AL9563-2/4); inu (K13-1/2); sud (STms-1/3); ros-4x (V-1/1); sch (AL5-2/3) inu (K13-1/1) 72/61 ros-4x (V-1/1) 57/-- 53/-- rrooss ((SSii63--11//15); Si6-1/2); ros-4x (Si6-1/3) ros (Si6-1/1) inu (K13-1/1) sud (STms-1/1) asi-4x (AL3-1/1) asi (AL1-1/2; ARX-1/4; NM-2/3); asi-4x(AL3-1/3; AL9563-1/2) ros-4x (Si6-1/1) 63/-- cer-4x (10222-1/1) 78/-- cer-4x (10222-1/1) api (Ra-1/3; Rc-1/2) api (Rb-1/5); cer-4x (10222-2/7); sud (STms-1/3); ros-4x (Si3-1/2) 72/67 ros-4x (V-1/1) sty (StE-1/1) 61/-- asp (NS2-2/6; Ta-3/10); ros (V-1/3); ros-4x (V-1/2) sud (STms-1/1) dis (DIKF-1/1) dis (DIKF-1/1) 66/63 dis (DIKF-1/1) asi-4x (AL3-1/2); dis (DIKF-1/2) 65/-- dis (DIKF-1/1) asi-4x (AL3-1/1) alp (US2-1/1) dis (DIKF-1/1) 98/8770/-- aassii- (4AxL (1A-L23/7-1);/ 5as) i - 4 x (AL9563-1/3) 70/-- asi (ARX-2/8); asi-4x (AL9563-1/2) asi-4x (AL9563-1/1) asi-4x (AL9563-1/1) api (Ra-1/1) 74/-- api (Ra-1/1) lan (US6-1/1) 89/87 lan (US6-1/1) 59/55 --/57 93/-- aasspp ((NBZS2-1-1/1/3; )T a - 1 / 1 ) ; c e r ( 1 0 2 4 0 - 2 / 1 0 ) ; ros-4x (Si3-1/1) api (Ra-1/1) 85/80 54/67 aappii ((RRaa--11//11)) asi (ARX-1/1) asp (BZ-1/2) api (Ra-1/1; Rc-1/1) asi (NM-3/5); api (Ra-1/1); lan (US6-1/1); dis (DIKF-1/3) api (Rc-1/1) ros-4x (V-1/1) sud (STms-1/1) 50/-- ddiiss ((DDIIKKFF--12//12)) alp (US2-1/1) 57/-- sscchh ((AALL55--11//11)) nob (ZN-1/5) api (Ra-1/2); lan (US5-1/4; US6-1/3); asi-4x (AL3-1/1) dis (DIKF-1/1) lan (US5-1/1) lan (US5-1/1) dis (DIKF-1/1) lat (Geo-3/12) 68/68 sscchh ((AALL55--11//11)) cer-4x (10222-1/1) sud (STms-1/1) ros-4x (V-1/2) api (Ra-1/1) 98/7856/52 aappii ((RRaa--11//12)); lan (US5-1/1) sty (StE-1/1) 99/100 sty (StE-1/1) 53/52 ros-4x (V-1/1) --/50 73/70 sty (StE-1/2) inu (K13-1/1) 51/-- api (Ra-1/1) 81/77 lan (US6-1/1) nob (ZN-1/1) 100/100 54/-- nnoobb ((ZZNN--23//99)) 80/-- set (K4-2/5) 96/90 set (K4-1/1; SeAA-3/8) 90/90 sseett ((KG4R--11//13); N S 1 - 2 / 8 ); inu (K13-1/2) 91/97 100/10068/-- sineut ((SKe1A3A-1-2/1/2)) sch (AL5-1/1) 76/-- 99/101000/100 ccliuugss ( ((SIINDD---111///252))) 76/-- 100/100 lig (SN-1/1) acu (ARX2-3/12, CB1-3/8) 100/10080/-- aaccuu ((TTBB55--13//27)) Figure3UnrootedNeighbourJoiningcladogramofthencpGSgeneofallthediploidandpolyploidtaxaanalyzedinthisstudy.The treecontains109allelichaplotypesgeneratedfrom359sequenceswith155substitutionsites.TopologyoftheMPtreeonthesamedatasetis broadlycomparablewiththatoftheNJtree.Bootstrapsupports(>50%)fromNJ/MPanalysesareshownnexttothebranches.Labelofeach terminalbranchiswrittenas“taxaabbreviation(populationcode-numberofindividuals/numberofclones)”.Fortaxaabbreviations,seeTable1. Diploidtaxaareinblack,polyploidtaxaindifferentcolours.