ActaBot. Need.39(4), December 1992,p. 369-384 Review Nuclear DNA markers in angiosperm taxonomy K. Bachmann HugodeVriesLaboratory,University ofAmsterdam, Kruislaan318,NL-I098 SMAmsterdam, TheNetherlands CONTENTS Chloroplast vs. nuclearDNA 369 Approaches to thenucleargenome 370 Restriction fragmentlength polymorphisms (RFLPs) 371 RibosomalRNA genesas anexample 372 RFLPanalysis withrandomgenomicandcDNA probes 374 Mini-satellitevariation:VNTR fingerprints 375 Polymorphisms inPCR amplification products: RAPDs 376 ConclusionsandOutlook 378 Key-words: angiosperms, molecularphylogeny, RAPD, reticulate evolution, RFLP,VNTR. CHLOROPLAST VS. NUCLEAR DNA New methods for the analysis ofpolymorphisms in DNA sequences have a profound influence on systematic botany. Several recent books dealwith molecular methods in evolutionandtaxonomy(Clegg &O'Brien 1990;Doolittle1990;Hillis& Moritz 1990;Li & Graur 1991),someofthemspecificallywithplants(Crawford 1990;Soltisetal. 1992).A general revisionoftheevolutionary relationships ofplants withmolecularmethodsseems tobewellunderway(Palmeretal. 1988).Chloroplast DNA(cpDNA) hasturnedouttobe themostconvenientcommonmoleculeforthe reconstructionofplant phylogeny. Inthis review, the use ofmarkersin the nuclear genomewill be examined.Obtaining reliable phylogenetic information from nuclear DNA tends to be considerably less straight- forward than working wth cpDNA. This pertains both to the collection and to the interpretation ofthe data. However, the featuresofcpDNA that make it such a useful molecule for phylogenetic analysis also limitits applicability. A brief and simplified comparison ofthese featureswith thoseofnuclearDNA can help point out whereand why nuclearmarkersbecomeimportant. ChloroplastscontainclosedcircularDNA moleculesabout 150000 basepairs(150kb) inlength.Thesmallestnucleargenomesinangiosperms areabout500-foldbigger,andthe variationinhaploid nucleargenomesizes(‘C values’)amongtheangiosperms isconsider- ablefromArabidopsis thaliana(Brassicaceae) withabout 105kb in thefivechromosomes ofthehaploid genometoFritillariaassyriaca (Liliaceae) with1-23x 108kb in 12chromo- somes (Bennett et al. 1982; Price 1988). There can be considerable differencesin the genomesizes ofclosely relatedcongeneric species and thereis anincreasing numberof reports on intraspecific variationin genomesize (Price 1991). A variationin C values from 1-84pg to 7-14pg (1 pg=9-5 x 105kb) among populations of Collinsia verna 369 370 K. BACHMANN (Scrophulariaceae) seems to be thewidest intraspecific range on record(Greenlee et al. 1984).Therangeofabout2:1betweenthe largest andthesmallestcpDNA inlandplants (Palmer 1985)issmallincomparison andeasyto explain. Theavailability ofthecomplete sequences of several cpDNAs facilitates considerably the evolutionary comparison (Ohyamacta/. 1986;Shinozakietal. 1986). Allchloroplast genomescontainmoreorlessthesamesetofgenes.Eventheorderofthe genes on the chloroplast map changes relatively slowly. A segmentcontaining, among others, thegenes forthe ribosomalRNAs typically ispresenttwice in thecpDNA inthe formofaninvertedrepeat.Thiscp-rDNAshouldnotbeconfusedwiththenuclearrDNA, which I shall discuss below, or withmitochondrialrDNA, which up to now has played only aminorroleinplant evolutionary studies(Grabau 1985).Thetwo repeatcopies in the cpDNA separate a large single-copy region from a small single-copy region. The problem ofmultiple copies ofidenticalor similar sequencesthereforeplays virtually no role in cpDNA evolution while it is ever present in comparisons of nuclear genomes (Doyle 1991, 1992).DifferencesamongcpDNAs are mostlybasesubstitutionsatstrictly homologous (orthologous) sites. Major changes in genome architecture such as inser- tions, deletions, inversionsand transpositions involvingseveral genes,arerelatively rare. Whentheyoccur,they canbecrucialmarkersfor majormonophyletic groups(Downie & Palmer 1992). Typically, the inheritanceofchloroplasts is uniparental. TheevolutionofcpDNA is essentially thatofasexual haploid clones. Eachmutationinitiatesa monophyletic line, andcladisticanalysis isstrictly applicable atalltaxonomiclevels. Acladogram basedon cpDNA should parallel that ofthe plants containing the chloroplasts. Usually it does (e.g. Crowhurst et al. 1990). However, recombination in the nuclear genomes within species and reticulateevolution (hybridization) between species are essential aspects oforganismal evolution, and discrepancies between thephylogenies ofchloroplast and nuclear DNAs can reveal crucial evolutionary events. Moreover, chloroplast DNA is essentially a neutral independent marker of phylogeny, while the vast majority of evolutionary change relevant to the living plant is encoded in the nuclear DNA. Ifwe want to go beyond phylogenetic reconstruction and analyse the interactions between organismic andmolecularevolution,wehaveto study thenucleargenome. One of the advantages of cpDNA for phylogenetic analysis is the relatively slow sequenceevolution. Thetaxonomiclevel at which cpDNA sequencesare optimally in- formativeis the differentiationof genera. With an increasing amount ofnoise (homo- plasy), cpDNA can be compared at higher taxonomic levels (Palmer 1985). At lower taxonomic levels, especially for studies on population genetics andspeciation, cpDNA usually provides littleinformativevariation.Here, ofcourse, recombinationandreticu- lateevolution play an essential role, and nuclear DNA markers reveal their fullvalue (Schaal et al. 1991). However, there is informativesequence variation for all levels of phylogenetic analysis in the nuclear genome, and we may state that, aside from fossils documenting extinctlineages, an event inphylogenetic history that has not left a recognizable trace inthe nuclearDNA is notdocumentedanywhere. APPROACHES TO THE NUCLEAR GENOME Therecan be considerablevariationin genomesizes amongclosely related species, and thisshowsthattheevolutionofnucleargenomesislessconstrainedandfasterthanthatof theorganisms harbouring thesegenomes.Theselectiveamplification ofsmallparts ofthe DNA MARKERS AND ANGIOSPERMTAXONOMY 371 genomecan create large fractions of repetitive DNA that may be specific for young evolutionary lineages (McIntyre etal. 1988).Asignificant fractionofthe DNA foundin onespecies maybeabsentinaclosely relatedone(Ganal & Hemleben 1986b; Schmidtet al. 1991; Unfriedet al. 1991). SuchDNA sequencescannot playa roleincoding forthe phenotype, eventhough theirbulkmaywell havea phenotypic effect viacell-cycle time and cell size (Bennett 1985; Price 1991). Thecoding sequencesshared by all plants are embeddedinavastand variable majorityofnon-coding DNA,much ofitconsisting of very many repeats ofa few basic sequences. Efforts to derive oneoverall measure ofa genetic similarityamong organisms fromthecommoncoding sequencesin nuclearDNA become technically difficultwhengenomeswithlargeandvariableamounts ofDNAare compared. The most general method, DNA-DNA hybridization (Bendich & Bolton 1967),is nowonly ofhistoricalinterestinangiosperm taxonomy. The nearest we can come to an overall measure of genetic relatedness is a random sampling of variation atmany strictly homologous sites all across the genomeas it is attempted in iso-enzyme studies (Gottlieb 1977). A typical problem with iso-enzyme studies is the limited numberof loci that are available for analysis. A more extensive randomsampling ofgenetic variationallacross the genomecanbeachieved withDNA fingerprinting, RFLP and RAPD methods. I shall discuss these methods and their limitationsbelow. Alternatively, ofcourse, thereare manygenes commonto all plant genomes,and the nucleotidesequenceofeachonecanbe usedforphylogenetic reconstruction. Thisyields the phylogeny ofthat onegene, and (as in the caseofcpDNA phylogenies) it will be necessary to check ifthis reflects the phylogeny ofthe organisms (Pamilo & Nei 1988; Doyle 1992). The key concept here is that of concordance(Avise & Ball 1990) of gene genealogies (Hudson 1990). A general consensus seems toemerge that the best possible molecularapproach to angiosperm phylogeny is the precise sequence comparison of severalgenetically unlinkedandfunctionally diversegenes,wherethechloroplast genome may be consideredone linkage group(Martin & Dowd 1991). There is, however, no consensus yet about the nuclear genes that should be includedin a general taxonomic survey.Theeffortinvolvedincharacterizing oneplantthisway excludesthisapproach for thetimebeing frompopulation sampling. Implicitly thiscreatesa typological approach (i.e., the individualsequence from one specimen represents a species, genus or even family).This seems to bean acceptable practice forthephylogeny ofhigher categories, whereintraspecific sequencevariationhastobetreatedas noiseunlessitcanberecorded statistically as allele frequencies. The continuous improvement in the methods for isolating strictly homologous genes from genomesand for determining their sequences willsoonmakephylogenetic analysis viagenetrees generally available. Othermethodswillhavetobeusedatthespecies level.Here,arepresentative sampling ofindividuals and of variation throughout the genome requires simpler methods for determining polymorphisms. Theavailability ofallkindsofdirectandindirectmethods forthe comparison ofDNA sequencevariationmakes it possible to choose the proper combinationoftechniques foreachprojectandeachorganism.Inthefollowingsections, I shalltry to illustratesometypical approaches. RESTRICTION FRAGMENT LENGTH POLYMORPHISMS (RFLPs) Manybacterialrestrictionendonucleasesarenowcommerciallyavailable.Eachrecognizes ashort nucleotidesequence, its specific restriction site, atwhich itcuts double-stranded 372 K. BACHMANN DNA. Restriction sites 4 or6 bp long are typical. Such sites occur statistically in any longer stretch of DNA unless it consists ofrepetitive simple sequences. A restriction enzyme thereforewilldigest DNA so that homologous sequencesare cutinto identical restrictionfragments. A nucleotidesubstitutionin a restriction site will prevent recog- nitionandcutting, andwill replace tworestrictionfragmentswithoneoftheircombined length. It will create a restriction fragment length polymorphism (RFLP). Restriction fragments can besortedaccording to length by gelelectrophoresis. All fragments ofone length then form a band in the gel. Banding patterns on gels based on the restriction fragments fromapiece ofDNA areroutinely used foridentificationandcomparison of DNA sequences.Ifapiece ofDNA is digested withseveral differentrestriction enzymes singly and together, therelative positions of the various cutting sites canbe determined andtheir distancesin basepairs canbe shownona physical mapofthe DNA (Fig. 4of Lanaudetal. 1992).Sucha physical mapis an extractofthe actual nucleotidesequence andcanbepredicted ifthesequenceisknown. Restriction fragments ofidentical length need not be homologous. The morebands thereareinapattern,thelesseasywillitbetodeterminetheirhomology.Thisisextremein the digest ofanentirenucleargenomecontaining thousandsoffragments whichforma smearallalongthegel. Inthiscase,theDNA fragments canbeblottedfromthegelontoa membranefilter(Southern 1975) andindividualbands homologous to a(radioactively) labelledclonedDNA-probecanbedetectedselectively by DNA/DNA hybridization. The probe may be cloned from the species itself (a ‘homologous’ probe) or from another species witha sufficiently similarsequencesothatthe probe findsits homologue by base pairing(a ‘heterologous’ probe). Forthechloroplast genomesofseveralspecies fromdiverseangiosperm familiesthere existcomplete setsofprobes covering the entiremolecule(e.g. the dicot, lettuce: Jansen & Palmer 1987; and the monocot, Oncidium excavatum: Chase & Palmer 1989). The conservativeevolution ofcpDNA facilitatesthe cross-hybridization with heterologous probes, and phylogenetic comparisons ofcpDNA can be based on complete physical maps. Much ofthe variationamongthe mapsfromrelated species is dueto thegain or loss of a restriction site due to a point mutation.Such mutationscan be arranged as character-statechanges inastrictcladistic analysis. Inthisway, RFLPsare astatistically characterizedrandom sample (based on the random occurrence ofrestriction sites) of sequencevariationallacross thecpDNA. RFLPscanalsoarisebytheinsertion, deletion, or inversionofDNA. Suchevents canbe recognized and differentiatedfromrestriction sitemutationsincpDNA analyses. RFLPanalysis ofnuclearDNA is neveras complete andusually notas precise. Itinvolveseitherthedetailedcomparison ofa gene(essentially asimplified sequencecomparison) orthe searchforRFLPsatarandomsetoflociacross thegenome.Thefirstis exemplified by RFLPanalysis oftheribosomalRNA genes.The latterdepends ontheavailability ofasetofappropriate clonedprobes for variousregions ofthegenome. RIBOSOMAL RNA GENES AS AN EXAMPLE The nuclear genes coding for ribosomal RNAs (rDNA) occur hundredsof times as tandem(one afterthe other) repeats atthe nucleolarorganizer regions on one or more chromosomesofa haploid set. This and the abundanceoftheir RNA products in the cellhas facilitatedtheir isolationand they have been used for a variety of molecular studiesincludingtheearliestattempts atmolecularphylogeny ofangiosperms (Vodkin & DNA MARKERS AND ANGIOSPERM TAXONOMY 373 Fig. 1.Structureandterminologyofthenuclearribosomal genes(rDNA).Eachoftherepeatunits(itoi)consists ofatranscriptionunitbetweenatranscriptioninitiation site(T1S,i)andatranscriptionterminationsite(TTS;I) and anon-transcribed spacer (NTS).Black bars mark theregionscodingforribosomal RNAs, 18S,5.8S, and 25S.Theseareprecededbyanexternal transcribed spacer(ETS)and separatedbythetwo internal transcribed spacers,ITS1and ITS2.Theentire regionbetweenthe endofone25Scodingsequence and thestartofan18S codingsequence istheintergenicspacer(IGS). Katterman1971).EachrDNA geneistranscribedintoone continuousprimary transcript (precursor rRNA), fromwhich theribosomal18S, 5-8S,and 25S RNAs arecut,whereby sequencescorresponding toan‘externaltranscribedspacer’ (ETS)andtwoshort‘internal transcribed spacers’ (ITS1 andITS2) arediscarded(Fig. 1). Theribosomal5S RNAs are coded by a separate,unlinkedtandemarray(Gerlach &Dyer 1980;Hemleben& Werts 1988; Dvorak etal. 1989; Lapitan etal. 1991).TherDNA transcription unitsare separ- ated by ‘non-transcribedspacers’ (NTSs, intergenic spacers: IGSs). In general, all the numerouscopiesofthe 18Sand25SrRNA genesina genomehavethesamesequence,and thesequencesofthesetwo RNAsevolveslowly (Zimmeretal. 1989;Martin&Dowd 1991; Hamby & Zimmer 1992). Thisconservationofthe coding sequencescontrasts with the nearly unrestrainedevolutionofthenon-transcribedspacerregion (Jorgensen &Cluster 1988).DNAprobes fromtheIGSofCucurbitapepo(courgette) donoteven hybridizewith DNA fromother generaofthe Cucurbitaceaesuch as Cucumis(cucumber and melon; Torres etal. 1989; Torres& Hemleben, 1991).Each intergenic spacertypically contains tandemrepeatsofsimplesequences,80-325bp inlengthindifferentspecies. Thesesimple- sequencesub-repeats areusuallyconservedwithina genusoramongclosely relatedgenera (Ganal& Hemleben1986a).Length heterogeneityamongtherDNArepeatunitswithina species can arise when the numberofsimple-sequence repeats in the IGSregion varies (Yakura etal. 1984; Rogers & Bendich 1987; Hemlebenet al. 1988; Reddy etal. 1990). Enzymes thatdonotcutwithinthesimple-sequence repeatsoftheIGS transformspacer length heterogeneity into RFLPs (e.g. Bellarosa et al. 1990; Reddy et al. 1990). This, ratherthanmutationsinrestrictionsites, is thetypical originofRFLPs inthe rDNA. As (heterologous) probes fortherDNAcoding sequenceshavebeenavailablefor sometime, there are quite a few taxonomicand evolutionary studies using RFLP analysis ofthe rDNA. Thevalueofanypolymorphism forphylogenetic analysis dependsonourknowledge of theway inwhichthispolymorphism evolves. Evenwherewehaveageneral understanding fortheevolutionofatypeofpolymorphism (suchas forbasepair replacements), wehave tocheckthespecificparametersempirically ineach case. Thereare severalpossible mech- anismsleading tochanges intherepeatnumberoftandemlyrepetitiveDNA,but thereare also mechanismsthateliminatevariationamongtherepeatsinonelinkedcluster(Appels & Dvorak 1982; Dover 1986; Sytsma & Schaal 1990). Ingeneral, recombinationtends to promote sequence homogeneity within populations, and the magnitude of fragment length differences among species neednot be a measure of thephylogenetic distance. 374 K. BACHMANN RFLPsbasedonsimple-sequence repeatsinspacersare beingusedforthedetermination of (phylo)genetic relatedness, mostly among populations of one species and among congeneric species. Thereare considerabledifferencesamong the levels ofvariability in differentcases. Twenty lengthvariants were detectedamongtherDNA repeatsofasingle plant ofViciafaba (Yakura etal. 1984)andupto fiveperindividualin Clematisfremontii (Learn & Schaal 1987). Nointraspecific spacer length variation, but some variationin restriction sites was observedinanextensive survey ofRudbeckiamissouriensis(King & Schaal 1989). Inmany species restrictionsite variationoccurs only amongraces (Doyle et al. 1984) or atthe interspecific level (Sytsma & Schaal, 1985; Doyle & Beachy 1985; Rieseberg et al. 1988). Qualitative and quantitative differentiationamong populations and subspecies was found in wild barley (Saghai-Maroof el al. 1984), Phloxdivaricata (Schaaletal. 1987)andClematisfremontii(Learn& Schaal 1987). Sytsma & Schaal(1985) haveanalysed theLisianthiusskinnericomplex inPanama.Lisianthiusisoneoftheappar- ently rareexamplesoflengthvariationintheITSratherthantheintergenicspacer(Sytsma & Schaal 1990). Bellarosa etal. (1990) have used RFLPs in the rDNA to determinethe relationship amongmediterraneanoaks (Quercus spp.), D’Ovidio et al. (1990a)that of poplars (Populus spp.),Reddy etal. (1990)havecompared 23populations from12species ofSecale, andZentgrafetal. (1992) haveused diagnostic RFLPs inthe rDNA clusterto distinguish theAsianandAfrican species groupswithinCucumis. Thereis nouniformmethodfortreating thedatainthevariousstudiesusing RFLPs in the nuclear rDNA. While Reddy et al. (1990) employ an essentially phenetic approach basedonfrequencies ofspacer-length variants, theclassificationby Bellarosaetal.(1990) isbasedonthesharedpresenceoflengthvariantsandis thereforeessentially cladistic.The recent analysisofnuclearrRNA variationinKrigia (Asteraceae) by Kim& Mabry(1991) is aninstructiveexample fortheevaluationofrDNA RFLPdata. RFLP ANALYSIS WITH RANDOM GENOMIC AND cDNA PROBES RFLPsallacross thegenomecanbeusedto determinetherelatednessandthephylogeny ofgenomes.Thecrucialfactoris theavailabilityofclonedDNA-probes forthedetection of homologous sequences by DNA/DNA hybridization on blots ofrestriction digests. RFLPs inthe nuclear DNA can beusedas co-dominantmarkersingenetic crosses, and their segregation and linkage relationships can be determinedby Mendeliananalysis. Theycanbeusedtoproduce anRFLPmapofthegenome.Incontrast tothephysical map ofa segment ofa genomein which restriction sitesare mapped in base-pair distances, distanceson an RFLP mapofthe genomeare in recombinationunits. Thegenetic loci detectedby RFLPs can be incorporated in the general genetic mapofthe species and greatly facilitate Mendeliananalysis as they show a simple monogenic co-dominant inheritance pattern (Helentjaris el al. 1985; Landry et al. 1987; Paterson et al. 1988). NuclearRFLPs forma directlink betweenthegeneticanalysis ofthegenomeandphylo- genetic analysis. Ideally theirmappositions areknown, and theircontribution to a truly representative sampling of the genome can be determined(Kesseli et al. 1991). The sequenceresponsible foranRFLP andthe relevantchromosomalregion are experimen- tally accessible via the cloned probe, and a direct link can be laid between molecular evolutionandthe geneticbasis ofmorphological evolution. Unfortunately,theexpenses intimeandmoneyforthefulluseofnuclearRFLPanalysis can become limiting. This primarily concerns the production of a set of informative DNA MARKERS AND ANGIOSPERMTAXONOMY 375 probes, the preparation and the characterizationofa clone‘library’. Theseprobes are eithercDNA, i.e. DNA that hasbeensynthesized in vitroby reverse transcriptionfrom mRNA (Helentjaris et al. 1985; Bernatzky & Tanksley 1986), or pieces of DNA cut directlyfrom thenuclear genome(Kochert et al. 1991;Neuhausen 1992). cDNA clones containonly thecoding sequencesofexpressed genes,whilerandomgenomic clonesalso containregulatory and non-coding repetitive sequences, which require a moreintensive screening for useful probes (Xu & Sleper 1991) butprovide access to a wider rangeof variation. Dueto theconsiderabletechnicalinvestment, nuclearRFLPanalysis hasbeenapplied mainlytocropspecies. Thesestudies,however, frequentlyinvolveinterestingphylogenetic problems, mostly on the level of species withina genus. The levels of intraspecific or interspecific variationfornuclearRFLPs areratherunpredictable inplants fromvarious generaor families(Shattuck-Eidens et al. 1990). Neuhausen(1992) found that RFLPs withinspecies ofmelons(Cucumis)mostlyweredueto mutationsinrestrictionsites,while RFLPs between species frequently involved chromosome restructuring. Song et al. (1988a,b, 1990) have constructed cladograms of Brassica species and cultivars from nuclearRFLPs,whichagreewellwithotherdataandprovidenewinsightsintheevolution ofthis group. Their investigation raises the difficult question of the incorporation of polyploid hybrid taxaamongdiploids in a cladisticanalysis. Brummeretal.(1991) con- structed separatecladograms for diploid and tetraploid cultivars ofalfalfa(Medicago), but also faced the problem ofreticulateevolution via hybridization at one ploidy level (‘homoploid reticulateevolution’). Kochert etal. (1991) compared 8 cultivars and 14 species of peanuts (Arachis) and found that they could recognize specific restriction fragments of the diploid parental species in the banding pattern of the tetraploids. Two cottonwoodspecies (Populus fremontiiand P.angustifolia) along the Weber River drainage in Utah couldbe identifiedby interspecific RFLPs and the parentageofindi- vidualplants ofahybrid swarm occurringintheiroverlap zonecouldbe determined.The hybrids were eitherFI plants orbackcrosses to the P. angustifolia parent (Keim etal. 1989).Thetopology oftherelationship amongfourspecies ofActinidiabasedonnuclear RFLPs withcDNA and randomgenomic clonesagreeswiththatbasedon RFLPs in the cpDNA (Crowhurst etal. 1990). MINI-SATELLITE VARIATION: VNTR FINGERPRINTS Various relatively short DNA sequencesseem to cause frequent irregularities in DNA replication and/orrecombinationwhereverthey occur (Levinson & Gutman1987). Such sequencesare foundasmoderately repetitivetandemrepeats(‘mini-satellites’) ofmoreor less identicalunits thatare up toabout60bp in length (Jeffreys etal. 1985).Many such mini-satellites seem to occur at various positions throughout all eukaryote genomes. Martienssen& Baulcombe (1989) illustrateamini-satellitein the wheat genomethat consists of 21 repeats of 15-26bp in length each containing a 14bp ‘core sequence’ (averaged overall 21 repeats, thereare 121of 14bp identicalwiththecore consensus). Thenumberofrepeatsinaparticularmini-satellitecanvaryamongtheindividualsofa population (variable number tandem repeats: VNTRs). This variation is detected as restriction fragmentlength variationwhenDNAis digested withrestriction enzymesthat donotcutwithinthe repeat.As VNTRs basedon onebasic sequenceregularly occur at various places in the genome, a probe detecting the repeat sequence usually reveals severalor many restrictionfragments perindividual.Someofthesemaybealleles from loci heterozygous for repeat numbers. Due to the high rate of length variation, the 376 K. BACHMANN corresponding pattern ofbands often is specific for anindividualand is calleda ‘DNA fingerprint’. Severalprobes containing coresequencesofVNTRs havebeenusedforfingerprinting invariousorganisms. Thefirst demonstrationsoftheexistenceofmini-satellitesinplants madeuseofahumanmini-satelliteprobe (Dallas 1988;Rogstad etal. 1988a) orofaprobe derived from bacteriophage M13(Rogstad et al. 1988b, 1991a,b; Ryskov et al. 1988; Nybom et al. 1989;Zimmermanet al. 1989). Relatively simple and reliabletechniques fordetectingfingerprints inplantsmakeuseofsynthetic oligonucleotides thatcanbe four- baserepeatssuch as (GATA)4or(GACA)5(Weisingetal. 1989, 1990, 1991).Suchprobes are easilyavailableandlikelytorevealRFLPs atmanylocithroughout thegenome(Tautz & Renz 1984;Tautz 1989;Condit& Hubble 1991).Thegreatadvantage ofthismethodis setagainst two limitationsto its usein phylogenetic analysis. Themost severe limitationis thehighrate ofchange ofVNTR lengths. VanHouten et al. (1991), using (GATA)4 as a probe, foundthat very few plants in a population of Microserispygmaea(Asteraceae) had identicalfingerprintpatternsand thatthe patterns couldchange unpredictablybetween aplantand itsoffspring fromselling. Inthe closely relatedspecies, M.elegans, fingerprint patternswere considerably morestableandcould be used to determinethegenetic relationship of genotypesisolatedfrom variouspopu- lations(Van Heusden& Bachmann, 1992a). Inacomparison ofthreespeciesofPlantago, Wolffetal.(in press)foundnointrapopulation variationwiththeM13 probeintheselling species P. majorand high levels ofmini-satellitevariationinthe outcrossing species P. lanceolata. Population differentiationformini-satelliteswas foundin Acer negundo by Nybom &Rogstad (1990) andinAsiminatriloba(Annonaceae; Rogstad etal. 1991a). The interpretation of fingerprint patterns confronts us with the second limitationof the method.Itisimpossible to determinethehomology (or theallelism)ofthebandswithout performing breeding experiments. Thecomparison is usually limitedto the presenceor absenceofabandata certainposition in thegel, andit isassumed thatthis comes from identical allelesofone locus in allthe plants. The typical datamatrix will list, foreach plant, the absence orpresence ofallbands foundin the entire sample. Thesedata can eitherbe treatedas cladistic charactersor, using anindexofband sharing, ina distance matrix(Lynch 1988, 1990). Themostuseful applicationofmini-satelliteprofiles inplant evolution seems tobe for thedeterminationofclonalidentityandevolutioninasexually reproducing plantspecies such as apomictic blackberries (Rubus, Nybom & Schaal 1990; Nybom & Hall 1991)or clones of quaking aspen (.Populus tremuloides, Rogstad et al. 1991b). Fingerprinting of apomictic dandelions(Taraxacum) has shown that plants of the ‘microspecies’ T. hollandicum from The Netherlands, Czechoslovakia and France are indeed members ofone clone, but has thrown doubton the clonal nature of some other Taraxacum microspecies (Van Heusdenetal. 1991). Theterm‘DNAfingerprinting’ is notexclusively applied to banding patternsbased on mini-satellitelength variation, butmaysignify any individual-or strain-specific banding pattern on a gel based on length polymorphisms of DNA fragments (Gebhardt et al. 1989). This may include banding patterns based not on restriction fragments but on amplification products(D’Ovidioetal. 1990b;Welsh& McClelland 1990). POLYMORPHISMS IN PCR AMPLIFICATION PRODUCTS, RAPDs Theintroductionofthepolymerase chainreaction(PCR) forthein-vitroamplification of specific DNA segmentsout ofa mixtureofmany others(Saiki et al. 1985) has greatly DNA MARKERS AND ANGIOSPERM TAXONOMY 377 simplified many molecular protocols. PCR has made comparative sequencing of homologous stretches of DNA accessible to systematists, because the purification of homologous sequencesfromnew material,whichusedtobeforbiddingly complex, isnow asimple routine.Themethod dependson sufficientknowledge ofthedesiredsequenceto synthesize stretchesofabout30bp lengthflanking thepiece tobeamplifiedwithopposite complementarity. These hybridize to the homologous sequences on denatured, single- strand (template) DNA and serve as starting points (primers) for the synthesis of the complementary strand including the site ofthe opposite primer. A heat-stableDNA polymerase is usedthat survives thetemperaturesused for denaturing DNA. Repeated denaturing, primerattachmentandcomplementary strandsynthesis eventually produces anenormous amountof double-strandedDNA corresponding precisely to the stretch between the opposing primers. Sequencing the amplification product will thenreveal polymorphisms atthislocusamongthevariousspecimens fromwhichtemplateDNA has beenisolated(Barbier etal. 1991). PCR canalsobeusedforarandomsamplingofsequencevariationalloverthegenome. Forthis, in place oftwo long (c. 30bp) primers flanking aspecific sequence,oneshorter (10bp) primer with an arbitrary sequence is used. Amplification will depend on the existencesomewhereinthegenomeoftwosites thatarecomplementary to theprimerand lieon opposite strands ata distancesothattheintervening sequencecanbefilledinby in vitro DNA synthesis. The statistical chance for such a constellation is minute, but genomesare largeandthereis a virtually unlimitedvarietyofpossible 10bp primers. In fact, quiteregularlyseveralamplificationproductsofdifferentlengthsareobtainedfroma plantgenomewithanarbitraryprimeranddetectedasbandsonagelafterelectrophoresis (Williams el al. 1990; Quiros et al. 1991; Van Heusden & Bachmann 1992a). These bandingpatternsareusually polymorphicinpopulations (randomamplifiedpolymorphic DNAs: RAPDs; Williamsetal. 1990), and thevariouspatterns obtainedwithdifferent primers forma detailedDNA ‘fingerprint’ for genomerecognition (Welsh & McClelland 1990; Klein-Lankhorstet al. 1991). Mendeliananalysis shows that the polymorphisms usually consist ofthe absence or presence of an amplification product from a site on the genome,possibly reflecting base substitutions in the primer attachmentsites. The dominantinheritanceofRAPDs makes them less convenientgenetic markersthan the co-dominantly inheritedRFLPs, but RAPD amplification products can be mapped (Williams et al. 1990; Quiros et al. 1991). They can also be used as presence/absence charactersforphylogenetic analysis. Populations of theCalifornian autogamousannual, Microseriselegans (Asteraceae), are geographically isolatedfromeachotherandconsist essentially ofonelocal ‘biotype’ each.Themorphologicalandgeneticdifferencesamongthesebiotypeshavebeenanalysed ininbredstrainsderivedfromvariouspopulations. Thevirtualabsenceofheterozygosity inthesestrainssimplifiestheircomparison withdominantRAPDmarkers(Van Heusden & Bachmann 1992a). These revealed monophyletic groups of strains, of which one represented populations that were not geographically nearest neighbours. Apparently, populations can arise after long-distance fruit dispersal to unoccupied sites withinthe range ofthe species and then resemble theirsource populations more than theirneigh- bours. Theclose genetic relationship among thesepopulations could not beconcluded frommorphologicalcharacters, butonceitwasknown,itcouldbe detectedinthefirsttwo factors ofaprincipalcomponentsanalysis ofthemorphological variation(Bachmann & van Heusden 1992). Similaranalyses of the closely related species, M. bigelovii (Van Heusden&Bachmann 1992b)and M.pygmaea(Van Heusden& Bachmann 1992c) have 378 K. BACHMANN revealedsubtledifferencesamongthe species bothin the degree ofrecombination(gene flow) amongpopulations and in the overall variationwithin the species. The method seems to be very useful for the analysis ofintraspecific variation, atleast in inbreeding species. However,VanHeusden& Bachmann(1992c) haveshownthatthereare RAPDs forvarioustaxonomiclevelsofanalysis. Inacomparison ofthetwo mostdivergentstrains ofeachofthreespecies ofMicroseriswith22 primers, 143bands were sharedby atleast two strains.Ofthese,6, 12,and 14amplification products were sharedby thetwo strains ofeach ofthe three species and are thereforecandidatesfor species identificationwith DNA.Forty-nine amplification products werepresentin all six strainsand are possibly informativeathigher taxonomiclevels. RAPD polymorphisms are typical molecularmarkers. They are individually oflittle value, but can become immensely valuablewhen they give informationabout many uniquelocirandomly distributedacross thegenome.Themethodisrelativelynewandnot yetfullytested. Infact,theoreticalpredictions basedonthe assumption that theprimers recognize and reveal allcomplementary sites inthe genomeare frequently not met. For instance, combinationsoftwo primersusually donotproduce thebandsofthetwo single- primerpatternsplusnewones(Klein-Lankhorst etal. 1991;Williamsetal.inpress), anda combinationoftwo template DNAs doesnot necessarily produce thecombined bands amplified fromthe single templates(Williamset al. in press). Moreover,ifamplification products are isolated froma gel, labelledandhybridized to ablot ofthis gel, morebands are labelledthanare seenin theoriginal ethidiumbromidestainedgel (Williams etal. in press; K.Buchmann, unpublished data).Thissuggeststhatthereis somesortofselection amongvarious potential sites foramplification during the amplification process. More- over, it pays to rememberthat the bands revealed by RAPD primers (and unspecific fingerprint probes) may come fromany genomeincluding thoseof insects and molds infecting theplants under study. Withinlimits and with relevant controls, though, the methodis idealfora quickand easy genetic survey. As theamplification products canbe isolatedandcloned,themethodcanbe convertedintoa rigorous RFLPtest. CONCLUSIONS AND OUTLOOK Thegenetic variationinnuclear-DNAmarkers covers theentirerangefrompolymorphisms amongrepeated sequenceswithinasingle plantto evolutionary stability overlong geolo- gical periods. The discussion ofthe ribosomalRNA genes has shown that all levels of variability can occur within differentparts of a single gene.Various genomic fractions revealed by the differentmethods have their typical levels of variability. The examples citedabovehaveshownmuchvariationwithintheexpected rangeofeachfraction, andthe level at whicha particular marker is informativein a specific plant groupneeds to be empirically determined. Most ofthe recent studies in plant evolution employing nuclear-DNA markers use them atthe level ofspecies and populations. There they complement cpDNA databy revealing variationat all levels of genetic relationship, by their biparental inheritance and bytheirgenetic linkage withthe genes involvedin morphological andphysiological evolution.Withnuclearmarkers, theparental genomesin diploid andpolyploid hybrids canberecognized (Arnold etal. 1990;Kochertetal. 1991; Zhang& Dvorak 1991).When polymorphisms in the nuclearDNA ofa diploid parental species are foundback inthe allotetraploid derivative, they suggesta multiple origin ofthe tetraploid (Soltis & Soltis 1991).
Description: