ebook img

Functional Constraints and rbcL Evidence for Land Plant Phylogeny PDF

34 Pages·1994·13.1 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Functional Constraints and rbcL Evidence for Land Plant Phylogeny

FUNCTIONAL CONSTRAINTS Victor A. Albert, 2 3 Anders Backlund, - 2 Kare AND Mark EVIDENCE FOR Bremer, 2 W. Chase* rbcL James R. Manhart,5 Brent D. Mishler,6 LAND PLANT PHYLOGENY 1 and Kevin Nixon7 C. Abstract DNA Although the proportion of "functional" in eukaryotic genomes is both debatable and subject to definition, most sequences gathered for phylogenetic purposes are indisputably functional. For example, patterns of variation are likely to be strongly constrained in ribosomal RNAs because of their structural and catalytic roles in protein and translation, in protein-coding genes, because of protein function Although seemingly obvious, these concerns itself. We are usually ignored by workers producing gene have examined trees. the extent of functional constraints in land- plant r6cL sequences. Not only do rbcL sequences appear to change with essentially clocklike regularity, but nucleotide- based cladograms imply that approximately 97.5% of codon changes on internal branches are functionally neutral (i.e., synonymous or functionally labile). From this perspective, rbcL evolution appears to be strongly constrained by function. Transforming nucleotide data into ad hoc string recognitions alters the size of the unit character sufficiently to highlight "blocks" of conservative information that may or may not be functionally constrained. Simultaneous cladistic analysis of available evidence all will highlight the proportion of congruent information, despite diverse among We functional constraints the characters analyzed. demonstrate the strength of this approach using different forms of the same rbcL evidence (i.e., nucleotides, strings, or amino acids) in combination with the seed-plant data of Nixon et al. Diversification of the major clades of extant land phenotypic gaps. In particular, the plastid rbcl plants probably dates from the Silurian to Creta- gene (which encodes the large subunit of RuBisCO: ceous. During the Silurian- Devonian, liverworts, ribulose- ,5-bisphosphate carboxylase/oxygenase, 1 hornworts, mosses, and tracheophytes formed dis- a primary enzyme carbon fixation) has been in tinct lineages. Differentiation of the tracheophyte emphasis on sequenced primary extensively, with clades, notably angiosperms and other seed plants, the angiosperms (Clegg, 1993; Chase et al, 1993). began by the Devonian. The estimation of land- Arguing from expected synonymous substitutions plant phylogeny, a research goal spanning over per site under a particular rate assumption, Clegg 400 million years of cladogenesis and extinction, (1993) suggested rbcL sequences should be that no is simple task. For example, many groups lack phylogenetically informative for the time interval strong morphological We similarities that might suggest 400-100 million years before present. argue patterns of relationship. incomplete. here that this and similar assertions are Recent years have seen an explosion of interest From direct estimation of total substitutions (as in molecular information, with promise l"y its of easily optimized on cladograms; see Albert et al., interpreted similarities for bridging & 1993) otherwise large 1993; Albert Mishler, 1992 Albert et al., T Bo^ch nS and PetCr EngStr6m BruneUa for ideas advice and Diana Lipscomb, zndZil >K / discussion, and Mick KRichardson for comments on an ' ' an*Ufcniuj earlier draft of the manuscript, Crepet, Else Marie Friis, Bill Stevenson for permission to use unpublished versions of their data matrix produced collaboration with KCN, and in Anderson o*n. Bill for last-m.nute our discussion of Malpighiaceae biogeography. All interpretations are, of course, The program for constructing random Support Iron strings was written by Karl-Konig KSnigsson and Rolf Staflin. ™ mJ «onlin/ tUrti^en e ReSearch C°Uncil to VAA AB d ™) and the U.S. National Science Found* • BSR-8906496 MWC, S < ' > VAA BSR-8906126 t to JRM, BSR-9107484 BDM) Lastly, to acknowledged. to gratefully is thanks the sympos.um Garden lor organizers, travel arrangement staff, and editorial office of the Missouri Botanical their courteous and patient assistance. Department ^^ of Systematic Botany, Uppsala University, Villavagen S-752 36 Uppsala, Sweden. - 6, 9 J 5 a C eSPO Department Villavagen of Physiological Botany, Uppsala University, 6, 36 Uppsda Swe - d°en M 3101"7 °f °leCU,ar Systema*cs, Royal Botanical Gardens, Kew, Richmond, Surrey TW9 3AB, United Kingdom' Department A&M of Biology, Texas University, College Texas 77843, U.S.A. u Station, | _. . "^ Berkele r " Department California, ST" of Integrative Biology, University of Califorr U 94720 7 L. H. Bailey Hortorium, Cornell University, New Ithaca, York 14853, U.S.A. Ann. mn. Missouri Rot. Carh «i- rsa-cj;-? Number Volume 81 3 Albert et 535 al. , 1994 Functional Constraints and rbcL Evidence we will demonstrate that divergence-time asym- al. (1992) and in the relative-rate tests of Gaut et metries among taxa restrict rfccL-based hypotheses al. (1992), but absolute rate estimates do not differ of land-plant phylogeny far more than do rate substantially from our own findings. Thus, whereas asymmetries. rbch data cannot be considered perfectly ultra- We have examined the internal stability of land- metric (i.e., satisfying a clock assumption), the plant rbcL evidence through conversion of nucle- small range of absolute variation suggests that some otide information into different data forms, includ- predictions of the clock hypothesis apply. For still ing presence/absence of ad hoc nucleotide strings. example, the relationship between time and the may Cladograms produced from nucleotide, string, and accumulation of nucleotide substitutions be We translated amino acid data are only partially con- nearly linear. term this condition, apparently gruent. Character optimization on both nucleotide characterizing rbcL sequence data, "quasi-ultra- and string trees reveals extensive functional con- metric." servation through the predominance of Quasi-ultrametricity has several important im- silent changes and labile (function-conserving) amino acid plications. One is that the extent of sequence di- replacements. Hence, r&cL nucleotides are no less vergence in a given taxon sampling should roughly functionally constrained than morphological char- reflect the timing of underlying cladogenetic events. sequence acters (contra Olmstead, 1989;Sytsmaetal., 1991; such events are ancient, extensive If all Clegg, 1993). differences among taxa are to be expected (Fig. all & Although the separation of protein-functional 1; cf. Donoghue Sanderson, 1992, fig. 15.3). from cladogenetic history may not be entirely pos- If some cladogenetic events are ancient whereas much more expected sequence sible, the extent to which functional history reflects others are recent, phylogeny might be assessed through congruence divergence in a data set would be prominently become studies with characters expected to carry diverse skewed (Fig. 2). As these properties ex- patterns of functional constraints. As such, we have treme, parsimony analysis will be hampered by the among performed changes total-evidence analyses at the seed-plant increased probability of parallel divergence-time-asym- level using, as a "constant," a new matrix of pri- either anciently diverged or & Donoghue marily morphological data (Nixon et al., 1994, this metric sequences (Figs. 1, 2; cf. issue). It emerges that combination of rbcL nucle- Sanderson, 1992: 347-349). Given that A, T, G, otide, amino acid, or string data with this matrix and C are the only character-state alternatives, produces produce patterns of highly compatible hypotheses. either scenario likely to cladistic is Ihese may nonhomologous and there- be studies point to the commonality of in- similarity that (i) formation in different data forms representing the fore cladograms that are ahistorical. This is pre- same by evidence, and the power of simultaneous cisely the "long branches attract" issue raised (ii) evaluation of available evidence and weakness Felsenstein (1978) and others. all of further production of rbcL gene trees Kluge, Although asymmetrical rates of sequence change (cf. 1989; Barrett et 1991; Donoghue & Sander- are often invoked to explain branch attraction be- al., son. 1992; Jones et 1993; Mishler, 1994). havior (see Clegg and Zurawski, 1992: 10, with al., reference to rbcL), the problem is better defined and divergence time as their terms of both rate The in Rate "Problem" per-character change: the X of Albert et product, & As 1992; has been (1992a, 1993; Albert Mishler, cf. pointed out several recent papers, in al. & sequence With quasi-ultrametric change Hendy Penny, 1989). in the rbcL gene not strictly is clocklike asymmetry unimportant in this regard: (Albert et al., 1992a; Bousquet et al., data, rate is 19 becomes 92; Gaut et al., 1992; Clegg, 1993). Here, we time through which a branch exists the provide a number of new comparisons (Table central factor. As such, our expectation of the 1) based on parsimony analysis on rbcL data patristic distances between woody taxon performance of Pairs from Search of Chase (1993). must include our ability to estimate both the ab- II et al. It is c'ear that our own estimates and those of other solute and relative timing of cladogenetic events w Of course, orkers data matrices. all fall within a very narrow range of inherent to particular absolutely low values. The mean rate per taxon this may not always be possible. 10'" quasi-ultrametricity Pair An implication of investigated here approximately 2 x additional is A total substitutions per site per million years; Wen- is the near satisfaction of selective neutrality. d el & Albert 5-7 x 10"' molecular clock predicted by the neutral theory (1992) estimated for is and mutation ""ee herbaceous-pair comparisons. Lineage-spe- of molecular evolution; equal rates of «C Kirnura, 1983; ( rate differences were found by Bousquet et fixation are the expectation (see 536 Annals of the Missouri Botanical Garden Table "Phylogenetic" estimation of total substitution rate for 1 9 woody-taxon pairs. The rate of sequence 1 . D divergence was calculated as per-site divergence (the patristic distance, divided by the number of nucleotides , p compared) divided by time since cladogenesis (Albert et al., 1992a). Average rates for individual taxa are half of the values shown. Data are from Search II of Chase et al. (1993); systematic error associated with that analysis can be expected to affect all calculations equally. Divergence time assumptions are based upon geologic dates associated with vicariant disjunctions (with the exception of Arecaceae comparisons, which follow from the arguments of Wilson all 1990). et al., Divergence Divergence rate time (subst./site- D Taxon pair Area assumption taxon pair) Callitris rhomboidea R. Br. ex Rich. Australia 100 My* 55 3.85 x 10'"' iddringtonia cedarbcrgcnsis Marsh Africa // (Cupressareae) Weta&equoia glyptostroboides Hu & Asia 40 My" 16 2.80 x 10-"' W. Chang C. Sequoiadendron giganteum (Lindl.) N. America Buchholz J. (Taxodiaceae) My Hlicium parviflorum Michx. ex Vent N. America/ Asia 200 54 1.89 x 10 -in Austrobaileyn scandens C. T. White Australia (Illiciaceae/Austrobaileyaceae) " & Drimys hi uteri J. R. G. Forst. S. America 100 My 21 1.47 x 10 New lirlliolum sp. Caledonia (Winteraceae) & 10-'" Drimys My x winter R. G. Forst. America 100 14 0.98 i J. S. Tasmannia insipida DC. Tasmania interaceae) (\\ -10 Canella winteriana (L.) Gaertn. N. America 200 My 78 2.73 x 10 Belliolum sp. New Caledonia (Canellaeeae Winteraceae) -Mi Canella winteriana (L.) Gaertn. N. America 200 My 67 2.35 x 10 Tasmannia insipida DC. Tasmania iant'llaceae/ Winteraceae) (( -Mi Liriodendron tulipifera L. N. America 40 My 10 1.75 x 10 Liriodendron chinense (Hemsl.) Sarg. Asia (Magnoliaceae) -H< Calycanthus vhinensis Cheng & Asia/ N. America 200 My 28 0.98 x 10 Chang S. T. Idiospermum australiense (Diels) Australia S. T. Blake (Calycanthaceae Idiospermaceae) - Chimonanthus praecox (L.) Link Asia 200 My 24 0.84 x 10 !•• Idiospermum australiense (Diels) Australia S. T. Blake '.alycanthaceae/ Idiospermaceae) {(. -Hi Chamaedorea My x 10 costaricana 1.75 Oerst. Americas 60 15 1 Drymophloeus subdistickus S. Pacific (H. E. Moore) E. Moore II. (Arecaceae) Chamaedorea costarieanu Oerst. Americas 60 My 20 2.33 x 10 Vr/W frutienns Wurb. S. Pacific India (Arecaceae) Serenoa repens (Bertram) Small Americas 60 My 18 2.10 x 10 Drymophloeus subdistickus S. Pacific (H. E. Moore) H. Moore E. (Arecaceae) Number 3 Volume 81 Albert et 537 al. , Functional Constraints and rbcL Evidence 1994 Table Continued. 1. Divergence Divergence rate time (subst./site- Taxon pair Area assumption D, taxon pair) Serenoa repens (Bartram) Small Americas 60 My 23 2.68 x 10 '" Wurb. Vy/wi fruticans S. Pacific/ India (Arecaceae) llct u la nigra L. N. Hemisphere 200 My 35 1.23 x 10 '« CustHirina litorca L. Australia (bVtulaceae Casuarinaceae) My Sothofagus dombeyi (Mirb.) Oerst. S. America 100 30 2.10 x 10 '" New \othofagus balansae (Baill.) Steenis Caledonia (Nothofagaceae) My Galphimia gracilis Bartl. S.-N. America" 00 34 2.38 x 10 M> 1 kridocarpus natalitius A. Juss. Africa/ Madagascar/ New (Malpighiaceae) Caledonia Oicella nucifera Chodat America 100 My 33 2.31 x 10 '" S. kridocarpus natalitius A. Juss. Africa/Madagascar/ Malpighiaceae) New Caledonia Watcagnia stannea (Griseb.) Nied. S.-N. America 00 My 34 2.38 x 10-' ii 1 kridocarpus natalitius A. Juss. Africa/ Madagascar/ Malpighiaceae) New Caledonia " Range 3.01 x 10 Mean 2.05 x 10 "' 10 -"• 5.D. 0.75 x My Standard time figure used to represent the breakup of Gondwana (rounded to the nearest 100 (million year-) & from 130 My, as estimated using Terra Mobilis® 2.1 by C. R. Denham and C. R. Scotese; see Wendel Albert, 1992: 137). Standard interchange between time figure (ca. early Oligocene) used to represent disruption of the boreotropical North America & and Eurasia (see Lavin Luckow, 1993). Standard time figure used to represent separation of the Northern and Southern Hemispheres upon the breakup W Denham Pangaea (rounded to the nearest 100 My from 160 My, as estimated using Terra Mobilis® 2.1 by C. R. andC. & R. Scotese; see Wendel Albert, 1992: 137). Divergence date used by Wilson et al. (1990), based on the fossil record. ^•>rth American Malpighiaceae are here interpreted as representing range expansion from South America. ei, 1987). Quasi-ultrametric data may imply se- Unit Characters and Finctional Constraints lection coefficients very Re- close to neutrality. Dumber membering that the underlying premise of selective As recently reviewed by Clegg(1993), a have neutrality and evolutionary studies relied is the neutral effect of point mutations, of systematic Such analyses nearly clocklike sequence evolution should involve solely on rbcL sequence variation. rbcL nucleotides a ,a rge make assumption that proportion of such changes, fixed as effec- the implicit mark- informative neutral independent and potentially Jroeiy Such would are substitutions. substitutions As discussed above with expected events. to be mainly synonymous ers of cladogenetic silent (i.e., *ith change, branching respect to amino acid8 and, with regard to respect to total rates of if all ), amino are relatively recent. acid replacements, functionally conservative events under consideration aD may be expected to proceed «le). parsimony analysis Quasi-ultrametricity in rbcL nucleotide se- branch quences of spuriou at- is thus an expected manifestation of strong with a reduced probability lower expected constraints on because of the absolutely protein 9 traction function. are relevant to cfedittk aegg<i993) on anonymous note that or.lv total institution rates •S«J rates for rl><\- "**t tods because Assuming all informative variation is considered. protein function and t,hat Jf i.s t,he tr_r_a_ctt .- onn that purifying selection eliimnates mutations deleterious to s uch °» mutations, may reformulated as the neutral theory be -/> S-(l Q 41 *Here S i« tk. »«.„i „..u_.:...»: u- ~~a e tk» mutation rate (after Nei, 1 87: 52, 1). .. ; Annals of the 538 Garden Missouri Botanical A B c D E A B C D E 6 Yrs • 1 (1) (2) FIGURES and 2. Patterns of historical versus spurious similarity resulting from symmetrically ancient and 1 asymmetrical time-samples. In both cases, time-sample refers to the nodes on these imaginary trees. In (1), all nodes are essentially time-coincident at 400 My, so the "true tree*' appears polytomous. In (2), the cladogenetic even! My indicated occur asymmetrically with respect to time, ranging from 400 to 50 since divergence. Possible patterns of nucleotide change are indicated by the filled and open rectangles; the former represent unadulterated markers ot from multiple cladogenetic history, whereas the latter represent spurious character-state similarity resulting, e.g., nucleotide substitutions. In these patterns of similarity are approximately equal in extent (because of ncarh (1), may clocklike substitutional behavior) but are in partial conflict with each other; parsimony analysis include resolution- In* containing some proportion of ahistorical evidence or even alternatives comprising totally spurious patterns. might be the expectation if taxa A through E were, e.g., Isoetes, Selaginella, Psilotum, Equisetum, and Angioptens. In (2), which approximates the situation in simultaneous studies of sporing and seed plants, the problems of (1) are mos only partially alleviated. Patterns of convergent similarity between the oldest taxa, A and B, will result in * parsimonious reconstructions that pair these taxa spuriously. As divergence time becomes shallower, the redu( likelihood of multiple changes at sites will insure that D and E are paired historically. Although C is linked with {11 E) by "true" similarity, this relationship may be broken by false similarities between B and C as well as between b\ (C, D, E). In summary, comparing only anciently diverged lineages with rbch may suggest patterns of relations ip o comparison that represent a hopelessly even mixture of historically reliable and nonreliable similarity. Likewise, more consisten ancient and recently diverged clades may have the same problem near the base while being relatively near the tips. This condition may characterize the /"6cL-based results shown in this paper. base studies sequence divergence and relatively lower associ- however, are in conflict with cladistic Ribosoma ated likelihood of character-state parallelism. This on morphological characters (see below). roles in "time-sampling" and catalytic strategy has been employed in RNAs, with their structural enormou-- under circumscribed studies ranging from particular an- protein translation, are obviously rDNAs may also giosperm groups (e.g., Conti et al., 1993; Kron & functional constraints. Like rbcU w tehavior Rodman Chase, 1993; 1993) substitutional et al., to seed plants exhibit nearly clocklike vary, as a whole (Chase et al., 1993). Here, a "time those positions that are "free" to ir- low sample" refers to the nodes rather than change approximate the the ter- absolute rates of corresponds minals on an imaginary tree; as such, a time sam- on estimated for rbcU analysis of result in «>^ pling is the collection of absolute and relative tim- time samples might be expected to dra and ings of underlying cladogenetic events in a data responding patterns of homologous P ^ matrix. Of course, the nodes of a cladogram are and therefore similar hieTaTC V^ similarity, Q : & 1992: not discernible a priori to analysis, but their ab- structions Donoghue Sanderson, (cf. solute and relative timing may be estimated by 349). * of * external criteria (e.g., the fossil record; cf. Norell To gain insight into the topological efl & h Novacek, 1992). samples (see Fig- asymmetrical time vastly from r Initial attempts to analyze time samples beyond have combined rbch information J "gymnosperms, angiosperms and other seed plants including phytes," "pteridophytes," (i.e., pro rbcL sequences from substitutional sporing plants; Albert et al., angiosperms (Table 2). If the 1 992b) resulted in cladistic patterns familiar from effectively clocklike among these taxa, is DNA land-plant r studies based on ribosomal (rDNA) constraints in variation effects of functional spun may be (e.g., monophyletic gymnosperms or combinations evolution should be discernible (as tu. "Problem, gymnosperm Ratp of lineages, a seed-plant "root" ibrancih • see ne naie ^ at attractions; I bo from the Gnetales, an angiosperm "root" at the mon- above); we explore this cladistically eo nuc hoc ocots; see Troitsky et al., 1991; Zimmer et primary nucleotide data as well as ad al., ^ a Hamby & 1989; Zimmer, examined also 1992). These The r&cL data are results, strings. Volume 81 Number 3 Albert et al. 539 , Functional Constraints and rbcL Evidence 1994 Table 2. rbcL sequences used for data transformation and cladistic analysis. These are listed by taxon and by GenBank accession number and/or literature reference where sequence data appeared. Voucher information, first where available, given by these sources. is Taxon GenBank. accession or literature reference Conoeephalum eonivum (L.) Lindb. Mishler et al., 1994 Lophocolea heterophylla (Schrad.) Dumort. Mishler et al., 1994 Anthoceros pu netatus L. Mishler et al., 1994 & Andreaeobryum macrosporum Steere Murray 1994 B. Mishler et al., Ophioglossum engelmannii Prantl LI 1058 R. Manhart, n press) (J. Psilotum nudum (L.) P. Beauv. LI 1059 R. Manhart, n press) (J. & Isoetes melanopoda Gay Durieu LI 1054 R. Manhart, n press) J. (J. Lycopodium digit a turn A. Br. LI 1055 R. Manhart, n press) (J. ingiopteris evecta (G. Forst.) Hoffm. LI 1052 (J. R. Manhart, in press) Equisetum arvense L. LI 1053 R. Manhart, n press) (J. Selaginella sp. LI 1280 R. Manhart, n press) (J. Botrychium biternatum (Sav.) Underwood LI 3474 R. Manhart, n press) (J. Tuxus x media Chase 1993 et al., Taxodium 1992 distichum (L.) Rich. Soltis et al., Podocarpus X58135 1992) gracilior Pilg. (Bousquet et al., Ginkgo 1993 biloba Chase L. et al., W. (Wras, revoluta Schutzman, FLAS, (M. Chase, unpublished) L. B. s.n., Stangeria 1993 eriopus (Kunze) Chase Baill. et al., Zamia & inermis Vovides, D. Reese J. M. Vasquez-Torres LI 2683 (Chase et al., 1993) Ephedra tueediana C. A. Mey. L12677 (Chase et al., 1993) U'lwitschia mirabilis Hook. Chase et al., 1993 (G. R. Furnier) f. m (>netu gnemon L. LI 2680 (Chase et al., 1993) Chloranthus japonicus Siebold LI 2640 (Chase et al., 1993) Piper betle L. LI 2660 (Chase et al., 1993) {Drimys) Tasmannia L01957 1992c) insipida DC. (Albert et al., Cafycanthus chinensis Cheng & S. T. Chang LI 2635 (Chase et al., 1993) Eupomatia 2644 1993) bennettii F. Muell. LI (Chase et al., Magnolia 1990 macrophylla L. Golenberg et al., Per me sen a can a Golenberg 1990 ri Mill. et al., Trochodendron & L01958 1992c) aralioides Siebold Zucc. (Albert et al., -1428 184 from Ceratophyllum demersum L. M77030 (Les et al., 1991) plus nucleotides 1 1993 Qiu et al., -1428 184 from Vymphaea odor M77035 1991) plus nucleotides at a Aiton (Les et al., 1 1993 Qiu et al., lilwm superbum 2682 1992a) L. LI (Albert et al., Plateaus 1992c) L01943 occidentalis L. (Albert et al., Caltha 1992c) palustris L. L02431 (Albert et al., DUknia 1992c) indica L. L01903 (Albert et al., rysolepis (Castanopsis) sempervirens Kellogg) 1993 Hjelmq. Chase < et al., Betula 1992c) nigra L01889 L. (Albert et al., Ca*uarina 1992c) L01893 litorea L. (Albert et al., Hamametti 1992c) mollis Oliv. L01922 (Albert et al., becomes X (= rate time), amino changes per site, acid ber of level for hierarchic compatibility with characters, nu- some morphological the Unlike nucleotide and large. string evidence. with analyzed cladistically data are usually cleotide nonadditive no assumed transformation series (i.e., NUCLEOTIDES Albert For such procedures, 1971). Fitch, steps; spurious The examined the potential for (1993) nucleotide the smallest unit character et al. is (1978) sim- DNA under Felsenstein's available in information. With only four states branch attraction State-change probabil- P^ible scenario. any four-taxon at given nucleotide data are sub- plified site, & Cantor, 1969) (Jukes ject to parallelism among sequences when the num- ities with Jukes-Cantor I 540 Annals of the Missouri Botanical Garden and Kimura 2-parameter (Kimura, 1980) correc- retention indices (C and R, respectively; Kluge & tions for multiple changes at sites were considered Farris, 1 969; Farris, 1 989a) were also calculated. in addition to observed changes only because of Five hundred fifteen nucleotide positions showed among the prospect of reducing character-state parallel- patterns of similarity taxa. isms. All calculations indicated a very small pa- Eight equally parsimonious cladograms were = R = rameter region under which branch attraction could found (C 0.362 (including all data), 0.523). be expected, provided that X values remained small The strict and combinable component consensus less than approximately 0.1; see Albert et trees (Bremer, 1990) were identical (see Fig. (i.e., al., 3). 1992a). For quasi-ultrametric data, differences in All trees indicate that (i) hornworts are nested inside X values must principally result from divergence the tracheophyte clade, lycopods rather than (ii) time differences. ferns plus Equisetum represent the sister group to The bryophyte lineages examined here could seed plants, Gnetales represent the sister group (iii) easily be pre-Silurian; the pteridophytes no later of all other seed plants, (iv) conifers, Ginkgo, and than Devonian; the seed-plants appearing by the cycads form the monophyletic sister group to an- Carboniferous; the angiosperms by the Cretaceous, giosperms, and (v) monocots are basalmost in the followed by their diversification through the Ter- angiosperms, followed by Piper. Characteristics (iii) — 500-5 rDNA tiary a time range potentially spanning and are shared with the analysis of (iv) & million years before present. Thus, even without a Hamby Zimmer (1992) but not with the mor & knowledge Crane Doyle Don priori of precise divergence times, phological analyses of (1985), it is & reasonable to approximate upper and lower X-bounds oghue(1986, 1992), Loconte Stevenson 1 990). ( from this range and our estimates of total sequence and Nixon et al. (1994). Characteristic (i) is in divergence. The mean rate for woody taxa (Table conflict with both morphological and molecular cla- & 1), averaged for single lineages by halving the distic studies (Mishler Churchill, 1985; Mishler divergence value, is approximately 1.0 x 10 -10 et al., 1994, this issue). Characteristic (ii) contrasts nucleotide substitutions per site per year. Similarly, both with morphological data (Bremer, 1985) and & of the estimates for herbaceous taxa (Wendel Al- with the chloroplast genome structural findings & bert, 1992) range between 2.5-3.5 x 10 10 As- Raubeson Jansen (1992) that link all tracheo- . suming that bryophytes and pteridophytes phytes except the lycopods, which have the ple- into fall the range 1.0-3.5 x lO" 10 as well, X values are siomorphic liverwortlike) state. Characteristic (i.e., (Don- estimated to lie between 0.05-0.175 (500 My) (v) contrasts with the results of morphological & & 1991: and 0.0005-0.00175 (5 My). On a four-taxon oghue Doyle, 1989; Loconte Stevenson, tree, some combinations of these values would yield Taylor & Hickey, 992) and some rDNA (Hambv 1 spurious branch attractions (see Albert et al., 1993). & Zimmer, 1992; cf. Zimmer et al., 1989) anal- Here, we are working with 40 taxa and a greater yses. potential for inconsistent Penny results (see et al., Needless to say. Function and phylogeny. 1991). the can represent not of the above observations all found Data The groups analysis. Nucleotide sequences (un- truth about land-plant history. ambiguously aligned by sight and excluding the 30 in the nucleotide-based parsimony analysis (1 ig- nature 5' -most which may but the positions, incorporated only primer well reflect historical reality, phy information for some taxa; Table 2) were analyzed of that reality could be other than strictly oo- PAUP cloc with nearly 3.1.1 (Swofford, 1993) using the Fitch genetic. From our argument about may r0 criterion (Fitch, 1971; cf. Albert et al., 1993) with rates and the functional constraints that P ACCTRAN som^^ that suppose (accelerated transformation) optimi- duce them, reasonable to it is & Fl ur zation (Farris, 1970; Swofford Maddison, 1987). or even of the branchings depicted in & all j^ The ra heuristic search option was used with 100 may reflect primarily spurious similarities We random have as replicates of data addition sequence, COL- than phylogenetic homologies. LAPSE, MULPARS, TBR exa by and (tree bisection-re- possible constraints on r be L evolution connection) branch-swapping. The on the in consistency and the amino acid changes implied ^ of fbi FIGURES 3-5. analyses Combinable component parsimony consensus trees summarizing the results of ^ evidence tor as (3) nucleotide, (4) sMtrinMg, and (5) amino acid data. For (3), the strict consensus is identical: \ (•jV tn** Sintrl#> rninlnn-iLln M*m«» «l4 *_ ~-~ ' J* ,. _ 1 I .1 r ' mnmtl that TCSOl opologies o CD Conocephalum Conocephalum -Conocephalum ^ * CO Androaeobryum Lophocolea Lophocolea 3 Androaeobryum Andreaeobryum Equisetum oo -Lycopodium Equisetum Lophocolea Anthoceros Selaginella Angiopteris n Isoetes Psilotum Psilotum 3 Angiopteris Ophioglossum Angiopteris CD Anthoceros Ophioglossum Botrychium CO Psilotum Anthoceros Botrychium Equisetum Lycopodium Selaginella phfogbssum Isoetes Isoetes Botrychium Lycopodium Selaginella r~~ Ephedra Podocarpus Ephedra |r~ Ephedra Welwitschia Welwitschia ^-Gnetum Gnetum Welwitschia netum Podocarpus Piper Nymphaea Ginkgo Taxus Taxus Taxodium Ceratophyllum Taxodium Ginkgo Lilium Cycas Cycas Podocarpus [r~ tangeria Stangeria i— Taxus *— Zamia *—Zamia Taxodium Nymphaea Ginkgo o Lilium CD 3- Chloranthus Piper Cycas o CD Piper Chloranthus Stangeria 0) Trochodendron Drimys Zamia O o Caltha Ceratophyllum Calycanthus rimys r~ Calycanthus Chloranthus CD Persea Drimys Ceratophyllum 3 Eupomatia r-Eupomatia [_j—Eupomatia 0> ^—Magnolia agnolia *— Magnolia Nymphaea Platanus Persea Q. rL— Platanus Trochodendron Calycanthus _ Persea Caltha Platan us Hamamelis Trochodendron Caltha m Hamamelis ilium < Dillenia a. Hamamelis Dillenia illenia CD Betula |— Chrysolepis Chrysolepis M— (4) (5) (3) 8 Betu hrysolepis la Betula asuarina *—Casuarina Casuarina CJ1 542 Annals of the Garden Missouri Botanical minimum have homoplasy branches of one of the eight equally most-parsi- if optimized as three monious trees (Appendix As summarized in Ta- autapomorphies). Additionally, direct analysis I). of 84% sequences from ble 3, over of the inferred nucleotide substi- nucleotide protein-coding genes ig- tutions on internal branches are silent with regard nores constraints imposed both by the genetic code to amino acid identity. The percentage of nucle- and protein function; codon positions may be both & amino and otide changes incurring functionally labile intra- inter -correlated (Fitch Markowitz, AM- acid replacements (judged using the P 250 log- 1970; Fitch, 1986). A may odds matrix of Dayhoffet 1978: 352; see Table data transformation that overcome these al., « 3) amount to an additional 13%. Viewed as a shortcomings stems from the early comparison of whole, 97.5% percent of synapomorphous nu- oligonucleotide catalogues (and even whole chro- all cleotide changes are expected to have little or no mosomes; see Farris, 1978; Fox et al., 1980; Bre- & DNA effect on protein function. With a maximum of mer Bremer, 989) prior to the sequencing 1 only 2.5% of these changes incurring non-labile revolution: production of ad hoc nucleotide strings. amino acid replacements of potential structural/ Our procedure (analogous to generating mapped may functional distinction (see Table 3), rbch sequences restriction site data) be outlined thus: (i) gen- C appear heavily burdened by forces leading to func- erate strings of random A, T, G, and content tional conservation." 1 Thus, the challenge for land- varying randomly in size between 6 and 21 base plant cladistics to determine how strongly func- pairs (so that a minimum and maximum of two and is tionally constrained variation may also reflect phy- seven codons are included), scan rbcL sequence (ii) logenetic patterns. data for the presence/ absence of given strings, (iii) and record recognitions by both base position tax- by a STRINGS on, treat multiple positional recognitions (iv) given search string separately, (v) treat all rec- The ideal "unit" character in phylogenetic anal- ognitions found in two or more taxa as binary ysis one that truly evolves as an independent is (sequences that characters for cladistic analysis unit, meaning one that independently undergoes position are have missing information at a string transformations from one condition to another that pro- coded Another procedure for accordingly). are hierarchically correlated congruent; (i.e., cf. sequences has ducing string data from nucleotide Farris, 1969) with those of other such characters. (unpublished); se- been developed by Farris S. J. For molecular may data, this often be the individual number quences are subdivided into a prespecified nucleotide, but possibly also a contiguous length of which each of DNA of string characters ("supersites"), an in insertion/deletion event, several non- explain many necessary to assigned as states as is contiguous nucleotide positions that are function- both method guarantees observed variation. Farris's RNA ally associated (e.g., because of higher order sequence a complete transformation of the entire or protein structure), a unique codon for a func- characters, as well as the non-overlap of string tionally constrained amino acid, or a whole chro- and Ap- below approach used here (see unlike the mosome in a karyological change. of course It is pendix II). difficult to assess such possibilities a priori, but it sequences into The transforming net effect of is nonetheless important to begin to develop meth- more mfor- incorporates strings twofold: ods examine is (i) it to the issue empirically. spann codons We mation terms of nucleotides or (in have thus examined some means by which e decreases and I in a larger unit character, (ii) the functional/phylogenetic evidence manifest in sai* of the gains independent probability that a given set of rbcL sequences might man be represented data character-state are represented in by data forms other than nucleotide and positions charac- binary parsimony analyses, (although, in their character The states. nucleotide indeed the is branch attrac i more spurious ters are subject to smallest unit character rbcL in evidence, but it is *^ characters; than are nonadditive multistate not necessarily the most informative nor most con- site to As mapped restriction et 1993). with al., sistent. First, nonadditive optimization of multistate i^ recogni a of may the probabilities of gain versus loss characters restrict potential topological res- ga^- paraile with string are highly asymmetrical, 1 olution a (e.g., 4-state, nonadditive character can Tern L P the least likely transformation series ( J H* & aL, 1983; DeBry 1985; Albert et Slade, ma^ may historical contain Therefore, string data 10 Patterns of codon usage ^ intrinsic to the primary branch attr much engage in nucleotide matrix are ers less likely to also suggestive of functional con- para e accumulated straints; these are discussed (which occurs because of in a separate paper _ (Albert, & Backlund & Bremer, Penny, in press). 1978; Hendy Felsenstein, cf. Volume 81 Number 3 Albert et al. 543 , Functional Constraints and rbcL Evidence 1994 #1 Table 3. Analysis of character support for internal branches of tree (of 8) from the nucleotide analysis. "# "Node" refers to the node numbers on the reference tree of Appendix I. changes" refers to the total number of nucleotide changes optimized onto a branch. "Constant" indicates that the nucleotide site belongs in a codon position that codes for the same amino acid throughout the entire matrix. "No change" indicates that the nucleotide belongs in a codon position that codes for two or more amino acids throughout the matrix, but that the particular site change indicated at this node does not cause a change in amino acid sequence. "Labile" means that the inferred change in amino acid due to the observed change in nucleotide sequence likely to happen by random chance or is PAM-250 better (according to the log-odds matrix of Dayhoff et al., 1978: 352). "Potentially nonlabile" indicates that at least one of the potential amino acid changes inferred from a particular nucleotide position is not likely to happen by random, but that there also are some changes in the same character that are likely to happen by random rhance or better. "Nonlabile" means that inferred acid changes (often only one) occur at less than random chance. all Potentially # Node No changes Constant change Labile nonlabile Nonlabile 78-77 42 22 4 8 5 3 77 76 24 13 6 4 1 76-71 27 13 9 3 2 71 70 29 19 9 1 70-42 40 24 5 11 42-41 33 26 5 1 1 70-69 42 17 16 8 1 69-66 29 21 8 66-48 34 15 13 5 1 48-44 25 10 12 2 1 44-43 29 19 8 2 48-47 15 8 7 47-46 24 14 7 3 46-45 11 4 4 3 66-65 56 34 15 7 65-64 26 13 10 3 64-63 18 11 6 1 63-54 5 2 3 54-53 4 3 1 53-51 10 3 5 1 1 51-49 9 4 2 3 51-50 8 5 2 1 53-52 4 11 5 2 63-62 16 5 11 62-61 14 6 7 1 61-59 8 2 4 2 59-58 4 17 8 5 58-57 13 6 4 3 57-56 33 20 6 7 56-55 6 3 2 1 6] 60 8 5 2 1 69 68 8 3 58 29 18 68 67 45 24 17 4 3 76 75 34 20 4 7 75 74 38 23 12 2 1 74 73 45 28 14 3 73- 72 65 43 12 9 1 2 13 11 951 529 272 126 1.37% 1.16 100.00% 55.63% 28.60% 13.25% 84 .23% 97.48%

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.