Collodictyon—An Ancient Lineage in the Tree of Eukaryotes Sen Zhao,(cid:1),1 Fabien Burki,(cid:1),2 Jon Bra˚te,1 Patrick J. Keeling,2 Dag Klaveness,1 and Kamran Shalchian-Tabrizi*,1 1Microbial Evolution Research Group, Department of Biology, University of Oslo, Oslo, Norway 2Canadian Institute for Advanced Research, Botany Department, University of British Columbia, Vancouver, British Columbia, Canada (cid:1)These authors contributed equally to this work. *Corresponding author: E-mail: [email protected]. Associate editor: Herve´ Philippe Abstract Thecurrentconsensusfortheeukaryotetreeoflifeconsistsofseverallargeassemblages(supergroups)thatarehypothesizedto R describe the existing diversity. Phylogenomic analyses have shed light on the evolutionary relationships within and between e supergroupsaswellasplacednewlysequencedenigmaticspeciesclosetoknownlineages.Yet,afeweukaryotespeciesremainof s e unknown origin and could represent key evolutionary forms for inferring ancient genomic and cellular characteristics of a eukaryotes. Here, we investigate the evolutionary origin of the poorly studied protist Collodictyon (subphylum Diphyllatia) by r c sequencingacDNAlibraryaswellasthe18Sand28SribosomalDNA(rDNA)genes.Phylogenomictreesinferredfrom124genes h placed Collodictyon close to the bifurcation of the ‘‘unikont’’ and ‘‘bikont’’ groups, either alone or as sister to the potentially a contentiousexcavateMalawimonas.PhylogeniesbasedonrDNAgenesconfirmedthatCollodictyoniscloselyrelatedtoanother r genus,Diphylleia,andrevealedaverylowdiversityinenvironmentalDNAsamples.TheearlyanddistinctoriginofCollodictyon t i suggests that it constitutes a new lineage in the global eukaryote phylogeny. Collodictyon shares cellular characteristics with c l ExcavataandAmoebozoa,suchasventralfeedinggroovesupportedbymicrotubularstructuresandtheabilitytoformthinand e broad pseudopods. These may therefore be ancient morphological features among eukaryotes. Overall, this shows that Collodictyon is a key lineage to understand early eukaryote evolution. Key words: 18S and 28S rDNA, Collodictyon, Diphyllatia, tree of life, phylogenomics, cDNA, pyrosequencing. Introduction and complex genome histories (Simpson and Roger 2004; Parfrey et al. 2006; Roger and Simpson 2009). Overthelastfewyears,molecularsequencedatahavead- Identification of sister lineages to these supergroups is dressed some of the most intriguing questions about the crucialforresolvingtheeukaryotetreeandunderstanding eukaryote tree of life. Phylogenomic analyses have con- the early history of eukaryotes. If these key lineages exist, firmed the existence of several major eukaryote groups theymaybefoundamongthefewspeciesthatharbordis- (supergroups)aswellasshownvariouslevelsofevidences tinct morphological features but are of unknown evolu- fortherelationshipsamongthem(Burkietal.2007;Parfrey tionary origin in single-gene phylogenies (Patterson 1999; et al. 2010). Recently, two new large assemblages, SAR Shalchian-Tabrizi et al. 2006; Kim et al. 2011). Indications (Stramenopila,Alveolata,andRhizaria)andCCTH(Crypto- thatsuchenigmaticspeciescanbeplacedintheeukaryote phyta, Centrohelida, Telonemia, and Haptophyta), were tree come from recent phylogenomic analyses. For in- proposed to encompass a large fraction of the eukaryote diversity,togetherwiththeothersupergroupsOpisthokon- stance, Ministeria (Opisthokonta), Breviata (Amoebozoa) ta,Amoebozoa,Archaeplastida,andExcavata(Patronetal. and Telonemia, Centroheliozoa, and Picobiliphyta have 2007; Burki et al. 2009). Solid phylogenomic evidence been shown to constitute deep lineages within their re- supports the monophyly of Amoebozoa, Opisthokonta, spective supergroups (Shalchian-Tabrizi, Minge, et al. Archaeplastida, and SAR (Rodriguez-Ezpeleta et al. 2007; 2008;Burkietal.2009;Mingeetal.2009;Yoonetal.2011). Burki et al. 2009; Minge et al. 2009), but the monophyly Here,weinvestigateamemberofsuchakeylineage,Col- of Excavata and CCTH (also called Hacrobia; Okamoto lodictyon, which was first described in 1865 (Carter 1865), et al. 2009) remains controversial, often dependent on butits cellularstructure andouter morphologywere ana- the selection of taxa and gene data set (Burki et al. lyzedonlyrecently(Klaveness1995;Brugerolleetal.2002). 2009;Hampletal.2009;Baurainetal.2010).Despiteseveral Collodictyonwasoriginallyproposedtobecloselyrelatedto attempts, the evolutionary relationships between these Diphylleia and Sulcomonas and classified in the family supergroups are still uncertain because of the ancient Diphylleidae(Cavalier-Smith1993;thesynonymousfamily ©TheAuthor(s)2012.PublishedbyOxfordUniversityPressonbehalfoftheSocietyforMolecularBiologyandEvolution. ThisisanOpenAccessarticledistributedunderthetermsoftheCreativeCommonsAttributionNon-CommercialLicense (http://creativecommons.org/licenses/by-nc/3.0),whichpermitsunrestrictednon-commercialuse,distribution,and Open Access reproductioninanymedium,providedtheoriginalworkisproperlycited. Mol. Biol. Evol. 29(6):1557–1568. 2012 doi:10.1093/molbev/mss001 Advance Access publication January 6, 2012 1557 MBE Zhao et al. · doi:10.1093/molbev/mss001 CollodictyonidaeinBrugerolleetal.2002)andsubphylum 250–600 bp was eluted from a preparative agarose gel Diphyllatia(Cavalier-Smith2003).Collodictyonisanomniv- and sequenced by the Norwegian ultra-high throughput orousamoeba-flagellatewithamixofcellularfeaturesthat sequencing service unit at the University of Oslo and makesituniqueamongeukaryotes.Thecellhasanegg-or Macrogen Inc (South Korea) yielding a total of 300,000 heart-like outline without walls or any other external sequence reads. ornamentation in spite of a highly vacuolated cytoplasm (Rhodes 1917; Klaveness 1995). It possesses four equally Sequence Analysis long flagella and mitochondria with unconventional Allthe454pyrosequencingreadswereassembledintocon- tubular-shaped cristae. An important character of Collo- tigsusingNewblerv2.5(Marguliesetal.2005)withdefault dictyon is a broad ventral feeding groove dividing the cell parameters. We retrieved contigs larger than 200 bp with longitudinally. This groove is supported by both left and significantsimilaritytogenesrecentlyusedinamultigene rightmicrotubularrootsalongtheentirelengthofthelips, phylogeny (Burki et al. 2010). The translated contigs were similar to comparable structures in other eukaryotes such screened by BlastP using our single-gene sequences as as in Excavata (Simpson 2003). It also forms pseudopods queries, and the homologous copies (e value , 1 (cid:1) 10(cid:2)20) typical of Amoebozoa at the base of the groove, which wereaddedtothesingle-genedataset.Thesenewsequen- are actively used for catching prey. ces were automatically aligned by Mafft with the linsi Despiteitsinterestingmorphologicalfeatures,itremains algorithm(Katohetal.2002),andambiguouslyalignedpo- unclear whether Collodictyon is closely related to either sitionswereremovedusingGblocks(Castresana2000)with ExcavataorAmoebozoaortoanyoftheothersupergroups halfofthegappedpositionsallowed,theminimumnumber becausenomoleculardataareavailable.Furthermore,the ofsequencesforaconservedandaflankpositionsetto50% position of the closely related Diphylleia is totally unre- of the number of taxa, the maximum of contiguous non- solvedin18SribosomalDNA(rDNA)phylogenies(Bruger- conservedpositionssetto12,andtheminimumlengthof olle et al. 2002; Shalchian-Tabrizi et al. 2006). In order to ablocksetto5.Theorthologyandpossiblecontamination exploretheoriginofCollodictyon,weestablishedaculture in eachsingle-gene alignment were assessedby maximum of Collodictyon triciliatum, sequenced the 18S and 28S likelihood (ML) reconstructions with 100 bootstrap repli- rDNA genes, and carried out a deep survey of a cDNA li- catesusingRAxMLv7.2.6underthePROTCATLGFsubsti- brary with 454 pyrosequencing. About 300,000 sequence tution model (Stamatakis 2006), followed by visual reads were generated and used to assemble an alignment evaluationoftheresultingindividualtrees.Forseveralsin- of 124 genes (27,638 amino acid characters) that covered gle genes (i.e., prmt8, tubb, rpsa, suclg1, tcp1-beta, hsp90, ataxon-richsamplingofeukaryotes(79species).Tofurther ubc, and crfg), the PROTGAMMALGF model was used in understandtheevolutionaryhistoryofthislineage,wealso additiontothePROTCATLGFmodelforbetteridentifica- screenedthecDNAlibraryforthedihydrofolatereductase tionoftheorthology.Weusedpublishedglobaleukaryotic (DHFR)andthymidylatesynthase(TS)genesandextended trees such as in Rodriguez-Ezpeleta et al. (2007) and the DHFR gene by 3# Rapid Amplification of cDNA Ends Burki et al. (2009) as framework to identify and remove (RACE) and polymerase chain reaction (PCR). thesequencesthatshowedunexpectedgroupingandwere supported with more than 70% bootstrap in the single genestrees.Inordertoidentifyhiddenparalogsinthedata, Materials and Methods we added more taxa in the single-gene phylogenetic Culturing,Harvesting,andcDNALibraryConstruction analyses than in analyses of the supermatrix. Deletion of Collodictyon triciliatum was isolated from Lake A˚rungen, long-branch taxa (i.e., Trichomonas, Giardia, and Spironu- Norway,andculturedonamodifiedGuillardandLorenzen cleus) was done in a subsample of the single-gene align- medium(GuillardandLorenzen1972).Collodictyontricilia- ments, but it did not change the phylogeny or the tum was inoculated in a culture of the cryptomonad bootstrap values significantly. Hence, although inclusion Plagioselmis nannoplanktica (Klaveness 1995; Shalchian- offast-evolvingspeciescouldpotentiallyintroducesystem- Tabrizi,Bra˚te,etal.2008).cDNAlibrarieswereconstructed atic errors in the trees, these types of taxa seemed not to byVertisBiotechnologyAG(Freising,Germany)according stronglyimpactourparalogidentification.Importantly,we to their random-primed cDNA protocol: Total RNA was includedgenesequencesfromthecryptomonadGuillardia extractedwithmirVanaRNAisolationkit(Ambion,Austin, theta in all alignments in order to phylogenetically distin- TX),andpoly(A)þRNAwasisolatedfromthetotalRNA. guish sequences from Collodictyon and its prey (P. nanno- First-strand cDNA synthesis was performed with random- planktica). This left in total 124 single-gene alignments ized primers, and second-strand cDNA was synthesized containing Collodictyon sequences that were used for fur- usingGublerandHoffmanprotocol(GublerandHoffman ther analyses. The concatenation of the 124 single genes 1983). Double-stranded DNA (dsDNA) was blunted, and was done by Scafos (Roure et al. 2007) and amounted 454 GSFLX adapters A and B were ligated to its 5# to 27,638 amino acid positions with average missing and 3# ends. dsDNA carrying both adapters was selected characters 34.4% (For detail, see supplementary table S2, and amplified with PCR (24 cycles). Differently expressed SupplementaryMaterialonline).Thesequencesgenerated genes were normalized with a method developed by here were submitted to GenBank with accession number Vertis Biotechnology AG. cDNA in the size range of JN618831–JN618979.Thesingle-genetreesandalignments 1558 MBE Phylogenomics of Collodictyon · doi:10.1093/molbev/mss001 aswellastheconcatenatedalignmentareavailableathttp:// Testing Robustness of Trees by Removal of www.mn.uio.no/bio/english/people/aca/kamran/. Fast-Evolving Sites WeappliedtheAIRpackage(Kumaretal.2009;Yang2007) Phylogeny of rDNA and Multigene Alignments to estimate evolutionary rates of sites under the Whelan ReconstructionsofMLphylogeniesfrom18Sand28SrDNA and Goldman þ GAMMA model. The ML topology con- sequence alignments were done using RAxML v7.2.6. The structedfromasampleof76taxa(i.e.,removaloftwoMa- besttreewasdeterminedafter100heuristicsearchesstart- lawimonas species and Collodictyon) was used as starting ingfromdifferentrandomtreesunderthegeneraltimere- treefortheestimateofsiterates.Therationaleforchoosing versible (GTR) þ GAMMA þ I model. Bootstrap analyses thistopologywastoensurethatthesiterateswerecalcu- wereperformedwith100pseudoreplicatesusingthesame lated independently of the evolutionary affinity between model as in the initial tree search. Bayesian analyses were thesetwolineagesandtheirpositionsinthetree.Thesites done with MrBayes v3.1.2 (Huelsenbeck and Ronquist werethenremovedin5%intervals(i.e.,removalofthe5%, 2001)undertheGTRþGAMMAþIþCOVevolutionary 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, and 50% fastest model that accounts for covarion substitution pattern evolvingsites)fromafullalignmentthatcontainedthetwo acrossthesequences.Twoindependentruns,eachstarting MalawimonasspeciesandCollodictyon(i.e.,79taxa)andan from a random tree for Markov chain Monte Carlo alignmentwhereonlythetwoMalawimonasspecieswere (MCMC) chains, were run for 6,000,000 (18S rDNA) and removed (i.e., 77 taxa). The bootstrap values (BP) for the 4,000,000 (18S þ 28S rDNA) generations and sampled ev- nodesdefiningthesupergroupsaswellasfortheposition ery 100 generations. Posterior probabilities and average ofCollodictyonandMalawimonaswereinferredfromeach branchlengthswerecalculatedfromtheconsensusoftrees oftheseprocessedalignmentsbyRAxMLv7.2.6underthe sampled after burn-in set to 3,000,000 (18S rDNA) and PROTCATLGF model (with 100 bootstrap replicates). 1,000,000(18Sþ28SrDNA)generations.Chainswerecon- Thesetrimmedalignmentswerethenusedfortheestima- sideredtobeconvergentwhentheaveragesplitfrequency tionofaminoacidcomposition(seesupplementarymate- was lower than 0.01. rials and methods, Supplementary Material online). All Severalconcatenatedproteinalignmentswithdifferent bioinformatics analyses were done on the Bioportal at taxonomiccompositionswereconstructedtoinvestigate the University of Oslo (www.bioportal.uio.no; Kumar theinfluenceofspeciessamplingandmissingdataonthe et al. 2009). phylogeny of Collodictyon. Phylogenies were inferred by ML and Bayesian approaches, as implemented in RAxML Topology Comparisons v7.2.6 and Phylobayes v3.2 (Lartillot and Philippe 2004), Topologytestingwasperformedusingtheapproximately respectively.FollowingboththeAkaikeinformationcrite- unbiased (AU) test (Shimodaira 2002). For each tested rionandthelikelihoodratiotestcomputedwithProtTest tree, site likelihoods were calculated using RAxML 3.0 (Darriba et al. 2011), the optimal model LG þ v7.2.6withthePROTGAMMALGFmodel,andtheAUtest GAMMA þ F available in RAxML v.7.2.6 was chosen to wasperformedusingCONSEL(ShimodairaandHasegawa infer ML trees. The best ML topology was determined 2001). in heuristic searches from ten random starting trees. Due to computational burden, statistical support was 3# RACE and Sequencing of the DHFR-TS Genes evaluatedwith100bootstrapreplicatesunderthePROT- All assembled contigs were used as queries in BLAST CATLGF model that approximates the gamma distribu- searchagainstthenonredundantproteinsequencesdata- tion for site-rate variation (Stamatakis et al. 2008). base available at NCBI. Three contigs (contig15348, Bayesian inferences were done with the CAT site- contig15349, and contig06264) showed a significant sim- heterogeneousmixturemodel.TwoindependentMCMC ilaritytotheDHFRgene(evalue,1(cid:1)10(cid:2)10).Inorderto chainsinPhyloBayesstartingfromrandomtreeswererun verify that these contigs belong to Collodictyon and not for 24,000 cycles with trees being sampled every cycle. the prey, we designed forward and reverse primers, then Consensustopologyandposteriorprobability(PP)values different combinations of primers were used to amplify were calculated from saved trees after burn-in. Conver- genomic DNA from three cultures: 1) P. nannoplanktica gence between the two chains was ascertained by exam- (PN),2)P.nannoplankticaþC.triciliatum(PNþCT),and ining the difference in frequency for all their bipartitions 3)ChlorellapyreuoidosaþC.triciliatum(CPþCT).Bands (maxdiff , 0.15). In addition, a bootstrap analysis under were observed on the agarose gel solely when using for- the CAT model was performed on 100 pseudoreplicates ward primer in contig15348 and reverse primer in con- generatedbySeqboot(Phylippackage;Felsenstein2001). tig15349 for PCR amplification from PN þ CT and CP For each replicate, two Phylobayes MCMC chains were þ CT cultures. Both sequences were identical and run for 5,000 cycles with a conservative burn-in of matched the 3#-end region of contig15348 and the 5#- 2,000cycles.Manualverificationof10%randomlychosen endregionofcontig15349.Sinceidenticalsequenceswere replicates showed that the burn-in was optimal between only obtained in the cultures containing Collodictyon, it 1,000 and 2,000 cycles. Consense (Phylip package) was confirmed that these two contigs corresponded to the used to calculate the bootstrap support based on these Collodictyon gene, not the Plagioselmis or Chlorella one. 100 Bayesian consensus trees. Total RNA was isolated from PN þ CT cultures with 1559 MBE Zhao et al. · doi:10.1093/molbev/mss001 Ammonia sp. Reticulomyxa filosa Allogromia sp. Minchinia teredinis Urosporidium crescens Gromia oviformis 1.0/77 Phagomyxa oLdEoMnDte0ll5a2e Plasmodiophora brassicae 1.0/62 Bodomorpha RmTin5iimina20 -/84 LKM30 BOLA383 Lecythium sp. LKM48 1.0/63 Gymnophrys cometa 1.0/63 RT5iin4 N-Por Acanthometra sp. Arthracanthida sp. Chaunacanthida sp. 0.84/- SyDmHp1h4y7aEcKanDt1h7id a 1.0/56 AT4.94 OLI11032 SAR Acrosphaera sp. CR6A Sphaerozoum punctatum 0.87/- AT8.54 Bacillaria paxillifer 0.63/- 0.83/- RT7iinO2LIC1i1li0o2p5h rys infusionum 0.99/- Achlya bisexualis BOLA515 0.69/- DH148EKD53 BAQA072 OLI51105 0.88/- CCaSfe.Ete0r4ia5 roenbergensis OLI11066 OLI11150 0.78/59 0.79/75 RT 5iin25 CS.E036 Diplophry s sp. 0.66/61 Labyrinthuloides minutaUlkenia profunda Gonyaulax spinifera 0.88/- 1.0/- ColpNoodcetillluac pao snctiicnatillans Loxophyllum utriculare Oxytricha nova 0.55/- Cryptosporidium parvum OLI11056 Emiliania hOuLxIl1e1y0i72 Haptophyta OLI11007 -/- Pavlova salina Telonema subtilis RCC404.5 0.73/- RA001219.10 0.63/- TReAlo0n1e0m41a2 s.u1b7tilis RCC358.7 Telonemia RA000412.136 0.81/- BL010625.25 Telonema antarcticum 0.79/- AT4.11 Amastigomonas debruynei Apusozoa -/- AT4.50 Apusomonas proboscidea 0.98/- Chlamydaster sterni Raphidiophrys ambigua Centrohelida TCS 2002 0.70/- GlaucCocyyasntoisp tnyocshteo cghloineeoacryusmtis Glaucophyta 0.81/- Cyanophora paradoxa Goniomonas truncata Cryptophyta 1.0/74 GuillarCdoiam tphseotap ogon coeruleus Rhodophyta Glaucosphaera vacuolata 0.84/- RT1n14cul 0.51/- SRchTe5riifnfVe2lo ialv doux bciaarteri Viridiplantae OLI11059 OLI11305 RT5iin8 Helianthus annuus LKM101 Trypanosoma cruzi AT4.56 AT1.3 -/- 0.69/57 Ichthyobodo necator AT 4.96 DH148EKB1 Diplonema ambulator 0.79/- Euglena gracilis RT8n7 Paravahlkampfia ustiana RT5in38 0.96/- BOLA212 Naegleria gruberi BOLA458 C2.E026 0.59/- C3.E012 CS.R003 0.87/- DH148EKD18 0.99/- BOLA048 DH145EKD11 0.68/- 1.0/- CS.E022 Jakoba liberaReclinomonas americana Excavata 0.99/68 UJankcoubltuar iendc aeruckeararytaote clone BOLA187 Uncultured eukaryote clone BOLA366 Breviata anathema Malawimonas jakobiformis 0.89/- OStxryembloonmaass stipx. strix 0.40/- 0.89/- Trimastix pyrifTorrmimisastix marina C1.E027 0.53/76 DUipnhcyullletuiare rdo tCaonlslodictyonidae partial Retortamonas sp. Diphyllatia Collodictyon triciliatum Amoeba proteus 0.53/- Leptomyxa retiBcuOlaLtAa868 Hartmannella vermiformis LKM74 0.95/- Mayorella sp. Amoebozoa Acanthamoeba castellanii 1.0/78 Platyamoeba stenopodia RT5iin44 LEMD267 Filamoeba nolandi 0.99/- Phalansterium solitarium 0.63/- Filobasidiella neoformans Saccharomyces cerevisiae 0.57/50 Monosiga brevicollis Opisthokonta Nuclearia simplex 0.99/68 Lumbricus rubellus Podocoryne carnea 0.2 FIG.1.18SrDNAphylogenyoftheDiphyllatiaspeciesCollodictyontriciliatum(highlightedbyblackbox)andDiphylleiarotans.Thetopology wasreconstructedbyMrBayesv3.1.2undertheGTRþGAMMAþIþcovarionmodel.Posteriorprobabilities(PP)andMLbootstrapsupports (BP,inferredbyRAxMLv7.1.2underGTRþGAMMAþImodel)areshownatthenodes.ThicklinesindicatePP.0.90andBP.80%.Dashes ‘‘-’’ indicate PP, 0.5 orBP , 50%. Afew long branches are shortened by50% (/) or 75%(//). 1560 MBE Phylogenomics of Collodictyon · doi:10.1093/molbev/mss001 Bigelowiella sp. Thaumatomonas sp. Paracercomonas marina Aureococcus anophagefferens 0.97/57 Pelagomonas calceolata Apedinella radians 0.81/- Rhizochromulina cf. marina Dictyocha speculum 0.98/- Glossomastix chrysoplasta Pinguiococcus pyrenoidosus 1.00/68 Tribonema aequale 0.75/85 Dictyota dichotoma 0.63/- Fucus gardneri Laminaria angustata Scytosiphon lomentaria SAR Choristocarpus tenellus 0.84/59 Skeletonema pseudocostatum Cylindrotheca closterium 0.97/63 Rhizosolenia setigera Ochromonas danica Chrysolepidomonas dendrolepidota Synura sphagnicola Mallomonas rasilis Hyphochytrium catenoides 0.90/- Phytophthora megasperma Cryptosporidium parvum Toxoplasma gondii 0.83/- CochPleordkininiusums paotllaynkrtiickuosides Paramecium tetraurelia Phaeocystis antarctica Haptophyta Centroheliozoa sp. Prymnesium patelliferum Centrohelida 0.97/- Cyanophora paradoxa Glaucophyta 0.84/- Glaucocystis nostochinearum Roombia truncata Goniomonas sp. Goniomonas truncata Cryptophyta Guillardia theta 0.80/- Cryptomonas paramecium Bangia fuscopurpurea Rhodophyta Cyanidioschyzon merolae 0.76/- Chlamydomonas pulsatilla 0.99/- 0.80/- 0.91/68 PeOdoiagsatrmuomc hdluapmleyxs zimbabwiensis Chlorella vulgaris Viridiplantae Pseudochlorella sp. 0.98/74 Arabidopsis thaliana 0.97/- Marchantia polymorpha Closterium selenastrum 0.90/0.60 Closterium ehrenbergii Chara globularis Histiona aroides Excavata 0.99/45 Reclinomonas americana Trimastix pyriformis Diphyllatia Collodictyon triciliatum 1.00/56 RhizamoebaA scaaxnothnaicmaoeba castellanii Amoebozoa Hartmannella vermiformis Apusomonas proboscidea Amastigomonas bermudensis Apusozoa Ancyromonas sigmoides Planomonas micra 0.85/- Filobasidiella neoformans Amanita bisporigera 1.00/79 Mucor racemosus Saccharomyces cerevisiae 1.00/- 0.65/61 Aspergillus oryzae Schizosaccharomyces pombe Pneumocystis carinii Nuclearia simplex Glomus mosseae Opisthokonta 0.66/- Antipathes galapagensis 0.90/68 Chironex fleckeri Beroe ovata Leucosolenia sp. Suberites ficus 0.96/61 Monosiga brevicollis Capsaspora owczarzaki 0.53/61 Ichthyophonus hoferi 0.03 FIG.2.18Sþ28SrDNAphylogenyofCollodictyontriciliatum(highlightedbyblackbox)reconstructedwithMrBayesv3.1.2undertheGTRþ GAMMAþIþcovarionmodel.NumbersatnodesarePPandMLbootstrapvalues(BP,inferredbyRAxMLv7.2.6undertheGTRþGAMMAþ I model). Thick lines show PP . 0.9 and BP . 80%. Nodes marked with symbol ‘‘-’’ indicate BP , 50% or PP , 0.5. Some branches are shortened byhalf in ordertosave space(marked with‘‘/’’). theRNAqueous-MicroKit(Ambion,Austin,TX)following Results and Discussion thestandardprotocol.The3#RACEsystemfromInvitro- Collodictyon Is an Ancient and Distinct Eukaryote gen (Carlsbad, CA) was performed to obtain the full- length 3#-end of the DHFR cDNA. Two specific forward Lineage primers (DHFR1F: 5#-CGAGTGCGTTGAATGATTCGT- InordertoclarifytheoriginofCollodictyon,wefirstobtained CAAA-3# and DHFR2F: 5#-CTCAATGTTATTGTCAG- the18SrDNAsequenceforC.triciliatum.Phylogeneticanal- CAGCACT-3#), together with a universal reverse ysisrecoveredmostoftheeukaryotesupergroupsasmono- primer (AUAP: 5#-GGCCACGCGTCGACTAGTAC-3#), phyleticclades,exceptCCTHandArchaeplastida,congruent were used in a two-step protocol to improve the speci- with several recent reports (fig. 1; Burki et al. 2007, 2008; ficityoftheamplificationprocess.ThePCRproductswere Yoonetal.2008;Hampletal.2009).Moreinterestingly,this sequencedtovalidatewhethertheDHFRgeneandtheTS phylogenyrobustlysupportedCollodictyonandDiphylleiaas gene were fused or not (GenBank accession number: sister lineages with 100% bootstrap support (BP) and 1.00 JN618830). posterior probabilities (PP), confirming that these two 1561 MBE Zhao et al. · doi:10.1093/molbev/mss001 Table 1.Maximum likelihood bootstrapvalues (ML)andbayesian posteriorprobabilities (Bayes) ofthe EukaryoteSupergroups in the Phylogenomic Trees. 79Taxa 74Taxaa 77Taxab 72Taxac 20% 20% AllSites Removedd AllSites Removedd AllSites AllSites Nodee Groups ML Bayes ML Bayes ML ML Bayes ML Bayes ML A Opisthokonta 100 1.00 100 1.00 100 100 1.00 100 1.00 100 B Unikonts 79 0.99 99 1.00 87 57 0.99 96 1.00 58 C Amoebozoa 86 1.00f 100 1.00 100 84 1.00f 100 1.00 100 D Collodictyon1Malawimonas 86 0.79 98 0.63 94 NA NA NA NA NA E Excavata 100 1.00 100 1.00 100 100 1.00 100 1.00 100 F Bikonts 98 1.00 98 1.00 95 98 1.00 100 1.00 93 G Archaeplastida - 0.98 63 0.84 - - 0.99 71 0.95 - H Archaeplastida1CCTH1SAR - 1.00 81 1.00 - - 1.00 88 1.00 - I CCTH - * 54 * - 50 * 60 * - J SAR 98 1.00 100 1.00 96 99 1.00 100 1.00 96 NOTE.—‘‘-’’indicatebootstrapvalues,50%orPP,0.5;‘‘*’’indicatethatCCTH(Cryptophyta,Centrohelida,Telonemia,andHaptophyta)isnotmonophyletic. aFivetaxa(Leishmania,Trypanosoma,Sawyeria,Entamoeba,andBreviata)wereremoved. bTwoMalawimonastaxawereremoved. cTwoMalawimonastaxaandfivetaxa(Leishmania,Trypanosoma,Sawyeria,Entamoeba,andBreviata)wereremoved. dRemovalofthe20%fastestevolvingsitesfromthealignment. eThecapitalletterscorrespondtosupergroupsmarkedinfigure3. fBreviataissistertoOpisthokonta(fig.3). speciesindeedarecloselyrelated.Inanattempttoenrichthe CCTH, but these were instead placed as a sister to Opis- speciesdiversityforthisgroupandestimatetheirpotential thokonta (0.75 PP) and SAR (0.91 PP). Of much interest, abundance and diversity in nature, we searched for our analyses showed that Collodictyon branched outside Collodictyon-like 18S rDNA sequences by blastn against any of the major lineages (fig. 3A and supplementary fig. theenvironmentaldatabaseinNCBI.TwentyofthetopBlast S1A,SupplementaryMaterialonline),morespecificallyat hits were used for phylogenetic analysis, but only a single the bifurcation of the so-called ‘‘unikonts’’ (Amoebozoa partial sequence grouped with Diphylleia (results not and Opisthokonta) and ‘‘bikonts’’ (Archaeplastida, SAR, shown),suggestingalowdiversityandabundanceoftheDi- Excavata,CCTH;thetermsunikontsandbikontsareused phyllatia in the environment. This partial sequence was in- here for simplicity and do not refer to their original cluded in the 18S phylogeny (fig. 1). description; Stechmann and Cavalier-Smith 2002; Roger To improve the rDNA tree, we also sequenced the 28S and Simpson 2009). Although Collodictyon did not fall rDNAgeneforCollodictyonandreconstructedacombined within any of the supergroups, an affinity to another 18S þ 28S rDNA phylogeny (fig. 2). This tree showed Col- enigmatic genus Malawimonas was recovered with 0.79 lodictyonasadeeplineagewithpossibleaffinitytoExcavata PP and 86% BP. with 45% BP and 0.99 PP. Interestingly, our data did not To test whether the deep position of Collodictyonwas show any affiliation to Apusozoa, even though this group stable or instead sensitive to taxonomic sampling, we has been proposed to be closely related to Collodictyon performed several taxon removal experiments, but (Cavalier-Smith 2003). Instead, the 18S þ 28S rDNA tree Collodictyon was consistently recovered in the same po- suggested Apusomonas to be sister to Amoebozoa (56% sition. Most interestingly, the position of Collodictyon in BP and 1.00 PP), although Ancyromonas grouped with the global eukaryote phylogeny remained identical when the Opisthokonta (,50% BP and 1.00 PP). Malawimonas was removed from our alignment (fig. 3B Because our 18S and 18S þ 28S rDNA trees suggested and supplementary fig. S1B, Supplementary Material on- that Collodictyon might have diverged very early in eu- line).Itwasstillplacedclosetothesplitbetweenunikonts karyote evolution and that these two genes alone were andbikonts,suggestingthatthispositionwasnotcaused not sufficient to infer ancient relationships, we sought by erroneous attraction to Malawimonas or other Exca- to increase the phylogenetic signal by constructing an vataspecies(i.e.,Trimastix;seesupplementaryfig.S2,Sup- alignment of 124 protein-coding genes and 79 taxa. Phy- plementary Material online). The high statistical support logenomic trees inferred with both Bayesian and ML forthebikontgrouprecoveredwiththisreduceddataset methods consistently recovered most eukaryote super- stronglyexcludedCollodictyonfrombeingmemberofthis groups as in recent studies (Rodriguez-Ezpeleta et al. assemblage (bikonts: BP 5 98% and PP 5 1.00). On the 2007; Burki et al. 2009; Hampl et al. 2009), generally with other hand, removing Malawimonas lowered the boot- highstatisticalsupport(table1).Differingfrompublished strap support for the unikonts (BP 5 57% and PP 5 phylogenies (Burki et al. 2009; Minge et al. 2009), the 0.99; table 1), pointing to a possible attraction between Bayesian inference (fig. 3A) did not recover Breviata as Collodictyonandthisothermajorgroup.Inordertoeval- sistertoAmoebozoaandTelonemadidnotbranchwithin uatethepotentialimpactofmissingdataontheposition 1562 MBE Phylogenomics of Collodictyon · doi:10.1093/molbev/mss001 A B Missing number Missing of genes characters (%) Homo Homo Mus Mus Gallus Gallus Danio Danio Drosophila Drosophila Strongylocentrotus Strongylocentrotus Nematostella Nematostella Monosiga Monosiga Opisthokonta A 0.99 CapSsapshpaoerraoforma A CapSspahspaoerraoforma Ustilago Ustilago Cryptococcus Cryptococcus Schizosaccharomyces Schizosaccharomyces Neurospora 0.93 Neurospora B 0.75 PBhaytcraocmhyoccehsytrium B PBhaytcraocmhyoccehsytrium * C 0.990.88 AcanthaHmaoretmbaanBnreelvlaiata C 0.990.92 AcanthaHmaortembaaBnrneevlliaata 0.86 PhysaruDMmicatsytoigsatemliouemba 0.89 PhysarDMumiacstytiogsatmeliouemba Amoebozoa D CollodictyonEntamoeba Entamoeba * 0.79 ‘Malawimonas californiana’ Collodictyon Diphyllatia Malawimonas jakobiformis Histiona * Histiona Reclinomonas E ReclJinaokmoboanas E SecuJlaakmoobnaas Seculamonas Stachyamoeba Excavata * Stachyamoeba 0.98 Naegleria 0.99 Naegleria Sawyeria Euglena Sawyeria Euglena Trypanosoma Trypanosoma Leishmania Leishmania Arabidopsis Arabidopsis Solanum Solanum Oryza Oryza Pinus Pinus Physcomitrella Physcomitrella Mesostigma * Mesostigma Chlamydomonas Chlamydomonas F Volvox F Volvox Micromonas Archaeplastida Micromonas 0.99 Ostreococcus ** G 0.99 GOrCashctrioleanordicaroucscus G PoGrCprahhcyoirlnaadriraus * 0.98 0.99 PGoarlpdhieyrriaa Cyanidioschyzon 0.99 0.9C9yanophoGraaldieria Cyanidioschyzon Cyanophora Glaucocystis Glaucocystis Guillardia * H 0.51 GPuRlialalagprihodisidaeiolmphisrys H 0.58 EmPiRlilaaangpiihaoisdeiolmphisrys ****** I 0.09.383 IPPmarIaEysvnmolmotconvihlnaeiarTisyCaniesuieailsmorcnBoePigmmheoaalonewaosdieallcatylum I 0.09.182 PPIrmyaImavslnonotcevoThasnCeriiuyaleosmrniBscePoigmmheaaoloenwoadiseallcatylum CCTH 0.91 AurTeohcaolacscsuiossira AurTehoacloacscsuiossira * Laminaria Laminaria * J PPhyythtoiupmhStchhoirzaochytrium J PPhyythtoiupmShtchhoirzaochytrium SAR Alexandrium Alexandrium * Karlodinium Karlodinium * Oxyrrhis Oxyrrhis Perkinsus Perkinsus Toxoplasma Toxoplasma Cryptosporidium Cryptosporidium Tetrahymena Tetrahymena Paramecium Paramecium 100 75 50 250255075100 0.1 0.1 FIG.3.PhylogenomicpositionofCollodictyoninferredfrom124genesundertheCATmixturemodelinPhyloBayesv3.2.Branchesthatreceived 1.00PParemarkedbyfilledcircles.ThebranchlengthofEntamoebaisshortenedby50%tosavespace.(A)Treetopologyconstructedwith79 taxafromthesaved18,000treesafterdiscardingthefirst6,000cyclesasburn-in(maxdiff50.137).Missingdataforeachtaxonisshownas a color barplot (left bar: missing number of genes; right bar: missing percentage of characters). Bars marked by ‘‘ ’’ indicate the missing * percentageofcharactersisover60%ofthefull-lengthalignment.(B)Treetopologyconstructedwith77taxa(i.e.,twoMalawimonasexcluded) from the saved 16,000 trees after discarding first 8,000 cycles as burn-in (maxdiff 5 0.083). CCTH is the abbreviation of Cryptophyta, Centrohelida, Telonemia, and Haptophyta. Additional statistical support values for the main nodes in the tree marked by capital letters in boxes arelisted intable1. of Collodictyon, we removed taxa with more than 60% branching within unikonts or bikonts using similar taxo- missingcharacters(fig.3A).Thephylogeniesinferredfrom nomic sampling as reported by Hampl et al. 2009 and this data set showed Collodictyon in the same position, Rodriguez-Ezpeletaetal.2007(i.e.,Leishmania,Trypanosoma, which indicated that taxa with low sequence coverage Sawyeria,Entamoeba,andBreviataremoved).Again,noal- didnotaffecttheconstructionofCollodictyonphylogeny ternativepositionwasobservedforCollodictyon(seetable (supplementary figs. S3 and S4, Supplementary Material 1 and supplementary fig. S5, Supplementary Material online). Finally, we tested the possibility of Collodictyon online). 1563 MBE Zhao et al. · doi:10.1093/molbev/mss001 est evolving sites showed strong evidence for excluding Collodictyon from unikonts (PP 5 1.00; CAT-BP 5 93%) or bikonts (PP 5 1.00; CAT-BP 5 100%) (fig. 5 and table 1). Cross-validation test showed that the CAT model fits our data better than the LG model with a score averaged over10replicatesof2451.36±132.9(allreplicatesfavored the‘‘CAT’’model).Theglobalphylogenyinferredfromthe CAT model should be favored, although both models re- coveredthesamepositionofCollodictyon(fig.5Bandsup- plementary fig. S6B, Supplementary Material online). Hence, after the removal of the noisiest positions in our alignment, Collodictyon was robustly placed close to the bifurcation of unikonts and bikonts. Consistent with the phylogenetic analyses mentioned above, the AU test based on the data set without the 20% fastest evolving sites rejected topologies where Collodictyon was placed within unikonts or bikonts. The FIG. 4. Changes in bootstrap support for key nodes in the inferred sameresultsholdtrueforthebikontswhenthefull-length treesasfast-evolvingsiteswereremoved.Siterateswereestimated alignment was used, but the possibility of Collodictyon from an alignment without two Malawimonas and Collodictyon branching within unikonts, that is, sister to Amoebozoa species (76 taxa). Sites were then removed in 5% increments from alignments consisting of (A) 79 taxa (including Collodictyon and (P50.372)orOpisthokonta(P50.076),couldnotbedis- Malawimonas) and (B) 77 taxa (including Collodictyon). ML carded at the 5% level of significance (table 2). These two Bootstrap values (BP) for Collodictyon þ Malawimonas, unikonts, alternativetreeswereevaluatedbycomparingwiththeop- bikonts, and Opisthokonta (used as a reference) were calculated timallikelihoodtopology(supplementaryfig.S1B,Supple- underthePROTCATLGFmodelinRAxMLv7.2.6.BPvaluesshaded mentary Material online) under a covarion model in bygrayrectanglesarelistedintable1andsupplementaryfigureS6 ProCov (Wang et al. 2009). The alternative topologies ob- (Supplementary Material online). tained substantially lower likelihood values (DlnL 5 (cid:2)31 All phylogenetic analyses described above were done andDlnL5(cid:2)15)thantheoptimaltopology.Nevertheless, based on a ‘‘concatenated model,’’ without considering inordertoexamineotherpossibleaffinitiesofCollodictyon theevolutionarytempoandmodeofeachproteincompos- within Amoebozoa or Opisthokonta, 24 topologies where ingtheconcatenatedalignment.Wethereforeassessedthe Collodictyonbranchedwithbasallineagesofunikontswere impactofusinga‘‘separatemodel’’thattakesintoaccount compared.Strikingly,allofthemwererejected(P,0.05), the evolutionary specificity of each gene (see supplemen- thus weakening the suspicion of a closer relationship be- tary materials and methods, Supplementary Material on- tween Collodictyon and unikonts (supplementary fig. S7, line). The topologies inferred from the separate model Supplementary Material online). again recovered Collodictyon in the same position near the bifurcation of unikonts and bikonts, either alone or Relationship between Collodictyon and as sister to Malawimonas (supplementary fig. S1 and S5, Malawimonas SupplementaryMaterialonline).Furthermore,theseparate Malawimonashasproventobeparticularlychallengingto model generated similar bootstrap support values as the placeintheeukaryotetree,evenwithverylargealignments, concatenatedmodel(seesupplementarytableS1,Supple- butithastypicallybeenassociatedwithExcavatabasedon mentary Material online), altogether demonstrating that itsultrastructure(Simpson2003).Inouranalyses,Malawi- thephylogeneticpositionofCollodictyonisnotanartifact monasgenerallybranchedoutsideofExcavata(fig.3A,sup- caused by oversimplification of the concatenated model. plementary figs. S1A and S3A and S3C, Supplementary To further investigate the evolutionary origin of Collo- Material online), in agreement with previous observations dictyon,weattemptedtoincreasethephylogeneticversus (Rodriguez-Ezpeletaetal.2007;Hampletal.2009).Because nonphylogeneticsignalratiobyremovingthefastestevolv- MalawimonasgroupedwithCollodictyonandnotwithEx- ingsites,whichhavebeenshowntobearthehighestdegree cavatainourBayesianandMLtrees,wetookacloserlook ofhomoplasy(BrinkmannandPhilippe1999).Becauseour at this relationship by applying several strategies. One analyses suggested that Collodictyon is excluded from the modelviolationthatisknowntocausetreereconstruction known eukaryote supergroups, we successively monitored artifactsisbiasintheaminoacid(AA)composition.Inter- the statistical support for unikonts and bikonts. Most no- estingly, our heatmap analyses showed a weak deviation tably,thebootstrapsupportfor unikontsincreasedas the fromaminoacidhomogeneitythatcouldpartiallyaccount fastest evolvingsites wereremoved,reachingapeak value for the grouping of Collodictyon and Malawimonas, to- of 96% after removing 20% of sites (table 1 and fig. 4B), gether with a few other taxa (supplementary fig. S8 and whereas the bikonts remained highly supported (BP . table S3, Supplementary Material online). Removing up 95%)duringthisexperiment.Moreover,aBayesianphylog- to20%ofthefastestevolvingsitesseemednottoovercome enyconstructedwiththealignmentremovingthe20%fast- theamino acidcompositional bias (supplementary fig.S8, 1564 MBE Phylogenomics of Collodictyon · doi:10.1093/molbev/mss001 FIG. 5. Bayesian phylogeny of Collodictyon constructed from 124 genes after removal of the fastest evolving sites. The consensus topology was calculatedundertheCATmodelfrom18,000savedtreesafterdiscardingthefirst6,000cyclesasburn-in.Branchesshowing1.00PParemarkedby filledcircles.ThebranchlengthofEntamoebaisshortenedby50%tosavespace.(A)Treetopologyinferredfromthetrimmedalignmentwiththe20% fastestevolvingsitesremoved(markedbygrayrectanglesinfig.4A).Chainswereconsideredtohaveconverged(maxdiff50.104).(B)Treetopology inferredfromthetrimmedalignment(i.e.,twoMalawimonasexcluded)withthe20%fastestevolvingsitesremoved(markedbygrayrectanglesinfig. 4B).Chainswereconsideredtohaveconverged(maxdiff50.065).Numbersatthenodesin(B)indicatePP/bootstrapvaluescalculatedfromfrom100 pseudoreplicates with Phylobayes under CAT mixture model. Dashes ‘‘-’’ indicate bootstrap supports , 50%. CCTH is the abbreviation of Cryptophyta,Centrohelida,Telonemia,andHaptophyta.Additionalstatisticalsupportvaluesforthesupergroupsareshownintable1. Supplementary Material online). However, recoding the better fitted CAT model from the alignment after removing amino acids into functional categories (Hrdy et al. 2004) the20%fastestevolvingsitesonlyweaklyrecoveredCollodictyon still recovered the grouping of Malawimonas and Collo- andMalawimonasasagroup(PP50.63;fig.5Aandtable1). dictyon(supplementaryfig.S9,SupplementaryMaterialon- Moreover, when Collodictyon and five other taxa (i.e., Leish- line), suggesting that the bias may not significantly affect mania,Trypanosoma,Sawyeria,Entamoeba,andBreviata)were the phylogeny. removedfromthedataset,Malawimonasgroupedassisterto Despitethisapparentcloserelationshipbetweenthem,itis ExcavatainourMLtree(BP5 60%;supplementaryfig.S5B, important to note that the Bayesian tree inferred under the Supplementary Material online), in agreement with recent 1565 MBE Zhao et al. · doi:10.1093/molbev/mss001 Table 2.AU Test ofTreeTopologies. PValueb Rank TreeTopologyBasedonaSampleof79Taxaa Allc 20%d 1 (((Opst,Amoe),(Mala,Coll)),(Exca,(Plan,(SAR,((Cryp,Hapt),TelRap))))) 0.982 0.995 2 (((Opst,Amoe),Coll),((Exca,Mala),(Plan,(SAR,((Cryp,Hapt),TelRap))))) 0.064 0.045 3 (((Amoe,(Mala,Coll)),Opst),(Exca,(Plan,(SAR,((Cryp,Hapt),TelRap))))) 0.067 0.006 4 (((Opst,(Mala,Coll)),Amoe),(Exca,(Plan,(SAR,((Cryp,Hapt),TelRap))))) 0.009 0.005 5 (((Opst,Mala),(Coll,Amoe)),(Exca,(Plan,(SAR,((Cryp,Hapt),TelRap))))) 0.007 0.010 6 (((Opst,Coll),(Mala,Amoe)),(Exca,(Plan,(SAR,((Cryp,Hapt),TelRap))))) 0.006 0.006 7 ((((Opst,Amoe),Coll),Mala),(Exca,(Plan,(SAR,((Cryp,Hapt),TelRap))))) 0.005 0.014 8 (((Opst,(Coll,Amoe)),Mala),(Exca,(Plan,(SAR,((Cryp,Hapt),TelRap))))) 0.003 2310204 9 ((((Opst,Coll),Amoe),Mala),(Exca,(Plan,(SAR,((Cryp,Hapt),TelRap))))) 0.002 0.001 10 (((Opst,Amoe),Mala),((Exca,Coll),(Plan,(SAR,((Cryp,Hapt),TelRap))))) 3310204 7310205 11 (((Opst,Amoe),Mala),(Exca,((Plan,Coll),(SAR,((Cryp,Hapt),TelRap))))) 2310204 8310205 12 (((Opst,Amoe),Mala),(Exca,(Plan,(SAR,((Cryp,Hapt),(Coll,TelRap)))))) 1310204 0.001 13 (((Opst,Amoe),Mala),(Exca,(Plan,(SAR,(((Cryp,Coll),Hapt),TelRap))))) 8310205 8310206 14 (((Opst,Amoe),Mala),(Exca,(Plan,(SAR,((Cryp,(Coll,Hapt)),TelRap))))) 2310205 6310207 15 ((((Opst,Amoe),Coll),Mala),(Exca,(Plan,(SAR,((Cryp,Hapt),TelRap))))) 1310205 2310207 16 ((Opst,Amoe),((Exca,(Mala,Coll)),(Plan,(SAR,((Cryp,Hapt),TelRap))))) 6310215 5310214 17 ((Opst,Amoe),((Exca,Mala),Coll)),(Plan,(SAR,((Cryp,Hapt),TelRap))))) 4310212 3310211 PValueb Rank TreeTopologyBasedonaSampleof77Taxaa(i.e.,twoMalawimonasexcluded) Allc 20%e 1 (((Opst,Amoe),Coll),(Exca,(Plan,(SAR,((Cryp,Hapt),TelRap))))) 0.532 0.614 2 (((Opst,Amoe),Coll),(Exca,(SAR,(Plan,((Cryp,Hapt),TelRap))))) 0.630 0.529 3 (((Opst,Coll),Amoe),(Exca,(Plan,(SAR,((Cryp,Hapt),TelRap))))) 0.071 0.046 4 (((Opst,Coll),Amoe),(Exca,(SAR,(Plan,((Cryp,Hapt),TelRap))))) 0.076 0.045 5 ((Opst,(Amoe,Coll)),(Exca,(Plan,(SAR,((Cryp,Hapt),TelRap)))) 0.284 0.040 6 ((Opst,(Amoe,Coll)),(Exca,(SAR,(Plan,((Cryp,Hapt),TelRap)))) 0.372 0.037 7 ((Opst,Amoe),((Exca,Coll),(Plan,(SAR,((Cryp,Hapt),TelRap))))) 0.001 0.001 8 ((Opst,Amoe),((Exca,Coll),(SAR,(Plan,((Cryp,Hapt),TelRap))))) 0.003 2310204 9 ((Opst,Amoe),(Exca,((Plan,Coll),(SAR,((Cryp,Hapt),TelRap))))) 2310206 4310207 10 ((Opst,Amoe),(Exca,((SAR,Coll),(Plan,((Cryp,Hapt),TelRap))))) 7310206 3310208 11 ((Opst,Amoe),(Exca,(Plan,(SAR,(((Cryp,Coll),Hapt),TelRap))))) 6310207 1310204 12 ((Opst,Amoe),(Exca,(SAR,(Plan,(((Cryp,Coll),Hapt),TelRap))))) 2310205 6310205 13 ((Opst,Amoe),(Exca,(Plan,(SAR,((Cryp,(Coll,Hapt)),TelRap))))) 6310209 7310239 14 ((Opst,Amoe),(Exca,(SAR,(Plan,((Cryp,(Coll,Hapt)),TelRap))))) 8310205 2310247 15 ((Opst,Amoe),(Exca,(Plan,(SAR,((Cryp,Hapt),(Coll,TelRap)))))) 1310274 5310251 16 ((Opst,Amoe),(Exca,(SAR,(Plan,((Cryp,Hapt),(Coll,TelRap)))))) 1310269 8310254 17 ((Opst,Amoe),(Exca,(Plan,((SAR,Coll),((Cryp,Hapt),TelRap))))) 2310263 7310240 18 ((Opst,Amoe),(Exca,(SAR,((Plan,Coll),((Cryp,Hapt),TelRap))))) 2310251 2310243 a The abbreviation of major groups: Opst, Opisthokonta; Amoe, Amoebozoa; Exca, Excavata; Plan, Archaeplastida; SAR, Stramenopila þ Alveolata þ Rhizaria; Cryp, GuillardiaþPlagioselmis;Hapt,Haptophyta;TelRap,TelonemiaþRaphidiophrys;Mala,Malawimonas;andColl,Collodictyon. bPvaluesinwhichthetopologiescannotberejectedatthe5%levelofsignificancewereunderlined. cPvalueswerecalculatedfromtheoriginalalignment(i.e.,nositesremoved). dPvalueswerecalculatedfromthetrimmedalignmentwithremovalofthe20%fastestevolvingsites(markedbygrayrectanglesinfig.4A). ePvalueswerecalculatedfromthetrimmedalignment(i.e.,twoMalawimonasexcluded)withremovalofthe20%fastestevolvingsites(markedbygrayrectanglesinfig.4B). examination of the Excavata phylogeny (Rodriguez-Ezpeleta therootoftheeukaryotetreeiscontroversialandnoclear etal.2007;Hampletal.2009).Inaddition,thealternativeposi- evidenceexistsforitsposition,alineagethatisnotincluded tionofMalawimonaswithinExcavatawasnotrejectedbythe withineitherunikontsorbikontsislikelyofearlyorigin.The AUtest(P50.064;table2),altogethersuggestingthatthepo- poor diversity of known Diphyllatia (Collodictyon and Di- sitionofMalawimonaswasnotstableandhighlysensitiveto phylleia) is striking in this respect as one would expect taxonomic sampling.Hence,althoughthe groupingofCollo- to find more related lineages along its branch, but it re- dictyonandMalawimonasremainsunclearafterouranalyses, mainstoseeifDiphyllatiainfactrepresentalargergroup: the unstable position of Malawimonas and low support in theycouldbecloselyrelatedtoothergroupsthatareyetto Bayesian analyses applying the CAT model indicates be sequenced or discovered. Regardless of these possible that these two lineages may belong to different groups of sister groups, interpretations of the evolutionary origin eukaryotes. of Collodictyon are largely dependent on the position of the root of the eukaryote tree. Collodictyon Is Placed Near the ‘‘Unikont–Bikont’’ Two rare genomic changes have suggested an ancient Bifurcation split between the unikonts and bikonts; the bikonts have Our phylogenetic inferences suggest that Collodictyon beenshowntoshareafusionofthedihydrofolatereductase diverged near the unikont—bikont bifurcation. Although (DHFR) and thymidylate synthase (TS) genes, whereas all 1566