GBE Tracing the Evolution of Streptophyte Algae and Their Mitochondrial Genome Monique Turmel*, Christian Otis, and Claude Lemieux InstitutdeBiologieInte´grativeetdesSyste`mes,De´partementdeBiochimie,deMicrobiologieetdeBio-Informatique,Universite´ Laval, Que´bec,Canada *Correspondingauthor:E-mail:[email protected]. Accepted:September 2, 2013 D o Datadeposition:ThisprojecthasbeendepositedatGenBankundertheaccessionnumbersKF060939,KF060940,KF060941,KF060942,and w n KF060943. loa d e d fro m Abstract h ttp s SixmonophyleticgroupsofcharophyceangreenalgaearerecognizedwithintheStreptophyta.Althoughincongruentwithearlier ://a studiesbasedongenesfromthreecellularcompartments,chloroplastandnuclearphylogenomicanalyseshaveresolvedidentical c a d relationshipsamongthesegroups,placingtheZygnematalesortheZygnematales+Coleochaetalesassistertolandplants.The e m presentinvestigationaimedatdeterminingwhetherthisconsensusviewissupportedbythemitochondrialgenomeandatgaining ic .o insightintomitochondrialDNA(mtDNA)evolutionwithinandacrossstreptophytealgallineagesandduringthetransitiontowardthe u p firstlandplants.WepresentherethenewlysequencedmtDNAsofrepresentativesoftheKlebsormidiales(Entransiafimbriataand .c o m Klebsormidiumspec.)andZygnematales(ClosteriumbaillyanumandRoyaobtusa)andcomparethemwiththeirhomologsinother /g charophyceanlineagesaswellasinselectedembryophyteandchlorophytelineages.Ourresultsindicatethatimportantchanges be /a occurredatthelevelsofgenomesize,geneorder,andintroncontentwithintheZygnematales.Althoughtherepresentativesofthe rtic Klebsormidialesdisplay moresimilarityingenomesizeandintroncontent, geneorderseemsmorefluidandgenelossesmore le -a frequentthaninothercharophyceanlineages.Incontrast,thetwomembersoftheCharalesdisplayanextremelyconservativepattern b s ofmtDNAevolution.Collectively,ouranalysesofgeneorderandgenecontentandthephylogeniesweinferredfrom40mtDNA- tra c encodedproteinsfailedtoresolvetherelationshipsamongtheZygnematales,Coleochaetales,andCharales;however,theyare t/5 consistentwithpreviousphylogenomicstudiesinfavoringthatthemorphologicallycomplexCharalesarenotsistertolandplants. /10 /1 Key words: Streptophyta, Charophyceae, charophytes, mitochondrial DNA, phylogenomics, genome evolution. 8 1 7 /5 2 1 0 2 Introduction 5 (atpBandrbcL),andmitochondria(nad5)of25charophycean b y Between500and450Ma,thefirstlandplantsemergedon greenalgae,8landplants,and5chlorophytesidentifiedwith g u Earthfromgreenalgalancestorsrelatedtothecontemporary greatsupporttheCharalesastheclosestgreenalgalrelatives es freshwatergreenalgaeclassifiedintheCharophyceae(Gensel oflandplants;however,thepositionsoftheothercharophy- t on 2008; Kenrick et al. 2012). Charophycean green algae and cean groups received moderate support (Karol et al. 2001; 30 M land plants form the Streptophyta (Bremer et al. 1987), McCourt et al. 2004). The best trees inferred in this four- a whereas all other green algae belong to the sister lineage geneanalysissupportedanevolutionarytrendtowardincreas- rch 2 Chlorophyta (Lewis and McCourt 2004). Six monophyletic ing cellular complexity; undeniably, this was an appealing 0 1 groupsofcharophyceangreenalgaearecurrentlyrecognized: result.Thesametopologywasrecoveredinanotheranalysis 9 the Mesostigmatales, Chlorokybales, Klebsormidiales, of combined genes from the three cellular compartments Zygnematales, Coleochaetales, and Charales, given here in (Qiuetal.2006).Thelattertwostudiesturnedouttoconflict orderofincreasingcellularcomplexity. with analyses of multiple chloroplast and nuclear genes. In Whichcharophyceangroupgaverisetothelandplants?As agreement with trees inferred from the chloroplast small indicated by several phylogenetic studies of multigene data andlargesubunitrRNAgenes(Turmel,Eharaetal.2002),a sets, this is a difficult question to address. Analyses of four whole-chloroplast genome study by our group resolved the genes from the nucleus (18S rRNA gene), chloroplast Charales before the divergence of the Coleochaetales and (cid:2)TheAuthor(s)2013.PublishedbyOxfordUniversityPressonbehalfoftheSocietyforMolecularBiologyandEvolution. ThisisanOpenAccessarticledistributedunderthetermsoftheCreativeCommonsAttributionNon-CommercialLicense(http://creativecommons.org/licenses/by-nc/3.0/),whichpermits non-commercialre-use,distribution,andreproductioninanymedium,providedtheoriginalworkisproperlycited.Forcommercialre-use,[email protected] GenomeBiol.Evol.5(10):1817–1835. doi:10.1093/gbe/evt135 AdvanceAccesspublicationSeptember10,2013 1817 GBE Turmeletal. ZygnematalesandshowedthateithertheZygnematalesora concatenated proteins were complicated by systematic cladecomposedoftheZygnematalesandColeochaetalesare errors causing the affiliation of the Charales with land theclosestrelativesoflandplants(Turmeletal.2006,2007b). plants, but when we minimized the impact of these errors, Recently,thisconclusionreceivedsupportfromthreeindepen- treesweaklysupportingthenotionthattheCharalesarenot dent nuclear phylogenomic analyses (Wodniok et al. 2011; sistertolandplantswererecovered. Laurin-Lemayetal.2012;Timmeetal.2012),thusreinforcing confidenceinthebasalpositionoftheCharales.Furthermore, Materials and Methods Laurin-Lemayetal.(2012)haveprovidedstrongevidencethat the Coleochaetales+Zygnematales are sister to land plants. StrainsandCultureConditions Why the analyses of combined genes yielded a different StrainsofKlebsormidiumspec.(SAG51.86),Closteriumbail- D topologyisunclear,butitisworthmentioningherethatthe lyanum (SAG 50.89), and Roya obtusa (SAG 168.80) were ow n misleading signal supporting the strong affiliation of the obtainedfromtheSammlungvonAlgenkulturenGo¨ttingen, lo a Charales with the land plants in the four-gene analysis was d whereasEntransiafimbriata(UTEXLB2353)originatedfrom e d mainly contributed by nad5, the sole mitochondrial gene in the Culture Collection of Algae at the University of Texas at fro the data set (Turmel et al. 2006). When using multigene m Austin. All four strains were grown in C medium (Andersen sequence alignments for resolving deeply diverging relation- 2005)underalternating12-hlight/darkperiods. http smhoipdse,lms,ibslieaasdiningseiqnufoernmceatcioonmdpuoesittoionvi,oolartisoenqsuoefnecevohluettieornoatary- s://ac DNASequencingandSequenceAnalyses a chy may yield the wrong tree topology (Phillips et al. 2004; de Jeffroyetal.2006;Philippeetal.2011b). For each strain, an A+T rich organellar DNA fraction was mic Inthepresentinvestigation,weundertookadetailedanal- obtained by CsCl–bisbenzimide isopycnic centrifugation of .ou p ysis of streptophyte mitochondrial genomes in order to ex- total cellular DNA as described earlier (Turmel et al. 1999). .c o plore the evolutionary patterns of mitochondrial DNA TheEntransiaandRoyaorganellarfractionsweresequenced m /g (mtDNA)andalsotodeterminewhetheramitochondrialphy- using 454 GS FLX Titanium technology at the Plate-forme b e lcohgloernoyplacsotnganrudenntucwleiathr gtehnoesse cpanrevbioeusrlyecoinnfsetrrurecdtedf.roTmo dul’aAvnaal.lcyase/sseqG/ee´nn/oinmdeiqxu.pehspo,flaLastvalacUcneisvseerdsitySe(hptttepm://bpearg.ib2i7s,. /article date,themitochondrialgenomesoffivecharophyceansrep- 2013).The454pyrosequencingreadswereassembledusing -a b s resenting four distinct lineages, i.e., the genomes of Newblerv2.5(Marguliesetal.2005)withdefaultparameters. tra Mesostigma viride (Mesostigmatales), Chlorokybus atmophy- For the Klebsormidium and Closterium mitochondrial ge- ct/5 ticus (Chlorokybales), Chaetosphaeridium globosum nomes,randomclonelibrarieswerepreparedfrom1,500to /1 0 (Coleochaetales), and Chara vulgaris (Charales), have been 2,000bpfragmentsderivedfromtheA+TrichDNAfractions /1 8 described in the literature (Turmel et al. 2002a, 2002b, usingthepSMART-HCKan(LucigenCorporation,Middleton, 17 /5 2003,2007a). Thesegenomesshowsubstantialvariabilityin WI)plasmid.Positivecloneswereselectedbyhybridizationof 2 1 size,geneorder,genedensity,andintroncontent.TheChara each plasmid library with the original DNA used for cloning. 02 5 mtDNA was found to be the most similar to its bryophyte DNA templates were amplified using the Illustra TempliPhi b y counterparts at the levels of gene order and intron content, Amplification Kit (GE Healthcare, Baie d’Urfe´, Canada) and g u e anditwasalsotheclosesttolandplantsmtDNAsinphyloge- sequencedwiththePRISMBigDyeterminatorcyclesequenc- s netic analyses of multiple genes (Turmel et al. 2003). More ingreadyreactionkit(AppliedBiosystems,FosterCity,CA)on t on 3 recently, the mtDNA sequence of another charalean green ABImodel373or377DNAsequencers(AppliedBiosystems), 0 M alga,Nitellahyalina,hasbecomeavailable(GenBankaccession usingT3andT7primersaswellasoligonucleotidescomple- a NC_017598). mentary to internal regions of the plasmid DNA inserts. The rch 2 Here, we report the mtDNA sequences of two charophy- resulting sequences were edited and assembled using 0 1 ceans representing the Klebsormidiales, Entransia fimbriata SEQUENCHER 4.8 (Gene Codes Corporation, Ann Arbor, 9 and Klebsormidium spec., and of two others representing MI).Genomicregionsnotrepresentedinthesequenceassem- the Zygnematales, Closterium baillyanum and Roya obtusa. bliesorplasmidclonesweredirectlysequencedfrompolymer- These genome sequences were compared with those previ- asechainreaction-amplifiedfragmentsusinginternalprimers. ouslyreportedfortheabovementionedcharophyceangreen Genes and open reading frames (ORFs) were identified algae and for selected embryophytes and chlorophytes. The usingacustom-builtsuiteofbioinformatictoolsasdescribed results unveiled the dynamic nature of the mitochondrial previously (Pombert et al. 2005). tRNA genes were localized genome in both the Klebsormidiales and Zygnematales. At usingtRNAscan-SE(LoweandEddy1997).Intronboundaries the levels of gene order and gene content, we found that were determined by modeling intron secondary structures the Zygnematales and the Zygnematales+Coleochaetales (Micheletal.1989;MichelandWesthof1990)andbycom- uniquelysharemorestructuralgenomiccharacterswithbryo- paring intron-containing genes with intronless homologs phytes than the Charales. Our phylogenetic studies of 40 using FRAMEALIGN of the Genetics Computer Group 1818 GenomeBiol.Evol.5(10):1817–1835. doi:10.1093/gbe/evt135 AdvanceAccesspublicationSeptember10,2013 GBE TracingtheEvolutionofStreptophyteAlgae software(version10.3)package(Accelrys,SanDiego,CA).To Confidenceofbranchpointswasestimatedbyfast-bootstrap estimate the proportion of repeated sequences, repeats analysis(foption¼a)with100pseudoreplicates.BIanalyses (cid:2)30bp were retrieved using REPFIND of the REPuter 2.74 were performed with PhyloBayes 3.3e (Lartillot et al. 2009) program (Kurtz et al. 2001) with the options -f (forward) -p using the site-heterogeneous CAT+(cid:2)4 and CATGTR+(cid:2)4 (palindromic) -l (minimum length¼30bp) -allmax and then models (Lartillot and Philippe 2004). To establish the appro- masked on the genome sequence using REPEATMASKER priateconditionsfortheseanalyses,twoindependentchains (http://www.repeatmasker.org/, last accessed September 27, wererunfor10,000cyclesundertheCAT+(cid:2)4modelandfor 2013) running under the Crossmatch search engine (http:// 2,000 cycles under the CATGTR+(cid:2)4 model, and consensus www.phrap.org/,lastaccessedSeptember27,2013). topologies were calculated from the saved trees using the BPCOMP program of PhyloBayes after a burn-in of 2,000 D o AnalysesofGeneOrderData and500cycles,respectively.Undertheseconditions,thelarg- w n estdiscrepancyweobservedacrossallbipartitionsinthecon- lo Weusedacustom-builtprogramtoidentifytheregionsthat a displaythesamegeneorderinselectedpairsofstreptophyte sensus topologies (maxdiff) was lower than 0.10, indicating ded mtDNAs. This program was also employed to convert gene thatconvergencebetweenthetwochainswasachieved.For fro thebootstrapanalyses,100pseudoreplicatesweregenerated m orderineachof16streptophytemtDNAstoallpossiblepairs h ofsignedgenes(i.e.,takingintoaccountgenepolarity).The using the SEQBOOT program of the PHYLIP package ttp presence/absenceofthesignedgenepairsintwoormorege- (Felsenstein 1989), and BI chains were run on these s://a pseudoreplicates using the conditions just described. In the c nomeswascodedasbinaryDollocharactersusingMacClade a d CAT+(cid:2)4 analyses, one chain was thus run for 10,000 e 4.08(MaddisonandMaddison2000).Theresultinggeneorder m datasetwassubjectedtomaximumparsimony(MP)analysis cycles (with each cycle sampled) for each pseudoreplicate ic.o and a consensus tree was computed with the READPB pro- u using PAUP 4.0b10 (Swofford 2003). Confidence of branch p gram of PhyloBayes after eliminationof 2,000 burn-intrees. .c pointswasestimatedby1,000bootstrapreplications. o m Phylogenies based on inversion medians were inferred Thesameprocedurewasappliedtoeachpseudoreplicatein /g with GRAPPA 2.0 (http://www.cs.unm.edu/~moret/ theanalysisdoneundertheCATGTR+(cid:2)4model,exceptthat be/a GRAPPA/,lastaccessedSeptember27,2013)usingthealgo- the chain was run for 2,000 cycles with a burn-in of 500 rtic rithm of Caprara (2003) and a data set of 55 gene/pseudo- cycles.Inbothanalyses,abootstrapconsensustreewasgen- le-a gene positions. Although this software can use breakpoint erated from the 100 resulting consensus trees using the bs medians to compute trees, we used inversion medians be- CONSENSEprogramofthePHYLIPpackage. trac causethismethodhasbeenshowntooutperformbreakpoint Cross-validation testswere conducted to evaluate the fits t/5 mediansinphylogenyreconstructions(Moretetal.2002). ofthefourmodelsofaminoacidsubstitutionstothedataset. /10/1 They were carried out with PhyloBayes using ten randomly 8 1 PhylogeneticReconstructionsfromSequenceData generatedreplicates.Cross-validationisaverygeneralstatis- 7/5 2 tical method for comparing models. The procedure can be 1 Mitochondrial genome sequences were retrieved from summarizedasfollows.Thedataset israndomlypartitioned 025 GenBank for the 25 green plant taxa listed in table 1. We into two unequal subsets, the learning set (also called the by selected for analysis the protein-coding genes that are g training set) and the test set. The learning set serves to esti- u e schriatereridonb:yatapt1l,e4as,t61,48,o9f,tchoeb,25cotxa1x,a2.,F3o,rtmytgtBen,ensadm1e,t2t,h3is, mthaetneutsheedptaoracommetpeurtseotfhtehleikmeliohdoeoldaonfdththeetseestpsaerta.mToetreerdsuacree st on 3 4,4L,5,6,7,9,rpl2,5,6,10,16,rps1,2,3,4,7,10,11,12, variability, multiple rounds of cross-validation are performed 0 M p13re,p1a4r,e1d9a,ssdfohl3lo,w4s,.aTnhdeydejeRd,uUc,eVd.aAmninamoaincoidasceidqudeantacesestfrwoams uscsoinrgesd(wifhfeicrehnmtepaasrutriteiohnoswawndelltthheetreesstuslteintsgwleorge-lpikreeldihicotoedd arch 2 the 40 individual genes were aligned using MUSCLE 3.7 0 bythemodel)areaveragedovertherounds. 1 (Edgar 2004), and the ambiguously aligned regions in each 9 Toanalyzetheaminoacidcompositionofthedataset,we alignmentwereremovedusingTRIMAL1.3(Capella-Gutierrez firstassembleda20(cid:3)25matrixcontainingthefrequencyof et al. 2009) with the following options: block¼7, gt¼0.7, eachaminoacidperspeciesusingtheprogramPepstatsofthe st¼0.001,andsw¼3,andtheproteinalignmentswerecon- EMBOSSpackage(Riceetal.2000).Acorrespondenceanal- catenated. Missing characters represented 12.5% of the ysisofthisdatasetwasthenperformedusingtheRpackage amino acid data set. This data set is available upon request ca(NenadicandGreenacre2007). tothecorrespondingauthor. Phylogenies were inferred from the amino acid data set TestingRobustnessofTreesbyRemovalofFast-Evolving using the maximum likelihood (ML) and Bayesian inference Sites (BI) methods. ML analyses were carried out using RAxML 7.2.8 (Stamatakis 2006) and the site-homogeneous The influence of removing increasing proportions of fast- LG+(cid:2)4+F and GTR+(cid:2)4 models of sequence evolution. evolvingsitesintheaminoaciddatasetwasinvestigatedas GenomeBiol.Evol.5(10):1817–1835. doi:10.1093/gbe/evt135 AdvanceAccesspublicationSeptember10,2013 1819 GBE Turmeletal. Table1 General Featuresof theStreptophyteandChlorophytemtDNAsComparedin ThisStudy Taxona Introns Name Abbreviation Accession Size(bp) A+T(%) Genes GroupI GroupIIb Repeatsc (%) Charophyceae Mesostigmaviride MESO NC_008240 42,424 67.79 65 4 3(2) 0.1 Chlorokybusatmophyticus CHLO NC_009630 201,763 60.20 71 6 14 7.1 Entransiafimbriata ENTR KF060941 61,645 57.30 66 3 2 0.0 Klebsormidiumspec.d KLEB KF060942 87,405 59.11 65 7 2 7.3 D o Chaetosphaeridiumglobosum CHAE NC_004118 56,574 65.59 68 9 2 0.1 w n Charavulgaris CHAR NC_005255 67,737 59.10 69 14 13 2.6 lo a d Nitellahyalina NITE NC_017598 80,183 59.17 69 7 13 1.1 e d Closteriumbaillyanum CLOS KF060940 152,089 60.37 71 20 11 3.7 fro Royaobtusa ROYA KF060943 69,465 73.17 69 0 2 3.0 m Embryophyta http Marchantiapolymorpha MARC NC_001660 186,609 57.59 70 7 25 6.9 s Treubialacunosa TREU NC_016122 151,983 56.62 66 7 21 5.5 ://a c Pleuroziapurpurea PLEU NC_013444 168,526 54.63 68 7 21 5.0 ad e Physcomitrellapatens PHYS NC_007945 105,340 59.43 66 3 24 1.0 m ic Anomodonrugelli ANOM NC_016121 104,239 58.80 66 3 24 1.0 .o Megacerosaenigmaticuse MEGA NC_012651 184,908 53.99 42 0 28 2.0 up .c Phaeceroslaevis PHAE NC_013765 209,482 55.40 43 0 32 2.5 o m Cycastaitungensis CYCA NC_010303 414,903 53.08 64 0 25(5) 24.9 /g b Betavulgaris BETA NC_002511 368,801 56.14 50 0 20(6) 22.1 e /a Nicotianatabacum NICO NC_006581 430,597 55.04 57 0 23(6) 17.6 rtic Arabidopsisthaliana ARAB NC_001284 366,924 55.23 49 0 23(5) 11.6 le Oryzasativa ORYZ NC_007886 490,521 56.15 54 0 23(6) 55.2 -a b s Chlorophyta tra Nephroselmisolivacea NEPH NC_008239 45,223 67.20 66 4 0 0.1 c t/5 Ostreococcustauri OSTR NC_008290 44,237 61.78 63 0 0 0.1 /1 0 Monomastixsp. MONO KF060939 60,883 59.87 65 8 0 4.8 /1 8 Protothecawickerhamii PROT NC_001613 55,328 74.18 62 5 0 5.3 1 7 aDifferentcolors,alsousedinfigures,denotethedistinctcharophyceanandbryophytelineages. /52 bNumberoftrans-splicedintronsisgiveninparentheses. 10 cNonoverlappingrepeatelementsweremappedoneachgenomewithRepeatMaskerusingtherepeats(cid:2)30bpidentifiedwithREPuterasinputsequences. 25 dThisspecieswasformallycalledMicrosporastagnorum(Mikhailyuketal.2008). b y eThisspeciesnamehasbeenchangedtoNothoceros(Villarrealetal.2010). g u e s t o n follows. Substitution rates among sites in the data set were in the analyses under the CATGTR+(cid:2)4 model. Consensus 30 estimated with CODEML in PAML 4.6 (Yang 2007) for the trees and posterior probability values were computed from M a four alternative topologies observed for the streptophyte saved trees after burn-in, using the BPCOMP program of rch green algal lineages in bootstrap analyses with the PhyloBayes. 20 1 CAT+(cid:2)4andCATGTR+(cid:2)4models.Theserates wereaver- Bootstrapanalysesofthetrimmeddatasetlacking20%of 9 agedforeachsite,andthefastestevolvingsiteswereincre- the fastest evolving sites were carried out with RAxML and mentally removed in 5% intervals using MESQUITE 2.75 PhyloBayesessentiallyasdescribedfortheoriginalaminoacid (MaddisonandMaddison2011)inordertogeneratefivesub- dataset.FortheBIanalysesundertheCAT+(cid:2)4andCATGT setsofdata.RAxMLanalysesofthesedatasubsetswereper- R+(cid:2)4 models, we confirmed that the largest discrepancy formed as described above for the original amino acid data across all bipartitions in the consensus trees obtained from set. The five trimmed data sets were also analyzed with twoindependentchainswaslowerthan0.10. PhyloBayes under the CAT+(cid:2)4 and CATGTR+(cid:2)4 models. Saturationlevelsoftheoriginalaminoaciddatasetandof Two independent chains were run for 5,000 cycles (with thetrimmeddatasetlacking20%ofthefastestevolvingsites each cycle sampled) with a burn-in of 1,000 cycles in each were estimated by computing the slopes of the regression oftheanalysesundertheCAT+(cid:2)4model,whereasthetwo plots of patristic distances versus observed distances using chainswererunfor2,000cycleswithaburn-inof500cycles KaleidaGraph 4.1.3 (Synergy Software, Reading, PA). For 1820 GenomeBiol.Evol.5(10):1817–1835. doi:10.1093/gbe/evt135 AdvanceAccesspublicationSeptember10,2013 GBE TracingtheEvolutionofStreptophyteAlgae each data set, patristic distances were calculated using PATRISTIC 1.0 (Fourment and Gibbs 2006) from the branch lengths of the best tree inferred under the CATGTR+(cid:2)4 model, whereas the observed distances were derived from pair-wise comparisons between the raw sequences of the datasetusingMEGA5(Tamuraetal.2011). Results ThemembersoftheKlebsormidialesweselectedformtDNA D sequencing, Entransia and Klebsormidium, represent distinct o w majorlineages(Karoletal.2001;Mikhailyuketal.2008).In nlo a theZygnematales,wealsosampledtwodistantlyrelatedspe- d e cbieelso,nCglotostethrieumDeasnmdidRiaoleyas,.iAtrltehmouaignhsuCnlocsletearriuwmheisthkenroRwonyatios d from part of the other major lineage recognized in the h ttp Zygnematales(McCourtetal.2000;Gontcharovetal.2004; s Halletal.2008).Concerningthephylogeneticpositionsofthe ://a c a members of the Charales whose mtDNAs were compared d e here, it has been shown that the Chara and Nitella genera FIG.1.—Totallengthsofcoding,intronic,andintergenicsequencesin m arepartofseparatemajorlineages(Karoletal.2001). the streptophyte mtDNAs examined in this study. Note that intron- ic.o encodedgeneswerenotconsideredascodingsequencesbutratheras up intronsequencesandthattrans-splicedintronswerenottakenintoac- .co ComparisonofStructuralGenomicFeatures m counttoestimatethetotallengthsofintronsequencesbecausetheirsizes /g The gene maps of the klebsormidialean and zygnematalean areusuallynotannotatedintheGenBankaccessions.Speciesnamesare be mitochondrial genomes are shown in supplementary figures abbreviatedasintable1. /artic S1 and S2, Supplementary Material online, respectively. The le sequencesaccountforlessthan7.5%ofthetotalmitochon- -a swtreurcetucoramlpfeaaretudrewsitohftthhoesseepforeuvrionueswlylyobsseeqruveedncfeodrfigveenocmhaers- drial genome size in charophyceans, the observed variation bstra being comparable to that found in bryophyte mtDNAs c ophyceans and seven bryophytes (table 1). The latter land t/5 (table1). /1 plants represent all three known lineages of bryophytes and 0 Among the 16 charophycean and bryophyte mtDNAs /1 consist of three liverworts (Marchantia polymorpha, Treubia 8 examined in this study, the total number of standard genes 1 7 lacunosa,andPleuroziapurpurea),twohornworts(Megaceros ranges from 42 (in the hornwort Megaceros) to 71 /52 aenigmaticus and Phaeceros laevis), and two mosses 1 (in Chlorokybus and the zygnematelean Closterium), with 0 2 (Physcomitrella patens and Anomodon rugelii). For certain 5 most genomes exhibiting between 65 and 71 genes b analyses, we have also included representatives of chloro- y (table 1). As in other charophycean mtDNAs, all protein- g phytesandseedplants(table1). coding genes in the newly sequenced genomes were anno- ues tatedusingthestandardgeneticcode,thusprovidingnoev- t o n GenomeSizeandGeneContent idenceforRNAediting.Thefewgenesthatwerelostfromthe 30 All compared charophycean mitochondrial genomes, except mitochondrial genome during charophycean evolution code Ma the mtDNAs of Chlorokybus and of the zygnematelean mainlyfortRNAs(trngenes),ribosomalproteins(rpsandrpl), rch Closterium, are smaller in size than their bryophyte counter- and subunits involved in cytochrome c biogenesis (yej, also 20 1 parts(table1andfig.1).Importantdifferencesinmitochon- designatedas ccm) (fig. 2). Inlandplant mtDNAs, the same 9 drial genome size are found among distinct charophycean categoriesofgenesarepronetoextinction(seefig.2)(Adams lineages as well as within individual lineages, in particular etal.2002;Knoop2013).Certainmitochondrialgeneswere theMesostigma/ChlorokybuscladeandtheZygnematales.In lostinalineage-specificfashion,whereasotherswerelostin this regard, it is noteworthy that the two representatives of twoormorelineages.Lineage-specificgenelossesoccurredat the Charales, Chara and Nitella, carry very similar mitochon- highest frequency in the Klebsormidiales and in the horn- drialgenomes;thesemtDNAs,whichdifferbyonly12,447bp, worts. The genes whose absence was recorded in multiple contain the same gene complement and display exactly the lineagesareofspecialinterestastheycanbephylogenetically same gene order. Overall, the total lengths of both the informative.Inthiscontext,wenotethattrnI(gau)isuniformly intergenic and intronic regions in mtDNA changed substan- absent from the three bryophyte lineages and the tially during charophycean evolution, even though the total Zygnematales, whereas rpl14 is missing from the bryophyte lengthsofcodingregionsremainedsimilar(fig.1).Repeated lineages,theZygnematalesandtheColeochaetales. GenomeBiol.Evol.5(10):1817–1835. doi:10.1093/gbe/evt135 AdvanceAccesspublicationSeptember10,2013 1821 GBE Turmeletal. D o w n lo a d e d fro m h ttp s ://a c a d e m ic .o u p .c o m /g b e /a rtic le -a b s tra c t/5 /1 0 /1 8 1 7 /5 2 1 0 2 5 b y g u e s t o n 3 0 M a rc h 2 0 1 9 FIG.2.—Generepertoiresofthestreptophytemitochondrialgenomesexaminedinthisstudy.Thepresenceofastandardgeneisindicatedbyadark blueboxandthepresenceofapseudogenebyalightbluebox.Thedifferentcolorsontheleftofthefigurerefertogenedistributionssupportingdistinct hypothesesconcerningthesistergroupoflandplants:orange,Zygnematales;brown,Coleochaetales+Zygnematales. 1822 GenomeBiol.Evol.5(10):1817–1835. doi:10.1093/gbe/evt135 AdvanceAccesspublicationSeptember10,2013 GBE TracingtheEvolutionofStreptophyteAlgae Two tRNA genes not previously identified in streptophyte IntrontContent mitochondria,trnL(aag)andtrnT(ugu),wereannotatedinthe In contrast to their land plant counterparts, charophycean course of this study. The trnT(ugu) gene is present in the mtDNAs are extremely variable in intron content (table 1 klebsormidialean Entransia and the zygnematalean and figs. 3 and 4). All the introns identified so far in charo- Closterium, whereas trnL(aag) is found only in Entransia. phycean and bryophyte genomes, except for two group II Given that these mitochondrial genes have been described introns in the tightly packed mtDNA of Mesostigma viride, so far in only a few chlorophytes and are also rarely found arecis-spliced.CharophyceangroupIandgroupIIintronslie orabsentinotheralgalgroups,theywerealmostcertainlynot at 37 and 44 distinct genomic sites (in 8 and 24 genes), inheritedverticallyfromthecommonancestorofallstrepto- respectively. Even within individual charophycean lineages, phytes. Our similarity searches against the nonredundant there are sharp differences in intron content. For instance, D o database of the National Center for Biotechnology within the Zygnematales, the Closterium and Roya mtDNAs w n Information using Entransia trnL(aag) as the query sequence harbor as many as 31 introns and as few as two introns, loa d failed to reveal any tRNA gene sharing substantial sequence respectively; the latter, which are both inserted into tRNA e d identity. However, both the Entransia and Closterium genes and belong to the group II family, are conserved be- fro m trnT(ugu)geneswerefoundtoexhibithighsequencesimilar- tween the two zygnematalean taxa and are also present in h itieswithtrnT(ugu)genesoriginatingfromthechloroplastsof other charophycean or bryophyte lineages. Within the ttp s red algae and glaucocystophytes. Two other tRNA genes Klebsormidiales, the five Entransia and nine Klebsormidium ://a showing a restricted distribution in streptophytes and other introns share no homolog, although similar introns reside at ca d algalgroups,trnS(acu)ettrnR(ucg),havepresumablynotbeen identical insertion sites in other charophycean mtDNAs. em inherited through vertical inheritance from the last common Moreover, the mtDNAs of the charalean Chara (27 introns) ic.o ancestorofstreptophytes.Oursimilaritysearchessuggestthat and Nitella (20 introns), both of which contain numerous up .c trnS(acu), documented uniquely in Chaetosphaeridium groupIandgroupIIintrons,haveincommonahighpropor- o m mtDNA, originates in part from themitochondrial trnT(ggu), tionofgroupIIintronsbutshareonlytwogroupIintronsat /g b aswedetectedsubstantialsequencesimilaritybetweenthe30 thesamesites. e/a portions of these two genes from Chaetosphaeridium. The Inapreviousstudy(Turmeletal.2003),wefoundthatfour rtic evolutionary history of trnR(ucg) is more complex, given the group II introns within cox2 (site 71), nad3 (site 140), nad4 le-a evidence that it arose in chlorophyte and streptophyte (site 949), and rps3 (site 77) are shared between Chara and bs mtDNAs through duplication and mutation of either some land plant lineages, including bryophytes (fig. 4). The trac trnR(acg) (in the Mesostigma and Nephroselmis lineages) or expandeddatareportedhereconfirmthatthesesitesareoc- t/5 /1 trnR(ucu)(intheChlorokybusandMarchantialineages)(Oda cupied by introns that are shared solely by members of the 0/1 etal.1992;Turmeletal.2003;Wangetal.2009).Ouranal- Charales(threeofthefourintronsarepresentinbothChara 81 7 yses showed that the mitochondrial trnR(ucg) genes of the andNitella) and oneor morelandplantlinages. Inaddition, /5 2 two klebsormidialeans and the zygnematalean Closterium weidentifiedanextrasiteincox2(site343)thatisoccupiedby 10 2 arehighlysimilartooneanotherandalsosharehighsimilarity agroupIIintronsharedbetweenNitella,mosses,hornworts, 5 b withthemitochondrialtrnR(acg)gene. andsomeseedplants.Moreover,agroupIIintronatanother y g In the Klebsormidium and Roya mtDNAs, we discovered siteincox2(site220)aswellasagroupIintroninnad5(site ue s potential coding sequences that are not usually found in 723) were found only in the zygnematalean Closterium and t o n green plant mitochondrial genomes. The orf209 of somebryophytelineages(figs.3and4). 3 0 Klebsormidium, which is embedded within a 7.1-kb region M a containing only three tRNA genes (supplementary fig. S1, GeneOrder rc h Supplementary Material online), shows strong sequence ho- Tocomparethegeneorganizationsofthecharophyceanand 20 1 mology (E¼e(cid:4)50) with bacterial and phycodnavirus genes bryophytemtDNAsinourstudygroup,weexaminedthegene 9 encoding C-5 cytosine-specific DNA methyltransferases. The pairsandgeneclusterstheyshareandalsoinferredascenario nonstandardgene(orf235)intheRoyamtDNAalsoresidesin ofgenomerearrangements.Figure5presentsoneofthetwo a relatively long intergenic region (supplementary fig. S2, gene-pair analyses that we performed. The mtDNAs of four Supplementary Material online) and its predicted product is green algae representing early diverging lineages of the related to phage integrases/recombinases (E¼1e(cid:4)09) and Chlorophyta (i.e., three members of the Prasinophyceae to putative proteins encoded by the mtDNAs of and Prototheca wickerhamii, a representative of the Chaetosphaeridium(orf202,E¼9e(cid:4)31)andthechlorophyte Trebouxiophyceae)wereincludedinthisanalysisinanattempt Protothecawickerhamii(orf304,E¼4e(cid:4)24).InboththeRoya todistinguishconservedgeneclustersthatpredatethediver- and Chaetosphaeridium mtDNAs, the putative integrase/ genceofchlorophytesandstreptophytes(i.e.,ancestralclus- recombinasegeneislocateddownstreamofthelargesubunit ters) from those that arose during charophycean evolution. rRNAgene(rnl). Theancestralgenepairsappeartobemoreprevalentinthe GenomeBiol.Evol.5(10):1817–1835. doi:10.1093/gbe/evt135 AdvanceAccesspublicationSeptember10,2013 1823 GBE Turmeletal. D o w n lo a d e d fro m h ttp s ://a c a d e m ic .o u p .c o m /g b e /a rtic le -a b s tra c t/5 /1 0 /1 8 1 7 /5 2 1 0 2 5 FIG.3.—DistributionofgroupIintronsamongthechlorophyteandstreptophytemtDNAsexaminedinthisstudy.Thepresenceofanintroncontaining b y anORFisdenotedbyadarkbluebox,whereasthepresenceofanintronlackinganORFisdenotedbyalighterbluebox.Introninsertionsitesinprotein- g u codingandtRNAgenesaregivenrelativetothecorrespondinggenesinMesostigmamtDNA;insertionsitesinrnsandrnlaregivenrelativetotheEscherichia e s coli16Sand23SrRNAs,respectively.Foreachinsertionsite,thepositioncorrespondingtothenucleotideimmediatelyprecedingtheintronisreported. t o n 3 0 M Klebsormidiales andCharalesthan inthe other streptophyte pointing out that several gene pairs are uniquely shared a rc lineages.Withregardtothegenepairsthatarestreptophyte- between the Charales, Zygnematales, and land plants h 2 specific, they are most abundant in the Zygnematales and (fig. 5), thus providing support for a basal position of the 0 1 Charalesandtheirproportionsintheselineagesarecompara- Coleochaetalesrelativetotheselineages.Inaseparateanaly- 9 bletothoseobservedinliverwortsandmosses.Weidentified sis,weassembledadatasetof134genepairsfrom16strep- thestreptophyte-specificgenepairs(synapomorphies)thatare tophytes (128 phylogenetically informative characters), in shared exclusively between the Charales and bryophytes; which the gene pairs were coded as binary Dollo characters betweentheColeochaetales,Zygnematales,andbryophytes; (presence/absence),andusedthisdatasettoinferaphylogeny andalsobetweentheZygnematalesandbryophytesinorder using MP (fig. 6). The majority-rule consensus tree of the toevaluatewhichofthethreecurrenthypothesesconcerning bootstrap replicates resolved with weak bootstrap (BP) the identity of the charophycean lineage(s) being sister to support the Coleochaetales and Zygnematales as sister landplantsisbestsupported(fig.5,comparethegenepairs to land plants and the Charales as sister to the shown in different colors). The hypothesis that the Coleochaetales+Zygnematales+landplants. Coleochaetales+Zygnematalesaresistertothelandplantlin- ConservedgenepairsinstreptophytemtDNAsformlarger eagereceivedthestrongestsupport.Inthiscontext,itisworth conserved clusters whose numbers, sizes, and arrangements 1824 GenomeBiol.Evol.5(10):1817–1835. doi:10.1093/gbe/evt135 AdvanceAccesspublicationSeptember10,2013 GBE TracingtheEvolutionofStreptophyteAlgae D o w n lo a d e d fro m h ttp s ://a c a d e m ic .o u p .c o m /g b e /a rtic le -a b s tra c t/5 /1 0 /1 8 1 7 /5 2 1 0 2 5 b y g u e s t o n 3 0 M a rc h 2 0 1 9 FIG.4.—DistributionofgroupIIintronsamongthechlorophyteandstreptophytemtDNAsexaminedinthisstudy.Thepresenceofacis-splicedintron containinganORFisdenotedbyadarkbluebox,whereasthepresenceofacis-splicedintronlackinganORFisdenotedbyalighterbluebox.Trans-spliced intronsarerepresentedbypurpleboxes.Introninsertionsitesinprotein-codingandtRNAgenesaregivenrelativetothecorrespondinggenesinMesostigma mtDNA; insertion sites in rns and rnl are given relative to the Escherichia coli 16S and 23S rRNAs, respectively. For each insertion site, the position correspondingtothenucleotideimmediatelyprecedingtheintronisreported. GenomeBiol.Evol.5(10):1817–1835. doi:10.1093/gbe/evt135 AdvanceAccesspublicationSeptember10,2013 1825 GBE Turmeletal. D o w n lo a d e d fro m h ttp s ://a c a d e m ic .o u p .c o m /g b e /a rtic le -a b s tra c t/5 /1 0 /1 8 1 7 /5 2 1 0 2 5 b y g u e s t o n 3 0 M a rc h 2 0 1 9 FIG.5.—Distributionofmitochondrialgene/pseudogenepairsthataresharedbyatleasttwocharophyceanandtwochlorophytetaxa,byatleastthree charophyceantaxa,orbyatleastonecharophyceanandtwobryophytelineages.Thepresenceofagenepairisdenotedbyadarkbluebox.Alightgreen boxreferstoagenepairinwhichatleastonegeneismissinginsomelineagesduetogeneloss.Ancestralgenepairs,thatis,genepairscommonto 1826 GenomeBiol.Evol.5(10):1817–1835. doi:10.1093/gbe/evt135 AdvanceAccesspublicationSeptember10,2013
Description: