Phylotranscriptomics Reveals Discordance in the Phylogeny of Hawaiian Drosophila and Scaptomyza (Diptera: Drosophilidae) Samuel H. Church*,1,2 and Cassandra G. Extavour1,3,4 1DepartmentofOrganismicandEvolutionaryBiology,HarvardUniversity,Cambridge,MA,USA 2DepartmentofEcologyandEvolutionaryBiology,YaleUniversity,NewHaven,CT,USA 3DepartmentofMolecularandCellularBiology,HarvardUniversity,Cambridge,MA,USA 4HowardHughesMedicalInstitute,ChevyChase,MD,USA *Correspondingauthor:E-mail:[email protected]. Associateeditor:MichaelRosenberg Abstract D o w Islandradiationspresentnaturallaboratoriesforstudyingtheevolutionaryprocess.TheHawaiianDrosophilidaeareone nlo a such radiation, with nearly 600 described species and substantial morphological and ecological diversification. These d e sppreecseienstanreewlaarsgseelmybdlievdidterdanisnctroipatofmewesmfraojmor1c2lasdpeesc,ibesutactrhoessrethlaetsieoncslahdipesb,eatnwdeuensectlhaedseestrreamnsacirnipstuonmceesrttaoinr.eHsoelrvee,twhee d from baseoftheevolutionaryradiation.Werecoveranewhypothesisfortherelationshipbetweenclades,anddemonstrateits h ttp supportoverpreviouslypublishedhypotheses.Wethenusetheevolutionaryradiationtoexploredynamicsofconcor- s danceinphylogeneticsupport,byanalyzingthegeneandsiteconcordancefactorsforeverypossible topologicalcom- ://a c a binationofmajorgroups.Weshowthathighbootstrapvaluesmasklowevolutionaryconcordance,andwedemonstrate d e m that the most likely topology is distinct from the topology with the highest support across gene trees and from the ic topology with highest support across sites. We then combine all previously published genetic data for the group to .o u p estimate a time-calibrated tree for over 300 species of drosophilids. Finally, we digitize dozens of published Hawaiian .c o Drosophilidaedescriptions,andusethistopinpointprobableevolutionaryshiftsinreproductiveecologyaswellasbody, m /m wing,andeggsize.Weshowthatbyexaminingtheentirelandscapeoftreeandtraitspace,wecangainamorecomplete b e understandingofhowevolutionarydynamicsplayoutacrossanislandradiation. /a Keywords:Drosophila,phylogenetics,discordance,Hawaii,Diptera. rticle /3 9 /3 /m s Introduction show that the relationships between the major groups of a c 0 these flies are best understood by using methods that em- 1 Intheeraofgenome-scaledata,wehaveanopportunityto braceevolutionarydiscordance. A2/6 5 unpack the biological meaning of phylogenetic support. In The Hawaiian Drosophila have a long history as a model r12 pbehtywloegeenneotrigcaannisamlyss,essutphpatorsteeiskotfotednisdcoefvienredthaesrtehlaetipornosphoiprs- c(Ola’dGeradfoyranthdeDiemSapllleem2e0n1t8a)t.iMonoroefthpahnyl2o0geyneeatriscagmo,etBhaokdesr ticl066 by tionofinformationthatfavorsaparticularbranchinanevo- e g andDeSalle(1997)usedtheHawaiianradiationofDrosophila u e lutionary tree (Simon 2020). Methods have been developed to perform one of the first analyses to demonstrate incon- st o thatemphasizeextractingthetreewiththegreatestamount gruencebetweenanoverallspeciestreeandunderlyinggene n 0 of support from out of an otherwise rugged landscape of trees. Their study focused on the resolution between major 5 A Htreoewsepvaecre, a(Sgwroowffoinrdg neutmable.r1o9f96s;tuHdieedstkheaveetemalp.h2as0i0ze6d). cwlaodrkesdoofneHabwyaCiiaarnsoDnroinsotphheila19a7n0ds ibnufeilrtrinogntthheeplahnyldomgeanrky pril 20 thebiologicalrelevanceofthatlandscapetoourunderstand- ofasubgroupofHawaiianDrosophila,thepicture-wingflies, 22 ing of the evolutionary process (Kellogg et al. 1996; Baum based on the banding pattern of polytene chromosomes 2007; Smith et al. 2015). For example, many new studies (CarsonandKaneshiro1976),amongotherearlyphylogenetic have contributed evidence that, even with trees with high studies(BeverleyandWilson1985;ThomasandHunt1991). measures of conventional support, we can expect large During the past 20 years, the relationships between major amounts of discordance among sites and genes, especially groups have been revisited several times (Kambysellis et al. when examining speciation events with short internodes or 1995; Bonacum 2001). Most recently, O’Grady et al. (2011) withalikelihoodofintrogression(Weisrocketal.2012;Pease usedmitochondrialgenesandexpandedtaxonsampling,and et al. 2016). Here, we use the island radiation of Hawaiian Magnaccaand Price(2015) used an expandednuclear gene drosophilid flies to study the landscape of treespace, and set. The study presented here builds on this foundational (cid:2)TheAuthor(s)2022.PublishedbyOxfordUniversityPressonbehalfoftheSocietyforMolecularBiologyandEvolution. ThisisanOpenAccessarticledistributedunderthetermsoftheCreativeCommonsAttributionLicense(https://creativecommons. org/licenses/by/4.0/),whichpermitsunrestrictedreuse,distribution,andreproductioninanymedium,providedtheoriginalworkis Open Access properlycited. Mol.Biol.Evol.39(3):msac012 doi:10.1093/molbev/msac012 AdvanceAccesspublicationJanuary20,2022 1 MBE ChurchandExtavour . doi:10.1093/molbev/msac012 work, presenting the first phylogenetic analysis of genome- usingtheBayesiansoftwarepackageBEASTshowedanalter- scaledataforthegroup. nativerelationship,withhaleakalaefliesasthesistercladeto TheHawaiianDrosophilidaeconsistof566describedspe- all other Hawaiian Drosophila, a clade uniting the cies (O’Grady et al. 2010; Magnacca and Price 2012), with MMþPNAþD. primaeva, and closer affinity between hundreds more estimated to be awaiting description D. primaeva and PNA species than between D. primaeva (O’Grady et al. 2010). These species have been divided into and MM species (fig. 1C). This latter arrangement is largely the following major clades (O’Grady et al. 2010): 1) the pic- consistent with relationships proposed by Throckmorton ture-wing,nudidrosophila,ateledrosophila(PNA)clade,which (1966) and reiteratedinseveral subsequent studies (supple- hasservedasamodelcladeforthestudyofsexualselection mentaryfig.S1,SupplementaryMaterialonline)(Kambysellis (KaneshiroandBoake1987)andspeciation(Kangetal.2016); etal.1995;BakerandDeSalle1997;Bonacum2001). 2) the antopocerus, modified-tarsus, ciliated-tarsus (AMC) Resolvingtheserelationshipsiscriticalforourunderstand- clade, first proposed by Heed (1968) and (O’Grady et al. ing of the morphological and ecological evolution of these 2010) and confirmed by subsequent phylogenetic studies flies (Kambysellis et al. 1995; Bonacum 2001; O’Grady et al. D (O’Grady et al. 2011; Lapoint et al. 2014); 3) the modified- 2011). Hawaiian Drosophila demonstrate a large diversity in o w mouthparts(MM)clade;and4)thehaleakalaeclade,anenig- body size (Stevenson et al. 1995), wing size (Edwards et al. n lo matic group in need of further study (Hardy et al. 2001). 2007),andeggsize(Montagueetal.1981);inthenumberand a d e Several other smaller clades have been suggested as falling position of structural features such as wing spots (Edwards d outsideofthesemajorgroups,includingtherusticagroupof etal.2007);inthenumberofegg-producingunitsintheovary fro m threespecies(O’Gradyetal.2001),andthemonotypicline- (ovarioles)(KambysellisandHeed1971;Sarikayaetal.2019); h ages of D. primaeva and D. adventitia. The position of and in the type of substrate used for oviposition and larval ttp s D.primaevahasbeensomewhatuncertain,butseveralstud- feeding(Kambysellisetal.1995;Magnaccaetal.2008).Some ://a c ies have suggested it is the sister taxon to picture-wing flies clades demonstrate unique suites of morphological and be- ad e (Bonacum 2001), including the work on polytene chromo- havioraltraits,whoseevolutionaryhistoryisunclearbecause m ic somesbyCarsonandStalker(1969).ThespeciesD.adventitia ofuncertaintiesinthephylogeny.Forexample,thehaleakalae .o u wasoriginallysuggestedtobepartoftheMMclade(Hardy fliesexclusivelyusefungalovipositionsubstratesandarecon- p .c 1965), but recent studies placed it as the sister taxon to sidered to have less complex mating behaviors than other, om D.primaeva(Bonacum2001)orpossiblyothermajorclades. morewell-studiedgroups(e.g.,picture-wingflies)(Hardyetal. /m b Additionally,theHawaiianDrosophilaarethesistercladeof 2001). It is unclear whether this suite of traits represents a e/a the genus Scaptomyza, which is nested within the broader secondarytransitionrelativetotheancestralstate,becauseit rtic paraphyletic genus Drosophila and is hypothesized to have isnotknownwhetherhaleakalaefliesarethesistercladetoall le/3 9 colonized the island independently (Throckmorton 1966; other Hawaiian Drosophila or nested within the radiation. /3 Lapoint et al. 2013), possibly more than once (Katoh et al. Resolution in the relationships at the base of this lineage /m s a 2017). Throughout this manuscript, we use Hawaiian will be key in identifying which branches experienced sub- c 0 Drosophila to refer to non-Scaptomyza Hawaiian species, stantial trait diversification, and especially in identifying 12 /6 and Hawaiian Drosophilidae to refer to the clade of whetheranyofthesetraitsdemonstratepredictablepatterns 5 1 HawaiianDrosophilaþScaptomyza. ofcoevolution. 20 6 Many phylogenetic studies have been performed which Here,wepresentthefirstphylogenomicrelationshipsbe- 6 b haveconfirmedthemonophyly of eachofthesecladesand tweenthemajorgroupsofHawaiianDrosophilidae.Wecom- y g providedresolutionforinternalrelationships(PNA,Bonacum bine 12 new transcriptomes sequenced in this study with ue s et al. 2005; Magnacca and Price 2015; AMC, Lapoint et al. recentlypublishedgenomesfortwoHawaiianDrosophilaspe- t o n 2011, 2014; haleakalae, O’Grady and Zilversmit 2004; and cies (Kim et al. 2021), four non-Hawaiian Scaptomyza (Kim 0 5 Scaptomyza,Lapointetal.2013;Katohetal.2017).Previous etal.2021),andsixoutgroupspecies(Larkinetal.2021).By A p phylogeneticstudies,however,havenotresultedinaconsen- increasingthenumberofgenesusedtoinferrelationships,we ril 2 sus relationship between the major clades within Hawaiian begintounpacktheevolutionaryhistoryintheshortintern- 0 2 2 Drosophila (supplementary fig. S1, Supplementary Material odes at the base of the Hawaiian Drosophila radiation. online) (Magnacca and Price 2015). Magnacca and Price Following up on the critical study by Baker and DeSalle (2015)showedthatdifferentphylogeneticmethodsofanal- (1997) 25years ago, we explore the landscape of treespace ysis (e.g., using software based on Bayesian statistics rather and the discordance between species and gene trees using thanmaximumlikelihoodforinference)producedhighlyin- ourphylotranscriptomicdataset.Wethenusetheresultsof congruent topologies (fig. 1) (Magnacca and Price 2015). In ouranalysisasinitialconstraintsonsubsequentphylogenetic thatstudy,themostlikelytopologyhadD.primaevaasthe analyses using a data set of 316 species and 44 genes, com- sistertaxontoallotherHawaiianDrosophila,andincludeda piled using all previous phylogenetic studies of Hawaiian clade uniting MMþAMCþhaleakalae, with the haleakalae Drosophilidae. Finally, we estimate the age of the radiation, clade showing greater affinity to AMC species relative to andusethistime-calibratedtreetoidentifybrancheswhere MMspecies(fig.1B).Thistopology wasconsistentwiththe shiftsintraitevolutionlikelyoccurred.Ourfindingssuggesta treesuggestedbyO’Gradyetal.(2011)analyzingmitochon- relationshipbetweenmajorcladesthatisdistinctfromboth drialdataandusingmaximumlikelihoodandBayesiananal- previously hypothesized topologies, and that is well sup- yses. However, the analyses of Magnacca and Price (2015) ported by both maximum likelihood and Bayesian analyses. 2 MBE DiscordanceinthePhylogenyofHawaiianDrosophilaandScaptomyza . doi:10.1093/molbev/msac012 D o w n lo a d e d fro m h ttp s ://a c a d e m ic .o u p .c o m /m b e /a rtic le /3 9 /3 /m s a c 0 1 2 /6 5 1 FIG.1.Phylotranscriptomicanalysisindicatesrelationshipsbetweenmajorcladesdistinctfromthosepreviouslyhypothesized.Photosshowsixof 20 6 the12specieswithdenovotranscriptomespresentedinthisstudy,listingtheirparentcladeandtheHawaiianislandonwhichtheyarefound.(A) 6 b Resultsnoveltothisstudy,showingbestsupportedtreeacrossmaximumlikelihoodandBayesiananalyses.Nodelabelsindicateultrafastbootstrap y g values,genetreeconcordancefactors(gCF),andsiteconcordancefactors(sCF),seeconcordancefactoranalysisbelow.Drosophilaadventitiawas u e notpresentinphylotranscriptomicanalyses;seefigure3forinformationonitsplacement.(B,C)Previouslyhypothesizedrelationshipsbetween st o thepicturewing-nudidrosophila-ateledrosophila(PNA),modified-mouthparts(MM),antopocerus-modifiedtarsus-ciliatedtarsus(AMC),halea- n 0 kalae,andScaptomyzaclades,aswellastwomonotypicclades,D.primaevaandD.adventitia.TopologyBwasrecoveredinO’Gradyetal.(2011) 5 A andMagnaccaandPrice(2015).TopologyCwasrecoveredusingtheBayesiansoftwareBEASTinMagnaccaandPrice(2015),showingincongruent p relationshipsbetweencladesatthebaseoftheradiationofHawaiianDrosophila. ril 2 0 2 2 Weshowthatexaminingacomprehensivelandscapeoftree (Stamatakis2014),aswellastheconsensustreewithhighest andtraitspacecanallowforamorecompleteunderstanding posterior probability estimated using PhyloBayes (Lartillot ofevolutionarydynamicsinthisremarkableislandradiation. et al. 2013) (fig. 1A and supplementary figs. S2 and S3, Supplementary Material online). Bootstrap support for all Results branches was 100 and posterior probability was 1, with the Phylotranscriptomics SuggestaNewPhylogenyof exception of the branch subtending the clade uniting HawaiianDrosophilidae MMþAMC(IQtreeultrafastbootstrap:66,RAxMLbootstrap: Using a phylotranscriptomic approach, werecovered a new 57,PhyloBayesposteriorprobability:0.52).Wealsoestimated topology between the major clades of Hawaiian the phylogeny using a multispecies coalescent model with Drosophilidae, distinct from those previously hypothesized ASTRAL(Zhangetal.2018),andrecoveredthesametopol- (fig. 1 and supplementary fig. S1, Supplementary Material ogywiththeexceptionoftheplacementofD.primaeva(as online).Thistopologywasthemostlikelytreeestimatedus- the sister taxon to PNA, supplementary fig. S4, ing IQtree (Minh, Schmidt, et al. 2020) and RAxML Supplementary Material online). Each of these analyses 3 MBE ChurchandExtavour . doi:10.1093/molbev/msac012 wereperformedonasupermatrixof10,949putativelyorthol- expectation based on chance (Minh, Hahn, et al. 2020). We ogousgenes,alignedandassembledusingtheagalmapipeline foundthatformanybranchesinourtreebothgCFandsCF (Dunnetal.2013)withnofilteringbasedonoccupancy(ac- are high, indicating these relationships are supported by a tualgeneoccupancywas41.7%).Totestthesenstivityofour majorityofgenesandsitesinourdataset.Forexample,the resultstomissingdata,werepeatedtheIQtreeanalysisona branchunitingHawaiianDrosophilahasagCFof91.2,andsCF datasetreducedusinganoccupancythresholdthatensures of72.1(fig.1A).However,forthebranchessubtendingmost representationof80%oftaxaateachgene(1,926genes),and relationships between the major clades of Hawaiian recovered the same topology as with the full set of genes Drosophila, gCF and sCF are low. For example, the branch (supplementaryfig.S5,SupplementaryMaterialonline). uniting D. primaevaþMMþAMCþhaleakalae to the exclu- ThemostlikelytreeindicatesthatthePNAclade,including sionofPNAhasabootstrapvalueof100,butagCFof19.3 picture-wingspecies, isthesister cladeto allotherHawaiian andsCFof32.2. Drosophila. Drosophila primaeva is found to be the sister Wealsotestedtheextenttowhichpotentialerrorinmul- taxon to a clade containing non-PNA Hawaiian Drosophila, tiplesequencealignmentaffectedconcordancevaluesbyfil- D thoughthiscladereceivedlowersupportwhenusingthedata teringoutpoorlyalignedsequencefragments,andrepeating o w set reduced by occupancy (supplementary fig. S5, the tree inference and concordance analyses. After filtering n lo SupplementaryMaterialonline,ultrafastbootstrapof85).A poorlyalignedsequences,werecoveredthesametopologyas a d e secondmonotypiclineage,D.adventitia,wasnotsampledfor figure 1A, with the exception of the arrangement of MM, d phylotranscriptomicanalyses,butusingspecificgenemarkers, AMC, and haleakalae, here showing a MMþhaleakalae as fro m we recover this as the sister taxon to a clade including monophyletic(IQtreeultrafastbootstrap:97,supplementary h MMþAMCþhaleakalae (seesection onexpanded phyloge- fig.S6,SupplementaryMaterialonline).Concordancefactors ttps neticanalysisbelow).Thislattercladewasrecoveredinpre- betweenanalysesonfilteredandnonfiltereddatawerenearly ://a c vious phylogenetic analyses (O’Grady et al. 2011; Magnacca equivalent (e.g., the branch separating PNA from the other ad e andPrice2015).Incontrasttothosestudies,whichsuggested Hawaiian Drosophila received a gCF of 19.0 and sCF of 31.6 m ic a monophyletic clade of AMCþhaleakalae, we do not re- afterfiltering,andagCFof19.3andsCFof32.2usingalldata). .o u cover sufficient support for any particular arrangement of Furthermore, using a series of stringency thresholds to filter p .c MM, AMC, and haleakalae (ultrafast bootstrap from both thedata,weobservednopatternofincreasingordecreasing om thefullandreducedoccupancymatrixis<95). concordancefactorsacrossbranches(supplementaryfig.S7, /m b Wetestedthemostlikelytreeemergingfromouranalysis Supplementary Material online). These results suggest that e/a (fig.1A)againsttwopreviouslysuggestedalternativehypoth- values of discordance in this phylogeny are not artificially rtic eses(fig.1BandC)usingtheSwofford–Olsen–Waddell–Hillis inflatedduetotechnicalerrorsfromthealignmentstep. le/3 9 (SOWH) test (Swofford et al. 1996), a parametric bootstrap Weinterpretthemeasuresofdiscordanceasreflectingreal /3 approach for comparing phylogenetic hypotheses. In both variation in the phylogenetic signal of different genes and /m s a cases, the difference in likelihood between the most likely sites, which is not unexpected for a radiation such as this c 0 treeandthesealternativeswaslargerthanwewouldexpect with short internodes subtending major clades (Minh, 12 /6 bychance(Pvalueforboth<0.01,withasamplesizeof100). Hahn,etal.2020).Furthermore,thepresenceofdiscordance 5 1 Betweenfigure1AandBthedifferenceinlog-likelihoodwas doesnotmeanthatthereislittlethatcanbesaidaboutthe 20 6 1,774.1,andbetweenfigure1AandCwas6,132.1,whereasthe relationshipsbetweenthesegroups.Incontrast,byunpacking 6 b nulldistributionaccordingtotheSOWHtesthadnodiffer- this discordance, we can begin to qualitatively describe the y g encesgreaterthan15foreithercomparison.Takentogether, amount and distribution of phylogenetic signal for multiple ue s our results suggest a new phylogeny for Hawaiian alternative,plausiblebipartitions. t o n DrosophilidaerelationshipswhereinMM,AMC,andhaleaka- To this end, we first visualized hotspots of concordance 0 5 lae represent a monophyletic group, and the PNA clade, acrosstreespace(fig.2).Wecreatedall105topologicalcom- A p rather than either the haleakalae clade or D. primaeva, is binationsofthepossiblearrangementsbetweenmajorclades, ril 2 thesistercladetoallothers(fig.1A). andthenre-estimatedgCFandsCFforeach.Visualizingthe 0 2 2 meanvaluesforgCFandsCFplottedintreespaceshowsthat IdentifyingHotspotsofGeneandSiteConcordancein themostlikelytree,asestimatedwithIQtree,isnotthetree Treespace withthehighestmeangCFandsCF,butitisnearahotspotof Weanalyzedthestrengthofphylogeneticconcordanceinour alternativearrangementsfor whichbothofthesevaluesare phylotranscriptomicdatasetbyestimatingthegeneandsite high(fig.2,treespace,mostlikelytreeindicatedbydarkred concordancefactorsforeachbranchinourtree.Genecon- outline).Incontrasttothemostlikelytopology,thetreeswith cordance factors (gCF) are calculated as the proportion of the top three mean gCF values and two of the three trees informativegenetreesthatcontainagivenbranchbetween withthetopmeansCFvaluesuniteD.primaevaþPNAtothe taxa,andcanrangefrom0to100(Baum2007;Minh,Hahn, exclusion of other Hawaiian Drosophila. Variation between etal.2020).Siteconcordancefactors(sCF)arecalculatedas these top trees largely depends on the placement of halea- the average proportion of informative sites that support a kalaerelativetootherclades(fig.2,topgCFandsCFtrees). givenbranchbetweentaxa.Becauseonesitecanonlysupport Calculating the mean gCF and sCF across branches may oneofthreearrangementsforaquartetoftaxa,sCFtypically not always provide an informative metric, given that some ranges from (cid:2)33.3 to 100, with 33.3 representing our null topologies may contain one highly supported branch and 4 MBE DiscordanceinthePhylogenyofHawaiianDrosophilaandScaptomyza . doi:10.1093/molbev/msac012 D o w n lo a d e d fro m h ttp s ://a c a d e m ic .o u p .c o m /m b e /a FIG.2.Thelandscapeoftreespaceshowshotpotsofconcordanceamonggenesandsites.Thelandscapeoftreespaceforallpossibletopological rtic combinationsofthefivecladesofHawaiianDrosophilastudiedhere:PNA,D.primaeva,haleakalae,MM,andAMC.Individualpointsrepresent le/3 differentarrangementsofthefiveclades,labeledrandomlywithtwo-letterIDsfromaathroughea.Thedistancebetweenpointsindicatestree 9/3 similarity(calculatedfromRobinson–Fouldsdistances).Thesizeofpointsrepresentsthemeangeneconcordancefactor(gCF)acrossrelevant /m s branches,andthecolorrepresentsthemeansiteconcordancefactor(sCF;purple,low;yellow,high).Thepointoutlinedinred(treedm)indicates a c thebesttopologyfoundwithIQtree,RAxML,andPhyloBayes,whichisdistinctfromthetoptreesaccordingtomeangCF(av,ah,andap)ormean 01 2 sCF(av,az,andcf).Concordancemeasurementsforalltopologiesareavailable,seeMaterialsandMethodsandDataAvailability. /6 5 1 2 0 others with very low support. Therefore, we also analyzed Haeseler1997; Minh,Schmidt,etal. 2020). Thisanalysis cal- 6 6 concordanceforalltheuniquebipartitionsacrossthesetof culatesthelikelihoodsupportforthethreepossiblearrange- b y possible topologies (supplementary figs. S8 and S9, ments of each quartet of taxa in an alignment, and then gu e Supplementary Material online; see supplementary concor- countsthenumberofinformativequartetsthatstronglysup- st o dance factor analysis, Supplementary Material online). We portonearrangementovertheothertwo(StrimmerandVon n 0 foundthatforgCF,thereisclearsignalsupportingbipartitions Haeseler 1997). Here, we performed likelihood mapping on 5 A tMhaMtþuAniMteCDþ.hparliemakaaelvaaeþPNA(s,uapsplwemellenatsartyhose ftihga.t unSit8e, eqaucahrtoeftsthreele1v0a,9n4t9togetnheespinosoituiornfuollfdthateaPsNetA,rgersouultpinagnind23,,108725 pril 20 2 SupplementaryMaterialonline).WefoundthatforsCF,con- quartetsrelevanttothepositionofthehaleakalaegroup.Our 2 cordance values across bipartitions are more variable, but resultsshowedthat,inbothcases,thevastmajorityofquar- those that unite PNAþhaleakalae show less support than tetsareuninformative,withnostronglikelihoodsupportfor we might expect by chance, whereas those that unite anyonearrangement(supplementaryfig.S10,Supplementary D. primaevaþPNA and AMCþMM show more support Material online). Although support was for the most part (supplementary fig. S9, Supplementary Material online). In evenly divided among possible arrangements, we observed addition,betweengCFandsCF,wefoundconflictingsignals morequartetsthatunitehaleakalaeþMM,totheexclusion for bipartitions that define one clade as sister to the rest of of AMC, PNA, and D. primaeva, as well more quartets that Hawaiian Drosophila, with gCF indicating support for PNA uniteD.primaevaþPNAtotheexclusionofotherHawaiian (consistentwiththemostlikelytopology),andsCFindicating Drosophilidae, thanotherarrangements(supplementary fig. supportforhaleakalae. S10AandC,SupplementaryMaterialonline).Wealsotested Toinvestigatethesourceofdiscordanceacrossgenes,we whether genes that support one topology over another at performed a likelihood mapping analysis that assesses the thesenodeswereenrichedforeitherlongorshortgenes,or phylogenetic information in each gene (Strimmer and Von fastorslow-evolvinggenes.Ourresultsshowednocorrelation 5 MBE ChurchandExtavour . doi:10.1093/molbev/msac012 betweengenetreetopologyandgenelengthorevolutionary estimates for the ages at which Hawaiian islands became conservation (supplementary fig. S11, Supplementary habitable, based on models of island emergence, growth, Material online), with the exception of genes supporting and decline via erosion and subsidence (Lim and Marshall D. primaeva as the sister to all other Hawaiian Drosophila, 2017) (table 1). Similar results were obtained using the cali- which was supported by somewhat slower evolving genes brationschemefromRussoetal.(2013)thatincludesasingle than alternative arrangements (supplementary fig. S11D, fossil calibration point for the clade, based on the taxon SupplementaryMaterialonline). S. dominicana recovered from dominican amber (median In summary, across all analyses, we found consistent evi- rootage22.9mya).However,withbothschemes,uncertainty denceforabipartitionthatseparatesPNAandacladethat aroundtherootageremainssubstantial(95%highestposte- includes MM and AMC. Although the placement of rior density confidence interval 17.4–29 mya), and small D. primaeva with MMþAMCþhaleakalae received strong changesinthecalibrationtimesusedcanleadtosubstantial support in our maximum likelihood and Bayesian analyses, differences in this estimate. When calibrating the tree using weobservedsubstantialdiscordanceinthisarrangement,and the same island age estimates as in Magnacca and Price D detectsignalsuggestingasignificantamountofsharedhistory (2015),whicharemarginallyyounger(table1),weestimated o w betweenD.primaevaandPNA.Similarly,althoughtheclade theageofHawaiianDrosophilidaetobe(cid:2)15Myold(median n lo uniting MMþAMCþhaleakalae received strong bootstrap root age 15.5 mya). Furthermore, we note that calibrating a d e support, we observed substantial discordance in the place- usingprimarilyvicariancebasedestimatesoftimeisconsid- d mentofhaleakalae,andsuggestthatfurtherresolutioninits ered to be imprecise and should be avoided fro m placementwillbepossiblewithadditionaltaxonsamplingin (Kodandaramaiah 2011). Taken together, we consider this h thatclade. estimateoftheageofHawaiianDrosophila,aswellasthose ttps previouslypublished,tobetentative,andsuggestthatfurther ://a c CalibratinganExpandedPhylogenytoTime data(e.g.,newfossilevidence)willbenecessarytodetermine ad e Buildingonthephylotranscriptomicanalysesabove,wecol- the age of diversification relative to island emergence with m ic lectedallpubliclyavailablegenomicandtranscriptomicdata greatercertainty. .o u forspeciesfromHawaiianDrosophilaandScaptomyza.These According to this estimate, we find that the division be- p .c datawereaccessionedinnineanalysespublishedsince1997, tween major Hawaiian Drosophila clades occurred around om mostofwhichfocusedonresolvingthephylogeneticrelation- 10 mya (fig. 3), prior to the estimated time when the /m b shipswithinamajorclade(BakerandDeSalle1997;O’Grady Hawaiian island Kaua’i became habitable (between 6.3 and e/a andZilversmit2004;Bonacumetal.2005;Lapointetal.2011, 6.0mya;LimandMarshall2017).Ourresultsshowthatthe rtic 2013,2014;MagnaccaandPrice2015;Katohetal.2017).The diversification of lineages within MM also occurred around le/3 datasetwecompiledcontained44genes(sixmitochondrial thattime,whereasthelineageswithintheAMC,haleakalae, 9/3 and 38 nuclear) from 316 species (including 271 described andgrimshawigroups(PNA)allarosewithinthelast5million /m s and 45 undescribed putative species), with an overall occu- years,aroundthetimeOahubecamehabitable.Wenotethat ac 0 pancy of 17.3% (supplementary fig. S12, Supplementary theMMgroupssufferfromlowerrepresentationacrossgenes 12 Materialonline).Weusedthisdatasettoinferthephylogeny used to calibrate the tree to time (supplementary fig. S12, /65 1 with IQtree, constraining the relationships between major SupplementaryMaterialonline),andsuggestthatmoredata 20 6 cladestoconformtothetopologyshowninfigure1A. may help shed light on differences in the age of this clade 6 b Theresultingtopologyistoourknowledgethemostspe- relativetoothers. y g ciesrichphylogenetictreeoftheHawaiianDrosophilidaeto ue s date(supplementaryfig.S13,SupplementaryMaterialonline). AncestralStateReconstructionofOvipositionand t o n Severalsupportvaluesarelow(ultrafastbootstrap<95),es- LarvalFeedingEcology 0 5 peciallyfornodesnearthebaseoftheradiation.However,this With this time-calibrated tree for 316 species, we have an A p is not unexpected, given that this phylogeny is estimated opportunitytoinvestigatetheevolutionarydynamicsoftrait ril 2 primarilyfromthesamedatapreviouslyanalyzed,whichre- diversification.Bymodelingtheevolutionofthediversesuite 0 2 2 coveredalternativerelationshipsatthosenodes.Ofnoteare of ecological and morphological features across the phylog- thelowsupportvaluesfortherelationshipswithintheMM eny, wecanidentify which lineageshaveexperienced major and haleakalae clades (supplementary fig. S13, shiftsintraitevolution.Predictingthenumberandphyloge- Supplementary Material online, polytomies), emphasizing neticpositionoftheseshiftswillinturnbecriticalforinform- theneedforfurtherstudyinthesegroups. ingfuturestudiesondevelopment,life-history,andevolution Weusedthisexpandedgeneticdatasetandtopologyto oftheseflies.Inthefollowinganalysesoftraitevolution,we estimatetheageoftheHawaiianDrosophilidaebycalibrating used the maximum clade credibility tree from the con- this tree to time using the software package BEAST strained BEAST analysis described above. Using this tree (Bouckaert et al. 2014). Consistent with recent publications allowsustomaximizethenumberoftaxaforwhichgenetic (Obbard et al. 2012; Magnacca and Price 2015; Katoh et al. data are available, painting the most complete picture of 2017),our resultsindicatethat theageofthesplit between ecological and morphological evolution in these flies up to Hawaiian Drosophila and Scaptomyza occurred between 20 thisdate.However,duetothefractionofgeneticdatamissing and 25 million years ago (fig. 3, median root age 22.8 mya). acrosstaxa,italsoincludesnodeswithlowbootstrapsupport The results shown here were calibrated using updated (supplementary fig. S13, Supplementary Material online, 6 MBE DiscordanceinthePhylogenyofHawaiianDrosophilaandScaptomyza . doi:10.1093/molbev/msac012 D o w n lo a d e d fro m h ttp s ://a c a d e m ic .o u p .c o m /m b e /a rtic le /3 9 /3 /m s a c 0 1 2 /6 5 1 2 0 6 6 b y g u e s t o n 0 5 A p ril 2 0 2 2 FIG.3. Time-calibratingthephylogenyof316Drosophilidaespecies.ThisphylogenywasinferredusingIQtreetoanalyzeallpubliclyavailable geneticdataforHawaiianDrosophilaandScaptomyza.ItwasthencalibratedtotimeusingthesoftwareBEAST,withfourcalibrationpointsat nodesthatshowabiogeographicprogressionrule(MagnaccaandPrice2015).Similarresultswereobtainedusingacalibrationscheme(Russoetal. 2013)thattakesintoaccountasinglefossiltaxonforthegroup(table1).The95%highestposteriordensityintervalsforeachnodeareshownas graybars,indicatingthecredibleintervalfortheageofthatgroup.TheageatwhichfourHawaiianislandsareestimatedtohavebecomehabitable isshowningreen.Coloredlabelsindicatethecladetowhichtaxabelong,andcolorscorrespondtofigure1;taxawithoutacoloredlabelarespecies withgeneticdatathatareasofyetundescribed.SeesupplementaryfigureS13,SupplementaryMaterialonline,forbootstrapsupport.Calibration usingonlyislandbiogeographyisknowntobeimprecise(Kodandaramaiah2011);therefore,thedivergencetimesshownhereareconsidered tentative. 7 MBE ChurchandExtavour . doi:10.1093/molbev/msac012 Table1. CalibrationPointsforDatingwithBEAST. Group SpeciesIncluded Magnacca Magnacca Russoetal. andPrice andPrice (2013) (2015) (2015), UpdatedAges Scaptomyzagenus AllScaptomyzaspecies 25.7(3.0) planitibiagroup D.anomalipes,D.quasianomialipes,D.oahuensis,D. 3(0.5) 4.135(0.5) obscuripes,D.hemipeza,D.melanocephala,D.pla- nitibia,D.heteroneura,D.silvestris,D.neoperkinsi, D.neopicta,D.ingens,D.differens,D.substenoptera, D.cyrtoloma,D.hanaulae,D.nigribasis lanaiensissubgroup D.lanaiensis,D.hexachaetae,D.digressa,D.moli 3(0.5) 4.135(0.5) picticornissubgroup D.picticornis,D.setosifrons,D.pilipa 4.7(0.1) planitibiasubgroup D.differens,D.hemipeza,D.planitibia,D.silvestris,D. 2.8(0.1) heteroneura D o cyrtolomasubgroup D.neoperkinsi,D.obscuripes,D.melanocephala,D. 2.8(0.1) w n cyrtoloma,D.ingens,D.hanaulae,D.oahuensis lo sobrina1orthofascia1ciliaticrus D.sobrina,D.orthofascia,D.ciliaticrus 1.7(0.3) 2.55(0.3) ad e silvestris1heteroneura D.silvestris,D.heteroneura 0.5(0.2) 1.2(0.2) 0.6(0.04) d fro NOTE.—Valuesaremean(standarddeviation)ageinmillionyearsfornormallydistributedtimepriors. m h ttp s polytomies).Therefore,forinternallineagesforwhichevolu- Werecoveratransitionfrombarktoleafbreedingatthebase ://a tionaryrelationshipsremainuncertain,thepositionofthese oftheAMCcladethathasgenerallypersistedthroughoutthe ca d evolutionaryshiftsaresubjecttochangeasmoregeneticdata diversification of that group. As previously reported em become available and further phylogenetic resolution is (Magnaccaetal. 2008),wefindseveralgroupsthatdemon- ic .o achieved. strate no reported variation in substrate type (e.g., fungus up .c The Hawaiian Drosophilidae use a wide variety of plant, breedinghaleakalae,fig.5B). o m animal, and fungal species for egg laying and larval feeding Over 1,000 stochastic character maps, we recovered an /m b (fig.4)(Heed1968;Montgomery1975;Magnaccaetal.2008). average of 44 transitions in oviposition substrate over the e /a Themajorityofspeciesbreedinrottingsubstrates,withvar- evolutionaryhistoryofHawaiianDrosophilidae.Themajority rtic iationinthepartoftheplantorfungusinquestion,including of thesechanges occurred along branches leading to extant le /3 rottingbark,leaves,flowers,andfruit.Afewspeciesbreedon tips,withfewtransitionsatinternalnodes(onthesummary 9 /3 live tissue, and one notable Scaptomyza subgenus, tree,8outof36totalchanges).Onaverage,70%oftransitions /m s Titanochaeta, have been reared exclusively from spider egg werebetweenusingasinglesubstratetypeasaprimaryhost a c 0 masses(Knab1914).In2008,Magnaccaetal.(2008)reviewed (“specialist” species) and using multiple types (“generalist” 1 2 host plant and substrate records and found that, whereas species,definedasusinganytwosubstratesthateachcom- /65 1 manyspeciescanbeconsideredspecialiststospeciesorsub- prise >1 of all rearing records, or with no substrate that 2 strate,hostshiftingwascommonandmanyspeciesoccasion- comprise4s >2 of rearing records; Magnacca et al. 2008). 066 3 b ally use nonpreferred substrates. The type of oviposition Other transitions were primarily between using rotting y g substrate has been suggested as a driver for diversification bark,leaves,orsap.Pinpointingbranchesoflikelytransitions ue s of the reproductive traits ovariole number and egg size showsthatsomegroupshaveexperiencedmanymoretran- t o n (Kambysellis and Heed 1971; Kambysellis et al. 1995; sitions than others, especially MM and non-flower/non-spi- 0 5 Sarikaya et al. 2019). However, the previous reconstruction der egg breeding Scaptomyza. Most generalist species fall in A p ofovipositionsubstratebyKambysellisetal.(1995)wasper- oneofthesetwoclades,whichalsoincludespecialistbarkand ril 2 formedwithaphylogenythatincludedonlythreenon-PNA leafbreeders,amongothersubstrates. 0 2 species, and was therefore unable to resolve the ancestral 2 oviposition substrate for Hawaiian Drosophila or to identify EvolutionofWing,Thorax,andBodyLength when evolutionary shifts in substrate outside of PNA were Alongside ecological diversification, the Hawaiian likelytohaveoccurred. Drosophilidae show substantial diversity in adult body size. We combined the phylogenetic results presented here Weusedthetime-calibratedphylogenytomodelthenumber with the data summarized in Magnacca et al. (2008), to re- andtimingofmajorchangesintheevolutionarydynamicsof constructtheancestralovipositionsubstratefortheHawaiian sizeacrossthephylogeny.First,wedigitized795recordsfrom Drosophilidae (fig. 5A and supplementary fig. S14, 26publications(GrimshawandSpeiser1901;Grimshaw1902; Supplementary Material online). Using stochastic character Frederick1914;Bryan1934,1938;Wirth1952;Hackman1959; mapping (Huelsenbeck et al. 2003), we recover the most Hardy 1965, 1966, 1969, 1977; Hardy and Kaneshiro 1968, probable ancestral oviposition substrate for the Hawaiian 1969, 1971, 1972, 1975, 1979; Kambysellis and Heed 1971; Drosophila as bark breeding (defined as including rearing Hardy et al. 2001; O’Grady et al. 2001, 2003; Starmer et al. recordsfrombark, stems, branches,roots, and fernrachises, 2003; Magnacca and O’Grady 2008, 2009; Craddock et al. seesupplementarytableS5,SupplementaryMaterialonline). 2016; Sarikaya et al. 2019), including descriptions of body, 8 MBE DiscordanceinthePhylogenyofHawaiianDrosophilaandScaptomyza . doi:10.1093/molbev/msac012 D o w n lo a d e d fro m h ttp s ://a c a d e m ic .o u p .c o m /m b e /a rtic le /3 9 /3 /m s a c 0 1 2 /6 5 1 2 0 6 6 b y g u e s t o n 0 5 A p ril 2 0 2 2 FIG.4. Ancestralstatereconstructionofovipositionsubstrateindicatesdozensofevolutionarytransitions.(A)Weusedstochasticcharacter mappingtoreconstructtheancestralsubstrateusedforovipositionandlarvalfeeding,andidentifieddozensoflikelytransitionsinsubstrate(gray circles). Branch color indicates the ancestral substrate type with highest probability, and tip box indicates extant oviposition substrate. (B) Ovipositionsubstratecategorywasdefinedbasedonrearingrecords,usingthedatasummarizedinMagnaccaetal.(2008).Generalistspeciesare definedasthosewithanytwosubstratesthateachcomprise>1ofrearingrecords,oranyspecieswithnosubstratethatcomprises>2ofrearing 4 3 records. wing,andthoraxlengthacross552species.Thenwemapped In the case of wing length, we find evidence for several these traits onto our phylogenetic results, and used the R highlysupportedregimeshiftsintheevolutionaryhistoryof package bayou (Uyeda and Harmon 2014) to identify Hawaiian Drosophilidae (fig. 5). Some of these are indepen- branchesthatrepresentprobableshiftsintraitdiversification. dentshiftsonbranchessubtendinggroupswithlargerwings ThispackageusesOrnstein–Uhlenbeck(OU)modelstode- thantheirnearbyrelatives,includingfliesintheantopocerus scribeshiftsinevolutionaryregimes,definedaslineagesthat group (AMC) and in the EngiscaptomyzaþGrimshawomyia shareanOUoptimumtraitvalue. subgenera (Scaptomyza). Others are independent shifts on 9 MBE ChurchandExtavour . doi:10.1093/molbev/msac012 D o w n lo a d e d fro m h ttp s ://a c a d e m ic .o u p .c o m /m b e /a rtic le /3 9 /3 /m s a c 0 1 2 /6 5 1 2 0 6 FIG.5.Multipleshiftsinevolutionaryregimeshelpexplainthediversityofwinglength.(A)UsingtheRpackagebayou(UyedaandHarmon2014), 6 b wemodeledtheevolutionofwinglength(mm)onthephylogenyanddetectedseveralprobableshiftsinevolutionaryregimes(graycircles,larger y g indicates greater posterior probability that a shift occurred on that branch). Locations of probable shifts include at the base of the u e s MMþAMCþhaleakalaeclade,subtendingtheantopocerusgroup(AMC),andsubtendingthenudidrosophila(PNA),amongothers.(B)The t o distributionofwinglengthsacrossthephylogenyofHawaiianDrosophilaandScaptomyza. n 0 5 A p branchessubtendinglineageswithsmallerwingsthannearby antopocerus, and nudidrosophila, consistent with the shifts ril 2 relativessuchasthenanellaþischnotrix(MM)andthenudi- recovered for wing size. In the case of body size, the most 0 2 drosophilasubgroups(PNA).Thissuggeststhattheevolution- probable shifts are located at the base of the Hawaiian 2 ary history of Hawaiian Drosophila has included multiple DrosophilaandtheEngiscaptomyzaþGrimshawomyiasubge- convergent transitions to both larger and smaller wings. In nera. However, for body length, no regime shifts received thecaseofnudidrosophila(PNA),wenotethatthetopology substantially more support than others, despite running recovered in the summary tree dividing this subgroup into bayou for an extra million generations and achieving afinal twolineageshasverylowbootstrapsupport(supplementary effectivesizeforlog-likelihoodof401.9. fig.S13,SupplementaryMaterialonline,polytomies),andwe suggestthatthetwoshiftstosmallerwingsrecoveredwithin EvidenceforConvergentEvolutionofOvariole PNA may represent a single shift if this group is indeed NumberandEggSize monophyletic. Wealsoperformedtheseanalysesonreproductivetraits,in- Wefoundsimilarresultswhenconsideringthoraxlength cluding egg size, egg shape (aspect ratio, calculated as egg (supplementaryfig.S15,SupplementaryMaterialonline)and length/width), and the number of egg-producing compart- bodylength(supplementaryfig.S16,SupplementaryMaterial ments in the ovary (ovarioles). These traits have been the online).Inthecaseoftheformer,wefindshiftsatthebaseof subject of several life-history studies regarding the 10