Laneetal.BMCGenomics (2016) 17:702 DOI10.1186/s12864-016-3052-0 RESEARCH ARTICLE Open Access The green ash transcriptome and identification of genes responding to abiotic and biotic stresses Thomas Lane1, Teodora Best2, Nicole Zembower2, Jack Davitt1, Nathan Henry1, Yi Xu3,4, Jennifer Koch5, Haiying Liang3, John McGraw6, Stephan Schuster6, Donghwan Shim2, Mark V. Coggeshall7, John E. Carlson2 and Margaret E. Staton1* Abstract Background: Todevelop a set of transcriptomesequences to support research on environmental stress responses ingreen ash (Fraxinuspennsylvanica), weundertookdeep RNA sequencing of green ash tissues under various stress treatments.Thetreatments,includingemeraldashborer(EAB)feeding,heat,drought,coldandozone,wereselected tomimictheincreasingthreatsofclimatechangeandinvasivepestsfacedbygreenashacrossitsnativehabitat. Results:WereportthegenerationandassemblyofRNAsequencesfrom55greenashsamplesinto107,611putative uniquetranscripts(PUTs).52,899openreadingframeswereidentified.FunctionalannotationofthePUTsby comparisontotheUniprotproteindatabaseidentifiedmatchesfor63%oftranscriptsandfor98%oftranscriptswith ORFs.Furtherfunctionalannotationidentifiedconservedproteindomainsandassignedgeneontologytermstothe PUTs.ExaminationoftranscriptexpressionacrossdifferentRNAlibrariesrevealedthatexpressionpatternsclustered basedontissuesregardlessofstresstreatment.Thetranscriptsfromstresstreatmentswerefurtherexaminedtoidentify differentialexpression.TenstohundredsofdifferentiallyexpressedPUTswereidentifiedforeachstresstreatment.Aset of109PUTswerefoundtobeconsistentlyupordownregulatedacrossthreeormoredifferentstresstreatments, representingbasalstressresponsecandidategenesingreenash.Inaddition,1956simplesequencerepeatswere identifiedinthePUTs,ofwhichweidentified465highqualityDNAmarkersanddesignedflankingPCRprimers. Conclusions:NorthAmericannativeashtreeshavesufferedextensivemortalityduetoEABinfestation,creatinga needtobreedorselectforresistantgreenashgenotypes.Stressfromclimatechangeisanadditionalconcernfor longevityofnativeashpopulations.Theuseofgenomicscouldacceleratemanagementefforts.Thegreenash transcriptomewehavedevelopedprovidesimportantsequenceinformation,geneticmarkersandstress-response candidategenes. Keywords:Transcriptome,RNASeq,Assembly,Fraxinus,Emeraldashborer,Heat,Drought,Cold,Ozone,Stressresponse Background tree, in parks, and in residential areas due to fast growth Green ash (Fraxinus pennsylvanica Marsh.) is the most andadaptabilitytourbanconditions.Both natural stands widely distributed species in the Fraxinus genus in and urban plantings of green ash are now seriously North America. Green ash is valuable both economically threatened by the emerald ash borer (EAB, Agrilus and ecologically. Green ash produces a large number of planipennis Fairmaire), a pest of Asian ash species acci- seeds, an important source of food for a diverse array of dentallyintroducedintoNorthAmerica[2].EABwasori- wildlife species [1]. It has been widely planted as a street ginally identified as the cause of widespread death and decline of native ash trees in Michigan and Ontario in 2002 [2]. EAB has since spread quickly and is currently *Correspondence:[email protected] 1DepartmentofEntomologyandPlantPathology,UniversityofTennessee, found in 26 U.S. states and two Canadian provinces Knoxville,TN37966,USA [3]. All native North American ash trees are considered Fulllistofauthorinformationisavailableattheendofthearticle ©2016TheAuthor(s).OpenAccessThisarticleisdistributedunderthetermsoftheCreativeCommonsAttribution4.0 InternationalLicense(http://creativecommons.org/licenses/by/4.0/),whichpermitsunrestricteduse,distribution,and reproductioninanymedium,providedyougiveappropriatecredittotheoriginalauthor(s)andthesource,providealinkto theCreativeCommonslicense,andindicateifchangesweremade.TheCreativeCommonsPublicDomainDedicationwaiver (http://creativecommons.org/publicdomain/zero/1.0/)appliestothedatamadeavailableinthisarticle,unlessotherwisestated. Laneetal.BMCGenomics (2016) 17:702 Page2of16 susceptible to this pest, and mortality rates of up to 99 % a diversityoftissues and stressconditions, includingcold, have been observed in forest stands 6 years after infest- heat, drought, ozone, wounding, and EAB-feeding. The ation[4].EABhaskilledmillionsofashtreesinMichigan, sequences were assembled into a reference transcriptome Ohio and Indiana, and is spreading rapidly across North and differential gene expression between libraries was America[5].Economicandecologicaldamageisexpected examined to identify general stress-response genes con- to occur as the pest spreads [6–8] with estimates of costs servedacrosstissuesandstresstypes. due to lost tree value, removal and replacement ranging from$10.7billionto$26.0billion[9,10]. Methods Abiotic stresses induced by climate change, including Samplingandtreatments droughtand heat, pose an increasing threat for allNorth Parenttreetreatmentsandtissuecollection American trees, including green ash [11]. In northeastern Seed with wings, axial buds, terminal buds, leaflets, one- U.S.forests,greaterprecipitationandwarmertemperatures year-old twigs, xylem from three-year-old twigs and open havebeenrecorded[12],andspeciesrangesareexpectedto pollinatedseedswerecollectedfromahealthyadultgreen shift northward in response [13]. Climate change can also ash tree located at the University of Missouri’s Agrofor- affect the spread of invasive pests [14, 15]. Changing estry Research Center. From the same tree, attached leaf- climate may help the EAB to move further north into lets were wounded by punching multiple holes with a territorywherewintersare currentlytoocoldtoallowthe paperpunchacrosstheleaftissue,and one-year-old twigs larvae to survive [16], possibly more rapidly than the nat- were wounded by snapping the end of the twigs off. urallyslowrangeshiftsexpectedofthetreesthemselves. Tissues from the wound sites (broken twig end or leaf Molecular tools and genomic resources are less well holemargin)werecollectedafter5handagainafter24h. established for most forest trees than for agricultural DNA samples from this tree have been banked and are crops, although forestry applications show great promise availableuponrequest. [17, 18]. Green ash is one of the economically and eco- logically important tree species facing devastating losses Seedlinggrowthforstresstreatments from pests and other environmental stresses for which Open pollinated seeds from the above parent tree genomic resources are needed to gain a greater under- were germinated in the greenhouse at the University standingofmolecularresponsestostress.Transcriptome ofMissouri.Attheageof1year,seedlingswereshippedto studies have been an effective means for identifying Clemson University and to Pennsylvania State University candidate genes utilized by trees to combat stress, in for abiotic stress treatments. One-year-oldopen-pollinated species such as cork oak (Quercus suber) [19], chestnut seedlingswereacclimatedtothenormalgreenhouseenvir- (Castanea mollissima) [20] and Douglas fir (Pseudotsuga onment for at least 1 month prior to abiotic stress treat- menziesii) [21].Transcriptome data also serve as a source ments.Sixbiologicalreplicateswereusedforalltreatments of sequence-based genetic markers, enabling studies levels for cold, heat, wounding, drought and ozone stress of population structure, genetic linkage mapping, quanti- experimentsonseedlings. tative trait loci (QTL) identification, and associations of phenotype with genotype. Such information provides Seedlingtemperaturestresstreatmentsandtissuecollection powerful tools to advance pedigree-based breeding Heat and cold treatments were conducted in a growth andselectionprogramsaswellasmanagementofstanding chamber with a cycle of cool fluorescent light followed populations[17]. by 8 h of dark. Heat-stressed tissues were collected after Currently, there is insufficient genomic information 24 h of exposure to 40 °C. Cold stress was induced by forgreenashtoundertakemolecular-basedtreeimprove- exposing 12 seedlings to 4 °C for 24 h. Tissues were ment approaches. Little is known about ash response to collected from half of the seedlings immediately after stress at the gene level. A previously reported RNA- cold stress (“cold stressed”). Tissues were collected Seq resource pooling phloem tissue samples from green, from the other half of the seedlings 24 h after they white, black, blue and Manchurian ash species yielded were returned to the normal greenhouse environment over58,000assembledtranscriptsequences,avaluablere- (“cold stressed, recovery”). source for genetic marker design and initial functional characterization of genes expressed in ash phloem tissue, Seedlingdroughtstresstreatmentsandtissuecollection onwhichEABfeeds[22].However,thesequenceobtained Drought and mechanical wounding were performed in in that study was all from healthy tissues pooled from all the greenhouse facility. For drought treatment, watering five species, making it impossible to assign sequences was withdrawn from two sets of seedlings: one set was to individual species or to identify genes activated in fortissuecollection,andtheothersetforpetiolepre-dawn response to stress. To expand the resources for green waterpotentialmeasurements.Whenwaterpotentialinsix ashwe conducted high-throughputRNA sequencingwith out of seven surveyed seedlings dropped below −0.1Mpa, Laneetal.BMCGenomics (2016) 17:702 Page3of16 tissueswerecollectedfromsixseedlingsgoingthroughthe NRS16-243-2016 (tree PE21), FS-NRS16-244-2016 (tree samedroughtschemeasthesurveyedones. PE22), and FS-NRS16-245-2016 (tree PE24). Tree PE36 was collected by Kathleen Knight and David Carey. Trees Seedlingmechanicalwoundingtreatmentsandtissue PE19, PE21, PE22 and PE24 were collected by Mary collection Mason and Dan Herms.Green ash treeSUM is the culti- Mechanical wounding was introduced by punching four var‘Summit’andwasaccessionedfromDawesArboretum holes per leaf. Wounded leaves were then collected ei- (D1991-0541). A voucher for this tree is also available at ther 5 or 24 h after wounding. All tissues were collected theUSDAForestService,NorthernResearchStation,Pro- in the morning, except for the 5 h after wounding time jectNRS-16(FS-NRS16-240-2016). point, for which tissues were collected in the afternoon. Leaf, petiole, and root tissues were collected for heat, RNAextractionandlibrarypreparation cold, and drought treatments; leaf and petiole tissues Tissue samples collected at the time points mentioned were collected formechanicalwounding asabove. above were immediately flash frozen in liquid nitrogen and stored at −80 °C until RNA extraction. For stress Seedlingozonestresstreatmentsandtissuecollection experiments of seedlings with biological replicates, tissue Greenhouse-acclimated seedlings (<10 parts per billion samplesforRNAextractionwerepooledinequalamounts (ppb) ozone) were placed into Continuously Stirred from each replicates per treatment per time point. Total Tank Reactor (CSTR) chambers. After at least 3 days of RNAwasisolatedfrom~1goffrozenpooledtissue,using acclimation to the chambers, ozone was delivered as de- a modified CTAB method with lithium chloride precipita- scribedbyHecketal.(1978)[23].Ozoneexposureswere tion [25]. RNA quality was assessed using an Agilent conducted for 28 days: <10 ppb ozone as control, Bioanalyzer (Agilent technologies). cDNA libraries were 80 ppb, 125 ppb, and 225 ppb. Ozone was delivered in prepared from the pooled RNAs for each treatment and square-wave fashion, for 8 hr, 7 day/wk, with exposures timepointforatotalof55libraries.Foreachsample,1ug beginning at 0900 h, ending at 1659, via a controllable of RNA was converted to cDNA using the Illumina micro metering system. Concentrations were monitored TruSeq kit. The cDNA samples were sheared on a with a TECO Model 49 O analyzer and data logger/ Covaris S2 to ~300 bp, following the manufacturer’s 3 computer recording system. Leaf samples were collected recommendation (Covaris, Woburn, MA). Size selec- at 3 time points (7 hr, 14 days, 28 days) after stress initi- tion was performed on the Biomek FXp using the ation with six biological replicates (seedlings) for each SPRIworks HT Reagent Kit. Each library was uniquely time point/treatment, for a total of 24 seedlings. After taggedwithoneofIllumina’sTruSeqLTDNAbarcodesto sampling on the 28th day, mechanical wounding was allow library pooling for sequencing. Library quantitation conducted in situ, with three leaves of each plant being was performed using Invitrogen’s Picogreen assay and the wounded by multiple hole punches. Leaf samples from averagelibrarysizewasdeterminedbyrunningthelibrar- the hole margin were collected 24 h post-wounding ies on a Bioanalyzer DNA 1000 chip (Agilent). Library (i.e. “29-day wounding”). concentration was validated by qPCR on a StepOne Plus realtime thermocycler (Applied Biosystems, Grand Island EABlarvaetreatments NY), using qPCR primers, standards and reagents from Tissues from four putatively EAB-resistant (PE19, PE21, KapaBiosystems(Wilmington,MA). PE22, and PE24) and two confirmed EAB-susceptible (PE36 and SUM) green ash genotypes were collected Transcriptomesequencinganddenovoassembly before and after exposure to EAB larvae. The EAB- Of the 55 RNASeq libraries for green ash, 41 were se- feeding bioassay is described in [24]. Briefly, the two quenced on both the Illumina MiSeq Desktop and the EAB-susceptible and four EAB-resistant genotypes Illumina HiSeq 2000 sequencers (San Diego, CA), 12 were grafted and grown in the greenhouse for 2 years. were sequenced only on the MiSeq, and two were se- Samples of bark and phloem of each genotype were quencedonlyusingHiSeq2000.Formostlibraries,qual- acquired in the summer as untreated control tissues. ity was assessed by running the samples on an Illumina Subsequently EAB eggs were placed under the bark MiSeq sequencer and high throughput sequencing was aroundthe stem of each graft for hatching. After 8 weeks carried out on an Illumina HiSeq 2000 sequencer at a of EAB larvae feeding, the insects were removed and the read-length of101bp paired-end. All rawreadswere de- EAB-damaged phloem and bark tissues were sampled. posited in the NCBI Short Read Archive (SRA) under Vouchers for the wild-collected green ash trees are avail- the bioproject accession PRJNA273266. A summary of able at USDA Forest Service, Northern Research Station, readstatistics perlibraryisprovidedinAdditionalfile1. ProjectNRS-16undervoucheraccessionsFS-NRS16-241- Raw sequences were trimmed using trimmomatic ver- 2016 (tree PE36), FS-NRS16-242-2016 (tree PE19), FS- sion0.32[26].TrimmedreadsgeneratedfromtheMiSeq Laneetal.BMCGenomics (2016) 17:702 Page4of16 platform were assembled using Trinity pipeline version 100–450, primer_min_tm=55.0, primer_max_tm=65.0, r20121005 [27]. Outputs from theTrinity assembly were primer_min_gc=40, primer_max_gc=60, primer_max_ furtherassembledwithcd-hitversion4.6.1tocollapseiso- poly_x=3, primer_gc_clamp=2. The perl script used to forms [28]. TheTrinity plugin TransDecoder was used to extract these sequences and run Primer3 is available at predictopenreadingframes(ORFs)intheassembly[29]. via GitHub [38]. The spreadsheet output by this script is In supplemental files and public repositories, all tran- available as Additional file 3 and contains summary sta- scriptnamesbeginwith“Fraxinus_pennsylvanica_120313_” tistics,SSRlocationsandprimersequences. to indicate transcriptome origin and version. This part of the transcript name this has been removed from the text Geneexpressionacrosstissuesandtreatments for brevity. For example, transcript “Fraxinus_pennsylvani- HTSeq version 0.6.1p1 was used to produce raw read ca_120313_comp52211_c0_seq2” is referred to in the text counts for each transcript per library [39]. To account as“comp52211_c0_seq2”. for variations among gene length and library size, these counts were converted to the metric RPKM (reads per Assemblyqualityassessment kilobase per million mapped reads) [40]. To calculate Three methods were used for assessing whether the as- the number of transcripts expressed in each sample, a sembly contains all or most ofthe green ash genes. First, minimum expression cut-off of RPKM>0.1 was used. CEGMA (Core Eukaryotic Genes Mapping Approach) As previously described [41–43], log2(RPKM +1) nor- version 2.5 was used to compare the assembled green malized values were used for clustering; adding one was ash transcriptome against the core set of eukaryotic necessary to prevent a log2 transformation from calcu- genes [30]. Next, nine incremental assemblies were con- lating undefined values in cases of zero values. Pearson ducted with subsets of data to build a saturation curve correlations were calculated using the result of the and predict if new gene discovery would be likely with log2-transformations. A distance matrix was constructed additional sequencing. The nine assemblies were per- using these values. Hierarchical clustering was performed formed with the same methodology previously described usingthehclustfunctionwithaveragedistance. for the full assembly. Each additional assembly used all of the data from the previous assembly plus an add- Differentialexpressionanalysis itional setoftissues ortreatments.Thelibrariesincluded The R package DESeq2 version 1.63 was used to deter- in each assembly and assembly statistics are provided in mine statistically significant differentially expression Additional file 2. A third quality assessment was per- [44]. Raw counts from HTSeq were provided as input. formed by aligning all trimmed read pairs to the All comparisons used the default Wald test except the transcriptome with bowtie2 version 2.2.1 [31]. Ozone libraries where the likelihood ratio test (LRT) was utilized. Principal component analyses (PCA) were FunctionalannotationandSSRdiscovery also calculated with the DESeq2 package. The R code Both the transcript sequences and the amino acid se- utilized to generate the results is available at GitHub quences from the predicted ORFs were queried against [45] and the lists of differentially expressed putative the Swiss-Prot protein database and the plant taxonomic uniquetranscripts(PUTs)areavailableinAdditionalfile4. division of the TrEMBL protein database [32] using The set of up and down differentially expressed PUTs BLAST+ version 2.2.22 [33].Aminoacid sequences were were each assessed for GO term enrichment using the subjected to InterProScan version 5.4–47.0 searches to Cytoscape application BiNGO v3.03 [46], an often used predict protein family membership and identify con- tool for assessing GO enrichment in transcriptome served domains [34]. Gene ontology (GO) terms [35] studies [47–49]. The R scripts and raw data files used were assignedusingtheInterProScan software[36]. to generate figures, including the cluster analysis and Simple sequence repeats (SSRs) were identified from GO term enrichment in BiNGO, are archived publicly transcripts. Di-, tri-, and tetra- nucleotide repeats were athttps://github.com/statonlab/green_ash_rnaseq.AllGO only reported if they met the following criteria: di- enrichmentresultsarelistedinAdditionalfile5. nucleotide repeats with 8–200 copies, tri-nucleotide re- peats with 7–133 copies, and tetra-nucleotide repeats Results and discussion with 6–100 copies. SSRs were flagged as compound if Transcriptomesequencinganddenovoassembly they were adjacent or separated by less than 15 bases. Transcriptome sequencing of 55green ash RNA samples Primer3 v2.3.6 was used to design primers flanking the spanning a variety of tissues and treatments yielded over SSRs, excluding all compound SSRs. Sequences were 99 Gb of sequence data. Sample libraries encompass masked for low complexity regions with dustmasker [37] EAB damage, specialized tissues, mechanical wounding, prior to primer design. The following parameters were heat exposure, drought exposure, cold exposure, ozone altered from the default: primer_product_size_range= exposure, and ozone exposure plus mechanical wounding Laneetal.BMCGenomics (2016) 17:702 Page5of16 (Table1).Astringentfilteringanddenovoassemblypipe- additional libraries added to each subsequent assembly line was used to produce 107,611 PUTs. The PUTs have until all libraries were included. The incremental assem- anaveragelengthof818nucleotidebasepairsandanN50 bly size ranged from 14,723 transcripts to 107,363 tran- of 1327 bases. A total of 52,899 open reading frames scriptsdepending ontotal data includedintheassembly; (ORFs) could be identified from 47,069 PUTs (43 %). more sequencing libraries resulted in more transcripts PUTs with more than one ORF may represent operons with each addition of data (Additional file 2). While the fromplastidormitochondrialgenomes,chimeras,ormis- overall number of transcripts and ORFs are increasing assemblieswherea1–2baseinsertion/deletionshiftedthe with the addition of data, the total number of identified ORF. PUTs without an ORF were of considerably shorter ORFs increased much more slowly, indicating that length:409basesinaveragelengthofPUTswithoutORFs additional sequencing would likely yield few new ORFs vs. 1342 bases average length of PUTs with ORFs. This (Fig. 2a). From an assembly with input data of 51.8 M may indicate that lack of a captured start or stop codon reads to 54.0 M reads, 3149 new transcripts were found impeded ORF identification or that the reads originated but only 1053 new ORFs were discovered. Regarding the fromnoncodingRNAs. lengthofthetranscripts,bothN50andaveragelengthare increasing only slightly at the largest data input sizes Assemblycompleteness (Fig. 2b): one additional base pair in N50 length per Three strategies were employed to test the completeness addition of 200,000 input reads and one additional base of the transcriptome: comparison to CEGMA (Core pair in average transcript length per addition of 500,000 Eukaryotic Genes Mapping Approach), alignment of input reads.Saturationwasreached forpeptidelengthfor reads to the transcriptome, and saturation analysis. both N50 as well as average length. From the fourth CEGMA includes a database containing 248 highly con- largestassemblywithaninputof26.0Mreadstothefinal served eukaryotic genes and a computationalmethod for assemblyof54.0Mreads,theaverageandN50lengthsin- assessing the presence of these genes in a dataset [30]. creasedbyonlyfourandfiveaminoacids,respectively. Comparison of the green ash transcriptome to the The three quality assessment metrics indicate that the CEGMA dataset indicates that all core eukaryotic genes green ash de novo transcriptome represents the majority are present in the final assembly. The majority, 238 of expressed genes based on presence of conserved genes or 96 %, were found to be complete in length in eukaryotic genes, the alignment of the majority of reads the green ash transcriptome. For the remaining ten backtotheassembly,andasaturationanalysisindicating genes, green ash transcripts were found but span only a that few new ORFs are likely to be discovered with add- portion oftheexpected gene length. itional sequencing. The strategy of sequencing RNA To assess how well the final assembly represented all from avariety ofgreenash tissues and treatments effect- of the sequenced reads, the reads were aligned to the ivelymaximizedthesamplingofallgenes,yieldingarich PUTs.A slight difference inrates ofalignmentfor librar- transcriptome sequence resource for further genomic ies from two different sequencing instruments was de- andgeneticwork inash. tected. For reads from the MiSeq instrument, an average of89%ofreadsalignedwitharangeforthe53individual FunctionalannotationandSSRdiscovery libraries from 83 to 91 %. Reads from the HiSeq aligned Toprovide functionalannotation for the green ashPUTs, onaverageatarateof87%,witharangeof83to90%for a combination of sequence similarity, protein domain individual libraries. Lessthan3 % ofall read pairs aligned searches and GO term assignments were conducted. discordantly,i.e.twopairedreadsalignedtodifferenttran- BLAST searches [50] against the Swiss-Prot and plant scripts. For all but two of the libraries sequenced on both TrEMBL databases [32] were conducted to compare the platforms, the MiSeq reads aligned at a slightly higher green ash PUTs to previously sequenced and annotated rate, about 1.6 % more often, than the reads from the proteins. For transcript sequences, 46 % matched at least HiSeqforthesamelibrary(Fig.1).However,forindividual one Swiss-Prot accession and 63 % matched at least one libraries sequenced on both platforms, the rate of align- plant protein from TrEMBL. The inferred homology re- ment from the MiSeq is correlated to the rate for the sults support the ORF-predictions; 98 % of PUTs with an HiSeq (R2=0.68), confirming that library quality can be ORF matched a known protein while only 36 % of PUTs assessed effectively on a MiSeq platform prior to higher- withoutapredictedORFmatchedaknownprotein.PUTs throughputsequencingonaHiSeq. without a match to known proteins may be non-coding Saturation analysis was carried out to detect the incre- RNAs, genes that have significantly diverged from avail- mental new gene discovery from the addition of new able reference sequences, or erroneous sequence data. RNAlibraries.Ararefactioncurvewasgeneratedbypro- The predicted protein sequences from the ORFs were ducing nine assemblies using subsets of data. The smal- characterized for homology to protein families and do- lest data set assembled was a single RNA library, with mains by InterProScan [36] (Additional file 7), yielding Laneetal.BMCGenomics (2016) 17:702 Page6of16 Table1Samplesforsequencing Tissue Source #Reads Leaves,ambientozonefor7h 6seedlings,pooled 419,064 Leaves,80ppbozonefor7h 6seedlings,pooled 457,118 Leaves,125ppbozonefor7h 6seedlings,pooled 495,728 Leaves,225ppbozonefor7h 6seedlings,pooled 500,118 Leaves,ambientozonefor14days 6seedlings,pooled 450,932 Leaves,80ppbozonefor14days 6seedlings,pooled 427,298 Leaves,125ppbozonefor14days 6seedlings,pooled 507,634 Leaves,225ppbozonefor14days 6seedlings,pooled 483,742 Leaves,ambientozonefor28days 6seedlings,pooled 440,864 Leaves,80ppbozonefor28days 6seedlings,pooled 486,950 Leaves,125ppbozonefor28days 6seedlings,pooled 516,644 Leaves,225ppbozonefor28days 6seedlings,pooled 318,888 Leaves,80ppbozonefor28days,woundingafter28thday,29daystotal 6seedlings,pooled 13,160,726 Leaves,125ppbozonefor28days,woundingafter28thday,29daystotal 6seedlings,pooled 11,583,960 Leaves,225ppbozonefor28days,woundingafter28thday,29daystotal 6seedlings,pooled 12,506,320 Leaves,ambientozonefor28days,woundingafter28thday,29daystotal 6seedlings,pooled 10,723,914 Unstressedleaves 6seedlings,pooled 24,953,504 Unstressedpetioles 6seedlings,pooled 30,230,264 Unstressedroots 6seedlings,pooled 29,378,320 Woundedleaves5h 6seedlings,pooled 24,619,536 Woundedleaves24h 6seedlings,pooled 27,899,560 Woundedpetioles5h 6seedlings,pooled 23,312,492 Woundedpetioles24h 6seedlings,pooled 25,669,498 barkandphloemafterEABfeeding Tree19 23,628,074 barkandphloemcontrol Tree19 52,011,186 barkandphloemafterEABfeeding Tree21 27,585,376 barkandphloemcontrol Tree21 25,016,884 barkandphloemafterEABfeeding Tree22 21,570,304 barkandphloemcontrol Tree22 29,090,224 barkandphloemafterEABfeeding Tree24 26,814,566 barkandphloemcontrol Tree24 26,043,308 barkandphloemafterEABfeeding Tree36 26,342,646 barkandphloemcontrol Tree36 26,495,904 barkandphloemafterEABfeeding TreeSummit 32,053,344 barkandphloemcontrol TreeSummit 59,097,222 Coldstressedleaves(4Cfor24hr,recoveryfor24hr) 6seedlings,pooled 29,361,894 Coldstressedpetioles(4Cfor24hr,recoveryfor24hr) 6seedlings,pooled 15,174,806 Coldstressedroots(4Cfor24hr,recoveryfor24hr) 6seedlings,pooled 21,691,182 Coldstressedleaves(4Cfor24hr) 6seedlings,pooled 15,625,080 Coldstressedpetioles(4Cfor24hr) 6seedlings,pooled 18,679,338 Coldstressedroots(4Cfor24hr) 6seedlings,pooled 18,870,828 Droughtstressedleaves(<1.0Mpa) 6seedlings,pooled 15,984,282 Droughtstressedpetioles(<1.0Mpa) 6seedlings,pooled 20,242,284 Droughtstressedroots(<1.0Mpa) 6seedlings,pooled 18,864,744 Laneetal.BMCGenomics (2016) 17:702 Page7of16 Table1Samplesforsequencing(Continued) Heatstressedleaves(40Cfor24Hr) 6seedlings,pooled 14,167,240 Heatstressedpetioles(40Cfor24Hr) 6seedlings,pooled 13,049,768 Heatstressedroots(40Cfor24Hr) 6seedlings,pooled 13,415,506 Unstressedleaves(controlforwounded) Adulttree 10,829,406 Woundedleaves Adulttree 14,083,066 Unstressed1yearoldtwigs Adulttree 11,524,160 Wounded1yearoldtwigs Adulttree 14,157,548 3-year-oldxylem Adulttree 8,191,208 Seedwings Adulttree 36,521,672 Axialbuds Adulttree 13,411,952 Terminalbuds Adulttree 14,531,836 additional functional information for 45,893 protein se- designedtoflank486oftherepeats(Additionalfile3).Re- quences (87 %). InterProScan also assigned Gene Ontol- peats with primers were cataloged for 431 di-nucleotide ogy (GO) terms; 29,666 proteins (56 %) were assigned at SSRs with a range of eight to 11 motif copies, 54 tri- least one GO term. The GO terms indicate that a variety nucleotide repeats with a range of seven to 12 motif cop- of different genes were captured, with 679 biological ies,andasingletetra-nucleotidewithsixmotifcopies. process, 914 molecular function and 227 cellular compo- nentGOtermsassigned. Expressionacrosstissuesandtreatments Extraction of SSRs yielded a total of 1956 individual ThehighdepthofreadsobtainedduringRNAsequencing SSRs and 5 compound SSRs from 1937 transcript se- enables comparison of transcriptome expression patterns quences. SSRs were relatively rare, with less than 2 % of across different tissues and treatments. Libraries with transcriptsyieldinganSSRandanaverageofoneSSRper fewerthan1millionsequencedreadswereexcludedfrom 44.8 kilobase (kbp) of transcript. Excluding compound this analysis due to possibly insufficient depth to capture SSRs, di-nucleotides were the most common making up rarely expressed transcripts, leaving 43 libraries for ana- 87.7 % of the total. The second most common was lysis.IndividualRNAsampleswerefoundtoexpressfrom tri-nucleotides at 11.9 %. Tetra-nucleotides make up 76,861to99,706PUTswithtwolibrariesfoundasoutliers less than 1 % of the total. Primers were successfully with significantly lower counts: 3-year-old xylem tissue Fig.1Readalignmentsbysequencingplatform.Thepercentofreadsthatsuccessfullyalignedtothefinaltranscriptomeforeachsequencing runrangedfrom83to92%.ReadsproducedfromtheIlluminaMiSeq(bluesquares)wereslightlymorelikelytoalignthanreadsfromthe IlluminaHiSeq(reddiamonds).Ifthesamelibrarywasrunonbothplatforms,thenthetwoexperimentsarelinkedbyaline Laneetal.BMCGenomics (2016) 17:702 Page8of16 Fig.2Genediscoverysaturationcurve.Astep-upmethodofassemblywascompletedinordertodetermineifadditionalsequencingislikelyto yieldnewtranscripts.Forincreasingnumbersofinputreads,thetotalnumberoftranscriptsandproteinscontinuedtoincrease(a).Fortheaverage lengthandN50ofthetranscripts,additionalreadsinducedsmallincreasesfortranscriptswhilepredictedORFswereunchangingafter30millioninput reads(b) with 52,419 expressed PUTs and EAB fed bark and increased or decreased transcript abundance between phloem from Tree19 with48,605expressed PUTs.Fewer stressed tissues and control conditions were determined identified transcripts may be indicative of library prepar- with a cutoff of p<0.05 using the package DESeq2 for ation variation or an actual lower number of genes each type of stress. For the EAB test, the six genotypes expressedbiologically. were sequenced invidually and used as biological repli- Hierarchical clustering of all libraries was performed cates. For the remaining stress tests, six biological repli- to determine which samples had similar expression pat- cates were produced in the greenhouse, and the RNA terns (Fig. 3). The tissue type was found to be the stron- from eachreplicatewaspooledpriortosequencing. gest common element for clustering, indicating that EAB is a primary threat to green ash trees. To under- tissue-specific transcription patterns, even under stress, stand defense responses on a molecular level, six geno- are conserved. Cluster Aincludesmostlyleaftissuesam- types of ash were assessed at two time points: pre-EAB ples, including control, wounding, cold and ozone plus feeding and 8-weeks post-EAB larval hatch and feeding. wounding conditions. Cluster B includes all of the peti- A Wald test of the data found 13,275 differentially ole tissue samples and the heat stressed leaf sample. expressed PUTs, 6884 of which had lower expression Cluster C includes the majority of the bark and phloem after feeding, and 6391 of which had higher expression tissues; the exceptions include Tree 19 samples as well after feeding. This large difference includes the response as the Tree Summit control sample. Cluster D includes to EAB feeding but also includes seasonal changes and primarily root samples, and cluster E includes twig and other environmental factors that impacted the seedlings bud tissue samples exclusively. Interestingly, twigs and duringthe8weeksoffeeding. buds from a mature green ash did not group with the The genotypes utilized for the EAB feeding bioassay bark and phloem samples from two-year-old grafts. were selected for their differing response to EAB; four Cluster Fincludes the majority of ozone stressed leaf tis- were identified as putatively resistant to EAB. These ‘lin- sue samples with a single exception, the ozone stressed gering ash’ trees were found to have a healthy canopy leaf tissue samples with wounding on the 28th day. They after EAB had caused over 95 % mortality of surround- did not group with other leaves, but this may be due to ing ash trees. The remaining two genotypes are both a batch effect; all samples in cluster F have only MiSeq known to be susceptible to EAB. For statistical analysis, data, resulting in overall lower depth than all other the resistant genotypes represented four biological repli- samples sequenced. cates and the susceptible genotypes were used as two biological replicates. This allows for an additional com- Differentialexpressionanalysis parison of interest, i.e. the difference in response by four The RNA sequencing of both control and stressed green EAB-resistant green ash genotypes in comparison to two ash tissues enables an initial inquiry into the regulation EAB-susceptible genotypes. A principal components of transcripts under each of six types of stress: EAB analysis (PCA) of the PUT expression patterns among feeding, cold, drought, heat, mechanical wounding and the twelve samples suggests that susceptibility has a de- ozone (Table 2). Statistically significant changes of either tectable association with expression; 63 % of variance Laneetal.BMCGenomics (2016) 17:702 Page9of16 Fig.3Hierarchicalclusteringoftissues.The55greenashRNAsampleswereclusteredbynormalizedreadcountsacrossallPUTs.Clusters(a–e)highlight groupsofsamplesoriginatingfromsimilartissuesand/orexperimentaltreatments across samples corresponds to treatment. A secondary genotypes post-feeding to identify active plant defense re- variance component of 13 % separates resistant versus sponse patterns. In post-feeding samples, 580 PUTs are susceptiblegenotypes(Fig. 4). lessexpressedand545PUTsaremoreexpressedinresist- Weperformeda testforexpressiondifferencesbetween ant genotypes. The candidate PUTs identified in this ex- resistant and susceptible genotypes before larval feeding periment may prove useful for further studies regarding todetermineifthetree’stranscriptomepatternsarediffer- themolecularmechanismsofnaturalresistancetoEAB. entpriortoinsectexposure.Forpre-feedingsamples,899 Plant response to insect herbivory is complex and may PUTs are down-regulated and 742 PUTs are up-regulated be induced by detection of nonself compounds as well in the resistant genotypes relative to the susceptible. We as by signals sent from damaged cells [51]. Tissue dam- also tested for differences in resistant and susceptible age may also be caused wind, hail, and other mechanical Laneetal.BMCGenomics (2016) 17:702 Page10of16 Table2DifferentialexpressionandGOtermenrichmentresults Transcriptsincreasinginexpression Transcriptsdecreasinginexpression Test #transcripts #EnrichedGO #transcriptswith #transcripts #EnrichedGO #transcriptswith plantslimterms putativefunction plantslimterms putativefunction EAB-ControltissuesvsInfestedTissues 6391 12 5346 6884 10 5534 EAB-SusceptiblevsResistant,Pre-EABFeeding 750 5 460 899 2 497 EAB-SusceptiblevsResistant,Post-EABFeeding 545 1 336 580 2 351 Cold-stressedtissuesvscontroltissues 3196 9 2336 456 0 342 Drought-stressedtissuesvscontroltissues 13 2 12 82 7 66 Heat-stressedtissuesvscontroltissues 502 7 386 1114 8 984 Mechanically-woundedtissuesafter5hvs 237 5 208 252 3 217 controltissues Mechanically-woundedtissuesafter24hvs 307 1 244 653 3 544 controltissues Tissuesat4levelsofozoneacross3timepointsa 350a 15a 342a a Statisticaltestswereconductedforeachstressconditiontodeterminegeneswithincreasedordecreasedexpression(adjustedp-value<0.01).Thesegeneswere assessedforsharedbiologicalprocessesormolecularfunctionsviaontologyenrichmentbasedonthesubsetofGOplantslimterms aTranscriptsresponsivetoozone.Alikelihoodratiotestwasusedtoidentifyanytranscriptsresponsivetoozonetreatmentacrossmultipletimepoints,allowing PUTswithmorecomplexpatterns,forexampleinitiallyupregulated,thendownregulated,tobeincluded factors. To explore ash response specifically to wound- PUTS increasing in expression and 82 PUTs decreasing. ing, experiments were conducted on leaves, petioles and Heat stress induced up-regulation for 502 PUTs and twigs. Experimentation included single biological repli- down-regulation for 1114 PUTs. Cold stress induced the cates of tissues (leaf, twig) on an adult tree 24 h after mostchangesof the threestresstypes, with3196increas- damage and six biological replicates of seedling tissues ing PUTs and 456 decreasing. An additional oxidative (petiole, leaf) 5 and 24 h after being damaged. Using all stress, increasing ozone, was assayed only for leaves, but tissues types in the Wald statistical test, we identified sampled across three time points (7 h, 14 days, 28 days) genes with significantly different abundances correlated and four ozone concentrations (atmosphere, 80 ppb, to mechanical wounding after 5 h and after 24 h. For 5 h 125 ppb, 225 ppb). We used a likelihood ratio test (LRT) post mechanical wounding, we found 237 up-regulated to identify 350 PUTs that responded to changes in ozone PUTs and 252 down-regulated PUTs. Additional dif- levels. Four experimental treatments involved a combin- ferentially expressed PUTs were found after 24 h: 307 ation ofmechanicalwoundingand ozone stress; neither a up-regulated and 653 down-regulated. Some genes Wald test nor a LRT yielded statistically significant gene were identified as differentially expressed at both time associations for these samples versus control tissues. This points: 109 genes were down regulated at both time is surprising given that each treatment independently ponits and 16 were up regulated at both time points. showeddifferentialgeneexpression.Possibly,crosstalkbe- tween ozonestressand mechanical woundingdiffers with Climatechangestressors different ozone levels, and the differences in gene expres- The alterations of Earth’s climate will impose increased sion response in each library led to a lack of statistical abiotic stresses with implications for the adaptation and power in detecting additional differentially expressed survival of ash species. We conducted analyses of four genesafterwounding. stressorsexpectedtoincreaseinseverityinnativeforests as part of climate change: heat, cold, drought, and ozone GOtermenrichmentacrossstressresponses (which also serves as a general oxidative stress for which Foreach listofdifferentially expressedPUTsfrom differ- accurate dose-response investigations can be conducted ent stress types, GO term enrichment was performed in under controlled conditions). Experiments were con- order to identify molecular pathways and processes in- ductedwithsixbiologicalreplicatespercondition:control, volved in stress response in green ash (Fig. 5). Expected heat, cold and drought conditions acrossleaf, petiole, and functions were found in many experiments, such as root tissues of green ash seedlings. For all statistical tests genes in the category “response to abiotic stimulus” dur- ofdifferentialgeneexpression,thethreetissueswerecon- ing drought and cold and “increased response to stress” sideredtogethertoprovideincreasedstatisticalpowerand under mechanical wounding and heat stress conditions. todiscover transcripts implicated in whole plant response Many GO terms were identified in more than one stress to stress. Drought produced significantly fewer differen- condition, indicating overlap in the stress response path- tially expressed PUTs than the other stressors, with 13 ways despite different stimuli. Unfortunately, many of
Description: