ebook img

RNA-Seq Analysis and De Novo Transcriptome Assembly of Jerusalem Artichoke PDF

15 Pages·2014·1.24 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview RNA-Seq Analysis and De Novo Transcriptome Assembly of Jerusalem Artichoke

RNA-Seq Analysis and DeNovoTranscriptome Assembly of Jerusalem Artichoke (HelianthustuberosusLinne) WonYongJung1,2.,SangSookLee1.,ChulWookKim2,Hyun-SoonKim1,SungRanMin1,JaeSunMoon1, Suk-Yoon Kwon1, Jae-Heung Jeon1*, Hye Sun Cho1* 1PlantSystemsEngineeringResearchCenter,KoreaResearchInstituteofBioscienceandBiotechnology,Daejeon,Korea,2AnimalMaterialEngineering,Gyeongnam NationalUniversityofScienceandTechnology,Jinju,Korea Abstract Jerusalemartichoke(HelianthustuberosusL.)haslongbeencultivatedasavegetableandasasourceoffructans(inulin)for pharmaceuticalapplicationsindiabetesandobesityprevention.However,transcriptomicandgenomicdataforJerusalem artichoke remain scarce. In this study, Illumina RNA sequencing (RNA-Seq) was performed on samples from Jerusalem artichoke leaves, roots, stems and two different tuber tissues (early and late tuber development). Data were used for de novoassemblyandcharacterizationofthetranscriptome.Intotal206,215,632paired-endreadsweregenerated.These wereassembledinto66,322lociwith272,548transcripts.LociwereannotatedbyqueryingagainsttheNCBInon-redundant, PhytozomeandUniProtdatabases,and40,215lociwerehomologoustoexistingdatabasesequences.GeneOntologyterms wereassignedto19,848loci,15,434lociwerematchedto25ClustersofEukaryoticOrthologousGroupsclassifications,and 11,844lociwereclassifiedinto142KyotoEncyclopediaofGenesandGenomespathways.Theassembledlocialsocontained 10,778potentialsimplesequencerepeats.Thenewlyassembledtranscriptomewasusedtoidentifylociwithtissue-specific differentialexpressionpatterns.Intotal,670lociexhibitedtissue-specificexpression,andasubsetofthesewereconfirmed using RT-PCR and qRT-PCR. Gene expression related to inulin biosynthesis in tuber tissue was also investigated. Exsiting genetic and genomic data for H. tuberosus are scarce. The sequence resources developed in this study will enable the analysis of thousands of transcripts and will thus accelerate marker-assisted breeding studies and studies of inulin biosynthesis in Jerusalemartichoke. Citation:JungWY,LeeSS,KimCW,KimH-S,MinSR,etal.(2014)RNA-SeqAnalysisandDeNovoTranscriptomeAssemblyofJerusalemArtichoke(Helianthus tuberosusLinne).PLoSONE9(11):e111982.doi:10.1371/journal.pone.0111982 Editor:HaoSun,TheChineseUniversityofHongKong,HongKong ReceivedMay9,2014;AcceptedOctober9,2014;PublishedNovember6,2014 Copyright: (cid:2) 2014 Jung et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricteduse,distribution,andreproductioninanymedium,providedtheoriginalauthorandsourcearecredited. DataAvailability:Theauthorsconfirmthatalldataunderlyingthefindingsarefullyavailablewithoutrestriction.Allrelevantdataarewithinthepaperandits SupportingInformationfiles. Funding:ThisworkwassupportedbyKRIBBResearchInitiativeProgram,TheCabbageGenomicsassistedbreedingsupportingcenter(CGsC)researchprograms fundedbytheMinistryforFood,Agriculture,ForestryandFisheriesoftheKoreanGovernment,TheNextGenerationofBioGreen21Project,TheNationalCenter forGMCrops(PJ009043)fromRDAtoHSC,andBio-industryTechnologyDevelopmentProgram(No.310006-5),MinistryforFood,Agriculture,Forestryand Fisheries,RepublicofKoreatoJ-HJ.Thefundershadnoroleinstudydesign,datacollectionandanalysis,decisiontopublish,orpreparationofthemanuscript. CompetingInterests:Theauthorshavedeclaredthatnocompetinginterestsexist. *Email:[email protected](J-HJ);[email protected](HSC) .Theseauthorscontributedequallytothiswork. Introduction chondrial genome, and nuclear ribosomal DNA genomes. This analysis showed that the genome of Jerusalem artichoke was not The sunflower species Jerusalem artichoke (Helianthus tuber- derived from Helianthus annuus (an annual) but instead osusL.),inthefamilyAsteraceaeoftheorderAsterales,hasbeen originated from perennial sunflowers through hybridization of cultivatedasavegetable,afoddercrop,andasourceofinulinfor the tetraploid Hairy Sunflower (Helianthus hirsutus) with the foodandindustrialpurposes[1–4].Jerusalemartichoke,whichhas diploid Sawtooth Sunflower (Helianthus grosseserratus). [11,12]. been cultivated since the 17th century, can grow well in These results indicate that H. tuberosus is an alloploid species, nutritionally poor soil and has good resistance to frost and plant havingasetofchromosomesfromeachprogenitoranddoublethe diseases [5,6]. In the early 1900s, systematic breeding programs chromosome number ofthe twoparental species. began to explore the use of H. tuberosus tubers for industrial Many members of the Asteraceae family accumulate fructans applications such as the production of ethanol [4]. Jerusalem (fructose polymers) in underground storage organs [13]. On such artichoke is a hexaploid with 102 chromosomes (2n=66=102) fructanis,inulin,whichisstoredinthevacuoleinapproximately [7] that is thought to have originated in the north-central U.S., 15% of flowering plant species [14]. Jerusalem artichoke and although the exact origins remain a subject of debate [8,9]. chicory (Cichorium intybus L.) are the most important cultivated Despite its cultural and economic significance, few studies have sourcesofinulin[15–17].Inulinmoleculesaremuchsmallerthan investigated the genetic origins of Jerusalem artichoke and its starch molecules, and have 2270 linked fructose moieties various cultivars. A recent study assessed the origin of Jerusalem terminated by a glucose residue [7]. The average number of artichoke using genome skimming [10], a new technique for fructose subunits depends on the species, production conditions, assembling and analyzing the complete plastome, partial mito- and developmental timing [18]. Inulin has many uses in the PLOSONE | www.plosone.org 1 November2014 | Volume 9 | Issue 11 | e111982 TranscriptomeAnalysisofJerusalemArtichoke productionoffood[19,20],andpharmaceuticals[21–23],andcan Fisher Scientific, Wilmington, DE, USA) and was used for beusedasastoragecarbohydrateforbioethanolproduction[24]. assessment of RNAquality andpurity. The inulin produced by Jerusalem artichoke is therefore a commercially valuable resource [7]. 2.2 Transcriptome Sequencing Recent advances in next-generation sequencing technology AnequalamountoftotalRNAfromeachtissuewaspooledfor have enabled gene discovery, analysis of gene content, and transcriptome sequencing in order to obtain a comprehensive measurementofgeneexpressioninnon-modelorganismsthatlack rangeoftranscripts.Poly(A)+RNAswerepurifiedfromthepooled a published genome sequence. For example, transcriptome total RNA (20 mg) using oligo(dT) Dynabeads. Impurities were sequencing can be used for genome-wide determination of removed from the hybridized sample using a series of low-salt absolute transcript levels, identification of transcripts, and washes. First-strand cDNAs were synthesized using oligo(dT) delineation of transcript structure (including 59 and 39 ends, primers. RNA was then degraded with RNase H (Invitrogen, introns, and exons) [25–28]. Transcriptome sequencing can also Carlsbad, CA, USA) and second-strand cDNA were synthesized identify genetic variations such as, single nucleotide polymor- using DNA polymerase I (New England BioLabs, Ipswich, MA, phisms(SNPs)andsimplesequencerepeats(SSRs)[29].Inrecent USA).Double-stranded cDNAswere randomlyfragmented using years, RNA-Seq analysis has facilitated transcriptome character- a nebulizer. The fragments were then repaired and extended at ization in hundreds of plant species lacking sequenced genomes the39endbyadditionofasingleadenine,anddifferentadapters [30–34]. were ligated to the 59 and 39 ends. The ligated fragments were Inthisstudy,weusedRNA-Seqtechnologytodevelopthefirst separatedonagel,andfragmentsof,200bpwereisolated.After H. tuberosus transcriptome dataset. De novo transcriptome amplificationbypolymerasechainreaction(PCR),fragmentswere sequencing was performed on RNA from five different H separatedusingelectrophoresis,purified,andsubjectedtoIllumina tuberosus tissues. We identified 66,322 loci, annotated 40,215 HiSeq2000sequencing.Rawsequencedataweregeneratedbythe loci,andmapped11,844locito237KyotoEncyclopediaofGenes Illumina analysis pipeline. Sequence data are deposited in the and Genomes (KEGG) pathways. We also identified 670 tissue- NCBI Sequence Read Archive (SRA, http://www.ncbi.nlm.nih. specificcandidatelociand10,778SSRs.Thisnoveldatasetwillbe gov/Traces/sra) understudy number PRJNA258432. an important resource in the further genetic characterization of Jerusalem artichoke and will be particularly valuable in marker- 2.3 De novo Transcriptome Assembly assisted breeding and investigation of traits related to inulin Raw sequence data were filtered using standard RNA-Seq biosynthesis. parameters. Briefly, low-quality and N-base reads were trimmed fromtherawreadsandreadswerefilteredbyPhredqualityscore Materials and Methods (Q$20forallbases)andreadlength($25bp).The39endsofthe 2.1 Plant Materials and RNA Isolation clean readswere trimmed toformfivesets of readsfrom thefive A widely-cultivated Jerusalem artichoke cultivar, Purple Jer- different tissues. These datasets were then pooled and assembled usalem Artichoke (PJA),wasused fortranscriptome analysis. PJA using de novo assemblers (Velvet v1.2.07 [35] and Oases v0.2.08 tubers were planted in January 2012 and were grown under [36]) based on the de Bruijn graph algorithm. Reads were normal conditions until harvesting. Stems, leaves, and tubers assembledintocontigsatdistinctk-mervalues(45,51,53,55,57, (stages 1 and 2; tuber1 and tuber2, respectively) were collected 6 59,61,63,65,67,69and75)usingVelvet.Contigsateachk-mer months after planting. To avoid contamination with pathogen, value were assembled into transcripts using Oases. Finally, the roots were collected from in vitro-cultivated PJA. Tissues were transcripts assembled at k-mer values 63 and 65 were merged snap-frozen in nitrogen upon harvest and were stored at 280uC using Oases with a minimum length of 200bp and other default until further processing. Total RNAs were extracted using Trizol settings. Hash length (k-mer=65) was considered for selection of Reagent(Invitrogen,Carlsbad,CA,USA),andwerethentreated the optimal de novo assembly as described previously [37]. The with DNaseI (Fermentas, Pittsburgh, PA,USA)according tothe cleanedreadswerealsoassembledusingTrinityrelease_2011-11- manufacturers’ instructions. The OD260/230 ratio was deter- 26[38]withk-merof25,minimumk-mercoverageof1.Default mined using a NanoDrop ND-1000Spectrophotometer (Thermo settings were used for all other parameters. The performance of Table1. Summaryof H.tuberosus de novo assemblyusingVelvet-Oases. Locus Transcripts Numberofsequences 66,322 272,548 Sequencestatistics Minimum 200 200 Maximum 15,368 16,437 meanlength 761 1,176 N50 1,249 1,703 Distributionofsequencelengths #500bp 36,383 79,718 501#1,000bp 13,926 68,089 1,001#1,500bp 7,027 47,719 1,501#2,000bp 4,216 33,486 2,001# 4,770 43,536 doi:10.1371/journal.pone.0111982.t001 PLOSONE | www.plosone.org 2 November2014 | Volume 9 | Issue 11 | e111982 TranscriptomeAnalysisofJerusalemArtichoke Figure 1. Comparison of assembled H.tuberosusloci with database sequences. Species, E-value, and similarity distributions of the assembledlociagainstdatabasesequencesareshown.(A)SpeciesdistributionofthetopBLASThitsfortheassembledloci(Cut-off,E-value=0). (B)E-valuedistributionofBLASThitsfortheassembledloci(E-value#1.0e-05).(C)SimilaritydistributionofBLASThitsfortheassembledloci. doi:10.1371/journal.pone.0111982.g001 the two assembly tools was assessed at N50 value, mean length, compared with expressed sequence tag (EST) sequences from H. maximumlengthandtranscriptnumber.Datasetsproducedusing tuberosus(atotalof40,388ESTs)andH.annus(atotalof134,474 Velvet-Oases were selected for subsequent analyses. Singletons ESTs) in NCBI GenBank (ftp://ftp.ncbi.nih.gov/pub/TraceDB/ and the longest sequence in each cluster were designated as loci helianthus_tuberosus/andhttp://www.ncbi.nlm.nih.gov/Taxonomy/ andwerethentranslatedinallsixframes.Putativetranscriptswere Browser/wwwtax.cgi?id=4232, respectively) using BLASTN [39] validated by comparison with gene sequences in the Phytozyme with anE-value cut-off of 1E-20. database (http://www.phytozyme.net/) using BLASTX (E-value #1E-05,BLASTv.2.2.28+).Inaddition,theassembledlociwere PLOSONE | www.plosone.org 3 November2014 | Volume 9 | Issue 11 | e111982 TranscriptomeAnalysisofJerusalemArtichoke 2.4 Functional Annotation and Classification BLASTX and Blast2GO software v2.4.4 [40] were used to comparetheassembledloci($200bp)totheNR,Phytozome,and s n UniProt databases at a threshold E-value #1.0E-05. For Gene o ati Ontologyanalysis,thegeneontology(GO)database(http://www. ot geneontology.org/) was downloaded and the assembled loci were n an 5 annotated to the GO database using BLASTP (E-value #1.0E- Total 192,96 40,215 0re6s)u.lGtsOfrotemrmthaenMnoatpat2iSonlimw.apsldscertieprtm[i3n7e]d. PusriontgeinGOseqculaesnscifeicsawtiiotnh the highest sequence similarities and cut-offs were retrieved for analysis. Further functional enrichment analysis was carried out using DAVID [41,42] and AgriGO (plant GO slim, FDR#0.01) 20 8 [43]. Gene lists were annotated byTAIR ID, and were analyzed 3 4 GO 120, 19,8 tweirtmhsd[e4fa4u],ltCclruitsetreiras (ocfouEnutksa$ry2otiacndOrEthAoSlEogoscuosreG#ro0u.p1s) (fKorOGGOs) [45], and KEGG pathways [46]. In addition, KEGG pathways were assigned to the locus sequences using the single-directional best hit method on the KEGG Automatic Annotation Server 3 7 1 [47,48]. R 5 8 TAI 153, 28,7 CodingsequenceswerepredictedthroughBLASTcomparisons withpublicproteindatabases.Sequenceswerecomparedwiththe Phytozome andNrproteindatabases usingBLASTX(E-value# 1.0E-5).LocithatmatchedsequencesinthePhytozomedatabase G 0 4 were not examinedfurther. Codingsequences were derivedfrom KEG 35,18 11,84 loci sequences according to BLASTX outcomes ($200bp). In addition, full-length transcripts were predicted using BLASTP with the following parameters to ensure similarity of transcripts: orthologous geneof 99%similarity, minimum90%identity. 0 4 G 8 3 KO 95,9 15,4 2.5AnalysisofDifferentialExpressionandTissue-Specific Loci FivemRNAlibrariesweregeneratedfromseparatetissuesusing Illuminasequencing.Readsforeachsequencedtagweremapped ot 1 to the assembled loci using Bowtie (mismatch #2bp, other Unipr 208,27 38,668 peaacrhamloecteursswasasdecfoauunltt)e,da.nTdhtheeDnEuGmsbeeqropfacckleaagnem[4a9p]pwedasreuasdesdfotor s. identify differentially expressed genes. The five different libraries e c werecomparedpairwiseusingagreaterthantwo-folddifferenceas n e the criterion for differential expression. Significant differential u 4 6 seq Nr 41,33 32,15 e,xp0r.e0s1si,onanbdetlwogeen.tis2su.eDsiwffaesrednetfiianleedxbpyrepss-ivoanluaen,aly0s.i0s0b1e,twFDeeRn s 2 su tissues was used to identify candidate loci with tissue-specific o er expressions, and to determine functionally enriched loci, as b u described above. t H. e Tissue-specificlociwereselectedbasedonthereadcountsfrom d m leaf,root,stem,tuber1andtuber2samplesofH.tuberosus.Tissue- e o bl oz 45 6 specific candidates were those with . 200 reads from the target sem Phyt 185,1 32,74 tissueand ,50reads fromother tissues. s a of 2.6 Identification of SSRs ns SSRs were detected using the MIcroSAtellite Identification aryofannotatio Totalsequences 272,548 66,322 one.0111982.t002 TTtmmSrSiohia-nRo,exilismmat.(esMuutsrmemaImS-,rbAdelip,pesdteehaantnutttcpanne-:i/ugo/meafnpnbeg1der0rsc0e.hoiqnenfuxpue1akcn0--lncg,eeuao6stct,eilwder5soee,lsteri5bedw,eesan5cs.rrdaeeaenepl/nldeomeaw5dti,esadfrmo/er)bsopetmPietfwecsortneilvweosen-icl,tyrhtdi.wpiA-toa., m p m nal. u ur 2.7 Reverse Transcription (RT) and Quantitative Reverse S o Table2. Assembly Numberoftranscripts Numberofloci doi:10.1371/j TRsirzNaTendAositwsacolirtihpPRltMNuiosA-nM(Tw(LaqakVsRaTrirase)o,vPleaTCrtoseRedkytAorfra,nonaJmsalcyprfsaiipevntes)a.sHeTh.anetudcbDaenrNosoAulissgowti(edssrTuee)sspyrnuimsthineegr- PLOSONE | www.plosone.org 4 November2014 | Volume 9 | Issue 11 | e111982 TranscriptomeAnalysisofJerusalemArtichoke Figure2.GeneOntology(GO)classificationoftheassembledloci.TheresultsofBLASTXsearchesagainstthePhytozomedatabasewere usedforGOtermmappingandannotation. Thenumberandratioofsequencesassignedtolevel2GOtermsfromGOsubcategories including biologicalprocess,molecularprocess,molecularfunction,andcellularcomponentareshown(BP:biologicalprocess,CC:CellularComponent,MF: MolecularFunction). doi:10.1371/journal.pone.0111982.g002 in a 20mL volume according to the manufacturer’s instructions 16,675,072,220 nucleotides. Of these, 68.37% reached a strict (Invitrogen, Carlsbad, CA, USA). Twenty putative tissue-specific qualityscore threshold ofQ$20basesandread length $25bp, genes (five per tissue type), were selected for RT-PCR. Quanti- andthese were usedfor denovo assembly [31]. tative RT-PCR was performed in 10mL reactions containing ThecleanRAN-Seqreadswereassembleddenovointocontigs gene-specificprimers,1 mLcDNAastemplate,andSYBRPremix using two assemblers with optimal parameters. First, the reads Ex Taq. Reactions were performed using a CFX96 Real-Time wereassembledusingVelvet-Oases(k-mer=65)[35,36]toreduce PCR system (BioRad, Hercules, CA, USA). The thermal profile redundancy and generate longer sequences: 66,322 loci and forqRT-PCRwasasfollows:3 minat95uC,followedby40cycles 272,548transcriptswithlengths$200bpwereproduced.Second, eachconsistingof95uCfor25sec,60uCfor25secand72uCfor thereadswereassembledusingtheTrinityprogram[38]:246,155 25sec. Primer specificities and the formation of primer-dimers transcriptswithlengths$200bpwereproduced.Acomparisonof were monitored by dissociation curve analysis. The expression transcriptlengthdistributionbetweenthetwoassembliesisshown level of H. tuberosus Actin2 (HtActin2) was used as an internal in Figure S1. Overall, the mean length, maximum length, and standardfornormalizationofcDNAtemplatequantity.RT-PCR N50 were longer for the Velvet-Oases assembled sequences than and qRT-PCR reactionswere performed intriplicate. for the Trinity assembled sequences and we therefore used the Velvet-Oases assembly forsubsequent analyses. Results and Discussion The sequences assembled by Velvet-Oases were $200bp and had an average length of 761bp (a total of 4,083,193,637bp), 3.1 RNA-sequencing and de novo Transcriptome N50 length of 1,249 bp, and maximal length of 15,368bp. Assembly of H. tuberosus Transcript sequences were also $200bp and had an average TotalRNAswereisolatedfromfivedifferenttissuesofthePJA length of 1,176bp (a total of 16,675,072,220bp), N50 length of cultivar: leaves, stems, roots, tuberous initial stage 1 (tuber1) and 1,703 bp, and maximal length of 16,437bp (Table 1). A maturestage2(tuber2).TheextractedRNAswerethenmixedin substantial number of transcripts (124,741) had lengths . 1kb. equal proportions for mRNA isolation, fragmentation, cDNA These transcripts were clustered, resulting in 66,322 loci that synthesis, and sequencing. RNA sequencing with the Illumina included 16,013 loci (24.1%) . 1 kb in length (Table 1). The Hiseq2000 produced 244,101,906 paired-end 101bp reads assembled sequences are deposited at http://112.220.192.2/htu corresponding to more than 24.4 billion base pairs of sequence. and are summarized in Table S2. In summary, we generated The raw reads were subjected to quality control using FastQC, genome-widelocussequencesofH.tuberosus,aresourcethatwill and reads were trimmed (Table S1). The total number of high- promote functional genomics approaches inJerusalem artichoke. quality reads was 206,215,632, and these contained a total of PLOSONE | www.plosone.org 5 November2014 | Volume 9 | Issue 11 | e111982 TranscriptomeAnalysisofJerusalemArtichoke Figure3.EukaryoticOrthologousGroups(KOG)classificationoftheassembledloci.Of66,322lociwithNr,PhytozomeandUniProthits, 15,434sequenceswithsignificanthomologiesintheKOGdatabase(E-value#1.0E-5)wereclassifiedinto25categories. doi:10.1371/journal.pone.0111982.g003 3.2 Validation of Assembled Loci Against Publically (E-value#1.0E-05).Databasematcheswerefoundfor32,746loci Available ESTs from H. tuberosus (49.4%). The unmatched loci were further analyzed against the NCBI non-redundant (Nr) and UniProt database. Additionally, We used publically available EST data to validate the loci databasesweresearchedusingBLASTNandBLASTXtoidentify identified by our RNA-Seq and assembly. Sequence information homologous genes. Overall, 40,215 loci (60.64%) matched forESTsfromH.tuberosuswasretrivedfromtheNCBIGenBank significantly similar sequences within the databases. The 39.36% database (most recently accessed in January, 2014). BLASTN analysis of the assembled loci was performed against the H. of sequences (26,107 loci) without hits may represent novel loci tuberosusESTs(40,388ESTs)andthebesthitforeachlocuswas specific to H. tuberosus. Alternatively, these sequences may have selected. Of the H. tuberosus ESTs, 35,402 sequences (87.65%) beentooshorttoproducesignificanthits.Similarsearchoutcomes matched a locusfrom our assembly, but no match was foundfor have been observed in previous non-model plant studies [54–56] 4,986ESTs(12.35%).MostofthelociwithhitmatchedtheESTs (Table 2).BasedonthetopBLASTXhitsagainstthePhytozome with good coverage and assembly quality (Figure S2A). Of our database, H. tuberosus loci were most similar to sequences from 66,322loci,52,174locishowednoBLASThitstotheH.tuberosus Vitisvinifera(3,556loci,12.02%)followedbySolanumtuberosum ESTs and were thus considered to be putative transcripts newly- (2,869 loci, 9.7%) and Solanum lycopersicum (2,500 loci, 8.45%) identified byour RNA-Seqanalysis. (Figure 1A). The E-value distribution of the top matches showed Transcriptome information is not available for the direct that23.52%ofthesequenceshadanextremelyhighE-valuescore progenitors of H. tuberosus, Helianthus hirsutus and Helianthus (E-value=0)and76.48%ofthehomologoussequenceshadvalues grosseserratus;however,acuratedunigenecollectionforsunflower in the range 1.0E-0521.0E-180 (Figure 1B). The similarity (Helianthus annuus L.) was recently generated by EST assembly distribution showed that 18.93% of these sequences had similar- analysis [50]. We used BLASTN to compare our assembled H. itiesgreaterthan80%,42.21%hadsimilaritiesof60%280%,and tuberosus loci against the ESTs of H. annuus and found that 38.86%hadsimilarities , 60%(Figure 1C). 81.04%ofH.annuusESTs(108,984outof134,474)hadmatches Loci with matches in the protein databases were examined among theH. tuberosus loci(Figure S2B). further. The translated the coding sequences of these loci had $90% identity with the matched sequences. Of the annotated 3.3 Functional Annotation of H. tuberosus Loci 40,215loci,10,066containedaputativefull-lengthtranscript(with 39 and 59 untranslated regions). BLAST analysis using those loci After filtering out short-length and low-quality sequences, we indicated that information from other species was sufficient to usedourassembledlocussequencestoperformsimilaritysearches allow annotation ofthe H. tuberosus loci. against public protein databases (Phytozome [51] Nr [52], and UniProt[53]).Firstly,wesearchedallsixframetranslationsofour loci against the Phytozyme protein database using BLASTX PLOSONE | www.plosone.org 6 November2014 | Volume 9 | Issue 11 | e111982 TranscriptomeAnalysisofJerusalemArtichoke Figure4. KyotoEncyclopedia of GenesandGenomes(KEGG)classificationof the assembledloci. Locussequenceswere compared using BLASTX with an E-value cut-off #1.0E-05 against the KEGG biological pathways database. The loci were mapped to 237 KEGG pathways. M;Metabolism,GIP;GeneticInformationProcessing,EIP;EnvironmentalInformationProcessing,CP;CellularProcesses,OS;OrganismalSystems. doi:10.1371/journal.pone.0111982.g004 3.4 Classification of H. tuberosus Loci dominant subcategories assigned to H. tuberosus loci were as WeusedGOtermenrichmentanalysistoclassifythefunctions follows:‘Primarymetabolicprocess’(15.19%),‘Cellularmetabolic of the assembled H. tuberosus loci [44]. The BLASTX similarity process’ (14.75%), ‘Response to stress’ (6.76%), ‘Nitrogen com- searchresultsforthe66,322H.tuberosuslociwereimportedinto pound metabolic process’ (5.33%) and ‘multicellular organismal the Phytozome database for GO mapping and annotation with development’ (4.08%). In the CC category, ‘Cell part’ (21.61%), TAIR information. Sequence annotations associated with 19,848 ‘Intracellular’(13.81%),‘Intracellularpart’(13.49%),‘Intracellular loci(29.93%)werecategorizedintothethreemainGOontologies: organelle’(12.33%),and‘Membrane-boundedorganelle’(11.89%) biological process (BP), cellular component (CC), and molecular were the dominant subcategories. Finally, ‘Nucleotide binding’ function(MF)(Figure 2).Intotal,7,589,8,685and8,510lociwere (22.12%), ‘Protein binding’ (20.45%), ‘Nucleoside binding’ assigned GO terms from the BP, CC, and MF categories, (18.94%),‘Transferaseactivity’(15.80%),and‘Hydrolaseactivity’ respectively.TheGOtermsweresummarizedinto49subcatego- (12.17%) were dominant in the MF category. These annotations ries with GO classifications at level 2. In the BP category, the indicatedthatextensivemembranemetabolicactivityoccurredin PLOSONE | www.plosone.org 7 November2014 | Volume 9 | Issue 11 | e111982 TranscriptomeAnalysisofJerusalemArtichoke Figure5.LocidifferentiallyexpressedbetweentissuesinH.tuberosus.Lociwerequantifiedandup-anddown-regulatedlociareshownas blackandgreybars,respectively.Pairwisecomparisonsbetweentissuesareshown. doi:10.1371/journal.pone.0111982.g005 H.tuberousinthesampledtissues.Thelociwereanalyzedfurther containinggroups’’(GO:0016772)and‘‘hydrolaseactivity,acting forGO-categoryenrichmentrelativetoPlantGOslimcategories on acid anhydrides’’ (GO:0016817, including several fructosyl- using AgriGO [43]. The H. tuberosus loci contained 71 transferase loci). The most significantly enriched of these was the significantly enriched (FDR# 0.01) functional GO terms in the level two term ‘‘catalytic activity’’. In the CC category, the GO BP category, including top five terms (‘‘cellular process’’, terms ‘‘cytoplasmic part’’ (GO:0044444), ‘‘interacellular mem- GO:0009987; ‘‘cellular metabolic process’’, GO:0044237; ‘‘met- brane-bounded organelle’’ (GO:0043231), ‘‘interacellular organ- abolic process’’, GO:0008152, ‘‘primary metabolic process’’, elle part’’ (GO:0044446) and their daughter terms (‘‘plastid’’, GO:0044238, and ‘‘response to stimulus’’, GO:0050896, respec- ‘‘Golgiapparatus’’,‘‘cytosol’’and‘‘vacuole’’)werehighlyenriched tively). The GO term ‘‘cellular, macromolecule, nitrogen com- (FDR#1.0E-60).Theseenrichmentscorrespondwiththeinvolve- pound and primary metabolic process’’ was highly enriched ment of storage organelles in tuber inulin accumulation. The (FDR#1.0E-40), and enriched daughter terms included ‘‘nucleo- ‘‘vacuole’’termwasalsofoundtobesignificantlyenrichedintuber base, nucleoside, nucleotide and nucleic acid metabolic process’’ samples.TheH.tuberosusannotationresultsweresimilartothose (GO:0006139), ‘‘cellular macromolecule metabolic process’’ from the potato and sweet potato transcriptomes [57–60]. The (GO:0044260), ‘‘macromolecule modification’’ (GO:0043412), majority of the sequenced H. tuberosus loci were associated with ‘‘carbohydratemetabolicprocess’’(GO:0005975;includingsever- fundamentalregulatoryandmetabolicprocessesinthemembrane. al loci with fructan 1,2-beta-fructan 1-fructosyltransferase, inver- To assess the functionality of the H. tuberosus transcriptome, tase, hexokinase, sucrose synthase, sucrose phosphate synthase, the annotated loci were matched to the Eukaryotic Orthologous starch synthase, starch branching enzyme, and beta glucosidase Groups (KOGs) database to find homologous genes. The search sequences), and ‘‘cellular biosynthetic process’’ (GO:00044249; outcomes were used to determine sequence directions within loci sucrose 1F-beta-D-fructosyltransferase). Theseresults suggest that [45].The66,322lociwereannotatedwith15,434KOGtermsin gene expression in H. tuberosus is geared towards carbohydrate 25 classifications (Figure 3). Each KOG term represents a metabolism, cellular biosynthetic processes, and macromolecule conserved domain; therefore, these results indicated that a large modification functions. This expression enrichment concurs with proportion of the putative proteins encoded by the assembled biosynthetic analysis results indicating that inulin accumulation locus sequences had protein domains with existing functional occursatthetimeoftuberinitiation[4,19].Anadditionalenriched annotations [45]. The cluster for ‘General function’ prediction GOtermwas‘‘proteinmodificationprocess’’(GO:0006464).This (19.77%) was the most frequently identified group, followed by includedlociwithcyclophilin,FKBP-typepeptidyl-prolylcis-trans ‘Signal transduction mechanisms’ (16.34%), ‘Post translational isomerase, CONSTANS-like 4, heat shock protein 7, chaperones modification, protein turnover, chaperones’ (7.37%), ‘Function protein chaperone, and transferase sequences. As in the MF unknown’ (7.03%), ‘Transcription’ (6.78%), ‘Carbohydrate trans- category, loci were associated with 16 significantly enriched GO port and metabolism’ (5.53%), and ‘Secondary metabolites terms. These included the level two terms ‘‘catalytic activity’’ biosynthesis, transportandcatabolism’ (3.67%). (GO:0003824), ‘‘binding’’ (GO:0005488), ‘‘transporter activity’’ In addition, to identify active biochemical pathways, we (GO:0005215), and ‘‘receptor activity’’ (GO:0004872), the level mapped the H. tuberosus loci onto the KEGG pathways using three terms ‘‘protein binding’’ (GO:0005515), ‘‘transferase activ- BLASTX and the KEGG Automatic Annotation Server [47,48]. ity’’ (GO:0016740), and ‘‘hydrolase activity’’ (GO:0016787), and KO identifiers were assigned to 11,844 loci, using the KEGG thelevelfourterms‘‘transferaseactivity,transferringphosphorus- orthology that contains 4,531 Enzyme Codes [46]. A number of PLOSONE | www.plosone.org 8 November2014 | Volume 9 | Issue 11 | e111982 TranscriptomeAnalysisofJerusalemArtichoke Figure6.qRT-PCRvalidationoflociexpressedspeciallyinfiveH.tuberosustissues.TheqRT-PCRresultsofroot-specific(A),stem-specific (B),leaf-specific(C),andtuber-specific(D)candidatelociareshown. doi:10.1371/journal.pone.0111982.g006 KEGG pathways (237) were associated . 5 loci. The prevalent KEGG annotations provided valuable information for investiga- pathwaysrepresentedwere‘Ribosome’(408loci),‘Planthormone tion of metabolic processes, functions and pathways involved in signal transduction’ (365 loci), ‘Plant-pathogen interaction’ (365 H. tuberosus metabolism. loci), ‘Protein processing in endoplasmic reticulum’ (354 loci), ‘Spliceosome’ (329 loci), ‘Neurotrophin signaling pathway’ (285 3.5 Identification of Differentially Expressed Loci using loci), and ‘Starch and sucrose metabolism’ (276 loci) (Table S3). RNA-Seq Data Thenumberofsequencesassociatedwithsubcategoriesinthetop RNA-Seq data were used for the identification of differentially five KO categories are shown in Figure 4. Among the identified expressed genes (DEGs) in different H. tuberosus tissues. More functional categories, ‘Signal transduction’ (1,252 loci), ‘Transla- than 4.8 million raw reads were obtained from the libraries for tion’ (1,029 loci), ‘Carbohydrate metabolism’ (1,023 loci), and eachtissue(roots,stems,tuber1,tuber2,andleaves)(TableS1).To ‘Folding,sortinganddegradation’(913loci)werethemosthighly create a unified library, the reads were normalized by the total represented.Theseresultsshowedthatlociinvolvedinprocessing read count for gene expression in each tissue library (Figure S3). of genetic information, pathogen resistance, and carbohydrate Next, Likelihood Ratio Tests were used to correct p-values, and metabolismwereactiveinH.tuberosusinthesampledtissues.The librariesweremediannormalized.DEGswereidentifiedusingthe PLOSONE | www.plosone.org 9 November2014 | Volume 9 | Issue 11 | e111982 TranscriptomeAnalysisofJerusalemArtichoke af 6 9 0 8 4 5 08 17 8 0 6 3 91 50 6 4 8 0 8 8 8 8 2 3 80 56 2 3 1 8 90 1 2 6 Le 9.4 9.7 9.0 6.5 7.0 5.2 10. 11. 6.9 4.7 9.5 7.2 11. 11. 8.1 5.6 4.5 1.0 2.5 2.5 6.7 3.5 8.2 8.8 10. 11. 3.3 7.4 8.8 5.7 11. 3.8 5.5 9.6 1 ber 7 6 2 9 5 7 3 39 5 0 28 8 21 93 3 6 0 0 0 8 0 2 6 2 76 35 9 9 0 0 65 0 8 5 Tu 8.6 8.0 8.4 6.2 5.2 4.1 5.7 12. 6.5 3.0 10. 8.2 14. 11. 7.1 4.8 1.0 1.0 0.0 1.5 3.7 2.3 8.2 6.0 13. 15. 6.3 6.3 6.0 3.0 11. 7.6 8.4 7.8 2 ber 8 4 6 0 9 8 7 44 3 0 17 5 67 27 0 4 0 0 0 0 4 2 3 8 01 44 6 4 1 8 9 1 8 2 Tu 8.3 7.8 8.5 5.7 6.2 3.5 5.1 12. 6.8 2.0 10. 8.3 13. 12. 5.7 6.0 3.0 1.0 1.0 1.0 5.6 2.3 8.4 6.4 15. 17. 8.4 6.9 5.8 2.5 8.6 5.8 8.0 7.9 (log)2 Stem 9.97 8.45 10.24 6.55 6.30 4.46 7.71 12.00 6.25 2.00 10.03 8.07 13.14 13.29 7.75 6.04 4.00 1.00 3.00 3.58 6.36 2.58 8.03 8.62 13.88 16.13 7.25 7.98 6.73 4.32 10.70 7.54 7.92 10.95 nt u o C ad ot 41 28 7 7 0 7 3 07 7 5 5 7 89 89 5 5 1 0 8 9 3 6 0 1 86 51 4 5 5 1 55 3 8 38 Re Ro 10. 10. 7.9 7.7 8.8 5.1 7.1 11. 8.6 6.5 8.9 7.1 14. 11. 4.7 7.5 3.8 3.7 4.5 4.3 6.8 4.4 8.7 8.6 11. 14. 5.6 7.5 8.2 4.9 11. 8.7 9.1 10. D I ocus 1162 5274 7028 2657 9519 9904 2369 1074 2465 7941 1418 1923 1943 4075 6006 3509 0925 3505 5585 8812 7531 8850 1010 4752 1768 3971 3619 4816 7745 8463 0707 2746 4040 6728 L 0 0 0 1 4 4 0 2 2 3 6 6 0 0 0 1 2 2 3 3 4 4 1 4 0 3 5 1 1 1 0 3 3 0 s. u s o er b u t H. esisin umber 1 14 13 24 100 99 10 153 26 h n 1. 1. 1. 3. 1. 1. 1. 1. 1. ynt EC 2.7. 2.4. 2.4. 3.1. 2.4. 2.4. 2.4. 3.2. 3.2. s o bi n uli n i n i e d as olve nsfer IdentificationofgenesinvTable3. Enzyme Hexokinase SucrosePhosphateSynthase SucroseSynthase SucrosePhosphatePhosphatase Fructan:fructan1,2-beta-fructan1-fructosyltra Sucrose:sucrose1F-beta-D-fructosyltransferase Sucrose6-fructosyltransferase Fructan1-exohydrolaseIia SolubleacidInvertase PLOSONE | www.plosone.org 10 November2014 | Volume 9 | Issue 11 | e111982

Description:
Data Availability: The authors confirm that all data underlying the findings are fully available fructosyltransferase, 6-SFT: sucrose:sucrose fructosyltransferase, 1-FFT: 1 .. a universal tool for annotation, visualization and analysis in functional genomics Tao X, Gu YH, Wang HY, Zheng W, Li X,
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.