ebook img

Measuring Absolute RNA Copy Numbers at High Temporal Resolution Reveals Transcriptome PDF

17 Pages·2016·8.97 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Measuring Absolute RNA Copy Numbers at High Temporal Resolution Reveals Transcriptome

Resource Measuring Absolute RNA Copy Numbers at High Temporal Resolution Reveals Transcriptome Kinetics in Development Graphical Abstract Authors NickD.L.Owens,IraL.Blitz,MauraA. Lane,...,MichaelJ.Gilchrist,KenW.Y. Cho,MustafaK.Khokha Correspondence [email protected](M.J.G.), [email protected](K.W.Y.C.), [email protected](M.K.K.) In Brief Owensetal.presentanultra-high- resolutionRNA-seqtimecourseduring embryonicdevelopment.Usingspike-ins tocalibratetranscriptreads,they measureabsolutemRNAtranscript numbersandcalculatetranscriptome kinetics,insomecases,inkilobasesper minuteperallele.Withthesedata,they findthattemporalsynexpressionand characteristictimescalesdistinguish genefunctionsacrossthetranscriptome. Highlights Accession Numbers d Ultra-hightemporalresolutionRNA-seqrevealssmooth GSE65785 expressiontrajectories d Absolutetranscriptmeasurementsenablecalculationof transcriptomekinetics d Transcriptaccumulationratesvaryovereightordersof magnitudeduringdevelopment d Temporalsynexpressionandtimescalesdistinguishgene functions Owensetal.,2016,CellReports14,632–647 January26,2016ª2016TheAuthors http://dx.doi.org/10.1016/j.celrep.2015.12.050 Cell Reports Resource Measuring Absolute RNA Copy Numbers at High Temporal Resolution Reveals Transcriptome Kinetics in Development NickD.L.Owens,1,6IraL.Blitz,2,6MauraA.Lane,3,4IlyaPatrushev,1JohnD.Overton,4,5MichaelJ.Gilchrist,1,* KenW.Y.Cho,2,*andMustafaK.Khokha3,4,* 1TheFrancisCrickInstitute,MillHillLaboratory,TheRidgewayMillHill,LondonNW71AA,UK 2DepartmentofDevelopmentalandCellBiology,UniversityofCalifornia,Irvine,CA92697USA 3PrograminVertebrateDevelopmentalBiology,DepartmentofPediatrics 4DepartmentofGenetics 5YaleCenterforGenomeAnalysis YaleUniversitySchoolofMedicine,333CedarStreet,NewHaven,CT06520,USA 6Co-firstauthor *Correspondence:[email protected](M.J.G.),[email protected](K.W.Y.C.),[email protected](M.K.K.) http://dx.doi.org/10.1016/j.celrep.2015.12.050 ThisisanopenaccessarticleundertheCCBYlicense(http://creativecommons.org/licenses/by/4.0/). SUMMARY typical events that occur in rapid succession. A quantitative understanding of the transcriptome during embryogenesis will Transcript regulation is essential for cell function, have important applications for congenital malformation re- andmisregulationcanleadtodisease.Despitetech- searchandregenerativemedicine(Fakhroetal.,2011;Tebben- nologies to survey the transcriptome, we lack a kampetal.,2014;Zaidietal.,2013).Withtheabilitytomeasure comprehensiveunderstandingoftranscriptkinetics, globaltranscriptkinetics,wecaneffectivelystudytheimpactof which limits quantitative biology. This is an acute different transcript regulation strategies on gene expression; for example, dynamics of chromatin modifications, utilization challenge in embryonic development, where rapid ofcis-regulatorysequences,kineticsoftranscriptionfactorbind- changes in gene expression dictate cell fate deci- ing,andtranscriptstability.Becausethestudyofgeneregulatory sions.Byultra-high-frequencysamplingofXenopus networksislimitedbythelackofgenome-widekineticdata(Kar- embryos and absolute normalization of sequence lebachandShamir,2008),akinetictranscriptomedatasetwillbe reads, we present smooth gene expression trajec- transformativetobuildandtestgeneregulatorynetworkmodels toriesinabsolutetranscriptnumbers.Duringadevel- indevelopment. opmental period approximating the first 8 weeks of Themeasurementoftranscriptkineticshastworequirements: human gestation, transcript kinetics vary by eight (1) measurements of gene expression in absolute numbers of orders of magnitude. Ordering genes by expression transcriptspercellorembryoand(2)samplingatasufficiently dynamics, we find that ‘‘temporal synexpression’’ high temporal resolution to properly calculate rates of change predicts common gene function. Remarkably, a of transcript numbers. While there is great potential to use RNAsequencing(RNA-seq)tostudytranscriptkinetics,theda- single parameter, the characteristic timescale, can tasetsthatarecurrentlyavailablelackoneorbothoftheseattri- classify transcript kinetics globally and distinguish butes, and suitable methodologies have yet to be developed genes regulating development from those involved (Stegleetal.,2015).Inmostdatasets,samplingtimepointsare incellularmetabolism.Overall,ouranalysisprovides too widely spaced. Data points are effectively isolated ‘‘snap- unprecedented insight into the reorganization of shots’’ in developmental time, and analysis is restricted to maternal and embryonic transcripts and redefines pairwisecomparisonsofdifferencesinrelativegeneexpression ourabilitytoperformquantitativebiology. (Aanes et al., 2011; Fang et al., 2010; Kalinka et al., 2010; Paranjpeetal.,2013;Spenceretal.,2011;Tanetal.,2013;Ves- terlundetal.,2011;Wangetal.,2004;Yanaietal.,2011).Inaddi- INTRODUCTION tion,ineachofthesedatasets,relativenormalizationprecludes kinetic calculations. Relative normalization is ubiquitous within Gene expression is dynamic and tightly regulated. To build a thefieldandprovidesexpressionlevelsinarbitraryunitsrelated quantitativeunderstandingofgeneregulation,directmeasure- tothenumberofreadsthatmaptoatranscriptrelativetotheto- ment of transcript kinetics is necessary. Transcript kinetics talnumberofreadsfromthelibrary.Theserelativevaluesdonot describe the rate of change of transcript copy numbers with relatedirectlytotranscriptnumbersandarenotreliableindica- time.Indevelopingsystemssuchastheembryo,dynamictran- torsofeitherthemagnitudeorthedirectionofchangeingene script expression precisely coordinates a sequence of stereo- expressionbetweensampleswhenRNAcontentchanges.For 632 CellReports14,632–647,January26,2016ª2016TheAuthors A Cleavage Gastrulation Neurulation Organogenesis Egg 7 9 10 12 1314 2224 252627 28 32 35 38 39 40 41 42 Stage Samples collected from two crosses Sample PolyA+ rdRNA Clutch A Clutch B 0 5 10 15 20 25 30 35 40 45 50 55 60 65 Time (hours post fertilization, hpf) B M) 8000 Relative Normalization 8× 107 Absolute Normalization P o eef1a1o A+on (T 6000 eEeRfC1aC1-o00171 A+mbry 6 ERCC-00171 Polymilli Polyer E E(SRpikCeCd -T0ra0n1sc7r1ipts) Clutch A scripts per 24000000 Clutch A anscripts p 24 an Tr Tr 0 0 0 20 40 60 0 20 40 60 Time (hpf) Time (hpf) C 1 12 24 32 40 42 Stage 100 XX.. ttrrooppiiccaalliisss CClluuttcchh AAA ppoollyyAA++ 90 XX.. ttrrooppiiccaalliisss CClluuttcchh BB ppoollyyAA++ TThhiiss ssttuuddyy 80 XX.. ttrrooppiiccaalliiss CClluuttcchh AAA rrddRRNNAA XX.. llaaeevviisss PPoollyyAA++ yo 70 XX.. llaaeevviiss PPoollyyAA++//33..44 SSaaggaattaa eett.. aall,, 11998800 br m e 60 g / n 50 A N R 40 m 30 20 10 0 0 10 20 30 40 50 60 Time (hpf) D 6 × 10 12 24 32 40 42 Stage 6 +) XXeettrroo..II0011771199 rraallbbpp11 XXeettrroo..II0011771199 A hh11ffoooo y 5 mmiixx11 mmyyoodd11 mmyyll22 ol p tt hh11ffoooo yo ( 4 tt mmyyoodd11 mmeepp11aa mmyyll22 br mmeepp11bb CClluuttcchh AA m 3 CClluuttcchh BB e mmiixx11 mmeepp11bbbbbb pts / 2 cri rraallbbpp11 mmmmmeepp11aa ns 1 a Tr 0 0 10 20 30 40 50 60 Time (hpf) (legendonnextpage) CellReports14,632–647,January26,2016ª2016TheAuthors 633 example,asthetotalamountofRNAincreasesinthedeveloping fertilizationat30-minintervalsforthefirst24hrofdevelopment, embryo,transcriptswithaconstantcopynumberwillappearto followedbyhourlysamplingthereafterto66hr(Figure1A).We decreaseinexpressionunderrelativenormalization.Therefore, sampledRNAfromtwocrossescollectedinparallel(Figure1A), eveninthosestudiesinwhichsamplingratesarehigh(Collart Clutch A and Clutch B. For Clutch A, we sequenced poly(A)+ etal.,2014;Tomancaketal.,2002),relativenormalizationcom- RNA for the full 66 hr and also total RNA depleted of rRNA binedwithvaryingtotalmRNAlevelsconfoundcomparisonsso (rdRNA), containing both poly(A)+ and poly(A)(cid:1) RNAs, for the that kinetic calculations cannot be made even in simple fold first24hr.ForClutchB,wesequencedpoly(A)+RNAforthefirst changeterms. 24hronly. Absolutetranscriptmeasurementscanbeachievedbyspiking To calculate absolute transcript numbers and kinetics, we in RNAs at known abundances for use as quantification stan- calibrated read counts of all transcripts using spike-in RNAs dards.NativetranscriptsarecalibratedagainstRNAstandards asquantification standards. Afterembryohomogenization and tocalculateabsolutetranscriptcopynumberestimates.Previ- priortoRNAextraction,weaddedaknownamountofspike-in ous reports have used spike-ins to normalize for sequencing RNAsperembryotoeachsampleindependently.Thisensured depth, tocontrol fortechnical variability,andtocalculateesti- thatRNAstandardsunderwentthesamevariationsinrecovery mates of absolute transcript numbers (Brennecke et al., 2013; as the endogenous embryonic transcripts during RNA purifi- Islametal.,2014;Junkeretal.,2014;Rissoetal.,2014).How- cation, library preparation, and sequencing. To Clutch A and ever,recentreportshavesuggestedthatsuchadirectabsolute Clutch B, we added the 92 synthetic RNAs available from the normalization with currently available RNA standards is infea- External RNA Control Consortium (ERCC) (Baker et al., 2005). sible due to significant sequencing biases (Risso et al., 2014; Inaddition,weindependentlyaddedthreeoftheE.coli-derived SEQC/MAQC-IIIConsortium,2014).Todate,afewRNA-seqda- ArrayControlspike-instoClutchB. tasetshavebeencalibratedindirectly,whereasmallnumberof We aligned reads to the X. tropicalis genome, known off- transcriptsarefirstquantifiedusinganon-sequencingtechnol- genomeexpressedsequencetags(ESTs)sequencesandspike ogythat,inturn,isusedtonormalizeRNA-seqdata(Marguerat RNAsequences,andcalculatedrelativetranscriptabundances etal.,2012;Tuetal.,2014).Here,weshowthatdirectabsolute of an augmented version of X. tropicalis v7.2 models in tran- quantificationusingsequencedataaloneisfeasibleandyields scripts per million (TPM). The overall quality of our datasets accuratemeasurementsofembryoRNAcontent. was excellent, according to a number of metrics (Figures S1A UsingRNA-seqandspike-inRNAsforabsolutenormalization, and S1B). Importantly, for capturing transcript kinetics, tran- wecreatedagenome-widedatasettopreciselymeasuretran- scriptome-wide comparisons of neighboring time points were script kinetics, which can be linked to biological processes ascorrelatedasbiologicalreplicates,asmeasuredbytranscrip- occurringduringembryogenesis(i.e.,gastrulationororganogen- tome-wideSpearmancorrelation(FigureS1A).Fourtimepoints esissuchasforeye,muscle,andheart).Toaccomplishthis,we (among 139) and one RNA standard that performed poorly sampledrapidlydevelopingembryosatsuchhightemporalres- (Qing et al., 2013; SEQC/MAQC-III Consortium, 2014) were olutionthatdatafromneighboringtimepointswerefoundtobe excludedfromallsubsequentanalysis(FiguresS1C–S1E;Sup- assimilarasbiologicalreplicates,ensuringthatwecapturedpre- plementalExperimentalProcedures).Wealsofoundthatasmall cisetranscriptdynamics. fractionofnon-rRNAtranscriptsaredepletedintherdRNAdata- Thisrichdatasetenablesthevisualizationofgenedynamicsof set(FigureS2F;TableS2),indicatingthatourribosomal-deple- the entire transcriptome over an extended developmental tionprotocolmayhaveasmallnumberofoff-targeteffects. period.Inthefollowingtext,weoutlineanumberofdiscoveries Commonly, in large-scale genomics studies, unsupervised enabledbycombiningabsolutenormalizationandhightemporal machine learning tools such as clustering methods (e.g., resolutionsampling. weighted correlation network analysis; WGCNA) (Langfelder and Horvath, 2008) or dimensionality reduction methods (e.g., RESULTS principal-componentanalysis;PCA)(Jolliffe,2002)candiscover relationshipswithinthedata.Inourcase,PCAconfirmedastrong DataCollection,Quality,andOverviewofAnalysis temporal correlation inherent in our experimental design (Fig- Toprofilegeneexpressionathightimeresolutioninthehuman ure1A;FigureS1F).Therefore,weoptedforadataanalysisframe- disease model, Xenopus tropicalis, we collected eggs (time 0) work that can explicitly describe these temporal correlations. and then synchronously developing embryos from an in vitro Gaussianprocessesareusedcommonlyinthephysicalsciences Figure1. DataGeneration,mRNAContentoftheEmbryo,andGeneExpressionDynamics (A)Experimentaldesignandsamplecollection. (B)Left:eef1a1oandERCC-00171abundanceinrelativenormalization(TPM),ClutchApoly(A)+.RNAstandarddecreaseswithtime.Right:absolutenormali- zationofleftpanel.RedlineindicatesERCC-00171transcriptsaddedbasedonthemanufacturer’sdatasheet.TheRNAstandardisconstantatthecorrect abundance. (C)TotalmRNA(poly(A)+andrdRNA)intheembryoinnanogramswithtime.Grayregionandline,respectively,markGaussianprocess95%confidenceintervals (CI)andmedianofClutchesAandBpoly(A)+data. (D)Dynamicsofgeneexpressionofpoly(A)+RNAintranscriptsperembryo.CirclesmarkClutchApoly(A)+,andtrianglesmarkClutchBpoly(A)+.Thehorizontal axismarkstime(bottom)andNieuwkoopandFaber(NF)developmentalstage(top).ShadedregionsmarkGaussianprocess95%CIsofthedata. SeealsoFiguresS1andS2. 634 CellReports14,632–647,January26,2016ª2016TheAuthors andhavereceivedsignificantattentionasgeneraltoolsforma- allsamples,thedetectionlimitis(cid:3)1,300transcriptsperembryo, chine learning (Rasmussen and Williams, 2006). They offer a whichislessthan1transcriptpercelloncetheembryohasat- statisticalframeworkforrepresentingandexaminingthestrong tainedthecellnumberpresentintheblastula. temporal correlations present in our data. Gaussian processes have been successfully applied to a range of biological data TotalmRNAContentoftheEmbryoValidatesAbsolute (A¨ijo¨etal.,2014;Heinonenetal.,2014;Honkelaetal.,2010;Stegle Normalization etal.,2010; waMaina et al., 2014). Here,Gaussian processes WeinvestigatedthelevelsoftotalmRNAintheembryoduring allowustoaddressmultipletopicswithinasingleframeworkto development.Wesummedtheabsoluteabundanceofalltran- elucidatetranscriptdynamics.Weusetheminabsolutenormali- scriptsmeasuredtoderivethetotalmRNAinnanogramsperem- zation,regression,estimationofconfidenceintervals,identifica- bryoateachtimepointforbothpoly(A)+andrdRNA(Figure1C). tionofdifferentialexpression,andthecalculationofkinetics. Althoughthereisanotablewavebetween0and10hourspost- fertilization (hpf) (discussed later), the total amount of poly(A)+ DirectAbsoluteNormalizationofRNA-SeqData mRNAincreaseswithtimefrom10–15ngofmRNAperembryo Weachievedabsolutenormalizationwithatwo-stepprocedure. atfertilizationto30–50ngofmRNAperembryoat66hpf(swim- Ourfirststepisrelativenormalizationforsequencingdepth,re- ming tadpole, stage 42). These values validate our absolute sultingintranscriptabundancesinTPM.Wenotethattherelative normalization, as they agree well with total RNA yields after normalizedabundancesofourRNAstandardsdecreasedwith tissue homogenization ((cid:3)1.3 mg per embryo and (cid:3)2.4 mg per timeduetoaccumulatingRNAintheembryo(Figure1Bleft;Fig- embryo at the earliest and latest stages of our time course, ureS2A).Inoursecondstep,wetransformedtheseRNAstan- respectively), where (cid:3)1%–2% of total RNA is polyadenylated dardabundancestobeconstantwithtimeattheknownamount (Davidson,1986).Asfurthervalidationoftheabsolutenormaliza- spiked into each sample. To minimize the effects of technical tion,wecompareourX.tropicalismRNApredictionstoexperi- noiseoftheRNAstandards,weuseGaussianprocessestolearn mentallymeasuredpoly(A)+mRNAyieldsfromX.laevis(Sagata afunctionthatvariessmoothlywithtimetotransformtheRNA et al., 1980). Accounting for the uncertainty in our absolute standards. This smooth mapping is critical, as it ensures that normalization, the mean ratio between these X. laevis mRNA absolute normalized copy numbers are not contaminated by yields and our data is 3.38 ± 0.02 (mean ± SD) (Figure 1C). sample-to-sampletechnicalnoiseintheRNAstandards. This is in good agreement with the ratio in volume between RecentworkhasdemonstratedthatERCCspikesexhibitsig- X. laevis and X. tropicalis eggs; their respective diameters are nificantsequencingbiases,andtheirappropriatenessfordirect 1.19±0.07mmand0.80±0.05mm(Crowderetal.,2015),giving absolutenormalizationinRNA-seqhasbeenquestioned(Risso avolumeratioof3.31±0.76.Therefore,ourabsolutequantifica- etal.,2014;SEQC/MAQC-IIIConsortium,2014).Toinvestigate tion agrees well with measurements of in vivo mRNA levels theseconcerns,weevaluatedtheconsistencyofourproposed determinedbyanindependentexperimentalmethod. absolute normalization on our three datasets independently: ClutchApoly(A)+,ClutchArdRNA,andClutchBpoly(A)+(Fig- TranscriptDynamicsintheDevelopingEmbryo ureS2B).Inagreementwithpreviousobservations(Qingetal., Examining the expression of individual genes, we observe 2013; Risso et al., 2014; SEQC/MAQC-III Consortium, 2014), smoothtransitionsingeneexpression(Figure1D).Oursampling we find that (1) ERCC spikes perform poorly in poly(A)+ seq- rateissufficienttocapturethedynamicsoftheexpressedgenes uencing when compared to rdRNA sequencing (Figures S1F (see timescale analysis given later, Figure 7). We find genes andS2B),(2)spike-invariationbetweenthetwoindependently whose expression peaks at different times in development, spiked poly(A)+ datasets (Clutches A and B) is smaller than demonstrating transcript kinetics that are highly dynamic and between Clutch A poly(A)+ and Clutch A rdRNA (Figure S2B), suggestingspecificrolesforthesegenesduringdefineddevel- and (3) spike-in performance varies across spike-in species opmentalperiods(Figure1D).Forexample,inthecaseofmuscle butisconsistentoverreplicates(FigureS2C).Weapplycorrec- induction,aseriesofgenescriticalformesodermandthenso- tionsforthefirsttwosourcesofvariation(FigureS2B;Supple- mitespecificationaccumulateandextinguishatdifferenttimes mental Experimental Procedures) and then calculate absolute asdevelopmentprogresses(FigureS3A).Thetemporalexpres- transcriptsperembryofornativetranscriptsandRNAstandards. sionprofilesofthesegenesagreewithRNAdetectionbyinsitu We capture the Clutch A/B variation and the spike-in species hybridizationinwhole-mountembryos(FigureS3A).Wefurther sequencingbiasesinasingleuncertaintymodelfortheabsolute examined genes expressed later in development, which also normalization(FigureS2D).Ourabsolutenormalizationperforms correlatedwithinsituhybridizationdata(FiguresS3BandS3C). wellwithR2=0.97–0.98(FigureS2C)andhasa1.11-to1.25-fold Thereisutilityinidentifyinggeneswithconstantexpression,as error when comparing the ERCC and ArrayControl spike-ins thesemayserveasoptimalloadingcontrols invariousexperi- that were added independently to Clutch B samples (Fig- mental regimens (Figure S4). We found 109 genes (<0.7%) in ure S2C). We discuss the limitations of this normalization, the poly(A)+(0–66hpf)and1,078genes(6.9%)inrdRNA(0–24hpf) validity of our Gaussian process models, and influences of that had less than a 2-fold change (Figure S4A; Table S3). gene model quality in Supplemental Experimental Procedures Commonly used loading controls (gapdh, odc1, eef1a1) have andTableS2. morevariableexpression(FigureS4B).Ourdatawouldsuggest Thedetectionlimit,whichisthenumberoftranscriptsrequired thattheRNAhelicaseddx3xhasthebestloadingcontrolperfor- to produce a single read on average, increases with time as mance across developmental time and RNA-seq preparations mRNAaccumulatesintheembryo(FigureS2E).Averagedover (FigureS4C). CellReports14,632–647,January26,2016ª2016TheAuthors 635 DifferentialExpressionbetweenBiologically sequence differences between isoforms were long enough to IndependentReplicatesRevealsPreciseRegulationof generateeffective in situhybridization probes. Remarkably, all Transcripts threeshoweddifferentspatialexpressiondomains(Figure3D). Our dense temporal sampling offers a unique ability to detect This demonstrates the biological importance of isoform dy- differentialexpressionovertheentiretimecourse.Webeganby namics during embryogenesis and suggests specific mecha- exploring differential expression between our Clutch A/B nisms to regulate isoforms in different tissues. Moreover, we biologicalreplicates(Figure2;SupplementalExperimentalProce- demonstratethepowerofhigh-resolutionsamplingfortranscript dures).Wefoundthat12,062of16,914genes(71%)hadidentical discoveryandgenemodeling.Thesmallfragmentsizesassoci- expression dynamics between the two clutches. Additionally, ated with high-throughput sequencing present challenges for 4,561 genes (27%) showed differential expression with a small identifyingtheco-expressionofdistantexonsinlongtranscripts. statistical effect (Figures 2A and 2B); their expression profiles The temporal statistics here can aid in the resolution of long- show only minor differences (Figure 2C). Thus, expression dy- rangeorder;distantexonsthatareexpressedinthesameiso- namicsof98%ofthetranscriptomearesimilarbetweentwoinde- formwillexhibitrelatedtemporaldynamics(Figure3C). pendentbiologicalreplicates.Thishighlightsboththehighdegree ofreproducibilityinourdatasetandtheexceptionallyrobustcon- TemporalSynexpressionPredictsCommonGene trolofgeneexpressionduringdevelopment.Theremaining291 Function genes(1.7%)hadalargestatisticaleffect,indicatingstrongdiffer- To explore the expression dynamics of the entire dataset, we ential expression. The majority of these genes have differential createdaheatmapoftranscriptsorganizedaccordingtoahier- maternalexpression,butmostClutchAandClutchBdifferences archical clustering of their normalized expression profiles (Fig- converge by 10–12 hpf, displaying identical expression later in ure 4A). The heatmap is ordered so that nearest neighbors development (e.g., dhx32; Figure 2D). Interestingly, this set of havesimilarexpressionprofiles,whichweterm‘‘temporalsyn- genesisenrichedforGeneOntology(GO)termsrelatingtocelldi- expression.’’Synexpressionsuggeststhatgenesarecontrolled vision(TableS4),suggestingrolesintherapidcelldivisionofthe bysharedregulatorynetworksandmayhavecommonfunction earlyembryo.Notably,asmallernumberofgeneshaveexpres- (Niehrs and Pollet, 1999). To investigate temporal synexpres- sion profiles that are divergent between replicates (Figure 2D). sion,weusedaslidingwindowtoassesslocalGOenrichment From these data, we conclude that gene expression is tightly across the heatmap (Figure S6A; Table S4). Interestingly, we regulated throughout embryogenesis and highly reproducible found many blocks of genes enriched for specific functions whenmeasuredusinghigh-throughputsequencing. rangingfromcellularbiologytodevelopmentalpatterningevents (Figure4A).Inthesetofgenesshowingearlytransientexpres- DifferentialExpressionandSwitchingofTranscript sion,wefoundGOtermsassociatedwithearlydevelopmental IsoformsduringDevelopment steps,includinggermlayerspecification,mesoderm/endoderm Next,weanalyzedthedifferentialexpressionofalternativetran- development,andaxispatterning(Figure4A,right,topofheat- scriptisoforms.Wecanclassifydifferentialisoformexpression map). Genes transiently expressed later in development are as either ‘‘differential abundance’’ or ‘‘differential temporal enrichedforGOtermsassociatedwithorganogenesis(Figure4A, dynamics.’’Transcriptsaredefinedashavingdifferentialabun- right,bottomofheatmap).Therefore,proximitywithintheheat- danceifonetranscriptisexpressedatadifferentlevelthanthe map enriches for GO terms that are consistent with develop- other,withaconstantratiobetweenthem,orashavingdifferen- mentalevents,demonstratingthattemporalsynexpressionisa tial dynamics if the ratio varies with time (Figure S5). Here, powerfulpredictorofgenefunction(seebelow). werestrictourattentiontothoseisoformsthathavedifferential Whileexploringtemporalsynexpression,wefoundtwodistinct temporal dynamics(Figure 3;Figure S5).Wefound761genes setsofgenes(S1andS2inFigures4Aand4B,left;Supplemental withisoformsshowingdifferentialdynamics,withdifferencesin ExperimentalProcedures)thathavesimilaroscillatorybehaviors promoters,exons,and30 UTRs.Isoformexpressionprofilesfor between34and66hrofdevelopment.Thesesetsarenotimme- 147 genes show a switchpoint where their expression profiles diatelyadjacentintheoverallheatmapduetodifferentexpres- cross, marking a change in which isoform is most abundant sion prior to 34 hr. Postulating that there may be more genes (Figure 3A). The timing of these switchpoints was not evenly acrosstheheatmapwithsimilaroscillationsand,hence,similar distributedoverdevelopment;twothirdsaretransitionsfroma function,weusedoneofthesegenes,ckb(Figure3D,right),as maternaltoazygoticisoformconcentratedaround10hpf(Fig- a‘‘seed’’toidentifyothersimilarlyoscillatinggenesanddiscov- ure3A).Interestingly,wefoundexamplesofisoformswitching ered a much larger set (Table S5, 150 genes). To determine duetopost-transcriptionalregulation;twomaternalisoformsof whether this local temporal synexpression identified common bicd2 exhibit differential polyadenylation (Figure 3B). One iso- functions, we examined spatial expression patterns. Our seed form bicd2(1) has a long 30 UTR and is polyadenylated upon gene,ckb,haspronouncedexpressioninthesomiticmesoderm fertilization,theotherbicd2(2)hasashort30UTRandisdeade- andlieswithinablockofgenesenrichedformusclecontraction nylateduponfertilization(Figure3C).The30UTRofbicd2(1)must GOterms.Remarkably,ourlargerlocaltemporalsynexpression containcontrolelementstoremaindeadenylatedduringoocyte setalsohassignificantenrichmentinsomiticmesoderm.38of maturationandbecomepolyadenylatedafterfertilization. thesegeneshaveexpressionpatternsavailableinpublicdata- Wepredictedthatisoformswithdifferenttemporaldynamics bases(Bowesetal.,2010),and34/38(89%)havesomiticexpres- may show different spatial as well as temporal expression, sion(FigureS6B).Thesegenesappeartooscillatewithaperiodof implyingdifferentfunction.Weselectedthreegeneswherethe approximately20hr,whichissignificantlylongerthanthesomitic 636 CellReports14,632–647,January26,2016ª2016TheAuthors A B Measure of Effect Size 2500 Nexop dreifffsfesrioenntial DSmiffffearlle Enffftfieacl tExpression DLaifffrfgeere Entfffifaelc Etxpression B 0 60% overlap & 10% overlap Genes12500000 1721,.03%62 genes 42,576.01% Genes 21.971% genes een ClutchA Ame average)--24 2.5% overlap # 1000 p betw ti(log10-6 a 500 verl -8 efNfeoct Sefmfeacltl Leaffregcet O 0 -10 -50 0 50 100 150 -50 0 50 100 150 C Difffferential Expression Score - Bayesian Information Criterion (BIC) BIC × 105 pnocc 188.5 × 106 Xetro.K05169 169.5 × 106 sox77 148.7 × 105 dhx32 135.6 Transcripts / Embryo 10500 10 AB 20 Transcripts / Embryo 0120 AB 10 20 Transcripts / Embryo 01230 10 20AB Transcripts / Embryo 02460 10 20AB hpf hpf hpf hpf × 105 ccdc188 118.7 × 106 top1.22 103.5 × 106 cep1522 92.5 × 105 brca22 81.0 Transcripts / Embryo 10500 10 20AB Transcripts / Embryo 2400 10 20AB Transcripts / Embryo 0120 10 20AB Transcripts / Embryo 0240 10 20AB hpf hpf hpf hpf × 105 cenpc1 70.1 × 105 dock88 60.4 × 106 top2b 51.8 × 105 rpl32 40.1 Transcripts / Embryo 02460 10AB 20 Transcripts / Embryo 0120 10 AB 20 Transcripts / Embryo 01230 AB 10 20 Transcripts / Embryo 0120 AB 10 20 hpf hpf hpf hpf × 105 xrn1 30.6 × 106 dnd1 20.8 × 105 hgf 10.7 × 105 mov10l1 0.0 Transcripts / Embryo 1105050 10 AB 20 Transcripts / Embryo 0120 10 20AB Transcripts / Embryo 0240 AB 10 20 Transcripts / Embryo 1105500 10 20AB hpf hpf hpf hpf D yo15 Clutch A XXXetttrrro.K05169 30 Convergences rmb Clutch B dddhx32 149 genes ipts / ecr150 Xetro.K05169 divergence dhx32 convergence G#enes1200 D1iv9e rggeennecses s n Tra 00 5 10 15 20 0000 5 10 15 20 Time (hpf) Time (hpf) Figure2. ClutchAversusClutchBPoly(A)+DifferentialExpression (A)ClutchAversusClutchBhistogramofdifferentialexpressionscoresasmeasuredbytheBayesianinformationcriterion(BIC);largerscoresindicategreater differentialexpression(SupplementalExperimentalProcedures).Threeregionsmarked:(1)nodifferentialexpression,BIC%0;(2)differentialexpressionsmall effect,0<BIC%60,and(3)differentialexpressionlargeeffect,BIC>60.See(B)forexplanationofthresholds. (B)Differentialexpression effect size as measured bythe log mean overlap betweenClutch AandClutch BGaussian process models (Supplemental 10 ExperimentalProcedures).MeanoverlapdecreaseswithincreasingBIC.ThresholdsatBIC=0,60inadditiontomeanoverlapsof60%,10%,and2.5%are marked.AtBIC=60,approximatelyallgeneshavelessthan10%overlap,andallgeneswithlessthan2.5%overlaphaveaBIC>60. (C)DifferentialexpressionexampleswithdecreasingBIC(topright).Genesontheboundaryforstrongdifferentialexpressionhavehighlycorrelatedexpression profilesin(A)and(B).Verticallinesmarkconvergences(red)ordivergences(black). (D)Left:examplesofconvergence(dhx32)anddivergence(Xetro.K05169).Right:histogramofconvergenceanddivergencetimes. SeealsoSupplementalExperimentalProcedures. CellReports14,632–647,January26,2016ª2016TheAuthors 637 A × 106 Differential Expression of Isoforms Isoform SwitchTime o 20 Clutch Acripts / embry 123 bbiiccdd22((21)) nneekk22((21)) ssllcc66aa88((13)) #Genes10 MMZyaagttoeetrrinncaa tllo tt ooZ yZMgyaogttoeictri:nc 5a: 3l9: 22g eggneeennseess s n a Tr 0 0 0 5 10 15 20 0 10 20 30 40 50 60 Maternal to Maternal Switch (2 hpf) Zygotic to Zygotic Switch (13.7 hpf) Time (hpf) Time (hpf) Maternal to Zygotic Switch (6.9 hpf) B C bicd2 isoforms PolyA+/rdRNA PolyA+ read depth on bicd2 locus with time (Clutch A) × 106 2 0 0 Clutch A nscript / embryo 01..155 bbbbiiiiccccdddd2222((((1212)))) PPrrddooRRllyyNNAAAA++ Time (hpf)0424..05 hhppff 01.5 Tra 0 5’ bicd2(2) bicd2(1) 0 5 10 15 20 Maternal to Maternal switch (2hpf) Time (hpf) 3’ UTR D × 105 calhm2 calhm2(1) calhm2(2) × 107 ckb ckb(1) ckb(2) Clutch A PolyA+Transcripts / Embryo 11050504 cell 20 St4a0ge 33-34 60 calhm2(2)calhm2(1) 4 cell Stage 33-34 Clutch A PolyA+ranscipts / EmbryoTr 10500 Stage 920 Stage 3410-32 60 ckb(1)ckb(2) Stage 9 Stage 31-32 Time (hpf) Time (hpf) × 105 btbd3 btbd3(1) btbd3(2) o 15 4 cell Stage 33-34 A+mbry 4 cell Stage 33-34 3(2) Clutch A Polynscripts / E 150 btbdbtbd3(1) a Tr 0 0 20 40 60 Time (hpf) Figure3. IsoformDifferentialExpressioninClutchA (A)Isoformswitchingeventexamples:maternaltomaternal;maternaltozygotic;andzygotictozygoticswitchingevents(left).Histogramofisoformswitching eventsbycategoryandtime(right). (B)bicd2isoformsswitchinpoly(A)+data(circles)butdonotswitchinrdRNAdata(squares),indicatingdifferentialpolyadenylation.Poly(A)+andrdRNA abundancesagreewithinuncertaintyoftheabsolutenormalization(SupplementalExperimentalProcedures). (C)Normalizedreaddepthonbicd2locusinClutchApoly(A)+between0and4.5hpf.Notetemporaldynamicssharedbythefinalthreebicd2(1)exons.Heatmap anddepthofpile-upscorrectedforchangingtotalmRNA. (D)Spatialexpressionofthreegeneswithisoformswithdifferentialdynamics.Allisoformsshowdifferenttemporalandspatialexpressiondomains.Graylinesand pointsmarktotalexpressionforeachgene(sumoftheredandblueisoforms). SeealsoFigureS5andSupplementalExperimentalProcedures. clock (Pourquie´, 2003) and more reminiscent of the circadian expression patterns from 40 to 66 hpf that enrich for visual rhythm.Interestingly,afewcircadianclockgenesareexpressed perception (GO:0007601). Withthe exception oftmem145, all inthesomites(Curranetal.,2008,2014),andouranalysisindi- V1and V2genesarewell characterizedashavingrolesin rod cates that there may be additional unexplored links between andconecellsandareassociatedwithhumanretinaldiseases clockgenesandsomiteregulatorynetworks. (Table S5). Therefore, we predicted that the uncharacterized Inanothercase,wefoundtwosetsofgenes(V1andV2;Sup- tmem145 also plays a role in vision; and, indeed, tmem145 plemental Experimental Procedures) with similar temporal does have spatial expression in the eye (XDB3 [Bowes et al., 638 CellReports14,632–647,January26,2016ª2016TheAuthors A Clutch A Synexpression with Clutch A Synexpression with Time (hpf) GO Enrichment Time (hpf) GO Enrichment 1000 genes 0 20 40 60 100 genes 0 20 40 60 al 116 n 62 cell cycle er DNA repair Mat RNA processing 33132 regulation of Wnt receptor 55 21 signaling pathway Onset Transient 18241 Gednoeevnse itlnroa penmasrieleynnttly 39285217652423313417 Tndsoruoamcrnsleisatoocl/sgrvioeepmnntiteoersan ail s Fspsaaecttmteorbrnsly *formation Early nsient 10858 ptrraontseliant ilooncalization 41geannnaesdtuetorrraduiole ltarru/tmpiboo edns deteverevioleorlop ppmamettneentrnt specification n-tra 178125 cperollt ecoyclylesis dmeetseormdeinrmat idoenv oefl olepfmt/reignhtt symmetry o 67 BMP signaling pathway N RNA processing growth factor activity 9765 cell cycle 53 skeletal system devlopment nt protein localization siste 272 111180 pcerollt ecoyclylesis 30 cytoskeletal protein binding Per 154528 RchNrAo mpraoticne ossrginagnization contractile fiber 60 rDeNgAu-ladteiopnen odre tnratnscription, 12129 DNA repair 93 Transcription Factors* Somite S1 17075 39 asnpteecriifoicr/aptoiosnteriod pattern 19 Synexpression S2 Transcription Factors* et 7439126 cdmaeurvsdeciloolevp acmsoecnnuttrlaarc tsiyosntem 13 adsnkeetteelerrimtoariln /spayotsisottenem roio fd rle epvfatl/otrtipegmrhnte snspytemcmifiectarytion ns synaptic transmission epithelium development d o 220 signal transduction eliamrb d deevveeloloppmmeenntt Mi 41 translation cardiovascular system development central nervous system development synaptic transmission pancreas development et 20292 cell surface receptor proximal/distal pattern formation e onsV2 124 ssdiiigggnneaasllt iitnorgna npsadthuwctaioyn 90472277 3992 24 40 kmidunsecyle dteisvseuloep dmeveenltopment Lat V1 nviesruvaolu pse srcyesptetmio ndevelopment Vision 210300254137muscle contraction Synexpression B Somite Synexpression VisionSynexpression dn 1 dn 1 Aeo Aeo Clutch ormalizxpressi 0.5 (Sct.k 3b1(-23)2) (Satc. t238) S1 S2 Clutch ormalizxpressi0.5 V2 V1 NE 0 NE 0 0 10 20 30 40 50 60 0 10 20 30 40 50 60 Time (hpf) Time (hpf) Figure4. TemporalSynexpressioninClutchAPoly(A)+ (A)Temporalmapofthetranscriptome:anenumerationofallgeneexpressiondynamicsintheembryo.Heatmapsdisplayallgenesnormalizedbymaximum expressionandorderedbysimilarity.Inset(right)expandsongenestransientlyexpressedwithearlyonset.VerticalbarsmarkGOenrichment(colorscorrespond toGOterms),numbersappendedtoGObarsindicatethenumberofgenesofgivencategory.AllreportedGOblocksareenrichedwithp<2310(cid:1)4(Fisher’s exacttest)forentireblock.S1,S2,V1,andV2markthelocationsofsomiteandvisiontemporalsynexpressiongenes,respectively,in(B).‘‘TranscriptionFactors*’’ labelsanannotationoftranscriptionfactorsseparatetoGO(SupplementalExperimentalProcedures). (B)Somite(left)andvision(right)synexpressiongroups. SeealsoFigureS6andSupplementalExperimentalProcedures. 2010] and XGC EST database [Klein et al., 2002; Morin et al., anticipatethatfutureinvestigationsoftemporalsynexpression 2006]). inthisdatamaybeabletoassignputativefunctionstomany We conclude that temporal synexpression can be used of the 5,718 genes ((cid:3)28%) that remain unnamed. effectively to predict gene function and predictions can be refined depending on the local developmental period of TheReorganizationofMaternalandZygotic interest. Remarkably, we uncovered a large cohort of genes Transcriptomes expressed in the somites, simply by ranking their similarity InouranalysisoftotalmRNAduringdevelopment(Figure1C),we in expression to a gene with known somite expression. We notedadifferencebetweentotalrdRNAlevelsandpoly(A)+mRNA CellReports14,632–647,January26,2016ª2016TheAuthors 639 A enes (Clutch A) 468000000 Detection Limit~720 transcripts 0.0 hpf ~850 transcripts 4.5 hpf ~1000 transcripts 10.0 hpf ~2900 transcripts 66.0 hpf 468000000 G200 200 # 0 0 102 104 106 108 102 104 106 108 102 104 106 108 102 104 106 108 Transcripts / Embryo B C PolyA+ 0-23.5 hpf PolyA+ 0-23.5 hpf eef1a1o 1000 rdRNA 0-23.5 hpf cirbp 1000 rdRNA 0-23.5 hpf 4x107kb/h (Clutch A) 800 PolyA+ 24-66 hpf (P3oxly1A+0 /7 rtdrR/hNA) (Clutch A) 800 PolyA+ 24-66 hpf 7x1m0y7hk1b/h s 600 s 600 e e n n e cox3 e G 400 3x107 tr/h G 400 # # pri-mir427 200 200 7x108kb/h 0 0 102 104 106 108 102 104 106 108 Max. rate of increase (transcripts / hour) Max. rate increase (kb / hour) D 0 - 3.5 hpf 4 - 23.5 hpf 0 - 3.5 hpf 4 - 23.5 hpf h A)1000 PolyA+ PolyA+ PolyA+ PolyA+ 1000 Clutc rdRNA rdRNA rdRNA rdRNA s ( ne500 500 e G # 0 0 102 104 106 108 102 104 106 108 102 104 106 108 102 104 106 108 Max. rate of increase (transcripts / hour) Max. rate increase (kb / hour) Figure5. TranscriptCopyNumbersandKineticsperEmbryoinClutchA (A)Transcriptcopynumberhistograminpoly(A)+sequencingattime0.0(egg),4.5hpf(earlystage9,blastula),10hpf(stage12.5,endofgastrulation),and66hpf (tadpolestage42).Detectionlimitsgivenumberoftranscriptsrequiredtogenerateasinglereadaveragedoveralltranscriptlengths(FigureS2E).Transcriptcopy numberdistributionstransitionsmoothlybetweenthesetimepoints(datanotshown). (B)HistogramofmaximumratesofincreaseintranscriptsperhourbetweenanyconsecutivemeasurementpointscalculatedfromGaussianprocessmediansfor eachgene.Blueindicatespoly(A)+0–23.5hpf;redindicatesrdRNA0–23.5hpf;greenindicatespoly(A)+24–66hpf.Verticallinesmarkthegenewithmaximum rateofincreaseforeachcategory. (C)Asin(B),butinkilobasesperhour. (D)Maximumrateofincreaseintranscriptsperhour(left)andkilobasesperhour(right)for0–3.5hpfand4–23.5hpftimeintervals.Poly(A)+andrdRNAdistri- butionsarediscrepantfor0–3.5hpf,reflectingpolyadenylationofmaternalRNAs.Distributionsareidenticalover4–23.5hpf. SeealsoFigureS2E. levelspriorto4.5hpf(earlystage9blastula),thepointatwhichzy- maternalmRNAstargetedforclearancearelargelyextinguished gotictranscriptionbecomeswidespread.From0to4.5hpf,there (seetopleft,Figure4A),andzygotictranscriptioncausesmRNA isanincreaseinpoly(A)+transcripts,whilerdRNAlevelsshowlit- levelstoincreasefortheremainderofourtimecourse. tlechange,reflectingthepolyadenylationofmaternaltranscripts Toexaminethesetransitionsmoreclosely,weconsideredthe (Collartetal.,2014).Then,from4.5to10hpf,bothpoly(A)+and distribution of the number of transcripts per gene at different rdRNAlevelsfall,astheclearanceofmaternalmRNAexceeds timepoints(Figure5A).Initially,intheegg,thedistributionhas nascent zygotic transcription. The fall in RNA constitutes an abroadpeakat(cid:3)100,000transcripts,whichchangesdramat- (cid:3)20%lossoftheembryo’smRNAcontent,supportingthenotion icallyat4.5hpfintoabimodaldistributionwithamarkedpeakat thatveryearlyembryonicdevelopmentreliesheavilyonmater- (cid:3)600,000 transcripts. This change in distribution reflects the nallystoredtranscripts.By10hpf(lategastrulastage12),those rapid increase in polyadenylated maternal transcripts. The 640 CellReports14,632–647,January26,2016ª2016TheAuthors

Description:
any transcript in the early frog embryo, at 7 3 108 kb/hr per em- bryo (Figure 5C), an order of magnitude faster than the very abundantly expressed
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.