ebook img

Micro: RNA Methods PDF

285 Pages·2007·3.16 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Micro: RNA Methods

S E C T I O N O N E IDENTIFYING MicroRNA and S THEIR TARGETS C H A P T E R O N E Identification of Viral MicroRNAs Christopher S. Sullivan* and Adam Grundhoff† Contents 1. Introduction 3 2. ComputationalPredictionofViralmiRNACandidates 7 2.1. PrinciplesofcomputationalmiRNAidentification 7 2.2. PredictingviralmiRNAcandidateswithVMir 9 3. ArrayConfirmationofViralmiRNACandidates 11 3.1. Arraydesign 13 3.2. Microarrayhybridization 15 3.3. Dataanalysis 19 4. ConcludingRemarks 21 References 22 Abstract Given the important function of microRNAs (miRNAs) in the control of gene expression, it is not surprising to find that some viruses encode their own miRNAs.AlthoughthefunctionoftheoverwhelmingmajorityofthesemiRNAs remainsunknown,atleastsomeofthemareexpectedtoplaycrucialrolesin the viral life cycle, and hence there is great interest in identifying novel viral miRNAs. The majority of currently known viral (and host) miRNAs have been identifiedbycloningofsmallRNAs,but,duetotheirsmallsize,viralgenomes are especially amenable to alternative methods based on the computational predictionofmiRNAcandidates.Here,weprovideadetailedprotocolonhowto use computational prediction methods in conjunction with high-throughput microarrayanalysistodetectmiRNAsinviralgenomes. 1. Introduction WiththerealizationthatmiRNAsplayaprofoundroleinregulatingthe geneexpressionofmostmulticellulareukaryotescomestheunderstandingthat some viruses that infect these hosts encode their own miRNAs. In 2004, * DepartmentofMolecularGeneticsandMicrobiology,UniversityofTexasatAustin,Austin,Texas { Heinrich-Pette-InstituteforExperimentalVirologyandImmunology,Hamburg,Germany MethodsinEnzymology,Volume427 #2007ElsevierInc. ISSN0076-6879,DOI:10.1016/S0076-6879(07)27001-6 Allrightsreserved. 3 4 ChristopherS.SullivanandAdamGrundhoff Pfefferandcolleagueswerethefirsttoreportthatvirusesencodesuchmole- cules when they cloned miRNAs of viral origin from a B-cell line latently infectedwithEpsteinBarrvirus(EBV)(Pfefferetal.,2004).Sincethen,over 80 virally derived miRNAs have been discovered, mostly in viruses that are members of the herpesvirus and polyomavirus families (Table 1.1). Both families contain viruses with deoxyribonucleic acid (DNA) genomes that canestablish lifelong, persistentinfectionsin their hosts.In addition, there is a report that a single strain of human immunodeficiency virus (HIV), a member of the retroviral family, encodes an miRNA (Omoto et al., 2004). (However, whether the processing of this small RNA is dependent on the miRNA machinery, how prevalent this small RNA is in other HIV strains, and whether this small RNA is relevant to the life cycle of the virus awaits furtherstudies.) Aswiththeirhostcounterparts,thefunctionsofthevastmajorityofviral miRNAsareunknown.ClearlydefinedfunctionsofthreeviralmiRNAshave beenestablished(Cullen,2006;PfefferandVoinnet,2006;SamolsandRenne, 2006;SullivanandGanem,2005).SV40expressesapre-miRNAlateduring infectionthatisprocessedintoanmiRNAandanmiRNA*,bothofwhich activelydirectthecleavageofearlygeneproducts(Sullivanetal.,2005).This causesreducedearlyproteinlevelsatlatetimepointsofinfections,resultingin less susceptibility of infected cells to T-cell–mediated lysis in in vitro assays. Additionally,EBVencodesformiR-BART-2thatcandirectthecleavageof the transcript encoding the BALF5 DNA polymerase (Pfeffer et al., 2004), which may suggest a common theme of viral utilization of miRNAs for autoregulation of gene expression. However, Fraser and colleagues have reported that HSV-1 encodes a miRNA expressed during latent infection thatnegativelyregulatescertaincellulartranscriptsinvolvedinaproapoptotic response (Gupta et al., 2006). This suggests that at least some viral miRNAs willhaveaprofoundeffectonthevirallifecyclebytargetinghosttranscripts. The majority of known miRNAs (including those encoded by viruses) have been identified via cloning. In this procedure, small RNAs are size- fractionated, and successive rounds of single-stranded ligation incorporate flanking regions on the RNA to facilitate reverse transcription, PCR- mediated amplification, and cloning. The methodology for cloning viral small RNAs is identical to that for host-derived RNAs, and, therefore, we directreadersinterestedincloningviralmiRNAstoseveralestablishedand well-written protocols (Pfeffer et al., 2003; http://web.wi.mit.edu/bartel/ pub/protocols/miRNAcloning.pdf ). However, the relatively small size of viral genomes makes them particularly amenable to computational-based miRNA discovery methods. Indeed, a sizable fraction ((cid:1)15%) of virally derivedmiRNAswereidentifiedsolelybynoncloningcomputationalmeth- ods.Herewepresentthemethodologybehindourexperimentalstrategyof using computational prediction combined with microarray detection to identifyvirallyencodedmiRNAs. Table1.1 VirusesknowntoencodemicroRNAs Number ofknown Virus Family miRNAs Methodofidentification Reference(s) EpsteinBarrvirus Herpesviridae 23 17bycloning,6bya Caietal.,2006; combinationofcomputational Grundhoffetal.,2006; predictionandmicroarray Pfefferetal.,2004 analysis Kaposi’ssarcoma Herpesviridae 12 11bycloning,1by Caietal.,2005; herpesvirus computationalpredictionand Grundhoffetal.,2006; microarrayanalysis Pfefferetal.,2005; Samolsetal.,2005 Humancytomegalovirus Herpesviridae 11 9bycloning,2bycomputational Dunnetal.,2005; prediction Greyetal.,2005; Pfefferetal.,2005 Murineherpesvirus68 Herpesviridae 9 Allbycloning Pfefferetal.,2005 Rhesuslymphocryptovirus Herpesviridae 16 Allbycloning Caietal.,2006 Herpessimplexvirus1 Herpesviridae 2 1bycomputationalprediction,1 Cuietal.,2006; byacombinationof Guptaetal.,2006 computationalprediction, cloning,andcolony hybridization (continued) Table1.1 (continued) Number ofknown Virus Family miRNAs Methodofidentification Reference(s) Marek’sdiseasevirus Herpesviridae 8 Allbycloninganddeep Burnsideetal.,2006 sequencing Simianvirus40 Polyomaviridae 1 Computationalprediction Sullivanetal.,2005 SA12 Polyomaviridae 1 Computationalprediction Cantalupoetal.,2005; Sullivanetal.,2005 Murinepolyomavirus Polyomaviridae 1 Computationalprediction Sullivanetal., unpublished Humanimmunodeficiency Retroviridae 1? Cloning OmotoandFujii,2005; virus Omotoetal.,2004 IdentificationofViralmiRNAs 7 2. Computational Prediction of Viral miRNA Candidates 2.1. Principles of computational miRNA identification SoonaftertheinitialdiscoveryofmiRNAs,itwasrealizedthatbioinformatic methodsholdgreatpotentialtoaidintheidentificationofnovelmiRNAs, andalargenumberofmiRNApredictionalgorithmshavebeendeveloped since then. By definition, miRNAs are small, noncoding RNAs generated by the RNAse III-like enzymes Drosha and Dicer (Ambros et al., 2003). They are initially transcribed as part of much longer precursor transcripts (pri-miRNAs), and structural analysis has shown the region encoding the maturemiRNAfoldsintoahairpinstructure(thepre-miRNA)ofapproxi- mately 60 to 80 nucleotides. Neitherpre-miRNAsnorpri-miRNAs share any recognizable sequence motifs, and it is thus thought that the hairpin structures themselves serve as the primary signal that recruits Drosha and thereby initiates miRNA maturation. Accordingly, all miRNA-prediction algorithmsusesecondarystructureanalysistoidentifypotentialpre-miRNA fold-backstructures.Suchpredictions,however,aregreatlycomplicatedby thefactthathairpinsareextremelyabundantstructureswithinanysequence. Theproblem,then,istodefinefeaturesthatdistinguishpre-miRNAsfrom randomhairpinsorotherstructuredRNAelements.Accordingtoanumber ofmiRNA-definingcriteriasetforthbyaconsortiumofresearchersin2003 (Ambrosetal.,2003),miRNAprecursorhairpins‘‘shouldnotcontainlarge internalloopsorbulges,particularlynotlargeasymmetricbulges.’’Although fulfillment of these minimal criteria is considered sufficient to prove that a smallRNAidentifiedviacloningindeedrepresentsamiRNA(asopposedto a small interfering RNA[siRNA]), they are certainly not stringent enough to serve as a basis for computational prediction methods. Therefore, most miRNApredictionprogramsemployscoringalgorithmsbasedonthestatis- ticalcomparisonofareferencesetofknownpre-miRNAhairpinsversusa control collection of unrelated hairpins. Minimally, such comparisons include consideration of bulge size and symmetry, but many algorithms also analyze additional features (e.g., sequence composition, position of internal bulges, length of helices, hairpin symmetry). The number of ana- lyzed features as well as how they are weighted in the calculation of a final score differs considerably between the various prediction methods, which can result in little overlap between the predictions obtained with different programs. Although the predictions of some programs might be more accurate than others, it remains a fact that none of the currently available computational algorithms is able to reliably identify miRNAs based on structural analysis alone. Therefore, most miRNA prediction programs employ additional filtering methods to eliminate false-positive candidates. 8 ChristopherS.SullivanandAdamGrundhoff By far the most frequently used filter is evolutionary conservation: because the overwhelming majority of miRNAs are located in nonprotein coding regions, pre-miRNAs often register as isolated islands of high-sequence conservation against the background of nonconserved DNA. Although such filters are very powerful, they have the obvious disadvantageof being unabletoidentifynonconservedmiRNAs.Dependingon thescopeofthe particular analysis, this trade-off between gain in specificity at the cost of sensitivityisoftenanacceptableone,especiallygiventhemajorityofhitherto known animal and plant miRNAs that indeed appear to be conserved. However, for several reasons, such approaches are less suited for viral miRNAs. First, for many viruses, only very distant relatives are known, and thus suitable sequence information to conduct a comparative genome analysis is often not available. Second, viruses represent organisms that are highlyadaptedtotheirhosts,andevencloselyrelatedviruseshavefrequently adopted different strategies to infect their hosts, by targeting different cell types or tissues, for example. As a result, many viruses harbor genes or genomic segments that are not or only very poorly conserved (e.g., the latency genes of many herpesviruses). It is reasonable to assume this is also true for viral miRNAs, and focussing only on evolutionarily conserved sequencesthusmightmissmanymiRNAs.Theseconcernsnotwithstanding, evolutionarily conserved viral miRNAs certainly exist: for example, EBV andrhesuslymphocryptovirus(RLV)sharesevenmiRNAs(Caietal.,2006). However, the remaining EBV and lymphocryptovirus (LCV) miRNAs appeartobeunique,andalthoughastudythatusedphylogeneticcompari- son identified five miRNAs conserved between chimpanzee and human cytomegalovirus (CMV), it missed several others that are apparently not conserved(Greyetal.,2005).Lastly,virusesoftenusetheircodingcapacity to the maximum, and it is not unusual to find overlapping open reading frames(ORFs)onbothstrandsofagivenregionorevenindifferentreading frames on the same strand. Thus, it can be expected that at least some viral miRNAs overlapwith ORFs (as indeed some of the known ones do), and becausecodingregionsaremaskedinsearchesforevolutionarilyconserved miRNAs,suchcandidateswouldbemissed. For these reasons, it is preferable to use ab initio prediction methods to identifyviralmiRNAs.However,suchmethodsarenaturallylessaccuratein theirpredictions.Twostrategiesaregenerallyusedtocopewiththisproblem: the first is to increase the stringency of the algorithm scoring the hairpin structure. This will result in a decreased rate of false-positive predictions at the expense of false negatives such that bona fide miRNAs might be missed. The alternative is to perform a low stringency primary prediction, therebyallowingrelativelylargenumbersoffalsepositivesfollowedbyahigh- throughputexperimentalscreentoeliminatethefalse-positivecandidates.The latter method is especially attractive for the identification of viral miRNAs because the small genome sizes of viruses will, even with minimal filtering, generate only a limited number of predictions that can be fitted easily on IdentificationofViralmiRNAs 9 today’shigh-densitymicroarrays.Withtheseconsiderationsinmind,wehave previouslydevelopedaprogramcalledVMirtopredictviralmiRNAs. 2.2. Predicting viral miRNA candidates with VMir VMirrepresentsalowstringencypredictionmethodfortheidentificationof miRNAsinviralgenomes(orothersequencesuptoapproximately2Mbin size),whichcontainsaneasy-to-usegraphicaluserinterface.Theprogramis described in detail by Grundhoff et al. (2006). Briefly, VMir performs an analysis bysliding a500-nt windowin stepsof 10 nt over thesequencesof interest.Withineachwindow,theprogramperformsastructureprediction byminimalfreeenergyfolding(usingtheRNAfold[Hofackeretal.,1994] algorithm) and identifies individual hairpins above a given size cutoff (by default 45 nt). These hairpins are then scored based on a statistical comparison to a reference set of known pre-miRNA hairpins (see Grundhoff et al., 2006 for details). The scoring algorithm of the VMir program was designed to overpredict rather then underpredict potential pre-miRNAcandidatesinviralgenomes.Asindicatedpreviously,theratio- nalebehindthisdesignwasthattherelativesmallsizeofviralgenomeswould permit such nonstringent filtering, thereby minimizing the risk of false- negative predictions. The VMir program nevertheless incorporates several user-adjustablequalityfilters,whichcanbeusedtoreducethecomplexityof theprediction.Asanexample,Fig.1.1Ashowstheprimaryoutputfroman analysis of SV40, a relatively small virus with a genome of approximately 5kbpinwhichatotalof109hairpinsaredetected.Theresultscanreasonably be filtered by setting a score cutoff of 115 because 95% of the known pre- miRNAsinthetrainingsetreachorexceedthisvalue.Underthesecondi- tions,only16hairpinswouldremain,anumberthatcanbeeasilytestedby Northernblotting.Anothermethodtofiltertheresultsisindependentofthe scoring algorithm but, rather, is based on the robustness with which the structures fold in different sequence contexts. VMir analyzes windows that aresignificantlylargerthantheexpectedsizeofapre-miRNAhairpin(60– 80nt).Furthermore,eachslidingwindowoverlapswiththepreviousoneby 490ntandthusahairpinmayfoldinmultiplewindows(e.g.,provideditis notlocated attheextremeendsof theanalyzedgenome,ahairpin of80 nt willhavetheopportunitytofoldinupto43windows).Becausesignificantly stable structures are expected to fold in the majority of sequence contexts represented by the different windows, the number of windows in which a givenhairpinisdetected(referredtoasthewindowcount)canbeusedasa quality criterion. We consider this method preferable to a filter based on minimal free folding energy values (MFEs) alone because, depending on theirnucleotidecompositionandlength,evenrandomsequencescanhave quitelow,andthereforeseeminglysignificant,MFEs.Withawindowsizeof 500nt,wetypicallyusewindowcountcutoffsbetween10(leaststringent)to 35 (most stringent). Under the most stringent conditions, only two 10 ChristopherS.SullivanandAdamGrundhoff A SV40 200 150 e r o c r s100 Mi V 50 0 1 1001 2001 3001 4001 5001 B Position 200 180 e r o160 c s r Mi140 V 120 100 1 1001 2001 3001 4001 5001 Position C KSHV 300 250 re200 o c r s150 Mi V100 50 0 1 20,001 40,001 60,001 80,001 100,001 120,001 D Position 300 260 e r o220 c s r Mi180 V 140 100 1 20,001 40,001 60,001 80,001 100,001 120,001 Position Figure1.1 VMiranalysisoftheSV40(A,B)andKSHV(C,D)genomes.Hairpinsare plottedaccordingtogenomiclocationandVMirscore.AandCrepresenttheunfiltered outputfromtheVMirprediction,whereastheresultsweresubjectedtostringentfilter- ing(scoreandwindowcountcutoffsof115and35,respectively)toproducetheplots shownin(B)and(D).Trianglesanddiamondsrepresenthairpinsindirectorreverse IdentificationofViralmiRNAs 11 candidates(oneofthemthebonafideSV40-miR-S1[Sullivanetal.,2005]) remain in the analysis of the SV40 genome (Fig. 1.1B, filled triangle). Of course,largergenomeswillproducesignificantlymorecandidates,andeven stringentfiltering mightbeinsufficienttoreducethenumberof candidates such that each of them can be verified individually by Northern blotting. Figure 1.2C shows the output from a VMir analysis of the KSHV long unique region (LUR) which is (cid:1)140kbp in size. 3046 hairpins register in theprimaryanalysis,and146ofthesesurvivethestringentfilteringmethod (Fig. 1.2D). Out of the 12 pre-miRNAs known to be encoded by KSHV (Cai et al., 2005; Grundhoff et al., 2006; Pfeffer et al., 2005; Samols et al., 2005),10arecontainedinthefilteredprediction(shownasblackdiamonds inFig.1.2D),and8ofthesemaptothetop20scoringhairpins.Althoughthe latter would have been surely identified in an experimental verification attempt based on the top scoring candidates alone, two pre-miRNAs with lowerscoresof134and144wouldlikelyhavebeenmissed.Likewise,two pre-miRNA hairpins, which fold in less than 35 windows and thus were eliminated from the analysis would also have evaded detection (their posi- tions are indicated by asterisks marked with arrows in Fig. 1.1D). Thus, if Northern blotting would be used as the sole confirmatory test, bona fide miRNAsmightbemissedduetothestringentfilteringnecessarytoproduce manageablenumbersofcandidates.Incontrast,microarrayanalysisprovides theopportunitytoinvestigatemuchlargernumbersofmiRNAcandidates. Dependingonthesizeoftheviralgenomeandthecapacityofthemicroarray formatused,itispossibletoscreenthetotalityofallhairpinstructureswithin a viral genome (which VMir will report if no filter is used). Thus, such an analysiscanserveasaconfirmationmethodthathasthecapacitytodetectall miRNAs,regardlessofthescoreawardedbythepredictionalgorithm.Inthe followingsection,wewillprovideadetailedprotocolonhowtodesignand analyzesuchmicroarrays. 3. Array Confirmation of Viral miRNA Candidates The procedure described here can be divided into three steps: First, custom arraysaredesigned withaidof theVMirsoftware. Thesearrays are then hybridized to small RNAs isolated from infected cells, and the orientation,respectively.Confirmedpre-miRNAsareshownasfilledblacksymbols. In(D),thepositionoftwopre-miRNAhairpinsthatdidnotpassthestringentfiltering areindicatedbyasterisksmarkedwitharrows.TheanalysisoftheKSHVgenomerepre- s e nts d ata re por ted i n G r u nd hoff et al. (2 0 0 6) , f rom wh ich ( D ) was pa r t i a l ly re produce d (withpermission).AnalysisoftheSV40genomewasperformedwiththecurrentversion ofVMir(v1.5).

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.