Janiceetal.BiologyDirect2013,8:4 http://www.biology-direct.com/content/8/1/4 DISCOVERY NOTES Open Access Surprisingly high number of Twintrons in vertebrates Jessin Janice, Marcin Jąkalski and Wojciech Makałowski* Abstract Twintrons represent a special intronic arrangementin which introns of two different types occupy the same gene position. Consequently, alternative splicing ofthese introns requires two different spliceosomes competing for the same RNA molecule.So far, onlytwo twintrons have been described in insects. Surprisingly, wediscovered several such arrangements invertebrate genomes, which are quite conserved throughout the lineages. Reviewers: This article was reviewed byFyodorKondrashowand Eugene Koonin. Keywords: Twintrons, Vertebrate genomes, Gene expression Findings typesofintrons,e.g.theydon’tneedtobenestedoneinto Most eukaryotic protein coding genes are interrupted by another. In fact, one-third of the reported here twintrons 0 0 non-coding regions called introns [1], which are removed areshifted,meaningthatforinstanceboth5 and3 splice from pre-mRNA by a complex macromolecular machin- sitesoftheU12-typeintronlayupstreamofU2-typesplice ery called spliceosome[2]. Interestingly,two types of spli- sites(Figure1insert). ceosomal introns exist that are processed by two distinct The spliceosomal twintron system was for the first complexes.Themajorspliceosomerecognizesandexcises time described in the gene prospero of Drosophila mela- most of the introns (in humans, about 99.5% of the nogaster [5]. The second intron of the gene contains two introns),whiletherestareprocessedbytheminorspliceo- sets of splice sites (SS): a U12-type with an AT-AC ter- some. The two classes of introns are named after major mini and nested within a U2-type intron with a GT-AG RNA components of these spliceosomes: U2-type and termini resulting in a twenty-nine amino acids longer U12-type introns, respectively [3]. Although the overall protein [6]. The U12-type intron of the prospero gene is splicing mechanism of the two types of introns is very an ancestral one, while the U2-type splice sites appeared similarandthetwospliceosomessharesomecomponents, early in the evolution of insects [7]. Recently, we have it is believed that the two systems originated independ- reported another insect-specific twintron in the ZRSR2 ently at different points in eukaryotic evolution [4]. It is gene. However, in this case, two sets of the splicing sites intriguing that two types of introns can coexist in the are not nested but shifted by several dozens of nucleo- same gene, which means that two large nucleoprotein tides and consequently, two protein isoforms are of a complexes must operate simultaneously on a single pre- similar size [7]. Interestingly, the ancestral intron in this mRNAmolecule.Evenmoresurprisingistheexistenceof position was of the U2-type and the U12-type one is the so-called twintrons. We define twintrons as such an ar- first known case of de novo origination of a minor type rangementinwhichthealternativelysplicedU12-typeand intron. Nevertheless, we have hypothesized that the U2-type introns occupy the same genomic location and twintron arrangement is a safe pathway of intron type are processed by different spliceosomes. Consequently, two spliceosomes must compete over the same RNA re- gion to process a pre-mRNA (see Figure 1). This defin- ition doesn't imply any specific spatial relation of two type switching ineithercase.Tofurthertestourhypoth- esis, we expanded the search for twintrons into well- annotatedvertebrategenomes. Janiceetal.BiologyDirect2013,8:4 Page2of5 http://www.biology-direct.com/content/8/1/4 Figure1Schematicrepresentationofatwintron.Inmostcasesasetofsplicingsitesforaonetypeofspliceosomeisnestedinasetof splicingsignalsofanothertypeofspliceosome(majorcartoon).However,inanumberofcasesthesplicingsignalsofthetwospliceosomesare shifted,i.e.both50and30splicingsignalsofonespliceosomelayupstreamofsplicingsinglasofanotherspliceosome(imbeddedcartoon). A comprehensive scan of the human genome revealed is excised in seven different ways, including an AT-AC 0 eighteen twintronic arrangements within different genes U12-type intron. Although the 3 SS is similar for all the 0 (seeTable 1). Interestingly, six of these twintrons consist introns except the U12-type intron, the 5 SS varies exten- of multiple alternative U2-type introns, with as many as sively. The length of the introns also differs ranging from seven U2-type intron variants in the PRMT1 gene. 209 nt for the shortest intron to 4,226 nt for the longest Phylogenetic analysis revealed that for all these eighteen one. Interestingly, the U12-type intron belongs to the 0 twintrons, the ancestral intron was of the U12-type and AT-AC type with an unusual AA terminus at the 3 SS. their presence is highly conserved throughout the verte- Uponsplicing,itproducesasplicevariant,whichresultsin brate genomes (Additional file 1: Table S1). Two of the the Premature Termination Codon (PTC) and conse- U12-typeintronsareevenseeninthegenomeofD.mel- quently is subjected to nonsense-mediated mRNA decay anogaster. We investigated several chordate genomes for (NMD)inboththehumanandmousegenomes.Although twintron presence at the genomic and transcript levels. theconservedmotifsofminorintronarepresentinseveral Surprisingly, comparative genomic analysis revealed a vertebrate genomes, including opossum and Xenopus, we high evolutionary depth of the twintronic arrangements foundsolidevidenceoftheU12-typeintronsplicingonlyin in vertebrates (Additional file 1:Table S1). In four cases, humans and mice. Interestingly, in the platypus genome twintrons are apparent as far as in the lamprey genome, splicing signals for U12-type spliceosome has been muted andinafew cases,they arealsoevidentinamphibians. and cannot be recognized by a minor spliceosome any One of the interesting genes harboring a twintron is longer. This may suggest that a twintron arrangement was PRMT1 (protein arginine N-methyltransferase 1), which a mediator of U12-type intron elimination from the host functionsasahistonemethyltransferasespecificforH4[8]. gene,inagreementtoouroriginalhypothesis[7]. There are more than twenty different splice variants To elucidate the role of alternative SSs in protein reportedforthisgene.Thesecondintronicpositioninmost architecture, we scanned protein splice variants with of the transcripts harbors a twintron where a U12-type in- InterProScan. Most of the protein isoforms of twintrons tronisnestedwithinaU2-typeintron.Thisintronicregion did not show any changes in the conserved motifs and Janiceetal.BiologyDirect2013,8:4 Page3of5 http://www.biology-direct.com/content/8/1/4 Table1Detailsofthehumangenesharboringatwintron Gene Function LengthofU12 LengthofU2variant(inaa)* variant(inaa) ACTR10 Thisgeneencodesactininvolvedinmicrotubule-basedmovement. 417 219(U2-a),219(U2-b) C19orf54 Thisgeneencodesuncharacterizedphosphoprotein. 351 139 C1orf112 Functionunknown. 718 853(U2-a),606(U2-b) C3orf17 Functionunknown. 567 392(U2-a),400(U2-b) CTNNBL1 Althoughthefunctionofthisproteinhasnotbeendetermined,theC-terminal 563 376 portionoftheproteinhasbeenshowntopossessapoptosis-inducingactivity. CUL4A Thisgeneencodesubiquitinligasecomponentofamultimericcomplexinvolved 789 149 inthedegradationofDNAdamage-responseproteins. ESRP1 ThisgeneencodesRNA-bindingproteinthatisanepithelialcell-type-specific 742 206 splicingregulator. HNRPLL ThisgeneencodesRNA-bindingproteinregulatingactivation-inducedalternative 537 536 splicinginTcells. NCBP2 Componentofthecap-bindingcomplex(CBC),whichbindstothe 156 103 monomethylated50capofnascentpre-mRNAinthenucleoplasm.Theencoded proteinhasanRNPdomaincommonlyfoundinRNAbindingproteins,and containsthecap-bindingactivity.TheCBCpromotespre-mRNAsplicing,30-end processing,RNAnuclearexport,andnonsense-mediatedmRNAdecay. PCID2 Thisgeneisexpressedinimmatureandearly-stageBlymphocytesandregulates 399 453(U2-a),292(U2-b) expressionofthemitoticcheckpointproteinMAD2. PRMT1 Thisgeneencodesargininemethyltransferasethatisresponsibleforthemajority NMD 371(U2-a),353(U2-b),346(U2-c), ofcellularargininemethylationactivity.Increasedexpressionofthisgenemay 325(U2-d),213(U2-e),192(U2-f) playaroleinmanytypesofcancer. SLC9A7 Thisgeneencodesasodiumandpotassium/protonantiporterthatisamember 725 727 ofthesolutecarrierfamily9proteinfamily.Itisprimarilylocalizedtothetrans- GolginetworkandisinvolvedinmaintainingpHhomeostasisinorganellesalong thesecretoryandendocyticpathways. SPAG16 Thisgeneencodesproteinkinasebindingproteinassociatedwiththeaxonemeof 631 577 spermtail. SSR3 Thisgeneencodesthegammasubunitofglycosylatedendoplasmicreticulum(ER) 198 174 membranereceptorassociatedwithproteintranslocationacrosstheER membrane. TAPT1 Thisgeneencodesahighlyconserved,putativetransmembraneprotein.A 567 338 mutationinthemouseorthologofthisgeneresultsinhomeotic,posterior-to- anteriortransformationsoftheaxialskeleton,whicharesimilartothephenotype ofmousehomeoboxC8genemutants. TTLL9 Thisgeneencodesatubulintyrosineligase-likeproteinthatformspolyglutamate 347 439 sidechainsontubulin. UBE2H Thisgeneencodesamemberoftheubiquitin-conjugatingenzymefamily.The 183 149 modificationofproteinswithubiquitinisanimportantcellularmechanismfor targetingabnormalorshort-livedproteinsfordegradation. ZNF207 Thisgeneencodesuncharacterizedzingfingerproteinexpressedincultured 494 95(U2-a),74(U2-b) breastcancercells. *Insomecases,multipleU2-typeintronsaresplicedfromasingleintronicposition. structures, except for HNRPLL and NCBP2. Although removedasapartoftheU2-typeintron(Additionalfile3: theproteinproductofHNRPLL showsslightvariationsin Figure S2), mostlikely leadingtoa failure in binding with the RNA recognition motif (RRM) for major and minor CAP80toformtheCap-BindingComplex(CBC). intron splice variants, the changes in the protein se- To comprehend the effect of twintrons in gene func- quence and structure are insignificant (Additional file 2: tion and regulation, we looked at the expression pat- Figure S1). The only gene that shows key structural vari- terns of all the twintronic protein isoforms. In many ation is Nuclear Cap Binding Protein 2 (NCBP2), which cases, the newly synthesized U2 splice variants are asso- has RRM from the 42nd to the 112th amino acid of the ciated with cancerous tissues and in a few cases show U12-type splice variant (PDB ID – 3FEX) [9]. When the tissue specific expression. This is especially evident in U2-type intron is spliced, a major portion of RRM is testicular tissue, as most of the newly evolved genes in Janiceetal.BiologyDirect2013,8:4 Page4of5 http://www.biology-direct.com/content/8/1/4 testes seem to be preferentially expressed [10,11] introducedin1991byauthorsthatdiscoverednestedgroup (Additional file 4: Table S2). The U12-type splice vari- II introns in Euglena. Thus, according to the original (and ant of the gene NCBP2 is expressed mainly in brain, Wikipedia) definition a twintron is any set of nested thymus, uterus, lungs, testis, and several other tissues, introns, belonging to the same splicing mechanism or not. whereas the U2-type splice variant is expressed mostly Thisisatoddswiththedefinitionusedbytheauthorsand in tumors and cancerous tissues (Additional file 4: perhaps should be resolved. Perhaps a figure that demon- Table S2). NCBP2 forms a heterodimer with CAP80 strateswhatatwintronlookslikeiscalledfor:itwouldhave and plays a key role in the biogenesis of mRNAs, beenclearertomewhattheauthorsmean. snRNAs, and microRNAs, and also in NMD. By a characterization of the U2 variant of NCBP2, Pabis Authors’ response: Our definition of twintrons differs et al. have discovered its physiological function in RNA slightly from the original one and includes both nested processing [12]. U2 isoforms show precise subcellular and shifted arrangements. Although we provide a short distribution, associations with active transcription sites, twintron definition in the abstract, the full one is now and RNA processing proteins, showing several proper- provided in the body of the paper and accompanied by a ties of RNA processing factors. Hence, the U2 splice figure. variant may also play vital roles in RNA polymerase II transcription and/or co-transcriptional mRNA proces- Second, the authors suggest that having such nested sing [12]. This gene serves as one of the best example introns that are excised by two different spliceosomes of twintron regulation and utility. Only two spliceo- can be an evolutionary mechanism of switching between somes twintrons have been reported previously [3,6,7]. the two intron types. However, perhaps this is at odds Moreover, both are limited to insect genomes. Surpris- with the apparent conservation of such a setup - if this ingly, our scan of vertebrate genomes resulted in the is a “safe pathway for intron switching” then certainly it discovery of several twintrons in higher animals. We does not appear to be a neutral one. Additionally, the expect that with the increase in transcriptomic and ex- mechanism that could turn an internal exon into an ex- pression data, more twintrons will be found in the near ternal one in a nested situation is not immediately clear future. As hypothesized previously, a twintron arrange- tome. ment may serve as a safe pathway in intron type switching. Although we did not find clear evidence that Authors’ response: We think that either of two introns this pathway was actually utilized, the PRMT1 gene can be switched off and this might be random process. case may suggest that such a process happens in intron We agree that presented examples don’t provide direct evolution. While there is a high chance of a splicing evidencethat sucha switchoccurs.However,thefactthat error in a gene with signals for both U2 and U12-type twintron arrangement is more common phenomena than introns at the same position, the described twintrons anticipated provides indirect evidence that such a mech- are phylogeneticaly conserved, indicating their vital, yet anismcouldbeused during gene structureevolution. elusive, role in the cell. Further analyses of the twintro- nic system should shed more light on the evolutionary Is there a preference for U-2 or U-12 introns to be the importance of this fascinating phenomenon. externalonesinthenestedsetup? Reviewers’ comments Authors’ response: No, there’s no bias in the two types Reviewer1(Dr.FyodorKondrashov,CentreforGenomic intronsarrangement.InsixcasesU12-typeintronisthein- Regulation,Spain) ternalone,while in four casesthe arrangement is reversed. Therestoftwintronsdisplayshiftedarrangement. This is a quaint study of the distribution of an interesting genomicelement:nestedintronswhereoneoftheintrons Reviewer2(Dr.EugeneKoonin,NationalInstitutesof isexcisedbytheU-2splicingsystemandtheotherbythe Health,USA) U-12 system. It appears that over a dozen of such cases can be found throughout genes found in vertebrate gen- This is quite an interesting short paper that reports a omes,someofthemconservedthroughouttheclade. number of previously unnoticed twintrons and most im- Ihavetwo pointsandaquestion. portantly demonstrates their evolutionary conservation First,itis thedefinitionof theterm twintron. Iam nota atconsiderablephylogeneticdepths,withtheimplication fan of this word, I think nested introns would have been of functional importance of the twintron structure itself. a more descriptive term. Unfortunately for my sense of This is the major finding of the work, and it is certainly semantic taste I found that this terms is defined enough valuable. I am less enthusiastic of the two hypotheses to appear in Wikipedia. It appears that the term was proposed in the article, namely that twintrons could be Janiceetal.BiologyDirect2013,8:4 Page5of5 http://www.biology-direct.com/content/8/1/4 an important intermediate along the path of elimination 5. OtakeLR,ScamborovaP,HashimotoC,SteitzJA:ThedivergentU12-type of U12 introns and that multiple protein isoforms pro- spliceosomeisrequiredforpre-mRNAsplicingandisessentialfor developmentinDrosophila.MolCell2002,9:439–446. duced by expression of twintron-containing genes might 6. ScamborovaP,WongA,SteitzJA:Anintronicenhancerregulatessplicing play a role in carcinogenesis. The first hypothesis, which ofthetwintronofDrosophilamelanogasterprosperopre-mRNAbytwo isoneofthemain themesinthearticle,isofinterestbut differentspliceosomes.MolCellBiol2004,24:1855–1869. 7. LinCF,MountSM,JarmolowskiA,MakalowskiW:Evolutionarydynamicsof I find the evidence quite limited. The authors might U12-typespliceosomalintrons.BMCEvolBiol2010,10:47. wish to expand the discussion. The idea about carcino- 8. WangH,HuangZQ,XiaL,etal:MethylationofhistoneH4atarginine3 genesis, to me, is sheer, unwarranted speculation. The facilitatingtranscriptionalactivationbynuclearhormonereceptor. Science2001,293:853–857. presence of additional splice variants in cancer samples 9. DiasSM,WilsonKF,RojasKS,AmbrosioAL,CerioneRA:Themolecularbasis might be caused by a variety of factors, above all the fortheregulationofthecap-bindingcomplexbytheimportins. general deterioration of regulatory processes in tumors, NatStructMolBiol2009,16:930–937. 10. KaessmannH:Origins,evolution,andphenotypicimpactofnewgenes. and have nothing to do with carcinogenesis. I am not GenomeRes2010,20:1313–1326. surethisiseven worthamention. 11. SzczesniakMW,CiomborowskaJ,NowakW,RogozinIB,MakalowskaI: Primateandrodentspecificintrongainsandtheoriginofretrogenes Authors’ response: We agree that two hypotheses are withsplicevariants.MolBiolEvol2011,28:33–37. 12. PabisM,NeufeldN,Shav-TalY,NeugebauerKM:Bindingpropertiesand highly speculative. As suggested, we have expanded the dynamiclocalizationofanalternativeisoformofthecap-binding discussion of the first hypothesis and removed the second complexsubunitCBP20.Nucleus2010,1:412–421. onefrom themanuscript. doi:10.1186/1745-6150-8-4 Citethisarticleas:Janiceetal.:SurprisinglyhighnumberofTwintrons invertebrates.BiologyDirect20138:4. Additional files Additionalfile1:TableS1.Conservationofmajorandminorsplice sitesinvertebrates. Additionalfile2:FigureS1.Superimposed3DstructuresofU12and U2-typesplicevariantsofthegeneHNRPLL.Thestructureiscolored basedonthesecondarystructure:redcolorforalpha-helicesandyellow forbeta-sheets.Ablackarrowindicatesthevariableaminoacids. Additionalfile3:FigureS2.Superimposed3DstructuresofU12and U2-typesplicevariantsofthegeneNCBP2.The3Dstructureiscolored basedonthesecondarystructure:redcolorforalpha-helicesandyellow forbeta-sheets.RRMdomainmissingintheU2splicevariantisshownin blue. Additionalfile4:TableS2.Expressiondataofsplicevariantsof twintrons. Competinginterests Theauthorsdeclarethattheyhavenocompetinginterests. Authors’contribution WMconceivedanddesignedtheproject.JJandMJperformedgenomic analysisandJJdraftedthemanuscript.Alltheauthorscontributedtothe finalversionofthemanuscript.Allauthorsreadandapprovedthefinal manuscript. 