AcademicPressisanimprintofElsevier TheBoulevard,LangfordLane,Kidlington,Oxford,OX51GB,UK 32JamestownRoad,LondonNW17BY,UK Radarweg29,POBox211,1000AEAmsterdam,TheNetherlands 225WymanStreet,Waltham,MA02451,USA 525BStreet,Suite1800,SanDiego,CA92101-4495,USA Firstedition2013 Copyright©2013ElsevierInc.Allrightsreserved. Nopartofthispublicationmaybereproduced,storedinaretrievalsystemortransmittedin anyformorbyanymeanselectronic,mechanical,photocopying,recordingorotherwise withoutthepriorwrittenpermissionofthepublisher. PermissionsmaybesoughtdirectlyfromElsevier’sScience&TechnologyRights DepartmentinOxford,UK:phone:(+44)(0)1865843830;fax:(þ44)(0)1865853333; email:permissions@elsevier.com.Alternativelyyoucansubmityourrequestonlineby visitingtheElsevierwebsiteathttp://elsevier.com/locate/permissions,andselecting, ObtainingpermissiontouseElseviermaterial. Notice Noresponsibilityisassumedbythepublisherforanyinjuryand/ordamagetopersonsor propertyasamatterofproductsliability,negligenceorotherwise,orfromanyuseor operationofanymethods,products,instructionsorideascontainedinthematerialherein. Becauseofrapidadvancesinthemedicalsciences,inparticular,independentverificationof diagnosesanddrugdosagesshouldbemade. ISBN:978-0-12-410523-2 ISSN:1876-1623 ForinformationonallAcademicPresspublications visitourwebsiteatstore.elsevier.com PrintedandboundinUSA 13 14 15 16 10 9 8 7 6 5 4 3 2 1 PREFACE Chromosomesundergostructuralreorganizationsthroughoutthecellcycle toenabletheirphase-specificfunctions.Thesereorganizationsareverycom- plexandnumerous.Thus,inthisvolumeoftheAdvancesinProteinChemistry andStructuralBiologydedicatedtotheorganizationofchromosomes,wedis- cussonlyafewaspectsofchromosomeandcytoskeletonstructuralalterations enablingchromosomerearrangements. Chapters1and2discusstheroleoftandemlyrepeatedDNAsequences inchromosomeorganizationandfunction.Thetandemlyorganizedhighly repetitivesatelliteDNAisthemainDNAcomponentofcentromeric/per- icentromeric constitutive heterochromatin. For a very long time, these sequences have been considered as a “junk DNA.” However, it is now established that each chromosome possesses a unique code madeup of dif- ferent large tandem repeats. Tandem repeat multiformity tunes the devel- oped nuclei 3D pattern by sequential steps of associations. Some satellite DNAs are also found transcribed during different phases of development andcellcycle,suggestingtheirimportanceforchromosomereorganization. InChapters3and4,detailedreviewsontheroleofhistonevariantsand zinc-fingerproteinsinelucidationofthestructuraldynamicsofnucleosomes andchromosomesarepresented.Thepresenceofvarianthistoneproteinsin the nucleosome core raises the functional diversity of the nucleosomes in gene regulation and has the profound epigenetic consequences of great importance for understanding the fundamental issues such as the assembly of variant nucleosomes and chromatin remodeling. Zinc-finger DNA- bindingproteinshaveshowntobeimportantorganizersofthe3Dstructure of chromosomes and as such are called the master weaver of the genome. Chapter 5 focuses on the latest developments in unconventional actin configurations.Theexistenceofactininthenucleuswasfirstreportedmore than40yearsago.However,theideaofanuclearactinmetwithskepticism for decades. Nuclear preparations were thought to be contaminated with actinfromthecytoplasm.However,aftertheidentificationofactininsev- eral nuclear complexes, which implicates it in diverse nuclear activities including transcription, chromosome remodeling, and nucleocytoplasmic trafficking,skepticismhasfinallygivenwaytothechallengingsearchforspe- cificstatesand/oruniqueconformationsactinmayadopttofulfillitsnuclear functions. vii viii Preface Finally, Chapter6 presentsourcurrent understandingof the chromatin condensationduringmitosiswithparticularattentiontothemajormolecular playersthattriggerandmaintainthiscondensation.Furthermore,themech- anismsthattakecaretoensurethatmitoticchromosomecondensationand theblockoftranscriptionduringmitosisdonotwipeoutthecellidentityare also discussed. Itismysincerehopethatthisoverviewofthebasicmechanismsinvolved in the regulation of the dynamic chromosome organization would inspire futuretranslationalstudiesfocusingonthefineregulationofthedysregulated cellcycleindiseasessuchascancer,Alzheimer’s,etc. ROSSEN DONEV Institute of Life Sciences, College of Medicine Swansea University, United Kingdom Clinic and Polyclinic for Psychiatry and Psychotherapy Rostock University, Germany CHAPTER ONE Large Tandem Repeats Make up the Chromosome Bar Code: A Hypothesis Olga Podgornaya*,†,1, Ekaterina Gavrilova†, Vera Stephanova*, * * Sergey Demin , Aleksey Komissarov * InstituteofCytologyRAS,St.Petersburg,Russia †FacultyofBiologyandSoilSciences,St.PetersburgStateUniversity,St.Petersburg,Russia 1Correspondingauthor:e-mailaddress:[email protected] Contents 1. Introduction 2 2. SatelliteDNAintheMouseGenome 3 2.1 SatelliteDNAandtandemrepeats 3 2.2 Mousegenomeasamodelforthetandemrepeatsearch 4 2.3 Lackoftandemrepeatsinthereferencegenome 5 2.4 LargetandemrepeatsinWGSassemblies 6 2.5 Largetandemrepeatgraphicrepresentation 9 2.6 TEs-relatedtandemrepeats 10 2.7 FamilyTRPC-21A-MM 12 3. TandemRepeatPositionDefinedbyFISH 13 3.1 Chromosomesofbonemarrowcells:firstsetofprobes 13 3.2 Chromosomesofembryonicfibroblasts:secondsetofprobes 15 3.3 Hybridizationsummary 18 4. “BarCode”Hypothesis 19 4.1 Internallargetandemrepeatsinhumanandmousereferencegenomes 19 4.2 Proteinsspecificinbindingtotandemrepeats 21 4.3 RNA-mediatedmechanismoftandemrepeatassociation 22 4.4 “Barcode”asmorphogeneticprogram 24 Acknowledgments 25 References 25 Abstract Muchoftandemrepeats’functionalnatureinanygenomeremainsenigmaticbecause there are only few tools available for dissecting and elucidating the functions of repeated DNA. The large tandem repeat arrays (satellite DNA) found in two mouse whole-genome shotgun assemblies were classified into 4 superfamilies, 8 families, and62subfamilies.Withthesimplifiedvariantofchromosomepositioningofdifferent AdvancesinProteinChemistryandStructuralBiology,Volume90 #2013ElsevierInc. 1 ISSN1876-1623 Allrightsreserved. http://dx.doi.org/10.1016/B978-0-12-410523-2.00001-8 2 OlgaPodgornayaetal. tandem repeats, we noticed the nonuniform distribution instead of the positions reportedformousemajorandminorsatellites.Itisvisiblethateachchromosomepos- sessesakindofuniquecodemadeupofdifferentlargetandemrepeats.Thereference genomesallowmarkingonlyinternaltandemrepeats,andevenwithsuchalimiteddata, thecolored“barcode”madeupoftandemrepeatsisvisible.Wesupposethattandem repeatsbarethemechanismforchromosomestorecognizetheregionstobeassociated. Theassociations,initiallyestablishedviaRNA,becomefixedbyhistonemodifications(the histoneorchromatincode)andspecificproteins.Insuchaway,associations,beingatthe beginningflexibleandregulated,thatis,adjustable,appearasirreversibleandinheritable incellgenerations.Tandemrepeatmultiformitytunesthedevelopednuclei3Dpattern bysequentialstepsofassociations.Tandemrepeats-basedchromosomebarcodecould bethecarrierofthegenomestructuralinformation;thatis,theorderofprecisetandem repeatassociationistheDNAmorphogeneticprogram.Tandemrepeatsarethecoresof thedistinct3Dstructurespostulatedin“genegating”hypothesis. 1. INTRODUCTION Tandemlyrepeatedsequences(TRs)arecommoninhighereukaryotes andcancompriseupto10%ofthegenomes.MuchofTRs’functionalnature inanygenomeremainsenigmaticbecausethereareonlyfewtoolsavailable fordissectingandelucidatingthefunctionsofrepeatedDNA(Blattesetal., 2006).Itissupposedthatacertainamountofconstitutiveheterochromatinis essential in multicellular organisms at two levels of organization: chromo- somalandnuclear.Atthechromosomallevel,TRscanbeofprofoundstruc- tural as well as evolutionary importance, as genomic regions with a high densityofTRs(e.g.,telomeric,centromeric,andheterochromaticregions) oftenhavespecificpropertiessuchasalternativeDNAstructureandpackag- ing (Kobliakova, Zatsepina, Stefanova, Polyakov, & Kireev, 2005; Podgornaya, Voronin, Enukashvily, Matveev, & Lobov, 2003; Ushiki & Hoshi,2008;Vogt,1990). At the nuclear level of organization, constitutive heterochromatin may helpmaintaintheproperspatialrelationshipsnecessaryfortheefficientoper- ationofthecellthroughthestagesofmitosisandmeiosis.Theassociationof satellite(orotherhighlyrepetitive)DNAwithconstitutiveheterochromatin is understandable, as it stresses the importance of the structural rather than transcriptional roles of these entities. In the interphase nucleus, satellite DNAs (satDNAs) have one property in common despite their species specificity, namely, heterochromatization (Yunis & Yasmineh, 1971). Heterochromatization appears to involve RNA interference-mediated LargeTandemRepeatsandtheChromosomeBarCode 3 chromatin modifications (Alleman et al., 2006; Martienssen, 2003). The RNAipathwayisrequiredfortheformationofheterochromatinandsilenc- ingofrepetitivesequencesinDrosophilamelanogaster(Grewal&Elgin,2007). In mammalian cells, an RNA component is required for the association of HP1 with pericentric heterochromatin (Lu & Gilbert, 2007; Maison et al., 2002; Muchardt et al., 2002). Moreover, the strand-specific burst in tran- scription of pericentric satellites is required for chromocenter formation andearlymousedevelopment.Specificexpressiondynamicsofmajorsatel- lite (MaSat) repeats, together with their strand-specific control, represent necessary mechanisms during a critical time window in preimplantation development that are of key importance to consolidate the maternal and to set up the paternal heterochromatic state at pericentric domains (Probst et al., 2010). Such an important and crucial finding is based on the known sequenceof themouse MaSat.MostoftheothermouseTRscouldnotbe tested in similar experiments being undescribed and unclassified. TheanalysisoftheTRcontentofgenomesisimportantforunderstanding theevolutionandorganizationofgenomesaswellasgeneexpression.The computation approaches to the TR analysis on the genome level gradually appear with the genome sequencing advanced (Alkan et al., 2007; Ames, Murphy, Helentjaris, Sun, & Chandler, 2008; Mayer, Leese, & Tollrian, 2010; Warburton et al., 2008). Current review is dedicated to the large TRsorsatelliteDNA. 2. SATELLITE DNA IN THE MOUSE GENOME 2.1. Satellite DNA and tandem repeats SatDNAswereidentifiedabout50yearsagoasanadditional,“satellite,”frac- tionofgenomicDNAduringtheequilibriumcentrifugationinCsClgradient (Kit, 1961, 1962). SatDNAs have been found in all the high eukaryotic species investigated by advanced centrifugation methods (Beridze, 1986). ReassociationkineticstestsshowthatsatDNAisthehighlyrepetitivesequence, builtupto10%ofthetotalDNA(Britten&Kohne,1968;Corneo,Ginelli,& Polli, 1967, 1970). Cytologically distinguished domains for satDNAs have beenshownwithmouseMaSatwithinitiallydeterminedcentromericlocali- zation (Pardue & Gall, 1970). Simple sequence satellites with monomers of 5–15bphavebeenfoundwiththesamemethodofequilibriumcentrifugation of human DNA (Gosden, Mitchell, Buckland, Clayton, & Evans, 1975). TheinventionoftherestrictionendonucleasesallowsfindingothersatDNA families. Main human alpha-satDNA has been found with EcoRI and XbaI 4 OlgaPodgornayaetal. restriction endonucleases (Jabs, Wolf, & Migeon, 1984; Manuelidis, 1976, 1978a,1978b). Long satDNA blocks of up to millions of base pairs built of relativelyshort,fromseveraluptohundredsofbasepairs,monomerstandemly arranged“headtotail.”Induetime,theterm“satDNA”hasbeenbroadenedto comprisenearlyalltandemlyorganizedsequencesfoundincludingmini-and microsatellites,whicharelocatedineuchromatin.Itcausessomeconfusion,for mini-andmicrosatellitespossessfeaturesquitedifferentfromclassicalsatDNA (Lo´pez-Flores&Garrido-Ramos,2012;Vogt,1990). TandemlyrepeatedDNAmakesupasignificantportionofthemammal genome.However,thereisnothingforsatDNAtoabuttoortobe“satellite” ofanythingintheassembledgenomes;thearraysoftandemrepeatsjustcon- tinuetheeuchromaticgene-containingregions.Nowadays,theterm“large tandemrepeats”lookslikemoreadequateandweuseitinthistext.Theterm “satDNA”couldnottotallybedrawnaway,forsomepreciselargetandem repeatshaveitincludedintheirnameshistorically(e.g.,mouseMaSat,human satellite3(HS3)).Ontheotherhand,thebulkofinformationobtainedabout classicalsatDNAbyscientificcommunityisquiteapplicabletothelargetan- demrepeats. SequencesofsatDNAsaredifferentindifferentspecies,andtheleveloftheir evolutional variability is high (Beridze, 1986; Podgornaia, Ostromyshenski˘ı, Kuznetsova,Matveev,&Komissarov,2009).However,theyhavesomecom- monstructuralfeatures(Fitzgerald,Dryden,Bronson,Williams,&Anderson, 1994;Lobov,Tsutsui,Mitchell,&Podgornaya,2001;Mart´ınez-Balba´setal., 1990), and their positions at the chromosomes are fixed. A great part of satDNAs of different families reside in the pericentromeric or subtelomeric chromosome regions. All the eukaryote centromeric sequences investigated uptonow,withtheexceptionofsimpleSaccharomycescerevisiaecentromeres, contain satDNA as the main component (Choo, 1997; Pezer, Brajkovic´, Feliciello,&Ugarkovc´,2012). 2.2. Mouse genome as a model for the tandem repeat search Tandem repeat content on genome level is well investigated in the human genome and shows a wide range of repeat sizes and organizations, ranging frommicrosatellitesofafewbasepairstomegasatellitesofuptoseveralkilo- bases.Microsatellitesandvariablenumberoftandemrepeats(calledVNTRs or minisatellites) can be highly polymorphic and have an important use as genetic markers (Ames et al., 2008; Warburton et al., 2008). Thecentromericregionofhumanchromosomescontainsalpha-satDNA, thelargestTRfamilyinthehumangenome.Thishasbeenextensivelystudied LargeTandemRepeatsandtheChromosomeBarCode 5 andprovidesaparadigmforunderstandingthegenomicorganizationofTRs (Schueler, Higgins, Rudd, Gustashaw, & Willard, 2001). Their arrays are composed of either diverged monomers with no detectable higher-order structureoraschromosome-specifichigher-orderrepeat(HOR)unitschar- acterizedbydistinctrepeatinglineararrangementsofanintegralsetofbasic monomers(Rudd&Willard,2004).TheHORstructureofhumancentro- mericalpha-satellitehasbeenimplicatedasimportantincentromerefunction (Schueleretal.,2001). In humans, the pericentromeric regions consist of alpha-satDNAarrays that are surrounded by arrays of “classical” satellites (e.g., human satDNA 1–4) (Choo, 1997; Lee, Wevrick, Fisher, Ferguson-Smith, & Lin, 1997; Moyzis et al., 1987; Podgornaya, Bugaeva, Voronin, Gilson, & Mitchell, 2000; Prosser, Frommer, Paul, & Vincent, 1986). These pericentromeric regionshaveaspecifichigh-orderchromatinstructureandmightberespon- sible for chromatin spatial organization. Inthehousemouse,Musmusculus,therearetwohighlyconservedTRs known as centromeric minor and pericentromeric major satellites (MiSat andMaSat,respectively,GSAT_MMandSATMINinRepbasenomencla- ture). MiSat comprises an AT-rich, 120bp monomer that occupies 300–600kb of the terminal region of all mouse telocentric (single-armed) chromosomesandisthesiteofkinetochoreformationandspindlemicrotu- bule attachment (Kalitsis, Griffiths, & Choo, 2006; Kipling, Ackford, Taylor, & Cooke, 1991; Wong & Rattner, 1988). MaSat (234bp) is more abundant and resides adjacent to MiSat, and has a role in heterochromatin formation andsister chromatidcohesion(Broccoli,Miller,&Miller,1990; Broccoli, Trevor, Miller, & Miller, 1991; Guenatri, Bailly, Maison, & Almouzni, 2004). Neither of these satDNA has been identified at the cen- tromereofthemorphologicallydistinctacrocentricYchromosome(Ho¨rz& Altenburger, 1981), which has a very small short arm that distinguishes it fromthetelocentricautosomesandXchromosome.Recently,thechromo- some Y centromere was shown to comprise a highly diverged MiSat-like sequence (designated Ymin) with HOR organization previously not describedformouseMiSatarrays(Pertile,Graham,Choo,&Kalitsis,2009). 2.3. Lack of tandem repeats in the reference genome Thelargeregionsofclassicalheterochromatinarepoorlycoveredbyassem- bled sequences (Warburton et al., 2008), and for the mouse genome even lessis known.Mouseacrocentric chromosomeshave prolongedTRarrays at the ends, and it is the reason why these regions are difficult to assemble. 6 OlgaPodgornayaetal. Table1.1 LargeTRsintheregionadjustedtocentromericgap Chromosome TRsubfamily Arraylength(kb) Coordinates(bp) 3 TRPC-21A-MM 33.6 3,000,001–3,033,629 4 TRPC-21A-MM 7.0 3,006,469–3,013,522 4 TR-22A-MM 4.9 3,104,899–3,109,811 6 TR-22A-MM 9.9 3,082,006–3,091,879 9 MaSat 38.4 3,000,003–3,038,419 11 MaSat 3.9 3,000,004–3,003,872 16 TRPC-21A-MM 9.0 3,232,335–3,241,336 17 TRPC-21A-MM 32.5 3,006,399–3,038,945 17 TR-27A-MM 4.6 3,070,530–3,075,093 18 TR-22A-MM 8.0 3,112,790–3,120,776 OnlyTRswiththearraymorethan3kbinthedistanceupto2Mbfromthecentromericgapareshown. Coordinates,thearraypositiononchromosomes. The mouse chromosomes end abruptly in 3Mb gaps reserved for centro- mericregions.Wefoundoutthatonlyeightchromosomeendscontaindis- tinctTRarraysandonlytwoofthemcontainMaSat(Table1.1).Theresult obviously illustrated the bad assembly of the heterochromatic regions: the rest of the chromosomes contain genes at the ends. The assemblies did notcovereventhebeginningoftheregionsoftheconstitutiveheterochro- matin (Komissarov, Gavrilova, Demin, Ishov, & Podgornaya, 2011). Thesequencesfromthoseregionscanbefoundinwhole-genomeshot- gun (WGS) contigs. The advantage of WGS is that they includethe entire shotgunsequencingreads,assembledintocontigs.Bothassembliesrepresent euchromatic and heterochromatic regions, even when not assembled into continuouscontigsand/ornotanchoredonchromosomesyet.Theregions enrichedinTRsaremostlynotanchored,althoughTRsand,inparticular, large TRs are present in WGS due to their abundance in the genome. 2.4. Large tandem repeats in WGS assemblies InWGSassemblies,theamountofallTRsislessthantheexperimentallydeter- mined amount of the MaSat alone ((cid:1)8%) (Abdurashitov, Chernukhin, Gonchar,&Degtyarev,2009),indicatingthateveninWGSassembliesTRs remain underrepresented (Table 1.2). All large TRs found in the mouse LargeTandemRepeatsandtheChromosomeBarCode 7 Table1.2 TandemrepeatsinmouseWGSassemblies Assembly Size(bp) Contigs TRs(all) %ofassembly TRs(>3kb) MGSC WGS 2,477,633,597 224,713 849,466 2.9 157 Celera WGS 3,003,109,157 837,963 1,084,552 5.0 784 Total WGS 5,480,742,754 1,062,676 1,944,018 3.8 941 TRs(all),totalamountfoundinassembly;MSGC,themousesequencinggenomeconsortium. Figure 1.1 Schematic representation of the large tandem repeat classification workflow.Overviewofthelargetandemrepeatanalysis.Foreachprogram,onlyparam- etersthatwerechangedareshown.FamilynamesaregivenaccordingtoTable1.2.The completedescriptionoftheworkflowisgiveninKomissarovetal.(2011). WGSwereclassifiedinto4superfamilies,8families,and62subfamilies,includ- ing60notdescribedyet,usingclassificationbasedonarraysimilarity,monomer length,thedegreeofunitsimilarity,positiononthereferencegenomechro- mosome assemblies, and GC content. Each new subfamily was named accordingtothesuggestedcytogenetic-basednomenclature(theclassification methodshownonFig.1.1).