Downloaded from genome.cshlp.org on January 22, 2023 - Published by Cold Spring Harbor Laboratory Press Research Genome-wide DNA methylation profiles in hematopoietic stem and progenitor cells reveal overrepresentation of ETS transcription factor binding sites Amber Hogart,1 Jens Lichtenberg,1 Subramanian S. Ajay,2 Stacie Anderson,1,3 NIH Intramural Sequencing Center,4 Elliott H. Margulies,2 and David M. Bodine1,5 1GeneticsandMolecularBiologyBranch,NationalHumanGenomeResearchInstitute,NationalInstitutesofHealth,Bethesda, Maryland20892,USA;2GenomeTechnologyBranch,NationalHumanGenomeResearchInstitute,NationalInstitutesofHealth, Bethesda,Maryland20892,USA;3FlowCytometryCoreFacility,NationalHumanGenomeResearchInstitute,NationalInstitutes ofHealth,Bethesda,Maryland20892,USA;4NIHIntramuralSequencingCenter,NationalHumanGenomeResearchInstitute, NIH,Rockville,Maryland20852,USA DNAmethylationisanessentialepigeneticmarkthatisrequiredfornormaldevelopment.KnockoutoftheDNAmethyl- transferaseenzymesinthemousehematopoieticcompartmentrevealsthatmethylationiscriticalforhematopoieticdiffer- entiation. To better understand the role of DNA methylation in hematopoiesis, we characterized genome-wide DNA methylationin primarymouse hematopoieticstem cells (HSCs), commonmyeloid progenitors(CMPs),and erythroblasts (ERYs).Methylbindingdomainprotein2(MBD)enrichmentofDNAfollowedbymassivelyparallelsequencing(MBD-seq) was used to map genome-wideDNAmethylation. Globally,DNA methylationwas most abundant in HSCs, witha 40% reductioninCMPs,anda67%reductioninERYs.Only3%ofpeaksariseduringdifferentiation,demonstratingagenome- widedeclineinDNAmethylationduringerythroiddevelopment.Analysisofgenomicfeaturesrevealedthat98%ofpro- moterCpGislandsarehypomethylated,while20%–25%ofnon-promoterCpGislandsaremethylated.Proximalpromoter sequencesofexpressedgenesarehypomethylatedinallcelltypes,whilegenebodymethylationpositivelycorrelateswith geneexpressioninHSCsandCMPs.Elevatedgenome-wideDNAmethylationinHSCsandthepositiveassociationbetween methylationandgeneexpressiondemonstratesthatDNAmethylationisamarkofcellularplasticityinHSCs.Usingdenovo motifdiscovery,weidentifiedoverrepresentedtranscriptionfactorconsensusbindingmotifsinmethylatedsequences.Motifs forseveralETStranscriptionfactors,includingGABPAandELF1,areoverrepresentedinmethylatedregions.Ourgenome- widesurveydemonstratesthatDNAmethylationismarkedlyalteredduringmyeloiddifferentiationandidentifiescritical regionsofthegenomeandtranscriptionfactorprogramsthatcontributetohematopoiesis. [Supplementalmaterialisavailableforthisarticle.] Epigeneticmechanismsofgeneregulationareheritable,reversible Methyl-binding domain proteins are a family of DNA-binding modificationsthatarecriticalfortheorganizationofchromatin proteins that recognize methylated DNA and modify gene ex- and regulation of tissue-specific geneexpression.DNA methyla- pression by forming complexes with other regulatory proteins. tionisadynamicepigeneticmarkprimarilylocalizedtocytosine StudiesofmouseknockoutmodelsoftheMBDproteinsdemon- residuesinthecontextofaCpGdinucleotideinmammals.Tar- strate unique but nonessential roles for most of these proteins geteddisruptionofthegenesresponsiblefordenovomethylation (Bogdanovic and Veenstra 2009). Of the MBD proteins, MBD2 and maintenance of DNA methylation during replication dem- appearstoplayanimportantroleinhematopoiesis,withspecific onstratethatDNAmethylationisessentialforproperdevelopment rolesinglobingeneswitching(Ruponetal.2006). inthemouse(Lagetetal.2010;Maetal.2010).Whilethecritical Thehematopoieticsystemisidealforthestudyofmethyla- role for DNAmethylationin early development is clearlyestab- tionduringdifferentiationbecauseprimarycellsatspecificstages lished,theroleforDNAmethylationintissuespecificationisless canbeseparatedfromotherhematopoieticcellsbyflowcytometry. understood. Thehematopoieticstemcell(HSC)givesrisetoallcellsinthepe- DNAmethylationhaslongbeenrecognizedasanimportant ripheralblood.Thecommonmyeloidprogenitor(CMP)generates markinestablishingandmaintainingimprintedgeneexpression onlymyeloidcells(redcells,platelets,granulocytes,monocytes, and X-chromosome inactivation. Apart from these specialized andeosinophils),butnotlymphoidcells(T-andB-lymphocytes). roles for DNA methylation, little is known about how DNA Erythroblasts(ERYs)arenucleatedredbloodcellsthathavecom- methylationleadstomoregeneralalterationsingeneexpression. mittedtoterminaldifferentiation.Conditionalknockoutmicein whichthegenesforthedenovomethyltransferasesDnmt3aand Dnmt3baredeletedinHSCsretaintheabilitytodifferentiateinto 5Correspondingauthor bothmyeloidandlymphoidlineages,butlong-termrepopulation [email protected] of the hematopoietic system is impaired (Tadokoro et al. 2007). Articlepublishedonlinebeforeprint.Article,supplementalmaterial,andpub- licationdateareathttp://www.genome.org/cgi/doi/10.1101/gr.132878.111. SerialtransplantationofDnmt3adeficientHSCsrevealedimpaired 22:1407–1418 (cid:2)2012,PublishedbyColdSpringHarborLaboratoryPress;ISSN1088-9051/12;www.genome.org GenomeResearch 1407 www.genome.org Downloaded from genome.cshlp.org on January 22, 2023 - Published by Cold Spring Harbor Laboratory Press Hogart et al. differentiation as well asimpaired repopulation (Challen et al. 2012).Similarly,conditionalknockoutmiceinwhichthegene forthemaintenanceDNAmethyltransferaseDnmt1isdeletedin HSCs demonstrated severe impairment of repopulating ability and inappropriate enhancement of mature myeloid lineages (Broske et al. 2009; Trowbridge et al. 2009). Together these studies demonstrate a profound role for DNA methylation in hematopoiesis. WhiletheimportanceofDNAmethylationinhematopoietic differentiationhasbeenwellestablished,thegenome-widelocal- izationofmethylatedDNAatspecificstagesofmyeloiddifferen- tiationremainstobeelucidated.Recentadvancesinsequencing technology have allowed comprehensive surveys of DNA meth- ylationwithvaryingdegreesofresolution.Thehighest-resolution techniquesusebisulfitesequencingapproachesthathavethead- vantageofsingle-baseresolutionbutdonotdistinguishbetween 5-methylcytosineand5-hydroxymethylcytosine(Kriaucionisand Figure1. OverviewofMBDsequencingresults.(A)Schematicrepre- Heintz2009;Tahilianietal.2009).Inthisstudy,weusedarecombi- sentationofhematopoiesishighlightingthethreecelltypesenrichedfor nant methyl-binding domain protein to enrich 5-methylcytosine methylation analysis. HSC and CMP cell populations were lineage de- modified regions of the genome for massively parallel sequence pleted(Lin(cid:2))andthenpositivelyselectedwiththecellsurfacemarkers Sca1andc-kit.Erythroblastcellswerepositivelyselectedwithantibodies analysis(MBD-seq).Usingthisapproach,wecomparedgenome- forTer119andCD71.(B)Totalnumberofmethylationpeakscalledineach widemethylationinpurifiedpopulationsofmurineHSCs,CMPs, ofthethreecellpopulations.(C)Totalmethylationpeakcountperchro- andERYs.Byfocusingonmethylationchangesdefinedbypeaks mosomeforeachcellpopulation:HSC(blue);CMP(orange);ERY(red). asopposedtosite-specificmethylation,wewereabletoidentify discrete regions of the genome where dynamic methylation changesoccurduringhematopoiesis.Ourstudyrevealsthatthe Toquantifygenome-wideDNAmethylationlevels,mapped greatestnumberof methylation peaksoccursinHSCsandthat sequencingreadswereanalyzedtodeterminestatisticallysignifi- thesepeaksarespecificallyandsuccessivelylostduringmyeloid cant peaks of methylation. In a conservative filteringapproach, differentiation,consistentwiththebiasinmyeloidlineagesseen MBD-seqpeakswerecallediftheywerepresentinbothbiological in Dnmt1 knockout mice (Broske et al. 2009; Trowbridge et al. replicatesand overlappedby at least200 bp.The average distri- 2009).Theidentificationofregionswheremethylationchanges bution of peak lengths was also investigated, with the average duringhematopoiesiswillfacilitatemechanisticstudiesofhow peaks occupying slightly more than 800 bp in each cell type DNAmethylationregulateshematopoieticdifferentiation. (SupplementalTableS2).Comparisonofoverallpeakcountfrom eachcelltyperevealedthatDNAmethylationpeaksweremost abundant in HSCs (85,797), with decreasing levels in CMPs Results (50,638) and fewer in ERYs (27,839) (Fig. 1B). The decrease in methylation reflects a global genomic decrease that is demon- Genome-wideDNAmethylationdeclinesduringmyeloid stratedbythesimilarrelativedistributionofmethylationpeakson differentiation eachchromosomebetweencelltypesdespitethereductioninthe Whole-genome sequencing in human embryonic stem cells has total number of methylation peaks (Fig. 1C). Among the auto- demonstratedthatasmuchas80%ofallCpGdinucleotidesinthe somes,chromosome5hasthegreatestnumberofDNAmethyla- genomearemethylated(Laurentetal.2010),yetitisunlikelythat tionpeaksinallcelltypes,whilethefewestnumberofpeakswas allofthesesitesarerelevanttodifferentiation.MBD-seqidentifies foundonchromosome19.TheXchromosomeissubjecttoran- regions of the genome bound by a DNA methylation-binding domX-chromosomeinactivationinfemales,aprocessassociated protein, highlighting loci containing multiple methylated CpG withDNAmethylation.WhileallanimalsincludedintheHSCand sites.RecombinantMBD2pull-downofmethylatedgenomicDNA CMP data sets were female, the fetal tissues used for ERY also combinedwithmassivelyparallelsequencingwasusedtoidentify includedmalesresultinginaslightlylowerdensityofpeaksonthe methylatedgenomiclociinenrichedpopulationsoflineageneg- XchromosomeinERYs. ativeSca1+c-kit+cells(apopulationofcellsenrichedforlong-and BothinsilicovalidationofCpGcontentwithinpeaksaswell short-term HSCs and multipotent progenitors, which we have asbisulfitesequencingofselectedpeakswereperformedtovalidate designatedasHSCs),commonmyeloidprogenitors(CMPs),and that the MBD-seq approach specifically recognizes methylated erythroblasts(ERYs)(Fig.1A).Twobiologicalreplicateswerein- DNA.ThedistributionofpeaklengthisshowninSupplemental cludedforeachofthethreecelltypeswiththetotalnumberof TableS2,andthedistributionofCpGcontentisshowninTableS3. readsrangingfrom32to46million.Acompletesummaryofthe Atleast99.96%ofallpeakscalledcontainedoneCpGsitewiththe sequencing reads is provided in Supplemental Table S1. Non- average CpG count rangingfrom 21 to 24 CpGs per peak (Sup- enrichedgenomicDNAwassequencedtodetermineastandard plemental Table S3). Random sampling of sequences of similar backgroundforsubsequentanalysis.SupplementalFigureS2il- length revealed that the MBD-enrichment approachresults in a lustratesthesequencingresultsforaregioncontainingtheim- statisticallysignificanthigherpercentageofpeakswithCpGsites printingcontrolregionoftheimprintedgeneSnrpn.Consistent thanisexpectedbychance(P-value=0.00507)andwithasignifi- with the observation that Snrpn is imprinted within all cells, cantly higher average CG content (P-value = 0.00391). Bisulfite specificpeaksofDNAmethylationweredetectedinallthreecell sequencingof10genomiclocicontainingMBD-seqpeaks(fivecom- types. montoallthreedatasets,fourpeaksuniquetoHSCsorprogenitors 1408 GenomeResearch www.genome.org Downloaded from genome.cshlp.org on January 22, 2023 - Published by Cold Spring Harbor Laboratory Press Genome-wide DNA methylation in hematopoiesis [HSCs+CMPs],andonepeakuniquetoERYs)werefoundtohave largestcategorywaspeakspresentinHSCsandCMPs,butabsentin atleast47%methylationattheCpGsitesperregioninvestigated ERYs(progenitor;27%).Approximately3%ofthetotalmethyl- with an average of ;80% methylation (Supplemental Fig. S5; ation peaks were unique to either ERYs or CMPs, and an even SupplementalTableS4).Bisulfitesequencingatgenomicregions smallerfractionofpeaks(0.3%)wereabsentinHSCsbutpresentin thatwerenotidentifiedasstatisticallysignificantpeaksrevealedan the more differentiated cell types (myeloid). Figure 3A demon- averageof30.7%methylatedCpGsites,consistentwiththeob- strates examples of HSCs and progenitor-specific methylation servationthatCpGmethylationisrelativelycommonthroughout peaks upstream of the important hematopoietic transcription thegenome.Thepresenceoflow-to-intermediatelevelsofmeth- factorgeneGata2.AnexampleofanERY-specificpeakintheMeis1 ylationinregionsexcludedfrompeaksconfirmsthattheMBD-seq locus is shown in Figure 3B. Overall, the distribution of peaks approach identifies regions of the genome with high-density withineachofthecelltypesdemonstratesaprogressiveglobalde- methylation.AnadditionalvalidationofourMBD-seqdatasetwas creaseinmethylationduringdifferentiationwithlittleacquisitionof achieved by comparing our methylation peaks to the recently newmethylationpeaksduringthelaterstagesofdifferentiation. published(Shearstoneetal.2011)reducedrepresentationbisulfite sequencing(RRBS)datafrommouseerythroblasts.Supplemental DNAmethylationisoverrepresentedinthecodingportion Figure S6 shows the degree of overlap between the erythroblast ofthegenome MBD-seqpeaksandtheRRBSdata.Consistentwiththeconven- tionalbisulfitesequencingvalidation,theoverlapbetweenthetwo We next investigated the genomic distribution of methylation datasetsincreasesdramaticallywhentheRRBSvaluesexceed30%– peaks. The genome was divided into five partitions defined by 40%methylation(SupplementalFig.S6). RefSeq genes. The proximal promoter was defined as sequences Becauseglobal DNA methylation peaks decrease with mye- 1kbupstreamofthetranscriptionstartsite(TSS)throughthefirst loid differentiation, we determined whether methylation peaks 50bppastthetranscriptionalstartsite.Genebodymethylation were shared among the cell types or whether the peaks were wasdefinedbytheRefSeqgenecoordinatesminusthefirst50bp unique to each of the three cell types. Seven categories of DNA pasttheTSS.Additionally,welookedatthe10-kbregionsupstream methylationpeaksbasedonpresenceinoneormorecelltypesare of and downstream from gene boundaries because we expected showninFigure2.ThelargestcategoryofDNAmethylationpeaks, theseflankingsequencestoalsocontaincisregulatoryelements. composingnearly40%ofallpeaks,wereuniquetoHSCs.Meth- Theremainderofthegenomeconstitutestheintergenicpartition. ylationpeakscommontoallthreecelltypeswerethenextlargest Thedistributionofmethylationpeaksineachcell-typecategory category, representing 28% of all methylation peaks. The third was compared with the relative percentage of sequence in each partitiontothetotalnonrepetitiveportionofthegenome(geno- micaverage).Significantdeviationfromtheexpecteddistribution wasobservedforcommonpeaksandforpeaksinallmultipotent categories(Table1).Anoverrepresentationofmethylationpeaks wasseenintheRefSeqportionofthegenome,andmethylation peakswereunderrepresentedinintergenicsequences.Methylation peaks were overrepresented in the downstream flanking regions and were similar to the expected percentage in both the distal promoterand59-flankingregionsinallpeakcategories. WenextinvestigatedthedistributionofpeakswithintheRefSeq portionofthegenome.TheRefSeqsequencesweredividedintofour categories: 59-untranslated sequences (8.7%), 39-untranslated se- quences(6.8%),exons(5.3%),andintrons(79.2%).Althoughexons represent;5%ofthegenesequence,methylationpeaksweresig- nificantlyoverrepresentedin theexons(7%–25%ofall thegenic DNAmethylationpeaks)(Table1)andunderrepresentedinthein- tronsinmostcategories.Sincethenumberofpotentialmethylation sitesisvariablewithingenicregions,wenextdeterminedtherelative contributionofCpGdinucleotidesineachpartitionandcompared this with peak distributions. Table 2 demonstrates that MBD-seq peakdensitydoesnotfollowCpGdistribution,becausemethylation peaksexceedtherelativecontributionofCGinbothexonsandin- trons,whilemethylationpeaksarefoundatlessthantheexpected frequencyinboththe59and39UTRcompartments(Table2). CpGislandmethylationisrareinhematopoieticstem cell–specificpeaks Figure2. Analysisofpeaksharingbetweencellpopulations.(A)Venn diagramillustratingthedegreeofoverlapinmethylationpeaksbetween Previous studies have shown that CpG islands in characterized cellpopulations.(B)Totalpeakcountforeachcellpopulation,thenumber promotersofgenesremainunmethylatedinmostinstances,while ofpeakssharedbetweeneachcelltype,andthepercentagethateach CpGislandsnotassociatedwithknownpromoters,orphanCpG categorycontributestothetotalnumber.Progenitorpeaksaredefinedas islands,aremuchmorelikelytobemethylated(Illingworthetal. presentinHSCsandCMPsbutabsentinERYs.Myeloidpeaksareabsentin HSCsbutpresentinCMPsandERYs.(Other)Peaksthatwereabsentinthe 2010).ConsistentwithapreviousreportofmethylatedCpGislands CMPbutpresentinHSCsandERYs. inwholemouseblood(Illingworthetal.2010),wefound;2%of GenomeResearch 1409 www.genome.org Downloaded from genome.cshlp.org on January 22, 2023 - Published by Cold Spring Harbor Laboratory Press Hogart et al. Figure 3. MBD-seq peaks at selected loci. UCSC Genome Browser view of the Gata2locusonChr6(A)andtheMeis1locusonChr11(B).Raw sequencingdataforeachcellpopulation:blue(HSC),orange(CMP),andred(ERY),withsignificantpeaksshownasblackbarsbeloweachsample.The y-axisindicatespeakheight,definedbythemaximumnumberofsequencingtagsseeninthehighestvisiblepeakineachwindow.Transcriptionfactor consensussites(TRANSFACIDs)areindicatedabovetheMBD-seqdatainregionsofsignificantpeaks.ThegenestructureisindicatedbelowtheMBD-seq datawithCpGislandsshowningreen.Bisulfitesequencingdataconfirmingthecell-type-specificmethylationareshownbelow.(Blackcircles)Methylated CpGsites;(opencircles)unmethylatedCpGsites.Eachhorizontallineindicatesauniqueclone. promoterCpGislandsmethylatedinHSCs,andasmallerpercent in the progenitor category (HSCs + CMPs) compared with com- methylated in the more differentiated hematopoietic cell types monandHSC-specificpromoters(Fig.4B).Incontrast,themeth- (Fig.4A).OrphanCpGislandswerearound10-foldmorelikelyto ylation of orphan CpG islands varied between peak-type cate- bemethylatedineachofthecelltypes,consistentwiththegreater gories.ThelargestpercentageofmethylatedorphanCpGislands roleintissue-specificmethylation. wasamongthecommonmethylation peaks(20%) (Fig.4B).Al- Investigation of cell-type-specific methylation peaks revealed thoughHSCisthemostmethylatedcelltype,thepercentageof thatthevastmajorityofallpromoterCpGislandsareunmethylated HSC-specificmethylationpeaksinorphanCpGislandswas10-fold inallcategories.Increasedpromoter-associatedmethylationobserved less than common peaks and around threefold lower than the 1410 GenomeResearch www.genome.org Downloaded from genome.cshlp.org on January 22, 2023 - Published by Cold Spring Harbor Laboratory Press Genome-wide DNA methylation in hematopoiesis Table1. Distributionofmethylationpeaksingenomicpartitions(%inpartition) Distal Proximal promoter promoter RefSeq 59UTR Exon Intron 39UTR Downstream Intergenic P-value Genomicaverage 7.22 0.84 36.12 8.72 5.29 79.15 6.84 7.69 48.13 Common 9.13 2.11 52.16 8.13 24.72 61.39 5.76 13.36 23.24 1.61303310(cid:2)18 HSC 8.46 1.40 47.90 7.89 11.14 74.84 6.13 11.00 31.23 0.016898107 Progenitor 8.47 2.93 45.51 8.88 14.83 69.77 6.52 10.87 32.23 7.02049310(cid:2)05 CMP 6.82 2.35 42.65 8.36 11.84 73.40 6.41 10.76 37.41 0.04120891 Myeloid 7.41 1.54 32.10 12.75 6.86 78.43 1.96 9.57 49.38 0.5006945 ERY 8.62 1.55 40.95 7.30 7.51 78.90 6.29 11.72 37.16 0.501189395 progenitor category (Fig. 4B). These results suggest that the re- Furtheranalysisofgeneswithmethylationinthreeormorepar- quirementofDNAmethylationformaintenanceofHSCsisnot titionsrevealedthat10-foldmoreexpressedgeneshadmethyla- primarilyduetosilencingofgenesviamethylatedCpGislands. tioninthegenebodyandflankingregionsthaninallfourparti- tions in each of the three cell types. Therefore, while proximal DNAmethylationinexpressedgenesvaries promotermethylationisrareinexpressedgenes,methylationinthe bygenomicpartition genebodyandflankingregionsoutsideoftheproximalpromoter occursofteninexpressedgenes. To investigate the role that DNA methylation plays in gene ex- pression,wecomparedthegenomicdistributionofDNAmethyl- Denovomotifdiscoveryrevealsdifferentiallymethylated ationpeakstogeneexpressiondataobtainedfromBloodExpress, adatabasecontaininggeneexpressionprofilesfromalargenum- transcriptionfactorbindingsitesignaturesofhematopoietic ber of mouse hematopoietic cell types (Miranda-Saavedra et al. differentiation 2009).BasedongeneexpressionprofilesformouseHSC,CMP,and Toelucidatedevelopmentalprogramspotentiallymodifiedbythe ERY cell types, thegreatestnumberof genesis expressedin the presenceofDNAmethylation,weinvestigatedthesequencesun- HSCs(8594),followedbyCMPs(7122),andthefewestnumberof derneath the methylation peaks for recurring motifs. De novo expressed genes are seen in ERYs (4223). Peaks assigned to the motifdiscoverywasperformedonalldatasetsandgenomicpar- upstream10kb,proximalpromoter,RefSeqgeneboundary,and titions independently. The top 25 overrepresented motifs were downstream10kbwerecomparedinexpressedgenes.Consistent queried againstthe TRANSFAC (Matys et al.2006) transcription with the observations of Hodges et al. (2011), the majority of factordatabase,andtranscriptionfactorswithcharacterizedbinding expressed genes lacked methylation peaks in the proximal pro- sites were identified. Figure 6 contains a summary of the tran- moter, with a similar trend seen in the more upstream 10 kb scriptionfactorswithoverrepresentedmotifsoccurringatsitesof (Fig.5A,B).Incontrast,themajorityofexpressedgenesinHSCs DNAmethylationwithinthefivegenomicpartitions(identified and CMPs contained methylation peaks within the gene body ontheleft)ineachpeakcategory(shownontheright).Theheatmap (RefSeq) (Fig. 5A). Methylation peaks were also investigated in highlightstranscriptionfactorsthatoccurinonlyonepeakcategory genesthatwerenotexpressedineachcelltype.Consistentwith (red)suchasMATH1inHSCproximalpromoterpeaks,twocate- thelargerabsolutenumberofnonexpressedgenes,thenumberof gories(green),threecategories(teal),andtranscriptionfactorsthat methylated nonexpressed genes exceeds the numberof methyl- are overrepresented in almost all data sets (dark blue) such as ated expressed genes for each partition (Supplemental Fig. S7). GABPA. While the ratio of unexpressed methylated genes to expressed Comparison of the overrepresented transcription factor methylatedgenesissimilarbetweentheupstream,downstream, binding sites revealed that >10% of all overrepresented motifs andRefSeqpartitions,thisratioiselevatedintheproximalpro- corresponded to ETS transcription factors. Although the ETS moterforeachcelltypeandmarkedlyelevatedinERYs.Thesedata transcription factor (ELF1, ETS2, FLI1, and GABPA) binding furthersupporttherolethatproximalpromotermethylationplays sites were overrepresented in all five genomic partitions, the ingenesilencing. distribution within cell-type-specific categories was quite dif- Onaverage,geneshadmultiplemethylationpeaks,withthe ferent. Common methylation peaks had overrepresentation of RefSeqpartitionspecificallyaveraging3.75peakspergeneinHSCs, allETSfactorsinallpartitionsexceptintheintergenicpartition. 2.7 peaks in CMPs, and 2.1 peaks in ERYs. We investigated the In contrast, HSC-unique peaks lacked overrepresentation of associationofmethylationpeaksinmultiplegenomicpartitions GABPAin the promoter and flanking partitions but exhibited simultaneously to ascertain if specific patterns were associated withpositivegeneexpression.Figure5Cshowsthepercentageof Table2. Peakpartitiondistributionwithingenescomparedwith expressedgeneswithmethylationpeaksintwoormoreofthefour CGcontent partitions:upstream10kb(UP),proximalpromoter(PP),RefSeq geneboundaries(CDS),and10kbdownstream(DS).Consistent 59UTR Exon Intron 39UTR P-value withthesinglepartitionanalysis,expressedgenesareunlikelyto %oftotalCG 27.94 7.38 40.9 23.78 havemethylationintheproximalpromoterincombinationwith Common 8.13 24.72 61.39 5.76 5.80995310(cid:2)17 any other genomic region. Of the two-partition correlations, HSC 7.89 11.14 74.84 6.13 1.94352310(cid:2)12 methylationmostfrequentlyoccurredtogetherinthegenebodies Progenitor 8.88 14.83 69.77 6.52 1.48518310(cid:2)11 (CDS)andthe10kbdownstream(DS)partitions,ineachofthe CMP 8.36 11.84 73.4 6.41 7.10713310(cid:2)12 Myeloid 12.75 6.86 78.43 1.96 1.51578310(cid:2)13 celltypes,withtheminimaltwo-partitioncorrelationoccurringin ERY 7.3 7.51 78.9 6.29 1.09267310(cid:2)13 theproximalpromoterandthedownstreampartitions(Fig.5C). GenomeResearch 1411 www.genome.org Downloaded from genome.cshlp.org on January 22, 2023 - Published by Cold Spring Harbor Laboratory Press Hogart et al. etic cell line, a multipotent cell that can be differentiated into variousmyeloidlineages(PintodoO´ etal.1998).Overlapbetween transcriptionfactorpeaksandMBD-seqpeaksinCMPsandHSCs areshowninTable3.Thegenomiccoverageofeachtranscription factordatasetandmethylationpeakdatasetwasusedtogenerate randomdatasetstocalculatetheexpectedgenomicoverlapsbased onchance.Asignificantunderrepresentationofoverlapbetween transcriptionfactorsitesandmethylationpeaksoccurredinboth CMPs and HSCs (Table 3). The only transcription factor with significant co-occurrence with methylation was FLI1 in HSC methylation peaks. Of the 10 key hematopoietic transcrip- tionfactorsincludedinthestudybyWilsonetal.(2010),FLI1 is the only factor with an annotated consensus site that was overrepresentedinourDNAmethylationpeaks. Figure4. OverlapofMBD-seqpeakswithCpGislands.(A)MBD-seq peaksinthethreecelltypescomparedwiththecompletegenomicCpG ChromatinimmunoprecipitationofFLI1andELF1reveals islandsclassifiedbyIllingworthetal.(2010).Thepercentofmethylated co-occupancyoftranscriptionfactorsandmethylation promoterCpGislands(blue)ororphan(non-promoter)CpGislands(red) isshownforeachcelltype.(B)MethylatedpromoterandorphanCpG TofurtherinvestigatetherelationshipbetweenDNAmethylation islandsincommonandcell-type-specificpeaks,asapercentageoftotal andtranscriptionfactorbinding,wenextidentifiedthepresence CpGislands. of CpG sites in the overrepresented transcription factor motifs. Table 4 contains a list of all transcription factors with known overrepresentationofmultipleETSsiteswithintheintergenicpar- binding sites that were overrepresented in our analysis of the tition.Thesecomplementarypatternsoftranscriptionfactormotifs proximalpromoterpartition.Ofthe19transcriptionfactormotifs suggest that DNA methylation may modulate ETS transcription associatedwithsequencesoverrepresentedinmethylatedproximal factor occupancy during hematopoietic development. Other transcription factor motifs suchas the NFAT (nuclearfactorof acti- vatedT-cells)werehighlyspecifictosingle partitionsorcelltypes.TheNFAT1-binding site is overrepresented in CMP-unique methylation and myeloid-specific peaks located in the proximal promoter. The NFAT2motifisoverrepresentedinCMP- uniquemethylationpeakslocatedinthe proximal promoter and downstream re- gions,andinthedistalpromoterofmy- eloidpeaks.Incontrast,theNFAT3motif is overrepresented in common and pro- genitormethylationpeakslocatedinthe proximal promoter. The striking differ- ence between overrepresented motifs in cell-type-specific methylation profiles demonstrates that hematopoietic line- agescontainuniquesignaturesofmethyl- ation patterns that may regulate changes thatoccurduringdifferentiation. Minimalgenome-wideco-occupancy ofhematopoietictranscriptionfactors withDNAmethylationpeaks To assess the relationship between ge- nome-wide DNA methylation and tran- Figure5. Methylationingenicpartitionsofexpressedgenes.Geneexpressionwasobtainedfromthe scription factor binding, we compared BloodExpress(Miranda-Saavedraetal.2009)geneexpressiondatabaseforeachcelltypeandcompared ourmethylationdatawithgenome-wide withMBD-seqpeaks.(A)Thenumberofexpressedgenes(y-axis)ineachcelltypewithmethylation occupancydataforasetof10keyhema- peaksinthedistalpromoter,proximalpromoter,RefSeqgenebody,anddownstreamregion.(B)The topoietic transcription factors (Wilson numberofexpressedgenes(y-axis)lackingmethylationpeaksineachofthegenicpartitionsforeachcell type.(C)Combinatorialanalysisofmethylationpeaksoccurringinallfourgenicpartitionsinexpressed etal.2010).Transcriptionfactorbinding genes.(UP)Upstream10kb;(PP)proximalpromoter;(CDS)RefSeqgeneboundary;(DS)downstream. in Wilson et al. was ascertained in the Thepresence(+)orabsence((cid:2))ofmethylationintheindicatedpartitionisshownbelowthex-axis.The immortalized HPC-7 mouse hematopoi- y-axisshowsthepercentageofexpressedgeneswitheachmethylationpattern. 1412 GenomeResearch www.genome.org Downloaded from genome.cshlp.org on January 22, 2023 - Published by Cold Spring Harbor Laboratory Press Genome-wide DNA methylation in hematopoiesis differentiallymethylatedsiteswereobservedintwointronicpeaks ofEng.Themethylationpeakinthefirstintron(site2)doesnot overlap a functional conserved element, while the methylation peakinintron11(site3)correspondstoaconservedelementthat hasnotbeenassayedforfunction.Inbothcases,themoremeth- ylatedsitewasoccupiedbyFLI1andELF1(Fig.7B,C).Overall,we foundthatfourofsixregionswithstatisticallysignificantdiffer- encesintranscriptionfactoroccupancyshowedincreasedbinding inthemethylatedcelltype,whiletwoofsixregionshadsignifi- cantlymore bindingin the less methylatedcell type (Fig. 7B,C; Supplemental Fig. S8). We conclude that methylation does not preventthebindingofELF1orFLI1. Discussion Thewidespreaduseofmassivelyparallelsequencingapproacheshas ledtobreakthroughsingenomics,especiallyintheareaofepige- neticcontrolofgeneregulation.Avarietyofapproacheshavebeen developed to map DNA methylation including whole-genome bisulfite sequencing (MethylC-seq), reduced representation bi- Figure6. Heatmapofoverrepresentedmethylatedtranscriptionfactor sulfitesequencing(RRBS),andtheenrichment-basedsequencing binding motifs. The top 25 overrepresented sequence motifs for each peakcategoryandeachgenomicpartitionarecompiledintoaheatmap. methods MeDIP-seq and MBD-seq. MethylC-seq requires more (Top) Transcription factors (TRANSFAC IDs) with confirmed consensus sequencing reads (503) than enrichment approaches to yield sequences.(Right)Eachgenomicpartitionwithcell-typecategories.(Red a comparable level of genome coverage, but provides single- boxes) Transcription factor binding sites that are unique to one cell- nucleotide resolution (Harris et al. 2010). Reduced representation typepeakinaparticularpartition.Greenboxesarepresentintwocell-type categories,tealboxesrepresentthreecategories,andblueboxesindicate bisulfitesequencingprovidesanalternativetowhole-genomebi- presenceinthreeormorecell-typespecificcategories.Grayindicatesthat sulfitesequencing;however,itprovideshigh-resolutiondatafor thetranscriptionfactorisnotoverrepresentedinthedataset. onlyalimitedportionofthegenome(Meissneretal.2005).The enrichmentapproachesdetectmethylatedDNAfragmentsgenome- wideandassumemethylationofnearbyCpGsites.Comprehensive promoters, five contained CpG siteswithinthe consensus rec- comparison of the genome-wide method of MethylC-seq to the ognitionsequence,andfivecontainedaCpGdinucleotideim- MBD-seq and MeDIP-seq enrichment methods revealed a >97% mediately adjacent to the binding sequence. Many consensus concordance rate for methylation calls throughout the genome bindingsitesinour datasetlack CpGsites;therefore,itisun- (Harrisetal.2010).WeconcludethattheMBD-seqisanaccurate clearifmethylationdirectlyimpactsbindingofthetranscription andcost-effectiveapproachtosurveygenome-wideDNAmethyl- factors. ation in small numbers of primary cells. In this study, we used Totestthehypothesisthatdifferentialmethylationdirectly MBD-seqcombinedwithpeakcallingalgorithmstodescribeDNA impactsthebindingoftranscriptionfac- tors revealed by our de novo motif dis- Table3. Overlapbetweentranscriptionfactoroccupancyandmethylationpeaks covery, we performed chromatin immu- noprecipitation of the ETS factors ELF1 Occupied Number Exp.overlapping and FLI1. Methylation peaks primarily Transcriptionfactors sites overlapping (mean) Z-score P-value localized to promoter regions of 10 dis- MethylationCMP(50,639) tinctgenes,aswellasseveralgenicpeaks ERG 36,166 966 1983 (cid:2)20.80 2.16310(cid:2)96 were selected for validation. ELF1 and FLI1 19,601 348 1075 (cid:2)21.32 3.70310(cid:2)101 FLI1 occupancy was compared in these GATA2 9234 278 507 (cid:2)9.87 2.81310(cid:2)23 regions of differential methylation in GFI1B 8853 235 486 (cid:2)11.04 1.23310(cid:2)28 LMO2 9604 174 528 (cid:2)14.66 5.81310(cid:2)49 chromatin from CMPs and ERYs. Figure LYL1 4350 101 238 (cid:2)8.52 7.98310(cid:2)18 7Ashowsthedatafortheendoglin(Eng) MEIS1 8401 207 461 (cid:2)11.17 2.86310(cid:2)29 locus,whereseveralconservedfunctional PU1 22,743 973 1248 (cid:2)7.35 9.91310(cid:2)14 elements have been identified (Pimanda RUNX1 5269 97 290 (cid:2)11.11 5.61310(cid:2)29 etal.2006,2008).Consistentwithpro- SCL 7096 146 389 (cid:2)12.26 7.42310(cid:2)35 moter methylation corresponding to MethylationHSC(85,797) genesilencing,thisgeneisunmethylated ERG 36,166 1357 3204 (cid:2)29.95 2.20310(cid:2)197 and expressed in CMPs, and uniquely FLI1 19,601 5333 1737 80.61 0 methylated and silenced in ERYs, as GATA2 9234 468 819 (cid:2)11.51 5.87310(cid:2)31 GFI1B 8853 340 784 (cid:2)15.51 1.48310(cid:2)54 determined by BloodExpress (Miranda- LMO2 9604 296 850 (cid:2)18.27 7.17310(cid:2)75 Saavedraetal.2009).Althoughsilenced, LYL1 4350 156 386 (cid:2)11.50 6.60310(cid:2)31 significantlyincreasedenrichmentofboth MEIS1 8401 343 744 (cid:2)13.97 1.19310(cid:2)44 FLI1 and ELF1 was observed in the ERY PU1 22743 1315 2015 (cid:2)14.69 3.74310(cid:2)49 RUNX1 5269 166 467 (cid:2)12.80 8.20310(cid:2)38 promoter methylation peak (Fig. 7B,C). SCL 7096 195 629 (cid:2)16.83 7.36310(cid:2)64 Significantdifferencesinbindingbetween GenomeResearch 1413 www.genome.org Downloaded from genome.cshlp.org on January 22, 2023 - Published by Cold Spring Harbor Laboratory Press Hogart et al. Table4. Overrepresentedmotifsinproximalpromoter differential methylation at CpG islands ismoreimportantforearlystagesofline- Transcriptionfactor TRANSFACbindingsite Identifiedmotif CpGproximity age specification than for final cell fate determination. CKROX(ZBTB7B) cccctcccc CCCTCCC CpGwithinbindingsite gccctcccc CCTCCCC A comparison of DNA methylation cccctcccg CCTCCC inpluripotentembryonicstemcellsand ELF1 aggaag AGGAAG lineage-committed cells has demon- cggaag CACGGAAGC strated a positive association between ETS2 cttcccg CTTCCC cttcctg CTTCCTG global DNA methylation levels and the attcctg TTCCTG ability to differentiate into multiple cell cttcctc CTTCCT types (Lister et al. 2009; Laurent et al. LMAF(MAFA) acacagcag CAGCAG 2010).Similartowhatwehaveobserved cgtcagcag in hematopoietic stem cells, studies of ggtcagcag SP4 gcccctccccc CCCTCCC embryonic stem cells have shown that tcccctccccg CCTCCC methylation is not restricted to tran- gcccctcccac GCCCCGCCC scriptional repression, because many gccccgcccct GCCCCGCCCC expressed genes in stem cells contain GABPALPHA(GABPA) cttccc CTTCCC CpGimmediatelyadjacent Cttcct CTTCCTG DNAmethylationwithinthegenebody MAFA Acagcag CAGCAG (Lister et al. 2009; Laurent et al. 2010). Tcagcag Additionally, the correlation between MATH1(ATOH1) gcagctggtg CAGCTG DNAmethylationandtherepressivehis- NFAT2(NFATC1) Tttcctg TTCCTG WT1 ccctcctcc CTCCTCC tonemodificationH3K27me3isnotob- cacacatac CACACATACA served in pluripotent embryonic stem ccctcctcc TCCTCC cells(Laurentetal.2010).Thesefindings IK tctgggagga CTGGGA CpGwithin1bp suggest that DNA methylation in stem cctgggagag CCTGGG cttgggaggc GGGAGG cells is not directly involved in in- cttgggaggt activation of genes, but may instead be gttgggagga amarkenablinggenomicplasticity.Our KAISO(ZBTB33) ctcctgctgt CTCCTGC genome-widesurveyofDNAmethylation CTGCTG demonstrating high levels of methyla- NFAT1(NFATC2) ggaaag GGAAAG ggaagc CACGGAAGC tion in the multipotent hematopoietic FLI1 aaggaaggaag AGGAAG CpG2+bp stem cell with progressive loss of meth- NFAT3(NFATC4) ggaatttctt TTTCTT ylation during erythroid specification PARP(PARP1) agagaaagag AGAAA supports the positive role for DNA GAGAAA SF1 tgacctccc CCTCCC methylationincellularplasticity. TR4(NR2C2) tctgacctctg CCTCTG DNA methylation dramatically de- TERALPHA(T3A) ctggaggtgac CTGGAG creases as differentiation proceeds from hematopoietic stem cells to lineage-re- Where different from TRANSFAC IDs, Mouse Genome Informatics-approved symbols are listed in strictedcommonmyeloidprogenitorcells parentheses. and the terminally committed erythro- blast.Thesedataareconsistentwiththe observationsofShearstoneetal.(2011), methylation. Direct comparison of our MBD-seq data with the whohavealsodemonstratedglobaldemethylationinterminally reduced representation bisulfite sequencing (RRBS) data in pri- differentiating erythroid cells. These genome-wide profiles marymouseerythroblastsdemonstratedthatourMBD-seqpeaks complementseveralotherrecenthigh-throughputsurveysofDNA arebiasedtowardgenomicregionswith>30%–40%CpGmethyl- methylationinprimarylymphoid(B-andT-)cells(Jietal.2010; ation(SupplementalFig.S6). Deatonetal.2011;Shearstoneetal.2011).Jietal.(2010)usedan Inmousehematopoieticcells,wefound2%–3%methylated array-based method to investigate methylation at ;4.6 million promoterCpGislandsinallcelltypes,consistentwithMBD-seq CpGsitesineightdistincthematopoieticcelltypesincludingHSC/ andMeDIP-seqstudiesinothermouseandhumantissues(Weber multipotent progenitor cells, CMPs, lymphoid progenitors, and etal.2007;Illingworthetal.2010).Althoughonlyafewpercentof differentiated T-cells. In contrast to the decreasing global DNA promoterCpGislandsaremethylated,CpGislandswithingenes methylationweobservedindifferentiatingmyeloidanderythroid are more likely to be methylated in a tissue-specific manner cells, both Ji et al. (2010) and Deaton et al. (2011) found little (Maunakeaetal.2010),andinourstudy,weidentifiedthat20%of differenceinthemethylationofprimitiveandmaturelymphoid non-promoterCpGislandsweremethylatedinallhematopoietic cells.Weconcludethatmethylationalterationsplayamuchlarger cells.DifferentialmethylationofCpGislands,bothpromoterand roleinthedifferentiationofmyeloidanderythroidcellsthanin non-promoter,waslesscommonincell-type-specificpeaks.This thedifferentiationoflymphocytes.Thisconclusionissupported observationisinagreementwiththestudyrecentlypublishedby bytheobservationthatmicedeficientinDnmt1haveincreased Deatonetal.(2011)inwhichdifferentialmethylationofCpGis- production of myeloid and erythroid cells (Broske et al. 2009; lands between differentiated lymphoid cells was much less pro- Trowbridgeetal.2009). foundthanthedifferencesobservedbetweenlymphoidcellsand We observed that regions of DNA methylation were brain cells. Our results are consistent with the hypothesis that overrepresentedforETStranscriptionfactorbindingsitesandthat 1414 GenomeResearch www.genome.org Downloaded from genome.cshlp.org on January 22, 2023 - Published by Cold Spring Harbor Laboratory Press Genome-wide DNA methylation in hematopoiesis (Sunetal.2005),andtheETStranscrip- tion factor ELF1 were overrepresented intherelativelyrareerythroblast-unique methylationpeaks.Westudiedtwogenes thataresilencedinerythroidcells,Meis1 and Eng, which have erythroid-specific methylation peaksthat containconsen- sussitesforELF1,andbothofthesesites are bound by ELF1 in erythroblasts. Al- thoughELF1hasnotyetbeenreportedto function as a transcriptional repressor, the redundant binding of ETS factors (Hollenhorst et al. 2007; Okada et al. 2011) combined with the recent de- scriptionsofrepressivefunctionsofre- latedETSfactorsleadsustohypothesize thatETSfactorsmaysilencecriticalgenes forpropererythroiddevelopment. Insummary,weprovideacompre- hensive whole-genome map of DNA methylationthatcanbeeasilyintegrated withotheroccupancydata.Thesequences subjecttodifferentialmethylationduring myeloid development lead us to suggest thatDNAmethylationmayofferanaddi- Figure7. FLI1andELF1occupancyofmethylationpeaksintheEnggene.(A)UCSCGenomeBrowser tional layer of modulation of key hema- viewoftheEnglocusonchromosome2.Transcriptionfactorconsensussites(TRANSFACIDs)arein- dicatedabovetheMBD-seqdatainregionsofsignificantpeaks.QPCRdataverifyingenrichedChIP topoietictranscriptionfactors.Thisstudy bindingofFLI1(B)andELF1(C)atthreesitesofmethylationpeaksthroughoutEng.Site1,anERY- provides insight into the interplay be- specificpeak,iswithintheproximalpromoter;site2,anERY-specificintronicpeak;andsite3,apro- tween DNA methylation and transcrip- genitor(HSCandCMP)peakinthe39codingregion.ComparisonofbindinginCMPsandERYsrevealed tionalnetworksthatareinvolvedinhe- statisticallysignificantdifferencesofP=0.004(site1,FLI1),P=0.009(site1,ELF1),P=0.021(site2, ELF1),andP=0.012(site3,FLI1). matopoieticdifferentiation. Methods therewaslessthanexpectedoverlapbetweenmethylationpeaks andtranscriptionfactoroccupancy.TheETStranscriptionfactors Enrichmentofprimarymousecells arerequiredforboththemaintenanceofhematopoieticstemcells andformyeloiddevelopment(Yangetal.2011;Yuetal.2011). Hematopoieticstemcellsandcommonmyeloidprogenitorcells Whiletypicallyviewedasactivatorsoftranscription,severalrecent were harvested from adult female C57BL/6 bone marrow. After studies have characterized repressive functions of ETS factors lysisofredcellsinACKlysingbuffer(BioWhittaker),;109cells (Drydenetal.2012;Leeetal.2012),suggestingthatmodulationof weresubjectedtolineagedepletionusingthefollowingratanti- the binding of these factors may have complex effects on gene mouseantibodies:CD8a,CD4,CD11b,Ly-6G/Ly-6C,CD45R,and expression.Ournegativecorrelationbetweenmethylationpeaks Ter119(BDBiosciences).Lineagedepletionwasperformedaspre- andglobaltranscriptionfactorbindingsuggeststhatmethylation viously described (Nemeth et al. 2003). Lineage-negative cells negatively impacts the binding of these transcription factors. (lin(cid:2))werestainedwithPEanti-mouseSca1(cloneD7)andAPC anti-mouseCD117(clone2B8;BDbiosciences).Lin(cid:2)Sca-1+c-kit+ Alternatively,thelackofco-occurrenceofmethylationandtran- (HSC; 1%–2% of lineage depleted cells) and lin(cid:2) Sca-1(cid:2) c-kit+ scriptionfactoroccupancycouldbeaconsequenceoftranscription (CMP;10%oflineagedepletedcells)weresortedonaBDFACSAria factorsbindingprimarilyinmethylation-deficientregionssuchas instrument.ImagesfromarepresentativesortareshowninSup- promoters.OurchromatinimmunoprecipitationanalysisofFLI1 plemental Figure S1. Cells from five sorts (HSC) and two sorts andELF1bindinginpromotersofmultiplegeneswithdifferential (CMP) were combined to generate sufficient cells (;1 3 106 to methylationpeaksinCMPsandERYsdemonstratedthatthemore 23106)foranalysis.ErythroblastswereobtainedfromC57BL/6d methylatedcelltypeoftenexhibitedsignificantlyenrichedbind- 13.5embryonicfetalliversdisruptedintoasingle-cellsuspension ing,whichindicatesthatneitherinterpretationisentirelycorrect. witha21-gaugeneedleandstainedwithAPCanti-mouseTer119 Sinceourmethylationpeaksspanregionsofseveralhundredbase andFITCanti-mouseCD71antibodies(BDBiosciences).Thecells pairs, we do notknowwhetherthe exactnucleotidesboundby were sorted into the five populations described by Zhang et al. transcriptionfactorsaremethylated.Despitetheselimitations,we (2003). Cytospins were stained with May-Grunwald Giemsa to concludethatmethylatedregionsmaybeimportantfordifferen- demonstratethattheR3population(CD71+Ter119+)contained tialregulationofgenesbytranscriptionfactors. latebasophilicerythroblasts(SupplementalFig.S1). Whilegenomicmethylationismostreducedinerythroblasts, some functionally important sites of methylation are retained, MBD2enrichmentofmethylatedDNA includingtheimprintedlociSnrpn(SupplementalFig.S2)andH19 (Shearstoneetal.2011).Bindingsitesfortranscriptionfactorssuch GenomicDNAwasisolatedfromenrichedcellswiththeQIAGEN as CKROX, a factor important for specification of CD4 T-cells Puregene kit and sonicated to 200- to 400-bp fragments. MBD2 GenomeResearch 1415 www.genome.org Downloaded from genome.cshlp.org on January 22, 2023 - Published by Cold Spring Harbor Laboratory Press Hogart et al. enrichmentwasperformedwiththeActiveMotifMethylCollector obtained from similar purification strategies were selected and kit.Approximately1mgofsonicatedgenomicDNAwasincubated compiledforeachcelltype.HSCgeneexpressionrepresentedthe withMBD2-His-conjugatedproteinandmagneticbeadsaccording union of theLT-HSC, ST-HSC,and MPP datasetsfrom multiple to the manufacturer’s protocol. Between four and eight MBD2- geneexpressionstudies(Akashietal.2003;Chambersetal.2007; genomicDNAreactionsforeachbiologicalreplicatewerepurified Mansson et al. 2007; Ficara et al. 2008). CMP gene expression simultaneously and pooled post-enrichment. After enrichment, representsthecompilationofsimilarlypurifiedc-kit+,sca1(cid:2)pro- boththemethylatedfractionandsupernatantfractionswerepu- genitorexpressedgenes(Akashietal.2003;Jankovicetal.2007) rifiedwithQIAGENDNApurificationcolumns.QuantitativePCR and ERY gene expression was determined based from Ter119+ amplification of the differentially methylated regionsregulating normoblasts(Chambersetal.2007).Fulldetailsofthecellsusedin theimprintingofSnrpnandRasgrf1andtheunmethylatedCpG theBloodExpressdatabaseareavailablefromhttp://hscl.cimr.cam. island promoter of Actb was performed with SYBR Green PCR ac.uk/bloodexpress/. master mix (Applied Biosystems), and was used to validate the enrichment of methylated DNA using the MBD2-pull-down ap- Motifdiscovery proach(SupplementalFig.S3). The complete set of putative binding sites of length 5–10 were Bisulfitesequencingvalidation enumerated using the WordSeeker approach (Lichtenberg et al. 2010).Eachsitewasevaluatedforitsoverrepresentationinregard GenomicDNAfromHSCs,CMPs,andERYswasbisulfite-converted toitssequencecoverageusingMarkovchainbackgroundmodels withtheEZDNAMethylation-Directkit(ZymoResearch).Briefly, (Lichtenbergetal.2009)ofvaryingorder,rangingfrom0tothe 500ngofDNAwasconvertedaccordingtoastandardprotocol. maximumallowedorderforagivenbindingsitelength(maximum The converted DNA was PCR-amplified with bisulfite primers order=sitelength(cid:2)2). designedusingMethPrimer(LiandDahiya2002).Theprimerse- Thesetofenumeratedandscoredbindingsiteswassortedin quences are listed in Supplemental Table S5. PCR-amplified bi- decreasingorderbasedontheoverrepresentationscoreandfiltered sulfite products were cloned using the TOPO TA Cloning kit according to a word factor–based maximization [a site y is dis- (Invitrogen) and sequenced. Bisulfite sequences were analyzed cardedifitiscompletelycontainedinasitexofequalorlonger withtheQuantificationToolforMethylationAnalysis(Kumaki lengthandlargeroverrepresentation:x>yifx=uyvandscore(x)> etal.2008). score(y),withuandvbeingnon-emptyfactorsthemselves].The top25putativebindingsiteswereretainedforfurtheranalysis. Next-generationsequencingofMBD2-enrichedgenomicDNA Thesetofpredictedbindingsiteswascomparedwiththesetof knowntranscriptionfactorbindingsitesannotatedinTRANSFAC Twobiologicalreplicatesofeachenrichedcellpopulationandone (Matysetal.2006).Overlapsbetweenpredictedandknownbinding supernatant sample per cell type were submitted for high- siteswerecorrectedforoverlaptoensurethattheTRANSFACsites throughput sequencing analysis. Between 225 and 540 ng of colocalizedtothepredictedsitesintheinputsequences. MBD2-enrichedDNAand1mgofsupernatantforeachcelltype were used to construct DNA libraries according to the Illumina protocol.ThelibrariesweresequencedontheIlluminaGenome Signaturegeneration Analyzer platform, and 36-bp single-end reads were used to Foreachknownandputativelyinvolvedtranscriptionfactor,the uniquelyidentifytheMBD2-boundfractionofthemousegenome. numberofinputdatasetsinwhichitoccurswascomputed.Using thematrix2pngapplication(PavlidisandNoble2003),aheatmap MappingandpeakcallingforMBD2enrichment wascreatedcorrelatingeachofthetranscriptionfactorswiththe Sequencedreadswere mappedto themousegenome(UCSC as- input data sets characterizing the different stages of differentia- semblymm9,NCBIbuild37)usingtheELAND(AJCox,unpubl.) tion.Thetemperatureofeachmatchisannotatedasthenumberof short-readalignmentprogram.Peaksforeachcelltypewerecalled datasetsthatsharethetranscriptionfactorinvolvementtoallow usingMACS(Zhangetal.2008)withaP-valuethresholdofP< highlightingofuniquecombinations. 10(cid:2)5.Sequencedreadsfromthematchedsupernatantwereusedas a controlforeachcell type. WhenMACSwasunabletobuilda Occupation-methylationcorrelationanalysis modelfor agiventreatment/controlpair, themodelrestrictions wereloweredtocompensate.Peakcallsmadeonreplicateswere ChIP-seqlocalizationdataobtainedbyWilsonetal.(2010)were post-processed to generate a confident set. Replicate peaks that overlappedwiththemethylationpeaks.Randompeaksofequal overlapped by >200 bp were considered for further analysis; at cardinality and size were generated and used to quantify the thiscutoffthemajorityofoverlapswereretained(Supplemental overlapexpectedbychance.Basedon1000bootstrappingitera- Fig.S2). tions, the tabulated expected average overlap and standard de- Thefinalsetsofpeakswerepartitionedaccordingtothege- viationwerecomputed.Incongruencewiththeobservedoverlap, nomicsegmentstheyoccupybasedontheRefSeqgenecoordinate Z-scoreswereinferredandsubsequentlyusedtocomputeP-values mapavailableforthemm9genomebuild.SinceDNAmethylation usingtheRtoolkit(IhakaandGentleman1996). peaklengthsaverage800bp(SupplementalTableS2),peaksoften spanintronandexonsegmentboundaries.Toassessthedistribu- Chromatinimmunoprecipitation tionofDNAmethylationwithinthecontextofmorediscretese- quenceswithingenes,themeanvalueofapeakrangewasusedto PrimarymouseCMPandERYcellsweresortedasdescribedabove assignittoapartition. andfixedwith1%formaldehydefor10minatroomtemperature. Fixationwasquenchedwith0.125Mglycinefor10minatroom temperature.Cellswerewashedtwicein13PBScontainingpro- Insilicogeneexpressionanalysis teaseinhibitorandfrozenat(cid:2)80°C.Chromatinimmunoprecipi- GeneexpressionwasdeterminedusingtheBloodExpressdatabase tationwasperformedusingtheMagnaChIPA/Gkit(Millipore). (Miranda-Saavedraetal.2009).Briefly,datasetsrepresentingcells Briefly,cellswerelysedandsonicatedto200-to400-bpfragments; 1416 GenomeResearch www.genome.org
Description: