Downloaded from genome.cshlp.org on February 22, 2023 - Published by Cold Spring Harbor Laboratory Press Article Reconstruction of the vertebrate ancestral genome reveals dynamic genome reorganization in early vertebrates Yoichiro Nakatani,1,5 Hiroyuki Takeda,2 Yuji Kohara,3 and Shinichi Morishita1,4,5 1DepartmentofComputationalBiology,GraduateSchoolofFrontierSciences,TheUniversityofTokyo,Kashiwa277-0882, Japan;2DepartmentofBiologicalSciences,GraduateSchoolofScience,TheUniversityofTokyo,Tokyo113-0033,Japan; 3CenterforGeneticResourceInformation,NationalInstituteofGenetics,Mishima411-8540,Japan;4BioinformaticsResearch andDevelopment(BIRD),JapanScienceandTechnologyAgency(JST),Tokyo102-8666,Japan Although several vertebrate genomes have been sequenced, little is known about the genome evolution of early vertebrates and how large-scale genomic changes such as the two rounds of whole-genome duplications (2R WGD) affected evolutionary complexity and novelty in vertebrates. Reconstructing the ancestral vertebrate genome is highly nontrivial because of the difficulty in identifying traces originating from the 2R WGD. To resolve this problem, we developed a novel method capable of pinning down remains of the 2R WGD in the human and medaka fish genomes using invertebrate tunicate and sea urchin genes to define ohnologs, i.e., paralogs produced by the 2R WGD. We validated the reconstruction using the chicken genome, which was not considered in the reconstruction step, and observed that many ancestral proto-chromosomes were retained in the chicken genome and had one-to-one correspondence to chicken microchromosomes, thereby confirming the reconstructed ancestral genomes. Our reconstruction revealed a contrast between the slow karyotype evolution after the second WGD and the rapid, lineage-specific genome reorganizations that occurred in the ancestral lineages of major taxonomic groups such as teleostfishes,amphibians,reptiles,andmarsupials. [Supplementalmaterialisavailableonlineatwww.genome.org.] Earlyvertebrategenomeevolutionhaslongbeeninneedofclari- identified groups of human genes (called “ohnologs” [Wolfe fication,anditisnowofparticularinterestbecauseseveraldis- 2001]) duplicated by the 2R WGD by ensuring that individual tantlyrelatedvertebrategenomeswererecentlysequenced.The genesinagroupweremostsimilartotheidenticalorthologous 2Rhypothesispostulatesthattworoundsofwhole-genomedu- deuterostome gene of the sea urchin (Sea Urchin Genome Se- plication(2RWGD)occurredatthebaseofthevertebratelineage quencing Consortium 2006) or tunicate (Dehal et al. 2002; (Ohno1970;Hollandetal.1994)becauseoftheobservationthat Shoguchietal.2006).Theidentificationofohnologsisadifficult invertebrates have one Hox gene cluster, whereas lobe-finned task because all ohnologs and their corresponding Ciona genes fishes and land vertebrates have four clusters. However, the 2R arerarelyconservedduetonumerouslossesandduplicationsof hypothesishasbeenquitecontroversialuntilrecently(Skrabanek Ciona, human, and medaka genes. Given these difficulties, andWolfe1998;GibsonandSpring2000;FriedmanandHughes ohnologsweregroupedbasedonthemethodofDehalandBoore 2001;Wolfe2001;Abi-Rachedetal.2002;FurlongandHolland (2005), which is detailed in Supplemental Figure S1 and the 2002;Guetal.2002;McLysaghtetal.2002;Durand2003;Pan- Supplemental Document. These ohnologs typically occur con- opoulou et al. 2003; Seoighe 2003; Panopoulou and Poustka secutivelyinparalogouschromosomalregionsinthehumange- 2005;Vandepoeleetal.2004)becauseitleavesopenthepossi- nomeandarelikelytorepresentaremainingblockderivedfrom bilityofoneroundofWGDfollowedbylarge-scaleduplications asinglegnathostome(jawedvertebrate,seephylogenetictreeof suchassegmentalandchromosomalduplications.Recently,De- vertebrates in Fig. 6, below) proto-chromosome. To test if they hal and Boore (2005) showed that a large part of the human are really remaining blocks of the gnathostome proto- genome contains four-way paralogous chromosomal regions, chromosomes,theirsynteniesinthemedakagenome(Kasahara whicharetracesofthe2RWGD. etal.2007)wereinvestigated(fordetails,seetheSupplemental The task of reconstructing ancestral vertebrate proto- Document).Thefinalstepinournovelanalysiscombinedquali- chromosomesbeforethe2RWGDisverydifferentfromordinary fied remaining blocks into vertebrate and gnathostome proto- synteny analysis using orthologs because the effect of the 2R chromosomesusinginformationonohnologdistributionamong WGD must be carefully examined. Moreover, it is necessary to theblocks.Thisseriesofstepswasnewlydevelopedforthisre- determine human chromosome regions originating from the construction(seeMethods). same ancestral chromosome at the second WGD and integrate Next,weattemptedtovalidatethereconstructedgnathos- these regions to rebuild the ancestral karyotype. Thus, we first tome proto-chromosomes. If extant genomes have experienced 5Correspondingauthors. intensive interchromosomal rearrangements, they are hardly [email protected];fax81-47-136-3977. syntenictothereconstructedancestralgenomeandarenotuse- [email protected];fax81-47-136-3977. fulinthevalidation.Amongsequencedvertebrategenomes,we Articlepublishedonlinebeforeprint.Articleandpublicationdateareathttp:// searched for a genome that was not used in the reconstruction www.genome.org/cgi/doi/10.1101/gr.6316407. Freely available online throughtheGenomeResearchOpenAccessoption. stepbutpreservedtheproto-karyotype.Thechickengenomewas 1254 Genome Research 17:1254–1265©2007byColdSpringHarborLaboratoryPress;ISSN1088-9051/07;www.genome.org www.genome.org Downloaded from genome.cshlp.org on February 22, 2023 - Published by Cold Spring Harbor Laboratory Press Vertebrate genome evolution mostpromising,giventhehypothesisthatavianmicrochromo- humanchromosomeswerecomparedwithCionaandseaurchin somes represent archaic linkage groups of ancestral vertebrates genesasanoutgroupbecauseoftheintensiverearrangementsin (Burt 2002). Indeed, we observed that many ancestral gnathos- the mammalian lineage. We therefore needed additional infor- tomeproto-chromosomesinourreconstructionhadone-to-one mation.Thus,wefullyusedthemedakachromosomesasanout- correspondence with microchromosomes in the chicken ge- group of mammalians to clarify the boundaries (see details in nome,thusprovidingastrongvalidationofourreconstruction. Supplemental Fig. S2). The conserved vertebrate linkage (CVL) We reconstructed the gnathostome (jawed vertebrate), os- blockswereessentialinreconstructingancestralvertebrateproto- teichthyan(bonyvertebrate),andamniote(thegroupincluding chromosomes. reptiles, birds, dinosaurs, and mammals) ancestral genomes, which are located at key phylogenetic positions, leading to a Reconstructionofthevertebrateancestralgenome novelscenarioofgenomeevolutioninearlyvertebrates.Thetwo roundsofWGDeventsduplicated10–13proto-chromosomesin Here,weassumedasimplifiedmodelinwhichnomajorgenome thevertebrateancestor,producingthegnathostome(jawedver- rearrangementsoccurredbetweenthe2RWGDevents,although tebrate) ancestor (n≈40). Subsequent chromosome fusions re- wewillpresentamoreelaboratemodellaterinthispaper.The duced the number of chromosomes in the osteichthyan proto- expectedsignatureoftworoundsofWGDisillustratedinFigure karyotype(n≈31)andintheamnioteproto-karyotype(n≈26). 2,A–D.The2RWGDquadruplicatedtheredandbluechromo- Theseestimatesofchromosomenumberareconsiderablylarger somes,producingsisterchromosomeseachhavingthesameset than previous estimates (Jaillon et al. 2004; Naruse et al. 2004; ofgenesarrangedinthesameorder.Inreality,genelossesand Woodsetal.2005;Kohnetal.2006)andcontradictthewidely chromosomerearrangementsmusthavetakenplacebetweenthe held hypothesis that the osteichthyan proto-karyotype was 2RWGD,butforsimplicity,thesedetailsareomittedfromFigure n≈12(Postlethwaitetal.2000;Jaillonetal.2004;Naruseetal. 2A. Wolfe (2001) proposed calling these duplicated gene pairs 2004; Woods et al. 2005; Kohn et al. 2006) and that land- “ohnologs”afterSusumuOhno.Oneohnologisrepresentedbya vertebrategenomeswereshapedbylineage-specificchromosome dotinatrianglewhoseX-andY-axiscoordinatesrepresentthe “fissions”(Postlethwaitetal.2000).Onthecontrary,ourresults positions of duplicated genes in the sister chromosomes pro- demonstrate that many lineage-specific chromosome “fusions” ducedbythe2RWGD.Becauseohnologsareproducedbychro- shaped the ancestral karyotypes of major taxonomic groups, mosome-wide duplication and not by smaller-scale gene dupli- suchasteleostfishes,amphibians,reptiles,andmarsupials. cation,theyareobservedbetweenpairsofdistinctsisterchromo- somes, as illustrated by dots in the red region. In contrast, no ohnologswouldbefoundwithinonesisterchromosome,which Results is shown by the absence of ohnolog dots in the green regions. Afterthe2RWGD,chromosomebreaks(fissionsandtransloca- Evolutionofvertebratechromosomes tions),fusions,orinversionsmusthavealteredsomeofthesister Figure1presentsourscenarioofvertebratechromosomeevolu- chromosomes (Fig. 2B). The accumulation of rearrangements tion.Althoughtheancestralgenomewouldhavemorethantwo gradually disrupted the ancestral gene order and scattered the proto-chromosomesbeforethetworoundsofWGD,forsimplic- ohnolog dots (Fig. 2C). Furthermore, intensive interchromo- itywedisplaytwoofthereconstructedancestralvertebratechro- somal rearrangements in the mammalian lineage (Burt et al. mosomes,coloredredandblue(Fig.1A).ThefirstroundofWGD 1999) must have distributed blocks across the human genome doubledthesetwochromosomes,andfissionoccurredinoneof (Fig. 2D). Reconstruction of ancestral proto-chromosomes is a theduplicatedcopiesoftheredchromosome.Afterthesecond task of reordering the CVL blocks in Figure 2D to estimate the roundofWGD(Fig.1B),sisterchromosomesweregraduallydis- orderingsinFigure2,BandC. rupted by genome rearrangements and broken into smaller Figure2Epresentstheactualdistributionofohnologsinthe blocksofchromosomalsegmentsduringearlyvertebrateevolu- humangenome,inwhichCVLblocksareplacedinorderfrom tion(Fig.1C).Eventually,theseblocksweredistributedoversev- humanchromosome1toX.Thereconstructionstepstartedwith eral human chromosomes (Fig. 1D) because of intensive inter- categorizing the blocks into groups that were derived from a chromosomalrearrangementsinthemammalianlineage(Burtet singleancestralvertebratechromosomebeforethe2RWGD.For al.1999).However,intheteleostfishlineage,intensivechromo- thispurpose,anytworedCVLblockssharingseveralohnologs some fusions and another WGD occurred after the divergence wereplacedinthesame“vertebrate”group(Fig.2F).Thesubse- fromtheosteichthyanancestor. quentstepdividedtheblockswithineachgroupintosubgroups As illustrated in Figure 1, conserved vertebrate linkage thatrepresentedduplicatedproto-chromosomesinthegnathos- groups, which are groups of genes located on a single chromo- tomeancestor.Tothisend,weperformedanexhaustivesearch someafterthesecondroundofWGD,constitutechromosomal fortheoptimalsubgroupingsothatsignificantlymoreohnologs blocksinthehumangenome.Theseblockscanalsobeidentified weresharedbetweendistinctsubgroups,asillustratedbythered by doubly conserved synteny (DCS) analysis with the medaka regionsinFigure2G,whereasfewohnologswereobservedwithin genome(Jaillonetal.2004;Kellisetal.2004).Forexample,al- anysinglesubgroup,asindicatedbythegreenregions.Thede- thoughblocks19a,19b,and19carelocatedconsecutivelyinthe tailsofthereconstructionprocedurearedescribedintheMeth- human genome, both 19a and 19c are syntenic to each of the ods. Our reconstruction in Figure 2G shows that in five of 10 duplicatedmedakachromosomes4and17,whereasduplicatesof ancestral vertebrate chromosomes, estimating four sister chro- 19barefoundintwoduplicatedmedakachromosomes(1and8). mosomeswasmoresignificantthaninferringtwo,three,orfive Otherneighboringblocks,i.e.,1b–1c,3a–3b,7a–7b,and9a–9b, sister chromosomes. CVL blocks with a small number of arelocatedondistinctmedakachromosomes.Thisimpliesthat ohnologsfailtoberecognizedassisterchromosomesbecauseof decidingtheboundariesofsuchneighboringblocksisextremely theirlowstatisticalsignificance,therebyyieldingareconstructed important; however, the task was difficult when genes on the vertebrategroupwithlessthanfourduplicatedchromosomes.In Genome Research 1255 www.genome.org Downloaded from genome.cshlp.org on February 22, 2023 - Published by Cold Spring Harbor Laboratory Press Nakatani et al. Figure1. Vertebratechromosomeevolutionscenario.(A)Forsimplicity,weillustratetwoproto-chromosomes(redandbluebars)duplicatedbythe firstroundofWGD.Subsequently,fissiondividedoneoftheduplicatedchromosomes.(B)ThesecondroundofWGDdoubledtheproto-chromosomes. Blocksinchromosomesarelabeledwiththeirrespectivechromosomepositionsinthehumangenome.(C)AfterthesecondWGD,earlyvertebrates underwentslowchangesinkaryotypeoveralongevolutionaryprocess.(D)Intheancestralmammalianlineage,intensiveinterchromosomalrear- rangementsoccurredandtheancestralchromosomeswerebrokenintosmallersegmentsthatweredistributedacrossmanyhumanchromosomes.(E) Intheancientray-finnedfishlineage,intensivechromosomefusionsmergedtheancestralchromosomalsegmentsintoancestralteleostchromosomes. (F)AnotherroundofWGDintheancestralteleostdoubledproto-chromosomes,butafterward,fewglobalrearrangementsshapedthepresentmedaka genome. 1256 Genome Research www.genome.org Downloaded from genome.cshlp.org on February 22, 2023 - Published by Cold Spring Harbor Laboratory Press Vertebrate genome evolution Figure2. (Legendonnextpage) Genome Research 1257 www.genome.org Downloaded from genome.cshlp.org on February 22, 2023 - Published by Cold Spring Harbor Laboratory Press Nakatani et al. contrast,fiveancestralvertebratechromosomes(A–E)hadfairly proto-chromosomesbeforethedivergencefromvertebrates,pre- largegroupsofCVLblockswith>250ohnologgenepairs,pro- sumably because of numerous chromosome rearrangements. vidingstrongsupportforthe2Rhypothesis. Therefore,atthisstage,weestimatedthattheancestralvertebrate had 10–13 proto-chromosomes before the 2R WGD. We assert Assumingfusionsandfissionsbetweenthe2RWGD thattheancestralgnathostomehad40proto-chromosomes,and Forsimplicity,wehavethusfarassumednomajorchromosome wasnotaffectedbytheselectionofafusionorfissionscenario. rearrangementsbetweenthe2RWGD.Here,weextendthesim- plifiedmodeltoconsiderthepossibilityofchromosomefusions Validationofreconstructedproto-chromosomes andfissionsbetweenthe2RWGDevents,thoughhowtodetect Tocheckthevalidityofthereconstructedproto-chromosomes, remains of these events is a nontrivial question. Figure 3A wesearchedforagenomethatpreservedtheancestralchromo- presents vertebrate groups A, B, and F in Figure 2F, and it is somes among available deuterostome, chordate, and vertebrate remarkabletoobservepairsoflightblueboxeswithfewohnologs genomes.Theseaurchindraftgenome(SeaUrchinGenomeSe- inindividualtrianglesbecausetheseboxesareexpectedtocon- quencingConsortium2006)islargelyfragmentedintoverysmall tainmanyohnologsaccordingtothesimplifiedmodelassumed pieces,whereastheCionagenome(Shoguchietal.2006)experi- sofar.Wethinkthatthesephenomenaprovidecluestotheprob- enced numerous genome rearrangements after the divergence lemofinferringgenomerearrangementsbetweenthe2RWGD; from vertebrates >500 million yr ago; for example, even Hox here,weshowthreealternativescenariosthatmayproducethese genes are separated onto multiple chromosomes (Ikuta et al. phenomena. 2004). Thus, these two genomes fail to preserve proto- Figure3Bshowsthefirstscenarioinwhichduplicatesoftwo chromosomes.Wealsoexaminedvertebrategenomesandfound ancestralchromosomesarefused,whereasFigure3Cshowsthe that rat, mouse, and dog genomes did not preserve the proto- second scenario, in which one duplicate of a single ancestral karyotypebecauseofseveralinterchromosomalrearrangements chromosomeisdividedbyafissionevent.Bothscenariosgener- in the mammalian lineage (Burt et al. 1999). Among tested ge- atethesameohnologdistributionpatternamongthesixdescen- nomes, the chicken genome significantly preserved the recon- dent chromosomes, as illustrated in Figure 3E; note that pairs structedancestralgnathostomechromosomes.Moreprecisely,a A21–B22andB21–A22havenoohnologs.Thethirdscenarioin series of CVL blocks that occur both in one ancestral gnathos- Figure 3D indicates that neither fissions nor fusions take place tome chromosome and in one chicken chromosome presents between the 2R WGD events, but independent fission events strongevidenceforthereconstruction;incontrast,occurrences occur for duplicated sister chromosomes after the 2R WGD. As of CVL blocks of one chicken chromosome in different sister illustratedinFigure3F,thepairA21a–A22bisfreeofohnologs; chromosomesofthegnathostomeancestorsuggestintensivein- however, some ohnologs are observed in the pair A21b–A22a. terchromosomalrearrangementsintheavianlineageorpossible Bothpairswouldhavenoohnologsifthetwoindependentfis- mistakes in the reconstruction. We observed that CVL blocks sion events took place at the identical chromosomal positions, located in the same chicken microchromosomes belonged pri- butthiscaseislesslikely.Amongthethreescenarios,theformer marilytothesameancestralgnathostomechromosomes(Fig.4). twoscenariosofonefusionorfissioneventaremoreparsimoni- Tobeprecise,chickenchromosomes7,8,11,15,19,20,21,22, ousthanthethirdscenariooftwoindependentfissions,andthey 23,24,27,and28hadone-to-onecorrespondencetoE0,A2,B4, better explain pairs of light blue boxes with few ohnologs in H0,H1,B1,F0,C3,B3,J0,E2,andA1,respectively(Fig.5).Thus, Figure3A.Thus,weestimatedthatamong10vertebrategroups, thisclearcorrespondencebetweenchickenandgnathostomean- A, B, and F had fusions or fissions, whereas the other seven cestor chromosomes provides a firm grounding for our recon- groupswerefreeofthesechromosomerearrangements. struction. However, it remains difficult to judge whether a fusion or fission happened between the 2R WGD; each of the vertebrate Reconstructionofancestralosteichthyanandamniote groups A, B, and F may have had two ancestral chromosomes proto-chromosomes before the 2R WGD according to the fusion scenario, or one proto-chromosomeiffissiontookplace.Thisargumentmaybe The reconstruction of proto-chromosomes in the gnathostome settled by using the genomes of some invertebrates as an out- ancestorallowedustoinferthekaryotypesofevolutionarilyim- group.However,thedraftgenomeoftheseaurchin(SeaUrchin portantancestorssuchasthebonyvertebrate(osteichthyan)an- Genome Sequencing Consortium 2006) consists of very small cestorandtheamnioteancestor,leadingtonovelinsightsinto genomicfragments,andtheCionadraftgenome(Shoguchietal. genomeevolutionafterthe2RWGD.Previousstudieshavesug- 2006),withanearlycompletechromosomemap,failstopreserve gestedthattheosteichthyanproto-karyotypewasn≈12bytreat- Figure2. Modelofvertebrategenomeevolutionandreconstructionoftheancestralgenome.(A)Forsimplicity,supposethattheancestralchro- mosomehad10genes.The2RWGDproducedohnologs(bluedotsalongthediagonallineinthetriangulardotplot)intheduplicatedchromosomes. (B)Chromosomebreaksandinversionsmayhavealteredtheorderofohnologsonthesisterchromosomes.(C)Inthecourseofearlyvertebrategenome evolution,theancestralgeneorderwasdisruptedbymanyinversions,resultinginscatteredohnologdots.(D)Eventually,CVLblocksweredistributed acrossseveralhumanchromosomesbyintensiveinterchromosomalrearrangements.(a–d)Atypicalmodelofgenomeevolutioninvolvingthe2RWGD. In the next step, we handle real human genome data. (E) This is a real instance of the dot plot in D. CVL blocks were ordered from the human chromosomes1toX,andohnologssharedamongtheseCVLblockswereplotted.(Red)RegionsrepresentingpairsofparalogousCVLblockswitha greatnumberofohnologs(P<10(cid:1)4,seeMethods).(F)ThiscorrespondstothestateinC.CVLblockswerereorderedinsuchawaythatparalogous CVLblocksweregroupedsothateachgrouprepresentedoneancestralvertebratechromosome(seeMethods).(G)Thisstatecorrespondstothatin B.CVLblockswithinindividualvertebrategroupswerefurtherreorderedtoobtainancestralgnathostomesubgroups(namely,chromosomes),which wereduplicatedfromasingleancestralvertebratechromosomebythe2RWGDevents.Thepartitionofsubgroupsthatoptimizesthesignificance definedintheMethods.RectanglesandtrianglesarecoloredinaccordancewiththoseinBtomakethecorrespondenceclear.(H)Thevertebrategroup Awasdecomposedintofourgnathostomesubgroupsbystatisticalanalysis,indicatingthattheancestralchromosomeunderwent2RWGD. 1258 Genome Research www.genome.org Downloaded from genome.cshlp.org on February 22, 2023 - Published by Cold Spring Harbor Laboratory Press Vertebrate genome evolution Figure3. Modelsofchromosomeevolutionduring2RWGD.(A)VertebrategroupsA,B,andEinFig.2F.Pairsoflightblueboxeswithfewohnologs maybetheremainsofgenomerearrangementsbetweenthetwoWGDevents.(B)Twodistinctancestorchromosomes(AandB)wereduplicatedby thefirstWGD,andduplicatedchromosomesA1andB1underwentachromosomefusion.(C)Oneancestorchromosomeconsistingoftwoparts(Aand B)wasduplicatedbythefirstWGD,andacopyoftheduplicatedchromosomeswassplitintotwochromosomes(A1andB1)byachromosomefission event.(D)AfterthesecondroundofWGD,twoofthefoursisterchromosomesunderwenttwoindependentchromosomefissions.Becausethetwo fissionssplittwochromosomes(A11andA12)atdifferentchromosomepositions,blocksA21bandA22ahavesomeohnologsincommon.(E)Bothof thefusionandfissionmodelsinBandCproducethesamedistributionofohnologs,especiallypairsoflightblueboxeswithoutohnologs.(F)The distributionoriginatingfromindependentfissioneventsinDisdistinguishablefromthatinEasindicatedbyonelightbluebox. Genome Research 1259 www.genome.org Downloaded from genome.cshlp.org on February 22, 2023 - Published by Cold Spring Harbor Laboratory Press Nakatani et al. Figure4. Reconstructedancestralchromosomes.AncestralvertebratechromosomesA,B,andFhadtwoalternativescenarios,fusionsorfissions, betweenthe2RWGDevents,asshowninFig.3.Thus,thenumberofproto-chromosomesrangesfrom10to13dependingonthechoiceoftwo alternatives.Thefigureillustratesthescenarioinwhichonlyfissionstookplace.Tenreconstructedproto-chromosomesinthevertebrateancestorshown atthetopareassigneddistinctcolors,andtheirdaughterchromosomesinthegnathostomeancestoraredistinguishedbytheirrespectiveverticalbars. Inthegenomesoftheosteichthyan,teleost,andamnioteancestors,andhuman,chicken,andmedakagenomes,genomicregionsareassignedcolors andverticalbarsthatrepresentcorrespondencesofindividualregionstotheproto-chromosomesinthegnathostomeancestorfromwhichrespective regionsoriginated.Unassignedblocksareshownintherightmostchromosome(Un)intheosteichthyanandamnioteancestors. 1260 Genome Research www.genome.org Downloaded from genome.cshlp.org on February 22, 2023 - Published by Cold Spring Harbor Laboratory Press Vertebrate genome evolution Figure5. Syntenicchromosomecorrespondencebetweenthechickenandgnathostomeancestorchromosomes.Inthetable,thenumberinacell indicatesthenumberoforthologousgenesinsyntenicregions(seeSupplementalmaterials)betweenthechickenchromosomeinthecolumnandthe reconstructedgnathostomeproto-chromosomeintherow.Toemphasizechickenchromosomesthatwereprimarilyderivedfromasingleancestral gnathostomechromosome,cellswiththemaximumnumberoforthologsarecoloredyelloworred.Inparticular,redcellsimplychickenchromosomes thathaveone-to-onecorrespondencetotheirancestralgnathostomechromosomes. ingtheteleostancestorofmedaka,Tetraodon,andzebrafishbe- teleostancestor,andgnathostomeancestor.Wecallthisrulethe foreteleostWGDastheosteichthyanancestor(Postlethwaitetal. 2-of-3ruleandwedescribethedetailsandrationaleoftherulein 2000;Jaillonetal.2004;Naruseetal.2004;Woodsetal.2005)or the Methods. The resulting osteichthyan proto-karyotype was byleavingthepossibilityofchromosomefusionsoutofconsid- n≈31, which confirms our postulate that the number of oste- eration(Kohnetal.2006).However,thishypothesis(n≈12)ap- ichthyanproto-chromosomesismorethanthepreviousestima- pears to disagree with our observations because many chromo- tionofn≈12. somes in the chicken genome (n=39) have one-to-one corre- Regarding the amniote ancestor, Kohn et al. (2006) esti- sponding proto-chromosomes in the gnathostome ancestor mated n≈18 using the teleost ancestor as an outgroup of the (n≈40),asshowninFigure5,whereasmostchromosomesinthe human and chicken genomes. However, in our reconstruction, teleostancestor(n≈13)havesyntenyblockswithmultiplechro- theteleostancestorgenomeunderwentseveralrearrangements, mosomesinthegnathostomeancestor.Theseobservationsledus especiallyfusions,afterdivergencefromtheosteichthyanances- to postulate that the osteichthyan proto-karyotype had more tor. Thus, we used the reconstructed osteichthyan ancestor, in chromosomes than the previous estimate (n≈12), and that place of the teleost ancestor, together with the human and manygenomerearrangementstookplacefromtheosteichthyan chicken genomes to reconstruct the amniote ancestor chromo- proto-karyotypetotheteleostancestor. somes. Using the 2-of-3 rule, we estimated that the number of Totestthishypothesis,weattemptedtoreconstructtheos- amniote proto-chromosomes was n≈26, which is significantly teichthyan proto-karyotype by comparing the chicken and an- largerthanthepreviousestimateofn≈18.Overall,weconclude cestralteleostkaryotypeswiththatofthegnathostomeancestor. thatchromosomenumbersinearlyvertebratesareconsiderably Thisapproachissomewhatdifferentfromthetraditionalmethod larger than previous estimates (Jaillon et al. 2004; Naruse et al. ofusinganoutgroupofthechickenandteleostancestorbecause 2004;Woodsetal.2005;Kohnetal.2006). weinsteadusedthegnathostomeancestor,whichisanancestor of the osteichthyan ancestor. In the reconstruction, two CVL Karyotypeevolutionandtaxon-specificfusionmodel blocks were expected to occur in the same osteichthyan proto- chromosomeifthetwoCVLblockswerelocatedontheidentical ThereconstructedancestralkaryotypesinFigure4provideasce- chromosome of at least two of the three karyotypes: chicken, narioforkaryotypeevolutioninvertebrates,asdepictedinFigure Genome Research 1261 www.genome.org Downloaded from genome.cshlp.org on February 22, 2023 - Published by Cold Spring Harbor Laboratory Press Nakatani et al. Figure6. Changesinchromosomenumberduringvertebratekaryotypeevolution.Thereconstructedproto-chromosomesinFig.4allowustodiscuss howthenumberofchromosomeshaschangedinindividualvertebratelineages.(Left)Thephylogenetictreeofvertebrates,(right)thedistributionof chromosome number in individual lineages (Gregory et al. 2007; Animal Genome Size Database, http://www.genomesize.com). Considering the numbers of proto-chromosomes in Fig. 4, two ancient whole-genome duplication events almost quadrupled the number of chromosomes; subse- quently,chromosomenumbersinindividuallineagestendedtodecreaseinmanylineages,althoughnotintheavianlineage. 6.BeforethefirstroundofWGD,thevertebrateancestorkaryo- model is that the ancestors of major vertebrate groups such as typewasn≈10–13,andthesubsequent2RWGDandsomege- teleosts, amphibians, squamates, and metatherians underwent nome rearrangements yielded the gnathostome ancestor of numerous chromosome fusions (except for birds), leading to n≈40. After the divergence of bony vertebrates (Osteichthyes) slow changes in karyotype, as indicated by the distribution of and cartilaginous fishes (Chondrichthyes), genome rearrange- chromosome number. Similar speculations have been made re- mentsreducedthenumberofchromosomesintheosteichthyan gardingamphibiansandreptiles;Morescalchipostulatedthatthe ancestorton≈31.Inthemeantime,moderncartilaginousfishes number of chromosomes decreased in the ancestral amphibian havegenomescomprisingavarietyofkaryotypes;thenumberof based on the large number of chromosomes in some primitive chromosomes in the haploid set, n, ranges from 25 to 50. We amphibians(GreenandSessions1991).Withrespecttoreptiles, speculatethatsomeofextantcartilaginousfishesmayretainthe Olmo(2005)indicatedthatancestralreptileshadahigherrateof gnathostomeancestralkaryotype(n≈40),assumingnogenome karyotypicchanges.Ourkaryotypeevolutionmodelisconsistent duplicationeventstookplaceinthecartilaginouslineage.After withthesehypotheses. the divergence of ray-finned and lobe-finned fishes, in the lin- Incontrast,alargeincreaseinthenumberofchromosomes eage of ray-finned fishes (Actinopterygii), chromosome fusions was found in the avian lineage. One explanation is that fewer reducedthenumberofchromosomesandproducedtheteleost repeatsequencesinaviangenomeswerelikelytoresultinchro- ancestorwithn≈13.Ourfusionmodelmadeitnecessarytore- mosome fissions rather than fusions and interchromosomal re- vise the n≈12 ancestral osteichthyan proto-karyotype hypoth- arrangements(Burtetal.1999;Burt2002).Anotherinteresting esis,whichtreatstheteleostancestorof∼350Myaastherepre- finding involved microchromosomes in the avian lineage. Burt sentativeoftheosteichthyanancestorof∼450Mya(Postlethwait arguedthatmanyavianmicrochromosomeswerealreadypresent etal.2000;Jaillonetal.2004;Naruseetal.2004;Woodsetal. inthecommonancestorofbirdsandotherterrestrialvertebrates 2005;Kohnetal.2006).Subsequently,thewhole-genomedupli- andwereconservedfor>400millionyr(Burt2002).Ouranalysis cation in the teleost ancestor doubled the number of chromo- extendsthisideaconsiderablybyshowingthatmanyofthemi- someston≈26.Thenumberofchromosomesintheteleostlin- crochromosomes had one-to-one correspondence with proto- eagehasremainednearlyunchangedduringevolution,andthe chromosomesinthegnathostomeancestor. chromosome numbers of extant teleost species peak at n=24 Weconcludethatinearlyvertebrategenomeevolution,two or25. ancientwhole-genomeduplicationeventsincreasedthenumber Inthelineageoflobe-finnedfishes(Sarcopterygii),thedis- of chromosomes, but subsequent chromosome fusions reduced tributionofchromosomenumberdifferslargelyacrossvertebrate the number of chromosomes in individual lineages, with the taxa (Fig. 6). In particular, an interesting observation in our exceptionoftheavianlineage. 1262 Genome Research www.genome.org Downloaded from genome.cshlp.org on February 22, 2023 - Published by Cold Spring Harbor Laboratory Press Vertebrate genome evolution Conclusions blocks were produced by independent gene duplications and concludedthatthetwoCVLblockswereparalogous.Thecruxis We present the first study to reconstruct the vertebrate proto- howtoselectameaningfulvalueasthethreshold.Weidentified karyotype before the 2R-WGD events. The reconstruction was 118 CVL blocks (for details, see the Supplemental Document). made possible by the newly developed computational method BecausethetotalnumberofallpairsofCVLblocksamountedto outlined in Figures 2 and 3. The reconstructed genome organi- 6903,themaximumthresholdoftheprobabilitywassetto10(cid:1)4 zationofthevertebrateancestorprovidesnewinsightsintoearly so that the expected number of false-positive paralogous pairs vertebrate genome evolution. First, some rearrangements oc- was<1.InFigure2,EandF,regionsrepresentingparalogousCVL curred during the 2R WGD in the ancestral vertebrate. Second, blocksarecoloredrediftheP-value<10(cid:1)4. afterthesecondWGD,thegnathostome,osteichthyan,andam- Reconstructionofsisterchromosomesinthegnathostome niote ancestors underwent slow karyotype evolution. Third, ancestorbygroupingparalogousCVLblocks rapid,lineage-specificchromosomefusionsshapedtheancestral genomesofmajorvertebratetaxasuchasteleosts,amphibians, We began by constructing an undirected graph in which the reptiles,andmarsupials. nodesareCVLblocksandedgesaredrawnbetweentwoparalo- Ourreconstructionwillalsointegratenewlysequencedge- gous CVL blocks. The graph was then divided into connected nomessuchasthepipidfrog(Xenopustropicalis),grayshort-tailed componentssothattherewasapathbetweenanytwonodesin opossum(Monodelphisdomestica),andamphioxus(Branchiostoma aconnectedcomponent,asillustratedinSupplementalFigureS7. floridae) into the global picture of vertebrate and chordate ge- CVLblocksrepresentedbynodesinoneconnectedcomponent nome evolution. Our presumption is that the X. tropicalis ge- were thought to be derived from one vertebrate proto- nomeevolvedfromtheosteichthyanancestorgenomeafterun- chromosomebeforethe2RWGD,asillustratedinFigure2H. Subsequently, to estimate proto-chromosomes in the gna- dergoingnumerouschromosomefusions.Becausethosefusions thostomeancestorthatwereproducedbyWGDevents,nodesin tookplacespecificallyintheancestralamphibianlineage,they oneconnectedcomponentwerefurtherpartitionedintoseveral wouldbelargelydifferentfromfusionsthattookplacefromthe subgroupsofnodessothateachsubgrouprepresentedoneproto- osteichthyanancestortotheteleostancestor,whichwillmakeit chromosomeinthegnathostomeancestor.Thenumberofgna- difficulttoidentifyone-to-onecorrespondencebetweenchromo- thostome proto-chromosomes originating from one vertebrate somesofX.tropicalisandtheteleostancestor. proto-chromosomewasexpectedtobefourassumingthatthe2R WGDeventstookplace(DehalandBoore2005),butweenumer- Methods atedpatternsofpartitioningCVLblocksintotwo,three,four,or fivesubgroupstosearchforthemostsignificantpattern.Specifi- Proteinsequencesforgenesandidentificationofohnologs cally, for each connected component, all combinations of CVL blockswereenumeratedwiththeconstraintthatanyparalogous Human, mouse, dog, chicken, and Tetraodon protein sequences CVLblockscouldnotbecombinedintothesamesubgroup.To wereobtainedfromEnsembl(Birneyetal.2006)(v.36–Dec2005); definethesignificanceofonepartition,CVLblocksinonesub- Takifugu(version2)fromEnsembl;Cionaintestinalis(version1) group (one gnathostome proto-chromosome) are unlikely to from JGI (http://www.jgi.doe.gov); and Strongylocentrotus purpu- share ohnologs, as illustrated in the green regions of Figure 2; ratus(version2.1)fromNCBI.Ifmultiplesequencesoverlapped however,incontrast,manypairsofCVLblocksindistinctgna- inthegenome,theywererepresentedbythelongestsequence. thostome proto-chromosomes are likely to share a significant We conducted all-against-all BLASTP searches (Altschul et al. 1997) (E-value<1(cid:2)10(cid:1)10). Special treatments were necessary number of ohnologs, as shown by red boxes in Figure 2. We therefore measured the significance of one partition with n toidentifyohnologsproducedbythe2RWGD(seedetailsinthe ohnologsamongdifferentsubgroups(i.e.,nohnologdotsinthe Supplementalmaterials). redregionsinFigure2H)bytheprobabilitythat(cid:1)nohnologsare IdentificationofparalogousCVLblocksforgroupingCVL sharedbydifferentsubgroupsundertherandomdistributionof blocks ohnologs in the whole triangular region. More precisely, the probability, denoted by p, is defined by assuming that c is the We evaluated the significance of the number of ohnologs, de- totalnumberofpairsofgenesinthefocalconnectedcomponent, notedbyn,thattwoCVLblocksshareincommonbyevaluating misthetotalnumberofohnologsintheconnectedcomponent, the probability that the two CVL blocks have (cid:1)n ohnologs in andristhetotalnumberofpairsofgenesindifferentsubgroups commonundertheconditionthatallohnologsaredistributedat ofthepartition.Intuitively,cistheareaofthewholetriangular randominallCVLblocks.Theprobabilityisdefinedasfollows: region in Figure 2H, and r is the total area of red regions. The LetNbethetotalnumberofgenesinallCVLblocks,mbethe probability p is the total number of cases when each pair of totalnumberofohnologsamongallCVLblocks,andxandybe paralogous CVL blocks in different subgroups of the partition therespectivenumberofgenesinCVLblocksXandY.Assuming shares(cid:1)nohnologsdividedbythetotalnumberofcaseswhenm arandomdistributionofohnologsamongCVLblocks,theprob- ohnologsappearinpairsoftwoarbitraryCVLblocks: abilityofobserving(cid:1)nohnologsisapproximatedby (cid:1) (cid:2)(cid:1) (cid:2)(cid:3)(cid:1) (cid:2) p=(cid:1)m(cid:1)mi(cid:2)qi(cid:2)1−q(cid:3)m−1,whereq=N(cid:2)N2x−y1(cid:3). p=(cid:1)im=n ri mc−−ri mc . i=n Here,pisthesummationoftheprobabilitythati((cid:1)n)amongm Thepartitionwiththeminimumprobabilityisthemostsignifi- ohnologsaremembersofXandYwiththeprobabilityqthatan cantone,andthesubgroupsinthepartitionaretreatedasgna- arbitrarypairofgenesselectedfromallN(N(cid:1)1)/2pairsbecomes thostomeproto-chromosomes. anohnologsharedbyXandY. In the literature, some algorithms have been proposed for If the probability was at most a reasonable threshold, we reconstructingpre-duplicatedgenomes(El-MabroukandSankoff rejectedthenullhypothesisthatohnologsbetweenthetwoCVL 2003;Kellisetal.2004).Theideaofdoublyconservedsynteny Genome Research 1263 www.genome.org
Description: