ebook img

Inferring ancient metabolism using ancestral core metabolic models PDF

17 Pages·2013·1.28 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Inferring ancient metabolism using ancestral core metabolic models

Baumleretal.BMCSystemsBiology2013,7:46 http://www.biomedcentral.com/1752-0509/7/46 RESEARCH ARTICLE Open Access Inferring ancient metabolism using ancestral core metabolic models of enterobacteria David J Baumler1*, Bing Ma1,4, Jennifer L Reed2 and Nicole T Perna1,3 Abstract Background: Enterobacteriaceae diversified from an ancestral lineage ~300-500 million years ago (mya) intoa wide variety of free-living and host-associated lifestyles.Nutrient availability varies across niches, and evolution of metabolic networks likely played a key role inadaptation. Results: Here weuse a paleo systems biology approach to reconstruct and model metabolic networks of ancestral nodes oftheenterobacteria phylogeny to investigate metabolism of ancient microorganisms and evolution ofthe networks. Specifically, we identified orthologous genes across genomes of 72 free-livingenterobacteria (16 genera), and constructed core metabolic networks capturingconserved components for ancestral lineagesleading to E.coli/ Shigella (~10 mya), E.coli/Shigella/Salmonella (~100 mya), and allenterobacteria (~300-500mya). Using these models we analyzed thecapacity for carbon, nitrogen, phosphorous, sulfur, and iron utilization in aerobic and anaerobic conditions, identified conserved and differentiating catabolic phenotypes, and validated predictions by comparison to experimental data from extant organisms. Conclusions: This is a novel approach using quantitative ancestral modelsto study metabolicnetwork evolution and may be useful for identification of new targets to control infectious diseases causedby enterobacteria. Keywords: Constraint-based modeling, Enterobacteria, Metabolic network reconstruction, Ancient metabolism, Paleo systems biology, Ancestral core Background has revealed some of the genomic variations linked to Initially named for a group of intestinal bacteria, mem- host/niche specialization. The metabolic gene content of bers of the family Enterobacteriaceae are distributed these genomes is complex, with each strain predicted to worldwide and are found in soil, water, agronomic crops contain over 800 genes encoding metabolic enzymes and and produce, plants and trees, and in animals ranging transporters.Onemethodtoinvestigatethecomplexityof from insectsto humans. Pathogenic enterobacteriacause genome-scale metabolic networks is through the con- biomedically and agriculturally significant diseases, and structionofcomputationalmodels. historically have resulted in numerous pandemics, food- Computational modeling of bacterial metabolism of- borne outbreaks, and nosocomial infections, arguably fers a promising approach to predict strain-to-strain impacting human health more than any other microbial variation in metabolic capabilities and microbial strat- family. Enterobacteria have been extensively studied in egies used in different environments, including host tis- the laboratory due to their importance to human health sues. The number of available genome-scale metabolic and as standard laboratory strains for molecular biology. models (GEMs) has grown in the last ten years to over The family includes 44 distinct genera and 176 named 50 GEMs, and they capture the metabolic capabilities of species [1], and there are over 150 complete or nearly numerous microbial taxa important to human health, complete genomes currently available for enterobacteria. biotechnology and bioengineering [2,3]. Systems biology Extensive comparative analysis between these genomes combines computational and experimental approaches to study the complexity of biological networks at a systems *Correspondence:[email protected] level,wherethecellularcomponentsandtheirinteractions 1GenomeCenterofWisconsin,UniversityofWisconsin-Madison,Madison, lead to complex cellular behaviors. Genome-scale bio- Wisconsin,USA logical networks have proven usefulfor interpretinghigh- Fulllistofauthorinformationisavailableattheendofthearticle ©2013Baumleretal.;licenseeBioMedCentralLtd.ThisisanOpenAccessarticledistributedunderthetermsoftheCreative CommonsAttributionLicense(http://creativecommons.org/licenses/by/2.0),whichpermitsunrestricteduse,distribution,and reproductioninanymedium,providedtheoriginalworkisproperlycited. Baumleretal.BMCSystemsBiology2013,7:46 Page2of17 http://www.biomedcentral.com/1752-0509/7/46 throughput data and generating computational models. Results Mathematical models are constructed from network re- The first GEM for E. coli K-12 MG1655, was developed constructions, and they include variables, parameters, 10 years ago and has undergone numerous improve- and equations to describe the potential behavior of ments and updates. It is now a sophisticated compart- these networks. Starting with E. coli K-12 numerous mentalized model containing over 1,300 genes and 2,400 types of genome-scale biological networks have been reactions [4,7]. It has been used extensively for biotech- constructed including metabolic, regulatory, and tran- nology, discovery applications, and to study evolutionar- scriptional and translational machinery [4-9], and add- ily related enterobacteria. Here we generated ancestral itionalGEMsforadditionalenterobacteriahaverecently core metabolic GEMs at three phylogenetic branching beenconstructed[4,10-15]. points within the family Enterobacteriaceae from a E. coli To date, GEMs of enterobacteria have been constructed K-12 MG1655 GEM [4] based on the retained metabolic for three standard laboratory E. coli strains [4,6-8,10], four capability determined through a comparative genomic pathogenicE.colistrains[4],oneSalmonellastrain[14,16], analysisof72enterobacterialgenomes.Wevalidatedthese oneKlebsiella strain [12], two Yersinia strains [10,13], and models by comparing in silico carbon source utilization one insect endosymbiont, Buchnera [15]. These GEMs predictions to experimental data spanning 36 extant have been used to bioengineer strains for valuable end strains from 16 genera to examine the shared metabolic product formation [17-22], to conduct simulations to capabilities of modern-day enterobacteria and the impact investigate metabolic processes during host-pathogen ofchangesinthemetabolicnetworkonphenotypictraits. interactions [14], to identify differentiating metabolic properties between commensal and pathogenic E. coli Phylogeneticreconstructionforthefamily strains [4], and to provide insight into the genome evo- Enterobacteriaceae lution of other enterobacteria [23-25]. In addition to Atotalevidencetreewasconstructedfortheenterobacteria strain-specific enterobacterial GEMs, recently 16 E. coli with available genome sequence data (Figure 1). The genomes were used to construct models from the total evidence tree is extremely robust with full support combined genomic content of these E. coli strains, at every internal tree node. Trees were concordant be- representing the intersection (ancestral core) and union tween the neighbor joining and maximum likelihood (pangenome) and revealed new insight into the evolu- methods,aswellasbetweenthetotalevidencetreesand tionofthisspecies[4]. consensus trees. This phylogeny in phylogram form Members of the family Enterobacteriaceae diversified from a common ancestor ~300-500 million years ago Citrobacter (mya) into a wide variety of free-living and host- Salmonella associated lifestyles [26,27], yet based on conserved Escherichia metabolic phenotypes of all modern enterobacteria, lit- Shigella Enterobacter tle is known about ancestral traits of metabolism be- Leclercia yond that theywereableto catabolize glucose and grow Klebsiella in the presence or absence of oxygen [1]. Here the me- Yokenella Cronobacter tabolism of ancient microorganisms has been investi- Cedecea gated by identifying orthologous genes shared in the Erwinia genomes of 72 free-living enterobacteria from 16 gen- Pantoea era, and constructing metabolic networks representing Brenneria Pectobacterium the ancestral core at three phylogenetic points: the E. Dickeya coli/Shigella ancestral core (~10 mya), the E. coli/ Serratia Shigella/Salmonella ancestral core (~100 mya), and the Yersinia Ewingella enterobacterial ancestral core (~300-500 mya). Using Edwardsiella these metabolic models we have analyzed the metabolic Hafnia capacity for carbon, nitrogen, phosphorous, sulfur, and Photorhabdus Xenorhabdus iron utilization in aerobic and anaerobic conditions and Proteus have identified conserved and differentiating catabolic Pseudomonas phenotypesandvalidatedthesepredictionsbycomparison Vibrio toexperimentaldata.Apartfromourpreviouspublication 10000 on E. coli, this is the first study to use constraint-based Figure1Phylogeneticreconstructionforthefamilyof modelingtoexaminethemetabolicpropertiesofancestral Enterobacteriaceae.TotalevidenceMaximumLikelihoodtree (*indicatesphylogeneticbranchingpointforancestralcore bacteria and provides new insight into the evolution of metabolicmodels). metabolismforthefamilyEnterobacteriaceae. Baumleretal.BMCSystemsBiology2013,7:46 Page3of17 http://www.biomedcentral.com/1752-0509/7/46 indicates a general trend that plant-associated clades thatgenesconservedacrossallstrainsrepresentaconser- have a relatively deeper split including both the soft- vative estimate of the core genome of the ancestor of rotting clade with Dickeya and Pectobacterium, as well modernE. coli strains.From the 16 E. coli and seven Shi- as the Erwinia-Pantoea clade, suggesting ancient speci- gellaspp.genomesweidentifiedacollectionof2,073con- ation events for these genera. The clade including served genes to construct iEcoli_core (Figure 2). Of these Escherichia,Salmonellaandotheranimal-associatedor- conservedgenes,790havebeenexperimentallycharacter- ganisms shows relatively shorter intra-clade branches ized for metabolic function and are present in the E. coli length, indicating their relatively recent speciation. This K-12 MG1655 GEM (iEco1339_MG1655) [4]. As previ- phylogenetic tree for the enterobacteria was used to de- ously described for construction of ancestral core meta- termine the order of genomes used in subsequent ana- bolic models [4], a metabolic network for the E. coli/ lysisforancestralcoremetabolicgenedetermination. Shigella ancestral core was made by removing reactions from the iEco1339_MG1655 network if orthologs to the GenerationofanE.coli/Shigellacoremetabolicnetwork associated gene(s) were absent in one or more of the 23 E. coli and Shigella strains are thought to have diverged genomesandifthereactionsdidnothaveanyisozymes.If from a common ancestor ~10 mya [27]. Although this is removing a reaction prevented biomass production the most recent ancestral state we investigate here, gene (predicted using flux balance analysis) then the reaction lossesandacquisitionofgenesviahorizontaltransferhave was added back to the metabolic model without a gene ledtoextensivedifferencesingenomecontentamongde- associated with it and the reaction was classified as an scendants of this node with some E. coli strains differing orphan reaction (i.e. a reaction without any associated byasmuchas25%oftheirgenecontent[28].Itisofinter- genes).Usingthisapproach549ORFSassociatedwith454 est to understand the extent to which this has impacted reactionswereremovedfromiEco1339_MG1655resulting the metabolic network over this time frame. We assume in an E. coli core metabolic network (iEcoli_core) 4500 4000 3500 3000 2500 la t o T 2000 1500 1000 500 0 colrnK2istai1 Escerichiah glShiela Salonelaml Lelerccia eelaKlbsil Yokeneall Cedecea riEwnia Pantaoe Serratia wEingella Dicyake uectobacterim Yersinia aHafni Phoabdustorh E. P Enterobacteria member = Number of ancestral core genes = Number of genes in ancestral core metabolic network = Number of ancestral core reactions = Number of orphan reactions Figure2EnterobacterialcoreevolutionaccordingtothenumberofsequencedgenomeslistedinTable5. Baumleretal.BMCSystemsBiology2013,7:46 Page4of17 http://www.biomedcentral.com/1752-0509/7/46 consisting of a total of 790 ORFs and 1,674 reactions andSalmonellastrains.Weassumethatgenesconserved (Table 1), and the reactions retained in the core model across all strains represent a conservative estimate of wereclassifiedbasedonmetabolicsubsystem(Figure3). the core genome of the ancestor of modern E. coli and Salmonella strains and from these 16 E. coli, seven Shi- GenerationofanE.coli/Shigella/Salmonellacore gella spp., and 16 Salmonella spp. genomes the collec- metabolicnetwork tion of 1,703 conserved genes were used to construct E. coli and Salmonella strains are thought to have di- the ancestral metabolic model iSalcoli_core (Figure 2). verged from a common ancestor ~100 mya [27] and it is There are 683 of these genes that have characterized of interest to understand how the metabolic networks metabolic function in the E. coli K-12 MG1655 GEM have evolved over time to have an estimate of the meta- (iEco1339_MG1655) [4]. A metabolic network for the E. bolic capabilities of an ancestor to modern day E. coli coli/Shigella/Salmonella ancestral core was made as Table1Metabolicmodelinformationandreactionsubsystemclassification Model iEco1339_MG1655 iEcoli_core iSalcoli_core iEntero_core Genomesincludedinanalysis 1 23 39 72 Genes 1339 790 683 325 ReactionsTotal 2,128 1,674 1,601 1,191 OrphanReactions 100 207 272 677 Reactionsbysubsystem AlternateCarbonMetabolism 192 73 64 37 AminoAcidMetabolism 170 144 139 121 AnapleroticReactions 8 6 6 4 CarnitineDegradation 1 0 0 0 CellEnvelopeBiosynthesis 134 118 118 104 CitricAcidCycle 13 10 10 9 CofactorandProstheticGroupBiosynthesis 164 148 147 131 FolateMetabolism 6 6 5 4 GlycerophospholipidMetabolism 225 191 191 75 GlycineBetaineBiosynthesis 1 1 1 1 Glycolysis/Gluconeogenesis 22 20 19 15 GlyoxylateMetabolism 4 2 2 2 InorganicIonTransportandMetabolism 105 97 91 63 LipopolysaccharideBiosynthesis/Recycling 68 52 49 46 MembraneLipidMetabolism 46 34 33 15 MethylglyoxalMetabolism 8 7 7 5 MureinBiosynthesis 15 15 15 15 MureinRecycling 38 34 31 17 NitrogenMetabolism 13 4 3 0 NucleotideSalvagePathway 131 101 92 77 OxidativePhosphorylation 55 39 36 10 PentosePhosphatePathway 10 9 9 6 PurineandPyrimidineBiosynthesis 26 24 24 22 PyruvateMetabolism 10 8 7 6 Transport,InnerMembrane 307 198 174 103 Transport,OuterMembrane 39 28 25 9 Transport,OuterMembranePorin 247 247 247 247 tRNACharging 22 18 18 14 Unassigned 37 28 26 21 Baumleretal.BMCSystemsBiology2013,7:46 Page5of17 http://www.biomedcentral.com/1752-0509/7/46 350 300 s n250 o itc200 a e150 R 100 50 0 AmiAlntoerAncaitdeMCeatrabboonAlinsMaemptCClaeeolfrblaootElciitncsovrmRelaeonapcdteiPorBinoossstyhCientttirihccesGiArscoiudGplCyyBcicelorFesooylpnathtIhoenesosMpriGLelghistyaoaplcniboioplopcilIiyodlssoiyMnsms/eatTGrcalacubnhcolsaiorpisnodermeto gaBienondseysMMineestthamebbsroilaisns/emRLMeiecpityhdclyilMngelgtyaobxoallisMMumerteaibnoliBisomsyMnutrheieNnisiNtrsRuoeclgceeyonctliidMneegtOaSxaibldovliaatsigPveemePnPtuPraoithsnhoeeswPpaahynhoodrsyPplyahrtiaitomeindPiantehPBwiyaorTyrsuayvnanttsehpeoMrsTiter,staIannbsnolpeirosrtM,meOmutbreranMeemtbrRaNnAeChargiUnnagssigned = iEco1339_MG1655 = iEcoli_core = iSalcoli_core = iEntero_core (All reactions) = iEntero_core (Orphan reactions) Figure3E.coli/Shigella,E.coli/Shigella/Salmonella,andenterobacterialancestralcoremetabolicmodelcompositionandorphan reactionscontainedintheenterobacterialancestralcoremetabolicmodelclassifiedintosubsystems. before by removing reactions from the iEco1339_MG1655 genewasfoundthatspansall72enterobacterialgenomes, network if orthologs for the associated genes were absent yet four genes (b0241, b0929, b1377, or b2215) have in one or more of the 39 genomes. Reactions were added equivalent function for 244 of these reactions in the backasorphanreactionsiftheywereessentialforbiomass iEco1339_MG1655 network. For all 72 genomes of production during growth simulation in minimal media enterobacteria examined, one or more of these genes en- withglucoseasthesolecarbonsource.Usingthisapproach coding functionally equivalent proteins were found, and 657 ORFS associated with 527 reactions were removed led to the retention of these 244 reactions in the entero- from iEco1339_MG1655resulting inan E. coli/Salmonella bacterial core network classified as orphan reactions. coremetabolicnetwork(iSalcoli_core)consistingofatotal Usingthisapproach1,014ORFSassociatedwith937reac- of 683 ORFs and 1,601 reactions (Table 1), and the tions were removed from iEco1339_MG1655 resulting in reactions remaining were classified based on metabolic an enterobacterial core metabolic network (iEntero_core) subsystem(Figure3). consisting of a total of 325 ORFs and 1,191 reactions (Table 1), and the reactions remaining were classified Generationofanenterobacterialcoremetabolicnetwork basedonmetabolicsubsystem(Figure3). All members of the family of enterobacteria are thought to have diverged from a common ancestor ~300-500 Assessmentandvalidationofmodelsforcarbonsource mya [26]. The metabolic network present at that time utilization represents the backbone on which the metabolism of all To evaluate the accuracy of three ancestral core GEMs, modern enterobacteria was built. We again assume that we predicted if these ancestral strains could use 190 genes conserved across all strains represent a conserva- different carbon sources under aerobic and anaerobic tive estimate of the core genome of the ancestor of conditions. These predictions were then compared to ex- modern enterobacterial strains and from these 72 ge- perimental growth phenotypes for current strains mea- nomes the collectionof756conservedgenestoconstruct sured using Biolog phenotypic arrays or published carbon iEntero_core(Figure2).Thereare325ofthesegenesthat source utilization data for 38 enterobacteria spanning 23 have characterized metabolic function in the E. coli K-12 genera listed in Table 2 [1,4]. There are numerous strain- MG1655 GEM (iEco1339_MG1655)[4]. A metabolicnet- specific differences in carbon source utilization in both work for the enterobacterial ancestral core was made by aerobic (Additional file 1) and anaerobic conditions removing reactions from the iEco1339_MG1655 network (Additional file 2). For the experimental growth pheno- if one or more of the 72 genomes did not have an types, if any of the current strains could not grow on a orthologousgenetoassociatetothereactionandifthere- carbon source then we assumed the ancestral core model action was not essential. In the case of transport outer couldalsonotgrowonthecarbonsource.Theseexpected membraneporinreactions(n=247),nosingleothologous experimental results for the ancestral strains were then Baumleretal.BMCSystemsBiology2013,7:46 Page6of17 http://www.biomedcentral.com/1752-0509/7/46 Table2Sourcesforexperimentalcarbonsource Comparisons were made for all three ancestral meta- utilizationdataformoderndayenterobacterialstrains bolic models (Figure 4, Additional file 3). We compared Genusspeciesstrain Column Sourceor the accuracy of the ancestral models for carbon source number reference predictions to all other microbial GEMs that were vali- EscherichiacoliK-12MG1655 1 [4]andthisstudy datedthroughacomparisontocarbonsourceutilization EscherichiacoliK-12W3110 2 [4]andthisstudy data(Figure5),anddeterminedthattheaccuracyofcar- bon source utilization predictions for ancestral meta- EscherichiacoliEDL933 3 [4]andthisstudy bolic models was similar to the range of accuracy for EscherichiacoliSakai 4 [4]andthisstudy GEMs of extant bacteria under both aerobic and anaer- EscherichiacoliCFT073 5 [4]andthisstudy obicconditions[4,6,10-12,14,16,29-32]. EscherichiacoliUTI89 6 [4]andthisstudy Shigellaflexneri2457T 7 Thisstudy Insilicopredictionsfornitrogen,phosphorous,iron,and Shigelladysenteriae 8 [1] sulfurutilizationpredictions Once all three ancestral core models were validated Shigellaboydii 9 [1] through comparison to experimental data for determin- Shigellasonnei 10 [1] ing accuracy of in silico carbon source utilization using SalmonellatyphimuriumLT2 11 [4]andthisstudy FBA, predictions were generated for utilization of sole SalmonellaArizonae 12 [1] nitrogen, phosphorous, iron, and sulfur compounds. As SalmonellaCholeraesuis 13 [1] the number of genomes used for the generation of the SalmonellaGallinarum 14 [1] three ancestral core models increased, the number of metabolites decreased that were predicted as useable ni- SalmonellaParatyphi 15 [1] trogen, phosphorous, iron, or sulfur source under aer- SalmonellaTyphi 16 [1] obic (Figure 6A)oranaerobicconditions(Figure 6B). Citrobacterkoseri 17 [1] Enterobactercloacae 18 [1] Analysisofgeneessentiality Leclerciaadecarboxylata 19 [1] To further explore the metabolic similarities between all Klebsiellapneumoniae 20 [1] enterobacteria, we determined reaction essentiality pre- dictions of the enterobacterial core metabolic network Yokenellaregensburgei 21 [1] (iEntero_core) for conditions simulating aerobic and an- Cronobactersakazakii 22 [1] aerobic growth in glucose containing minimal media Cedeceadavisae 23 [1] (Table 3). Of the 325 genes included in the enterobacte- ErwiniaamylovoraATCC49946 24 Thisstudy rial ancestral core model, we compared the 169 genes PantoeastewartiiDC283 25 Thisstudy predicted in silico as essential to orthologous genes in Brenneriasalicis 26 [1] other enterobacteria strains for which GEMs have been constructed [4,11,12,14], to experimentally determined Serratiamarcescens 27 [1] essential genes [14,33,34], and to “superessential” gene Ewingellaamericana 28 [1] predictions (required in all metabolic networks analyzed Dickeyadadantii3937 29 Thisstudy [35]) (Figure 7). 39% of genes predicted as essential (66/ PectobacteriumatrosepticumSCRI1043 30 Thisstudy 169) using the enterobacterial ancestral core metabolic Yersiniaenterocolitica 31 [1] networkwerealsopredictedasessentialinsilicoinoneor Yersiniapestis 32 [1] more GEMs generated from genomes of extant entero- bacteria.Ofthe325genescontainedintheenterobacterial Yersiniapseudotuberculosis 33 [1] ancestral core metabolic network, 156 genes were pre- Edwardsiellatarda 34 [1] dicted as non-essential and 98% (154/156) of these Hafniaalvei 35 [1] predictions matched non-essential gene predictions from Photorhabdusluminescens 36 [1] orthologousgenescontainedinGEMsgeneratedfromge- Xenorhabdusnematophila 37 [1] nomesofextantenterobacteria. Proteusvulgaris 38 [1] When the predicted 169 genes predicted as essential using the enterobacterial ancestral core metabolic model compared to FBA predictions of growth for the different were compared to experimental data for modern day en- ancestral core GEMs using different carbon sources. For terobacterial strains, 39% of essential gene predictions those compounds included in the Biolog plates that (66/169)wereinagreementwiththeexperimentallydeter- havetransportersinthemodel,FBAwasusedtopredict mined essential genes for E. coli and Salmonella strains, if they could be used for growth as sole carbon source. and out of the 156 non-essential gene predictions Baumleretal.BMCSystemsBiology2013,7:46 Page7of17 http://www.biomedcentral.com/1752-0509/7/46 E. colicore Salcolicore Entero core O2 No O2 O2 No O2 O2 No O2 E S A E S A E S A E S A E S A E S A 1,2-Propanediol 2-Aminoethanol 2-Deoxy Adenosine 4-Amino ButyricAcid 5-Keto-D-GluconicAcid Acetic Acid Acetoacetic Acid Adenosine α-D-Glucose α-D-Lactose α-Keto-Glutaric Acid Butyric Acid Capric Acid Caproic Acid Citric Acid D,L-α-Glycerol-Phosphate D-Alanine D-Fructose D-Fructose-6-Phosphate D-Galactonic acid-γ-lactone D-Galactose D-Galacturonic Acid D-Gluconic Acid D-Glucose-1-Phosphate D-Glucose-6-Phosphate D-Glucuronic Acid Dihydroxy Acetone D-Malic Acid D-Mannitol D-Mannose E = Experimental D-Melibiose D-Ribose S = In Silico D-Saccharic Acid D-Serine A = Agreement D-Sorbitol D-Trehalose = No growth Dulcitol D-Xylose Formic Acid = Yes growth Fumaric Acid Glycerol = Agreement Glycine Glycolic Acid Inosine = Not in Agreement L-Alanine L-Arabinose L-Arginine L-Asparagine L-Aspartic Acid L-Fucose L-Galactonic Acid-γ-Lactone L-Glutamic Acid L-Glutamine L-Histidine L-Homoserine L-Isoleucine L-Lactic Acid L-Leucine L-Lysine L-Lyxose L-Malic Acid L-Methionine L-Ornithine L-Phenylalanine L-Proline L-Rhamnose L-Serine L-Tartaric Acid L-Threonine L-Valine Maltose Maltotriose M-Inositol Mucic Acid N-Acetyl-D-Glucosamine N-Acetyl-Neuraminic Acid N-Acetyl-β-D-Mannosamine Phenylethylamine Propionic Acid Putrescine Pyruvic Acid β-D-Allose Succinic Acid Sucrose Thymidine Tyramine Uridine Figure4ExperimentalandinsilicocarbonsourceresultsforE.coli/Shigella,E.coli/Shigella/Salmonella,andtheenterobacterial ancestralcore. generated using the enterobacterial ancestral core meta- in agreement for essential gene predictions and 96.7% bolic network 74.3% were in agreement with experimen- were in agreement for non-essential gene predictions tally determined non-essential genes (116/156) (Figure 7). (151/156).For eachof these comparativegeneessentiality When gene essentiality predictions generated using the data sets, overall predictions using the enterobacterialan- enterobacterial ancestral core metabolic network were cestralcoremetabolicmodelwereinagreementwithgene compared to “superessential” genes, 43.7% (74/169) were essentiality predictions from in silico enterobacterial Baumleretal.BMCSystemsBiology2013,7:46 Page8of17 http://www.biomedcentral.com/1752-0509/7/46 100 90 ) 80 % 70 ( tn 60 e m 50 e 40 e rg 30 A 20 10 0 E. coli KE1. 2cMoliG E1K. 61c52o5l Mi( iG JK11R62950M5E4 E.).G( icc1Ao6olFli5i 1 52 K(W61i (02Ei)cECW. o A31c113o2l13i7 09 3(_)OI1ME5cG7Eo1.H 167c35 (o35il)i8 E_cOEoW.1 135c317o41lHi40 7) _U(iEPEDEcLECo.9 (13ic33Eo)4lci5 o_U1S2Pa8Ek8aCi_K l)(iCSeaEFblcTsimoOe1ol7l3n3a0) el1lp_San aUelLTuTIm28mo 9(onI)neiRllaRae 1 L0YMT8er23G )s(iHI n7iM8aA 5P9p7s4eP8e5 ss)(tu iiedsYu oLdC1omS2Oohm29eno82 a)wn(siaa EPsnp ceCualotl8lieia1/rd E5uSaco) gh(oniiilienJg/ ieoPSldls8aelaa1 n(5mai)sEionnMsnct OeeelMr1lsto0aRr -5aba1l6 a)n(iccctoSeersrOeti 7r(ai 8ala3 E)ncc coolreies_(ticrEoalrc eoc)oSrale _(icEornte)ero_core) = Aerobic = Anaerobic Figure5ComparisonofinsilicocarbonsourceutilizationaccuracyforE.coli/Shigella,E.coli/Shigella/Salmonella,andenterobacterial ancestralcoremetabolicmodelsincomparisontoallotherexistingGEMsvalidatedwithcarbonsourceutilizationdata. GEMs for 68% (221/325) of gene essentiality predictions, addition on the number of genes and reactions retained. 76%(241/325)whencomparedtoexperimentaldatafrom ThenumberoforphanreactionsisalsoshowninFigure2 enterobacterialmembers,and70.4%(229/325)whencom- andincreaseswithdivergencetimeoftheancestralnode. pared to “superessential” gene predictions. In summary, For the model representing the ancestor of all entero- essential gene predictions from the enterobacterial ances- bacteria, over 600 of the approximately 1,200 reactions tral core were 80.3% in agreement with essentiality data in the model are orphans (Figure 2). There are several determined previously in one or more published studies possible explanations. These orphans may arise from fromindividualenterobacterialmembers(Figure7). false-negative ortholog predictions that support remov- ing a reaction that appears to be missing from a taxon, Discussion but for which the genes are truly present in the anno- This study describes the generation of three computa- tated genome. Similarly, orphans could arise from false- tional models designed as a first approximation of the negative gene prediction errors. Both these mundane metabolic capacity of ancestral enterobacteria. We se- error sources could be corrected manually, but this lected three nodes in a well resolved phylogenetic tree of would require a good deal of effort. A more interesting enterobacteriawithcompletegenomesequences(Figure1). explanation is that these reactions might be carried out The first node represents the common ancestor of E. coli by non-orthologous isofunctional equivalents in the or- and Shigella (~10 mya), the second represents the ances- ganisms that suggested removing them from the model. tor of Ecoli/Shigella and Salmonella (~100 mya), and the This was the case for 244 transport outer membrane third represents the ancestor of all enterobacteria (~300- porin reactions that were retained in the enterobacterial 500 mya). Each model includes all identifiable metabolic ancestral core model that did not have a single ortholog reactions from a previous GEM for E. coli K-12 that are conserved across all 72 genomes, yet each of these ge- linkedtogenesconservedamong(23,39and72genomes, nomes contained one or more functionally equivalent respectively) descendants of the phylogenetic node, plus genes encoding isozymes for these reactions. Further orphanreactionsfromtheE.coliK-12modelthatmustbe examination of these cases could advance understanding retained because removal would prevent production of of the rates and patterns of non-orthologous displace- biomass. Thus, these three models are progressively ment in the evolution of metabolic processes. It is also smaller subsets of the E. coli K-12 model. The models likely that we will see a reduction of orphan reactions in were created in a step-wise fashion by deleting reactions the next generation of ancestral models that makes use missing from one taxon at a time selected approximately of parsimony or maximum likelihood based ancestral in order of increasing phylogenetic distance from E. coli state prediction to determine which genes are present at K-12. Figure 2 shows the impact of each successive each internal node, because this approach will be more Baumleretal.BMCSystemsBiology2013,7:46 Page9of17 http://www.biomedcentral.com/1752-0509/7/46 A 200 180 160 140 Aerobic 120 100 80 60 40 20 0 Carbon Nitrogen Phosphorous Iron Sulfur B 160 140 120 100 Anaerobic 80 60 40 20 0 Carbon Nitrogen Phosphorous Iron Sulfur = iEco1339_MG1655 = iEcoli_core = iSalcoli_core = iEntero_core Figure6Carbon,Nitrogen,Phosphorous,Iron,andSulfurutilizationinsilicopredictionsforE.coli/Shigella,E.coli/Shigella/Salmonella, andtheenterobacterialancestralcoremetabolicmodelsin(A)aerobicand(B)anaerobicconditions. tolerant of annotation or orthology errors affecting a category (Figure 3). Some subsystems are particularly subset of taxa. It will also expand the models to include highlyconserved. Mureinbiosynthesis andtransportouter reactions that were legitimately lost in individual taxa membrane porin reactions are almost entirely conserved and lineages allowing us to further probe the evolution- across all phylogenetic depths assayed. Other categories, ary dynamics of metabolic networks and perhaps dis- like alternate carbon metabolism, transport inner mem- cover system-scale emergent phenotypes linked with brane and glycerophospholipid metabolism are highly losses ofparticular reactions. variableacrossthemodels.Forexample,119reactionsin- Here, we focus on these models representing core ab- volved with alternate carbon metabolism were deleted solutely conserved reactions to examine a conservative from the E. coli K-12 model in order to create the E. coli estimate of the metabolic capacity of each ancestral ancestralcoremodel,9additionaldeletionswererequired lineage. We compared the reactions retained in each an- for the E. coli/Salmonella ancestral core model, and 27 cestralmodeltothetotalsetofreactionsinthematureE. more were required for the enterobacteria ancestral core coliK-12model for each reaction subsystem classification model.Thispatternsuggeststhatmuchofthevariationin Baumleretal.BMCSystemsBiology2013,7:46 Page10of17 http://www.biomedcentral.com/1752-0509/7/46 Table3Subsystemclassificationforessentialreactions models in this study. The iAF1260 biomass equation is predictedforallmetabolicmodelsunderanaerobic the most extensive biomass equation of all enterobacte- conditions rial GEMs containing 92 metabolites including many Source iEcoli_core iSalcoli_core iEntero_core trace metals and elements such as iron and sulfur that Essentialreactions 284 286 326 are known to be essential for supporting microbial life. Since unique biomass equations exist for three other Essentialreactions subsystemclassification strains of enterobacteria with GEMs [11,12,14], we com- AlternateCarbon 2 2 2 pared the metabolites present in all strain-specific entero- Metabolism bacterial GEMs and determined that there are 34 present AminoAcidMetabolism 82 82 84 inall fourbiomass equations, 8 in three of the four, 18 in two of the four, and that there are metabolites specific to CellEnvelopeBiosynthesis 45 45 45 each strain-specific biomass equation for E. coli (23), Sal- CitricAcidCycle 4 4 5 monella (8), Klebsiella (12), and Yersinia (6). As more CofactorandProsthetic 66 66 69 enterobacterial members have GEMs constructed with GroupBiosynthesis strain-specific biomass equations, a future direction may FolateMetabolism 2 2 4 be to generate an average biomass composition from all Glycerophospholipid 10 10 10 modern-day GEMs of enterobacteria to use in the en- Metabolism terobacterial ancestral core metabolic model, or to use Glycolysis 1 1 10 parsimony-based ancestral state reconstruction to esti- InorganicIonTransport 8 9 12 mate node-specific ancestral biomass composition. Fu- andMetabolism ture studies are warranted to investigate whether and Lipopolysaccharide 11 11 11 howthesealternativeapproachesimpactinsightsgained Biosynthesis/Recycling throughsimulations. MembraneLipid 2 2 4 We used the models to predict whether each ancestral Metabolism “strain” would grow on a large number of carbon, nitro- MureinBiosynthesis 2 2 2 gen,phosphorous,iron,andsulfursourcesunderaerobic PentosePhosphate 2 2 3 and anaerobic conditions. We compared these predic- pathway tions to previously published experimental growth data PurineandPyrimidine 26 26 34 for 38 extant enterobacteria representing the 23 genera Metabolism used in the phylogenetic analysis (Figure 1) (several of Transport,InnerMembrane 4 5 8 which were not represented among the genomes used to Transport,OuterMembrane 15 15 19 construct our models) to investigate the accuracy of the Unassigned 2 2 4 models (Figure 4). For each node corresponding to one of our models, we first compiled published experimental thisbiologicalprocessliesatthespecieslevel.Incontrast, data recording whether each nutrient could serve as a for the glycerophospholipid metabolism subsystem, the E. sole source for growth in all reported wild-type descen- coli/Shigella core and E. coli/Salmonella core models are dantsofthe node. Ifeven a singlereport indicated thata identical, but 116 reaction deletions wererequired to cre- strain was unable to utilize the nutrient as a sole source, ate the enterobacteria core model. This tells us that this we recorded it as a negative. For the E. coli/Shigella, metabolic process was largely intact by the time E. coli Ecoli/Shigella/Salmonella, and the enterobacteria core and Salmonella diverged, but these metabolic capabilities model reconstructions, the total number of experimen- were likely acquired since the ancestor of all entero- tally reported carbon sources utilized by all relevant bacteria.Futuremodelingofadditionalnodeswithincorp- strains in anaerobic conditions were 81, 65, and 33, oration of ancestral state reconstruction will further and in anaerobic conditions were 77, 46, and 16, re- illuminatethetimingofthesemetabolicinnovations.This spectively (Figure 4). Previously it was appreciated that in turn, can be used to simulate metabolic evolution in all free-living enterobacteria could utilize glucose as a theforwarddirection,addingcapabilitiesinstep-wiseevo- sole carbon source [1]. Our compilation of experimen- lutionary order. With each addition, it will be possible to tal data identified 15 additional carbon sources that competeeach“evolved”strainagainstitsancestorinsilico are utilized by all free-living enterobacteria (alpha- alteringtheenvironmentalconditionsinthesimulationto D-Glucose, D, L-Malic Acid, D-Alanine,D-Fructose, examinetherelativepredictedbiomass(i.e.fitness). D-Fructose-6-Phosphate, D-Galactose, D-Glucose-1- Since the biomass of ancestral cells is impossible to Phosphate,D-Glucose-6-Phosphate,D-Ribose,Inosine, determine empirically, we used the E. coli Biomass equa- L-Aspartic Acid, L-Glutamic Acid, L-Serine, N-Acetyl- tion from iAF1260 [6] for all ancestral core metabolic D-Glucosamine, Pyruvic Acid, and Uridine). Catabolism of

Description:
RESEARCH ARTICLE Open Access Inferring ancient metabolism using ancestral core metabolic models of enterobacteria David J Baumler1*, Bing Ma1,4, Jennifer L Reed2 and
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.