ebook img

Genomic analyses of the ancestral Manila family of Mycobacterium tuberculosis PDF

15 Pages·2017·1.22 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Genomic analyses of the ancestral Manila family of Mycobacterium tuberculosis

RESEARCHARTICLE Genomic analyses of the ancestral Manila family of Mycobacterium tuberculosis XuehuaWan1,KentKoster2,LishiQian2,EdwardDesmond3,RichardBrostrom4, ShaobinHou1,JamesT.Douglas2* 1 AdvancedStudiesinGenomics,ProteomicsandBioinformatics,UniversityofHawaii,Honolulu,Hawaii, UnitedStatesofAmerica,2 DepartmentofMicrobiology,UniversityofHawaii,Honolulu,Hawaii,United StatesofAmerica,3 MicrobialDiseasesLaboratory,CaliforniaDepartmentofPublicHealth,Richmond, California,UnitedStatesofAmerica,4 TBControlProgram,HawaiiStateDepartmentofHealth,Honolulu, Hawaii,UnitedStatesofAmerica a1111111111 a1111111111 *[email protected] a1111111111 a1111111111 a1111111111 Abstract Withitsairbornetransmissionandprolongedlatencyperiod,Mycobacteriumtuberculosis spreadsworldwideasoneofthemostsuccessfulbacterialpathogensandcontinuestokill millionsofpeopleeveryyear.M.tuberculosislineage1isinferredtooriginateancestrally OPENACCESS basedonthepresenceofthe52-bpTbD1sequenceandanalysisofsinglenucleotidepoly- Citation:WanX,KosterK,QianL,DesmondE, morphisms.Previously,webrieflyreportedthecompletegenomesequencingofM.tubercu- BrostromR,HouS,etal.(2017)Genomicanalyses oftheancestralManilafamilyofMycobacterium losisstrains96121and96075,whichbelongtotheancientManilafamilyandmodern tuberculosis.PLoSONE12(4):e0175330.https:// Beijingfamilyrespectively.Herewepresentthecomprehensivegenomicanalysesofthe doi.org/10.1371/journal.pone.0175330 Manilafamilyinlineage1comparedtocompletegenomesinlineages2–4.Principalcompo- Editor:ShihuiYang,NationalRenewableEnergy nentanalysisofthepresenceandabsenceofCRISPRspacerssuggeststhatManilaisolate Laboratory,UNITEDSTATES 96121isdistinctlydistantfromlineages2–4.WefurtheridentifyatruncatedwhiB5geneand Received:December15,2016 aputativeoperonconsistingofgenesencodingaputativeserine/threoninekinasePknH Accepted:March25,2017 andaputativeABCtransporter,whichareonlyfoundinthegenomesofManilafamilyiso- lates.Sixsinglenucleotidepolymorphismsareuniquelyconservedin38Manilastrains. Published:April10,2017 Moreover,whencomparedtoM.tuberculosisH37Rv,59proteinsareunderpositiveselec- Copyright:©2017Wanetal.Thisisanopen tioninManilafamilyisolate96121butnotinBeijingfamilyisolate96075.Theuniquefea- accessarticledistributedunderthetermsofthe CreativeCommonsAttributionLicense,which turesfurtherserveasbiomarkersforManilastrainsandmayshedlightonthelimited permitsunrestricteduse,distribution,and transmissionofthisancestrallineageoutsideofitsFilipinohostpopulation. reproductioninanymedium,providedtheoriginal authorandsourcearecredited. DataAvailabilityStatement:Allrelevantdataand accessionnumbersarewithinthepaperandits SupportingInformationfiles. Funding:Thisresearchwassupportedbythe Introduction UniversityofHawai’iOfficeofResearchServices andtheLeahiTrustoftheHawai’iCommunity Theglobalepidemictuberculosis(TB)accountsformillionsofdeathsworldwideeveryyear, Foundation,Honolulu,Hawaii.Partialfundingfor andithasbeenrecognizedasaWorldHealthOrganization(WHO)emergencysince1993[1]. theworkdescribedherewasprovidedbytheDean One-thirdoftheworldpopulationislatentlyinfectedbytuberculosis-causingbacteria[2]. oftheCollegeofNaturalSciences,Dr.William Onemajorcauseofdeathamonghumanimmunodeficiencyvirus(HIV)carryingpopulations Ditto,toJTD.Thefundershadnoroleinstudy isTB,andmorethan10%ofTBcasesareassociatedwithHIV[1].Inaddition,multidrug- design,datacollectionandanalysis,decisionto publish,orpreparationofthemanuscript. resistantTB(MDR-TB)andextensivelydrug-resistantTB(XDR-TB)occurglobally.Thus, PLOSONE|https://doi.org/10.1371/journal.pone.0175330 April10,2017 1/15 GenomicanalysesofMycobacteriumtuberculosis Competinginterests:Theauthorshavedeclared identificationofnovelbiomarkersofglobalTB-causingbacteriaisneededforimprovingclini- thatnocompetinginterestsexist. caldetectionanddevelopingnewtreatments[3]. MycobacteriumtuberculosisisonepathogenicbacterialspeciesintheMycobacteriumtuber- culosiscomplex(MTBC).Sevenhuman-adaptedMTBClineagesarecharacterizedbasedon thephylogeneticanalysis;lineages1–4and7areM.tuberculosisstrains,andlineages5and6 areMycobacteriumafricanum[4,5].Lineages1,5and6areclassifiedasancientlineagesdue tothepresenceofa52-bpregionnamedTbD1,whichisalsoidentifiedinthecattleTB-causing bacteriumMycobacteriumbovis,whiletheremaininglineagesareclassifiedasmodernlineages duetotheabsenceofTbD1[6].SNPs(singlenucleotidepolymorphisms)basedphylogenetic analysisfurthersupportsthisscenarioandlineage1isestimatedtodiverge~67,000yearsago [4,5].Inaddition,modernlineagesaremoresuccessfulthanthoseancientoneswhencom- paredinvirulenceandgeographicalspread[7]. TheManilafamilyofM.tuberculosiswasoriginallydefinedbyinvestigatingIS6110poly- morphism,spoligotyping,andthreegeneloci(oxyR,gyrA,andkatG)in48M.tuberculosis strains,isolatedfrompatientslivinginManila,RepublicofthePhilippines[8].Basedongeo- graphicaldistributionandourunpublisheddata,theManilafamilybelongstolineage1which includesM.tuberculosisstrainscirculatinginthePhilippinesandaroundtherimoftheIndian Ocean.Inourpreviouswork,weperformedwholegenomesequencing,denovoassembly,and gapclosingoftwodrugsensitiveM.tuberculosisstrainsfromtheManilaandBeijingfamilies respectively[9].However,theManilafamilyhasnotbeenfullycharacterizedatthecomplete genomelevel.HerewepresentcomparativeanalysesofthecompletegenomesofM.tuberculo- sisManilafamilyisolate96121,strainsinlineages2–4,andtwooutgroupstrainsincluding M.bovisandMycobacteriumcanettii.Wealsoinvestigatewhethertheuniquefeaturesin Manilafamilyisolate96121areprevalentin38draftgenomesoftheManilafamily.Ourfind- ingsprovideevidencetorevealtheuniqueevolutionofthisancestrallyderivedfamilyandmay shedlightonthelimitedtransmissionofthisfamilyofM.tuberculosis. Materialsandmethods Genomesequencing,assembly,anddatacollection WholegenomeshotgunsequencingofM.tuberculosisManilafamilyisolate11L4601wascar- riedoutontheIonTorrentPGMplatform(ThermoFisherScientific,USA).Thecomplete genomesequenceofManilafamilyisolate96121wasusedasthereferencegenometoassemble PGMreadsusing454gsMappersoftware(Roche).ThisWholeGenomeShotgunprojecthas beendepositedatDDBJ/ENA/GenBankundertheaccessionLSFJ00000000(TableAinS1 File).TheversiondescribedinthispaperisversionLSFJ01000000.MUMmer3packagewas usedforwholegenomealignments[10].TheIlluminareadssequencedfor37Manilastrains weredownloadedfromNCBIdatabase(TableAinS1File).Readsweretrimmedusingtheea- utilsprogram[11]andthenassembledusingthereferenceassemblymethodasabove.Addi- tionalgenomesequencesweredownloadedfromtheNCBIFTPsiteasdescribedinTable1 andTableAinS1File. ClusteringofCRISPR(ClusteredRegularlyInterspacedShort PalindromicRepeats)spacers CRISPRregionsandspacerswereidentifiedusingCRISPRFinder[12,13].Atotalof716 spacersequencesfrom21completegenomesofM.tuberculosiswereloadedforall-vs-all BLASTN-shorttoidentifyhomologousspacersequences.90%and95%weresetasstringent cut-offvaluesofminimumalignedlengthandsequenceidentity.TheMCLprogram[14]was PLOSONE|https://doi.org/10.1371/journal.pone.0175330 April10,2017 2/15 GenomicanalysesofMycobacteriumtuberculosis Table1. Straininformationlist. Strainname Accessionnumber Re-annotatedproteincodinggenenumber Isolatedplace TbD1 UT205 NC_016934 4109 Colombia,SouthAmerica - Erdman_ATCC35801 NC_020559 4085 TrudeauMycobacterialCultureCollection - Beijing_96075 NZ_CP009426 4054 Beijing,China - Manila_96121 NZ_CP009427 4088 Manila,Philippines + CCDC5180 NC_017522 4091 Beijing,China - RGTB327 NC_017026 4689 Kerala,SouthIndia - CTRI-2 NC_017524 4057 Russia - NITR204 NC_021193 5237 TamilNadu,SouthIndia - CDC1551 NC_002755 4067 Kentucky/Tennessee,USA - KZN605 NC_018078 4064 KwaZulu-Natal,SouthAfrica - CCDC5079 NC_017523 4176 Beijing,China - KZN4207 NC_016768 4051 KwaZulu-Natal,SouthAfrica - KZN1435 NC_012943 4064 KwaZulu-Natal,SouthAfrica - NITR202 NC_021192 - TamilNadu,SouthIndia - RGTB423 NC_017528 4793 Kerala,SouthIndia - NITR206 NC_021194 4137 TamilNadu,SouthIndia - H37Rv NC_000962 4069 virulentlabstrain - H37Ra NC_009525 4079 attenuatedlabstrain - NITR203 NC_021054 4158 Beijing,China - F11 NC_009565 4068 SouthAfrica - 7199–99 NC_020089 4061 Europe - https://doi.org/10.1371/journal.pone.0175330.t001 usedtoclusterhomologousspacersequencesandidentify50groups.Theaccumulationcurve ofspacersandprincipalcomponentanalysiswereanalyzedandvisualizedwithRpackage. Clusteringoforthologgroups OrthologgroupsofproteinswereidentifiedusingorthoMCL[15]withseveraladditionalsteps tofilterall-vs-allBLASTPresults[16].Ifthealignedlengthwaslessthan80%ofthelonger lengthoftwoproteinsequences,theBLASTPresultwasfilteredout.Thenifthealignedlength wasequaltoorabove150aminoacids,theBLASTPresultwithpairwiseidentitybelow30% wasfilteredout;ifthealignedlengthwaslessthan150aminoacids,theBLASTPresultwasfil- teredbytheformula(100×(0.06+4.8×L^(-0.32×(1+e^(-L/1000))))),inwhichLstandsfor thealignedlength[16].ThepipelineProkka1.11[17]wasusedtore-annotategenomesand thenthepan-genomewasfurtheranalyzedusingtheRoarypipeline[18].Geneaccumulation curveswereplottedusingRpackageveganorggplot2[19,20]. AnalysisofSNPsandsynonymous/nonsynonymoussubstitutionrates Core-SNPswereidentifiedusingthekSNPv2.13package[21].IndelsandSNPswerealso identifiedusingMUMer3package[10].Synonymoussubstitutionrate(Ks)andtheratioof nonsynonymoussubstitutionratetosynonymoussubstitutionrate(Ka/Ksratio)werecalcu- latedbyParaATandKaKs_Calculatorpackages[22,23].DensityplotsofKa/KsandKsvalues werecreatedusingRcommandsandheatscatterplotsofKaandKsvaluesweregeneratedby LSDpackage[24]. PLOSONE|https://doi.org/10.1371/journal.pone.0175330 April10,2017 3/15 GenomicanalysesofMycobacteriumtuberculosis Functionalannotation RNAfamilieswereannotatedusingRfamdatabase[25].ThesecondarystructuresofRNA sequenceswerecalculatedandpredictedusingthestand-aloneViennaRNApackage[26].The RNAstructureswerevisualizedusingVARNA3.92[27]. Results Genomesequencing,datacollection,andassembly WecarriedoutwholegenomeshotgunsequencingofM.tuberculosisManilafamilyisolate 11L4601,whichwascollectedfromafemalepatientlivingintheCommonwealthoftheNorth- ernMarianaIslandsin2010.Wegenerated116.5Mbandtheassemblytotaled4,356,842bases, covering98.8%ofthegenomeofManilafamilyisolate96121(TableBinS1File).Theassem- bledcontigsalignedwellwiththegenomesequenceofManilafamily96121,onlyyieldingran- domsimplerepeatslocatedawayfromtheplotteddiagonalasseeninFigAinS1File.To obtainpopulationinformationofgenomesofManilafamily,wedownloadedIlluminareads sequencedfor37Manilastrains,whichweremappedtothegenomesequenceofH37Rvstrain forSNP-basedsubgroupingofManilastrains[28].Wetrimmedthelow-qualityendsofIllu- minareadsusingPhredqualityscoreQ30andperformedreferenceassemblytoobtainthe37 draftgenomesofManilastrainsprovidedinTableBinS1File. ForcomparativegenomicanalysestounveiluniquesignaturesofancestralManilafamily, 19completegenomesofM.tuberculosisstrainsweredownloadedfromtheNCBIdatabase (Table1).ByBLASTNanalysis,Manilafamily96121wastheonlycompletegenomecontain- ingthe52-bpTbD1regionwhichdesignateditasanancestralstrain,whiletheother20com- pletegenomesonlycontainedapartialregion(~27bp)ofTbD1whichdesignatedthemas modernstrains[6].Allthe38draftgenomesofManilafamilystrainscontained52-bpTbD1 region. GainandlosspatternofCRISPRspacersrevealedthattheManilafamily wasdistinctlydifferentfrommodernstrains CRISPRisarepetitiveregionidentifiedinbacterialandarchaealgenomes,anditisproposed asimmunesystemresistanttohorizontalgenetransfer(HGT)inprokaryotes[29,30].We usedCRISPRFindertoidentifyCRISPRregionsin21completegenomesofM.tuberculosis and2genomesofoutgroupstrains.ThenumberofCRISPRarraysinM.tuberculosisgenomes rangedfrom1to3.TwoCRISPRregionswereidentifiedinManilafamilyisolate96121,and oneCRISPRregionwasfoundinBeijingfamilyisolate96075.Thepresenceandabsenceof homologousspacersequencesofeachstrainwerecountedfrom50clusteredgroupsand recordedinabinarymatrix.Thespaceraccumulationcurveof21strainsindicatedthatthe pan-genomespacernumberfor21strainswas104andtherewasapotentiallylargerpan- genomespacersize(Fig1A).UsingtheEigenvectormethod,principalcomponentanalysis (PCA)indicatedthatManilafamilyisolate96121wasobviouslydifferentfromtheother20 strains(Fig1B).AddingCRISPRspacersofManilafamily11L4601,T17,T92,andoutgroup speciesforPCA,wefoundthatManilafamily11L4601locatedcloselytoManilafamily96121. Furthermore,T17,T92,andM.bovislocatedatpointsbetweenManilafamily96121andline- ages2–4(FigBinS1File). WefurthercomparedspacerregionrearrangementsamongManilafamilyisolate96121, Beijingfamilyisolate96075,andH37Rvtorevealthespacerbirth-deathpatternofM.tubercu- losisstrainsisolatedfromvariousniches(Fig2).CRISPRarray1inManilafamily96121lacked theacquisitionoffivespacersinH37Rv,ofwhichtheacquisitionofspacer7occurredin PLOSONE|https://doi.org/10.1371/journal.pone.0175330 April10,2017 4/15 GenomicanalysesofMycobacteriumtuberculosis Fig1.EvolutionofCRISPRspacersincompletegenomesof21M.tuberculosisstrains.(A) Accumulationcurveofspacersincompletegenomesof21M.tuberculosisstrains.(B)PCAofpresenceand absenceofCRISPRspacersincompletegenomesof21M.tuberculosisstrains.Thebluearrowindicatesthe positionofManilafamily96121,illustratingadramaticdifferenceinthecompositionofspacersinManila family96121. https://doi.org/10.1371/journal.pone.0175330.g001 Beijingfamilyisolate96075.However,itmaintainedeighthomologousspacerswithhighsimi- laritytothoseofotherspeciesinMTBCincludingM.bovis,Mycobacteriummicroti,andM. africanum.FourspacersinCRISPRarray1ofManilafamilyisolate96121werestrain-specific. ComparisonofCRISPRarray2betweenManilafamilyisolate96121andH37Rvalsosug- gestedthateightspacerssharedwithMTBCweremaintainedinManilafamily96121but absentfromH37Rv. Core-genomeandpan-genomeanalysisofcompletegenomesrevealed geneticdiversityintheManilafamily WeonlyconsideredcompletegenomesofM.tuberculosisforcoreandpan-genomeanalysis inthisstudy.Atotalof82,395proteinsequencesfrom21completegenomesofM.tubercu- losiswereloadedforall-vs-allBLASTPanalysistoidentifyhomologousproteinsequences. Fig2.CRISPRspacerrearrangementsinManilafamily96121,Beijingfamily96075,andH37Rv.Green highlightsspacersconservedwithotherMTBCspeciesbutlostinmodernBeijingfamilyisolate96075and H37Rv.RedhighlightsspacersspecificinManilafamilyisolate96121.Bluehighlightsspacerspresentin H37RvbutabsentinManilafamilyisolate96121andBeijingfamilyisolate96075. https://doi.org/10.1371/journal.pone.0175330.g002 PLOSONE|https://doi.org/10.1371/journal.pone.0175330 April10,2017 5/15 GenomicanalysesofMycobacteriumtuberculosis OrthoMCL[15]wasfurtherusedtocluster4,899groupsamongstrains.Only736ortholog groupswereidentifiedtocontaincoregenesacrossthe21completegenomesofstrains, ofwhich711orthologousgroupscontainedsingle-copyorthologousgenesfromeach strain. ConsideringextremelyhighidentitiesofgenomenucleotidesofM.tuberculosisstrains,we furtherquestionedwhethervariousgenepredictionalgorithmscausedartificialgeneticdiver- sityorwhetherindels/mutationscausedframeshift,prematurestopcodon,orstopcodonread- through.Weusedthestand-aloneProkkaprogram[17]tore-annotate20completegenome sequences(Table1),excludingNITR202strainduetounexpectednucleotidedesignations. BasedonRoaryresults[18],thecoregenomeincluded2,623orthologousgenesandthepan- genomeincluded7,591genes(Fig3Aand3B),indicatingthatdifferentannotationprocesses affectedcoreandpan-genomeanalysisandthatconsiderablegeneticdiversitywithinM.tuber- culosisexisted.264genesinManilafamilyisolate96121werenotclusteredwithgenesin H37RvbyRoary.Furthermore,58genesinManilafamilyisolate96121werenotclustered withgenesin19re-annotatedgenomes(Table1),inwhich46geneswereannotatedashypo- theticalproteinsand12geneswereassignedwithfunctions.WefurtherretrievedcodingDNA sequences(CDS)ofthe58genesandranBLASTNagainstH37Rvgenomesequencetodouble checkwhethertheyweresingletons.Wemanuallyconfirmedthat13singletonswereformed byframeshiftorstopcodonformation/readthroughand2singletonswereabsentinthe19 completegenomes. ComparingthegenomesequenceofManilaisolate96121toH37Rv,weidentified983 indelsand2,155SNPs.83genesintheManilaisolate96121wereidentifiedtocontainindels. Wesuggestthatthoughgenomesequencesarerelativelyconservedwithhighidentity,encoded functionalproteomesarere-shapedbyindelsandSNPs,andthesemicroevolutionarychanges resultinthedynamicsanddiversityofmycobacterialgenomes. AtruncatedWhiB5proteinintheManilafamily WeidentifiedoneCtoTmutationinManilafamilyisolate96121thatcausedtheformationof thestopcodonTAGinthewhiB5gene.ThisCtoTmutationwasalsopresentin38draft genomesofManilafamily,leadingtoatruncatedN-terminalsensordomainthatcontained fourconservedcysteineresiduestocoordinatetheFe-SclusterbutlosttheC-terminalalpha helixesthatmayfunctiontobindDNAorinteractwithotherproteins.TheWhiB5proteinin Fig3.Core-genomeandpan-genomeofM.tuberculosisinlineages1–4,basedonanalysisof20 completedgenomes.(A)Accumulationcurveforconservedgenesin20re-annotatedgenomes.(B) Accumulationcurvefortotalgenenumbersin20re-annotatedgenomes. https://doi.org/10.1371/journal.pone.0175330.g003 PLOSONE|https://doi.org/10.1371/journal.pone.0175330 April10,2017 6/15 GenomicanalysesofMycobacteriumtuberculosis H37Rvisrequiredforexpressionof58genesandisrequiredforprogressiveandchronic mouseinfections[31].ThefullCDSofthewhiB5genewaspresentinM.canettii,M.bovis,and theother20M.tuberculosiscompletegenomesbyBLASTNsearches. CoupledancestralABCtransporterandSTPKPknHintheManilafamily WeidentifiedauniqueMTM_01342geneinthecompletegenomeofManilafamilyisolate 96121,whichwasabsentfrom19completegenomesofmodernstrains.MTM_01342genewas theparalogofMTM_01852gene,whichencodedtheforkhead-associated(FHA)domaincon- tainingATP-bindingcassette(ABC)transporterintheManilafamilyisolate96121(TableCin S1File).TheCDSofMTM_01342hadnohomologousDNAsequencein19complete genomesofM.tuberculosisbyBLASTNsearches.Incontrast,M.canettiicontainedthehomol- ogousCDSofMTM_01342genewith99%identity.Wefurtherconfirmedthatthehomolo- gousCDSofMTM_01342genewaspresentin37Manilastrainswith100%identityandthat itsabsenceinoneManilastrainwasduetothefragmentedassemblyofthedraftgenome.The abovedatasuggestthelossoftheorthologofputativeMTM_01342geneinmodernstrains. InH37Rv,Rv1747(theorthologofMTM_01852),theFHAdomain-containingABCtrans- porter,isphosphorylatedbySer/Thrproteinkinase(STPK)PknF,anditsencodinggene, Rv1747,locatesadjacenttopknFinthegenome[32–34].InthegenomeofManilafamilyiso- late96121,MTM_01342locatedat~561kbupstreamofMTM_01852,anditwasflankedby twopknHgenes(TableCinS1File).CouplingthelossoftheorthologofMTM_01342,the paralogouspknHgene(MTM_01343)wasalsoabsentinmostofthemodernstrainsexcept NITR203andRGTB423.TheintergenicregionbetweenMTM_01342andMTM_01343con- tainedonly10bases,suggestingthatthesetwogenesmaybeco-transcribedandforman operon.STPKPknHwasreportedtophosphorylatetheFHAdomainofEmbRandhelixα10 ofDosR,bothofwhicharetranscriptionalregulators[35,36].DeletionofthepknHgeneinM. tuberculosiscausedhypervirulenceinBALB/cmice[37].Takentogether,thegenomicloca- tionssuggestthattheputativeFHAdomain-containingABCtransportermaybephosphory- latedbytheputativeSTPKPknH,andtheymayfunctionasanadditionalindependent signalingtransducerandreceptortoregulatevirulenceintheManilafamily. SNPsidentifiedinfunctionalcodingregions ToinvestigateSNPsinM.tuberculosisstrains,weusedthekSNPv2.13package[21]toidentify SNPsin21completegenomesofM.tuberculosisand7draftgenomesofBeijingfamilystrains, usingthek-mercountinganalysismethod.Afteroptimization,11-merswerechosenandgen- eratedforSNPsanalysis.158coreSNPsand2386non-coreSNPswereidentifiedthrough28 genomes.Outof158coreSNPs,118SNPswereannotatedtobelocatedinproteincoding regionsofH37Rv,andanadditional17SNPswereannotatedtobelocatedinproteincoding regionsofstrainsotherthanH37Rv.SearchingagainstH37Rvannotationinintergenic regions,wefoundonlytwoSNPswithfunctions.ToexploreSNPsinRNAfamilies,wefurther annotated28genomeswiththeRfamdatabase[25].SNPsinacobalaminriboswitchandthe ASpkssmallRNAwereidentifiedinManilafamilyisolate96121.Wefurtherusedthestand- aloneRNAfoldprograminViennaRNApackage[26]topredicttheSNPs’effectsonRNA structurefolding.Asingle-baseGtoAmutationinthecobalaminriboswitchofManilafamily isolate96121maycausealocalconfirmationchange(FigCinS1File).Itwasalsofoundin IndianstrainsRGTB423andNITR206.Asingle-baseGtoCmutationintheASpkssmall RNAofManilafamilyisolate96121maycauseaglobalconfirmationchange,causingadouble stem-loopstructuretoformasinglestem-loop(FigCinS1File).Again,thismutationwasalso PLOSONE|https://doi.org/10.1371/journal.pone.0175330 April10,2017 7/15 GenomicanalysesofMycobacteriumtuberculosis Table2. ListofManila96121nonsynonymousSNPsinproteincodinggenes. SNP H37Rv H37Rv H37Rvamino Manila96121 Manila96121 Manila96121amino H37Rv H37Rvlocus ID Location base acid Location base acid name tag T189 3504418 G P 3510068 A L pflA Rv3138 T337 4013388 G M 4014984 A I kshB Rv3571 T444 734116 T M 732013 C T secE1 Rv0638 T530 1689349 G R 1687692 A H Rv1498c Rv1498c T678 2569593 T H 2584176 C R Rv2298 Rv2298 T857 3296721 G G 3305059 C R pks15 Rv2947c T907 15117 C I 15117 G M trpG Rv0013 T1565 895082 A F 890641 G L Rv0802c Rv0802c T1594 885689 C T 884148 A K Rv0791c Rv0791c T2042 3299413 G A 3307751 A V fadD22 Rv2948c T2046 3035533 G D 3044018 A N Rv2723 Rv2723 T2264 1113290 G E 1108916 C Q Rv0996 Rv0996 https://doi.org/10.1371/journal.pone.0175330.t002 inIndianstrainRGTB423andNITR206.ASpkswasidentifiedasanantisenseregulatorof pks12mRNA,whichisinvolvedinpolyketidebiosynthesisandpathogenesis[38]. ComparedtoH37Rv,29outof158coreSNPswereidentifiedinManilastrain96121,and 20SNPswerelocatedinproteincodingregionsofH37Rv.8SNPsweresynonymousmuta- tions,while12SNPswerenonsynonymous.Annotatedfunctionalgeneswiththesenonsynon- ymousSNPsincludedthepossibleproteinexporterSecE,methyltransferase,oxidoreductase, polyketidesynthasepks15,andothers(Table2). SixuniqueSNPswereidentifiedinthecompletegenomeofManilafamilyisolate96121 anddraftgenomesof38Manilafamilyisolates.FourSNPswereinproteincodinggenes, includingpflA,kshB,Rv2723(alx),andRv0925c,whiletheothertwoSNPswereinintergenic regions.TheseuniqueSNPsmayserveasadditionalbiomarkersforManilafamilystrains. Interestingly,anotherSNPinthetreZgenewasinManilafamilyisolate96121and6other (15.8%)Manilastrains,butnotintheremaining32Manilastrains. Purifyingandpositiveselection BothpurifyingselectionandpositiveselectionhavebeenreportedinM.tuberculosisgenome- widestudies[39–41].ToexaminenaturalselectiondirectionsinManilafamilyisolate96121 andBeijingfamilyisolate96075,wescanned711single-copyorthologstocalculateandplot densitiesofKa/Ksratios(theratioofnonsynonymoussubstitutionrate(Ka)tosynonymous substitutionrate(Ks)),comparingeachstraintoH37Rv(Fig4Aand4B,FigDinS1File).The densitiesofKa/Ksvaluesshowedtwopeaks,onecenteredatlessthan1andtheothercentered above1,indicatingthatonesetofgenesareunderpurifyingselection,whiletheothersetof genesareunderpositiveselection.DensityplotofthesynonymoussubstitutionrateKsindi- catedadifferentpatternfortheattenuatedstrainH37Ra. Atotalof108proteinswereunderpositiveselectioninManilafamilyisolate96121.16pro- teinswereannotatedashypotheticalproteinsbytheNCBIProkaryoticGenomeAnnotation Pipeline(PGAP).59proteinsunderpositiveselectionwereidentifiedinManilafamilyisolate 96121butnotinBeijingfamilyisolate96075(Table3).Theseproteinswereencodedbygenes involvedindiversefunctions.Thesenonsynonymoussubstitutionsmaycausechangesin proteinstructureandfunctionalactivity.ComparedtostrainH37Rv,11aminoacidswere mutatedfromnonpolaraminoacidstopolaraminoacidsinManilafamilyisolate96121, includingtheribosomalproteinS9andthePhoUfamilytranscription/regulationprotein. PLOSONE|https://doi.org/10.1371/journal.pone.0175330 April10,2017 8/15 GenomicanalysesofMycobacteriumtuberculosis Fig4.DistributionsofKa/KsandKsvaluesforsingle-copyorthologsamongM.tuberculosisstrains. (A)Theblack,red,yellow,blue,andgreenlinesrepresentKa/Ksdistributionsofsingle-copyorthologousgene pairsinH37Ra-H37Rv,Manilafamilyisolate96121-H37Rv,Beijingfamilyisolate96075-H37Rv, NITR206-H37RvandRGTB423-H37Rv,respectively.(B)Theblack,red,yellow,blue,andgreenlines representKsdistributionsofsingle-copyorthologousgenepairsinH37Ra-H37Rv,Manilafamilyisolate 96121-H37Rv,Beijingfamilyisolate96075-H37Rv,NITR206-H37RvandRGTB423-H37Rv,respectively. https://doi.org/10.1371/journal.pone.0175330.g004 Discussion Usingcomprehensivecomparative-genomicanalysismethods,weidentifiedgenomicfeatures oftheM.tuberculosisManilafamilyisolate96121.Ouranalysissuggestedthatthepan-genome derivedfrom20completeM.tuberculosisgenomeswasestimatedtocontainmorethan7000 genes,whichismuchlessthanthereportedpan-genomeofE.coli[42].Thisdifferencemaybe duetothelongerevolutiontimeofE.colistrains,whichhasamorerapidgrowthrateandhas takentensofmillionsofyears[43]toaccumulatemutations,indels,andrecombinations,and duetothemorediversifiedecologicalnichesofE.colistrainsthatleadtohigheroddsofacqui- sitionofxenologsviaHGT. TheCRISPRregionswerefullyconservedinManilafamilyisolates96121and11L4601, andmaintained16spacersequenceswhichwerelostinmodernlineagesbutsharedhighiden- titieswithotherancientMTBCspecies(M.bovis,M.microti,andM.africanum).Theinterna- tionalstandardmethodofspoligotypingidentifiesthepresenceorabsenceofCRISPRspacers againstasetofpreselectedoliogosthroughreverse-linehybridization,butcannotdetectmuta- tionsorindels.Genomicanalysiscanidentify57spacersinthecompletegenomeofManila familyisolate96121andshowtheaccuratesimilaritybetweenspacersfromdifferentstrains. Positiveselectionshavebeenreportedinviruses,pathogenicbacteria,andotherM.tubercu- losisstrains[40,41,44].Inthiswork,thedistributionofgenome-wideKa/Ksratiossuggested thatonegroup(75orthologousgenesinManila96121-H37Rv)wasunderpurifyingselection (Ka/Ks<1)andanothergroup(108orthologousgenesinManila96121-H37Rv)wasunder positiveselection(Ka/Ks>1).Fordecades,synonymoussubstitutionwasregardedassilent andneutralandwasusedtocalculatespeciesdiversificationtime.However,recentdiscoveries suggestthatsynonymousmutationscancausegeneticcodebiasandaffectmRNAstability,and mayalsobeunderevolutionarypressureinsteadofonlyunderneutralevolution[45].Thus, proteinsdefinedinthepurifyingselectiongroupmayalsobetheresultofnaturaladaptation. Inthisstudy,wecomparedManilafamilygenomestocompletegenomesinlineages2–4 andtwooutgroupspecies.WeidentifiedthetruncatedtranscriptionfactorwhiB5geneandan additionalpotentialoperoncontainingtheSTPKpknHandABCtransportergene.The PLOSONE|https://doi.org/10.1371/journal.pone.0175330 April10,2017 9/15 GenomicanalysesofMycobacteriumtuberculosis Table3. ListofproteinsunderpositiveselectioninManilafamily96121butnotBeijingfamily96075. ProteinID Start End Strand Product AIQ06604.1 5240 7267 + DNAgyrasesubunitB AIQ06635.1 39871 41196 - MFStransporter AIQ06685.1 92390 93340 + formatehydrogenlyase AIQ06723.1 147972 148406 - F420-dependentprotein AIQ06735.1 160933 161538 + acetyltransferase AIQ06768.1 193734 194174 + cyclase AIQ06771.1 196926 197723 + ABCtransporterpermease AIQ06915.1 377093 377749 + hypotheticalprotein AIQ06932.1 391858 393207 - cytochromeP450 AIQ06965.1 438910 440292 + magnesiumtransporter AIQ07080.1 562596 563966 + membraneprotein AIQ07164.1 643362 644150 + bromoperoxidase AIQ07252.1 731634 732119 + preproteintranslocasesubunitSecE AIQ07264.1 745467 746375 + ROKfamilytranscriptionalregulator AIQ07313.1 795363 796802 + dehydrogenase AIQ07411.1 882531 883259 - transglutaminase-likeenzyme AIQ07419.1 890531 891187 - succinyl-CoAtransferase AIQ07421.1 893640 894269 + abortivephageinfectionprotein AIQ07440.1 909816 911870 - LytRfamilytranscriptionalregulator AIQ07471.1 944559 945386 - short-chaindehydrogenase AIQ07554.1 1036753 1037865 - phosphate-bindingprotein AIQ07573.1 1058781 1059944 + succinyl-CoAsynthetasesubunitbeta AIQ07602.1 1092448 1093134 + transcriptionalregulator AIQ07613.1 1104204 1104797 - 5-formyltetrahydrofolatecyclo-ligase AIQ07936.1 1459675 1460427 + ATPsynthaseF0F1subunitA AIQ07961.1 1493984 1496575 + glycogenphosphorylase AIQ08030.1 1572174 1572575 - ribonuclease AIQ08344.1 1934517 1935977 + sulfatetransporter AIQ08345.1 1936088 1936951 + chromosomepartitioningproteinParA AIQ08437.1 2038829 2040049 + secretionproteinEccC AIQ08693.1 2298175 2299170 + NAD(P)Hnitroreductase AIQ08698.1 2301793 2302758 - membraneprotein AIQ08731.1 2350042 2350449 + hypotheticalprotein AIQ08827.1 2449181 2449840 + conservedlipoproteinLppM AIQ08912.1 2544083 2544586 + hypotheticalprotein AIQ08916.1 2546103 2546921 - beta-lactamase AIQ08928.1 2560636 2561163 + lipoproteinLppN AIQ08953.1 2583665 2584636 + oxidoreductase AIQ08998.1 2627687 2628454 - molybdopterinbiosynthesisproteinmoeW AIQ09020.1 2654411 2654803 + Furfamilytranscriptionalregulator AIQ09095.1 2739621 2740154 + alkylhydroperoxidereductase AIQ09342.1 3000837 3001532 - hypotheticalprotein AIQ09443.1 3094683 3095222 - AsnCfamilytranscriptionalregulator AIQ09460.1 3114882 3115445 - hypotheticalprotein AIQ09573.1 3226665 3227540 + D-alanyl-D-alaninecarboxypeptidase AIQ09609.1 3306175 3308292 - acyl-CoAsynthetase AIQ09658.1 3360602 3361612 - 3-isopropylmalatedehydrogenase (Continued) PLOSONE|https://doi.org/10.1371/journal.pone.0175330 April10,2017 10/15

Description:
spreads worldwide as one of the most successful bacterial pathogens and continues to kill millions of Mycobacterium tuberculosis is one pathogenic bacterial species in the .. not for in vitro growth [32], suggesting that it plays a role in stress response during host- . Darty K, Denise A, Ponty Y.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.