ebook img

Protein Function Prediction: Methods and Protocols PDF

243 Pages·2017·8.378 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Protein Function Prediction: Methods and Protocols

Methods in Molecular Biology 1611 DDaaiissuukkee KKiihhaarraa EEddiittoorr Protein Function Prediction Methods and Protocols M M B ETHODS IN OLECULAR IO LO GY SeriesEditor JohnM.Walker School of Lifeand MedicalSciences University ofHertfordshire Hatfield, Hertfordshire, AL109AB,UK Forfurther volumes: http://www.springer.com/series/7651 Protein Function Prediction Methods and Protocols Edited by Daisuke Kihara Department of Biological Sciences and Computer Science Purdue University West Lafayette, Indiana, USA Editor DaisukeKihara DepartmentofBiologicalSciencesandComputerScience PurdueUniversity WestLafayette,Indiana,USA ISSN1064-3745 ISSN1940-6029 (electronic) MethodsinMolecularBiology ISBN978-1-4939-7013-1 ISBN978-1-4939-7015-5 (eBook) DOI10.1007/978-1-4939-7015-5 LibraryofCongressControlNumber:2017937538 ©SpringerScience+BusinessMediaLLC2017 Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpartofthematerialis concerned,specificallytherightsoftranslation,reprinting,reuseofillustrations,recitation,broadcasting,reproduction onmicrofilmsorinanyotherphysicalway,andtransmissionorinformationstorageandretrieval,electronicadaptation, computersoftware,orbysimilarordissimilarmethodologynowknownorhereafterdeveloped. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublicationdoesnotimply, evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevantprotectivelawsandregulations andthereforefreeforgeneraluse. Thepublisher,theauthorsandtheeditorsaresafetoassumethattheadviceandinformationinthisbookarebelievedto betrueandaccurateatthedateofpublication.Neitherthepublishernortheauthorsortheeditorsgiveawarranty, expressorimplied,withrespecttothematerialcontainedhereinorforanyerrorsoromissionsthatmayhavebeenmade. Thepublisherremainsneutralwithregardtojurisdictionalclaimsinpublishedmapsandinstitutionalaffiliations. Printedonacid-freepaper ThisHumanaPressimprintispublishedbySpringerNature TheregisteredcompanyisSpringerScience+BusinessMediaLLC Theregisteredcompanyaddressis:233SpringStreet,NewYork,NY10013,U.S.A. Preface Knowingthefunctionofaproteinandunderstandinghowitiscarriedoutaretheultimate goals of molecular biology and biochemistry. From the early stage of bioinformatics in the 1980s,thedevelopmentofcomputationaltoolstoaidinelucidatingproteinfunctionwasa major focus of the field. Numerous methods have been developed since then. Computa- tionally, protein function can be predicted through similarity searches because similarity implies homology from an evolutionary standpoint, and also because it indicates that the proteins have the same physical structures where the function takes place. Thus, based on thissimilarityprinciple,methodsweredevelopedtocompareglobalorlocalsequencesand the structures of proteins. Databases were also developed, which organize function infor- mation of proteins and serve as references to be queried against. In this book, well- established sequence- and structure-based tools and databases are introduced, which are very useful for biology labs. In addition, this book introduces software which addresses function beyond its conventional meaning, reflecting the diversity of the current active researchfield. This book begins by introducing two sequence-based function prediction methods, PFP and ESG, in Chapter 1. The chapter also describes a web server, NaviGO, which can analyze Gene Ontology annotations. Then, Chapters 2, 3, and 4 discuss tools suitable for thefunctionalanalysisofmetagenomicsdata.Thetoolsinthesethreechaptersarebasedon sequencedatabasesearchesfaster thanconventionalhomologysearchmethods,anecessity when processing the large amounts of sequence data which typify metagenome sequences. Chapter 2 introduces GhostX, which uses a suffix array for fast sequence comparison. Fun4Me in Chapter 3 is a pipeline that combines protein coding gene detection in query sequences and a fast sequence database search utilizing a hashing technique. SUPER- FOCUSinChapter4combinesfastsearchalgorithmswithpreclusteredreferencesequence databases. In Chapter 5, we have MPFit, a program that detects when query proteins are moonlightingproteins,i.e.,aproteinwithdualfunctions. The next chapter (Chapter 6) describes SignalP, a well-established web server that predicts subcellular localization by recognizing a signal peptide in a query sequence. Sub- cellularlocalizationisoneofthethreefunctionalcategoriesintheGeneOntology(Cellular Component),and itcan beaclue forother biological functionsofa proteinsince localiza- tionandbiologicalfunctionarecloselycorrelated. The following four chapters deal with protein structures. ProFunc in Chapter 7 is a popular web server that performs multiple different analyses on a query protein structure, including global and local structure matching to known proteins. Chapter 8 describes G- LoSA, which finds ligand binding sites similar to a query binding site within a reference database.eMatchSite,thefollowingchapter(Chapter9),alignstwoligandbindingsitesto quantifysimilaritiesbetweenthem.InChapter10,WATsite2.0isintroduced,whichpredicts boundwatermoleculesinaligandbindingsite.Watermoleculesboundtoproteinsmediate ligand-proteininteractionsandarethusimportantinproteinfunction. The subsequent five chapters cover resources that address protein function through pathways,networks,andgenomes.Chapter11discussesrecentupdatesofKEGG,focusing onenzymesandpathways.KEGGisoneofthemostcomprehensivedatabasesofpathways, genomes,andotherbiomoleculesandisafundamentalresourcefor understandingprotein v vi Preface functionatasystemslevel.Chapter12isabouttheMicrobialGenomeDatabase,avaluable resourcetoperformcomparativegenomics.TheSaccharomycesGenomeDatabase(SGD)is describedinChapter13.S.cerevisiaeisoneofthemostextensivelystudiedorganisms.SGD haslongservedasareliablesourceforproteinfunctionandotherresources,includinggene expression and phenotypes, in S. cerevisiae. Chapter 14 introduces MouseNet, which pre- dictsgenefunctioninmicefromageneexpressionnetwork.FANTOM5inChapter15isa databaseofhumanandmousegenomes.Transcriptionstartsitesandpromoteractivitiesof various cells can be browsed and searched. The last chapter (Chapter 16) introduces Spatiocyte, a software for simulating the diffusion and localization of proteins in a cell. Resultsfromthesimulation,i.e.,aphenotype,canbecomparedagainstmicroscopeobser- vations.Proteinsexhibittheirfunctionthroughdynamicinteractionsinacellenvironment. Thus,ultimatelyfunctionsmustbeconsideredinadynamicsystem,whichthissoftwareaims todo. Ihopereadersenjoythisbookasapracticalguideforusingbioinformaticstoolsrelated toproteinfunctionprediction.Moreover,Ialsohopethat this compilationitselfexhibitsa snapshot of the current research field and our understanding of the concept of protein function,whileindicatingthefuturedirectionofthefield. Editing of this book was greatly aided by Mr. Joshua McGraw, Ms. Sarah Rodenbeck, Ms. Lenna X. Peterson, and Mr. Charles Christoffer of my research group. I would like to conclude this preface by recognizing and acknowledging their help as a happy memory of myresearchactivities. WestLafayette,IN,USA DaisukeKihara Contents Preface ..................................................................... v Contributors................................................................. ix 1 UsingPFPandESGProteinFunctionPredictionWebServers ............... 1 QingWei,JoshuaMcGraw,IshitaKhan,andDaisukeKihara 2 GHOSTX:AFastSequenceHomologySearchToolforFunctional AnnotationofMetagenomicData......................................... 15 ShujiSuzuki,TakashiIshida,MasahitoOhue, MasanoriKakuta,andYutakaAkiyama 3 FromGeneAnnotationtoFunctionPredictionforMetagenomics............ 27 FatemehSharifiandYuzhenYe 4 AnAgileFunctionalAnalysisofMetagenomicDataUsing SUPER-FOCUS........................................................ 35 GenivaldoGueirosZ.Silva,FabyanoA.C.Lopes,andRobertA.Edwards 5 MPFit:ComputationalToolforPredictingMoonlightingProteins............ 45 IshitaKhan,JoshuaMcGraw,andDaisukeKihara 6 PredictingSecretoryProteinswithSignalP................................. 59 HenrikNielsen 7 TheProFuncFunctionPredictionServer .................................. 75 RomanA.Laskowski 8 G-LoSAforPredictionofProtein-LigandBindingSitesandStructures........ 97 HuiSunLeeandWonpilIm 9 LocalAlignmentofLigandBindingSitesinProteins forPolypharmacologyandDrugRepositioning............................. 109 MichalBrylinski 10 WATsite2.0withPyMOLPlugin:HydrationSitePrediction andVisualization ....................................................... 123 YingYang,BingjieHu,andMarkusA.Lill 11 EnzymeAnnotationandMetabolicReconstructionUsingKEGG ............ 135 MinoruKanehisa 12 OrthologIdentificationandComparativeAnalysisofMicrobial GenomesUsingMBGDandRECOG..................................... 147 IkuoUchiyama 13 ExploringProteinFunctionUsingtheSaccharomycesGenomeDatabase....... 169 EdithD.Wong 14 Network-BasedGeneFunctionPredictioninMouse andOtherModelVertebratesUsingMouseNetServer ...................... 183 EiruKimandInsukLee vii viii Contents 15 TheFANTOM5ComputationEcosystem:GenomicInformation HubforPromotersandActiveEnhancers.................................. 199 ImadAbugessaisa,ShuheiNoguchi,PieroCarninci,andTakeyaKasukawa 16 Multi-AlgorithmParticleSimulationswithSpatiocyte ....................... 219 SatyaN.V.ArjunanandKoichiTakahashi Index ...................................................................... 237 List of Contributors IMADABUGESSAISA (cid:1) DivisionofGenomicsTechnologies,RIKENCenter forLifeScience Technologies,Yokohama,Kanagawa,Japan YUTAKAAKIYAMA (cid:1) DepartmentofComputerScience,SchoolofComputing,TokyoInstituteof Technology,Tokyo,Japan;EducationAcademyofComputationalLifeSciences(ACLS), TokyoInstituteofTechnology,Yokohama,Japan;DepartmentofComputerScience, GraduateSchoolofInformationScienceandEngineering,TokyoInstituteofTechnology, Tokyo,Japan SATYAN.V.ARJUNAN (cid:1) LaboratoryforBiochemicalSimulation,RIKENQuantitative BiologyCenter,Suita,Osaka,Japan MICHAL BRYLINSKI (cid:1) DepartmentofBiologicalSciences,LouisianaStateUniversity,Baton Rouge,LA,USA;Center forComputation&Technology,LouisianaStateUniversity, BatonRouge,LA,USA PIEROCARNINCI (cid:1) DivisionofGenomicsTechnologies,RIKENCenter forLifeScience Technologies,Yokohama,Kanagawa,Japan ROBERTA.EDWARDS (cid:1) ComputationalScienceResearchCenter,SanDiegoStateUniversity, SanDiego,CA,USA;DepartmentofBiology,SanDiegoStateUniversity,SanDiego,CA, USA;DepartmentofComputerScience,SanDiegoStateUniversity,SanDiego,CA,USA BINGJIE HU (cid:1) DepartmentofMedicinalChemistryandMolecularPharmacology,Collegeof Pharmacy,PurdueUniversity,WestLafayette,IN,USA;ComputationalADME,Drug Disposition,LillyResearchLaboratories,EliLillyandCompany,Indianapolis,IN,USA WONPILIM (cid:1) DepartmentofBiologicalSciencesandBioengineeringProgram,Lehigh University,Bethlehem,PA,USA TAKASHIISHIDA (cid:1) DepartmentofComputerScience,SchoolofComputing,TokyoInstituteof Technology,Tokyo,Japan;EducationAcademyofComputationalLifeSciences(ACLS), TokyoInstituteofTechnology,Yokohama,Japan;DepartmentofComputerScience, GraduateSchoolofInformationScienceandEngineering,TokyoInstituteofTechnology, Tokyo,Japan TAKEYAKASUKAWA (cid:1) DivisionofGenomicsTechnologies,RIKENCenter forLifeScience Technologies,Yokohama,Kanagawa,Japan MASANORIKAKUTA (cid:1) DepartmentofComputerScience,GraduateSchoolofInformation ScienceandEngineering,TokyoInstituteofTechnology,Tokyo,Japan MINORU KANEHISA (cid:1) InstituteforChemicalResearch,KyotoUniversity,Uji,Kyoto,Japan ISHITAKHAN (cid:1) DepartmentofComputerScience,PurdueUniversity,WestLafayette,IN, USA DAISUKEKIHARA (cid:1) DepartmentofBiologicalSciencesandComputerScience,Purdue University,WestLafayette,IN,USA EIRU KIM (cid:1) DepartmentofBiotechnology,CollegeofLifeScienceandBiotechnology,Yonsei University,Seoul,Korea ROMANA.LASKOWSKI (cid:1) EuropeanBioinformatics Institute,Hinxton,Cambridge,UK HUISUNLEE (cid:1) DepartmentofBiologicalSciencesandBioengineeringProgram,Lehigh University,Bethlehem,PA,USA INSUKLEE (cid:1) DepartmentofBiotechnology,CollegeofLifeScienceandBiotechnology,Yonsei University,Seoul,Korea ix x ListofContributors MARKUSA.LILL (cid:1) DepartmentofMedicinalChemistryandMolecularPharmacology,College ofPharmacy,PurdueUniversity,WestLafayette,IN,USA JOSHUA MCGRAW (cid:1) DepartmentofBiologicalSciences,PurdueUniversity,WestLafayette, IN,USA FABYANOA.C.LOPES (cid:1) CellularBiologyDepartment,UniversidadedeBrası´lia(UnB), Brası´lia,DF,Brazil HENRIK NIELSEN (cid:1) DepartmentofBioandHealthInformatics,TechnicalUniversity ofDenmark,Lyngby,Denmark SHUHEINOGUCHI (cid:1) DivisionofGenomicsTechnologies,RIKENCenter forLifeScience Technologies,Yokohama,Kanagawa,Japan MASAHITOOHUE (cid:1) DepartmentofComputerScience,GraduateSchoolofInformation ScienceandEngineering,TokyoInstituteofTechnology,Tokyo,Japan;Departmentof ComputerScience,SchoolofComputing,TokyoInstituteofTechnology,Tokyo,Japan FATEMEHSHARIFI (cid:1) SchoolofInformaticsandComputing,IndianaUniversity,Bloomington, IN,USA GENIVALDOGUEIROSZ.SILVA (cid:1) ComputationalScienceResearchCenter,SanDiegoState University,SanDiego,CA,USA SHUJISUZUKI (cid:1) DepartmentofComputerScience,GraduateSchoolofInformationScience andEngineering,TokyoInstituteofTechnology,Tokyo,Japan;EducationAcademyof ComputationalLifeSciences(ACLS),TokyoInstituteofTechnology,Yokohama,Japan KOICHITAKAHASHI (cid:1) LaboratoryforBiochemicalSimulation,RIKENQuantitativeBiology Center,Suita,Osaka,Japan IKUO UCHIYAMA (cid:1) LaboratoryofGenomeInformatics,NationalInstituteforBasicBiology, NationalInstitutesofNaturalSciences,Okazaki,Aichi,Japan QINGWEI (cid:1) DepartmentofComputerScience,PurdueUniversity,WestLafayette,IN,USA EDITHD.WONG (cid:1) DepartmentofGenetics,StanfordUniversity,Stanford,CA,USA YING YANG (cid:1) DepartmentofMedicinalChemistryandMolecularPharmacology,Collegeof Pharmacy,PurdueUniversity,WestLafayette,IN,USA YUZHEN YE (cid:1) SchoolofInformaticsandComputing,IndianaUniversity,Bloomington,IN, USA

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.