ebook img

Gene Prediction: Methods and Protocols PDF

286 Pages·2019·7.317 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Gene Prediction: Methods and Protocols

Methods in Molecular Biology 1962 Martin Kollmar Editor Gene Prediction Methods and Protocols M M B ETHODS IN OLECULAR IO LO GY SeriesEditor JohnM.Walker School of Lifeand MedicalSciences University ofHertfordshire Hatfield, Hertfordshire,AL109AB,UK Forfurther volumes: http://www.springer.com/series/7651 Gene Prediction Methods and Protocols Edited by Martin Kollmar Group Systems Biology of Motor Proteins, Department NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Goettingen, Germany Editor MartinKollmar GroupSystemsBiologyofMotorProteins DepartmentNMR-basedStructuralBiology Max-Planck-InstituteforBiophysicalChemistry Goettingen,Germany ISSN1064-3745 ISSN1940-6029 (electronic) MethodsinMolecularBiology ISBN978-1-4939-9172-3 ISBN978-1-4939-9173-0 (eBook) https://doi.org/10.1007/978-1-4939-9173-0 LibraryofCongressControlNumber:2019935814 ©SpringerScience+BusinessMedia,LLC,partofSpringerNature2019 Chapter 3 is licensed under the terms of the Creative Commons Attribution 4.0 International License (http:// creativecommons.org/licenses/by/4.0/).Forfurtherdetailsseelicenseinformationinthechapter. Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpartofthematerialis concerned,specificallytherightsoftranslation,reprinting,reuseofillustrations,recitation,broadcasting,reproduction onmicrofilmsorinanyotherphysicalway,andtransmissionorinformationstorageandretrieval,electronicadaptation, computersoftware,orbysimilarordissimilarmethodologynowknownorhereafterdeveloped. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublicationdoesnotimply, evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevantprotectivelawsandregulations andthereforefreeforgeneraluse. Thepublisher,theauthors,andtheeditorsaresafetoassumethattheadviceandinformationinthisbookarebelievedto betrueandaccurateatthedateofpublication.Neitherthepublishernortheauthorsortheeditorsgiveawarranty, expressorimplied,withrespecttothematerialcontainedhereinorforanyerrorsoromissionsthatmayhavebeenmade. Thepublisherremainsneutralwithregardtojurisdictionalclaimsinpublishedmapsandinstitutionalaffiliations. This Humana Press imprint is published by the registered company Springer Science+Business Media, LLC, part of SpringerNature. Theregisteredcompanyaddressis:233SpringStreet,NewYork,NY10013,U.S.A. Preface Since the application of new high-throughput sequencing methods (next-generation sequencing) to genome sequencing, the number of sequenced eukaryotic genomes is increasing more and more rapidly. Several large-scale sequencing projects of thousands of specieshavealreadybeenstartedyearsago,suchasthe1kfungiproject,the“Genome10k” project(10,000vertebrates),thei5kproject(5000insects),andthe959nematodesproject. While genome sequencing steps from record to record, progress in genome annotation is developing slowly. Despite considerable experimental and bioinformatics efforts, even the annotationofthehumangenomehasnotbeenfinishedyet.Forotherspecies,thequalityof theannotationsissignificantlyworse.Inadditiontotheintrinsicincompletenessofgenome annotations, annotations are currently only available for a small part of all sequenced genomes. Annotations are usually only updated for the most important model species. Formanyspecies,onlytheinitialgenomeannotationsareavailable,althoughincorporation ofnewdata(e.g.,transcriptsequencing)andapplicationofnewapproachescouldconsider- ablyimproveannotations. Improvements in genome annotations have a direct positive effect on all downstream analysesaswellasdiverseapplicationsinmedicine,biology,biotechnology,andagriculture. Thisvolumeintroducessoftwareforgenepredictionwithfocusoneukaryoticgenomes.The primaryaudienceareresearchersandresearchgroupsworkingontheassemblyandannota- tion of single species or small groups of species. Such groups usually do not have access to advancedandcomplexannotationpipelines,whichareinusebysomelarge-scalesequencing centers, and usually do not have particular expertise in gene prediction software. Also, the focusofsuchgroupsisoftenonaparticularbiologicalaspectthatcanwellbeexplainedwith just a very preliminary and partially incomplete genome annotation. The protocols describedinthisvolumeshouldenablethesegroupstoconsiderablyimproveandcomplete their genome annotations to be useful for a wider research community. Re-annotation of longavailablegenomeassembliesshouldalsobesimplified.Availablesoftwarealsocontains optionsandparametersthatareoftenhiddentothenoviceoroccasionaluser.Chapterswill explainsoftwareandwebserver usageasappliedintypicalusecases,writteninthespiritof theseries,whichaimstoprovidepracticalguidanceandtroubleshootingadvice. Goettingen,Germany MartinKollmar v Contents Preface ..................................................................... v Contributors................................................................. ix 1 tRNAscan-SE:Searchingfor tRNAGenesinGenomicSequences............. 1 PatriciaP.ChanandToddM.Lowe 2 PredictingRNAFamiliesinNucleotideSequences UsingStructRNAfinder.................................................. 15 ViniciusMaracaja-Coutinho,Rau´lArias-Carrasco, HelderI.Nakaya,andVictorAliaga-Tobar 3 StructuralandFunctionalAnnotationofEukaryotic GenomeswithGenSAS.................................................. 29 JodiL.Humann,TaeinLee,StephenFicklin,andDorrieMain 4 PracticalGuideforFungalGenePredictionfromGenome AssemblyandRNA-SeqReadsbyFunGAP ................................ 53 ByoungnamMinandIn-GeolChoi 5 Whole-GenomeAnnotationwithBRAKER ................................ 65 KatharinaJ.Hoff,AlexandreLomsadze,MarkBorodovsky, andMarioStanke 6 EuGene:AnAutomatedIntegrativeGeneFinder forEukaryotes andProkaryotes ........................................................ 97 ErikaSallet,Je´roˆmeGouzy,andThomasSchiex 7 ChemGenome2.1:AnAbInitioGenePredictionSoftware................... 121 AkhileshMishra,PriyankaSiwach,PoonamSinghal,andB.Jayaram 8 Multi-GenomeAnnotationwithAUGUSTUS.............................. 139 StefanieNachtweideandMarioStanke 9 GeMoMa:Homology-BasedGenePredictionUtilizingIntron PositionConservationandRNA-seqData ................................. 161 JensKeilwagen,FrankHartung,andJanGrau 10 CodingExon-StructureAwareRealigner(CESAR):UtilizingGenome AlignmentsforComparativeGeneAnnotation ............................. 179 ViragSharmaandMichaelHiller 11 PredictingGenesinCloselyRelatedSpecieswithScipioandWebScipio ....... 193 MartinKollmar 12 AnABlast:Re-searchingforProtein-CodingSequencesinGenomicRegions... 207 AlejandroRubio,CarlosS.Casimiro-Soriguer,PabloMier, MiguelA.Andrade-Navarro,Andre´sGarz(cid:1)on,JuanJimenez, andAntonioJ.Pe´rez-Pulido 13 GeneratingPublication-ReadyProkaryoticGenomeAnnotations withDFAST ........................................................... 215 YasuhiroTanizawa,TakatomoFujisawa,MasanoriArita, andYasukazuNakamura vii viii Contents 14 BUSCO:AssessingGenomeAssemblyandAnnotationCompleteness......... 227 MathieuSeppey,Mose` Manni,andEvgenyM.Zdobnov 15 EvaluatingGenomeAssembliesandGeneModelsUsinggVolante............ 247 OsamuNishimura,YuichiroHara,andShigehiroKuraku 16 ChoosingtheBestGenePredictionswithGeneValidator .................... 257 IsmailMoghul,AnuragPriyam,andYannickWurm 17 COGNATE:ComparativeGeneAnnotationCharacterizer................... 269 JeanneWilbrandt Index ...................................................................... 283 Contributors VICTORALIAGA-TOBAR (cid:1) FacultaddeCienciasQuı´micasyFarmace´uticas,AdvancedCenter forChronicDiseases—ACCDiS,UniversidaddeChile,Santiago,Chile;Programade DoctoradoenGen(cid:1)omicaIntegrativa,Vicerrectorı´adeInvestigaci(cid:1)on,UniversidadMayor, Santiago,Chile MIGUELA.ANDRADE-NAVARRO (cid:1) FacultyofBiology,JohannesGutenbergUniversityMainz, Mainz,Germany RAU´LARIAS-CARRASCO (cid:1) FacultaddeCienciasQuı´micasyFarmace´uticas,AdvancedCenter forChronicDiseases—ACCDiS,UniversidaddeChile,Santiago,Chile;Programade DoctoradoenGen(cid:1)omicaIntegrativa,Vicerrectorı´adeInvestigaci(cid:1)on,UniversidadMayor, Santiago,Chile MASANORIARITA (cid:1) DepartmentofInformatics,NationalInstituteofGenetics,Shizuoka, Japan;RIKENCenter forSustainableResourceScience,Yokohama,Kanagawa,Japan MARKBORODOVSKY (cid:1) JointGeorgiaTechandEmoryUniversityWallaceHCoulter, DepartmentofBiomedicalEngineering,Atlanta,GA,USA;SchoolofComputational ScienceandEngineering,Atlanta,GA,USA;MoscowInstituteofPhysicsandTechnology, Dolgoprudny,MoscowRegion,Russia CARLOSS.CASIMIRO-SORIGUER (cid:1) FacultaddeCienciasExperimentales(A´readeGene´tica), CentroAndaluzdeBiologı´adelDesarrollo(CABD,UPO-CSIC),UniversidadPablode Olavide,Sevilla,Spain PATRICIAP.CHAN (cid:1) DepartmentofBiomolecularEngineering,UniversityofCalifornia SantaCruz,SantaCruz,CA,USA IN-GEOLCHOI (cid:1) DepartmentofBiotechnology,CollegeofLifeSciencesandBiotechnology, KoreaUniversity,Seoul,SouthKorea STEPHENFICKLIN (cid:1) DepartmentofHorticulture,WashingtonStateUniversity,Pullman, WA,USA TAKATOMOFUJISAWA (cid:1) DepartmentofInformatics,NationalInstituteofGenetics,Shizuoka, Japan ANDRE´SGARZO´N (cid:1) FacultaddeCienciasExperimentales(A´readeGene´tica),Centro AndaluzdeBiologı´adelDesarrollo(CABD,UPO-CSIC),UniversidadPablodeOlavide, Sevilla,Spain JE´ROˆMEGOUZY (cid:1) LaboratoiredesInteractionsPlantes-Microorganismes(LIPM),Universite´ deToulouse,INRA,CNRS,Castanet-Tolosan,France JANGRAU (cid:1) InstituteofComputerScience,MartinLutherUniversityHalle-Wittenberg, Halle(Saale),Germany YUICHIROHARA (cid:1) LaboratoryforPhyloinformatics,RIKENCenter forBiosystemsDynamics Research(BDR),Kobe,Japan FRANKHARTUNG (cid:1) InstituteforBiosafetyinPlantBiotechnology,JuliusKu¨hn-Institut(JKI), FederalResearchCentreforCultivatedPlants,Quedlinburg,Germany MICHAELHILLER (cid:1) MaxPlanckInstituteofMolecularCellBiologyandGenetics,Dresden, Germany;MaxPlanckInstitutefor thePhysicsofComplexSystems,Dresden,Germany; Center forSystemsBiology,Dresden,Germany KATHARINAJ.HOFF (cid:1) UniversityofGreifswald,InstituteofMathematicsandComputer Science,Greifswald,Germany ix x Contributors JODI L.HUMANN (cid:1) DepartmentofHorticulture,WashingtonStateUniversity,Pullman, WA,USA B.JAYARAM (cid:1) SupercomputingFacilityforBioinformatics andComputationalBiology, IndianInstituteofTechnologyDelhi,NewDelhi,India;KusumaSchoolofBiological Sciences,IndianInstituteofTechnologyDelhi,NewDelhi,India;Departmentof Chemistry,IndianInstituteofTechnologyDelhi,NewDelhi,India JUANJIMENEZ (cid:1) FacultaddeCienciasExperimentales(A´readeGene´tica),CentroAndaluz deBiologı´adelDesarrollo(CABD,UPO-CSIC),UniversidadPablodeOlavide,Sevilla, Spain JENSKEILWAGEN (cid:1) InstituteforBiosafetyinPlantBiotechnology,JuliusKu¨hn-Institut(JKI), FederalResearchCentreforCultivatedPlants,Quedlinburg,Germany MARTINKOLLMAR (cid:1) GroupSystemsBiologyofMotorProteins,DepartmentofNMR-Based StructuralBiology,Max-Planck-InstituteforBiophysicalChemistry,Goettingen,Germany SHIGEHIRO KURAKU (cid:1) LaboratoryforPhyloinformatics, RIKENCenter forBiosystems DynamicsResearch(BDR),Kobe,Japan TAEINLEE (cid:1) DepartmentofHorticulture,WashingtonStateUniversity,Pullman,WA,USA ALEXANDRELOMSADZE (cid:1) JointGeorgiaTechandEmoryUniversityWallaceHCoulter DepartmentofBiomedicalEngineering,Atlanta,GA,USA TODDM.LOWE (cid:1) DepartmentofBiomolecularEngineering,UniversityofCaliforniaSanta Cruz,SantaCruz,CA,USA DORRIE MAIN (cid:1) DepartmentofHorticulture,WashingtonStateUniversity,Pullman,WA, USA MOSE` MANNI (cid:1) DepartmentofGeneticMedicineandDevelopment,SwissInstituteof Bioinformatics,UniversityofGenevaMedicalSchool,Geneva,Switzerland VINICIUS MARACAJA-COUTINHO (cid:1) FacultaddeCienciasQuı´micasyFarmace´uticas, AdvancedCenterforChronicDiseases—ACCDiS,UniversidaddeChile,Santiago,Chile; BeagleBioinformatics,Santiago,Chile;InstitutoVandique,Joa˜oPessoa,Brazil PABLO MIER (cid:1) FacultyofBiology,JohannesGutenbergUniversityMainz,Mainz,Germany BYOUNGNAMMIN (cid:1) DepartmentofBiotechnology,CollegeofLifeSciencesandBiotechnology, KoreaUniversity,Seoul,SouthKorea AKHILESHMISHRA (cid:1) SupercomputingFacilityforBioinformaticsandComputationalBiology, IndianInstituteofTechnologyDelhi,NewDelhi,India;KusumaSchoolofBiological Sciences,IndianInstituteofTechnologyDelhi,NewDelhi,India ISMAILMOGHUL (cid:1) UCLCancerInstitute,UniversityCollegeLondon,London,UK STEFANIENACHTWEIDE (cid:1) InstituteofMathematicsandComputerScience,Universityof Greifswald,Greifswald,Germany YASUKAZUNAKAMURA (cid:1) DepartmentofInformatics,NationalInstituteofGenetics,Shizuoka, Japan HELDERI.NAKAYA (cid:1) DepartmentofClinicalandToxicologicalAnalyses,Schoolof PharmaceuticalSciences,UniversityofSa˜oPaulo,Sa˜oPaulo,Brazil OSAMUNISHIMURA (cid:1) LaboratoryforPhyloinformatics, RIKENCenter forBiosystems DynamicsResearch(BDR),Kobe,Japan ANTONIO J.PE´REZ-PULIDO (cid:1) FacultaddeCienciasExperimentales(A´readeGene´tica), CentroAndaluzdeBiologı´adelDesarrollo(CABD,UPO-CSIC),UniversidadPablode Olavide,Sevilla,Spain ANURAGPRIYAM (cid:1) SchoolofBiologicalandChemicalSciences,QueenMaryUniversityof London,London,UK Contributors xi ALEJANDRORUBIO (cid:1) FacultaddeCienciasExperimentales(A´readeGene´tica),Centro AndaluzdeBiologı´adelDesarrollo(CABD,UPO-CSIC),UniversidadPablodeOlavide, Sevilla,Spain ERIKASALLET (cid:1) LaboratoiredesInteractionsPlantes-Microorganismes(LIPM),Universite´de Toulouse,INRA,CNRS,Castanet-Tolosan,France THOMASSCHIEX (cid:1) Unite´deMathe´matiquesetInformatiqueApplique´esdeToulouse(MIAT), Universite´deToulouse,INRA,Castanet-Tolosan,France MATHIEUSEPPEY (cid:1) DepartmentofGeneticMedicineandDevelopment,SwissInstituteof Bioinformatics,UniversityofGenevaMedicalSchool,Geneva,Switzerland VIRAGSHARMA (cid:1) MaxPlanckInstituteofMolecularCellBiologyandGenetics,Dresden, Germany;MaxPlanckInstitutefor thePhysicsofComplexSystems,Dresden,Germany; Center forSystemsBiology,Dresden,Germany;CRTD-DFGCenter forRegenerative TherapiesDresden,CarlGustavCarusFacultyofMedicine,TechnischeUniversita€t Dresden,Dresden,Germany;PaulLangerhansInstituteDresden(PLID)oftheHelmholtz CenterMunichatUniversityHospitalCarlGustavCarusandFacultyofMedicine, TechnischeUniversita€tDresden,Dresden,Germany;GermanCenterforDiabetesResearch (DZD),Munich,Germany POONAMSINGHAL (cid:1) SupercomputingFacilityforBioinformaticsandComputationalBiology, IndianInstituteofTechnologyDelhi,NewDelhi,India PRIYANKASIWACH (cid:1) SupercomputingFacilityforBioinformaticsandComputationalBiology, IndianInstituteofTechnologyDelhi,NewDelhi,India;DepartmentofBiotechnology, ChaudharyDeviLalUniversity,Sirsa,Haryana,India MARIOSTANKE (cid:1) InstituteofMathematicsandComputerScience,UniversityofGreifswald, Greifswald,Germany YASUHIROTANIZAWA (cid:1) DepartmentofInformatics,NationalInstituteofGenetics,Shizuoka, Japan JEANNEWILBRANDT (cid:1) Center forMolecularBiodiversityResearch,ZoologicalResearch MuseumAlexanderKoenig(ZFMK),Bonn,Germany YANNICKWURM (cid:1) SchoolofBiologicalandChemicalSciences,QueenMaryUniversityof London,London,UK EVGENYM.ZDOBNOV (cid:1) DepartmentofGeneticMedicineandDevelopment,SwissInstituteof Bioinformatics,UniversityofGenevaMedicalSchool,Geneva,Switzerland

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.