ebook img

Variant Calling: Methods and Protocols PDF

352 Pages·2022·8.149 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Variant Calling: Methods and Protocols

Methods in Molecular Biology 2493 Charlotte K. Y. Ng Salvatore Piscuoglio Editors Variant Calling Methods and Protocols M M B ETHODS IN OLECULAR IO LO GY SeriesEditor JohnM.Walker School of Lifeand MedicalSciences University ofHertfordshire Hatfield, Hertfordshire, UK Forfurther volumes: http://www.springer.com/series/7651 For over 35 years, biological scientists have come to rely on the research protocols and methodologiesinthecriticallyacclaimedMethodsinMolecularBiologyseries.Theserieswas thefirsttointroducethestep-by-stepprotocolsapproachthathasbecomethestandardinall biomedicalprotocolpublishing.Eachprotocolisprovidedinreadily-reproduciblestep-by- step fashion, opening with an introductory overview, a list of the materials and reagents neededtocompletetheexperiment,andfollowedbyadetailedprocedurethatissupported with a helpful notes section offering tips and tricks of the trade as well as troubleshooting advice. These hallmark features were introduced by series editor Dr. John Walker and constitutethekeyingredientineachandeveryvolumeoftheMethodsinMolecularBiology series. Tested and trusted, comprehensive and reliable, all protocols from the series are indexedinPubMed. Variant Calling Methods and Protocols Edited by Charlotte K. Y. Ng Department for BioMedical Research, University of Bern, Bern, Switzerland Salvatore Piscuoglio Department of Biomedicine, University of Basel, Basel, Switzerland Editors CharlotteK.Y.Ng SalvatorePiscuoglio DepartmentforBioMedicalResearch DepartmentofBiomedicine UniversityofBern UniversityofBasel Bern,Switzerland Basel,Switzerland ISSN1064-3745 ISSN1940-6029 (electronic) MethodsinMolecularBiology ISBN978-1-0716-2292-6 ISBN978-1-0716-2293-3 (eBook) https://doi.org/10.1007/978-1-0716-2293-3 ©TheEditor(s)(ifapplicable)andTheAuthor(s),underexclusivelicensetoSpringerScience+BusinessMedia,LLC,part ofSpringerNature2022 Thisworkissubjecttocopyright.AllrightsaresolelyandexclusivelylicensedbythePublisher,whetherthewholeorpart of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting,reproductionon microfilmsorinanyotherphysicalway,andtransmissionorinformation storageand retrieval,electronicadaptation, computersoftware,orbysimilar ordissimilar methodologynow knownorhereafter developed. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublicationdoesnotimply, evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevantprotectivelawsandregulations andthereforefreeforgeneraluse. Thepublisher,theauthorsandtheeditorsaresafetoassumethattheadviceandinformationinthisbookarebelievedto betrueandaccurateatthedateofpublication.Neitherthepublishernortheauthorsortheeditorsgiveawarranty, expressedorimplied,withrespecttothematerialcontainedhereinorforanyerrorsoromissionsthatmayhavebeen made.Thepublisherremainsneutralwithregardtojurisdictionalclaimsinpublishedmapsandinstitutionalaffiliations. ThisHumanaimprintispublishedbytheregisteredcompanySpringerScience+BusinessMedia,LLC,partofSpringer Nature. Theregisteredcompanyaddressis:1NewYorkPlaza,NewYork,NY10004,U.S.A. Preface Genetic diversity is fundamental to species adaptability and survival. The identification of sequencevariationsprovidesinsights notonly intogeneticdiversitybut alsotothe genetic component of diseases and phenotypes. The advances of next-generation sequencing over the past two decades have offered the opportunity to comprehensively screen and detect various types of variants in large populations. While the Human Genome Project took 13 years to complete, now entire genomes can be screened for genetic variations at single base-pair resolution in a matter of days. The identification of variants in the nearly half- millionsequencedspecieshasenabledsignificantadvancesinmanyfacets,frommedicineto agriculture to industrial processes. The number of known variants will only continue to increaseexponentiallywiththeEarthBioGenomeProjectaimingtosequenceandannotate ~1.5millionknowneukaryoticspeciesby2028. This book was compiled in the midst of one of the great challenges in modern medi- cine—theSARS-CoV-2pandemic.Beingabletorapidlysequencethepathogenandtraceits variants as they spread across the world has been critical to pandemic response planning aimingtocontainviralspread.Whiletheinitialvaccinesdevelopedatbreakneckspeedhave proveneffective,theidentificationofnovelvariantsthatcouldcompromisetheirpotencyis critical to the further development of vaccines. Next-generation sequencing and variant detection,togetherwithotherpandemicresponsemeasures,haveundoubtedlysavedmany, manylives. The evolution of next-generation sequencing has necessitated the continual develop- mentofalgorithmstoaddressspecificquestionsandtoovercomecomputationalchallenges. The many types of sequencing protocols and technologies enable the discovery of many different types of variants, each requiring specialist algorithms to accurately model the underlying statistical properties. In this volume, we cover a diverse range of methods relevanttovariantcalling.Inthefirstpart,Chapters1–11covergeneralmethodsforvariant calling,withChapters1–5coveringapproachestocalling“classical”variants—singlenucle- otide substitutions, insertions, and deletions—and Chapters 6–11 describing methods for the identification of other, but no less biologically relevant, variant types—copy number variants,structuralvariants,somaticmitochondrialvariants,andsplicevariants.Thesecond part covers variant calling in specialized data types and use cases. Chapters 12–14 describe threeapproaches to variant callingin specializeddatatypes—IonTorrent sequencingdata, RNA-seqdata,andUMI-taggedpaired-endsequencingdata.Chapters15and16describe alignment-freegenotypingandSNPcalling.Chapters17and18describesinglenucleotide and copy number variant detection in single-cell DNA sequencing data. The third part coverstopicsrelevantandimportanttovariantcalling,withChapters19and20dedicatedto variant annotation—critical to biological interpretation—and Chapter 21 devoted to pre- analytical quality control to ensure successful variant calling. The chapters should be of interest to bioinformatics students and researchers alike looking to broaden and deepen theirknowledgeonvariantcalling. v vi Preface We would like to thank the authors for sharing their methods. We thank Springer Nature and Series Editor Prof. John M Walker for giving us the opportunity to compile thiscollectionandfor theirsupportthroughouttheprocess. Bern,Switzerland CharlotteK.Y.Ng Basel,Switzerland SalvatorePiscuoglio Contents Preface ..................................................................... v Contributors................................................................. ix 1 DataProcessingandGermlineVariantCalling withtheSentieonPipeline ............................................... 1 RafaelAldanaandDonaldFreed 2 MuSE:ANovelApproachtoMutationCalling withSample-SpecificErrorModeling...................................... 21 ShuangxiJi,MatthewD.Montierth,andWenyiWang 3 Octopus:GenotypingandHaplotypinginDiverse ExperimentalDesigns ................................................... 29 DanielP.Cooke 4 AccurateEnsemblePredictionofSomaticMutations withSMuRF2 .......................................................... 53 WeitaiHuang,NgakLengSim,andAndersJ.Skanderup 5 DetectingMediumandLargeInsertionsandDeletions withtransIndel ......................................................... 67 Ting-YouWangandRendongYang 6 DECoN:ADetectionandVisualizationToolforExonic CopyNumberVariants .................................................. 77 AnnaFowler 7 FACETS:FractionandAllele-SpecificCopyNumberEstimates fromTumorSequencing................................................. 89 ArshiArora,RonglaiShen,andVenkatramanE.Seshan 8 Meerkat:AnAlgorithmtoReliablyIdentifyStructural VariationsandPredictTheirFormingMechanisms.......................... 107 LixingYang 9 StructuralVariantDetectionfromLong-ReadSequencing DatawithcuteSV ....................................................... 137 TaoJiang,ShiqiLiu,ShuqiCao,andYadongWang 10 IdentifyingSomaticMitochondrialDNAMutations ........................ 153 JisongAn,KyoungIlMin,andYoungSeokJu 11 Identification,Quantification,andTestingofAlternative SplicingEventsfromRNA-SeqDataUsingSplAdder ....................... 167 PhilippMarkolin,GunnarRa€tsch,andAndre´Kahles 12 PipeIT2:SomaticVariantCallingWorkflowforIon TorrentSequencingData................................................ 195 AndreaGarofoli,De´sire´eSchnidrig,andCharlotteK.Y.Ng 13 VariantCallingfromRNA-seqDataUsingtheGATKJoint GenotypingWorkflow................................................... 205 Jean-SimonBrouardandNathalieBissonnette vii viii Contents 14 UMI-Varcal:ALow-FrequencyVariantCaller forUMI-Tagged Paired-EndSequencingData............................................. 235 VincentSater,Pierre-JulienViailly,ThierryLecroq, E´lisePrieur-Gaston,E´lodieBohers,MathieuViennot, PhilippeRuminy,He´le`neDauchel,PierreVera,andFabriceJardin 15 Alignment-FreeGenotypingofKnownVariationswithMALVA.............. 247 GiuliaBernardini,LucaDenti,andMarcoPrevitali 16 Kmer2SNP:Reference-FreeHeterozygousSNPCalling Usingk-merFrequencyDistributions ..................................... 257 YanboLi,HardipPatel,andYuLin 17 SomaticSingle-NucleotideVariantCallingfromSingle-Cell DNASequencingDataUsingSCAN-SNV................................. 267 SajedehBahonarandHesamMontazeri 18 CopyNumberVariationDetectionbySingle-CellDNA SequencingwithSCOPE ................................................ 279 RujinWangandYuchaoJiang 19 VariantAnnotationandFunctionalPrediction:SnpEff ...................... 289 PabloCingolani 20 AnnotatingCancer-RelatedVariantsatProtein–Protein InterfacewithStructure-PPi.............................................. 315 MiguelVazquezandTirsoPons 21 PreanalyticalVariablesandSampleQualityControlforClinical VariantAnalysis......................................................... 331 IlariaAlborelliandPhilipM.Jermann Index ...................................................................... 353 Contributors ILARIAALBORELLI • InstituteofMedicalGeneticsandPathology,UniversityHospitalBasel, Basel,Switzerland RAFAELALDANA • Sentieon®Inc.,SanJose,CA,USA JISONGAN • GraduateSchoolofMedicalScienceandEngineering(GSMSE),Korea AdvancedInstituteofScienceandTechnology(KAIST),Daejeon,RepublicofKorea ARSHIARORA • DepartmentofEpidemiology&Biostatistics,MemorialSloanKettering CancerCenter,NewYork,NY,USA SAJEDEHBAHONAR • DepartmentofBioinformatics,InstituteofBiochemistryandBiophysics (IBB),UniversityofTehran,Tehran,Iran GIULIABERNARDINI • CentrumWiskunde&Informatica,Amsterdam,TheNetherlands NATHALIEBISSONNETTE • AgricultureandAgri-FoodCanada,Sherbrooke,QC,Canada E´LODIE BOHERS • DepartmentofPathology,CentreHenriBecquerel,Rouen,France; INSERMU1245,UniversityofNormandieUNIROUEN,Rouen,France JEAN-SIMONBROUARD • AgricultureandAgri-FoodCanada,Sherbrooke,QC,Canada SHUQICAO • HarbinInstituteofTechnology,Harbin,Heilongjiang,China PABLO CINGOLANI • AstraZeneca,OncologyR&D,Arlington,MA,USA DANIELP.COOKE • MRCWeatherallInstituteofMolecularMedicine,UniversityofOxford, Oxford,UK HE´LE`NEDAUCHEL • NormandieUniv,UNIROUEN,LITISEA4108,Rouen,France LUCA DENTI • DepartmentofComputationalBiology,C3BIUSR3756CNRS,Institut Pasteur,Paris,France ANNAFOWLER • DepartmentofHealthDataScience,InstituteofPopulationHealth, UniversityofLiverpool,Liverpool,UK DONALD FREED • Sentieon®Inc.,SanJose,CA,USA ANDREAGAROFOLI • InstituteofMedicalGeneticsandPathology,UniversityHospitalBasel, Basel,Switzerland WEITAIHUANG • LaboratoryofComputationalCancerGenomics,GenomeInstituteof Singapore,A*STAR(AgencyforScience,TechnologyandResearch),Singapore,Singapore FABRICEJARDIN • DepartmentofPathology,CentreHenriBecquerel,Rouen,France; INSERMU1245,UniversityofNormandieUNIROUEN,Rouen,France PHILIPM.JERMANN • InstituteofMedicalGeneticsandPathology,UniversityHospitalBasel, Basel,Switzerland SHUANGXIJI • DepartmentofBioinformaticsandComputationalBiology,TheUniversityof TexasMDAndersonCancerCenter,Houston,TX,USA TAOJIANG • HarbinInstituteofTechnology,Harbin,Heilongjiang,China YUCHAOJIANG • DepartmentofBiostatistics,GillingsSchoolofGlobalPublicHealth, UniversityofNorthCarolina,ChapelHill,NC,USA;DepartmentofGenetics,Schoolof Medicine,UniversityofNorthCarolina,ChapelHill,NC,USA;Lineberger ComprehensiveCancerCenter,UniversityofNorthCarolina,ChapelHill,NC,USA YOUNGSEOKJU • GraduateSchoolofMedicalScienceandEngineering(GSMSE),Korea AdvancedInstituteofScienceandTechnology(KAIST),Daejeon,RepublicofKorea ix

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.