ebook img

Handbook of Statistical Bioinformatics PDF

406 Pages·2022·10.143 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Handbook of Statistical Bioinformatics

Springer Handbooks of Computational Statistics Henry Horng-Shing Lu Bernhard Schölkopf Martin T. Wells Hongyu Zhao   Editors Handbook of Statistical Bioinformatics Second Edition Springer Handbooks of Computational Statistics SeriesEditors JamesE.Gentle,GeorgeMasonUniversity,Fairfax,VA,USA WolfgangKarlHärdle,Humboldt-UniversitätzuBerlin,Berlin,Germany YuichiMori,OkayamaUniversityofScience,Okayama,Japan Henry Horng-Shing Lu (cid:129) Bernhard Schölkopf (cid:129) Martin T. Wells (cid:129) Hongyu Zhao Editors Handbook of Statistical Bioinformatics Editors HenryHorng-ShingLu BernhardSchölkopf InstituteofStatistics DepartmentofEmpiricalInference NationalYangMingChiaoTungUniversity MaxPlanckInstituteforIntelligentSystems Hsinchu,Taiwan,ROC Tübingen,Germany MartinT.Wells HongyuZhao DepartmentofStatisticsandDataScience DepartmentofBiostatistics CornellUniversity YaleUniversity Ithaca,NY,USA NewHaven,CT,USA ISSN2197-9790 ISSN2197-9804 (electronic) SpringerHandbooksofComputationalStatistics ISBN978-3-662-65901-4 ISBN978-3-662-65902-1 (eBook) https://doi.org/10.1007/978-3-662-65902-1 ©TheEditor(s)(ifapplicable) andTheAuthor(s),underexclusivelicensetoSpringer-VerlagGmbH, DE,partofSpringerNature2011,2022 Thisworkissubjecttocopyright.AllrightsaresolelyandexclusivelylicensedbythePublisher,whether thewhole orpart ofthematerial isconcerned, specifically therights oftranslation, reprinting, reuse ofillustrations, recitation, broadcasting, reproductiononmicrofilmsorinanyotherphysicalway,and transmissionorinformationstorageandretrieval,electronicadaptation,computersoftware,orbysimilar ordissimilarmethodologynowknownorhereafterdeveloped. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant protectivelawsandregulationsandthereforefreeforgeneraluse. Thepublisher,theauthors,andtheeditorsaresafetoassumethattheadviceandinformationinthisbook arebelievedtobetrueandaccurateatthedateofpublication.Neitherthepublishernortheauthorsor theeditorsgiveawarranty,expressedorimplied,withrespecttothematerialcontainedhereinorforany errorsoromissionsthatmayhavebeenmade.Thepublisherremainsneutralwithregardtojurisdictional claimsinpublishedmapsandinstitutionalaffiliations. This Springer imprint is published by the registered company Springer-Verlag GmbH, DE, part of SpringerNature. Theregisteredcompanyaddressis:HeidelbergerPlatz3,14197Berlin,Germany Preface Numerous fascinating and important breakthroughs in biotechnology have gen- erated massive volumes of high throughput data with diverse types that demand novel developments of efficient and appropriate tools in computational statistics that are integrated with biologicalknowledge and computationalalgorithms. This updated volume collects contributed chapters from leading researchers to survey manyrecentactiveresearchtopicsthathavedevelopedsincethepreviouseditionof the HandbookofStatistical Bioinformatics.This updatedhandbookis intendedto serveasbothanintroductoryandreferencemonographforstudentsandresearchers who are interested in learning the state-of-the-art developments in computational statisticsasappliedtocomputationalbiology. This collectionof articles, from the leading scholarsin the field, is primarily a monographwhichwillbeofinteresttotheeducational,academic,andprofessional organizationsrelatedtostatisticians,computerscientists,biologicalandbiomedical researcherswithstronginterestsincomputationalbiology.Althoughthereareother volumes available for computational statistics and bioinformatics on the market, therearefewbookssuchasthisthatfocusontheinterfacebetweencomputational statisticsandcutting-edgedevelopmentsincomputationalbiology.Seeingthisneed, thiscompletelyupdatedcollectionisaimedtoestablishthisbridge.Thishandbook covers many significant up-to-date topics in probabilistic and statistical modeling aswellastheanalysisofmassivedatasetsgeneratedfrommodernbiotechnology. Thesemethodsandtechnologieswillchangetheperspectivesofbiology,healthcare, and medicine in the twenty-first century! This collection is an extended version of the previous edited handbook. The advanced research topics cover statistical methodsforsingle-cellanalysis,networkanalysis,andsystemsbiology. During the editing process of this handbook, the world has been upended by the massive influence of COVID-19 pandemic and other challenges. The editors wouldliketothankthecontributingauthors,Springermanagementteammembers, v vi Preface supportingcolleaguesandfamilymembersfortheirincrediblesupportandpatience duringthischallengingtimeperiodinorderforthishandbooktobemadeavailable totherelatedscholarlycommunities! Hsinchu,Taiwan,ROC HenryHorng-ShingLu Tübingen,Germany BernhardSchölkopf Ithaca,NY,USA MartinT.Wells NewHaven,CT,USA HongyuZhao May8,2022 Contents PartI Single-CellAnalysis ComputationalandStatisticalMethodsforSingle-CellRNA SequencingData.................................................................. 3 ZuohengWangandXitingYan Pre-processing, Dimension Reduction, and Clustering for Single-CellRNA-seqData....................................................... 37 JialuHu,YiranWang,XiangZhou,andMengjieChen IntegrativeAnalysesofSingle-CellMulti-OmicsData:AReview fromaStatisticalPerspective................................................... 53 ZhixiangLin ApproachestoMarker Gene IdentificationfromSingle-Cell RNA-SequencingData........................................................... 71 RonnieY.Li,WenjingMa,andZhaohuiS.Qin Model-BasedClusteringofSingle-CellOmicsData.......................... 85 XinjunWang,HaoranHu,andWeiChen DeepLearningMethodsforSingle-CellOmicsData......................... 109 JingshuWangandTianyuChen PartII NetworkAnalysis ProbabilisticGraphicalModelsforGeneRegulatoryNetworks ........... 135 ZhenweiZhou,XiaoyuZhang,PeitaoWu,andChing-TiLiu Additive ConditionalIndependence forLargeandComplex BiologicalStructures............................................................. 153 Kuang-YaoLee,BingLi,andHongyuZhao IntegrationofBooleanandBayesianNetworks .............................. 173 Meng-YuanTsaiandHenryHorng-ShingLu vii viii Contents ComputationalMethods for Identifying MicroRNA-Gene RegulatoryModules ............................................................. 187 YinLiu CausalInferenceinBiostatistics................................................ 209 ShashaHanandXiao-HuaZhou BayesianBalanceMediationAnalysisinMicrobiomeStudies.............. 237 LuHuangandHongzheLi PartIII SystemsBiology IdentifyingGeneticLociAssociatedwithComplexTraitVariability...... 257 JiachengMiaoandQiongshiLu CellType-SpecificAnalysisforHigh-throughputData...................... 271 ZiyiLiandHaoWu RecentDevelopmentofComputationalMethodsinthe Field ofEpitranscriptomics............................................................ 285 ZijieZhang,ShunLiu,ChuanHe,andMengjieChen EstimationofTumorImmuneSignaturesfromTranscriptomicsData.... 311 XiaoqingYu Cross-LinkingMassSpectrometryDataAnalysis............................ 339 ChenZhouandWeichuanYu Cis-regulatoryElementFrequencyModulesandtheirPhase TransitionacrossHominidae ................................................... 371 LeiMLi,MengtianLi,andLiangLi ImprovedMethodforRootingandTip-DatingaViralPhylogeny......... 397 XuhuaXia Part I Single-Cell Analysis Computational and Statistical Methods for Single-Cell RNA Sequencing Data ZuohengWang andXitingYan Abstract In recent years, advances in droplet-based technology have boosted the popularity of using single-cell RNA sequencing (scRNA-seq) technology to investigate transcriptomic and cell population composition changes in various tissues and diseases. Despite the potential of these technologies in understanding disease pathogenesis and developing novel personalized therapeutics, analyses of the generated scRNA-seq data are challenging, mainly due to high noise level, prevalent dropout events, heterogeneous sources of variation confounding phenotypeof interest, and so on. In this chapter,we introducethese challengesin analyses of scRNA-seq data and the corresponding computational and statistical methods developed to address them. The topics include data preprocessing, data normalization,dropoutimputation,anddifferentialexpressionanalysis. 1 Introduction Gene expressionprofilingmeasureslevels of mRNA to understandtranscriptomic changes due to disease, treatment, environment,time, and so on. Traditional bulk RNAgeneexpressionprofilingusingmicroarraysandRNAsequencingpoolsRNAs fromalargepopulationofcellsconsistingofvariousandoftenunknowncelltypes. Itmeasurestheaverageexpressionprofileinmixedcellpopulationswithunknown contribution from different cells or cell types. Thus, bulk RNA gene expression dataisunabletopreciselyidentifythecellularsourceoftranscriptomicchangesof interest,especiallywhenhighcell-to-cellheterogeneityexists[1–5].Toinvestigate transcriptomicchangesatsingle-cellresolution,twomajorchallengesexistinclud- ing (1) isolating cells from each other without strong perturbations to cells that lead to systematic transcriptomic changes and (2) amplification of extremely low Z.Wang·X.Yan((cid:2)) YaleUniversity,NewHaven,CT,USA e-mail:[email protected];[email protected] ©TheAuthor(s),underexclusivelicensetoSpringer-VerlagGmbH,DE, 3 partofSpringerNature2022 H.H.-S.Luetal.(eds.),HandbookofStatisticalBioinformatics, SpringerHandbooksofComputationalStatistics, https://doi.org/10.1007/978-3-662-65902-1_1

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.