ebook img

Nonparametric Bayesian Inference in Biostatistics PDF

448 Pages·2015·9.991 MB·English
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Nonparametric Bayesian Inference in Biostatistics

Frontiers in Probability and the Statistical Sciences Riten Mitra Peter Müller Editors Nonparametric Bayesian Inference in Biostatistics Frontiers in Probability and the Statistical Sciences Editor-inChief: SomnathDatta DepartmentofBioinformatics&Biostatistics UniversityofLouisville Louisville,Kentucky,USA SeriesEditors: FrederiG.Viens DepartmentofMathematics&DepartmentofStatistics PurdueUniversity WestLafayette,Indiana,USA DimitrisN.Politis DepartmentofMathematics UniversityofCalifornia,SanDiego LaJolla,California,USA HannuOja DepartmentofMathematicsandStatistics UniversityofTurku Turku,Finland MichaelDaniels SectionofIntegrativeBiology DivisionofStatistics&ScientificComputation UniversityofTexas Austin,Texas,USA Moreinformationaboutthisseriesathttp://www.springer.com/series/11957 Riten Mitra • Peter Mu¨ller Editors Nonparametric Bayesian Inference in Biostatistics 123 Editors RitenMitra PeterMu¨ller DepartmentofBioinformatics DepartmentofMathematics andBiostatistics UniversityofTexas UniversityofLouisville Austin,TX,USA Louisville,KY,USA FrontiersinProbabilityandtheStatisticalSciences ISBN978-3-319-19517-9 ISBN978-3-319-19518-6 (eBook) DOI10.1007/978-3-319-19518-6 LibraryofCongressControlNumber:2015945621 SpringerChamHeidelbergNewYorkDordrechtLondon ©SpringerInternationalPublishingSwitzerland2015 Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpartof thematerialisconcerned,specificallytherightsoftranslation,reprinting,reuseofillustrations,recitation, broadcasting,reproductiononmicrofilmsorinanyotherphysicalway,andtransmissionorinformation storageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilarmethodology nowknownorhereafterdeveloped. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant protectivelawsandregulationsandthereforefreeforgeneraluse. Thepublisher,theauthorsandtheeditorsaresafetoassumethattheadviceandinformationinthisbook arebelievedtobetrueandaccurateatthedateofpublication.Neitherthepublishernortheauthorsor theeditorsgiveawarranty,expressorimplied,withrespecttothematerialcontainedhereinorforany errorsoromissionsthatmayhavebeenmade. Printedonacid-freepaper SpringerInternational PublishingAGSwitzerlandispartofSpringerScience+Business Media(www. springer.com) Preface Nonparametric Bayesian (BNP) approaches are becoming increasingly more common in biostatistical inference. Many problemsinvolve an abundance of data thatallowstheuseofmoreflexibleandcomplexprobabilitymodelsbeyondtradi- tionalparametricfamilies.OneofthemosttraditionalapplicationareasforBNPis in survival analysis, including in particular survival regression. The nature of the recordedoutcomesmakes it natural to target inference on an entire unknowndis- tribution,ratherthanfocusonjustameanfunction.Manymorerecentapplications ofBNPinbiostatisticsandbioinformaticsinvolveinferenceonunknownpartitions. For example, this could be an arrangement of patients into clinically meaningful subpopulations.This volume coversthese and some more applicationsof BNP in biomedicalinferenceproblems.Theintentionofthisbookistoprovideagoodre- viewofandintroductiontorelatedapplicationareas. Part I starts with two introductory chapters. Chapter 1 provides a brief review of the most commonly used basic BNP models, including the Dirichlet process, Dirichletprocessmixtures,thedependentDirichletprocess,PolyatreesandGaus- sian processes.Thesemodelsandrelatedvariationsactasthe workhorsesofBNP methods in biostatistics applications. Chapter 2 discusses some related examples, spanningawiderangeofapplications. Part II includesseveralchaptersthat considerBNP methodsfor inferencewith genomicdata,includinggeneexpression,mutations,copynumberaberrationsand more.Manychaptersin this partinvolvea notionof clusteringbiomolecularenti- ties, e.g.genes, cells, genomiclocationsor proteins.The inferredclustersprovide insightsintounderlyingbiologicalstructureandfunctionalityoftheseentities.An- othermajorfeatureformostchaptersinthispartisthatthedataareobtainedfrom state-of-the-art experimental platforms, e.g. next generation sequencing counts in Chapter4. Although the exact form of raw data vary across chapters, all chapters areperfectillustrationsoftheubiquitousapplicabilityofthecommonBNPpriorsin bioinformatics.Forexample,Chapter3discussescurveclusteringmethods,focus- ing on Chinese restaurantprocesspriorsto cluster proteinsbased on their shapes. v vi Preface Theinferenceiscenteredaroundaninnerproductmatrixthatisbuiltusingaspecial metricontheshapespace.Inthisway,itefficientlyreducesthecomputationalscale of the problem from infinite-dimensional curves to clustering patterns in a finite matrix.Chapter4usesavariationoftheIndianbuffetprocesspriortoaddressthe challengingissue of estimating tumor heterogeneity.Inferringsubclonesof tumor cellsbasedongenomicprofilehasimportantimplicationsingenomicsandcancer research.Chapters9and6dealwithclusteringnotasgoalsinthemselves,butinter- estingly,asefficienttoolsfordimensionreduction.Chapter9clustersalargenum- ber of genes based on their expression and eventually uses these clusters to build regressionmodelsforpredictingseverityofmultiplemyeloma.Chapter6addresses theclassicalproblemofSNPdiseaseassociationthroughanormalizedgeneralized gamma (NGG) process. This is one of the first applications of this prior to large scale biostatistical inference.A thoroughreview of Bayesian network modelscan befoundin Chapter8. Inaddition,thechapterintroducesnovelandflexiblesemi- parametricextensionsofcurrentapproachestoinferenceforhighdimensionalgene networks.Chapter7reviewstherecentforayofBNPpriorsintotheclassicalfield of populationgenetics.The chapterdiscusses the problemof detectingpopulation clustersbasedonallelefrequenciesusinghierarchicalDPpriors.Populationadmix- turemodelsarenotonlyrelevantforunderstandingthepatternofhumanmigration and evolution,butare also criticalto accountfor confoundingin gene association studies. Finally, we would want the readers to take note that all clustering priors discussedinthispartarerelatedtoexchangeabledistributions.Thatis,theyassume that there is no natural orderingof the variables, which restrains their straightfor- ward application to segmentation problems where time or say, genomic locations canbeofgreatimportance.Chapter5discussesanelegantwaytocircumventthis issuebygeneralizinganexchangeablepriorthroughlatentvariables.Themethodis appliedtodetectregionsofcopynumbervariationsinthegenomeandiscompared withstandardsegmentationmethodslikehiddenMarkovmodels. Chapters in Part III discuss applicationsof BNP to survivalanalysis. Inference forsurvivaldataweresomeofthefirstproblemsthatmotivatedtheearlyBNPlit- erature.Thisisthecasebecauseforeventtimedataitisnaturaltofocusontheen- tire distribution,includingalldetailedfeatures,ratherthanjustlocationandscale. Chapter10motivatesMarkovprocessesasaflexibleclassofpriorsonhazardrates. Itdiscusseshowsuchpriorscanbeeasilyadaptedtocureratemodelsandcomplex multivariatesettings,likecompetingrisksandrecurrentevents.Chapter11linksthe relevantpriorsforbaselinehazardrateswiththestandardparametricmodelstopro- videacomprehensivereviewofsomecommonsemi-parametricapproachesinthe literature.Thechapterconcludeswitha discussiononspatially correlatedsurvival data.Chapter12introducesa fullynonparametricpriorto addressthechallenging andcomplexissuesofintervalcensoringandmisclassification. Part IV groups together chapters that include a notion of modeling some ran- domfunction(orresponsesurface).Chapter13usesGaussianprocess(GP)priors tomodelthetemporalevolutionofthefiringpatternsofagroupofneurons.Thetwo mainthemesofthischaptersareadaptingGPpriorstosignalingdataandmultivari- ate extensionsto capturecomplexspatio-temporaleffects. Next, in Chapter14 we Preface vii find an extensive review of general curve registration techniquesthat can account forphasevariabilityinfunctionaldata.Theseapproachesarecriticaltomanypublic health applications, as illustrated in their application to growth data and pharma- cokinetics. Chapter 15 delivers yet another flavor of BNP priors in the contextof new age adaptive clinical trials. The chapter focuses on a flexible modeling ap- proachforaclinicaloutcomeoveralargecovariatespace,wherethecovariatesare biomarkers.Thecomplexityoftherelatedresponsesurfaceisparticularlyrelevant forclinicaltrialsettingswherediseasestatusandprogressionarerelatedtohigher orderinteractionsbetween biomarkers.Part IV finally concludeswith a review of non-parametricmodelingofROCcurvesinChapter16.ROCcurvesareubiquitous in classification problems,especially in medical diagnostic tests. While providing anotherillustrationoftheutilityoftheDPMprior,thischapteralsoprovidesalucid descriptionofBayesianbootstrapmethods. Inferencefor spatio-temporaldata gives rise to a specific set of challengesand modeling needs. Part V discusses such models in considerable detail. Chapter 17 starts by providingan in-depth theoreticalintroductionto Gaussian processes and other BNP priors for spatial data. This review starts from semi-parametric mod- elsandgraduallybuildsuptofullynon-parametricmethods.Chapter18discusses twospecificpriorsforspatialdata.Oneofthehighlightsofthischapterisaclever methodofadaptingconventionalproductpartitionpriorstothespatialcontext.The finalchapterinthispart,Chapter19,dealswiththespecializedproblemofdetecting boundariesinspatialdata.ThechapterdiscussessomespecificBNPpriorsforthis problem.Inaddition,itformulatesthisproblemintheframeworkofBayesianmul- tiple hypothesis testing advocating ways to extract multiplicity-adjusted posterior probabilities. Increasingly stricter ethical standards, increasing cost and complexity of clini- cal studiesmake it ever moredifficultto carryoutrandomizedstudies. This leads toanincreaseduseofnon-randomizeddata.Derivingmeaningfulconclusionfrom suchdatarequiresadjustmentforthelackofrandomizationbytechniquesknownas “causalinference.”Manyofthesemethodsdealwithmissingdataaswell.Chapters inPartVIdiscussthisclassofmethods,includingaBNPpriorforrandomdiscon- tinuationdesignsin Chapter20and BNP priorsforinferencewith missing data in Chapter21. Finally,besidespopularizingtheuseofBNPmethodsinbiostatisticsandbioin- formaticsthis volume supportsBNP research by donatingany royalties to the In- ternational Society for Bayesian Analysis (ISBA) in support of travel awards for youngresearchersforupcomingmeetingsofthebiennialworkshopinnonparamet- ricBayesianinference. Louisville,KY,USA RitenMitra Austin,TX,USA PeterMu¨ller Contents PartI Introduction 1 BayesianNonparametricModels .............................. 3 PeterMu¨llerandRitenMitra 2 BayesianNonparametricBiostatistics .......................... 15 WesleyO.JohnsonandMigueldeCarvalho PartII GenomicsandProteomics 3 BayesianShapeClustering.................................... 57 ZhengwuZhang,DebdeepPati,andAnujSrivastava 4 Estimating LatentCellSubpopulationswithBayesianFeature AllocationModels ........................................... 77 YuanJi,SubhajitSengupta,JuheeLee,PeterMu¨ller, andKamalakarGulukota 5 SpeciesSamplingPriorsforModelingDependence:AnApplication totheDetectionofChromosomalAberrations ................... 97 FedericoBassetti,FabrizioLeisen,EdoardoAiroldi, andMicheleGuindani 6 ModelingtheAssociationBetweenClustersofSNPsandDisease Responses ................................................. 115 Raffaele Argiento,Alessandra Guglielmi, Chuhsing Kate Hsiao, FabrizioRuggeri,andCharlotteWang 7 BayesianInferenceonPopulationStructure:FromParametric toNonparametricModeling................................... 135 MariaDeIorio,StefanoFavaro,andYeeWhyeTeh ix

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.