ebook img

Statistical Rethinking: A Bayesian Course with Examples in R and STAN (draft) PDF

603 Pages·2020·39.683 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Statistical Rethinking: A Bayesian Course with Examples in R and STAN (draft)

2 Statistical Rethinking ABayesianCourse withExamples inRandStan SecondEdition RichardMcElreath ThisversioncompiledOctober21,2019 Contents PrefacetotheSecondEdition xi Preface xiii Audience xiii Teachingstrategy xiv Howtousethisbook xiv InstallingtherethinkingRpackage xviii Acknowledgments xix Chapter1. TheGolemofPrague 1 1.1. Statisticalgolems 1 1.2. Statisticalrethinking 4 1.3. Toolsforgolemengineering 10 1.4. Summary 17 Chapter2. SmallWorldsandLargeWorlds 19 2.1. Thegardenofforkingdata 20 2.2. Buildingamodel 28 2.3. Componentsofthemodel 32 2.4. Makingthemodelgo 36 2.5. Summary 46 2.6. Practice 46 Chapter3. SamplingtheImaginary 49 3.1. Samplingfromagrid-approximateposterior 52 3.2. Samplingtosummarize 53 3.3. Samplingtosimulateprediction 61 3.4. Summary 68 3.5. Practice 69 Chapter4. GeocentricModels 73 4.1. Whynormaldistributionsarenormal 74 4.2. Alanguagefordescribingmodels 79 4.3. Gaussianmodelofheight 81 4.4. Linearprediction 94 4.5. Curvesfromlines 113 4.6. Summary 123 4.7. Practice 123 Chapter5. TheManyVariables&TheSpuriousWaffles 127 5.1. Spuriousassociation 129 5.2. Maskedrelationship 148 vii viii CONTENTS 5.3. Categoricalvariables 157 5.4. Summary 162 5.5. Practice 162 Chapter6. TheHauntedDAG&TheCausalTerror 165 6.1. Multicollinearity 167 6.2. Post-treatmentbias 174 6.3. Colliderbias 180 6.4. Confrontingconfounding 187 6.5. Summary 193 6.6. Practice 193 Chapter7. Ulysses’Compass 195 7.1. Theproblemwithparameters 197 7.2. Entropyandaccuracy 206 7.3. GolemTaming: Regularization 218 7.4. Predictingpredictiveaccuracy 221 7.5. Usingcross-validationandinformationcriteria 229 7.6. Summary 242 7.7. Practice 242 Chapter8. ConditionalManatees 247 8.1. Buildinganinteraction 249 8.2. Symmetryofinteractions 260 8.3. Continuousinteractions 263 8.4. Summary 270 8.5. Practice 270 Chapter9. MarkovChainMonteCarlo 275 9.1. GoodKingMarkovandHisislandkingdom 276 9.2. Metropolis,Gibbs,andSadness 279 9.3. HamiltonianMonteCarlo 282 9.4. EasyHMC:ulam 291 9.5. CareandfeedingofyourMarkovchain 299 9.6. Summary 308 9.7. Practice 308 Chapter10. BigEntropyandtheGeneralizedLinearModel 311 10.1. Maximumentropy 312 10.2. Generalizedlinearmodels 324 10.3. Maximumentropypriors 333 10.4. Summary 333 Chapter11. GodSpikedtheIntegers 335 11.1. Binomialregression 336 11.2. Poissonregression 360 11.3. Censoringandsurvival 375 11.4. Summary 380 11.5. Practice 380 Chapter12. MonstersandMixtures 383 12.1. Over-dispersedoutcomes 383 12.2. Zero-inflatedoutcomes 390 CONTENTS ix 12.3. Orderedcategoricaloutcomes 394 12.4. Orderedcategoricalpredictors 404 12.5. Summary 409 12.6. Practice 410 Chapter13. ModelsWithMemory 413 13.1. Example: Multileveltadpoles 415 13.2. Varyingeffectsandtheunderfitting/overfittingtrade-off 422 13.3. Morethanonetypeofcluster 429 13.4. Divergenttransitionsandnon-centeredpriors 434 13.5. Multilevelposteriorpredictions 439 13.6. Summary 444 13.7. Practice 444 Chapter14. AdventuresinCovariance 447 14.1. Varyingslopesbyconstruction 449 14.2. Advancedvaryingslopes 459 14.3. Instrumentalvariablesandfrontdoors 467 14.4. Socialrelationsascorrelatedvaryingeffects 472 14.5. ContinuouscategoriesandtheGaussianprocess 478 14.6. Summary 496 14.7. Practice 496 Chapter15. MissingDataandOtherOpportunities 499 15.1. Measurementerror 500 15.2. Missingdata 509 15.3. Categoricalerrorsanddiscreteabsences 527 15.4. Summary 532 15.5. Practice 532 Chapter16. GeneralizedLinearMadness 535 16.1. Geometricpeople 536 16.2. Hiddenmindsandobservedbehavior 541 16.3. Ordinarydifferentialnutcracking 547 16.4. Populationdynamics 551 16.5. Summary 560 16.6. Practice 560 Chapter17. Horoscopes 563 Endnotes 567 Bibliography 581 Preface to the Second Edition ItcameasacompletesurprisetomethatIwroteastatisticsbook. Itisevenmoresur- prisinghowpopularthebookhasbecome. ButIhadsetouttowritethestatisticsbookthat IwishIcouldhavehadingraduateschool. NooneshouldhavetolearnthisstuffthewayI did. Iamgladthereisanaudiencetobenefitfromthebook. Itconsumed5yearstowriteit. Therewasaninitialsetofcoursenotes,melteddownand hammeredintoafirst200pagemanuscript. Idiscardedthatfirstmanuscript. Butittaught metheoutlineofthebookIreallywantedtowrite. Thenseveralyearsofteachingwiththe manuscriptfurtherrefinedit. ReallyIcouldhavecontinuedrefiningiteveryyear. Goingtopresscarriesthepenaltyof freezingadynamicprocessofbothlearninghowtoteachthematerialandkeepingupwith changesinthematerial. Astimegoeson,IseemoreelementsofthebookthatIwishIhad donedifferently. I’vealsoreceivedalotoffeedbackonthebook,andthatfeedbackhasgiven meideasforimprovingit. Sointhesecondedition,Iputthoseideasintoaction. Thegoalwithasecondeditionis onlytorefinethestrategythatmadethefirsteditionasuccess. Themajorchangesare: TheRpackagehassomenewtools. Themaptoolfromthefirsteditionisstillhere,butnow itisnamedquap. Thisrenamingisjusttoavoidmisunderstanding. Wejustusedittogeta quadraticapproximationtotheposterior. Sonowisnamedassuch. Abiggerchangeisthat map2stan has been replaced by ulam. The new ulam is very similar to map2stan, and in manycasescanbeusedidentically. Butitisalsomuchmoreflexible,mainlybecauseitdoes not make any assumptions about GLM structure and allows explicit variable types within the formula list. All the map2stan code is still in the package and will continue to work. Butnowulamallowsformuchmore,especiallyinlaterchapters. Bothofthesetoolsallow samplingfromthepriordistribution, usingextract.prior, aswellastheposterior. This helpswiththenextchange. Much more prior predictive simulation. A prior predictive simulation means simulating predictions from a model, using only the prior distribution instead of the posterior distri- bution. Thisisveryusefulforunderstandingtheimplicationsofaprior. Therewasonlya vestigialamountofthisinthefirstedition. Nowmostmodelingexampleshavesomeprior predictivesimulation. Ithinkthisismostusefuladditiontothesecondedition,sinceithelps somuchwithunderstandingnotonlypriorsbutalsothemodelitself. Moreemphasisonthedistinctionbetweenpredictionandinference. Chapter5,thechap- ter on multiple regression, has been split into two chapters. The first chapter focuses on helpfulaspectsofregression. Thesecondfocusesonwaysthatitcanmislead. Thisallowsas xi xii PREFACETOTHESECONDEDITION wellamoredirectdiscussionofcausalinference. ThismeansthatDAGs—directedacyclic graphs—make an appearance. The chapter on overfitting, Chapter 7 now, is also more di- rectincautioningaboutthepredictivenatureofinformationcriteriaandcross-validation. Cross-validation and importance sampling approximations of it (PSIS-LOO) are now dis- cussedexplicitly. Nowmodeltypes. Chapter4nowendswithB-splines. Thechapteroncountmodels,Chap- ter11,nowincludesanitem-response(factoranalytic)example. Chapter12containsasur- vivalanalysiswithcensoring. Chapter14hasanexampleofaphylogeneticdistanceregres- sion. Andthereisanentirelynewchapter,Chapter16,thatfocusesonmodelsthatarenot easilyconceivedofasGLMMs. Somenewdataexamples. Therearesomenewdataexamples,suchastheJapanesecherry blossomshistoricaltimeseriesandalargerprimateevolutiondatasetwith300speciesand amatchingphylogeny. MorepresentationofrawStanmodels. ThereareseveralplacesnowwhererawStanmodel codeisexplained,insideoptionalboxes. Ihopethismakesatransitiontoworkingdirectly inStaneasier. ButthemaintextremainsRscript,usingtherethinkingpackage’steaching tools. Kindnessandpersistence. Asinthefirstedition,Ihavetriedtomakethematerialaskindas possible. Noneofthisstuffiseasy,andthejourneyintounderstandingislongandhaunted. Itisimportantthatreadersexpectthatconfusionandpartialunderstandingarenormal. This isalsothereasonthatIhavenotchangedthebasicmodelingstrategyinthebook. First,Iforcethereadertoexplicitlyspecifyeveryassumptionofthemodel. Somereaders ofthefirsteditionlobbiedmetousesimplifiedformulatoolslikebrmsorrstanarm. Those are fantastic packages, and graduating to use them after this book is recommended. But I don’t see how a person can come to understand the model when using those tools. The priors being hidden isn’t the most limiting part. Instead, since linear model formulas like y ~ (1|x) + z don’t show the parameters, nor even all of the terms, it is not easy to see howthemathematicalmodelrelatestothecode. Itisultimatelykindertobeabitcrueland requiremorework. Sotheformulalistsremain. Inthisbook,youareprogrammingthelog- posterior,downtotheexactrelationshipbetweeneachvariableandcoefficient. You’llthank melater. Second,halfthebookgoesbybeforeMCMCappears. Somereadersofthefirstedition wantedmetostartinsteadwithMCMC.IdonotdothisbecauseBayesisnotaboutMCMC. Weseektheposteriordistribution,buttherearemanylegitimateapproximationsofit,and MCMCisjustonesetofstrategies. Usingquadraticapproximationinthefirsthalfalsoallows aclearertietonon-Bayesianalgorithms. Andsincefindingthequadraticapproximationis fast,itmeansreadersdon’thavetostrugglewithtoomanythingsatonce. Again,itisabout beingkind. RichardMcElreath Leipzig,10August2019 Preface Masons,whentheystartuponabuilding, Arecarefultotestoutthescaffolding; Makesurethatplankswon’tslipatbusypoints, Secureallladders,tightenboltedjoints. Andyetallthiscomesdownwhenthejob’sdone Showingoffwallsofsureandsolidstone. Soif,mydear,theresometimesseemtobe Oldbridgesbreakingbetweenyouandme Neverfear. Wemayletthescaffoldsfall Confidentthatwehavebuiltourwall. (“Scaffolding”bySeamusHeaney,1939–2013) Thisbookmeanstohelpyouraiseyourknowledgeofandconfidenceinstatisticalmod- eling. Itismeantasascaffold, onethatwillallowyoutoconstructthewallthatyouneed, eventhoughyouwilldiscarditafterwards. Asaresult,thisbookteachesthematerialinof- teninconvenientfashion, forcingyoutoperformstep-by-stepcalculationsthatareusually automated. Thereasonforallthealgorithmicfussistoensurethatyouunderstandenough of the details to make reasonable choices and interpretations in your own modeling work. Soalthoughyouwillmoveontousemoreautomation,it’simportanttotakethingsslowat first. Putupyourwall,andthenletthescaffoldingfall. Audience The principle audience is researchers in the natural and social sciences, whether new PhDstudentsorseasonedprofessionals,whohavehadabasiccourseonregressionbutnev- erthelessremainuneasyaboutstatisticalmodeling. Thisaudienceacceptsthatthereissome- thing vaguely wrong about typical statistical practice in the early 21st century, dominated as it is by p-values and a confusing menagerie of testing procedures. They see alternative methods in journals and books. But these people are not sure where to go to learn about thesemethods. Asaconsequence,thisbookdoesn’treallyargueagainstp-valuesandthelike. Theprob- leminmyopinionisn’tsomuchp-valuesasthesetofoddritualsthathaveevolvedaround xiii xiv PREFACE them,inthewildsofthesciences,aswellastheexclusionofsomanyotherusefultools. So thebookassumesthereaderisreadytotrydoingstatisticalinferencewithoutp-values. This isn’ttheidealsituation. Itwouldbebettertohavematerialthathelpsyouspotcommonmis- takesandmisunderstandingsofp-valuesandtestsingeneral,asallofushavetounderstand suchthings,evenifwedon’tusethem. SoI’vetriedtosneakinalittlematerialofthatkind, butunfortunatelycannotdevotemuchspacetoit. Thebookwouldbetoolong,anditwould disrupttheteachingflowofthematerial. It’s important to realize, however, that the disregard paid to p-values is not a uniquely Bayesianattitude. Indeed,significancetestingcanbe—andhasbeen—formulatedasaBayes- ian procedure as well. So the choice to avoid significance testing is stimulated instead by epistemologicalconcerns,someofwhicharebrieflydiscussedinthefirstchapter. Teachingstrategy The book uses much more computer code than formal mathematics. Even excellent mathematicianscanhavetroubleunderstandinganapproach,untiltheyseeaworkingalgo- rithm. Thisisbecauseimplementationincodeformremovesallambiguities. Somaterialof thissortiseasiertolearn,ifyoualsolearnhowtoimplementit. Inadditiontoanypedagogicalvalueofpresentingcode,somuchofstatisticsisnowcom- putationalthatapurelymathematicalapproachisanywaysinsufficient. Asyou’llseeinlater partsofthisbook,thesamemathematicalstatisticalmodelcansometimesbeimplemented indifferentways,andthedifferencesmatter. Sowhenyoumovebeyondthisbooktomore advancedorspecializedstatisticalmodeling,thecomputationalemphasisherewillhelpyou recognizeandcopewithallmannerofpracticaltroubles. Everysectionofthebookisreallyjustthetipofaniceberg. I’vemadenoattempttobe exhaustive. RatherI’vetriedtoexplainsomethingwell. Inthisattempt,I’vewovenalotof conceptsandmaterialintodataanalysisexamples. Soinsteadofhavingtraditionalunitson, forexample,centeringpredictorvariables,I’vedevelopedthoseconceptsinthecontextofa narrative about data analysis. This is certainly not a style that works for all readers. But it hasworkedforalotofmystudents. Isuspectitfailsdramaticallyforthosewhoarebeing forcedtolearnthisinformation. Fortheinternallymotivated,itreflectshowwereallylearn theseskillsinthecontextofourresearch. Howtousethisbook This book is not a reference, but a course. It doesn’t try to support random access. Rather, it expects sequential access. This has immense pedagogical advantages, but it has thedisadvantageofviolatinghowmostscientistsactuallyreadbooks. Thisbookhasalotofcodeinit,integratedfullyintothemaintext. Thereasonforthis is that doing model-based statistics in the 21st century really requires programming, of at leastaminorsort. Thecodeisnotoptional. Everyplace,Ihaveerredonthesideofincluding too much code, rather than too little. In my experience teaching scientific programming, noviceslearnmorequicklywhentheyhaveworkingcodetomodify,ratherthanneedingto writeanalgorithmfromscratch. Mygenerationwasprobablythelasttohavetolearnsome programming to use a computer, and so coding has gotten harder and harder to teach as timegoeson. Mystudentsareverycomputerliterate,buttheysometimeshavenoideawhat computercodelookslike.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.