Table Of ContentPrinciplesofStatisticalAnalysis
Thiscompactcourseiswrittenforthemathematicallyliteratereaderwhowantsto
learntoanalyzedatainaprincipledfashion.Thelanguageofmathematicsenables
clearexpositionthatcangoquitedeep,quitequickly,andnaturallysupportsan
axiomaticandinductiveapproachtodataanalysis.Startingwithagoodgroundingin
probability,thereadermovestostatisticalinferenceviatopicsofgreatpractical
importance–simulationandsampling,aswellasexperimentaldesignanddata
collection–thataretypicallydisplacedfromintroductoryaccounts.Thecoreofthe
bookthencoversbothstandardmethodsandsuchadvancedtopicsasmultipletesting,
meta-analysis,andcausalinference.
ERY ARIAS-CASTRO isaprofessorintheDepartmentofMathematicsandinthe
Halıcıog˘luDataScienceInstituteattheUniversityofCalifornia,SanDiego,wherehe
specializesintheoreticalstatisticsandmachinelearning.Hiseducationincludesa
bachelor’sdegreeinmathematicsandamaster’sdegreeinartificialintelligence,both
fromE´coleNormaleSupe´rieuredeCachan(nowE´coleNormaleSupe´rieure
Paris-Saclay)inFrance,aswellasaPh.D.instatisticsfromStanfordUniversityinthe
UnitedStates.
Published online by Cambridge University Press
“Withtherapiddevelopmentofdata-drivendecisionmaking,statisticalmethodshave
becomeindispensableincountlessdomainsofscience,engineering,andmanagement
science, to name a few. Ery Arias-Castro’s excellent text gives a self-contained
and remarkably broad exposition of the current diversity of concepts and methods
developedtotacklethechallengesofdatascience.Simplyput,everyoneseriousabout
understandingthetheorybehinddatascienceshouldbeexposedtothetopicscovered
inthisbook.”
—PhilippeRigollet,Professor
DepartmentofMathematics,MassachusettsInstituteofTechnology
“A course on statistical modeling and inference has been a staple of many first-year
graduate engineering programs. While there are many excellent textbooks on this
subject,muchofthematerialisinspiredbymodelsofphysicalsystems,andassuch
thesebooksdealextensivelywithparametricinference.Theemergingdatarevolution,
on the other hand, requires an engineering student to develop an understanding of
statistical inference rooted in problems inspired by data-driven applications, and this
book fills that need. Arias-Castro weaves together diverse concepts such as data
collection, sampling, and inference in a unified manner. He lucidly presents the
mathematical foundations of statistical data analysis, and covers advanced topics on
dataanalysis.Withover700problemsandcomputerexercises,thisbookwillservethe
needsofbeginnerandadvancedengineeringstudentsalike.”
—VenkateshSaligrama,Professor
DataScienceFacultyFellow,DepartmentofElectricalandComputerEngineering,
DepartmentofComputerScience(bycourtesy),BostonUniversity
“In this book, aimed at senior undergraduates or beginning graduate students with
a reasonable mathematical background, the author proposes a self-contained and
yet concise introduction to statistical analysis. By putting a strong emphasis on the
randomization principle, he provides a coherent and elegant perspective on modern
statistical practice. Some of the later chapters also form a good basis for a reading
group.Iwillberecommendingthisexcellentbooktomycollaborators.”
—NicolasVerzelen,AssociateProfessor
Mathematics,ComputerScience,Physics,andSystemsDepartment,
UniversityofMontpellier
“Thistextishighlyrecommendedforundergraduatestudentswantingtograspthekey
ideasofmoderndataanalysis.Arias-Castroachievessomethingthatisrareintheart
of teaching statistical science – he uses mathematical language in an intelligible and
highlyhelpfulway,withoutsurrenderingkeyintuitionsofstatisticstoformalismand
proof.Inthisway,thereadercangetthroughanimpressiveamountofmaterialwithout,
however,evergettingintomuddywaters.”
—RichardNickl,Professor
StatisticalLaboratory,CambridgeUniversity
Published online by Cambridge University Press
INSTITUTE OF MATHEMATICAL STATISTICS
TEXTBOOKS
EditorialBoard
NancyReid(UniversityofToronto)
JohnAston(UniversityofCambridge)
ArnaudDoucet(UniversityofOxford)
RamonvanHandel(PrincetonUniversity)
ISBAEditorialRepresentative
PeterMu¨ller(UniversityofTexasatAustin)
IMSTextbooksgiveintroductoryaccountsoftopicsofcurrentconcernsuitablefor
advancedcoursesatmaster’slevel,fordoctoralstudentsandforindividualstudy.
Theyaretypicallyshorterthanafullydevelopedtextbook,oftenarisingfrommaterial
createdforatopicalcourse.Lengthsof100–290pagesareenvisaged.Thebooks
typicallycontainexercises.
IncollaborationwiththeInternationalSocietyforBayesianAnalysis(ISBA),
selectedvolumesintheIMSTextbooksseriescarrythe“withISBA”designationat
therecommendationoftheISBAeditorialrepresentative.
OtherBooksintheSeries(*withISBA)
1. ProbabilityonGraphs,byGeoffreyGrimmett
2. StochasticNetworks,byFrankKellyandElenaYudovina
3. BayesianFilteringandSmoothing,bySimoSa¨rkka¨
4. TheSurprisingMathematicsofLongestIncreasingSubsequences,byDanRomik
5. NoiseSensitivityofBooleanFunctionsandPercolation,byChristopheGarban
andJeffreyE.Steif
6. CoreStatistics,bySimonN.Wood
7. LecturesonthePoissonProcess,byGu¨nterLastandMathewPenrose
8. ProbabilityonGraphs(SecondEdition),byGeoffreyGrimmett
9. IntroductiontoMalliavinCalculus,byDavidNualartandEula`liaNualart
10. AppliedStochasticDifferentialEquations,bySimoSa¨rkka¨andArnoSolin
11. *ComputationalBayesianStatistics,byM.Anto´niaAmaralTurkman,Carlos
DanielPaulino,andPeterMu¨ller
12. StatisticalModellingbyExponentialFamilies,byRolfSundberg
13. Two-DimensionalRandomWalk:FromPathCountingtoRandomInterlacements,
bySergueiPopov
14. SchedulingandControlofQueueingNetworks,byGideonWeiss
Published online by Cambridge University Press
Published online by Cambridge University Press
Principles of Statistical Analysis
Learning from Randomized Experiments
ERY ARIAS-CASTRO
UniversityofCalifornia,SanDiego
Published online by Cambridge University Press
UniversityPrintingHouse,CambridgeCB28BS,UnitedKingdom
OneLibertyPlaza,20thFloor,NewYork,NY10006,USA
477WilliamstownRoad,PortMelbourne,VIC3207,Australia
314–321,3rdFloor,Plot3,SplendorForum,JasolaDistrictCentre,
NewDelhi–110025,India
103PenangRoad,#05–06/07,VisioncrestCommercial,Singapore238467
CambridgeUniversityPressispartoftheUniversityofCambridge.
ItfurtherstheUniversity’smissionbydisseminatingknowledgeinthepursuitof
education,learning,andresearchatthehighestinternationallevelsofexcellence.
www.cambridge.org
Informationonthistitle:www.cambridge.org/9781108489676
DOI:10.1017/9781108779197
©EryArias-Castro2022
Thispublicationisincopyright.Subjecttostatutoryexception
andtotheprovisionsofrelevantcollectivelicensingagreements,
noreproductionofanypartmaytakeplacewithoutthewritten
permissionofCambridgeUniversityPress.
Firstpublished2022
AcataloguerecordforthispublicationisavailablefromtheBritishLibrary.
ISBN978-1-108-48967-6Hardback
ISBN978-1-108-74744-8Paperback
CambridgeUniversityPresshasnoresponsibilityforthepersistenceoraccuracyof
URLsforexternalorthird-partyinternetwebsitesreferredtointhispublication
anddoesnotguaranteethatanycontentonsuchwebsitesis,orwillremain,
accurateorappropriate.
Published online by Cambridge University Press
Iwouldliketodedicatethisbooktosomeprofessorsthathave,alongthe
way, inspired, supported, and mentored me in my studies and academic
career,andtowhomIameternallygrateful:
DavidL.Donoho
mydoctoralthesisadvisor
PersiDiaconis
myfirstco-authoronaresearcharticle
YvesMeyer
mymaster’sthesisadvisor
RobertAzencott
myundergraduatethesisadvisor
Published online by Cambridge University Press
Incontrolledexperimentationithasbeenfoundnotdifficulttointroduce
explicit and objective randomization in such a way that the tests of
significance are demonstrably correct. In other cases we must still act
inthefaiththatNaturehasdonetherandomizationforus.[...]Wenow
recognizerandomizationasapostulatenecessarytothevalidityofour
conclusions,andthemodernexperimenteriscarefultomakesurethatthis
postulateisjustified.
RonaldA.Fisher
InternationalStatisticalConferences,1947
Published online by Cambridge University Press
Contents
Preface xiv
Acknowledgements xvii
PartI ElementsofProbabilityTheory
1 AxiomsofProbabilityTheory 3
1.1 ElementsofSetTheory 3
1.2 OutcomesandEvents 5
1.3 ProbabilityAxioms 8
1.4 Inclusion-ExclusionFormula 10
1.5 ConditionalProbabilityandIndependence 12
1.6 AdditionalProblems 17
2 DiscreteProbabilitySpaces 19
2.1 ProbabilityMassFunctions 19
2.2 UniformDistributions 20
2.3 BernoulliTrials 21
2.4 UrnModels 25
2.5 FurtherTopics 29
2.6 AdditionalProblems 31
3 DistributionsontheRealLine 34
3.1 RandomVariables 34
3.2 Borelσ-Algebra 34
3.3 DistributionsontheRealLine 36
3.4 DistributionFunction 36
3.5 SurvivalFunction 38
3.6 QuantileFunction 39
ix
Published online by Cambridge University Press