Mathematical Statistics with Applications in R Third Edition Kandethody M. Ramachandran Professor of Mathematics and Statistics University of South Florida Tampa, Florida Chris P. Tsokos Distinguished University Professor of Mathematics and Statistics University of South Florida Tampa, Florida Academic PressisanimprintofElsevier 125London Wall,LondonEC2Y5AS,UnitedKingdom 525BStreet,Suite1650,SanDiego,CA92101,UnitedStates 50HampshireStreet,5thFloor,Cambridge,MA02139,UnitedStates TheBoulevard,Langford Lane,Kidlington,OxfordOX5 1GB,UnitedKingdom Copyright©2021ElsevierInc.Allrightsreserved. Nopart ofthispublicationmay bereproduced ortransmitted inanyform orbyanymeans, electronicor mechanical,including photocopying, recording,oranyinformation storageandretrieval system,withoutpermission inwritingfromthePublisher. Details onhowtoseek permission, furtherinformation aboutthePublisher’spermissions policiesandourarrangements with organizations suchastheCopyrightClearanceCenterandtheCopyrightLicensing Agency,canbefoundatourwebsite: www.elsevier.com/permissions. Thisbookandtheindividual contributionscontainedinitareprotected undercopyrightbythePublisher (otherthanasmay benotedherein). Notices Knowledgeandbestpracticeinthisfieldareconstantlychanging. As newresearchandexperiencebroadenourunderstanding, changesinresearch methods,professional practices,ormedical treatmentmay becomenecessary. Practitionersandresearchers mustalwaysrelyontheir ownexperience andknowledgeinevaluatingandusingany information,methods,compounds,orexperiments describedherein. Inusingsuchinformation ormethodsthey shouldbe mindfuloftheirown safetyandthesafetyofothers,including partiesforwhom theyhaveaprofessional responsibility. Tothefullestextentofthelaw,neither northePublisher,northeauthors, contributors, oreditors,assume anyliabilityforany injuryand/ordamagetopersonsorpropertyasamatterofproductsliability,negligence orotherwise,or fromanyuseor operation ofanymethods,products, instructions,or ideascontainedinthematerialherein. LibraryofCongressCataloging-in-Publication Data Acatalogrecord forthisbook isavailablefromtheLibrary ofCongress BritishLibraryCataloguing-in-Publication Data Acataloguerecord forthisbook isavailablefromtheBritishLibrary ISBN:978-0-12-817815-7 Forinformation onallAcademic Presspublications visitourwebsite at https://www.elsevier.com/books-and-journals Publisher:Katey Birtcher AcquisitionEditor: Katey Birtcher EditorialProjectManager:Peter J.Llewellyn/Danielle McLean ProductionProjectManager:Beula Christopher CoverDesigner: BrianSalisbury TypesetbyTNQTechnologies Dedication Dedicated to our families: Usha, Vikas, Vilas, and Varsha Ramachandran and Debbie, Matthew, Jonathan, and Maria Tsokos Acknowledgments Weexpressoursincereappreciationtoourlatecolleague,coworker,anddearfriend,ProfessorA.N.V.Rao,forhishelpful suggestionsandideasfortheinitialversionofthesubjecttextbook.Inaddition,wethankBong-jinChoiandYongXufor theirkindassistanceinthepreparationofthefirsteditionofthebook.Wewouldliketothankthefollowingfortheirhelpin the preparation of second edition: A.K.M.R. Bashar, Jason Burgess, Doo Young Kim, Taysseer Sharaf, Bhikhari Tharu, Ram Kafle, Dr. Rebecca Wooten, and Dr. Olga Savchuk. For this edition, we would like to give a special thanks to Dr. OlgaSavchukforhermanycorrectionstothesecondedition,andtoMr.JayantaPokharelforwritingthesolutionmanual. WealsowouldliketothankallthosewhocommentedonourbookontheInternetsitessuchasAmazonandGoogle.We acknowledgeourstudents attheUniversityofSouth Florida fortheirusefulcomments throughtheyears. Toallofthem, we are very thankful. Finally, we would thank the entire Elsevier team for putting together this edition as well as the previous editions. Kandethody M. Ramachandran Chris P. Tsokos Tampa, Florida xiii About the authors Kandethody M. Ramachandran is a Professor of Mathematics and Statistics at the University of South Florida. He receivedhisBSandMSdegreesinmathematicsfromCalicutUniversity,India.Later,heworkedasaresearcherattheTata Institute of Fundamental Research, Bangalore Center, at its Applied Mathematics Division. Professor Ramachandran got his PhD in applied mathematics from Brown University. Hisresearchinterestsareconcentratedintheareasofappliedprobabilityandstatistics.Hisresearchpublicationsspana variety of areas, such as control of heavy traffic queues, stochastic delay equations and control problems, stochastic differential games and applications, reinforcement learning methods applied to game theory and other areas, software reliability problems, applications of statistical methods to microarray data analysis, mathematical finance, and various machine learning applications. He is also coauthor with Chris Tsokos of a book titled Stochastic Differential Games Theory and Applications, Atlantis Press. Professor Ramachandranisextensivelyinvolved inactivities toimprove statisticsandmathematics education.He isa recipient of the Teaching Incentive Program Award at the University of South Florida. He is a member of the MEME Collaborative, which is a partnership among mathematics education, mathematics, and engineering faculty to address issues related to mathematics and mathematics education. He was also involved in the calculus reform efforts at the UniversityofSouthFlorida. He istherecipient ofa$2milliongrant from theNSF,astheprincipalinvestigator, andisa co-principal investigator on a Howard Hughes Medical Institute grant of 1.2 million to improve STEM (science, tech- nology, engineering, and mathematics) education at the University of South Florida. ChrisP.TsokosisaDistinguishedUniversityProfessorofMathematicsandStatisticsattheUniversityofSouthFlorida. ProfessorTsokosreceivedhisBSinengineeringsciences/mathematicsandhisMAinmathematicsfromtheUniversityof Rhode Island and his PhD in statistics and probability from the University of Connecticut. Professor Tsokos has also served on the faculties at Virginia Polytechnic Institute and State University and the University of Rhode Island. Professor Tsokos’s research has extended into a variety of areas, including stochastic systems, statistical models, reliabilityanalysis,ecologicalsystems,operationsresearch,timeseries,Bayesiananalysis,andmathematicalandstatistical modeling of global warming, among others. He is the author of more than 400 research publications in these areas. ProfessorTsokosistheauthorofmorethan25researchmonographsandbooksinmathematicalandstatisticalsciences. He has been invited to lecture in several countries around the globe: Russia, People’s Republic of China, India, Turkey, andmostEUcountries,amongothers.ProfessorTsokoshasmentoredanddirectedthedoctoralresearchofmorethan65 students who are currently employed at our universities and private and government research institutes. ProfessorTsokosisamemberofseveralacademicandprofessionalsocieties.Heisservingasanhonoraryeditor,chief editor,editor,orassociateeditorofmorethan15internationalacademicresearchjournals.ProfessorTsokosistherecipient of many distinguished awards and honors, including Fellow of the American Statistical Association, USF Distinguished Scholar Award, Sigma Xi Outstanding Research Award, USF Outstanding Undergraduate Teaching Award, USF Pro- fessionalExcellenceAwardoftheUniversityAreaCommunityDevelopmentCorporation,andtheTimeWarnerSpiritof Humanity Award, among others. xv Preface Preface to the Third Edition Inthethirdedition,althoughwehavemadesomesignificantchanges,wehavemuchofthematerialofthesecondedition. Wehavecombinedthegoodness-of-fitchapterwiththatofcategoricaldatatocreateanewchapteroncategoricaldata.We haveaddedseveralnew,real-worldexamplesandexercises.WehaveexpandedthestudyofBayesiananalysistoinclude empirical Bayes. In the empirical Bayes approach, we emphasize bootstrapping and jackknifing resampling methods to estimate the prior probability density function. This new approach is illustrated by several examples and exercises. We haveexpandedthechapteronstatisticalapplicationstoincludesomerealsignificantlyimportantproblemsthatourglobal society isfacing in global warming,brain cancer, prostate cancer, hurricanes, rainfall, andunemployment, among others. In addition, we have made several corrections that were discovered in the previous edition. Throughout the third edition, we have incorporated the R-codes that will assist the student in performing statistical analysis. A solution manual for all exercises in the third edition has been developed and published for the convenience of teachers and students. Preface to the Second Edition In the second edition, while keeping much of the material from the first edition, there are some significant changes and additions.DuetothepopularityofRanditsfreeavailability,wehaveincorporatedR-codesthroughoutthebook.Thiswill make it easier for students to do the data analysis. We have also added a chapter on goodness-of-fit tests and illustrated theirapplicabilitywithseveralexamples.Inaddition,wehaveintroducedmoreprobabilitydistributionfunctionswithreal- worlddata-drivenapplicationsinglobalwarming,brainandprostatecancer,nationalunemployment,andtotalrainfall.In this edition, we have shortened the point estimation chapter and merged it with interval estimation. In addition, many corrections and additions are made to reflect the continuous feedback we have obtained. Wehavecreatedastudentcompanionwebsite,http://booksite.elsevier.com/9780124171138,withsolutionstoselected problemsanddataonglobalwarming,brainandprostatecancer,nationalunemployment,andtotalrainfall.Wehavealso posted solutions to most of the problems in the instructor site, http://textbooks.elsevier.com/web/Manuals.aspx?isbn1/ 49780124171138. Preface to the First Edition This textbook is of an interdisciplinary nature and is designed for a one- or two-semester course in probability and sta- tistics, with basic calculus as a prerequisite. The book is primarily written to give a sound theoretical introduction to statisticswhileemphasizingapplications.Ifteachingstatisticsisthemainpurposeofatwo-semestercourseinprobability andstatistics,thistextbookcoversalltheprobabilityconceptsnecessaryforthetheoreticaldevelopmentofstatisticsintwo chapters,andgoesontocoverallmajoraspectsofstatisticaltheoryintwosemesters,insteadofonlyaportionofstatistical concepts. Whatismore,usingtheoptionalsectiononcomputerexamplesattheendofeachchapter,thestudentcanalso simultaneouslylearntoutilizestatisticalsoftwarepackagesfordataanalysis.Itisouraim,withoutsacrificinganyrigor,to encouragestudentstoapplythetheoreticalconceptstheyhavelearned.Therearemanyexamplesandexercisesconcerning diverse application areas that will show the pertinence of statistical methodology to solving real-world problems. The exampleswithstatisticalsoftwareandprojectsattheendofthechapterswillprovidegoodperspectiveontheusefulnessof statistical methods. Tointroduce thestudentstomodernandincreasinglypopularstatistical methods,wehaveintroduced separate chapters on Bayesian analysis and empirical methods. Oneofthemainaimsofthisbookistoprepareadvancedundergraduatesandbeginninggraduatestudentsinthetheory ofstatisticswithemphasisoninterdisciplinaryapplications.Theaudienceforthiscourseisregularfull-timestudentsfrom mathematics,statistics,engineering,physicalsciences,business,socialsciences,materialsscience,andsoforth.Also,this xvii xviii Preface textbook is suitable for people who work in industry and in education as a reference book on introductory statistics for a good theoretical foundation with clear indication of how to use statistical methods. Traditionally, one of the main pre- requisites for this course is a semester of the introduction to probability theory. A working knowledge of elementary (descriptive)statisticsisalsoamust.Inschoolswherethereisnostatisticsmajor,imposingsuchabackground,inaddition tocalculussequence,isverydifficult.Mostofthepresentbooksavailableonthissubjectcontainfullone-semestermaterial forprobabilityandthen,basedonthoseresults,continueontothetopicsinstatistics.Also,someofthesebooksincludein their subject matter only the theory of statistics, whereas others take the cookbook approach of covering the mechanics. Thus,evenwithtwofullsemestersofwork,manybasicandimportantconceptsinstatisticsarenevercovered.Thisbook hasbeenwrittentoremedythisproblem.Wefusetogetherbothconceptsinorderforthestudenttogainknowledgeofthe theory and at the same time develop the expertise to use their knowledge in real-world situations. Although statistics is a very applied subject, there is no denying that it is also a very abstract subject. The purpose of this book is to present the subject matter in such a way that anyone with exposure to basic calculus can study statistics without spending two semesters of background preparation. To prepare students, we present an optional review of the elementary (descriptive) statistics in Chapter 1. All the probability material required to learn statistics is covered in two chapters.Studentswithaprobabilitybackgroundcaneitherrevieworskipthefirstthreechapters.Itisalsoourbeliefthat any statistics course is not complete without exposure to computational techniques. At the end of each chapter, we give someexamplesofhowtouseMinitab,SPSS,andSAStostatisticallyanalyzedata.Also,attheendofeachchapter,there areprojectsthatwillenhancetheknowledgeandunderstandingofthematerialscoveredinthatchapter.Inthechapteron the empirical methods, we present some of the modern computational and simulation techniques, such as bootstrap, jackknife,andMarkovchainMonteCarlomethods.Thelastchaptersummarizessomeofthestepsnecessarytoapplythe materialcoveredinthebooktoreal-worldproblems.Thefirstsixchaptershavebeenclasstestedasaone-semestercourse formorethan3yearswithfivedifferentprofessorsteaching.Firstelevenchaptershavebeenclasstestedbytwodifferent professors for more than 3 years in two consecutive semesters. The audience was junior- and senior-level undergraduate students from many disciplines who had two semesters of calculus, most of them with no probability or statistics back- ground. The feedback from the students and instructors was very positive. Recommendations from the instructors and students were very useful in improving the style and content of the book. Aim and Objective of the Textbook This textbook provides a calculus-based coverage of statistics and introduces students to methods of theoretical statistics and their applications. It assumes no prior knowledge of statistics or probability theory, but does require calculus. Most booksatthislevelarewrittenwithelaboratecoverageofprobability.Thisrequiresteachingonesemesterofprobabilityand then continuing with one or two semesters of statistics. This creates a particular problem for nonstatistics majors from various disciplines who want to obtain a sound background in mathematical statistics and applications. It is our aim to introducebasicconceptsofstatisticswithsoundtheoreticalexplanations.Becausestatisticsisbasicallyaninterdisciplinary applied subject, we offer many applied examples and relevant exercises from different areas. Knowledge of using com- puters for data analysis is desirable. We present examples of solving statistical problems using Minitab, SPSS, and SAS. Features l During years of teaching, we observed that many students who do well in mathematics courses find it difficult to un- derstand the concept of statistics. To remedy this, we present most of the material covered in the textbook with well- definedstep-by-stepprocedurestosolverealproblems.Thisclearlyhelpsthestudentstoapproachproblemsolvingin statistics more logically. l The usefulness of each statistical method introduced is illustrated by several relevant examples. l At the end of each section, we provide ample exercises that are a good mix of theory and applications. l In each chapter,we give various projects for students towork on. These projects aredesignedin sucha way that stu- dentswillstartthinkingabouthowtoapplytheresultstheylearnedinthechapteraswellasotherissuestheywillneed to know for practical situations. l Attheendofthechapters,we includeanoptional section oncomputermethodswith Minitab,SPSS,andSASexam- pleswithclearandsimplecommandsthatthestudentcanusetoanalyzedata.Thiswillhelpthestudentstolearnhow to utilize the standard methods they have learned in the chapter to study real data. l We introduce many of the modern statistical computational and simulation concepts, such as the jackknife and boot- strapmethods,theEMalgorithms,andtheMarkovchainMonteCarlomethods,suchastheMetropolisalgorithm,the Preface xix MetropoliseHastingsalgorithm,andtheGibbssampler.TheMetropolisalgorithmwasmentionedinComputinginSci- ence&Engineeringasbeingamongthetop10algorithmshavingthe“greatestinfluenceonthedevelopmentandprac- tice of science and engineering in the 20th century.” l We have introduced the increasingly popular concept of Bayesian statistics and decision theory with applications. l A separate chapter on design of experiments, including a discussion on the Taguchi approach, is included. l Thecoverageofthebookspansmostoftheimportantconceptsinstatistics.Learningthematerial alongwithcompu- tational examples will prepare students to understand and utilize software procedures to perform statistical analysis. l Everychaptercontainsdiscussiononhowtoapplytheconceptsandwhattheissuesrelatedtoapplyingthetheoryare. l A student’s solution manual, instructor’s manual, and data disk are provided. l In the last chapter, we discuss some issues in applications to clearly demonstrate in a unified way how to check for manyassumptionsindataanalysisandwhatstepsoneneedstofollowtoavoidpossiblepitfallsinapplyingthemethods explained in the rest of this textbook. Flow chart Inthisflowchart,wesuggestsomeoptionsonhowtousethebookinaone-semesterortwo-semestercourse.Foratwo- semestercourse,werecommendcoverageofthecompletetextbook.However,Chapters1,9,and14areoptionalforboth one-andtwo-semestercoursesandcanbegivenasreadingexercises.Foraone-semestercourse,wesuggestthefollowing options: A, B, C, D. One semester Without probability background With probability background Ch. 2 A B C D Ch. 3 Ch. 5 Ch. 5 Ch. 5 Ch. 5 Ch. 4 Ch. 6 Ch. 6 Ch. 6 Ch. 6 Ch. 5 Ch. 7 Ch. 7 Ch. 7 Ch. 7 Ch. 8 Ch. 8 Ch. 8 Ch. 6 Ch. 8 Ch. 10 Ch. 12 Ch. 11 Ch. 11 Ch. 7 Ch. 12 Ch. 13 Ch. 13 Ch. 12 Optional chapters Ch. 8 Ch. 10 Ch. 11 Ch. 12 xxi