ebook img

Modern Statistics: A Computer-Based Approach with Python (Statistics for Industry, Technology, and Engineering) PDF

452 Pages·2022·6.559 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Modern Statistics: A Computer-Based Approach with Python (Statistics for Industry, Technology, and Engineering)

Statistics for Industry, Technology, and Engineering SeriesEditor DavidSteinberg,TelAvivUniversity,TelAviv,Israel EditorialBoardMembers V.RoshanJoseph,GeorgiaInstituteofTechnology,Atlanta,GA,USA RonS.Kenett,KPALtd.RaananaandSamuelNeamanInstitute,Technion, Haifa,Israel ChristineAnderson-Cook,LosAlamosNationalLaboratory,LosAlamos,USA BradleyJones,SASInstitute,JMPDivision,Cary,USA FugeeTsung,HongKongUniversityofScienceandTechnology,HongKong, HongKong The Statistics for Industry, Technology, and Engineering series will present up-to- date statistical ideas and methods that are relevant to researchers and accessible to an interdisciplinary audience: carefully organized authoritative presentations, numerousillustrativeexamplesbasedoncurrentpractice,reliablemethods,realistic data sets, and discussions of select new emerging methods and their application potential.Publicationswillappealtoabroadinterdisciplinaryreadershipincluding both researchers and practitioners in applied statistics, data science, industrial statistics, engineering statistics, quality control, manufacturing, applied reliability, andgeneralqualityimprovementmethods. PrincipalTopicAreas: * Quality Monitoring * Engineering Statistics * Data Analytics * Data Science * TimeSerieswithApplications*SystemsAnalyticsandControl*Stochasticsand Simulation * Reliability * Risk Analysis * Uncertainty Quantification * Decision Theory * Survival Analysis * Prediction and Tolerance Analysis * Multivariate Statistical Methods * Nondestructive Testing * Accelerated Testing * Signal Processing*ExperimentalDesign*SoftwareReliability*NeuralNetworks* Theserieswillincludeprofessionalexpositorymonographs,advancedtextbooks, handbooks, general references,thematic compilations of applications/case studies, andcarefullyeditedsurveybooks. Ron S. Kenett • Shelemyahu Zacks • Peter Gedeck Modern Statistics A Computer-Based Approach with Python RonS.Kenett ShelemyahuZacks KPALtd.RaananaandSamuelNeaman MathematicalSciences Institute,Technion BinghamtonUniversity Haifa,Israel McLean,VA,USA PeterGedeck DataScience UniversityofVirginia FallsChurch,VA,USA Thisworkcontainsmediaenhancements,whicharedisplayedwitha“play”icon.Materialintheprint bookcanbeviewedonamobiledevicebydownloadingtheSpringerNature“MoreMedia”appavailable inthemajorappstores.Themediaenhancementsintheonlineversionoftheworkcanbeaccessed directlybyauthorizedusers. ISSN2662-5555 ISSN2662-5563 (electronic) StatisticsforIndustry,Technology,andEngineering ISBN978-3-031-07565-0 ISBN978-3-031-07566-7 (eBook) https://doi.org/10.1007/978-3-031-07566-7 MathematicsSubjectClassification:62E15,62G30,62M10,62P30,62P10,97K40,97K70,97K80 ©TheEditor(s)(ifapplicable)andTheAuthor(s),underexclusivelicensetoSpringerNatureSwitzerland AG2022 Thisworkissubjecttocopyright.AllrightsaresolelyandexclusivelylicensedbythePublisher,whether thewholeorpartofthematerialisconcerned,specificallytherightsoftranslation,reprinting,reuse ofillustrations,recitation,broadcasting,reproductiononmicrofilmsorinanyotherphysicalway,and transmissionorinformationstorageandretrieval,electronicadaptation,computersoftware,orbysimilar ordissimilarmethodologynowknownorhereafterdeveloped. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant protectivelawsandregulationsandthereforefreeforgeneraluse. Thepublisher,theauthors,andtheeditorsaresafetoassumethattheadviceandinformationinthisbook arebelievedtobetrueandaccurateatthedateofpublication.Neitherthepublishernortheauthorsor theeditorsgiveawarranty,expressedorimplied,withrespecttothematerialcontainedhereinorforany errorsoromissionsthatmayhavebeenmade.Thepublisherremainsneutralwithregardtojurisdictional claimsinpublishedmapsandinstitutionalaffiliations. This book is published under the imprint Birkhäuser, www.birkhauser-science.com by the registered companySpringerNatureSwitzerlandAG Theregisteredcompanyaddressis:Gewerbestrasse11,6330Cham,Switzerland TomywifeSima,ourchildrenandtheir children:Yonatan,Alma,Tomer,Yadin,Aviv, Gili,Matan,Eden,andEthan.RSK TomywifeHanna,oursonsYuvalandDavid, andtheirfamilieswithlove.SZ ToJanetwithlove.PG Preface Statisticshasdevelopedbycombiningtheneedsofscience,business,industry,and government. More recent development is connected with methods for generating insightsfromdata,usingstatisticaltheoryanddeliveryplatforms.Thisintegration isatthecoreofappliedstatisticsandmostoftheoreticalstatistics. Beforethebeginningofthetwentiethcentury,statisticsmeantobserveddataand descriptivesummaryfigures,suchasmeans,variances,indices,etc.,computedfrom data.Withtheintroductionoftheχ2-testforgoodnessoffitbyKarlPearson(1900) and the t-test by Gosset (Student, 1908) for drawing inference on the mean of a normal population, statistics became a methodology of analyzing sample data to determinethevalidityofhypotheses aboutthesourceofthedata(thepopulation). Fisher (1922) laid the foundations for statistics as a discipline. He considered the objectofstatisticalmethodstobereducingdataintotheessentialstatistics,andhe identifiedthreeproblemsthatariseindoingso: 1. Specification-choosingtherightmathematicalmodelforapopulation 2. Estimation-methodstocalculate,fromasample,estimatesoftheparametersof thehypotheticalpopulation 3. Distribution-propertiesofstatisticsderivedfromsamples Fortyyearslater,Tukey(1962)envisionedadata-centricdevelopmentofstatis- tics,sketchingthepathwaytodatascience.Fortyyearsafterthat,weenteredtheage of big data, data science, artificial intelligence, and machine learning. These new developmentsarebuiltonthemethods,applications,andexperienceofstatisticians aroundtheworld. The first two authors started collaborating on a book in the early 1990s. In 1998,wepublishedwithDuxburyWadsworthModernIndustrialStatistics:Design and Control of Quality and Reliability. The book appeared in a Spanish edition (Estadística Industrial Moderna: Diseño y Control de Calidad y la Confiabilidad, Thomson International, 2000). An abbreviated edition was published as Modern Statistics: A Computer based Approach (Thomson Learning, 2001); this was followedbyaChineseedition(ChinaStatisticsPress,2003)andasoftcoveredition, (Brooks/Cole, 2004). The book used QuickBasic, S-Plus, and MINITAB. In 2014 vii viii Preface we published, with Wiley, an extended second edition titled Modern Industrial Statistics: With Applications in R, MINITAB and JMP. That book was translated into Vietnamese by the Vietnam Institute for Advanced Studies in Mathematics (VIASM,2016).Athird,expandededition,waspublishedbyWileyin2021. This book is about modern statistics with Python. It reflects many years of experience of the authors in doing research, teaching and applying statistics in science, healthcare, business, defense, and industry domains. The book invokes over 40 case studies and provides comprehensive Python applications. In 2019, there were 8.2 million developers in the world who code using Python which is consideredthefastest-growingprogramminglanguage.AspecialPythonpackage, mistat,isavailablefordownloadhttps://gedeck.github.io/mistat-code-solutions/ ModernStatistics/. Everything in the book can be reproduced with mistat. We therefore provide, in this book, an integration of needs, methods, and delivery platformforalargeaudienceandawiderangeofapplications. Modern Statistics: A Computer-Based Approach with Python is a companion texttoanotherbookpublishedbySpringertitled:IndustrialStatistics:AComputer BasedApproachwithPython.Bothbooksincludemutualcrossreferences,butboth books are stand-alone publications. This book can be used as textbook in a one semester or two semester course on modern statistics. The technical level of the presentation in both books can serve both undergraduate and graduate students. The example and case studies provide access to hands on teaching and learning. Everychapterincludesexercises,datasets,andPythonapplications.Thesecanbe used in regular classroom setups, flipped classroom setups, and online or hybrid education programs. The companion text is focused on industrial statistics with special chapters on advanced process monitoring methods, cybermanufacturing, computerexperiments,andBayesianreliability.ModernStatisticsisafoundational textandcanbecombinedwithanyprogramrequiringdataanalysisinitscurriculum. This, for example, can be courses in data science, industrial statistics, physics, biology, chemistry, economics, psychology, social sciences, or any engineering discipline. Modern Statistics: A Computer-Based Approach with Python includes eight chapters.Chapter1isonanalyzingvariabilitywithdescriptivestatistics.Chapter2 isonprobabilitymodelsanddistributionfunctions.Chapter3introducesstatistical inference and bootstrapping. Chapter 4 is on variability in several dimensions and regression models. Chapter 5 covers sampling for estimation of finite population quantities, a common situation when one wants to infer on a population from a sample. Chapter 6 is dedicated to time series analysis and prediction. Chapters 7 and8areaboutmoderndataanalyticmethods. Industrial Statistics: A Computer-Based Approach with Python contains 11 chapters:Chapter1—IntroductiontoIndustrialStatistics,Chapter2—BasicTools and Principles of Process Control, Chapter 3—Advanced Methods of Statistical Process Control,Chapter 4—Multivariate StatisticalProcess Control,Chapter 5— Classical Design and Analysis of Experiments, Chapter 6—Quality by Design, Chapter 7—Computer Experiments, Chapter 8—Cybermanufacturing and Digital Twins,Chapter9—ReliabilityAnalysis,Chapter10—BayesianReliabilityEstima- Preface ix tion and Prediction, and Chapter 11—Sampling Plans for Batch and Sequential Inspection. This second book is focused on industrial statistics with applications tomonitoring,diagnostics,prognostic,andprescriptiveanalytics.Itcanbeusedas a stand-alone book, or in conjunction with Modern Statistics. Both books include solutionmanualstoexerciseslistedattheendofeachchapter.Thiswasdesignedto supportself-learningaswellasinstructorledcourses. Wemadeeverypossibleefforttoensurethecalculationsarecorrectandthetext is clear. However, should errors have skipped to the printed version, we would appreciate feedback from readers noticing these. In general, any feedback will be muchappreciated. Finally, we would like to thank the team at Springer Birkhäuser, including DanaKnowlesandChristopherTominich.Theymadeeverythinginthepublication processlookeasy. Ra’anana,Israel RonS.Kenett McLean,VA,USA ShelemyahuZacks FallsChurch,VA,USA PeterGedeck April2022 Contents 1 AnalyzingVariability:DescriptiveStatistics.............................. 1 1.1 RandomPhenomenaandtheStructureofObservations............. 1 1.2 AccuracyandPrecisionofMeasurements............................ 6 1.3 ThePopulationandtheSample....................................... 8 1.4 DescriptiveAnalysisofSampleValues............................... 9 1.4.1 FrequencyDistributionsofDiscreteRandomVariables ... 9 1.4.2 FrequencyDistributionsofContinuousRandom Variables...................................................... 14 1.4.3 StatisticsoftheOrderedSample ............................ 17 1.4.4 StatisticsofLocationandDispersion....................... 19 1.5 PredictionIntervals.................................................... 23 1.6 AdditionalTechniquesofExploratoryDataAnalysis ............... 25 1.6.1 DensityPlots ................................................. 25 1.6.2 BoxandWhiskersPlots ..................................... 27 1.6.3 QuantilePlots ................................................ 29 1.6.4 Stem-and-LeafDiagrams.................................... 30 1.6.5 RobustStatisticsforLocationandDispersion.............. 31 1.7 ChapterHighlights..................................................... 34 1.8 Exercises............................................................... 34 2 ProbabilityModelsandDistributionFunctions.......................... 39 2.1 BasicProbability....................................................... 39 2.1.1 EventsandSampleSpaces:FormalPresentation ofRandomMeasurements................................... 39 2.1.2 BasicRulesofOperationswithEvents:Unions andIntersections ............................................. 41 2.1.3 ProbabilitiesofEvents....................................... 44 2.1.4 ProbabilityFunctionsforRandomSampling............... 46 2.1.5 ConditionalProbabilitiesandIndependenceofEvents .... 49 2.1.6 Bayes’TheoremandItsApplication........................ 51 2.2 RandomVariablesandTheirDistributions........................... 54 xi xii Contents 2.2.1 DiscreteandContinuousDistributions...................... 55 2.2.1.1 DiscreteRandomVariables...................... 55 2.2.1.2 ContinuousRandomVariables .................. 56 2.2.2 ExpectedValuesandMomentsofDistributions............ 59 2.2.3 TheStandardDeviation,Quantiles,Measuresof Skewness,andKurtosis...................................... 62 2.2.4 MomentGeneratingFunctions.............................. 65 2.3 FamiliesofDiscreteDistribution ..................................... 66 2.3.1 TheBinomialDistribution................................... 66 2.3.2 TheHypergeometricDistribution........................... 69 2.3.3 ThePoissonDistribution .................................... 72 2.3.4 TheGeometricandNegativeBinomialDistributions...... 74 2.4 ContinuousDistributions.............................................. 78 2.4.1 TheUniformDistributionontheInterval(a,b), a <b ......................................................... 78 2.4.2 TheNormalandLog-NormalDistributions ................ 79 2.4.2.1 TheNormalDistribution......................... 79 2.4.2.2 TheLog-NormalDistribution ................... 84 2.4.3 TheExponentialDistribution................................ 85 2.4.4 TheGammaandWeibullDistributions..................... 88 2.4.5 TheBetaDistributions....................................... 92 2.5 Joint,Marginal,andConditionalDistributions....................... 93 2.5.1 JointandMarginalDistributions............................ 93 2.5.2 CovarianceandCorrelation.................................. 96 2.5.3 ConditionalDistributions.................................... 99 2.6 SomeMultivariateDistributions...................................... 102 2.6.1 TheMultinomialDistribution............................... 102 2.6.2 TheMulti-HypergeometricDistribution.................... 104 2.6.3 TheBivariateNormalDistribution.......................... 105 2.7 DistributionofOrderStatistics........................................ 108 2.8 LinearCombinationsofRandomVariables .......................... 111 2.9 LargeSampleApproximations........................................ 117 2.9.1 TheLawofLargeNumbers ................................. 117 2.9.2 TheCentralLimitTheorem ................................. 117 2.9.3 SomeNormalApproximations.............................. 119 2.10 AdditionalDistributionsofStatisticsofNormalSamples........... 120 2.10.1 DistributionoftheSampleVariance ........................ 121 2.10.2 The“Student”t-Statistic..................................... 122 2.10.3 DistributionoftheVarianceRatio........................... 123 2.11 ChapterHighlights..................................................... 125 2.12 Exercises............................................................... 126 3 StatisticalInferenceandBootstrapping................................... 139 3.1 SamplingCharacteristicsofEstimators .............................. 139 3.2 SomeMethodsofPointEstimation................................... 141

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.