ebook img

Statistics and Analysis of Scientific Data PDF

491 Pages·2022·9.207 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Statistics and Analysis of Scientific Data

Graduate Texts in Physics Massimiliano Bonamente Statistics and Analysis of Scientific Data Third Edition Graduate Texts in Physics SeriesEditors KurtH.Becker,NYUPolytechnicSchoolofEngineering,Brooklyn,NY,USA Jean-MarcDiMeglio,MatièreetSystèmesComplexes,BâtimentCondorcet, UniversitéParisDiderot,Paris,France SadriHassani,DepartmentofPhysics,IllinoisStateUniversity,Normal,IL,USA MortenHjorth-Jensen,DepartmentofPhysics,Blindern,UniversityofOslo,Oslo, Norway BillMunro,NTTBasicResearchLaboratories,Atsugi,Japan RichardNeeds,CavendishLaboratory,UniversityofCambridge,Cambridge,UK WilliamT.Rhodes,DepartmentofComputerandElectricalEngineeringand ComputerScience,FloridaAtlanticUniversity,BocaRaton,FL,USA SusanScott,AustralianNationalUniversity,Acton,Australia H.EugeneStanley,CenterforPolymerStudies,PhysicsDepartment,Boston University,Boston,MA,USA MartinStutzmann,WalterSchottkyInstitute,TechnicalUniversityofMunich, Garching,Germany AndreasWipf,InstituteofTheoreticalPhysics,Friedrich-Schiller-UniversityJena, Jena,Germany GraduateTextsinPhysicspublishescorelearning/teachingmaterialforgraduate-and advanced-levelundergraduatecoursesontopicsofcurrentandemergingfieldswithin physics, both pure and applied. These textbooks serve students at the MS- or PhD-levelandtheirinstructorsascomprehensivesourcesofprinciples,definitions, derivations,experimentsandapplications(asrelevant)fortheirmasteryandteaching, respectively.Internationalinscopeandrelevance,thetextbookscorrespondtocourse syllabisufficientlytoserveasrequiredreading.Theirdidacticstyle,comprehensive- nessandcoverageoffundamentalmaterialalsomakethemsuitableasintroductions orreferencesforscientistsentering,orrequiringtimelyknowledgeof,aresearchfield. Moreinformationaboutthisseriesathttps://link.springer.com/bookseries/8431 Massimiliano Bonamente Statistics and Analysis of Scientific Data Third Edition MassimilianoBonamente UniversityofAlabamainHuntsville Huntsville,AL,USA ISSN 1868-4513 ISSN 1868-4521 (electronic) GraduateTextsinPhysics ISBN 978-981-19-0364-9 ISBN 978-981-19-0365-6 (eBook) https://doi.org/10.1007/978-981-19-0365-6 1st&2ndeditions:©SpringerScience+BusinessMediaNewYork2013,2017 3rdedition:©TheEditor(s)(ifapplicable)andTheAuthor(s),underexclusivelicensetoSpringer NatureSingaporePteLtd.2022 Thisworkissubjecttocopyright.AllrightsaresolelyandexclusivelylicensedbythePublisher,whether thewholeorpartofthematerialisconcerned,specificallytherightsoftranslation,reprinting,reuse ofillustrations,recitation,broadcasting,reproductiononmicrofilmsorinanyotherphysicalway,and transmissionorinformationstorageandretrieval,electronicadaptation,computersoftware,orbysimilar ordissimilarmethodologynowknownorhereafterdeveloped. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant protectivelawsandregulationsandthereforefreeforgeneraluse. Thepublisher,theauthors,andtheeditorsaresafetoassumethattheadviceandinformationinthisbook arebelievedtobetrueandaccurateatthedateofpublication.Neitherthepublishernortheauthorsor theeditorsgiveawarranty,expressedorimplied,withrespecttothematerialcontainedhereinorforany errorsoromissionsthatmayhavebeenmade.Thepublisherremainsneutralwithregardtojurisdictional claimsinpublishedmapsandinstitutionalaffiliations. PortraitofJohnDunsScotusbyJ.vanGhent,oilonpanel,circa1472–1476,GalleriaNazionaledelle Marche ThisSpringerimprintispublishedbytheregisteredcompanySpringerNatureSingaporePteLtd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore Numbersdon’tlie Foreword ItisnowtheeraofBigData.Hugetrovesofquantitativeinformationresideevery- whereandcanbeaccessedfromanywhereintheworldinaheartbeat.Whileourtools for the collection and analysis of data have rocketed forward over a single human lifetime, we remain human. As such, we can only hold a few simple numbers and conceptsinourpoor,analogbrains.Werefertothoseconceptsasourunderstanding. Statistical analysis of data allows us to make quantitative assessments of our concepts,hopefullykeepingourpreconceptionsandmisconceptionsatbay.Butusing statisticalmethodsisneverjust“turningacrank”.Wemustapplyjudgment.Wemust becareful.Computerscaneasilylulloneintoafalsesenseofsecurity.Inhisthird edition,ProfessorBonamenteprovidestheclearestexpositionofstatisticalanalysis and the probability theory upon which it is based. He provides a roadmap to the properuseofstatistics. Intheopeningchapterofthisbook,theauthordiscussesBayesianversusFrequen- tist statistical approaches. The Frequentist approach can be useful if one has no underlying model for the mechanisms at play. But, in the Physical Sciences and Engineering,sincethetimeofIsaacNewton,wehavemodelsthatrelyonparame- ters.Andthoseparameters candistilltheunderstanding weseek.How tomeasure thoseparametersandplaceerrorestimatesupontheresultsisthecentralthemeof thisbook. Today,therearesurprisinglyfewcoursesinstatisticsofferedatuniversities.And thereareevenfewertextbooksavailabletosupportthoseclasses.Manyofthebooks onstatisticsarewrittenforthebiologicalandmedicalfieldsandaresuitableforuse inresearcharenaswheretheparametersandmechanismsarepoorlyunderstood.In this text, you will find the approaches suitable for the analysis of data in physical fieldsasopposedtothelifesciences. Thisthirdeditionisthetextbookwehavebeenawaiting.Everyindividualwho encountersdatainaprofessionalcapacityshouldreadit.Thematerialrangesfrom thebasicsofprobabilitytotheseriousanalysisofdata.Anyindividualwhoworks withdatadailyshouldmasterthematerialsinthisbook,keepingacopyforreference. Ifrequentlyteachourdepartment’scourseindataanalysisandstatistics.Ihave usedBonamente’ssecondeditiontogoodeffect,butthethirdeditionisthetextbook vii viii Foreword Ihaveawaited.IonlywishIhadhaditbackwhenIwasastudent.Iamoldenough thatIhadtouseaslideruleasanundergraduateanddidnotgainaccesstoahand calculatoruntilgraduateschool.Computerswererunondecksofpunchcards.How far our hardware tools have come! Yet the statistical tools employed today have hardlychanged. Even the data have changed. When I was young, astronomers employed film. Today we use solid-state detectors that count individual photons at light levels so lowthatmostofthedetectorbinshavecountednophotonsatall.Theswitchfrom film(analog)tophoton-counting(digital)hascompletelychangedthenatureofthe appropriatestatisticalanalysis. In Chaps. 15 and 16, the author presents an approach to quantitative statistical analysis of low-signal-count data. It is often the case today that many events are recorded,buttheinstrumentstakingthedatahaveevenhigherresolution,i.e.,more databinsthatevents.Inthesetwochapters,theauthordiscussestheapplicationof PoissonStatisticstothislow-count-per-bindatausingtheCashStatistic.Henotonly presentstheuseofthisCashStatisticbutdevelopsittoanewlevelthatshouldprove usefulacrossarangeofmodernapplications. Yes,theCashStatisticisnamedafterme,thewriterofthisforeword.Iwaslucky enoughtoparticipateintheveryearlydaysofx-rayastronomyand,inparticular,be involvedwiththeveryfirstx-raytelescopesbackinthe1970s.X-raysourceshave notoriously weak photon fluxes, but every photon packs a punch, so it was x-ray and gamma-ray astronomers who first had to reckon with the statistics of images andspectrawhereasignificantfractionofthedatabinswereempty.Thedatawere fundamentallyofaPoissondistributionnatureandcouldnotbeaddressedwiththe usualassumptionofaGaussiandistribution.How,forexample,doesonedealwith animagethathas1024binswithanaverageof.01x-raysperbin,yetthereisone binwiththreex-raysinit,rightwherethetargetissupposedtobe?Theanswerlays intheCashStatisticandshowingthatitwasdistributedasaChi-square.Professor Bonamentehasnow,inthistext,broadenedthescopeoftheapplicationsandprovided abetterframeworkfortheapplicationofthestatistic. InPartIIIofthisbook,theauthoraddressesMonteCarlomethods.Theseareof fundamentalimportancetomuchofmodernmodeling,particularlyinhighlycomplex modelslikeraytracingofopticalsystems.Again,thisbooktakesthereader(student) intorigorousanalysistechniquesthataredifficult,ifnotimpossible,tofindexplained elsewhere. Statistics can be a hard master. The math can be challenging. But in the age of easy access to computing, the challenge to the user is to sift from all the numbers the“understanding”thatissought.Themathbecomessecondary.Judgmentcomes tothefore.WhatstatisticshouldIuse?Whatdotheresultsactuallymean?Itiseasy tooverusethestatistics,findingstatisticallysignificantpatternsthatareactuallyof nosignificance. Foreword ix And,astimegoesby,andone’sattentionmoveselsewhere,oneiscertaintoforget howtoperformthestatisticalanalyses.Butthegeneralknowledgeoftherangeof statisticaltoolswillstaywithyou.Then,whentheneedarises,pullthisbookoffthe shelf.ProfessorBonamente’sbookwillbringitallback. June2022 WebsterCash ProfessorofAstrophysicalandPlanetarySciences UniversityofColorado,Boulder Preface WiththethirdeditionofStatisticsandAnalysisofScientificData,Ihavecontinued to pursue the goal of providing a textbook that is both mathematically sound and easytouseforpracticalapplications.Dataanalysisstraddlestherigorousfieldsof statisticsandmathematics,andthemessyworldofexperimentsanddatacollection, inawaythatmakesitoftenunclearastowheretodrawthelinebetweenwhatneeds tobedonetothedataandwhatshouldbedoneaccordingtotheavailabletheoretical tools.Thistextbookaimstotreatboththetheoreticalandpracticalaspectsofseveral keymethodsofdataanalysis,includingthelimitationsofcertaincommonpractices, suchastheuseofthechi-squaredstatisticfornon-Gaussiandata. Startingwiththesecondedition,Idecidedtomarkportionsofthetextbookthat are primarily theoretical with a gray sidebar. These parts are not essential for the practiceofdataanalysis,andtheycanbeskippedbythereaderwhoisnotinterested in its theoretical underpinnings, or by the reader who needs a fast reference for a methodthattheyalreadyknow.Ifindthatthisisaneffectivewayofpresentingcertain mathematicallyintensivetopics,andIhavecontinuedwiththispracticeinthethird edition. Certainkeytheoreticalresultsthathaveagreatinfluenceonstatisticalmethods, suchasH.Cramér’stheoremonthelimitingdistributionofthechi-squaredstatistic orS.S.Wilk’stheoremonthedistributionofthelikelihoodratio,arenowpresented inamoresystematicway.Themathematicaltheorybehindsomeoftheseresultsis quite complex, and therefore I sometimes found it necessary to describe only the main results, and then provide a reference to the original work, instead of delving into too deep a mathematical treatment. The more theoretically inclined reader is pointedtotherelevantreferencesinstead.Atthesametime,itisimportantthatthe dataanalysispractitionerisawareofthesemathematicalresultsandthelimitations they impose. A cautionary tale of what can go wrong when there is a mismatch betweenthemethodofanalysisandtheunderlyingtheoryisinK.Pearson’suseof the chi-squared statistic for contingency tables, which are now treated in detail in Chap.10.Pearsonwaserroneouslyconvincedthat2×2contingencytableswithfixed xi

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.