ebook img

Understanding Statistics And Experimental Design: How To Not Lie With Statistics PDF

146 Pages·2019·3.364 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Understanding Statistics And Experimental Design: How To Not Lie With Statistics

Learning Materials in Biosciences Michael H. Herzog Gregory Francis Aaron Clarke Understanding Statistics and Experimental Design How to Not Lie with Statistics Learning Materials in Biosciences Learning Materials in Biosciences textbookscompactly and concisely discuss a specific biological,biomedical,biochemical,bioengineeringorcellbiologictopic.Thetextbooks inthisseriesarebasedonlecturesforupper-levelundergraduates,master’sandgraduate students,presentedandwrittenbyauthoritativefiguresinthefieldatleadinguniversities aroundtheglobe. Thetitlesareorganizedto guidethereaderto adeeperunderstandingoftheconcepts covered. Eachtextbookprovidesreaderswithfundamentalinsightsintothesubjectandprepares themtoindependentlypursuefurtherthinkingandresearchonthetopic.Coloredfigures, step-by-stepprotocolsand take-homemessages offeran accessible approachto learning andunderstanding. In addition to being designed to benefit students, Learning Materials textbooks represent a valuable tool for lecturers and teachers, helping them to prepare their own respectivecoursework. Moreinformationaboutthisseriesathttp://www.springer.com/series/15430 Michael H. Herzog • Gregory Francis • Aaron Clarke Understanding Statistics and Experimental Design How to Not Lie with Statistics 123 MichaelH.Herzog GregoryFrancis BrainMindInstitute Dept.PsychologicalSciences ÉcolePolytechniqueFédéraledeLausanne PurdueUniversity (EPFL) WestLafayette Lausanne,Switzerland IN,USA AaronClarke PsychologyDepartment BilkentUniversity Ankara,Turkey ISSN2509-6125 ISSN2509-6133 (electronic) LearningMaterialsinBiosciences ISBN978-3-030-03498-6 ISBN978-3-030-03499-3 (eBook) https://doi.org/10.1007/978-3-030-03499-3 Thisbookisanopenaccesspublication. ©TheEditor(s)(ifapplicable)andTheAuthor(s)2019 OpenAccessThisbookislicensedunderthetermsoftheCreativeCommonsAttribution-NonCommercial4.0 InternationalLicense(http://creativecommons.org/licenses/by-nc/4.0/), whichpermitsanynoncommercialuse, sharing,adaptation,distributionandreproductioninanymediumorformat,aslongasyougiveappropriatecredit totheoriginalauthor(s)andthesource,providealinktotheCreativeCommonslicenceandindicateifchanges weremade. Theimagesorotherthirdpartymaterialinthisbookareincludedinthebook’sCreativeCommonslicence,unless indicatedotherwiseinacreditlinetothematerial.Ifmaterialisnotincludedinthebook’sCreativeCommons licenceandyourintendeduseisnotpermittedbystatutoryregulationorexceedsthepermitteduse,youwillneed toobtainpermissiondirectlyfromthecopyrightholder. Thisworkissubject tocopyright. Allcommercial rightsarereservedbytheauthor(s), whetherthewholeor partofthematerialisconcerned,specificallytherightsoftranslation,reprinting,reuseofillustrations,recitation, broadcasting,reproductiononmicrofilmsorinanyotherphysicalway,andtransmissionorinformationstorage andretrieval,electronicadaptation,computersoftware,orbysimilarordissimilarmethodologynowknownor hereafterdeveloped.Regardingthesecommercialrightsanon-exclusivelicensehasbeengrantedtothepublisher. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublicationdoes notimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevantprotective lawsandregulationsandthereforefreeforgeneraluse. Thepublisher,theauthors,andtheeditorsaresafetoassumethattheadviceandinformationinthisbookare believedtobetrueandaccurateatthedateofpublication. Neitherthepublishernortheauthorsortheeditors giveawarranty,expressorimplied,withrespecttothematerialcontainedhereinorforanyerrorsoromissions thatmayhavebeenmade.Thepublisherremainsneutralwithregardtojurisdictionalclaimsinpublishedmaps andinstitutionalaffiliations. ThisSpringerimprintispublishedbytheregisteredcompanySpringerNatureSwitzerlandAG. Theregisteredcompanyaddressis:Gewerbestrasse11,6330Cham,Switzerland Preface Science,Society,andStatistics Themodernworldisinundatedwithstatistics. Statistics determinewhatweeat,howwe exercise,whowebefriend,howweeducateourchildren,andwhattypeofmedicaltreat- mentwe use.Obviously,statistics isubiquitousand—unfortunately—misunderstandings aboutstatisticsaretoo.InChap.1,wewillreportthatjudgesatlawcourtscometowrong conclusions—conclusionsabout whether or not to send people to prison—because they lackbasicstatisticalunderstanding.Wewillshowthatpatientscommittedsuicidebecause doctorsdid not know how to interpretthe outcome of medical tests. Scientists are often no better. We know colleagues who blindly trust the output of their statistical computer programs,evenwhenthe resultsmakeno sense. We knowof publishedscientific papers containingresultsthatareincompatiblewiththetheoreticalconclusionsoftheauthors. Thereisanoldsaying(sometimesattributedtoMarkTwain,butapparentlyolder)that “There are lies, damn lies, and statistics.” We have to concede that the saying makes a validpoint.Peopledooftenmisusestatisticalanalyses.Maybeitdoesnotgoalltheway to lying (making an intentionally false statement), but statistics often seem to confuse ratherthanclarifyissues. Thetitle ofthis bookreflectsoureffortsto improvethe use of statistics so that people who perform analyses will better interprettheir data and people whoreadstatisticswillbetterunderstandthem.Understandingthecoreideasofstatistics helps immediately to reveal that many scientific results are essentially meaningless and explainswhymanyempiricalsciencesarecurrentlyfacingareplicationcrisis. The confusion about statistics is especially frustrating because the core ideas are actually pretty simple. Computing statistics can be very complicated (thus the need for complicatedalgorithmsandthicktextbookswithdeeptheorems),butagoodunderstand- ingofthebasicprinciplesofstatisticscanbemasteredbyeveryone. In2013,withtheseideasinmind,westartedteachingacourseattheEcolePolytech- niqueFédérale de Lausannein Switzerlandonunderstandingstatistics andexperimental design. Over the years, the course became rather popular and draws in students from biology, neuroscience, medicine, genetics, psychology, and bioengineering. Typically, thesestudentshavealreadyhadoneormorestatisticsclassthatguidedthemthroughthe v vi Preface detailsofstatisticalanalysis.Incontrast,ourcourseandthisbookaredesignedtofleshout thebasicprinciplesofthoseanalyses,tosuccinctlyexplainwhattheydo,andtopromote abetterunderstandingoftheircapabilitiesandlimitations. AboutThisBook BackgroundandGoal As mentioned,misunderstandingsaboutstatistics have becomea major problem in our societies. One problemis that computingstatistics has become so simplethatgoodeducationseemsnottobenecessary.Theoppositeistrue,however.Easy to use statistical programs allow people to perform analyses without knowing what the programsdoandwithoutknowinghowtointerprettheresults.Untenableconclusionsare a common result. The reader will likely be surprised how big the problem is and how useless a large numberof studies are. In addition,the reader may be surprised thateven basicterms,suchasthep-value,arelikelydifferentfromwhatiscommonlybelieved. The main goal of this book is to provide a short and to-the-point exposition on the essentialsofstatistics.Understandingtheseessentialswillpreparethereadertounderstand and critically evaluate scientific publications in many fields of science. We are not interestedinteachingthereaderhowtocomputestatistics.Thiscanbelefttothecomputer. Readership This book is for all citizens and scientists who want to understand the principlesofstatisticsandinterpretstatisticswithoutgoingintothedetailedmathematical computations.Itisperhapssurprisingthatthisgoalcanbeachievedwithaveryshortbook andonlya fewequations.We thinkpeople(notjust studentsorscientists) either withor withoutpreviousstatisticsclasseswillbenefitfromthebook. We kept mathematics to the lowest level possible and provided non-mathematical intuition wherever possible. We added equations only at occasions where they improve understanding.Exceptforextremelybasicmath,onlysomebasicnotionsfromprobability theoryareneededforunderstandingthemainideas.Mostofthenotionsbecomeintuitively clearwhenreadingthetext. WhatThis Book Is Not About This bookis not a course in mathematicalstatistics (e.g., Borelalgebras);itisnotatraditionaltextbookonstatisticsthatcoversthemanydifferent tests and methods;and it is not a manual of statistical analysis programs,such as SPSS and R. The bookis nota compendiumexplainingas manytests as possible. We tried to providejustenoughinformationtounderstandthefundamentalsofstatisticsbutnotmuch more. Preface vii WhatThisBookIsAbout InPartI,weoutlinethephilosophyofstatisticsusingaminimum ofmathematicstomakekeyconceptsreadilyunderstandable.Wewillintroducethemost basict-testandshowhowconfusionsaboutbasicprobabilitycanbeavoided. UnderstandingPartIhelpsthereaderavoidthemostcommonpitfallsofstatisticsand understand what the most common statistical tests actually compute. We will introduce null hypothesis testing without the complicated traditional approach and use, instead, a simpler approach via Signal Detection Theory (SDT). Part II is more traditional and introducesthe classic tests ANOVA and correlations.Parts I and II providethe standard statisticsastheyarecommonlyused.PartIIIshowsthatwehaveasciencecrisisbecause simple concepts of statistics were misunderstood, such as the notion of replication. For example, the reader may be surprised that too many replications of an experiment can be suspicious rather than a reflection of solid science. Just the basic notions and conceptsofChap.3inPartIareneededtounderstandPartIII,whichincludesideasthat are not presented in other basic textbooks. Even though the main bulk of our book is aboutstatistics, we show how statistics is stronglyrelated to experimentaldesign.Many statisticalproblemscanbeavoidedbyclever,whichoftenmeanssimple,designs. We believe that the unique mixture of core conceptsof statistics (Part I), a short and distinct presentation of the most common statistical tests (Part II), and a new meta- statistical approach (Part III) will not only provide a solid statistical understanding of statisticsbutalsoexcitingandshockinginsightstowhatdeterminesourdailylives. Materials For teachers, power point presentations covering the content of the book are availableonrequestviae-mail:michael.herzog@epfl.ch. Acknowledgements WewouldliketothankKonradNeumannandMarcRepnowforproofreading the manuscript and Eddie Christopher, Aline Cretenoud, Max, Gertrud and Heike Herzog, Maya AnnaJastrzebowska,SlimKammoun,IlariaRicchi,EvelinaThunell,RichardWalker,HeXu,and Pierre Devaud for useful comments. We sadly report that Aaron Clarke passed away during the preparationofthisbook. Lausanne,Switzerland MichaelH.Herzog WestLafayette,IN,USA GregoryFrancis Ankara,Turkey AaronClarke Contents PartI TheEssentialsofStatistics 1 BasicProbabilityTheory ......................................................... 3 1.1 ConfusionsAboutBasicProbabilities:ConditionalProbabilities ........ 4 1.1.1 TheBasicScenario................................................ 4 1.1.2 ASecondTest...................................................... 7 1.1.3 OneMoreExample:Guillain-BarréSyndrome.................. 8 1.2 ConfusionsAboutBasicProbabilities:TheOddsRatio................... 9 1.2.1 BasicsAboutOddsRatios(OR).................................. 9 1.2.2 PartialInformationandtheWorldofDisease.................... 10 References........................................................................... 11 2 ExperimentalDesignandtheBasicsofStatistics:SignalDetection Theory(SDT) ...................................................................... 13 2.1 TheClassicScenarioofSDT............................................... 13 2.2 SDTandthePercentageofCorrectResponses............................ 17 (cid:2) 2.3 TheEmpiricald ............................................................ 19 3 TheCoreConceptofStatistics................................................... 23 3.1 AnotherWaytoEstimatetheSignal-to-NoiseRatio ...................... 24 3.2 Undersampling .............................................................. 26 3.2.1 SamplingDistributionofaMean................................. 27 3.2.2 ComparingMeans................................................. 30 3.2.3 TheTypeIandIIError............................................ 33 3.2.4 TypeIError:Thep-ValueisRelatedtoaCriterion ............. 35 3.2.5 TypeIIError:Hits,Misses........................................ 36 3.3 Summary..................................................................... 38 3.4 AnExample ................................................................. 40 3.5 Implications,CommentsandParadoxes ................................... 41 Reference............................................................................ 50 ix x Contents 4 Variationsonthet-Test ........................................................... 51 4.1 ABitofTerminology ....................................................... 52 4.2 TheStandardApproach:NullHypothesisTesting......................... 53 4.3 Othert-Tests................................................................. 53 4.3.1 One-Samplet-Test................................................. 53 4.3.2 DependentSamplest-Test ........................................ 54 4.3.3 One-TailedandTwo-TailedTests................................. 55 4.4 AssumptionsandViolationsofthet-Test.................................. 55 4.4.1 The Data Need to be Independentand Identically Distributed ......................................................... 55 4.4.2 PopulationDistributionsareGaussianDistributed .............. 56 4.4.3 RatioScaleDependentVariable .................................. 56 4.4.4 EqualPopulationVariances....................................... 57 4.4.5 FixedSampleSize................................................. 57 4.5 TheNon-parametricApproach............................................. 58 4.6 TheEssentialsofStatisticalTests .......................................... 58 4.7 WhatComesNext?.......................................................... 59 PartII TheMultipleTestingProblem 5 TheMultipleTestingProblem ................................................... 63 5.1 IndependentTests ........................................................... 63 5.2 DependentTests............................................................. 65 5.3 HowManyScientificResultsAreWrong? ................................ 65 6 ANOVA ............................................................................. 67 6.1 One-WayIndependentMeasuresANOVA................................. 67 6.2 LogicoftheANOVA........................................................ 68 6.3 WhattheANOVADoesandDoesNotTellYou:Post-HocTests......... 71 6.4 Assumptions................................................................. 72 6.5 ExampleCalculationsfora One-WayIndependentMeasures ANOVA...................................................................... 72 6.5.1 ComputationoftheANOVA...................................... 72 6.5.2 Post-HocTests..................................................... 74 6.6 EffectSize ................................................................... 76 6.7 Two-WayIndependentMeasuresANOVA................................. 77 6.8 RepeatedMeasuresANOVA................................................ 80 7 ExperimentalDesign:ModelFits,Power,andComplexDesigns ............ 83 7.1 ModelFits ................................................................... 83 7.2 PowerandSampleSize ..................................................... 86 7.2.1 OptimizingtheDesign ............................................ 86 7.2.2 ComputingPower ................................................. 87 7.3 PowerChallengesforComplexDesigns................................... 90

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.