Ton J. Cleophas · Aeilko H. Zwinderman SPSS for Starters and 2nd Levelers Second Edition SPSS for Starters and 2nd Levelers Ton J. Cleophas (cid:129) Aeilko H. Zwinderman SPSS for Starters and 2nd Levelers Second Edition TonJ.Cleophas AeilkoH.Zwinderman DepartmentMedicine DepartmentBiostatistics AlbertSchweitzerHospital AcademicMedicalCenter Dordrecht,TheNetherlands Amsterdam,TheNetherlands EuropeanCollegePharmaceutical EuropeanCollegePharmaceutical Medicine Medicine Lyon,France Lyon,France Additionalmaterial tothis bookcan bedownloaded from http://extras.springer.com ISBN978-3-319-20599-1 ISBN978-3-319-20600-4 (eBook) DOI10.1007/978-3-319-20600-4 LibraryofCongressControlNumber:2015943499 SpringerChamHeidelbergNewYorkDordrechtLondon ©SpringerInternationalPublishingSwitzerland2009,2016 Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpartof the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilarmethodologynowknownorhereafterdeveloped. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publicationdoesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexempt fromtherelevantprotectivelawsandregulationsandthereforefreeforgeneraluse. Thepublisher,theauthorsandtheeditorsaresafetoassumethattheadviceandinformationinthis book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained hereinorforanyerrorsoromissionsthatmayhavebeenmade. Printedonacid-freepaper Springer International Publishing AG Switzerland is part of Springer Science+Business Media (www.springer.com) Prefaces to the 1st edition Part I Thissmallbookaddressesdifferentkindsofdatafiles,ascommonlyencounteredin clinical research and their data analysis on SPSS software. Some 15 years ago serious statistical analyses were conducted by specialist statisticians using main- frame computers. Nowadays, there is ready access to statistical computing using personal computers or laptops, and this practice has changed boundaries between basicstatisticalmethodsthatcanbeconvenientlycarriedoutonapocketcalculator and more advanced statistical methods that can only be executed on a computer. Clinical researchers currently perform basic statistics without professional help from a statistician, including t-tests and chi-square tests. With the help of user- friendlysoftware,thestepfromsuchbasicteststomorecomplextestshasbecome smallerandmoreeasytotake. It is our experience as masters’ and doctorate class teachers of the European College of Pharmaceutical Medicine (EC Socrates Project, Lyon, France) that students are eager to master adequate command of statistical software for that purpose.However,doingso,albeiteasy,itstilltakes20–50stepsfromloggingin to the final result, and all of these steps have to be learned in order for the procedurestobesuccessful. The current book has been made intentionally small, avoiding theoretical dis- cussions andhighlightingtechnicaldetails.Thismeans thatthisbook isunableto explainhowcertainstepsweremadeandwhycertainconclusionsweredrawn.For that purpose additional study is required, and we recommend that the textbook “Statistics Applied to Clinical Trials,” Springer 2009, Dordrecht, Netherlands, by the same authors, be used for that purpose, because the current text is much complementarytothetextofthetextbook. We have to emphasize that automated data analysis carries a major risk of fallacies. Computers cannot think and can only execute commands as given. As an example, regression analysis usually applies independent and dependent v vi Prefacestothe1stedition variables, often interpreted as causal factors and outcome factors. For example, genderoragemaydeterminethetypeofoperationortypeofsurgeon.Thetypeof surgeon does not determine the age and gender. Yet a software program does not havedifficultytousenonsensedeterminants,andtheinvestigatorincharge ofthe analysishastodecidewhatiscausedbywhat,becauseacomputercannotdothings likethat,althoughtheyareessentialtotheanalysis.Thesameisbasicallytruewith anystatisticaltestsassessingtheeffectsofcausalfactorsonhealthoutcomes. At the completion of each test as described in this book, a brief clinical interpretationofthemainresultsisgiveninordertocompensatefortheabundance of technical information. The actual calculations made by the software are not alwaysrequiredforunderstandingthetest,butsomeunderstandingmaybehelpful andcanalsobefoundintheabovetextbook.Wehopethatthecurrentbookissmall enoughforthosenotfondonstatisticsbutfondonstatisticallyprovenharddatain order to start on SPSS, a software program with an excellent state of the art for clinical dataanalysis.Moreover,itisverysatisfyingtoprovefromyourowndata that your own prior hypothesis was true, and it is even more satisfying if you are abletoproducetheveryproofyourself. Lyon,France TonJ.Cleophas December2009 AeilkoH.Zwinderman Part II The small book “SPSS for Starters” issued in 2010 presented 20 chapters of cookbook-like step by step data analyses of clinical research and was written to helpclinicalinvestigatorsandmedicalstudentsanalyzetheirdatawithoutthehelp ofastatistician.Thebookserveditspurposewellenough,since13,000electronic reprintswerebeingorderedwithin9monthsoftheedition. Theabovebookreviewed,e.g.,methodsfor: 1. Continuousdata,liket-tests,nonparametrictests,andanalysisofvariance 2. Binarydata,likecrosstabs,McNemar’stests,andoddsratiotests 3. Regressiondata 4. Trendtesting 5. Clustereddata 6. Diagnostictestvalidation Thecurrentbookisalogicalcontinuationandaddsfurthermethodsfundamental toclinicaldataanalysis. Itcontains,e.g.,methodsfor: 1. Multistageanalyses 2. Multivariateanalyses 3. Missingdata Prefacestothe1stedition vii 4. Imperfectanddistributionfreedata 5. Comparingvaliditiesofdifferentdiagnostictests 6. Morecomplexregressionmodels Althoughawealth ofcomputationallyintensive statisticalmethods iscurrently available,theauthorshavetakenspecialcaretosticktorelativelysimplemethods, becausetheyoftenprovidethebestpowerandfewesttypeIerrorsandareadequate toanswermostclinicalresearchquestions. Itistimeforcliniciansnottogetnervousanymorewithstatisticsandnottoleave their data anymore to statisticians running them through SAS or SPSS to see if significancescanbefound.Thisiscalleddatadredging.Statisticscandomorefor youthanproduceahostofirrelevantp-values.Itisadiscipline attheinterface of biologyandmathematics:mathematicsisusedtoanswersoundbiologicalhypoth- eses.Wedohopethat“SPSSforStarters1and2”willbenefitthisprocess. Two other publications from the same authors entitled Statistical Analysis of Clinical Data on a Pocket Calculator 1 and 2 are rather complementary to the above books and provide a more basic approach and better understanding of the arithmetic. Lyon,France TonJ.Cleophas January2012 AeilkoH.Zwinderman Preface to 2nd edition Over 100,000 copies of various chapters of the first edition of SPSS for Starters (Parts I (2010) and II (2012)) have been sold, and many readers have commented andgiventheirrecommendationsforimprovements. Inthis2ndedition,allthechaptershavebeencorrectedfortextualandarithmetic errors,andtheycontainupdatedversionsofthebackgroundinformation,scientific question information, examples, and conclusions sections. In “notes section”, updated references helpful to a better understanding of the brief descriptions in thecurrenttextaregiven. Instead of the, previously published, two-20-chapter Springer briefs, one for simpleandoneforcomplexdata,this2ndeditionisproducedasasingle60-chapter textbook. The,previouslyused,ratherarbitraryclassificationhasbeenreplacedwiththree parts,accordingtothemostbasicdifferencesindatafilecharacteristics: 1. Continuousoutcomedata(36chapters) 2. Binaryoutcomedata(18chapters) 3. Survivalandlongitudinaldata(6chapters) The latter classification should be helpful to investigators for choosing the appropriateclassofmethodsfortheirdata. Eachchapternowstartswithaschematicoverviewofthestatisticalmodeltobe reviewed,includingtypesofdata(mainlycontinuousorbinary(yes,no))andtypes ofvariables(mainlyoutcomeandpredictorvariables). Entire data tables of the examples are available through the Internet and are redundant to the current text. Therefore, the first 10 rows of each data table have nowbeenprintedonly. However, relevant details about the data have been inserted for improved readability. ix