ebook img

Behavioral Research Data Analysis with R PDF

246 Pages·2012·1.943 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Behavioral Research Data Analysis with R

Use R! SeriesEditors: RobertGentleman KurtHornik GiovanniParmigiani Forfurthervolumes: http://www.springer.com/series/6991 Use R! Albert:BayesianComputationwithR Bivand/Pebesma/Go´mez-Rubio:AppliedSpatialDataAnalysiswithR Cook/Swayne:InteractiveandDynamicGraphicsforDataAnalysis: WithRandGGobi Hahne/Huber/Gentleman/Falcon:BioconductorCaseStudies Paradis:AnalysisofPhylogeneticsandEvolutionwithR Pfaff:AnalysisofIntegratedandCointegratedTimeSerieswithR Sarkar:Lattice:MultivariateDataVisualizationwithR Spector:DataManipulationwithR Yuelin Li • Jonathan Baron Behavioral Research Data Analysis with R 123 YuelinLi JonathanBaron MemorialSloan-KetteringCancerCenter DepartmentofPsychology DepartmentofPsychiatryandBehavioral UniversityofPennsylvania Sciences 3720WalnutStreet 641LexingtonAve.7thFloor Philadelphia,Pennsylvania19104-6241 NewYork,NewYork10022-4503 USA USA [email protected] [email protected] SeriesEditors: RobertGentleman KurtHornik PrograminComputationalBiology Departmentfu¨rStatistikundMathematik DivisionofPublicHealthSciences Wirtschaftsuniversita¨tWienAugasse2-6 FredHutchinsonCancerResearchCenter A-1090Wien 1100FairviewAve.N,M2-B876 Austria Seattle,Washington98109-1024 USA GiovanniParmigiani TheSidneyKimmelComprehensiveCancer CenteratJohnsHopkinsUniversity 550NorthBroadway Baltimore,MD21205-2011 USA ISBN978-1-4614-1237-3 e-ISBN978-1-4614-1238-0 DOI10.1007/978-1-4614-1238-0 SpringerNewYorkDordrechtHeidelbergLondon LibraryofCongressControlNumber:2011940221 ©SpringerScience+BusinessMedia,LLC2012 Allrightsreserved.Thisworkmaynotbetranslatedorcopiedinwholeorinpartwithoutthewritten permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY10013, USA),except forbrief excerpts inconnection with reviews orscholarly analysis. Usein connectionwithanyformofinformationstorageandretrieval,electronicadaptation,computersoftware, orbysimilarordissimilarmethodologynowknownorhereafterdevelopedisforbidden. Theuseinthispublicationoftradenames,trademarks,servicemarks,andsimilarterms,eveniftheyare notidentifiedassuch,isnottobetakenasanexpressionofopinionastowhetherornottheyaresubject toproprietaryrights. Printedonacid-freepaper SpringerispartofSpringerScience+BusinessMedia(www.springer.com) Preface ThisbookiswrittenforbehavioralscientistswhowanttoconsideraddingRtotheir existingsetofstatisticaltools,orwanttoswitchtoRastheirmaincomputationtool. We aimprimarilytohelppractionersofbehavioralresearchmakethetransitionto R. The focus is to providepractical advice on some of the widely used statistical methods in behavioral research, using a set of notes and annotated examples. We alsoaimtohelpbeginnerslearnmoreaboutstatisticsandbehavioralresearch.These arestatisticaltechniquesusedbypsychologistswhodoresearchonhumansubjects, but of course they are also relevant to researchers in others fields that do similar kindsofresearch. WeassumethatthereaderhasreadtherelevantpartsofRmanualsontheCRAN websiteathttp://www.r-project.org,suchas“AnIntroductiontoR”,“R DataImport/Export”,and“RInstallationandAdministration”.Weassumethatthe reader has gotten to the point of installing R and trying a couple of examples. We also assume that the reader has relevant experiences in using other statistical packagestocarryoutdataanalytictaskscoveredinthisbook.Thesourcecodeand data for some of the examples in the book can be downloaded from the book’s website at: http://idecide.mskcc.org/yl home/rbook/. We do not dwellonthestatisticaltheoriesunlesssomedetailsareessentialintheappropriate use of the statistical methods. When they are called for, theoretical details are accompaniedbyvisualexplanationswheneverfeasible.Mathematicalequationsare usedthroughoutthebookinthehopesthatreaderwillfindthemhelpfulingeneral, and specifically in reaching beyond the scope of this book. For example, matrix notationsareusedinthechapterscoveringlinearregressionandlinearmixed-effects modelingbecausetheyarethestandardnotationsfoundinstatisticsjournals.Abasic appreciationofmathematicalnotationsmayhelpthereadersimplementthesenew techniquesbeforeapackagedsolutionisavailable.Nevertheless,themainemphasis of this book is on the practical data analytic skills so that they can be quickly incorporatedintothereader’sownresearch. Thestatisticaltechniquesinthisbookrepresentmanyofstatisticaltechniquesin ourownresearch.Thepedagogicalplanistopresentstraightforwardsolutionsand add more sophisticated techniques if they help improve clarity and/or efficiency. v vi Preface As can be seen in the first example in Chap. 1, the same analysis can be carried outbyastraightforwardandamoresophisticatedmethod.Chapters1–4coverbasic topicssuchasdataimport/export,statisticalmethodsforcomparingmeansandpro- portions,andgraphics.Thesetopicsmaybepartofanintroductorytextforstudents in behavioral sciences. Data analysis can often be adequately addressed with no more than these straightforward methods. Chapter 4 contains plots in published articles in the journal Judgment and Decision Making (http://journal.sjdm.org/). Chapters5–7 covertopicswith intermediarydifficulty,such as repeated-measures ANOVA,ordinaryleastsquareregression,logisticregression,andstatisticalpower andsamplesizeconsiderations.Thesetopicsaretypicallytaughtatamoreadvanced undergraduatelevelorfirstyeargraduatelevel. Practitioners of behavioral statistics are often asked to estimate the statistical power of a study design. R provides a set of flexible functions for sample size estimation. More complex study designs may involve estimating statistical power bysimulations.WefinditeasiertodosimulationswithRthanwithotherstatistical packagesweknow.ExamplesareprovidedinChaps.7and11. Theremainderofthisbookcovermoreadvancedtopics.Chapter8coversItem ResponseTheory(IRT),astatisticalmethodusedinthedevelopmentandvalidation of psychologicaland educationalassessment tools.We beginChap.8 with simple examplesandendwithsophisticatedapplicationsthatrequireaBayesianapproach. Suchtopicscaneasilytakeupafullvolume.Onlypracticalanalytictasksarecov- eredsothatthereadercanquicklyadaptourexamplesforhisorherownresearch. ThelatentregressionRaschmodelinSect.8.4.2highlightsthepowerandflexibility of R in working with other statistical languages such as WinBUGS/OpenBUGS. Chapter9coversmissingdataimputation.Chapters10–11coverhierarchicallinear models applied in repeated-measured data and clustered data. These topics are written for researchers already familiar with the theories. Again, these chapters emphasizethepracticaldataanalysisskillsandnotthetheories. R evolves continuously. New techniques and user-contributed packages are constantly evolving. We strive to provide the latest techniques. However, readers should consult other sources for a fuller understanding of relevant topics. The R journalpublishesthelatesttechniquesandnewpackages.Anothergoodsourcefor new techniquesis The Journalof Statistical Software (http://www.jstatsoft.org/). The R-help mailing list is another indispensable resource. User contributions make R a truly collaborative statistical computation framework.Many great texts and tutorials for beginners and intermediate users are already widely available. Beginner-level tutorials and how-to guides can be found online at the CRAN “ContributedDocumentation”page. Thisbookoriginatedfromouronlinetutorial“NotesontheuseofRforpsychol- ogy experiments and questionnaires.” Many individuals facilitated the transition. Wewouldliketothankthemformakingthisbookpossible.JohnKimmel,former editorforthisbookatSpringer,firstencouragedustowritethisbookandprovided continuousguidanceandencouragement.SpecialthanksgotoKathrynSchelland Marc Strauss and other editorial staff at Springer on the preparation of the book. Severalannonymousreviewersprovidedsuggestionsonhowtoimprovethebook. Preface vii Weareespeciallyindebtedtotheindividualswhohelpedsupplythedatausedinthe examples,includingtheauthorsoftheRpackagesweuse,andthosewhomakethe rawdatafreelyaccessibleonline. NewYork YuelinLi Philadelphia JonathanBaron Contents 1 Introduction ............................................................... 1 1.1 AnExampleRSession .............................................. 1 1.2 AFewUsefulConceptsandCommands............................ 3 1.2.1 Concepts...................................................... 3 1.2.2 Commands.................................................... 4 1.3 DataObjectsandDataTypes........................................ 9 1.3.1 VectorsofCharacterStrings................................. 10 1.3.2 Matrices,Lists,andDataFrames............................ 12 1.4 FunctionsandDebugging ........................................... 15 2 ReadingandTransformingDataFormat .............................. 19 2.1 ReadingandTransformingData .................................... 19 2.1.1 DataLayout................................................... 19 2.1.2 ASimpleQuestionnaireExample........................... 19 2.1.3 OtherWaystoReadinData ................................. 25 2.1.4 OtherWaystoTransformVariables......................... 26 2.1.5 UsingRtoComputeCourseGrades ........................ 30 2.2 ReshapeandMergeDataFrames ................................... 31 2.3 DataManagementwithaSQLDatabase ........................... 33 2.4 SQLDatabaseConsiderations....................................... 35 3 StatisticsforComparingMeansandProportions..................... 39 3.1 ComparingMeansofContinuousVariables........................ 39 3.2 MoreonManualCheckingofData ................................. 42 3.3 ComparingSampleProportions..................................... 43 3.4 ModeratingEffectinloglin()................................... 45 3.5 AssessingChangeofCorrelatedProportions....................... 49 3.5.1 McNemarTestAcrossTwoSamples........................ 50 4 RGraphicsandTrellisPlots............................................. 55 4.1 DefaultBehaviorofBasicCommands.............................. 55 4.2 OtherGraphics....................................................... 56 ix x Contents 4.3 SavingGraphics...................................................... 56 4.4 MultipleFiguresonOneScreen..................................... 57 4.5 OtherGraphicsTricks ............................................... 57 4.6 ExamplesofSimpleGraphsinPublications........................ 58 4.6.1 http://journal.sjdm.org/8827/ jdm8827.pdf.............................................. 60 4.6.2 http://journal.sjdm.org/8814/ jdm8814.pdf.............................................. 63 4.6.3 http://journal.sjdm.org/8801/ jdm8801.pdf.............................................. 64 4.6.4 http://journal.sjdm.org/8319/ jdm8319.pdf.............................................. 65 4.6.5 http://journal.sjdm.org/8221/ jdm8221.pdf.............................................. 66 4.6.6 http://journal.sjdm.org/8210/ jdm8210.pdf.............................................. 68 4.7 ShadedAreasUnderaCurve........................................ 69 4.7.1 Vectorsinpolygon() ..................................... 71 4.8 LatticeGraphics...................................................... 72 5 AnalysisofVariance:Repeated-Measures ............................. 79 5.1 Example1:TwoWithin-SubjectFactors ........................... 79 5.1.1 UnbalancedDesigns ......................................... 83 5.2 Example2:MaxwellandDelaney .................................. 85 5.3 Example3:MoreThanTwoWithin-SubjectFactors .............. 88 5.4 Example 4: A Simpler Design with Only One Within-SubjectVariable ............................................. 89 5.5 Example5:OneBetween,TwoWithin ............................. 89 5.6 OtherUsefulFunctionsforANOVA................................ 91 5.7 GraphicswithErrorBars............................................ 93 5.8 AnotherWaytodoErrorBarsUsingplotCI()...................... 95 5.8.1 UseError()forRepeated-MeasureANOVA............ 96 5.8.2 Sphericity..................................................... 102 5.9 HowtoEstimatetheGreenhouse–GeisserEpsilon? ............... 103 5.9.1 Huynh–FeldtCorrection..................................... 105 6 LinearandLogisticRegression.......................................... 109 6.1 LinearRegression.................................................... 109 6.2 AnApplicationofLinearRegressiononDiamondPricing........ 110 6.2.1 PlottingDataBeforeModelFitting ......................... 111 6.2.2 CheckingModelDistributionalAssumptions............... 114 6.2.3 AssessingModelFit.......................................... 115 6.3 LogisticRegression.................................................. 118 6.4 Log–LinearModels.................................................. 119 6.5 RegressioninVector–MatrixNotation.............................. 120 6.6 CautiononModelOverfitandClassificationErrors ............... 122

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.