Getting Started with R Getting Started with R An Introduction for Biologists Second Edition ANDREW P. BECKERMAN DYLAN Z. CHILDS DepartmentofAnimalandPlantSciences, UniversityofSheffield OWEN L. PETCHEY DepartmentofEvolutionaryBiology andEnvironmentalStudies, UniversityofZurich 3 3 GreatClarendonStreet,Oxford,OX26DP, UnitedKingdom OxfordUniversityPressisadepartmentoftheUniversityofOxford. ItfurtherstheUniversity’sobjectiveofexcellenceinresearch,scholarship, andeducationbypublishingworldwide.Oxfordisaregisteredtrademarkof OxfordUniversityPressintheUKandincertainothercountries ©AndrewBeckerman,DylanChilds,&OwenPetchey2017 Themoralrightsoftheauthorshavebeenasserted FirstEditionpublishedin2012 SecondEditionpublishedin2017 Impression:1 Allrightsreserved.Nopartofthispublicationmaybereproduced,storedin aretrievalsystem,ortransmitted,inanyformorbyanymeans,withoutthe priorpermissioninwritingofOxfordUniversityPress,orasexpresslypermitted bylaw,bylicenceorundertermsagreedwiththeappropriatereprographics rightsorganization.Enquiriesconcerningreproductionoutsidethescopeofthe aboveshouldbesenttotheRightsDepartment,OxfordUniversityPress,atthe addressabove Youmustnotcirculatethisworkinanyotherform andyoumustimposethissameconditiononanyacquirer PublishedintheUnitedStatesofAmericabyOxfordUniversityPress 198MadisonAvenue,NewYork,NY10016,UnitedStatesofAmerica BritishLibraryCataloguinginPublicationData Dataavailable LibraryofCongressControlNumber:2016946804 ISBN978–0–19–878783–9(hbk.) ISBN978–0–19–878784–6(pbk.) DOI10.1093/acprof:oso/9780198787839.001.0001 Printedandboundby CPILitho(UK)Ltd,Croydon,CR04YY LinkstothirdpartywebsitesareprovidedbyOxfordingoodfaithand forinformationonly.Oxforddisclaimsanyresponsibilityforthematerials containedinanythirdpartywebsitereferencedinthiswork. Contents Preface ix Introductiontothesecondedition ix Whatthisbookisabout xii Howthebookisorganized xiv WhyR? xvi Updates xviii Acknowledgements xviii Chapter1: GettingandGettingAcquaintedwithR 1 1.1 Gettingstarted 1 1.2 GettingR 2 1.3 GettingRStudio 5 1.4 Let’splay 6 1.5 UsingRasagiantcalculator(thesizeofyourcomputer) 8 1.6 Yourfirstscript 15 1.7 Intermezzoremarks 21 1.8 Importantfunctionality:packages 21 1.9 Gettinghelp 24 1.10 Amini-practical—somein-depthplay 26 1.11 Somemoretoptipsandhintsforasuccessfulfirst (andmore)Rexperience 28 Appendix1aMini-tutorialsolutions 29 Appendix1bFileextensionsandoperatingsystems 30 Chapter2: GettingYourDataintoR 35 2.1 GettingdatareadyforR 35 2.2 GettingyourdataintoR 40 2.3 Checkingthatyourdataareyourdata 45 2.4 Basictroubleshootingwhileimportingdata 48 2.5 Summingup 49 AppendixAdvancedactivity:dealingwithuntidydata 50 vi CONTENTS Chapter3: DataManagement,Manipulation,andExploration withdplyr 57 3.1 Summarystatisticsforeachvariable 58 3.2 dplyrverbs 59 3.3 Subsetting 60 3.4 Transforming 67 3.5 Sorting 68 3.6 Mini-summaryandtwotoptips 69 3.7 Calculatingsummarystatisticsaboutgroupsofyourdata 70 3.8 Whathaveyoulearned...lots 73 Appendix3aComparingclassicmethodsanddplyr 73 Appendix3bAdvanceddplyr 74 Chapter4: VisualizingYourData 79 4.1 Thefirststepineverydataanalysis—makingapicture 79 4.2 ggplot2:agrammarforgraphics 80 4.3 Box-and-whiskerplots 85 4.4 Distributions:makinghistogramsofnumericvariables 87 4.5 Savingyourgraphsforpresentation,documents,etc. 90 4.6 Closingremarks 91 Chapter5: IntroducingStatisticsinR 93 5.1 GettingstarteddoingstatisticsinR 93 5.2 χ2contingencytableanalysis 95 5.3 Two-samplet-test 103 5.4 Introducing...linearmodels 108 5.5 Simplelinearregression 109 5.6 Analysisofvariance:theone-wayANOVA 118 5.7 Wrappingup 128 AppendixGettingpackagesnotonCRAN 128 Chapter6: AdvancingYourStatisticsinR 131 6.1 Gettingstartedwithmoreadvancedstatistics 131 6.2 Thetwo-wayANOVA 131 6.3 Analysisofcovariance(ANCOVA) 145 6.4 Overview:ananalysisworkflow 164 Chapter7: GettingStartedwithGeneralizedLinearModels 167 7.1 Introduction 167 7.2 Countsandrates—PoissonGLMs 170 CONTENTS vii 7.3 Doingitwrong 173 7.4 Doingitright—thePoissonGLM 177 7.5 WhenaPoissonGLMisn’tgoodforcounts 194 7.6 Summary,andbeyondsimplePoissonregression 201 Chapter8: PimpingYourPlots:ScalesandThemesinggplot2 203 8.1 Whatyoualreadyknowaboutgraphs 203 8.2 Preparation 204 8.3 Whatyoumaywanttocustomize 206 8.4 Axislabels,axislimits,andannotation 207 8.5 Scales 209 8.6 Thetheme 212 8.7 Summingup 218 Chapter9: ClosingRemarks:FinalCommentsand Encouragement 219 GeneralAppendices 223 Appendix1DataSources 223 Appendix2FurtherReading 224 Appendix3RMarkdown 225 Index 227 Preface Introductiontothesecondedition ThisisabookabouthowtouseR,anopensourceprogramminglanguage andenvironmentforstatistics.Itisnotabookaboutstatisticsperse,buta bookaboutgettingstartedusingR.Itisabookthatwehopewillteachyou howusingRcanmakeyourlife(researchcareer)easier. Several years ago we published the first edition of this book, aiming to help people move from ‘hearing about R’ to ‘using R’. We had realized thattherewerelotsofbooksaboutexploringdataanddoingstatisticswith R, but none specifically designed for people that didn’t have a lot of ex- perience or confidence in using much more than a spreadsheet, people thatdidn’thavealotoftime,andpeoplethatappreciatedanengagingand sometimeshumorousinitialjourneyintoR.Thefirsteditionwasalsode- signedforpeoplewhodidknowstatisticsandotherpackages,butwanted a quick ‘getting started’ guide, because, well, it is hard to get started with R in some ways. Overall, we aimed to make the somewhat steep learning curvemoreofawalkinthepark. Over the past five years much has changed. Most significantly, R has evolved as a platform for doing data analysis, for managing data, and for producing figures. Other things have not changed. People still seem to needandappreciatehelpinnavigatingtheprocessofgettingstartedwork- ingwithR.Thus,thisnewversionofthebookdoestwothings.Itretains
Description: