ebook img

The R Book PDF

1072 Pages·2012·9.65 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview The R Book

The R Book The R Book Second Edition Michael J. Crawley Imperial College London at Silwood Park, UK http://www.bio.ic.ac.uk/research/mjcraw/therbook/index.htm A John Wiley & Sons, Ltd., Publication Thiseditionfirstpublished2013 (cid:1)C 2013JohnWiley&Sons,Ltd Registeredoffice JohnWiley&SonsLtd,TheAtrium,SouthernGate,Chichester,WestSussexPO198SQ,UnitedKingdom Fordetailsofourglobaleditorialoffices,forcustomerservicesandforinformationabouthowtoapplyforpermissiontoreusethe copyrightmaterialinthisbookpleaseseeourwebsiteatwww.wiley.com. TherightoftheauthortobeidentifiedastheauthorofthisworkhasbeenassertedinaccordancewiththeCopyright,Designsand PatentsAct1988. Allrightsreserved.Nopartofthispublicationmaybereproduced,storedinaretrievalsystem,ortransmitted,inanyformorbyany means,electronic,mechanical,photocopying,recordingorotherwise,exceptaspermittedbytheUKCopyright,Designsand PatentsAct1988,withoutthepriorpermissionofthepublisher. Wileyalsopublishesitsbooksinavarietyofelectronicformats.Somecontentthatappearsinprintmaynotbeavailablein electronicbooks. Designationsusedbycompaniestodistinguishtheirproductsareoftenclaimedastrademarks.Allbrandnamesandproductnames usedinthisbookaretradenames,servicemarks,trademarksorregisteredtrademarksoftheirrespectiveowners.Thepublisheris notassociatedwithanyproductorvendormentionedinthisbook.Thispublicationisdesignedtoprovideaccurateandauthoritative informationinregardtothesubjectmattercovered.Itissoldontheunderstandingthatthepublisherisnotengagedinrendering professionalservices.Ifprofessionaladviceorotherexpertassistanceisrequired,theservicesofacompetentprofessionalshould besought. LibraryofCongressCataloging-in-PublicationData Crawley,MichaelJ. TheRbook/MichaelJ.Crawley.–2e. pagescm Includesbibliographicalreferencesandindex. ISBN978-0-470-97392-9(hardback) 1.R(Computerprogramlanguage) 2.Mathematicalstatistics–Dataprocessing. I.Title. QA276.45.R3C732013 519.50285(cid:2)5133–dc23 2012027339 AcataloguerecordforthisbookisavailablefromtheBritishLibrary. ISBN:978-0-470-97392-9 Setin10/12ptTimesbyAptaraInc.,NewDelhi,India. Chapters Preface xxiii 1 GettingStarted 1 2 EssentialsoftheRLanguage 12 3 DataInput 137 4 Dataframes 159 5 Graphics 189 6 Tables 244 7 Mathematics 258 8 ClassicalTests 344 9 StatisticalModelling 388 10 Regression 449 11 AnalysisofVariance 498 12 AnalysisofCovariance 537 13 GeneralizedLinearModels 557 14 CountData 579 15 CountDatainTables 599 16 ProportionData 628 17 BinaryResponseVariables 650 18 GeneralizedAdditiveModels 666 19 Mixed-EffectsModels 681 20 Non-LinearRegression 715 21 Meta-Analysis 740 22 BayesianStatistics 752 vi CHAPTERS 23 TreeModels 768 24 TimeSeriesAnalysis 785 25 MultivariateStatistics 809 26 SpatialStatistics 825 27 SurvivalAnalysis 869 28 SimulationModels 893 29 ChangingtheLookofGraphics 907 ReferencesandFurtherReading 971 Index 977 Detailed Contents Preface xxiii 1 GettingStarted 1 1.1 Howtousethisbook 1 1.1.1 Beginnerinbothcomputingandstatistics 1 1.1.2 Studentneedinghelpwithprojectwork 2 1.1.3 DonesomeRandsomestatistics,butkeentolearnmoreofboth 2 1.1.4 DoneregressionandANOVA,butwanttolearnmoreadvancedstatistical modelling 2 1.1.5 Experiencedinstatistics,butabeginnerinR 2 1.1.6 Experiencedincomputing,butabeginnerinR 2 1.1.7 Familiarwithstatisticsandcomputing,butneedafriendlyreferencemanual 3 1.2 InstallingR 3 1.3 RunningR 3 1.4 TheComprehensiveRArchiveNetwork 4 1.4.1 Manuals 5 1.4.2 Frequentlyaskedquestions 5 1.4.3 Contributeddocumentation 5 1.5 GettinghelpinR 6 1.5.1 Workedexamplesoffunctions 6 1.5.2 DemonstrationsofRfunctions 7 1.6 PackagesinR 7 1.6.1 Contentsofpackages 8 1.6.2 Installingpackages 8 1.7 Commandlineversusscripts 9 1.8 Dataeditor 9 1.9 ChangingthelookoftheRscreen 10 1.10 Goodhousekeeping 10 1.11 Linkingtoothercomputerlanguages 11 2 EssentialsoftheRLanguage 12 2.1 Calculations 13 2.1.1 ComplexnumbersinR 13 2.1.2 Rounding 14 2.1.3 Arithmetic 16 2.1.4 Moduloandintegerquotients 17 viii DETAILEDCONTENTS 2.1.5 Variablenamesandassignment 18 2.1.6 Operators 19 2.1.7 Integers 19 2.1.8 Factors 20 2.2 Logicaloperations 22 2.2.1 TRUEandTwithFALSEandF 22 2.2.2 Testingforequalitywithrealnumbers 23 2.2.3 Equalityoffloatingpointnumbersusingall.equal 23 2.2.4 Summarizingdifferencesbetweenobjectsusingall.equal 24 2.2.5 EvaluationofcombinationsofTRUEandFALSE 25 2.2.6 Logicalarithmetic 25 2.3 Generatingsequences 27 2.3.1 Generatingrepeats 28 2.3.2 Generatingfactorlevels 29 2.4 Membership:TestingandcoercinginR 30 2.5 Missingvalues,infinityandthingsthatarenotnumbers 32 2.5.1 Missingvalues:NA 33 2.6 Vectorsandsubscripts 35 2.6.1 Extractingelementsofavectorusingsubscripts 36 2.6.2 Classesofvector 38 2.6.3 Namingelementswithinvectors 38 2.6.4 Workingwithlogicalsubscripts 39 2.7 Vectorfunctions 41 2.7.1 Obtainingtablesofmeansusingtapply 42 2.7.2 Theaggregatefunctionforgroupedsummarystatistics 44 2.7.3 Parallelminimaandmaxima:pminandpmax 45 2.7.4 Summaryinformationfromvectorsbygroups 46 2.7.5 Addresseswithinvectors 46 2.7.6 Findingclosestvalues 47 2.7.7 Sorting,rankingandordering 47 2.7.8 Understandingthedifferencebetweenuniqueandduplicated 49 2.7.9 Lookingforrunsofnumberswithinvectors 50 2.7.10 Sets:union, intersectandsetdiff 52 2.8 Matricesandarrays 53 2.8.1 Matrices 54 2.8.2 Namingtherowsandcolumnsofmatrices 55 2.8.3 Calculationsonrowsorcolumnsofthematrix 56 2.8.4 Addingrowsandcolumnstothematrix 58 2.8.5 Thesweepfunction 59 2.8.6 Applyingfunctionswithapply, sapplyandlapply 61 2.8.7 Usingthemax.colfunction 65 2.8.8 Restructuringamulti-dimensionalarrayusingaperm 67 2.9 Randomnumbers,samplingandshuffling 69 2.9.1 Thesamplefunction 70 2.10 Loopsandrepeats 71 2.10.1 Creatingthebinaryrepresentationofanumber 73 2.10.2 Loopavoidance 74 DETAILEDCONTENTS ix 2.10.3 Theslownessofloops 75 2.10.4 Donot‘grow’datasetsbyconcatenationorrecursivefunctioncalls 76 2.10.5 Loopsforproducingtimeseries 77 2.11 Lists 78 2.11.1 Listsandlapply 80 2.11.2 Manipulatingandsavinglists 82 2.12 Text,characterstringsandpatternmatching 86 2.12.1 Pastingcharacterstringstogether 87 2.12.2 Extractingpartsofstrings 88 2.12.3 Countingthingswithinstrings 89 2.12.4 Upper-andlower-casetext 91 2.12.5 Thematchfunctionandrelationaldatabases 91 2.12.6 Patternmatching 93 2.12.7 Dot.asthe‘anything’character 95 2.12.8 Substitutingtextwithincharacterstrings 96 2.12.9 Locationsofapatternwithinavectorusingregexpr 97 2.12.10 Using%in%andwhich 98 2.12.11 Moreonpatternmatching 98 2.12.12 Perlregularexpressions 100 2.12.13 Strippingpatternedtextoutofcomplexstrings 100 2.13 DatesandtimesinR 101 2.13.1 Readingtimedatafromfiles 102 2.13.2 Thestrptimefunction 103 2.13.3 Thedifftimefunction 104 2.13.4 Calculationswithdatesandtimes 105 2.13.5 Thedifftimeandas.difftimefunctions 105 2.13.6 Generatingsequencesofdates 107 2.13.7 Calculatingtimedifferencesbetweentherowsofadataframe 109 2.13.8 Regressionusingdatesandtimes 111 2.13.9 SummaryofdatesandtimesinR 113 2.14 Environments 113 2.14.1 Usingwithratherthanattach 113 2.14.2 Usingattachinthisbook 114 2.15 WritingRfunctions 115 2.15.1 Arithmeticmeanofasinglesample 115 2.15.2 Medianofasinglesample 115 2.15.3 Geometricmean 116 2.15.4 Harmonicmean 118 2.15.5 Variance 119 2.15.6 Degreesoffreedom 119 2.15.7 Varianceratiotest 120 2.15.8 Usingvariance 121 2.15.9 Deparsing:Agraphicsfunctionforerrorbars 123 2.15.10 Theswitchfunction 125 2.15.11 Theevaluationenvironmentofafunction 126 2.15.12 Scope 126 2.15.13 Optionalarguments 126 x DETAILEDCONTENTS 2.15.14 Variablenumbersofarguments(...) 127 2.15.15 Returningvaluesfromafunction 128 2.15.16 Anonymousfunctions 129 2.15.17 Flexiblehandlingofargumentstofunctions 129 2.15.18 Structureofanobject:str 130 2.16 WritingfromRtofile 133 2.16.1 Savingyourwork 133 2.16.2 Savinghistory 133 2.16.3 Savinggraphics 134 2.16.4 SavingdataproducedwithinRtodisc 134 2.16.5 PastingintoanExcelspreadsheet 135 2.16.6 WritinganExcelreadablefilefromR 135 2.17 Programmingtips 135 3 DataInput 137 3.1 Datainputfromthekeyboard 137 3.2 Datainputfromfiles 138 3.2.1 Theworkingdirectory 138 3.2.2 Datainputusingread.table 139 3.2.3 Commonerrorswhenusingread.table 139 3.2.4 Separatorsanddecimalpoints 140 3.2.5 Datainputdirectlyfromtheweb 140 3.3 Inputfromfilesusingscan 141 3.3.1 Readingadataframewithscan 141 3.3.2 Inputfrommorecomplexfilestructuresusingscan 143 3.4 ReadingdatafromafileusingreadLines 145 3.4.1 InputadataframeusingreadLines 145 3.4.2 Readingnon-standardfilesusingreadLines 147 3.5 Warningswhenyouattachthedataframe 149 3.6 Masking 150 3.7 Inputandoutputformats 150 3.8 Checkingfilesfromthecommandline 151 3.9 Readingdatesandtimesfromfiles 151 3.10 Built-indatafiles 152 3.11 Filepaths 152 3.12 Connections 153 3.13 Readingdatafromanexternaldatabase 154 3.13.1 CreatingtheDSNforyourcomputer 155 3.13.2 SettingupRtoreadfromthedatabase 155 4 Dataframes 159 4.1 Subscriptsandindices 164 4.2 Selectingrowsfromthedataframeatrandom 165 4.3 Sortingdataframes 166 4.4 Usinglogicalconditionstoselectrowsfromthedataframe 169 4.5 Omittingrowscontainingmissingvalues,NA 172 4.5.1 ReplacingNAswithzeros 174 4.6 Usingorderand!duplicatedtoeliminatepseudoreplication 174 DETAILEDCONTENTS xi 4.7 Complexorderingwithmixeddirections 174 4.8 Adataframewithrownamesinsteadofrownumbers 176 4.9 Creatingadataframefromanotherkindofobject 177 4.10 Eliminatingduplicaterowsfromadataframe 180 4.11 Datesindataframes 180 4.12 Usingthematchfunctionindataframes 182 4.13 Mergingtwodataframes 183 4.14 Addingmarginstoadataframe 185 4.15 Summarizingthecontentsofdataframes 187 5 Graphics 189 5.1 Plotswithtwovariables 189 5.2 Plottingwithtwocontinuousexplanatoryvariables:Scatterplots 190 5.2.1 Plottingsymbols:pch 195 5.2.2 Colourforsymbolsinplots 196 5.2.3 Addingtexttoscatterplots 197 5.2.4 Identifyingindividualsinscatterplots 198 5.2.5 Usingathirdvariabletolabelascatterplot 200 5.2.6 Joiningthedots 201 5.2.7 Plottingsteppedlines 202 5.3 Addingothershapestoaplot 203 5.3.1 Placingitemsonaplotwiththecursor,usingthelocatorfunction 204 5.3.2 Drawingmorecomplexshapeswithpolygon 205 5.4 Drawingmathematicalfunctions 206 5.4.1 Addingsmoothparametriccurvestoascatterplot 207 5.4.2 Fittingnon-parametriccurvesthroughascatterplot 209 5.5 Shapeandsizeofthegraphicswindow 211 5.6 Plottingwithacategoricalexplanatoryvariable 212 5.6.1 Boxplotswithnotchestoindicatesignificantdifferences 213 5.6.2 Barplotswitherrorbars 214 5.6.3 Plotsformultiplecomparisons 217 5.6.4 Usingcolourpaletteswithcategoricalexplanatoryvariables 219 5.7 Plotsforsinglesamples 220 5.7.1 Histogramsandbarcharts 220 5.7.2 Histograms 221 5.7.3 Histogramsofintegers 224 5.7.4 Overlayinghistogramswithsmoothdensityfunctions 225 5.7.5 Densityestimationforcontinuousvariables 226 5.7.6 Indexplots 227 5.7.7 Timeseriesplots 228 5.7.8 Piecharts 230 5.7.9 Thestripchartfunction 231 5.7.10 Aplottotestfornormality 232 5.8 Plotswithmultiplevariables 234 5.8.1 Thepairsfunction 234 5.8.2 Thecoplotfunction 236 5.8.3 Interactionplots 237

Description:
Hugely successful and popular text presenting an extensive and comprehensive guide for all R users The R language is recognized as one of the most powerful and flexible statistical software packages, enabling users to apply many statistical techniques that would be impossible without such software t
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.