ebook img

Programming Skills for Data Science: Start Writing Code to Wrangle, Analyze, and Visualize Data with R PDF

385 Pages·1·16.123 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Programming Skills for Data Science: Start Writing Code to Wrangle, Analyze, and Visualize Data with R

ptg27139148 Programming Skills for Data Science ptg27139148 The Pearson Addison-Wesley Data and Analytics Series Visit informit.com/awdataseries for a complete list of available publications. ptg27139148 T he Pearson Addison-Wesley Data and Analytics Series provides readers with practical knowledge for solving problems and answering questions with data. Titles in this series primarily focus on three areas: 1. Infrastructure: how to store, move, and manage data 2. Algorithms: how to mine intelligence or make predictions based on data 3. V isualizations: how to represent data and insights in a meaningful and compelling way The series aims to tie all three of these areas together to help the reader build end-to-end systems for fighting spam; making recommendations; building personalization; detecting trends, patterns, or problems; and gaining insight from the data exhaust of systems and user interactions. (cid:98) Make sure to connect with us! informit.com/socialconnect Programming Skills for Data Science Start Writing Code to Wrangle, Analyze, and Visualize Data with R ptg27139148 Michael Freeman Joel Ross Boston • Columbus • New York • San Francisco • Amsterdam • Cape Town Dubai • London • Madrid • Milan • Munich • Paris • Montreal • Toronto • Delhi • Mexico City São Paulo • Sydney • Hong Kong • Seoul • Singapore • Taipei • Tokyo Manyofthedesignationsusedbymanufacturersandsellerstodistinguishtheirproductsareclaimedas trademarks.Wherethosedesignationsappearinthisbook,andthepublisherwasawareofatrademark claim,thedesignationshavebeenprintedwithinitialcapitallettersorinallcapitals. Theauthorsandpublisherhavetakencareinthepreparationofthisbook,butmakenoexpressedorimplied warrantyofanykindandassumenoresponsibilityforerrorsoromissions.Noliabilityisassumedfor incidentalorconsequentialdamagesinconnectionwithorarisingoutoftheuseoftheinformationor ptg27139148 programscontainedherein. Forinformationaboutbuyingthistitleinbulkquantities,orforspecialsalesopportunities(whichmayinclude electronicversions;customcoverdesigns;andcontentparticulartoyourbusiness,traininggoals,marketing focus,orbrandinginterests),pleasecontactourcorporatesalesdepartment [email protected](800)382-3419. Forgovernmentsalesinquiries,[email protected]. ForquestionsaboutsalesoutsidetheU.S.,[email protected]. VisitusontheWeb:informit.com/aw LibraryofCongressControlNumber:2018953978 Copyright©2019PearsonEducation,Inc. Allrightsreserved.Thispublicationisprotectedbycopyright,andpermissionmustbeobtainedfromthe publisherpriortoanyprohibitedreproduction,storageinaretrievalsystem,ortransmissioninanyformorby anymeans,electronic,mechanical,photocopying,recording,orlikewise.Forinformationregarding permissions,requestformsandtheappropriatecontactswithinthePearsonEducationGlobalRights& PermissionsDepartment,pleasevisitwww.pearsoned.com/permissions/. ISBN-13:978-0-13-513310-1 ISBN-10:0-13-513310-6 1 18 (cid:118) Toourstudentswhochallengedustodevelopbetterresources,and ourfamilieswhosupportedusintheprocess. (cid:118) ptg27139148 This page intentionally left blank ptg27139148 Contents Foreword xi Preface xiii Acknowledgments xvii AbouttheAuthors xix I: GettingStarted 1 1 SettingUpYourComputer 3 1.1 SettingupCommandLineTools 4 1.2 Installinggit 5 1.3 CreatingaGitHubAccount 6 1.4 SelectingaTextEditor 6 1.5 DownloadingtheRLanguage 7 1.6 DownloadingRStudio 8 2 UsingtheCommandLine 9 2.1 AccessingtheCommandLine 9 ptg27139148 2.2 NavigatingtheFileSystem 11 2.3 ManagingFiles 15 2.4 DealingwithErrors 18 2.5 DirectingOutput 20 2.6 NetworkingCommands 20 II: ManagingProjects 25 3 VersionControlwithgitandGitHub 27 3.1 WhatIsgit? 27 3.2 ConfigurationandProjectSetup 30 3.3 TrackingProjectChanges 32 3.4 StoringProjectsonGitHub 36 3.5 AccessingProjectHistory 40 3.6 IgnoringFilesfromaProject 42 4 UsingMarkdownforDocumentation 45 4.1 WritingMarkdown 45 4.2 RenderingMarkdown 48 viii Contents III: FoundationalRSkills 51 5 IntroductiontoR 53 5.1 ProgrammingwithR 53 5.2 RunningRCode 54 5.3 IncludingComments 58 5.4 DefiningVariables 58 5.5 GettingHelp 63 6 Functions 69 6.1 WhatIsaFunction? 69 6.2 Built-inRFunctions 71 6.3 LoadingFunctions 73 6.4 WritingFunctions 75 6.5 UsingConditionalStatements 79 7 Vectors 81 7.1 WhatIsaVector? 81 7.2 VectorizedOperations 83 ptg27139148 7.3 VectorIndices 88 7.4 VectorFiltering 90 7.5 ModifyingVectors 92 8 Lists 95 8.1 WhatIsaList? 95 8.2 CreatingLists 96 8.3 AccessingListElements 97 8.4 ModifyingLists 100 8.5 ApplyingFunctionstoListswithlapply() 102 IV: DataWrangling 105 9 UnderstandingData 107 9.1 TheDataGenerationProcess 107 9.2 FindingData 108 9.3 TypesofData 110 9.4 InterpretingData 112 9.5 UsingDatatoAnswerQuestions 116 Contents ix 10 DataFrames 119 10.1 WhatIsaDataFrame? 119 10.2 WorkingwithDataFrames 120 10.3 WorkingwithCSVData 124 11 ManipulatingDatawithdplyr 131 11.1 AGrammarofDataManipulation 131 11.2 CoredplyrFunctions 132 11.3 PerformingSequentialOperations 139 11.4 AnalyzingDataFramesbyGroup 142 11.5 JoiningDataFramesTogether 144 11.6 dplyrinAction:AnalyzingFlightData 148 12 ReshapingDatawithtidyr 155 12.1 WhatIs“Tidy”Data? 155 12.2 FromColumnstoRows:gather() 157 12.3 FromRowstoColumns:spread() 158 12.4 tidyrinAction:ExploringEducationalStatistics 160 ptg27139148 13 AccessingDatabases 167 13.1 AnOverviewofRelationalDatabases 167 13.2 ATasteofSQL 171 13.3 AccessingaDatabasefromR 175 14 AccessingWebAPIs 181 14.1 WhatIsaWebAPI? 181 14.2 RESTfulRequests 182 14.3 AccessingWebAPIsfromR 189 14.4 ProcessingJSONData 191 14.5 APIsinAction:FindingCubanFoodinSeattle 197 V: DataVisualization 205 15 DesigningDataVisualizations 207 15.1 ThePurposeofVisualization 207 15.2 SelectingVisualLayouts 209 15.3 ChoosingEffectiveGraphicalEncodings 220 15.4 ExpressiveDataDisplays 227 15.5 EnhancingAesthetics 229

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.