Advanced R Data Programming and the Cloud — Matt Wiley Joshua F. Wiley Advanced R Data Programming and the Cloud Matt Wiley Joshua F. Wiley Advanced R: Data Programming and the Cloud Matt Wiley Joshua F. Wiley Elkhart Group Ltd. & Victoria College Elkhart Group Ltd. & Victoria College Columbia City, Indiana Columbia City, Indiana USA USA ISBN-13 (pbk): 978-1-4842-2076-4 ISBN-13 (electronic): 978-1-4842-2077-1 DOI 10.1007/978-1-4842-2077-1 Library of Congress Control Number: 2016959581 Copyright © 2016 by Matt Wiley and Joshua F. Wiley This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. Trademarked names, logos, and images may appear in this book. Rather than use a trademark symbol with every occurrence of a trademarked name, logo, or image, we use the names, logos, and images only in an editorial fashion and to the benefit of the trademark owner, with no intention of infringement of the trademark. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. While the advice and information in this book are believed to be true and accurate at the date of publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein. Managing Director: Welmoed Spahr Lead Editor: Steve Anglin Technical Reviewer: Andrew Moskowitz Editorial Board: Steve Anglin, Pramila Balan, Laura Berendson, Aaron Black, Louise Corrigan, Jonathan Gennick, Robert Hutchinson, Celestin Suresh John, Nikhil Karkal, James Markham, Susan McDermott, Matthew Moodie, Natalie Pao, Gwenan Spearing Coordinating Editor: Mark Powers Copy Editor: Sharon Wilkey Compositor: SPi Global Indexer: SPi Global Artist: SPi Global Distributed to the book trade worldwide by Springer Science+Business Media New York, 233 Spring Street, 6th Floor, New York, NY 10013. Phone 1-800-SPRINGER, fax (201) 348-4505, e-mail o [email protected] , or visit w ww.springeronline.com . Apress Media, LLC is a California LLC and the sole member (owner) is Springer Science + Business Media Finance Inc (SSBM Finance Inc). SSBM Finance Inc is a Delaware corporation. For information on translations, please e-mail [email protected] , or visit w ww.apress.com . Apress and friends of ED books may be purchased in bulk for academic, corporate, or promotional use. eBook versions and licenses are also available for most titles. For more information, reference our Special Bulk Sales–eBook Licensing web page at w ww.apress.com/bulk-sales . Any source code or other supplementary materials referenced by the author in this text are available to readers at w ww.apress.com . For detailed information about how to locate your book’s source code, go to www.apress.com/source-code/ . Readers can also access source code at SpringerLink in the Supplementary Material section for each chapter. Printed on acid-free paper To Family. Contents at a Glance About the Authors ..................................................................................................xiii About the Technical Reviewer .................................................................................xv Acknowledgments .................................................................................................xvii Introduction ............................................................................................................xix ■ Chapter 1: Programming Basics ............................................................................1 ■ Chapter 2: Programming Utilities ........................................................................17 ■ Chapter 3: Programming Automation ..................................................................29 ■ Chapter 4: Writing Functions ...............................................................................43 ■ Chapter 5: Writing Classes and Methods.............................................................61 ■ Chapter 6: Writing a Package ..............................................................................83 ■ Chapter 7: Introduction to Data Management Using data.table ........................115 ■ Chapter 8: Data Munging with data.table ..........................................................141 ■ Chapter 9: Other Tools for Data Management ....................................................159 ■ Chapter 10: Reading Big Data(bases) ................................................................181 ■ Chapter 11: Getting a Cloud ...............................................................................199 ■ Chapter 12: Cloud Ubuntu for Windows Users ...................................................211 ■ Chapter 13: Every Cloud has a Shiny Lining ......................................................225 ■ Chapter 14: Shiny Dashboard Sampler ..............................................................239 ■ Chapter 15: Dynamic Reports and the Cloud .....................................................253 ■ References .........................................................................................................271 Index .....................................................................................................................275 v Contents About the Authors ..................................................................................................xiii About the Technical Reviewer .................................................................................xv Acknowledgments .................................................................................................xvii Introduction ............................................................................................................xix ■ Chapter 1: Programming Basics ............................................................................1 Advanced R Software Choices .........................................................................................1 Reproducing Results ........................................................................................................2 Types of Objects ...............................................................................................................2 Base Operators and Functions .........................................................................................5 Mathematical Operators and Functions .........................................................................11 References .....................................................................................................................15 ■ Chapter 2: Programming Utilities ........................................................................17 Help and Documentation ................................................................................................17 System and Files ............................................................................................................18 Input ...............................................................................................................................23 Output .............................................................................................................................25 References .....................................................................................................................27 ■ Chapter 3: Programming Automation ..................................................................29 Loops ..............................................................................................................................29 Flow Control ...................................................................................................................32 *apply Family of Functions .............................................................................................35 Final Thoughts ................................................................................................................42 vii ■ CONTENTS ■ Chapter 4: Writing Functions ...............................................................................43 Components of a Function .............................................................................................43 Scoping ..........................................................................................................................44 Functions for Functions ..................................................................................................47 Debugging ......................................................................................................................52 Summary ........................................................................................................................59 ■ Chapter 5: Writing Classes and Methods.............................................................61 S3 System ......................................................................................................................61 S3 Classes ............................................................................................................................................61 S3 Methods ...........................................................................................................................................64 S4 System ......................................................................................................................71 S4 Classes ............................................................................................................................................72 S4 Class Inheritance .............................................................................................................................76 S4 Methods ...........................................................................................................................................77 Summary ........................................................................................................................80 ■ Chapter 6: Writing a Package ..............................................................................83 Before You Get Started ...................................................................................................83 Version Control .....................................................................................................................................84 R Package Basics ...........................................................................................................89 Starting a Package by Using DevTools .................................................................................................90 Adding R Code ......................................................................................................................................92 Tests .....................................................................................................................................................93 Documentation Using roxygen2 .....................................................................................98 Functions ..............................................................................................................................................99 Data ....................................................................................................................................................102 Classes ...............................................................................................................................................103 Methods ..............................................................................................................................................104 Building, Installing, and Distributing an R Package ......................................................107 Summary ......................................................................................................................112 viii ■ CONTENTS ■ Chapter 7: Introduction to Data Management Using data.table ........................115 Introduction to data.table .............................................................................................115 Selecting and Subsetting Data .....................................................................................120 Using the First Formal ........................................................................................................................120 Using the Second Formal ...................................................................................................................122 Using the Second and Third Formals ..................................................................................................123 Variable Renaming and Ordering..................................................................................125 Computing on Data and Creating Variables ..................................................................127 Merging and Reshaping Data .......................................................................................130 Merging Data ......................................................................................................................................130 Reshaping Data ..................................................................................................................................136 Summary ......................................................................................................................140 ■ Chapter 8: Data Munging with data.table ..........................................................141 Data Munging / Cleaning ..............................................................................................142 Recoding Data ....................................................................................................................................143 Recoding Numeric Values ...................................................................................................................148 Creating New Variables ................................................................................................150 Fuzzy Matching ............................................................................................................152 Summary ......................................................................................................................157 ■ Chapter 9: Other Tools for Data Management ....................................................159 Sorting ..........................................................................................................................160 Selecting and Subsetting .............................................................................................162 Variable Renaming and Ordering..................................................................................168 Computing on Data and Creating Variables ..................................................................170 Merging and Reshaping Data .......................................................................................173 Summary ......................................................................................................................178 ix ■ CONTENTS ■ Chapter 10: Reading Big Data(bases) ................................................................181 SQLite ...........................................................................................................................182 Installing SQLite on Windows .............................................................................................................182 SQLite and R .......................................................................................................................................183 PostgreSQL ...................................................................................................................186 Installing PostgreSQL on Windows .....................................................................................................186 PostgreSQL and R ...............................................................................................................................187 MongoDB ......................................................................................................................190 Installing MongoDB on Windows ........................................................................................................190 MongoDB and R ..................................................................................................................................192 Summary ......................................................................................................................196 ■ Chapter 11: Getting a Cloud ...............................................................................199 Disclaimers ..................................................................................................................199 Starting Amazon Web Services ....................................................................................200 Accessing Your Instance’s Command Line ...................................................................205 Uploading Files to Your Instance ..................................................................................207 Final Thoughts ..............................................................................................................209 ■ Chapter 12: Cloud Ubuntu for Windows Users ...................................................211 Common Commands ....................................................................................................211 Superuser and Security ................................................................................................213 Installing and Using R ...................................................................................................215 Installing and Using RStudio Server .............................................................................218 Installing Microsoft R ...................................................................................................222 Installing Java ..............................................................................................................224 Installing Shiny on Your Cloud ......................................................................................224 Final Thoughts ..............................................................................................................224 x ■ CONTENTS ■ Chapter 13: Every Cloud has a Shiny Lining ......................................................225 The Basics of Shiny ......................................................................................................225 Shiny in Motion ............................................................................................................232 Uploading a User File into Shiny ..................................................................................234 Hosting Shiny in the Cloud ..........................................................................................236 Final Thoughts ..............................................................................................................238 ■ Chapter 14: Shiny Dashboard Sampler ..............................................................239 A Dashboard’s Bones ...................................................................................................239 Dashboard Header ..............................................................................................................................241 Dashboard Sidebar .............................................................................................................................241 Dashboard Body .................................................................................................................................243 Dashboard in the Cloud ................................................................................................245 Complete Sampler Code ...............................................................................................247 References ...................................................................................................................251 ■ Chapter 15: Dynamic Reports and the Cloud .....................................................253 Needed Software ..........................................................................................................253 Local Machine ....................................................................................................................................253 Cloud Instance ....................................................................................................................................254 Dynamic Documents ....................................................................................................254 Dynamic Documents and Shiny ...................................................................................258 server.R ...............................................................................................................................................258 ui.R .....................................................................................................................................................261 report.Rmd ..........................................................................................................................................263 Uploading to the Cloud .................................................................................................269 Summary ......................................................................................................................269 ■ References .........................................................................................................271 Index .....................................................................................................................275 xi