Table Of ContentAdvanced R
Data Programming and the Cloud
—
Matt Wiley
Joshua F. Wiley
Advanced R
Data Programming and the Cloud
Matt Wiley
Joshua F. Wiley
Advanced R: Data Programming and the Cloud
Matt Wiley Joshua F. Wiley
Elkhart Group Ltd. & Victoria College Elkhart Group Ltd. & Victoria College
Columbia City, Indiana Columbia City, Indiana
USA USA
ISBN-13 (pbk): 978-1-4842-2076-4 ISBN-13 (electronic): 978-1-4842-2077-1
DOI 10.1007/978-1-4842-2077-1
Library of Congress Control Number: 2016959581
Copyright © 2016 by Matt Wiley and Joshua F. Wiley
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the
material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now
known or hereafter developed.
Trademarked names, logos, and images may appear in this book. Rather than use a trademark symbol with
every occurrence of a trademarked name, logo, or image, we use the names, logos, and images only in an
editorial fashion and to the benefit of the trademark owner, with no intention of infringement of the trademark.
The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are
not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to
proprietary rights.
While the advice and information in this book are believed to be true and accurate at the date of publication,
neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or
omissions that may be made. The publisher makes no warranty, express or implied, with respect to the
material contained herein.
Managing Director: Welmoed Spahr
Lead Editor: Steve Anglin
Technical Reviewer: Andrew Moskowitz
Editorial Board: Steve Anglin, Pramila Balan, Laura Berendson, Aaron Black, Louise Corrigan,
Jonathan Gennick, Robert Hutchinson, Celestin Suresh John, Nikhil Karkal, James Markham,
Susan McDermott, Matthew Moodie, Natalie Pao, Gwenan Spearing
Coordinating Editor: Mark Powers
Copy Editor: Sharon Wilkey
Compositor: SPi Global
Indexer: SPi Global
Artist: SPi Global
Distributed to the book trade worldwide by Springer Science+Business Media New York, 233 Spring Street,
6th Floor, New York, NY 10013. Phone 1-800-SPRINGER, fax (201) 348-4505, e-mail o rders-ny@springer-sbm.com ,
or visit w ww.springeronline.com . Apress Media, LLC is a California LLC and the sole member (owner) is Springer
Science + Business Media Finance Inc (SSBM Finance Inc). SSBM Finance Inc is a Delaware corporation.
For information on translations, please e-mail rights@apress.com , or visit w ww.apress.com .
Apress and friends of ED books may be purchased in bulk for academic, corporate, or promotional use.
eBook versions and licenses are also available for most titles. For more information, reference our Special
Bulk Sales–eBook Licensing web page at w ww.apress.com/bulk-sales .
Any source code or other supplementary materials referenced by the author in this text are available to
readers at w ww.apress.com . For detailed information about how to locate your book’s source code, go to
www.apress.com/source-code/ . Readers can also access source code at SpringerLink in the Supplementary
Material section for each chapter.
Printed on acid-free paper
To Family.
Contents at a Glance
About the Authors ..................................................................................................xiii
About the Technical Reviewer .................................................................................xv
Acknowledgments .................................................................................................xvii
Introduction ............................................................................................................xix
■ Chapter 1: Programming Basics ............................................................................1
■ Chapter 2: Programming Utilities ........................................................................17
■ Chapter 3: Programming Automation ..................................................................29
■ Chapter 4: Writing Functions ...............................................................................43
■ Chapter 5: Writing Classes and Methods.............................................................61
■ Chapter 6: Writing a Package ..............................................................................83
■ Chapter 7: Introduction to Data Management Using data.table ........................115
■ Chapter 8: Data Munging with data.table ..........................................................141
■ Chapter 9: Other Tools for Data Management ....................................................159
■ Chapter 10: Reading Big Data(bases) ................................................................181
■ Chapter 11: Getting a Cloud ...............................................................................199
■ Chapter 12: Cloud Ubuntu for Windows Users ...................................................211
■ Chapter 13: Every Cloud has a Shiny Lining ......................................................225
■ Chapter 14: Shiny Dashboard Sampler ..............................................................239
■ Chapter 15: Dynamic Reports and the Cloud .....................................................253
■ References .........................................................................................................271
Index .....................................................................................................................275
v
Contents
About the Authors ..................................................................................................xiii
About the Technical Reviewer .................................................................................xv
Acknowledgments .................................................................................................xvii
Introduction ............................................................................................................xix
■ Chapter 1: Programming Basics ............................................................................1
Advanced R Software Choices .........................................................................................1
Reproducing Results ........................................................................................................2
Types of Objects ...............................................................................................................2
Base Operators and Functions .........................................................................................5
Mathematical Operators and Functions .........................................................................11
References .....................................................................................................................15
■ Chapter 2: Programming Utilities ........................................................................17
Help and Documentation ................................................................................................17
System and Files ............................................................................................................18
Input ...............................................................................................................................23
Output .............................................................................................................................25
References .....................................................................................................................27
■ Chapter 3: Programming Automation ..................................................................29
Loops ..............................................................................................................................29
Flow Control ...................................................................................................................32
*apply Family of Functions .............................................................................................35
Final Thoughts ................................................................................................................42
vii
■ CONTENTS
■ Chapter 4: Writing Functions ...............................................................................43
Components of a Function .............................................................................................43
Scoping ..........................................................................................................................44
Functions for Functions ..................................................................................................47
Debugging ......................................................................................................................52
Summary ........................................................................................................................59
■ Chapter 5: Writing Classes and Methods.............................................................61
S3 System ......................................................................................................................61
S3 Classes ............................................................................................................................................61
S3 Methods ...........................................................................................................................................64
S4 System ......................................................................................................................71
S4 Classes ............................................................................................................................................72
S4 Class Inheritance .............................................................................................................................76
S4 Methods ...........................................................................................................................................77
Summary ........................................................................................................................80
■ Chapter 6: Writing a Package ..............................................................................83
Before You Get Started ...................................................................................................83
Version Control .....................................................................................................................................84
R Package Basics ...........................................................................................................89
Starting a Package by Using DevTools .................................................................................................90
Adding R Code ......................................................................................................................................92
Tests .....................................................................................................................................................93
Documentation Using roxygen2 .....................................................................................98
Functions ..............................................................................................................................................99
Data ....................................................................................................................................................102
Classes ...............................................................................................................................................103
Methods ..............................................................................................................................................104
Building, Installing, and Distributing an R Package ......................................................107
Summary ......................................................................................................................112
viii
■ CONTENTS
■ Chapter 7: Introduction to Data Management Using data.table ........................115
Introduction to data.table .............................................................................................115
Selecting and Subsetting Data .....................................................................................120
Using the First Formal ........................................................................................................................120
Using the Second Formal ...................................................................................................................122
Using the Second and Third Formals ..................................................................................................123
Variable Renaming and Ordering..................................................................................125
Computing on Data and Creating Variables ..................................................................127
Merging and Reshaping Data .......................................................................................130
Merging Data ......................................................................................................................................130
Reshaping Data ..................................................................................................................................136
Summary ......................................................................................................................140
■ Chapter 8: Data Munging with data.table ..........................................................141
Data Munging / Cleaning ..............................................................................................142
Recoding Data ....................................................................................................................................143
Recoding Numeric Values ...................................................................................................................148
Creating New Variables ................................................................................................150
Fuzzy Matching ............................................................................................................152
Summary ......................................................................................................................157
■ Chapter 9: Other Tools for Data Management ....................................................159
Sorting ..........................................................................................................................160
Selecting and Subsetting .............................................................................................162
Variable Renaming and Ordering..................................................................................168
Computing on Data and Creating Variables ..................................................................170
Merging and Reshaping Data .......................................................................................173
Summary ......................................................................................................................178
ix
■ CONTENTS
■ Chapter 10: Reading Big Data(bases) ................................................................181
SQLite ...........................................................................................................................182
Installing SQLite on Windows .............................................................................................................182
SQLite and R .......................................................................................................................................183
PostgreSQL ...................................................................................................................186
Installing PostgreSQL on Windows .....................................................................................................186
PostgreSQL and R ...............................................................................................................................187
MongoDB ......................................................................................................................190
Installing MongoDB on Windows ........................................................................................................190
MongoDB and R ..................................................................................................................................192
Summary ......................................................................................................................196
■ Chapter 11: Getting a Cloud ...............................................................................199
Disclaimers ..................................................................................................................199
Starting Amazon Web Services ....................................................................................200
Accessing Your Instance’s Command Line ...................................................................205
Uploading Files to Your Instance ..................................................................................207
Final Thoughts ..............................................................................................................209
■ Chapter 12: Cloud Ubuntu for Windows Users ...................................................211
Common Commands ....................................................................................................211
Superuser and Security ................................................................................................213
Installing and Using R ...................................................................................................215
Installing and Using RStudio Server .............................................................................218
Installing Microsoft R ...................................................................................................222
Installing Java ..............................................................................................................224
Installing Shiny on Your Cloud ......................................................................................224
Final Thoughts ..............................................................................................................224
x
■ CONTENTS
■ Chapter 13: Every Cloud has a Shiny Lining ......................................................225
The Basics of Shiny ......................................................................................................225
Shiny in Motion ............................................................................................................232
Uploading a User File into Shiny ..................................................................................234
Hosting Shiny in the Cloud ..........................................................................................236
Final Thoughts ..............................................................................................................238
■ Chapter 14: Shiny Dashboard Sampler ..............................................................239
A Dashboard’s Bones ...................................................................................................239
Dashboard Header ..............................................................................................................................241
Dashboard Sidebar .............................................................................................................................241
Dashboard Body .................................................................................................................................243
Dashboard in the Cloud ................................................................................................245
Complete Sampler Code ...............................................................................................247
References ...................................................................................................................251
■ Chapter 15: Dynamic Reports and the Cloud .....................................................253
Needed Software ..........................................................................................................253
Local Machine ....................................................................................................................................253
Cloud Instance ....................................................................................................................................254
Dynamic Documents ....................................................................................................254
Dynamic Documents and Shiny ...................................................................................258
server.R ...............................................................................................................................................258
ui.R .....................................................................................................................................................261
report.Rmd ..........................................................................................................................................263
Uploading to the Cloud .................................................................................................269
Summary ......................................................................................................................269
■ References .........................................................................................................271
Index .....................................................................................................................275
xi