ebook img

R in Action, 2nd Edition: Data analysis and graphics with R PDF

608 Pages·2015·0.88 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview R in Action, 2nd Edition: Data analysis and graphics with R

SECOND EDITION IN ACTION Data analysis and graphics with R Robert I. Kabacoff M A N N I N G www.it-ebooks.info Praise for the First Edition Lucid and engaging—this is without doubt the fun way to learn R! —Amos A. Folarin, University College London Be prepared to quickly raise the bar with the sheer quality that R can produce. —Patrick Breen, Rogers Communications Inc. An excellent introduction and reference on R from the author of the best R website. —Christopher Williams, University of Idaho Thorough and readable. A great R companion for the student or researcher. —Samuel McQuillin, University of South Carolina Finally, a comprehensive introduction to R for programmers. —Philipp K. Janert, Author of Gnuplot in Action Essential reading for anybody moving to R for the first time. —Charles Malpas, University of Melbourne One of the quickest routes to R proficiency. You can buy the book on Friday and have a working program by Monday. —Elizabeth Ostrowski, Baylor College of Medicine One usually buys a book to solve the problems they know they have. This book solves problems you didn't know you had. —Carles Fenollosa, Barcelona Supercomputing Center Clear, precise, and comes with a lot of explanations and examples…the book can be used by beginners and professionals alike, and even for teaching R! —Atef Ouni, Tunisian National Institute of Statistics A great balance of targeted tutorials and in-depth examples. —Landon Cox, 360VL Inc. www.it-ebooks.info ii www.it-ebooks.info R in Action SECOND EDITION Data analysis and graphics with R ROBERT I. KABACOFF MANNING SHELTER ISLAND www.it-ebooks.info iv For online information and ordering of this and other Manning books, please visit www.manning.com. The publisher offers discounts on this book when ordered in quantity. For more information, please contact Special Sales Department Manning Publications Co. 20 Baldwin Road PO Box 761 Shelter Island, NY 11964 Email: [email protected] ©2015 by Manning Publications Co. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in the book, and Manning Publications was aware of a trademark claim, the designations have been printed in initial caps or all caps. Recognizing the importance of preserving what has been written, it is Manning’s policy to have the books we publish printed on acid-free paper, and we exert our best efforts to that end. Recognizing also our responsibility to conserve the resources of our planet, Manning books are printed on paper that is at least 15 percent recycled and processed without elemental chlorine. Manning Publications Co. Development editor: Jennifer Stout 20 Baldwin Road Copyeditor: Tiffany Taylor PO Box 761 Proofreader: Toma Mulligan Shelter Island, NY 11964 Typesetter: Marija Tudor Cover designer: Marija Tudor ISBN: 9781617291388 Printed in the United States of America 1 2 3 4 5 6 7 8 9 10 – EBM – 20 19 18 17 16 15 www.it-ebooks.info brief contents PART 1 GETTING STARTED ...................................................... 1 1 ■ Introduction to R 3 2 ■ Creating a dataset 20 3 ■ Getting started with graphs 46 4 ■ Basic data management 71 5 ■ Advanced data management 89 PART 2 BASIC METHODS ...................................................... 115 6 ■ Basic graphs 117 7 ■ Basic statistics 137 PART 3 INTERMEDIATE METHODS ........................................ 165 8 ■ Regression 167 9 ■ Analysis of variance 212 10 ■ Power analysis 239 11 ■ Intermediate graphs 255 12 ■ Resampling statistics and bootstrapping 279 v www.it-ebooks.info vi BRIEF CONTENTS PART 4 ADVANCED METHODS ............................................... 299 13 ■ Generalized linear models 301 14 ■ Principal components and factor analysis 319 15 ■ Time series 340 16 ■ Cluster analysis 369 17 ■ Classification 389 18 ■ Advanced methods for missing data 414 PART 5 EXPANDING YOUR SKILLS ......................................... 435 19 ■ Advanced graphics with ggplot2 437 20 ■ Advanced programming 463 21 ■ Creating a package 491 22 ■ Creating dynamic reports 513 23 ■ Advanced graphics with the lattice package 1 online only www.it-ebooks.info contents preface xvii acknowledgments xix about this book xxi about the cover illustration xxvii PART 1 GETTING STARTED ........................................... 1 1 Introduction to R 3 1.1 Why use R? 5 1.2 Obtaining and installing R 7 1.3 Working with R 7 Getting started 8 ■ Getting help 10 ■ The workspace 11 Input and output 13 1.4 Packages 15 What are packages? 15 ■ Installing a package 15 Loading a package 15 ■ Learning about a package 16 1.5 Batch processing 16 1.6 Using output as input: reusing results 17 1.7 Working with large datasets 17 vii www.it-ebooks.info viii CONTENTS 1.8 Working through an example 18 1.9 Summary 19 2 Creating a dataset 20 2.1 Understanding datasets 21 2.2 Data structures 22 Vectors 22 ■ Matrices 23 ■ Arrays 24 ■ Data frames 25 Factors 28 ■ Lists 30 2.3 Data input 32 Entering data from the keyboard 33 ■ Importing data from a delimited text file 34 ■ Importing data from Excel 37 Importing data from XML 38 ■ Importing data from the web 38 ■ Importing data from SPSS 38 ■ Importing data from SAS 39 ■ Importing data from Stata 40 ■ Importing data from NetCDF 40 ■ Importing data from HDF5 40 Accessing database management systems (DBMSs) 40 Importing data via Stat/Transfer 42 2.4 Annotating datasets 43 Variable labels 43 ■ Value labels 43 2.5 Useful functions for working with data objects 43 2.6 Summary 44 3 Getting started with graphs 46 3.1 Working with graphs 47 3.2 A simple example 49 3.3 Graphical parameters 50 Symbols and lines 51 ■ Colors 52 ■ Text characteristics 53 Graph and margin dimensions 54 3.4 Adding text, customized axes, and legends 56 Titles 56 ■ Axes 57 ■ Reference lines 60 ■ Legend 60 Text annotations 61 ■ Math annotations 63 3.5 Combining graphs 64 Creating a figure arrangement with fine control 68 3.6 Summary 70 4 Basic data management 71 4.1 A working example 71 4.2 Creating new variables 73 www.it-ebooks.info CONTENTS ix 4.3 Recoding variables 75 4.4 Renaming variables 76 4.5 Missing values 77 Recoding values to missing 78 ■ Excluding missing values from analyses 78 4.6 Date values 79 Converting dates to character variables 81 ■ Going further 81 4.7 Type conversions 81 4.8 Sorting data 82 4.9 Merging datasets 83 Adding columns to a data frame 83 ■ Adding rows to a data frame 84 4.10 Subsetting datasets 84 Selecting (keeping) variables 84 ■ Excluding (dropping) variables 84 ■ Selecting observations 85 ■ The subset() function 86 ■ Random samples 87 4.11 Using SQL statements to manipulate data frames 87 4.12 Summary 88 5 Advanced data management 89 5.1 A data-management challenge 90 5.2 Numerical and character functions 91 Mathematical functions 91 ■ Statistical functions 92 Probability functions 94 ■ Character functions 97 Other useful functions 98 ■ Applying functions to matrices and data frames 99 5.3 A solution for the data-management challenge 101 5.4 Control flow 105 Repetition and looping 105 ■ Conditional execution 106 5.5 User-written functions 107 5.6 Aggregation and reshaping 109 Transpose 110 ■ Aggregating data 110 ■ The reshape2 package 111 5.7 Summary 113 www.it-ebooks.info

Description:
R in Action, Second Edition presents both the R language and the examples that make it so useful for business developers. Focusing on practical solutions, the book offers a crash course in statistics and covers elegant methods for dealing with messy and incomplete data that are difficult to analyze
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.