ebook img

R Visualizations: Derive Meaning from Data PDF

252 Pages·2020·15.535 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview R Visualizations: Derive Meaning from Data

R Visualizations R Visualizations Derive Meaning from Data David W. Gerbing The School of Business Portland State University Firsteditionpublished2020 byCRCPress 6000BrokenSoundParkwayNW,Suite300,BocaRaton,FL33487-2742 andbyCRCPress 2ParkSquare,MiltonPark,Abingdon,Oxon,OX144RN (cid:13)c 2020Taylor&FrancisGroup,LLC CRCPressisanimprintofTaylor&FrancisGroup,LLC Reasonableeffortshavebeenmadetopublishreliabledataandinformation,buttheauthorandpublisher cannotassumeresponsibilityforthevalidityofallmaterialsortheconsequencesoftheiruse. Theauthors andpublishershaveattemptedtotracethecopyrightholdersofallmaterialreproducedinthispublication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyrightmaterialhasnotbeenacknowledgedpleasewriteandletusknowsowemayrectifyinanyfuture reprint. Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted,orutilizedinanyformbyanyelectronic,mechanical,orothermeans,nowknownorhereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system,withoutwrittenpermissionfromthepublishers. For permission to photocopy or use material electronically from this work, access www.copyright.com or contacttheCopyrightClearanceCenter,Inc. (CCC),222RosewoodDrive,Danvers,MA01923,978-750- 8400. ForworksthatarenotavailableonCCCpleasecontactmpkbookspermissions@tandf.co.uk Trademarknotice: Productorcorporatenamesmaybetrademarksorregisteredtrademarks,andareused onlyforidentificationandexplanationwithoutintenttoinfringe. Library of Congress Cataloging-in-Publication Data Names: Gerbing,DavidW.,author. Title: Rvisualizations: derivemeaningfromdata/DavidW.Gerbing,TheSchoolof Business,PortlandStateUniversity. Description: 1st. |BocaRaton: CRCPress,2020. |Includesbibliographicalreferences andindex. Identifiers: LCCN2020004865|ISBN9781138599635(hardback)| ISBN9780429470837(ebook) Subjects: LCSH:Informationvisualization. |R(Computerprogramlanguage) Classification: LCCQA76.9.I52G472020|DDC001.4/226--dc23 LCrecordavailableathttps://lccn.loc.gov/2020004865 ISBN:978-1-138-59963-5(hbk) ISBN:978-0-429-47083-7(ebk) TypesetinLMRoman byNovaTechsetPrivateLimited,Bengaluru&Chennai,India Contents Preface xi 1 Visualize Data 1 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1.1 Visualization and Analytics . . . . . . . . . . . . . . . . . . . 2 1.1.2 Open-Source Software for Data Visualization . . . . . . . . . 3 1.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.2.1 R Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.2.2 Employee Data Example . . . . . . . . . . . . . . . . . . . . . 7 1.2.3 Types of Variables . . . . . . . . . . . . . . . . . . . . . . . . 9 1.2.4 Read Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.2.5 Variable Labels . . . . . . . . . . . . . . . . . . . . . . . . . . 18 1.2.6 Categorical Variables as Factors . . . . . . . . . . . . . . . . 19 1.2.7 Save the Data Frame . . . . . . . . . . . . . . . . . . . . . . . 22 2 Visualization Quick Start 25 2.1 Visualization Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.1.1 Relative Advantages of ggplot2 and lessR . . . . . . . . . . 26 2.1.2 Grayscale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 2.2 Distribution of a Categorical Variable . . . . . . . . . . . . . . . . . 28 2.2.1 Bar Chart of a Single Variable . . . . . . . . . . . . . . . . . 28 2.2.2 Bar Charts of Multiple Variables . . . . . . . . . . . . . . . . 32 2.3 Distribution of a Continuous Variable . . . . . . . . . . . . . . . . . 35 2.3.1 Default Histogram . . . . . . . . . . . . . . . . . . . . . . . . 35 2.3.2 Beyond the Histogram . . . . . . . . . . . . . . . . . . . . . . 36 v vi CONTENTS 2.4 Relation between Two Variables . . . . . . . . . . . . . . . . . . . . . 38 2.4.1 Basic Scatterplot . . . . . . . . . . . . . . . . . . . . . . . . . 38 2.4.2 Enhanced Scatterplot . . . . . . . . . . . . . . . . . . . . . . 39 2.5 Distribution of Values over Time . . . . . . . . . . . . . . . . . . . . 40 2.5.1 Time Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 2.5.2 Multiple Time Series . . . . . . . . . . . . . . . . . . . . . . . 42 3 Visualize a Categorical Variable 45 3.1 Bars, Dots, and Bubbles . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.1.1 Horizontal Bar Chart of Counts . . . . . . . . . . . . . . . . . 47 3.1.2 Cleveland Dot Plot of Counts . . . . . . . . . . . . . . . . . . 48 3.1.3 Bubble Plot of Counts . . . . . . . . . . . . . . . . . . . . . . 49 3.1.4 Display Proportions . . . . . . . . . . . . . . . . . . . . . . . 50 3.2 Multiple Plots on a Single Panel . . . . . . . . . . . . . . . . . . . . 52 3.3 Provide the Numerical Values . . . . . . . . . . . . . . . . . . . . . . 54 3.3.1 Bar Chart of Individual Data Values . . . . . . . . . . . . . . 55 3.3.2 Vertical Long Value Labels . . . . . . . . . . . . . . . . . . . 56 3.3.3 Cleveland Dot Plot of Individual Data Values . . . . . . . . . 57 3.3.4 Visualize Means across Categories . . . . . . . . . . . . . . . 58 3.4 Communicate with Bar Fill Color . . . . . . . . . . . . . . . . . . . . 61 3.4.1 Bar Fill Color Bifurcated by Value of Mean Deviations . . . . 62 3.4.2 Bar Chart of an Ordinal Variable . . . . . . . . . . . . . . . . 64 3.4.3 Custom Color for Individual Bars. . . . . . . . . . . . . . . . 67 3.5 Create a Report from Saved Output . . . . . . . . . . . . . . . . . . 69 3.6 Part-Whole Visualizations . . . . . . . . . . . . . . . . . . . . . . . . 70 3.6.1 Doughnut and Pie Charts . . . . . . . . . . . . . . . . . . . . 71 3.6.2 The Waffle Plot . . . . . . . . . . . . . . . . . . . . . . . . . . 72 3.6.3 The Treemap . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 CONTENTS vii 4 Visualize a Continuous Variable 79 4.1 Histogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 4.1.1 Binning Continuous Data . . . . . . . . . . . . . . . . . . . . 80 4.1.2 Histogram Artifacts . . . . . . . . . . . . . . . . . . . . . . . 82 4.1.3 Cumulative Histogram . . . . . . . . . . . . . . . . . . . . . . 83 4.1.4 Frequency Polygon . . . . . . . . . . . . . . . . . . . . . . . . 85 4.2 Density Plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 4.2.1 Enhanced Density Plot . . . . . . . . . . . . . . . . . . . . . 86 4.2.2 Overlapping Density Curves . . . . . . . . . . . . . . . . . . . 87 4.2.3 Rug Plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 4.2.4 Violin Plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 4.3 Box Plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 4.3.1 Classic Box Plot . . . . . . . . . . . . . . . . . . . . . . . . . 92 4.3.2 Box Plot Adjusted for Asymmetry . . . . . . . . . . . . . . . 93 4.4 One-Variable Scatterplot . . . . . . . . . . . . . . . . . . . . . . . . . 94 4.5 Integrated Violin/Box/Scatterplot . . . . . . . . . . . . . . . . . . . 94 4.5.1 VBS Plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 4.5.2 VBS Plot of Likert Data . . . . . . . . . . . . . . . . . . . . . 97 4.5.3 Trellis Plots or Facets . . . . . . . . . . . . . . . . . . . . . . 98 4.6 Pareto Chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5 Visualize the Relation of Two Continuous Variables 103 5.1 Enhance the Scatterplot . . . . . . . . . . . . . . . . . . . . . . . . . 104 5.1.1 The Ellipse . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 5.1.2 Line of Best Fit . . . . . . . . . . . . . . . . . . . . . . . . . . 107 5.1.3 Annotate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 viii CONTENTS 5.2 Consideration of a Third Variable . . . . . . . . . . . . . . . . . . . . 112 5.2.1 Map Data from a Grouping Variable to Aesthetics . . . . . . 112 5.2.2 Trellis (Facet) Scatterplots . . . . . . . . . . . . . . . . . . . 115 5.2.3 Map a Third Continuous Variable into a Visual Aesthetic . . 117 5.2.4 Plot Multiple Variables on the Same Panel . . . . . . . . . . 119 5.3 Inter-Relations of a Set of Variables . . . . . . . . . . . . . . . . . . 120 5.3.1 Scatterplot Matrix . . . . . . . . . . . . . . . . . . . . . . . . 120 5.3.2 Heat Map of a Correlation Matrix . . . . . . . . . . . . . . . 122 5.4 Scatterplots for Large Data Sets . . . . . . . . . . . . . . . . . . . . 131 5.4.1 Smoothed Scatterplots . . . . . . . . . . . . . . . . . . . . . . 131 5.4.2 Contoured and Hex-Binned Scatterplots . . . . . . . . . . . . 132 6 Visualize Multiple Categorical Variables 135 6.1 Two Categorical Variables . . . . . . . . . . . . . . . . . . . . . . . . 136 6.1.1 Stacked Two-Variable Bar Chart . . . . . . . . . . . . . . . . 136 6.1.2 Unstacked Two-Variable Bar Chart . . . . . . . . . . . . . . . 138 6.1.3 Trellis Plots or Facets . . . . . . . . . . . . . . . . . . . . . . 139 6.2 Other Styles for the Two-Variable Bar Chart . . . . . . . . . . . . . 141 6.2.1 Sorted Two-Variable Bar Chart . . . . . . . . . . . . . . . . . 141 6.2.2 Horizontal Bar Chart . . . . . . . . . . . . . . . . . . . . . . 142 6.2.3 Bar Chart with Legend on the Top . . . . . . . . . . . . . . . 142 6.2.4 100% Stacked Bar Chart . . . . . . . . . . . . . . . . . . . . . 143 6.2.5 Bar Chart of Means across Two Categorical Variables . . . . 144 6.2.6 Two-Variable Cleveland Dot Plot . . . . . . . . . . . . . . . . 147 6.2.7 Paired t-test Visualization . . . . . . . . . . . . . . . . . . . . 148 6.3 Mosaic Plots and Association Plots . . . . . . . . . . . . . . . . . . . 150 6.3.1 The Mosaic Plot . . . . . . . . . . . . . . . . . . . . . . . . . 150 CONTENTS ix 6.3.2 Independence and Pearson Residuals . . . . . . . . . . . . . . 153 6.3.3 The Association Plot . . . . . . . . . . . . . . . . . . . . . . . 155 7 Visualize over Time 157 7.1 Run Chart and Control Chart . . . . . . . . . . . . . . . . . . . . . . 158 7.1.1 Run Chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 7.1.2 Control Chart. . . . . . . . . . . . . . . . . . . . . . . . . . . 160 7.2 Time Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 7.2.1 Filled Area Time Series . . . . . . . . . . . . . . . . . . . . . 163 7.2.2 Stacked Multiple Time Series . . . . . . . . . . . . . . . . . . 165 7.2.3 Formatted Multi-Panel Time Series. . . . . . . . . . . . . . . 166 7.2.4 Data Preparation for Date Variables . . . . . . . . . . . . . . 168 7.3 Forecasts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 7.3.1 Time-Series Object . . . . . . . . . . . . . . . . . . . . . . . . 172 7.3.2 Seasonal/Trend Decomposition . . . . . . . . . . . . . . . . . 173 7.3.3 Generate a Forecast . . . . . . . . . . . . . . . . . . . . . . . 176 8 Visualize Maps and Networks 179 8.1 Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 8.1.1 Map the World . . . . . . . . . . . . . . . . . . . . . . . . . . 180 8.1.2 Raster Images . . . . . . . . . . . . . . . . . . . . . . . . . . 183 8.1.3 Online Geocode Databases . . . . . . . . . . . . . . . . . . . 184 8.1.4 Create a Country Map with Cities . . . . . . . . . . . . . . . 187 8.1.5 Choropleth Map . . . . . . . . . . . . . . . . . . . . . . . . . 189 8.2 Network Visualizations . . . . . . . . . . . . . . . . . . . . . . . . . . 191 8.2.1 Network Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 8.2.2 Visualizations . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 8.2.3 Network Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 196

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.