ebook img

Data Wrangling with JavaScript PDF

432 Pages·2019·14.325 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Data Wrangling with JavaScript

Ashley Davis M A N N I N G Data Wrangling with JavaScript Data Wrangling with JavaScript ASHLEY DAVIS MANNING Shelter ISland For online information and ordering of this and other Manning books, please visit www.manning.com. The publisher offers discounts on this book when ordered in quantity. For more information, please contact Special Sales Department Manning Publications Co. 20 Baldwin Road PO Box 761 Shelter Island, NY 11964 Email: [email protected] ©2019 by Manning Publications Co. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in the book, and Manning Publications was aware of a trademark claim, the designations have been printed in initial caps or all caps. ∞Recognizing the importance of preserving what has been written, it is Manning’s policy to have the books we publish printed on acid- free paper, and we exert our best efforts to that end. Recognizing also our responsibility to conserve the resources of our planet, Manning books are printed on paper that is at least 15 percent recycled and processed without the use of elemental chlorine. Manning Publications Co. Development editor: Helen Stergius 20 Baldwin Road Technical development editor: Luis Atencio PO Box 761 Review editor: Ivan Martinovic´ Shelter Island, NY 11964 Project manager: Deirdre Hiam Copy editor: Katie Petito Proofreader: Charles Hutchinson Technical proofreader: Kathleen Estrada Typesetting: Happenstance Type-O-Rama Cover designer: Marija Tudor ISBN 9781617294846 Printed in the United States of America 1 2 3 4 5 6 7 8 9 10 – SP – 23 22 21 20 19 18 brief contents 1 ■ Getting started: establishing your data pipeline 1 2 ■ Getting started with Node.js 25 3 ■ Acquisition, storage, and retrieval 59 4 ■ Working with unusual data 99 5 ■ Exploratory coding 115 6 ■ Clean and prepare 143 7 ■ Dealing with huge data files 168 8 ■ Working with a mountain of data 191 9 ■ Practical data analysis 217 10 ■ Browser-based visualization 247 11 ■ Server-side visualization 274 12 ■ Live data 299 13 ■ Advanced visualization with D3 329 14 ■ Getting to production 358 v contents preface xv acknowledgments xvii about this book xix about the author xxiii about the cover illustration xxv 1 Getting started: establishing your data pipeline 1 1.1 Why data wrangling? 1 1.2 What’s data wrangling? 2 1.3 Why a book on JavaScript data wrangling? 3 1.4 What will you get out of this book? 4 1.5 Why use JavaScript for data wrangling? 5 1.6 Is JavaScript appropriate for data analysis? 6 1.7 Navigating the JavaScript ecosystem 7 1.8 Assembling your toolkit 7 1.9 Establishing your data pipeline 8 Setting the stage 9 ■ The data-wrangling process 10 ■ Planning 10 ■ Acquisition, storage, and retrieval 13 ■ Exploratory coding 15 ■ Clean and prepare 18 ■ Analysis 19 ■ Visualization 20 ■ Getting to production 22 vii vviiiiii CONTENTS 2 Getting started with Node.js 25 2.1 Starting your toolkit 26 2.2 Building a simple reporting system 27 2.3 Getting the code and data 27 Viewing the code 28 ■ Downloading the code 28 ■ Installing Node.js 29 ■ Installing dependencies 29 ■ Running Node.js code 29 ■ Running a web application 30 ■ Getting the data 30 ■ Getting the code for chapter 2 31 2.4 Installing Node.js 31 Checking your Node.js version 32 2.5 Working with Node.js 33 Creating a Node.js project 33 ■ Creating a command-line application 36 ■ Creating a code library 38 ■ Creating a simple web server 40 2.6 Asynchronous coding 45 Loading a single file 46 ■ Loading multiple files 49 ■ Error handling 51 ■ Asynchronous coding with promises 52 ■ Wrapping asynchronous operations in promises 55 ■ Async coding with “async” and “await” 57 3 Acquisition, storage, and retrieval 59 3.1 Building out your toolkit 60 3.2 Getting the code and data 61 3.3 The core data representation 61 The earthquakes website 62 ■ Data formats covered 64 Power and flexibility 65 3.4 Importing data 66 Loading data from text files 66 ■ Loading data from a REST API 69 ■ Parsing JSON text data 70 ■ Parsing CSV text data 74 ■ Importing data from databases 78 ■ Importing data from MongoDB 78 ■ Importing data from MySQL 82 3.5 Exporting data 85 You need data to export! 85 ■ Exporting data to text files 85 ■ Exporting data to JSON text files 87 Exporting data to CSV text files 89 ■ Exporting data to a database 90 ■ Exporting data to MongoDB 91 Exporting data to MySQL 92 CONTENTS iixx 3.6 Building complete data conversions 95 3.7 Expanding the process 95 4 Working with unusual data 99 4.1 Getting the code and data 100 4.2 Importing custom data from text files 101 4.3 Importing data by scraping web pages 104 Identifying the data to scrape 104 ■ Scraping with Cheerio 105 4.4 Working with binary data 107 Unpacking a custom binary file 108 ■ Packing a custom binary file 111 ■ Replacing JSON with BSON 113 ■ Converting JSON to BSON 113 ■ Deserializing a BSON file 114 5 Exploratory coding 115 5.1 Expanding your toolkit 116 5.2 Analyzing car accidents 116 5.3 Getting the code and data 117 5.4 Iteration and your feedback loop 117 5.5 A first pass at understanding your data 118 5.6 Working with a reduced data sample 120 5.7 Prototyping with Excel 120 5.8 Exploratory coding with Node.js 122 Using Nodemon 123 ■ Exploring your data 125 Using Data-Forge 128 ■ Computing the trend column 130 ■ Outputting a new CSV file 134 5.9 Exploratory coding in the browser 135 5.10 Putting it all together 141 6 Clean and prepare 143 6.1 Expanding our toolkit 144 6.2 Preparing the reef data 145 6.3 Getting the code and data 145 6.4 The need for data cleanup and preparation 145 6.5 Where does broken data come from? 145

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.