ebook img

R Data Analysis Projects PDF

361 Pages·2017·9.493 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview R Data Analysis Projects

R Data Analysis Projects Build end to end analytics systems to get deeper insights from your data Gopi Subramanian BIRMINGHAM - MUMBAI R Data Analysis Projects Copyright © 2017 Packt Publishing All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews. Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book. Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information. First published: November 2017 Production reference: 1151117 Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK. ISBN 978-1-78862-187-8 www.packtpub.com Credits Author Copy Editors Gopi Subramanian Safis Editing Reviewer Project Coordinator Mark Hodnett Manthan Patel Commissioning Editor Proofreader Amey Varangaonkar Safis Editing Acquisition Editor Indexer Tushar Gupta Tejal Daruwale Soni Content Development Editor Graphics Aaryaman Singh Tania Dutta Technical Editor Production Coordinator Dharmendra Yadav Arvindkumar Gupta About the Author Gopi Subramanian is a scientist and author with over 18 years of experience in the fields of data mining and machine learning. During the past decade, he has worked extensively in data mining and machine learning, solving a variety of business problems. He has 16 patent applications with the US and Indian patent offices and several publications to his credit. He is the author of Python Data Science Cookbook by Packt Publishing. I would like to thank my parents Vijaya and Mani and sister Geetha for being a great source of inspiration. My family Anita and Rishi for their support. About the Reviewer Mark Hodnett is a data scientist with over 20 years of industry experience. He has worked in a variety of industries, ranging from website development, retail loyalty, and industrial systems to accountancy software. He holds a master's in data science and an MBA. His current role is with AltViz as a senior data scientist. Altviz applies machine learning, optimization, and intelligent evaluation to companies in many sectors, including retail and insurance. I would like to thank the author, Gopi, for giving me the opportunity to collaborate on this project. I would also like to thank Sharon and Conor for their patience while I worked on this project. www.PacktPub.com For support files and downloads related to your book, please visit www.PacktPub.com. Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details. At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks. https:/​/​www.​packtpub.​com/​mapt Get the most in-demand software skills with Mapt. Mapt gives you full access to all Packt books and video courses, as well as industry-leading tools to help you plan your personal development and advance your career. Why subscribe? Fully searchable across every book published by Packt Copy and paste, print, and bookmark content On demand and accessible via a web browser Customer Feedback Thanks for purchasing this Packt book. At Packt, quality is at the heart of our editorial process. To help us improve, please leave us an honest review on this book's Amazon page at https://www.amazon.com/dp/1788621875. If you'd like to join our team of regular reviewers, you can email us at [email protected]. We award our regular reviewers with free eBooks and videos in exchange for their valuable feedback. Help us be relentless in improving our products! Table of Contents Preface 1 Chapter 1: Association Rule Mining 6 Understanding the recommender systems 7 Transactions 8 Weighted transactions 9 Our web application 9 Retailer use case and data 10 Association rule mining 13 Support and confidence thresholds 25 The cross-selling campaign 30 Leverage 33 Conviction 34 Weighted association rule mining 35 Hyperlink-induced topic search (HITS) 42 Negative association rules 50 Rules visualization 53 Wrapping up 57 Summary 65 Chapter 2: Fuzzy Logic Induced Content-Based Recommendation 66 Introducing content-based recommendation 68 News aggregator use case and data 72 Designing the content-based recommendation engine 77 Building a similarity index 80 Bag-of-words 80 Term frequency 81 Document frequency 81 Inverse document frequency (IDF) 81 TFIDF 82 Why cosine similarity? 85 Searching 86 Polarity scores 89 Jaccard's distance 91 Jaccards distance/index 92 Ranking search results 94 Fuzzy logic 95 Fuzzification 95 Table of Contents Defining the rules 97 Evaluating the rules 97 Defuzzification 98 Complete R Code 107 Summary 113 Chapter 3: Collaborative Filtering 115 Collaborative filtering 116 Memory-based approach 118 Model-based approach 119 Latent factor approach 120 Recommenderlab package 122 Popular approach 124 Use case and data 126 Designing and implementing collaborative filtering 136 Ratings matrix 137 Normalization 139 Train test split 141 Train model 144 User-based models 149 Item-based models 152 Factor-based models 153 Complete R Code 155 Summary 161 Chapter 4: Taming Time Series Data Using Deep Neural Networks 162 Time series data 164 Non-seasonal time series 165 Seasonal time series 166 Time series as a regression problem 167 Deep neural networks 171 Forward cycle 174 Backward cycle 175 Introduction to the MXNet R package 175 Symbolic programming in MXNet 178 Softmax activation 182 Use case and data 184 Deep networks for time series prediction 186 Training test split 188 Complete R code 202 Summary 209 [ ii ]

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.