ebook img

Machine Learning with the Elastic Stack: Gain valuable insights from your data with Elastic Stack's machine learning features PDF

450 Pages·2021·35.052 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Machine Learning with the Elastic Stack: Gain valuable insights from your data with Elastic Stack's machine learning features

Machine Learning with the Elastic Stack Second Edition Gain valuable insights from your data with Elastic Stack's machine learning features Rich Collier Camilla Montonen Bahaaldine Azarmi BIRMINGHAM—MUMBAI Machine Learning with the Elastic Stack Second Edition Copyright © 2021 Packt Publishing All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews. Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book. Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information. Group Product Manager: Kunal Parikh Publishing Product Manager: Devika Battike Senior Editor: David Sugarman Content Development Editor: Joseph Sunil Technical Editor: Devanshi Ayare Copy Editor: Safis Editing Project Coordinator: Aparna Nair Proofreader: Safis Editing Indexer: Manju Arasan Production Designer: Alishon Mendonca First published: January 2019 Second published: May 2021 Production reference: 1270521 Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK. ISBN 978-1-80107-003-4 www.packt.com Contributors About the authors Rich Collier is a solutions architect at Elastic. Joining the Elastic team from the Prelert acquisition, Rich has over 20 years of experience as a solutions architect and pre-sales systems engineer for software, hardware, and service-based solutions. Rich's technical specialties include big data analytics, machine learning, anomaly detection, threat detection, security operations, application performance management, web applications, and contact center technologies. Rich is based in Boston, Massachusetts. Camilla Montonen is a senior machine learning engineer at Elastic. Bahaaldine Azarmi, or Baha for short, is a solutions architect at Elastic. Prior to this position, Baha co-founded ReachFive, a marketing data platform focused on user behavior and social analytics. Baha also worked for different software vendors such as Talend and Oracle, where he held solutions architect and architect positions. Before Machine Learning with the Elastic Stack, Baha authored books including Learning Kibana 5.0, Scalable Big Data Architecture, and Talend for Big Data. Baha is based in Paris and has an MSc in computer science from Polytech Paris. About the reviewers Apoorva Joshi is currently a security data scientist at Elastic (previously Elasticsearch) where she works on using machine learning for malware detection on endpoints. Prior to Elastic, she was a research scientist at FireEye where she applied machine learning to problems in email security. She has a diverse engineering background with a bachelor's in electrical engineering and a master's in computer engineering (with a machine learning focus). Lijuan Zhong is an experienced Elastic and cloud engineer. She has a master's degree in information technology and nearly 20 years of working experience in IT and telecom, and is now working with Elastic's major partner in Sweden: Netnordic. She began her journey in Elastic in 2019 and became an Elastic certified engineer. She has also completed the machine learning course by Stanford University. She leads lots of Elastic and machine learning POC and projects, and customers were extremely satisfied with the outcome. She has been the co-organizer of the Elastic Stockholm meetup since 2020. She took part in the Elastic community conference 2021 and gave a talk about machine learning with the Elastic Stack. She was awarded the Elastic bronze contributor award in 2021. Table of Contents Preface Section 1 – Getting Started with Machine Learning with Elastic Stack 1 Machine Learning for IT Overcoming the historical Learning what's normal 10 challenges in IT 4 Probability models 10 Dealing with the plethora of data 4 Learning the models 12 De-trending 14 The advent of automated Scoring of unusualness 16 anomaly detection 5 The element of time 17 Unsupervised versus supervised ML 7 Applying supervised ML to data Using unsupervised ML for frame analytics 17 anomaly detection 8 The process of supervised learning 18 Defining unusual 8 Summary 19 2 Enabling and Operationalization Technical requirements 21 Understanding Enabling Elastic ML features 22 operationalization 34 Enabling ML on a self-managed cluster 22 ML nodes 34 Enabling ML in the cloud – Jobs 35 Elasticsearch Service 26 Bucketing data in a time series analysis 36 Feeding data to Elastic ML 38 ii Table of Contents The supporting indices 39 Anomaly detection model snapshots 42 Anomaly detection orchestration 41 Summary 43 Section 2 – Time Series Analysis – Anomaly Detection and Forecasting 3 Anomaly Detection Technical requirements 48 Geographic 74 Elastic ML job types 48 Time 75 Dissecting the detector 50 Splitting analysis along The function 50 categorical features 75 The field 51 Setting the split field 76 The partition field 51 The difference between splitting using The by field 51 partition and by_field 78 The over field 51 Understanding temporal versus The "formula" 52 population analysis 79 Exploring the count functions 53 Categorization analysis of Other counting functions 67 unstructured messages 83 Detecting changes in metric Types of messages that are good values 70 candidates for categorization 84 Metric functions 70 The process used by categorization 85 Analyzing the categories 86 Understanding the advanced Categorization job example 86 detector functions 72 When to avoid using categorization 92 rare 72 Frequency rare 73 Managing Elastic ML via the API 92 Information content 74 Summary 94 4 Forecasting Technical requirements 95 Forecasting use cases 97 Contrasting forecasting with Forecasting theory of operation 97 prophesying 96 Table of Contents iii Single time series forecasting 100 Multiple time series forecasting 121 Looking at forecast results 115 Summary 124 5 Interpreting Results Technical requirements 126 Multi-bucket scoring 152 Viewing the Elastic ML results Forecast results 154 index 126 Querying for forecast results 155 Anomaly scores 131 Results API 157 Bucket-level scoring 132 Normalization 134 Results API endpoints 158 Influencer-level scoring 135 Getting the overall buckets API 158 Influencers 137 Getting the categories API 160 Record-level scoring 139 Custom dashboards and Results index schema details 140 Canvas workpads 161 Bucket results 141 Dashboard "embeddables" 162 Record results 145 Anomalies as annotations in TSVB 164 Influencer results 149 Customizing Canvas workpads 166 Multi-bucket anomalies 150 Summary 169 Multi-bucket anomaly example 151 6 Alerting on ML Analysis Technical requirements 172 Creating an alert with a watch 193 Understanding alerting Understanding the anatomy of the concepts 172 legacy default ML watch 193 Custom watches can offer some Anomalies are not necessarily alerts 172 unique functionality 200 In real-time alerting, timing matters 173 Summary 202 Building alerts from the ML UI 176 Defining sample anomaly detection jobs1 76 Creating alerts against the sample jobs 182 Simulating some real-time anomalous behavior 188 Receiving and reviewing the alerts 191 iv Table of Contents 7 AIOps and Root Cause Analysis Technical requirements 204 Leveraging the contextual Demystifying the term ''AIOps'' 204 information 216 Understanding the importance Analysis splits 217 and limitations of KPIs 206 Statistical influencers 218 Moving beyond KPIs 209 Bringing it all together for RCA 218 Organizing data for better Outage background 219 analysis 211 Correlation and shared influencers 221 Custom queries for anomaly detection Summary 227 datafeeds 212 Data enrichment on ingest 216 8 Anomaly Detection in Other Elastic Stack Apps Technical requirements 230 Log anomalies 242 Anomaly detection in Elastic Anomaly detection in the Metrics app 243 APM 230 Anomaly detection in the Enabling anomaly detection for APM 230 Uptime app 246 Viewing the anomaly detection job Anomaly detection in the results in the APM UI 236 Elastic Security app 249 Creating ML Jobs via the data recognizer2 38 Prebuilt anomaly detection jobs 250 Anomaly detection in the Logs Anomaly detection jobs as detection app 240 alerts 253 Log categories 240 Summary 255 Section 3 – Data Frame Analysis 9 Introducing Data Frame Analytics Technical requirements 260 Why are transforms useful? 261 Learning how to use transforms 260 The anatomy of a transform 262 Table of Contents v Using transforms to analyze transform configurations 276 e-commerce orders 262 Introducing Painless 276 Exploring more advanced pivot and aggregation configurations 267 Working with Python and Discovering the difference between Elasticsearch 288 batch and continuous transforms 270 A brief tour of the Python Elasticsearch Analyzing social media feeds using clients 289 continuous transforms 271 Summary 296 Using Painless for advanced Further reading 297 10 Outlier Detection Technical requirements 300 practice 308 Discovering the four techniques used Evaluating outlier detection for outlier detection 301 with the Understanding feature influence 304 Evaluate API 313 How does outlier detection differ from Hyperparameter tuning for anomaly detection? 306 outlier detection 320 Applying outlier detection in Summary 323 11 Classification Analysis Technical requirements 326 Introduction to decision trees 341 Classification: from data to a Gradient boosted decision trees 342 trained model 326 Hyperparameters 342 Feature engineering 331 Interpreting results 345 Evaluating the model 332 Summary 350 Taking your first steps with Further reading 350 classification 333 Classification under the hood: gradient boosted decision trees 340

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.