Table Of ContentMachine Learning
with the Elastic
Stack
Second Edition
Gain valuable insights from your data with Elastic
Stack's machine learning features
Rich Collier
Camilla Montonen
Bahaaldine Azarmi
BIRMINGHAM—MUMBAI
Machine Learning with the Elastic Stack
Second Edition
Copyright © 2021 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or
transmitted in any form or by any means, without the prior written permission of the publisher,
except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the
information presented. However, the information contained in this book is sold without warranty,
either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors,
will be held liable for any damages caused or alleged to have been caused directly or indirectly by
this book.
Packt Publishing has endeavored to provide trademark information about all of the companies
and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing
cannot guarantee the accuracy of this information.
Group Product Manager: Kunal Parikh
Publishing Product Manager: Devika Battike
Senior Editor: David Sugarman
Content Development Editor: Joseph Sunil
Technical Editor: Devanshi Ayare
Copy Editor: Safis Editing
Project Coordinator: Aparna Nair
Proofreader: Safis Editing
Indexer: Manju Arasan
Production Designer: Alishon Mendonca
First published: January 2019
Second published: May 2021
Production reference: 1270521
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.
ISBN 978-1-80107-003-4
www.packt.com
Contributors
About the authors
Rich Collier is a solutions architect at Elastic. Joining the Elastic team from the Prelert
acquisition, Rich has over 20 years of experience as a solutions architect and pre-sales
systems engineer for software, hardware, and service-based solutions. Rich's technical
specialties include big data analytics, machine learning, anomaly detection, threat
detection, security operations, application performance management, web applications,
and contact center technologies. Rich is based in Boston, Massachusetts.
Camilla Montonen is a senior machine learning engineer at Elastic.
Bahaaldine Azarmi, or Baha for short, is a solutions architect at Elastic. Prior to this
position, Baha co-founded ReachFive, a marketing data platform focused on user
behavior and social analytics. Baha also worked for different software vendors such as
Talend and Oracle, where he held solutions architect and architect positions. Before
Machine Learning with the Elastic Stack, Baha authored books including Learning Kibana
5.0, Scalable Big Data Architecture, and Talend for Big Data. Baha is based in Paris and has
an MSc in computer science from Polytech Paris.
About the reviewers
Apoorva Joshi is currently a security data scientist at Elastic (previously Elasticsearch)
where she works on using machine learning for malware detection on endpoints. Prior
to Elastic, she was a research scientist at FireEye where she applied machine learning to
problems in email security. She has a diverse engineering background with a bachelor's in
electrical engineering and a master's in computer engineering (with a machine learning
focus).
Lijuan Zhong is an experienced Elastic and cloud engineer. She has a master's degree in
information technology and nearly 20 years of working experience in IT and telecom, and
is now working with Elastic's major partner in Sweden: Netnordic. She began her journey
in Elastic in 2019 and became an Elastic certified engineer. She has also completed the
machine learning course by Stanford University. She leads lots of Elastic and machine
learning POC and projects, and customers were extremely satisfied with the outcome. She
has been the co-organizer of the Elastic Stockholm meetup since 2020. She took part in
the Elastic community conference 2021 and gave a talk about machine learning with the
Elastic Stack. She was awarded the Elastic bronze contributor award in 2021.
Table of Contents
Preface
Section 1 – Getting Started with Machine
Learning with Elastic Stack
1
Machine Learning for IT
Overcoming the historical Learning what's normal 10
challenges in IT 4 Probability models 10
Dealing with the plethora of data 4 Learning the models 12
De-trending 14
The advent of automated
Scoring of unusualness 16
anomaly detection 5
The element of time 17
Unsupervised versus
supervised ML 7 Applying supervised ML to data
Using unsupervised ML for frame analytics 17
anomaly detection 8 The process of supervised learning 18
Defining unusual 8
Summary 19
2
Enabling and Operationalization
Technical requirements 21 Understanding
Enabling Elastic ML features 22 operationalization 34
Enabling ML on a self-managed cluster 22 ML nodes 34
Enabling ML in the cloud – Jobs 35
Elasticsearch Service 26 Bucketing data in a time series analysis 36
Feeding data to Elastic ML 38
ii Table of Contents
The supporting indices 39 Anomaly detection model snapshots 42
Anomaly detection orchestration 41
Summary 43
Section 2 – Time Series Analysis – Anomaly
Detection and Forecasting
3
Anomaly Detection
Technical requirements 48 Geographic 74
Elastic ML job types 48 Time 75
Dissecting the detector 50 Splitting analysis along
The function 50 categorical features 75
The field 51 Setting the split field 76
The partition field 51 The difference between splitting using
The by field 51 partition and by_field 78
The over field 51
Understanding temporal versus
The "formula" 52
population analysis 79
Exploring the count functions 53
Categorization analysis of
Other counting functions 67
unstructured messages 83
Detecting changes in metric
Types of messages that are good
values 70
candidates for categorization 84
Metric functions 70 The process used by categorization 85
Analyzing the categories 86
Understanding the advanced
Categorization job example 86
detector functions 72
When to avoid using categorization 92
rare 72
Frequency rare 73 Managing Elastic ML via the API 92
Information content 74 Summary 94
4
Forecasting
Technical requirements 95 Forecasting use cases 97
Contrasting forecasting with Forecasting theory of operation 97
prophesying 96
Table of Contents iii
Single time series forecasting 100 Multiple time series forecasting 121
Looking at forecast results 115 Summary 124
5
Interpreting Results
Technical requirements 126 Multi-bucket scoring 152
Viewing the Elastic ML results
Forecast results 154
index 126
Querying for forecast results 155
Anomaly scores 131
Results API 157
Bucket-level scoring 132
Normalization 134 Results API endpoints 158
Influencer-level scoring 135 Getting the overall buckets API 158
Influencers 137 Getting the categories API 160
Record-level scoring 139
Custom dashboards and
Results index schema details 140 Canvas workpads 161
Bucket results 141 Dashboard "embeddables" 162
Record results 145 Anomalies as annotations in TSVB 164
Influencer results 149 Customizing Canvas workpads 166
Multi-bucket anomalies 150 Summary 169
Multi-bucket anomaly example 151
6
Alerting on ML Analysis
Technical requirements 172 Creating an alert with a watch 193
Understanding alerting Understanding the anatomy of the
concepts 172 legacy default ML watch 193
Custom watches can offer some
Anomalies are not necessarily alerts 172
unique functionality 200
In real-time alerting, timing matters 173
Summary 202
Building alerts from the ML UI 176
Defining sample anomaly detection jobs1 76
Creating alerts against the sample jobs 182
Simulating some real-time anomalous
behavior 188
Receiving and reviewing the alerts 191
iv Table of Contents
7
AIOps and Root Cause Analysis
Technical requirements 204 Leveraging the contextual
Demystifying the term ''AIOps'' 204 information 216
Understanding the importance Analysis splits 217
and limitations of KPIs 206 Statistical influencers 218
Moving beyond KPIs 209 Bringing it all together for RCA 218
Organizing data for better Outage background 219
analysis 211
Correlation and shared influencers 221
Custom queries for anomaly detection
Summary 227
datafeeds 212
Data enrichment on ingest 216
8
Anomaly Detection in Other Elastic Stack Apps
Technical requirements 230 Log anomalies 242
Anomaly detection in Elastic Anomaly detection in the Metrics app 243
APM 230
Anomaly detection in the
Enabling anomaly detection for APM 230 Uptime app 246
Viewing the anomaly detection job
Anomaly detection in the
results in the APM UI 236
Elastic Security app 249
Creating ML Jobs via the data recognizer2 38
Prebuilt anomaly detection jobs 250
Anomaly detection in the Logs Anomaly detection jobs as detection
app 240 alerts 253
Log categories 240
Summary 255
Section 3 – Data Frame Analysis
9
Introducing Data Frame Analytics
Technical requirements 260 Why are transforms useful? 261
Learning how to use transforms 260 The anatomy of a transform 262
Table of Contents v
Using transforms to analyze transform configurations 276
e-commerce orders 262 Introducing Painless 276
Exploring more advanced pivot and
aggregation configurations 267 Working with Python and
Discovering the difference between Elasticsearch 288
batch and continuous transforms 270 A brief tour of the Python Elasticsearch
Analyzing social media feeds using clients 289
continuous transforms 271
Summary 296
Using Painless for advanced
Further reading 297
10
Outlier Detection
Technical requirements 300 practice 308
Discovering the four techniques used Evaluating outlier detection
for outlier detection 301 with the
Understanding feature influence 304 Evaluate API 313
How does outlier detection differ from Hyperparameter tuning for
anomaly detection? 306
outlier detection 320
Applying outlier detection in Summary 323
11
Classification Analysis
Technical requirements 326 Introduction to decision trees 341
Classification: from data to a Gradient boosted decision trees 342
trained model 326
Hyperparameters 342
Feature engineering 331
Interpreting results 345
Evaluating the model 332
Summary 350
Taking your first steps with Further reading 350
classification 333
Classification under the hood:
gradient boosted decision trees 340