Mastering PostgreSQL 12 Third Edition Advanced techniques to build and administer scalable and reliable PostgreSQL database applications Hans-Jürgen Schönig BIRMINGHAM - MUMBAI Mastering PostgreSQL 12 Third Edition Copyright © 2019 Packt Publishing All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews. Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book. Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information. Commissioning Editor: Amey Varangaokar Acquisition Editor: Devika Battike Content Development Editor: Athikho Sapuni Rishana Senior Editor: Sofi Rogers Technical Editor: Utkarsha S. Kadam and Manikandan Kurup Copy Editor: Safis Editing Project Coordinator: Aishwarya Mohan Proofreader: Safis Editing Indexer: Rekha Nair Production Designer: Aparna Bhagat First published: January 2018 Second edition: October 2018 Third edition: November 2019 Production reference: 1281119 Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK. ISBN 978-1-83898-882-1 www.packt.com Packt.com Subscribe to our online digital library for full access to over 7,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website. Why subscribe? Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals Improve your learning with Skill Plans built especially for you Get a free eBook or video every month Fully searchable for easy access to vital information Copy and paste, print, and bookmark content Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details. At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks. Contributors About the author Hans-Jürgen Schönig has 18 years' experience with PostgreSQL. He is the CEO of a PostgreSQL consulting and support company called Cybertec Schönig and Schönig GmbH. It has successfully served countless customers around the globe. Before founding Cybertec Schönig and Schönig GmbH in the year 2000, he worked as a database developer at a private research company that focused on the Austrian labor market, where he primarily worked on data mining and forecast models. Besides, he has written several books on PostgreSQL. About the reviewers Daniel Durante is a consultant and strategist for Fortune 100 companies, and has been a full-stack developer since the age of 12. He is also an author and technical reviewer for Packt Publishing. His code exists in infrastructures such as Hubcash, Stripe, and Walmart. He has worked on text-based browser games that have surpassed 1,000,000 active players. Further he has created bin packing software for CNC machines, worked with embedded programming with cortex-m and PIC circuits, produced high-frequency trading applications, and helped contribute to and maintain one of the oldest object-relational mappers (ORMs) of Node.js (SequelizeJS). Marcelo Diaz is a software engineer with more than 15 years of experience with a special focus on PostgreSQL. He is passionate about open source and has promoted its application in critical and high-demand environments where he has worked as a software developer and consultant on private and public companies. He currently works very happily at Cybertec and as a technical reviewer for Packt Publishing. He enjoys spending his leisure time with his daughter, Malvina, and his wife, Romina. He also likes playing football. Packt is searching for authors like you If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea. Table of Contents Preface 1 Section 1: Basic Overview Chapter 1: PostgreSQL 12 Overview 6 What's new in PostgreSQL 12? 7 Digging into SQL and developer-related topics 7 Improving psql and database documentation 7 Displaying output as CSV 8 Rebuilding indexes concurrently 8 Storing the result of a computation 9 Improving ENUM handling 10 Making use of JSONPATH 11 Understanding backup and recovery related features 11 Making use of performance improvements 12 Optimizing common table expressions and planner support functions 12 Speeding up partitions 13 Creating special indexes more efficiently 15 Understanding new storage-related features 15 Summary 16 Chapter 2: Understanding Transactions and Locking 17 Working with PostgreSQL transactions 18 Handling errors inside a transaction 21 Making use of SAVEPOINT 22 Transactional DDLs 23 Understanding basic locking 24 Avoiding typical mistakes and explicit locking 26 Considering alternative solutions 28 Making use of FOR SHARE and FOR UPDATE 29 Understanding transaction isolation levels 33 Considering Serializable Snapshot Isolation transactions 34 Observing deadlocks and similar issues 35 Utilizing advisory locks 37 Optimizing storage and managing cleanup 38 Configuring VACUUM and autovacuum 39 Digging into transaction wraparound-related issues 40 A word on VACUUM FULL 41 Watching VACUUM at work 42 Limiting transactions by making use of snapshot too old 45 Making use of more VACUUM features 46 Table of Contents Summary 47 Questions 47 Section 2: Advanced Concepts Chapter 3: Making Use of Indexes 49 Understanding simple queries and the cost model 50 Making use of EXPLAIN 51 Digging into the PostgreSQL cost model 53 Deploying simple indexes 55 Making use of sorted output 56 Using more than one index at a time 57 Using bitmap scans effectively 59 Using indexes in an intelligent way 59 Improving speed using clustered tables 61 Clustering tables 65 Making use of index-only scans 66 Understanding additional B-tree features 67 Combined indexes 67 Adding functional indexes 68 Reducing space consumption 69 Adding data while indexing 70 Introducing operator classes 71 Creating an operator class for a B-tree 73 Creating new operators 73 Creating operator classes 77 Testing custom operator classes 77 Understanding PostgreSQL index types 78 Hash indexes 79 GiST indexes 79 Understanding how GiST works 80 Extending GiST 81 GIN indexes 82 Extending GIN 83 SP-GiST indexes 84 BRIN indexes 85 Extending BRIN indexes 86 Adding additional indexes 86 Achieving better answers with fuzzy searching 88 Taking advantage of pg_trgm 88 Speeding up LIKE queries 90 Handling regular expressions 91 Understanding full-text search 92 Comparing strings 93 Defining GIN indexes 94 Debugging your search 95 [ ii ] Table of Contents Gathering word statistics 96 Taking advantage of exclusion operators 96 Summary 97 Questions 98 Chapter 4: Handling Advanced SQL 99 Introducing grouping sets 100 Loading some sample data 100 Applying grouping sets 101 Investigating performance 103 Combining grouping sets with the FILTER clause 104 Making use of ordered sets 105 Understanding hypothetical aggregates 108 Utilizing windowing functions and analytics 108 Partitioning data 109 Ordering data inside a window 111 Using sliding windows 112 Understanding the subtle difference between ROWS and RANGE 115 Removing duplicates using EXCLUDE TIES and EXCLUDE GROUP 116 Abstracting window clauses 118 Using on-board windowing functions 119 The rank and dense_rank functions 119 The ntile() function 120 The lead() and lag() functions 122 The first_value(), nth_value(), and last_value() functions 125 The row_number() function 126 Writing your own aggregates 127 Creating simple aggregates 127 Adding support for parallel queries 130 Improving efficiency 131 Writing hypothetical aggregates 133 Summary 135 Chapter 5: Log Files and System Statistics 136 Gathering runtime statistics 136 Working with PostgreSQL system views 137 Checking live traffic 137 Inspecting databases 140 Inspecting tables 142 Making sense of pg_stat_user_tables 144 Digging into indexes 145 Tracking the background worker 147 Tracking, archiving, and streaming 148 Checking SSL connections 151 Inspecting transactions in real time 151 Tracking VACUUM and CREATE INDEX progress 152 Using pg_stat_statements 154 [ iii ] Table of Contents Creating log files 159 Configuring the postgresql.conf file 159 Defining log destination and rotation 159 Configuring syslog 161 Logging slow queries 161 Defining what and how to log 162 Summary 165 Questions 165 Chapter 6: Optimizing Queries for Good Performance 166 Learning what the optimizer does 166 Optimizations by example 167 Evaluating join options 167 Nested loops 168 Hash joins 168 Merge joins 168 Applying transformations 169 Step 1: Inlining the view 169 Step 2: Flattening subselects 170 Applying equality constraints 170 Exhaustive searching 171 Trying it all out 171 Making the process fail 173 Constant folding 174 Understanding function inlining 175 Join pruning 176 Speedup set operations 177 Understanding execution plans 179 Approaching plans systematically 179 Making EXPLAIN more verbose 181 Spotting problems 182 Spotting changes in runtime 182 Inspecting estimates 182 Inspecting buffer usage 185 Fixing high buffer usage 186 Understanding and fixing joins 187 Getting joins right 187 Processing outer joins 188 Understanding the join_collapse_limit variable 189 Enabling and disabling optimizer settings 190 Understanding genetic query optimization 194 Partitioning data 195 Creating partitions 196 Applying table constraints 198 Modifying inherited structures 200 Moving tables in and out of partitioned structures 200 Cleaning up data 201 Understanding PostgreSQL 12.0 partitioning 202 [ iv ]