ebook img

Replacing GEQO - Join ordering via Simulated Annealing PDF

72 Pages·2010·1.57 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Replacing GEQO - Join ordering via Simulated Annealing

Replacing GEQO Join ordering via Simulated Annealing Jan Urban´ski [email protected] UniversityofWarsaw/Flumotion May 21, 2010 JanUrban´ski [email protected](UniversityofWarRsaewpla/ciFnlgumGEotQioOn) May21,2010 1/46 For those following at home Getting the slides $ wget http://wulczer.org/saio.pdf Trying it out $ git clone git://wulczer.org/saio.git $ cd saio && make $ psql =# load ’.../saio.so’; Do not try to compile against an assert-enabled build. Do not be scared by lots of ugly code. JanUrban´ski [email protected](UniversityofWarRsaewpla/ciFnlgumGEotQioOn) May21,2010 2/46 1 The problem Determining join order for large queries GEQO, the genetic query optimiser 2 The solution Simulated Annealing overview PostgreSQL specifics Query tree transformations 3 The results Comparison with GEQO 4 The future Development focuses 5 The end JanUrban´ski [email protected](UniversityofWarRsaewpla/ciFnlgumGEotQioOn) May21,2010 3/46 Theproblem Determiningjoinorderforlargequeries Outline 1 The problem Determining join order for large queries GEQO, the genetic query optimiser 2 The solution 3 The results 4 The future 5 The end JanUrban´ski [email protected](UniversityofWarRsaewpla/ciFnlgumGEotQioOn) May21,2010 4/46 Theproblem Determiningjoinorderforlargequeries Getting the optimal join order (cid:73) part of of planning a query is determining the order in which relations are joined (cid:73) it is not unusual to have queries that join lots of relations (cid:73) JOIN and subquery flattening contributes to the number or relations to join (cid:73) automatically generated queries can involve very large joins JanUrban´ski [email protected](UniversityofWarRsaewpla/ciFnlgumGEotQioOn) May21,2010 5/46 Theproblem Determiningjoinorderforlargequeries Problems with join ordering (cid:73) finding the optimal join order is an NP-hard problem (cid:73) considering all possible ways to do a join can exhaust available memory (cid:73) not all join orders are valid, because of: (cid:73) outer joins enforcing a certain join order (cid:73) IN and EXISTS clauses that get converted to joins (cid:73) joins with restriction clauses are preferable to Cartesian joins JanUrban´ski [email protected](UniversityofWarRsaewpla/ciFnlgumGEotQioOn) May21,2010 6/46 Theproblem GEQO,thegeneticqueryoptimiser Outline 1 The problem Determining join order for large queries GEQO, the genetic query optimiser 2 The solution 3 The results 4 The future 5 The end JanUrban´ski [email protected](UniversityofWarRsaewpla/ciFnlgumGEotQioOn) May21,2010 7/46 Theproblem GEQO,thegeneticqueryoptimiser Randomisation helps (cid:73) PostgreSQL switches from exhaustive search to a randomised algorithm after a certain limit (cid:73) GEQO starts by joining the relations in any order (cid:73) and then proceeds to randomly change the join order (cid:73) genetic algorithm techniques are used to choose the cheapest join order JanUrban´ski [email protected](UniversityofWarRsaewpla/ciFnlgumGEotQioOn) May21,2010 8/46 Theproblem GEQO,thegeneticqueryoptimiser Problems with GEQO (cid:73) has lots of dead/experimental code (cid:73) there is a TODO item to remove it (cid:73) nobody really cares about it (cid:73) is an adaptation of an algorithm to solve TSP, not necessarily best suited to join ordering (cid:73) requires some cooperation from the planner, which violates abstractions JanUrban´ski [email protected](UniversityofWarRsaewpla/ciFnlgumGEotQioOn) May21,2010 9/46 Thesolution SimulatedAnnealingoverview Outline 1 The problem 2 The solution Simulated Annealing overview PostgreSQL specifics Query tree transformations 3 The results 4 The future 5 The end JanUrban´ski [email protected](UniversityofWarRsaewpla/ciFnlgumGEotQioOn) May21,2010 10/46

Description:
Replacing GEQO Join ordering via Simulated Annealing Jan Urbanski [email protected] University of Warsaw / Flumotion May 21, 2010 Jan Urbanski j.urbanski@wulczer
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.