ebook img

Amazon Redshift Database Developer Guide PDF

733 Pages·2014·4.05 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Amazon Redshift Database Developer Guide

Amazon Redshift Database Developer Guide API Version 2012-12-01 Amazon Redshift Database Developer Guide Amazon Redshift: Database Developer Guide Copyright © 2014 Amazon Web Services, Inc. and/or its affiliates. All rights reserved. The following are trademarks of Amazon Web Services, Inc.: Amazon, Amazon Web Services Design, AWS, Amazon CloudFront, Cloudfront, Amazon DevPay, DynamoDB, ElastiCache, Amazon EC2, Amazon Elastic Compute Cloud, Amazon Glacier, Kindle, Kindle Fire, AWS Marketplace Design, Mechanical Turk, Amazon Redshift, Amazon Route 53, Amazon S3, Amazon VPC. In addition, Amazon.com graphics, logos, page headers, button icons, scripts, and service names are trademarks, or trade dress of Amazon in the U.S. and/or other countries. Amazon's trademarks and trade dress may not be used in connection with any product or service that is not Amazon's, in any manner that is likely to cause confusion among customers, or in any manner that disparages or discredits Amazon. All other trademarks not owned by Amazon are the property of their respective owners, who may or may not be affiliated with, connected to, or sponsored by Amazon. Amazon Redshift Database Developer Guide Table of Contents Welcome ..................................................................................................................................... 1 Are you a first-time Amazon Redshift user?................................................................................ 1 Are you a database developer?................................................................................................ 2 Prerequisites ........................................................................................................................ 3 Amazon Redshift System Overview.................................................................................................. 4 Data warehouse system architecture ........................................................................................ 4 Performance......................................................................................................................... 6 Columnar storage.................................................................................................................. 8 Internal architecture and system operation................................................................................. 9 Query processing ........................................................................................................ 10 Client requests and execution........................................................................................ 10 Explain plan................................................................................................................ 12 Query execution steps.................................................................................................. 12 Workload management......................................................................................................... 13 Using Amazon Redshift with other services.............................................................................. 14 Moving data between Amazon Redshift and Amazon S3..................................................... 14 Using Amazon Redshift with Amazon DynamoDB............................................................. 14 Importing data from remote hosts over SSH..................................................................... 14 Automating data loads using AWS Data Pipeline............................................................... 14 Getting Started Using Databases ................................................................................................... 15 Step 1: Create a database..................................................................................................... 15 Step 2: Create a database user.............................................................................................. 16 Delete a database user................................................................................................. 16 Step 3: Create a database table............................................................................................. 16 Insert data rows into a table........................................................................................... 17 Select data from a table................................................................................................ 17 Step 4: Load sample data...................................................................................................... 18 Step 5: Query the system tables............................................................................................. 20 View a list of table names.............................................................................................. 21 View database users.................................................................................................... 21 View recent queries ..................................................................................................... 22 Determine the process ID of a running query.................................................................... 22 Step 6: Cancel a query......................................................................................................... 23 Cancel a query from another session.............................................................................. 24 Cancel a query using the Superuser queue...................................................................... 24 Step 7: Clean up your resources............................................................................................. 25 Amazon Redshift Best Practices..................................................................................................... 26 Best practices for designing tables.......................................................................................... 26 Take the Tuning Table Design tutorial............................................................................... 27 Choosing the best sort key............................................................................................ 27 Choosing the best distribution style................................................................................. 27 Use automatic compression........................................................................................... 28 Define constraints........................................................................................................ 28 Use the smallest possible column size............................................................................. 28 Using date/time data types for date columns..................................................................... 28 Best practices for loading data............................................................................................... 29 Take the Loading Data tutorial........................................................................................ 29 Take the Tuning Table Design tutorial............................................................................... 29 Use a COPY command to load data................................................................................ 29 Use a single COPY command........................................................................................ 30 Split your load data into multiple files............................................................................... 30 Compress your data files with gzip or lzop........................................................................ 30 Use a manifest file ....................................................................................................... 30 Verify data files before and after a load............................................................................ 30 Using a multi-row insert ................................................................................................ 31 API Version 2012-12-01 iii Amazon Redshift Database Developer Guide Using a bulk insert....................................................................................................... 31 Loading data in sort key order........................................................................................ 31 Load data in sequential blocks....................................................................................... 31 Use time-series tables.................................................................................................. 32 Use a staging table to perform a merge........................................................................... 32 Schedule around maintenance windows.......................................................................... 32 Best practices for tuning query performance............................................................................. 32 Design for performance ................................................................................................ 33 Take the Tuning Table Design tutorial............................................................................... 33 Vacuum your database ................................................................................................. 33 Performing a deep copy................................................................................................ 33 Increasing the available memory .................................................................................... 33 Maintaining up-to-date table statistics.............................................................................. 34 Specifying redundant predicates on the sort column.......................................................... 34 Tutorial:Tuning Table Design.......................................................................................................... 35 Prerequisites ...................................................................................................................... 35 Steps................................................................................................................................. 35 Step 1: Create a test data set................................................................................................. 36 To create a test data set................................................................................................ 36 Next step.................................................................................................................... 40 Step 2: Establish a baseline................................................................................................... 40 To test system performance to establish a baseline............................................................ 41 Next step.................................................................................................................... 43 Step 3: Select sort keys........................................................................................................ 44 To select sort keys....................................................................................................... 44 Next step.................................................................................................................... 44 Step 4: Select distribution styles............................................................................................. 45 Distribution styles ........................................................................................................ 45 To select distribution styles............................................................................................ 46 Next step.................................................................................................................... 48 Step 5: Review compression encodings................................................................................... 48 To review compression encodings................................................................................... 48 Next step.................................................................................................................... 51 Step 6: Recreate the test data set........................................................................................... 51 To recreate the test data set.......................................................................................... 51 Next step.................................................................................................................... 54 Step 7: Retest system performance after tuning........................................................................ 54 To retest system performance after tuning........................................................................ 54 Next step.................................................................................................................... 58 Step 8: Evaluate the results................................................................................................... 58 Next step.................................................................................................................... 60 Step 9: Clean up your resources............................................................................................. 60 Next .......................................................................................................................... 60 Summary ........................................................................................................................... 60 Next Step ................................................................................................................... 61 Tutorial: Loading Data from Amazon S3........................................................................................... 62 Prerequisites ...................................................................................................................... 62 Overview............................................................................................................................ 63 Steps................................................................................................................................. 63 Step 1: Launch a Cluster....................................................................................................... 63 Next Step ................................................................................................................... 64 Step 2: Download the Data Files............................................................................................. 64 Next Step ................................................................................................................... 65 Step 3: Upload the Files to an Amazon S3 Bucket..................................................................... 65 ................................................................................................................................. 65 Next Step ................................................................................................................... 66 Step 4: Create the Sample Tables........................................................................................... 66 Next Step ................................................................................................................... 69 API Version 2012-12-01 iv Amazon Redshift Database Developer Guide Step 5: Run the COPY Commands......................................................................................... 69 COPY Command Syntax .............................................................................................. 69 Loading the SSB Tables................................................................................................ 70 Step 6:Vacuum and Analyze the Database.............................................................................. 81 Next Step ................................................................................................................... 81 Step 7: Clean Up Your Resources........................................................................................... 81 Next .......................................................................................................................... 82 Summary ........................................................................................................................... 82 Next Step ................................................................................................................... 82 Managing Database Security......................................................................................................... 83 Amazon Redshift security overview......................................................................................... 83 Default database user privileges............................................................................................. 84 Superusers......................................................................................................................... 84 Users ................................................................................................................................ 85 Creating, altering, and deleting users.............................................................................. 85 Groups .............................................................................................................................. 86 Creating, altering, and deleting groups............................................................................ 86 Schemas............................................................................................................................ 86 Creating, altering, and deleting schemas ......................................................................... 87 Search path................................................................................................................ 87 Schema-based privileges.............................................................................................. 87 Example for controlling user and group access......................................................................... 87 Designing Tables ......................................................................................................................... 90 Choosing a column compression type..................................................................................... 90 Compression encodings ............................................................................................... 91 Testing compression encodings...................................................................................... 97 Example: Choosing compression encodings for the CUSTOMER table.................................. 99 Choosing a data distribution style.......................................................................................... 101 Data distribution concepts ........................................................................................... 101 Distribution styles....................................................................................................... 102 Viewing distribution styles............................................................................................ 102 Evaluating query patterns............................................................................................ 103 Designating distribution styles ...................................................................................... 103 Evaluating the query plan............................................................................................ 104 Query plan example ................................................................................................... 106 Distribution examples ................................................................................................. 111 Choosing sort keys............................................................................................................. 113 Defining constraints............................................................................................................ 113 Analyzing table design........................................................................................................ 113 Loading Data ............................................................................................................................ 117 Using COPY to load data .................................................................................................... 117 Preparing your input data ............................................................................................ 118 Loading data from Amazon S3..................................................................................... 119 Loading data from Amazon EMR.................................................................................. 127 Loading data from remote hosts ................................................................................... 134 Loading from Amazon DynamoDB................................................................................ 140 Verifying that the data was loaded correctly.................................................................... 142 Validating input data ................................................................................................... 143 Automatic compression............................................................................................... 143 Optimizing for narrow tables......................................................................................... 145 Default values ........................................................................................................... 145 Troubleshooting ......................................................................................................... 146 Updating with DML............................................................................................................. 150 Updating and inserting........................................................................................................ 150 Merge Method 1: Replacing existing rows....................................................................... 151 Merge Method 2: Specifying a column list....................................................................... 151 Creating a temporary staging table................................................................................ 151 Performing a merge operation by replacing existing rows.................................................. 152 API Version 2012-12-01 v Amazon Redshift Database Developer Guide Performing a merge operation by specifying a column list.................................................. 152 Merge examples ........................................................................................................ 154 Performing a deep copy ...................................................................................................... 156 Analyzing tables ................................................................................................................ 158 ANALYZE command history......................................................................................... 159 Automatic analysis ..................................................................................................... 160 Vacuuming tables............................................................................................................... 161 VACUUM frequency ................................................................................................... 161 Sort stage and merge stage......................................................................................... 161 Vacuum types ........................................................................................................... 162 Managing vacuum times.............................................................................................. 162 Vacuum column limit exceeded error............................................................................. 168 Managing concurrent writes................................................................................................. 169 Serializable isolation................................................................................................... 169 Write and read-write operations.................................................................................... 170 Concurrent write examples .......................................................................................... 171 Unloading Data.......................................................................................................................... 173 Unloading data to Amazon S3.............................................................................................. 173 Unloading encrypted data files ............................................................................................. 176 Unloading data in delimited or fixed-width format..................................................................... 177 Reloading unloaded data..................................................................................................... 178 Tuning Query Performance .......................................................................................................... 180 Troubleshooting Queries ..................................................................................................... 181 Connection Fails ........................................................................................................ 181 Query Hangs............................................................................................................. 181 Query Takes Too Long ................................................................................................ 182 Load Fails................................................................................................................. 183 Load Takes Too Long.................................................................................................. 183 Load Data Is Incorrect................................................................................................. 184 Analyzing the query plan..................................................................................................... 184 Simple EXPLAIN example ........................................................................................... 186 EXPLAIN operators .................................................................................................... 187 Join examples ........................................................................................................... 188 Mapping the query plan to system views........................................................................ 190 Managing how queries use memory...................................................................................... 190 Determining whether a query is writing to disk................................................................. 191 Determining which steps are writing to disk.................................................................... 192 Monitoring disk space......................................................................................................... 194 Compiled code .................................................................................................................. 195 Setting the JDBC fetch size parameter................................................................................... 196 Implementing workload management .................................................................................... 196 Defining query queues................................................................................................ 197 WLM queue assignment rules...................................................................................... 199 Modifying the WLM configuration.................................................................................. 201 Assigning queries to queues........................................................................................ 202 Monitoring workload management ................................................................................ 203 SQL Reference.......................................................................................................................... 205 Amazon Redshift SQL ........................................................................................................ 205 SQL functions supported on the leader node.................................................................. 205 Amazon Redshift and PostgreSQL................................................................................ 206 Using SQL........................................................................................................................ 212 SQL reference conventions.......................................................................................... 212 Basic elements.......................................................................................................... 212 Expressions .............................................................................................................. 235 Conditions ................................................................................................................ 239 SQL Commands ................................................................................................................ 256 ABORT .................................................................................................................... 257 ALTER DATABASE..................................................................................................... 258 API Version 2012-12-01 vi Amazon Redshift Database Developer Guide ALTER GROUP ......................................................................................................... 259 ALTER SCHEMA ....................................................................................................... 260 ALTER TABLE ........................................................................................................... 261 ALTER USER............................................................................................................ 266 ANALYZE ................................................................................................................. 268 ANALYZE COMPRESSION ......................................................................................... 269 BEGIN ..................................................................................................................... 270 CANCEL .................................................................................................................. 272 CLOSE .................................................................................................................... 273 COMMENT ............................................................................................................... 274 COMMIT .................................................................................................................. 275 COPY ...................................................................................................................... 276 CREATE DATABASE .................................................................................................. 308 CREATE GROUP....................................................................................................... 308 CREATE SCHEMA..................................................................................................... 309 CREATE TABLE......................................................................................................... 310 CREATE TABLE AS.................................................................................................... 320 CREATE USER ......................................................................................................... 325 CREATE VIEW .......................................................................................................... 327 DEALLOCATE........................................................................................................... 327 DECLARE ................................................................................................................ 328 DELETE................................................................................................................... 331 DROP DATABASE...................................................................................................... 332 DROP GROUP.......................................................................................................... 333 DROP SCHEMA........................................................................................................ 334 DROP TABLE............................................................................................................ 334 DROP USER............................................................................................................. 336 DROP VIEW ............................................................................................................. 337 END ........................................................................................................................ 339 EXECUTE ................................................................................................................ 339 EXPLAIN.................................................................................................................. 340 FETCH .................................................................................................................... 344 GRANT .................................................................................................................... 346 INSERT.................................................................................................................... 349 LOCK ...................................................................................................................... 353 PREPARE ................................................................................................................ 354 RESET .................................................................................................................... 355 REVOKE .................................................................................................................. 356 ROLLBACK .............................................................................................................. 358 SELECT................................................................................................................... 359 SELECT INTO........................................................................................................... 389 SET......................................................................................................................... 389 SET SESSION AUTHORIZATION................................................................................. 392 SET SESSION CHARACTERISTICS............................................................................. 393 SHOW ..................................................................................................................... 393 START TRANSACTION............................................................................................... 394 TRUNCATE .............................................................................................................. 394 UNLOAD .................................................................................................................. 395 UPDATE................................................................................................................... 407 VACUUM.................................................................................................................. 411 SQL Functions Reference ................................................................................................... 413 Leader-node only functions.......................................................................................... 414 Aggregate functions ................................................................................................... 415 Bit-wise aggregate functions ........................................................................................ 424 Window functions....................................................................................................... 428 Conditional expressions .............................................................................................. 465 Date functions ........................................................................................................... 471 Math functions........................................................................................................... 494 API Version 2012-12-01 vii Amazon Redshift Database Developer Guide String functions ......................................................................................................... 518 JSON Functions ........................................................................................................ 553 Data type formatting functions...................................................................................... 556 System administration functions ................................................................................... 565 System information functions ....................................................................................... 568 Reserved words................................................................................................................. 578 System Tables Reference............................................................................................................ 582 System tables and views..................................................................................................... 582 Types of system tables and views......................................................................................... 583 Visibility of data in system tables and views............................................................................ 583 Filtering system-generated queries ............................................................................... 583 STL tables for logging......................................................................................................... 584 STL_AGGR .............................................................................................................. 585 STL_ALERT_EVENT_LOG ......................................................................................... 587 STL_BCAST ............................................................................................................. 588 STL_COMMIT_STATS ................................................................................................ 590 STL_CONNECTION_LOG........................................................................................... 591 STL_DDLTEXT.......................................................................................................... 591 STL_DIST ................................................................................................................ 593 STL_DELETE ........................................................................................................... 594 STL_ERROR ............................................................................................................ 596 STL_EXPLAIN .......................................................................................................... 597 STL_FILE_SCAN....................................................................................................... 599 STL_HASH ............................................................................................................... 600 STL_HASHJOIN ........................................................................................................ 602 STL_INSERT ............................................................................................................ 603 STL_LIMIT................................................................................................................ 604 STL_LOAD_COMMITS ............................................................................................... 606 STL_LOAD_ERRORS ................................................................................................ 608 STL_LOADERROR_DETAIL ........................................................................................ 610 STL_MERGE ............................................................................................................ 612 STL_MERGEJOIN ..................................................................................................... 613 STL_NESTLOOP....................................................................................................... 614 STL_PARSE ............................................................................................................. 615 STL_PLAN_INFO ...................................................................................................... 616 STL_PROJECT ......................................................................................................... 618 STL_QUERY............................................................................................................. 620 STL_QUERYTEXT..................................................................................................... 622 STL_REPLACEMENTS .............................................................................................. 623 STL_RETURN........................................................................................................... 624 STL_SAVE ............................................................................................................... 625 STL_S3CLIENT......................................................................................................... 626 STL_S3CLIENT_ERROR ............................................................................................ 628 STL_SCAN ............................................................................................................... 629 STL_SESSIONS........................................................................................................ 631 STL_SORT ............................................................................................................... 632 STL_SSHCLIENT_ERROR ......................................................................................... 633 STL_STREAM_SEGS ................................................................................................ 634 STL_TR_CONFLICT .................................................................................................. 635 STL_UNDONE .......................................................................................................... 635 STL_UNIQUE ........................................................................................................... 636 STL_UNLOAD_LOG .................................................................................................. 638 STL_USERLOG ........................................................................................................ 639 STL_UTILITYTEXT .................................................................................................... 640 STL_VACUUM .......................................................................................................... 641 STL_WARNING......................................................................................................... 643 STL_WINDOW .......................................................................................................... 644 STL_WLM_ERROR ................................................................................................... 645 API Version 2012-12-01 viii Amazon Redshift Database Developer Guide STL_WLM_QUERY.................................................................................................... 645 STV tables for snapshot data............................................................................................... 649 STV_ACTIVE_CURSORS ........................................................................................... 649 STV_BLOCKLIST ...................................................................................................... 650 STV_CURSOR_CONFIGURATION............................................................................... 653 STV_EXEC_STATE .................................................................................................... 653 STV_INFLIGHT ......................................................................................................... 654 STV_LOAD_STATE .................................................................................................... 655 STV_LOCKS............................................................................................................. 657 STV_PARTITIONS ..................................................................................................... 657 STV_RECENTS ........................................................................................................ 659 STV_SLICES ............................................................................................................ 660 STV_SESSIONS ....................................................................................................... 661 STV_TBL_PERM ....................................................................................................... 662 STV_TBL_TRANS ..................................................................................................... 664 STV_WLM_CLASSIFICATION_CONFIG........................................................................ 665 STV_WLM_QUERY_QUEUE_STATE ............................................................................ 666 STV_WLM_QUERY_STATE......................................................................................... 667 STV_WLM_QUERY_TASK_STATE ............................................................................... 668 STV_WLM_SERVICE_CLASS_CONFIG ....................................................................... 669 STV_WLM_SERVICE_CLASS_STATE .......................................................................... 670 System views .................................................................................................................... 671 SVL_COMPILE ......................................................................................................... 672 SVV_DISKUSAGE ..................................................................................................... 673 SVL_QERROR.......................................................................................................... 675 SVL_QLOG .............................................................................................................. 675 SVV_QUERY_INFLIGHT ............................................................................................ 676 SVL_QUERY_REPORT .............................................................................................. 677 SVV_QUERY_STATE ................................................................................................. 679 SVL_QUERY_SUMMARY ........................................................................................... 681 SVL_STATEMENTTEXT.............................................................................................. 683 SVV_VACUUM_PROGRESS ....................................................................................... 684 SVV_VACUUM_SUMMARY......................................................................................... 685 SVL_VACUUM_PERCENTAGE .................................................................................... 687 System catalog tables......................................................................................................... 687 PG_TABLE_DEF ....................................................................................................... 687 Querying the catalog tables ......................................................................................... 689 Configuration Reference.............................................................................................................. 693 Modifying the server configuration......................................................................................... 693 datestyle .......................................................................................................................... 694 Values (default in bold)................................................................................................ 694 Description ............................................................................................................... 694 Example................................................................................................................... 694 extra_float_digits................................................................................................................ 694 Values (default in bold)................................................................................................ 694 Description ............................................................................................................... 694 max_cursor_result_set_size................................................................................................. 695 Values (default in bold)................................................................................................ 695 Description ............................................................................................................... 695 query_group ..................................................................................................................... 695 Values (default in bold)................................................................................................ 695 Description ............................................................................................................... 695 search_path ...................................................................................................................... 696 Values (default in bold)................................................................................................ 696 Description ............................................................................................................... 696 Example................................................................................................................... 696 statement_timeout ............................................................................................................. 697 Values (default in bold)................................................................................................ 697 API Version 2012-12-01 ix Amazon Redshift Database Developer Guide Description ............................................................................................................... 697 Example................................................................................................................... 697 wlm_query_slot_count ........................................................................................................ 697 Values (default in bold)................................................................................................ 697 Description ............................................................................................................... 698 Examples ................................................................................................................. 698 Sample Database ...................................................................................................................... 699 CATEGORY table............................................................................................................... 700 DATE table........................................................................................................................ 701 EVENT table ..................................................................................................................... 701 VENUE table..................................................................................................................... 701 USERS table..................................................................................................................... 702 LISTING table ................................................................................................................... 702 SALES table ..................................................................................................................... 703 Time Zone Names and Abbreviations............................................................................................ 704 Time Zone Names ............................................................................................................. 704 Time Zone Abbreviations..................................................................................................... 714 Document History ...................................................................................................................... 718 API Version 2012-12-01 x

Description:
Use a COPY command to load data Amazon Redshift Database Developer Guide. Using a bulk insert SQL Reference
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.