ebook img

Architectural Patterns for Big Data on AWS - Amazon S3 PDF

75 Pages·2016·7.46 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Architectural Patterns for Big Data on AWS - Amazon S3

Architectural Patterns for Big Data on AWS Max Amordeluso, Sr. Manager, Solutions Architecture, AWS Milano - April, 2016 © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda Big data challenges How to simplify big data processing What technologies should you use? •  Why? •  How? Reference architecture Design patterns Ever Increasing Big Data Variety Velocity Volume Big Data Evolution Batch Real-time Prediction Report Alerts Forecast Plethora of Tools EMR S3 DynamoDB SQS Amazon Amazon Redshift Glacier RDS ElastiCache Amazon Kinesis-enabled Kinesis app Data Pipeline CloudSearch DynamoDB Lambda ML Streams Is there a reference architecture? What tools should I use? How? Why? Architectural Principles Decoupled “data bus” •  Data → Store → Process → Answers Use the right tool for the job •  Data structure, latency, throughput, access patterns Use Lambda architecture ideas •  Immutable (append-only) log, batch/speed/serving layer Leverage AWS managed services •  No/low admin Big data ≠ big cost Simplify Big Data Processing ingest / process / consume / store collect analyze visualize Time to Answer (Latency) Throughput Cost Ingest / Collect Types of Data Collect Store Transactional Web Apps s n o Database ti •  Database reads & writes (OLTP) a c pli Mobile •  Cache p Apps Transactional Data A iOS Android Search Search Logstash •  Logs Search Data •  Streams File g g A File nn gigi Storage gg oo File Data •  Log files (/var/log) LL •  Log collectors & frameworks Stream Stream •  Log records Storage •  Sensors & IoT data T T Stream Data oo II

Description:
Why is Amazon S3 Good for Big Data? • Natively supported by big data frameworks (Spark, Hive, Presto, etc.) • No need to run compute clusters for storage
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.