Architectural Patterns for Big Data on AWS Max Amordeluso, Sr. Manager, Solutions Architecture, AWS Milano - April, 2016 © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda Big data challenges How to simplify big data processing What technologies should you use? • Why? • How? Reference architecture Design patterns Ever Increasing Big Data Variety Velocity Volume Big Data Evolution Batch Real-time Prediction Report Alerts Forecast Plethora of Tools EMR S3 DynamoDB SQS Amazon Amazon Redshift Glacier RDS ElastiCache Amazon Kinesis-enabled Kinesis app Data Pipeline CloudSearch DynamoDB Lambda ML Streams Is there a reference architecture? What tools should I use? How? Why? Architectural Principles Decoupled “data bus” • Data → Store → Process → Answers Use the right tool for the job • Data structure, latency, throughput, access patterns Use Lambda architecture ideas • Immutable (append-only) log, batch/speed/serving layer Leverage AWS managed services • No/low admin Big data ≠ big cost Simplify Big Data Processing ingest / process / consume / store collect analyze visualize Time to Answer (Latency) Throughput Cost Ingest / Collect Types of Data Collect Store Transactional Web Apps s n o Database ti • Database reads & writes (OLTP) a c pli Mobile • Cache p Apps Transactional Data A iOS Android Search Search Logstash • Logs Search Data • Streams File g g A File nn gigi Storage gg oo File Data • Log files (/var/log) LL • Log collectors & frameworks Stream Stream • Log records Storage • Sensors & IoT data T T Stream Data oo II
Description: