ebook img

Benchmarking Apache Flink and Apache Spark DataFlow Systems on Large-Scale Distributed ... PDF

81 Pages·2012·5.16 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Benchmarking Apache Flink and Apache Spark DataFlow Systems on Large-Scale Distributed ...

Benchmarking Apache Flink and Apache Spark DataFlow Systems on Large-Scale Distributed Machine Learning Algorithms Prof. Sonia Bergamaschi Advisor Dr. Tilmann Rabl Co-Advisor Andrea Spina Christoph Boden Candidate Co-Advisor Dipartimento di Ingegneria “Enzo Ferrari” - Laurea Magistrale di Ingegneria Informatica - MODENA Where, When, How and Why ● Berlin, DE ● 5 months Traineeship - MAY - OCT ‘16 ● Database Systems and Information Management Group, Technische Universität ● Team Project - Systems Performance Research Unit 2 Agenda ● Background ● Experiments Definition ● Benchmarking and Results Analysis ● Insights by Results Agenda 3 Benchmarking Apache Flink and Apache Spark DataFlow Systems on Large-Scale Distributed Machine Learning Algorithms Background Why Distributed Machine Learning? Mutliple Sources MapReduce Iterative Algorithms Background 5 Because represents innovation … SOLUTION Mutliple Sources Iterative Algorithms Massive DataFlow Engines Background 6 One of the goals - fairness ● give code open-source ● keep jobs reproducible ● make benchmark exhaustive ● … ● model systems as same as possible Background 7 Another Goal - include more and more systems Background 8 My Goal - “OK, may I start with a couple of those?” Background 9 Apache Flink vs. Apache Spark Apache Flink Apache Spark THE GOAL - Benchmark on Performance and Scalability Background 10

Description:
One of the goals - fairness Apache Flink vs. Apache Spark. 10. Background. Apache Flink. Apache Spark. THE GOAL - Benchmark on Performance and Scalability Peel. 17. Background. SETUP SUITE. EXPERIMENT SCOPE. SETUP EXP. SETUP RUN. Peel execution flow - turn on systems
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.