Jacek Majchrzak Sven Balnojan Marian Siwiak with Mariusz Sieraczkiewicz Foreword by Jean-Georges Perrin M A N N I N G Data mesh development cycle Choose Define and Improve business develop enabling case to data structures solve product Data product development cycle Collect and Define Measure Design data Implement analyze functional and data product data consumers’ nonfunctional product architecture product needs requirements success Data mesh development elements—data product development cycle details Data Mesh in Action Data Mesh in Action JACEK MAJCHRZAK, SVEN BALNOJAN, AND MARIAN SIWIAK, WITH MARIUSZ SIERACZKIEWICZ FOREWORD BY JEAN-GEORGES PERRIN MANNING SHELTER ISLAND For online information and ordering of this and other Manning books, please visit www.manning.com. The publisher offers discounts on this book when ordered in quantity. For more information, please contact Special Sales Department Manning Publications Co. 20 Baldwin Road PO Box 761 Shelter Island, NY 11964 Email: [email protected] ©2023 by Manning Publications Co. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in the book, and Manning Publications was aware of a trademark claim, the designations have been printed in initial caps or all caps. Recognizing the importance of preserving what has been written, it is Manning’s policy to have the books we publish printed on acid-free paper, and we exert our best efforts to that end. Recognizing also our responsibility to conserve the resources of our planet, Manning books are printed on paper that is at least 15 percent recycled and processed without the use of elemental chlorine. The author and publisher have made every effort to ensure that the information in this book was correct at press time. The author and publisher do not assume and hereby disclaim any liability to any party for any loss, damage, or disruption caused by errors or omissions, whether such errors or omissions result from negligence, accident, or any other cause, or from any usage of the information herein. Manning Publications Co. Development editor: Ian Hough 20 Baldwin Road Technical development editor: Michael Jensen PO Box 761 Review editor: Adriana Sabo Shelter Island, NY 11964 Production editor: Andy Marinkovich Copy editor: Sharon Wilkey Proofreader: Keri Hales Technical proofreader: Al Krinker Typesetter: Gordan Salinovic Cover designer: Marija Tudor ISBN 9781633439979 Printed in the United States of America brief contents P 1 F ..................................................................1 ART OUNDATIONS 1 ■ The what and why of the data mesh 3 2 ■ Is a data mesh right for you? 30 3 ■ Kickstart your data mesh MVP in a month 56 P 2 T ....................................85 ART HE FOUR PRINCIPLES IN PRACTICE 4 ■ Domain ownership 87 5 ■ Data as a product 123 6 ■ Federated computational governance 163 7 ■ The self-serve data platform 192 P 3 I ..............219 ART NFRASTRUCTURE AND TECHNICAL ARCHITECTURE 8 ■ Comparing self-serve data platforms 221 9 ■ Solution architecture design 247 v contents foreword xii preface xiv acknowledgments xvi about this book xvii about the authors xxiv about the cover illustration xxvi P 1 F ....................................................... 1 ART OUNDATIONS 1 The what and why of the data mesh 3 1.1 Data mesh 101 4 1.2 Why the data mesh? 7 Alternatives 8 ■ Data warehouses and data lakes inside the data mesh 9 ■ Data mesh benefits 10 1.3 Use case: A snow-shoveling business 11 1.4 Data mesh principles 15 Domain-oriented decentralized data ownership and architecture 16 Data as a product 18 ■ Federated computational governance 20 Self-serve data infrastructure as a platform 22 1.5 Back to snow shoveling 24 vi CONTENTS vii 1.6 Socio-technical architecture 25 Conway’s law 25 ■ Team topologies 26 ■ Cognitive load 26 1.7 Data mesh challenges 27 Technological challenges 27 ■ Data management challenges 27 Organizational challenges 28 2 Is a data mesh right for you? 30 2.1 Analyzing data mesh drivers 31 Business drivers 31 ■ Organizational drivers 33 ■ Domain- data drivers 35 ■ Minor organizational drivers 36 ■ Is a data mesh a good fit for me? 38 2.2 Data mesh alternatives and complementary solutions 39 Enterprise data warehouse 39 ■ Data lake 41 ■ Data lakehouse 42 Data fabric 43 ■ Data mesh vs. the rest of the world 44 2.3 Understanding a data mesh implementation effort 45 The data mesh development cycle 45 ■ Development cycle in the shoveling example 48 ■ Enabling the team 49 ■ Development cycle in detail 52 3 Kickstart your data mesh MVP in a month 56 3.1 Getting the lay of the land 57 Drawing a system landscape diagram 58 ■ Performing stakeholder analysis 60 3.2 Identifying candidates for the MVP implementation team 63 Choosing development teams 63 ■ Choosing the cooperation model 66 ■ Choosing a data governance team 66 3.3 Setting up MVP governance 69 Defining data mesh value statement(s) 70 ■ Defining data governance policies 71 ■ Federating data governance 72 3.4 Developing minimal data products 72 Identifying domain-oriented datasets 73 ■ Choosing data product owners 76 ■ Deciding on the minimum viable data product description 77 ■ Developing the simplest tools to expose your data 79 3.5 Setting up the minimal platform 80 Ensuring platform-forced governability 81 ■ Ensuring platform security 82 viii CONTENTS P 2 T ..........................85 ART HE FOUR PRINCIPLES IN PRACTICE 4 Domain ownership 87 4.1 Capturing and analyzing domains 90 Domain-driven design 101 91 ■ Invite the right people 92 Choose the correct workshop technique 93 4.2 Applying ownership using domain decomposition 95 Domain, subdomain, and business capability 97 ■ Decompose domains using business capability modeling 100 ■ How are domains and business capabilities related to data? 101 ■ Assign responsibilities to the data-product-owning team 104 ■ Choose the right team to own data 106 4.3 Applying ownership using data use cases 109 Data use cases 109 ■ Model and bounded context 111 ■ Set up boundaries of use-case-driven data products 113 ■ Choose the right team to own data 114 4.4 Applying ownership using design heuristics 114 What is a heuristic? 115 ■ Using design heuristics 115 Designing heuristics and possible boundaries 115 4.5 Final landscape: The mesh of interconnected data products 118 Messflix data mesh 118 ■ Data products form a mesh 120 Is it already a data mesh? 121 5 Data as a product 123 5.1 Applying product thinking 124 Product thinking analysis 125 ■ Data product canvas 128 5.2 What is a data product? 131 Data product definition 131 ■ Product, not project 133 ■ What can be a data product? 134 5.3 Data product ownership 135 Data product owner 136 ■ Data product owner responsibilities 137 An Agile DevOps team as a base for data product dev team 138 ■ Data product owner and product owner 139 5.4 Conceptual architecture of a data product 140 External architecture view 140 ■ Internal architecture view 144