ebook img

Seeking SRE: Conversations About Running Production Systems at Scale PDF

999 Pages·2018·10.73 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Seeking SRE: Conversations About Running Production Systems at Scale

Praise for Seeking SRE “Reading this book is like being a fly on the wall as SREs discuss the challenges and successes they’ve had implementing SRE strategies outside of Google. A must-read for everyone in tech!” Thomas A. Limoncelli SRE Manager, Stack Overflow, Inc. Google SRE Alum “A fantastic collection of SRE insights and principles from engineers at Google, Netflix, Dropbox, SoundCloud, Spotify, Amazon, and more. Seeking SRE shares the secrets to high availability and durability for many of the most popular products we all know and use.” Tammy Butow Principle SRE, Gremlin “Imagine you invited all your favorite SREs to a big dinner party where you just walked around all night quietly eavesdropping. What would you hear? This book is that. These are the conversations that happen between the sessions at conferences or over lunch. These are the (sometimes animated, but always principled) debates we have among ourselves. This book is your seat at the SRE family kitchen table.” Dave Rensin Director of Google CRE “Although Google’s two SRE books have been a force for good in the industry, they primarily frame the SRE narrative in the context of the solutions Google decided upon, and those may or may not work for every organization. Seeking SRE does an excellent job of demonstrating how SRE tenets can be adopted (or adapted) in various contexts across different organizations, while still staying true to the core principles championed by Google. In addition to providing the rationale and technical underpinning behind several of the infrastructural paradigms du jour that are required to build resilient systems, Seeking SRE also underscores the cultural scaffolding needed to ensure their successful implementation. The result is an actionable blueprint that the reader can use to make informed choices about when, why, and how to introduce these changes into existing infrastructures and organizations.” Cindy Sridharan Distributed Systems Engineer Seeking SRE Conversations About Running Production Systems at Scale Curated and edited by David N. Blank-Edelman Seeking SRE Curated and edited by David N. Blank-Edelman Copyright © 2018 David N. Blank-Edelman. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://oreilly.com/safari). For more information, contact our corporate/institutional sales department: 800- 998-9938 or [email protected]. Editor: Virginia Wilson Indexer: WordCo Indexing Services, Inc. Acquisitions Editor: Nikki McDonald Interior Designer: David Futato Proofreader: Rachel Monaghan Cover Designer: Karen Montgomery Copyeditor: Octal Publishing Services, Inc. Illustrator: Rebecca Demarest Production Editors: Kristen Brown and Melanie Yarbrough September 2018: First Edition Revision History for the First Edition 2018-08-21: First Release See http://oreilly.com/catalog/errata.csp?isbn=9781491978863 for release details. The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Seeking SRE, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc. The views expressed in this work are those of the authors, and do not represent the publisher’s views. While the publisher and the authors have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the authors disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights. 978-1-49197886-3 [GP] Introduction David N. Blank-Edelman, curator/editor And So It Begins... Conversations. That’s the most important word in the title of this book, so pardon the lack of subtlety I’m demonstrating by making it the first and last word of this book. Why is it so important? That’s where the “Seeking” part of Seeking SRE comes in. The people I respect in the Site Reliability Engineering (SRE) field all believe that the field itself is still evolving, expanding, changing, and being discovered. We are all in some sense still seeking SRE. In my experience, fields like ours grow best when the people in that field — the actual practitioners — talk to one another. Bring people together, let them talk, argue, laugh, share their experiences (success and failures) and their unsolved problems. A smart, kind, diverse, inclusive, and respectful community in conversation can catalyze a field like nothing else. Origin Story It was at SREcon16 Europe, one of the gatherings of the SRE community, that this book was born. (Full disclosure: I’m one of the cofounders of SREcon.) Brian Anderson, the original O’Reilly editor for this book, was on the hunt. The splendid book by Google called Site Reliability Engineering had recently met with much-deserved commercial success and the publisher was on the lookout for more SRE content to publish. He and I were talking about the possibilities during a break when I realized what didn’t exist for SRE. There was no volume I knew of that could bring people into some of the more interesting conversations that were happening in the field (like those that were happening at SREcon). I was seeing people discuss subjects like these: New implementations of SRE that didn’t have a book yet. SRE has blossomed in new and exciting ways as it has taken root in different (sometimes brownfield) contexts. Innovative ways to learn how to practice SRE. What gets in the way of adopting SRE. The best practices people had discovered as they adopted, adapted, and lived it. Where the field was going next, including the subjects that are new now in the field but will be commonplace in short order. Finally — and maybe most important — what about the humans in the picture? What is SRE doing for them? What is SRE doing to them? Are humans really the problem in operations (that need to be automated away) or is that short sighted? Can SRE improve more than just operations? And, so, the idea for Seeking SRE was born. Much to my surprise and delight, close to 40 authors from all over the field and all over the world liked the idea and decided to join me on this little project. I can’t thank them enough. Voices Besides a wee bit of meta matter like what you are reading now, I’ve tried to keep my voice soft in the book so that we could stand together and hear what its amazing contributors have to say. You’ll likely notice that this book doesn’t have a single consistent textual voice (mine or any other editor’s). There’s no attempt made to put the material into a blender and homogenize the chapters into a beige “technical book register.” I intentionally wanted you to hear the different voices from the different contributors just the way they talk. The only instruction they were given on tone was this: Pretend you are at lunch at a conference like SREcon. You are sitting with a bunch of smart SREs at lunch you don’t know, and one of them says to you, “So, what are you working on? What’s interesting to you these days?” You begin to answer the question… Now write that down. Even further in this direction is the crowdsourced chapter on the relationship between DevOps and SRE (Chapter 12). When I realized that there are likely many answers to this question and that no one I knew had the only right answer, I put the question out to my social networks (and people were kind enough to broadcast it even further). I’m very grateful for everyone who answered the call. And just for fun, there are also little “You Know You’re an SRE” tidbits scattered throughout the book from anonymous contributors (I promised anonymity when I asked the internet for them, but thanks to everyone who provided one or several!) In addition to a variety of voices, there’s also a variety of opinions and viewpoints. Are you going to agree with everything you read in this volume? Gosh, I hope not. I don’t agree with everything, so I don’t see why you should. That would make for a really boring set of conversations, no? I strongly recommend that you remember that there are one or more humans behind every chapter who were brave enough to put their opinions out there, so we all could have a good confab. The SRE community’s capacity to engage with respect is something I’ve come to appreciate over the years, and I know you’ll follow in that tradition. As the editor/curator of this book, which I’m very proud of, I do want to mention one regret up front. Our field has a real problem with diversity and inclusion of underrepresented minorities. We can’t have all of the important conversations if not everyone is in the room. Despite my attempts to address this lack of representation, this book doesn’t go far enough to work against this situation. I take full responsibility for that failure.

Description:
Organizations — big and small — have started to realize just how crucial system and application reliability is to their business. At the same time, they’ve also learned just how difficult it is to maintain that reliability while iterating at the speed demanded by the marketplace. Site Reliabil
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.