About This eBook ePUB is an open, industry-standard format for eBooks. However, support of ePUB and its many features varies across reading devices and applications. Use your device or app settings to customize the presentation to your liking. Settings that you can customize often include font, font size, single or double column, landscape or portrait mode, and figures that you can click or tap to enlarge. For additional information about the settings and features on your reading device or app, visit the device manufacturer’s Web site. Many titles include programming code or configuration examples. To optimize the presentation of these elements, view the eBook in single-column, landscape mode and adjust the font size to the smallest setting. In addition to presenting code and configurations in the reflowable text format, we have included images of the code that mimic the presentation found in the print book; therefore, where the reflowable format may compromise the presentation of the code listing, you will see a “Click here to view code image” link. Click the link to view the print-fidelity code image. To return to the previous page viewed, click the Back button on your device or app. Bayesian Methods for Hackers Probabilistic Programming and Bayesian Inference Cameron Davidson-Pilon New York • Boston • Indianapolis • San Francisco Toronto • Montreal • London • Munich • Paris • Madrid Capetown • Sydney • Tokyo • Singapore • Mexico City Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and the publisher was aware of a trademark claim, the designations have been printed with initial capital letters or in all capitals. The author and publisher have taken care in the preparation of this book, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein. For information about buying this title in bulk quantities, or for special sales opportunities (which may include electronic versions; custom cover designs; and content particular to your business, training goals, marketing focus, or branding interests), please contact our corporate sales department at [email protected] or (800) 382-3419. For government sales inquiries, please contact [email protected]. For questions about sales outside the United States, please contact [email protected]. Visit us on the Web: informit.com/aw Library of Congress Cataloging-in-Publication Data Davidson-Pilon, Cameron. Bayesian methods for hackers : probabilistic programming and bayesian inference / Cameron Davidson-Pilon. pages cm Includes bibliographical references and index. ISBN 978-0-13-390283-9 (pbk.: alk. paper) 1. Penetration testing (Computer security)–Mathematics. 2. Bayesian statistical decision theory. 3. Soft computing. I. Title. QA76.9.A25D376 2015 006.3–dc23 2015017249 Copyright © 2016 Cameron Davidson-Pilon All rights reserved. Printed in the United States of America. This publication is protected by copyright, and permission must be obtained from the publisher protected by copyright, and permission must be obtained from the publisher prior to any prohibited reproduction, storage in a retrieval system, or transmission in any form or by any means, electronic, mechanical, photocopying, recording, or likewise. To obtain permission to use material from this work, please submit a written request to Pearson Education, Inc., Permissions Department, 200 Old Tappan Road, Old Tappan, New Jersey 07675, or you may fax your request to (201) 236-3290. The code throughout and Chapters 1 through 6 in this book is released under the MIT License. ISBN-13: 978-0-13-390283-9 ISBN-10: 0-13-390283-8 Text printed in the United States on recycled paper at RR Donnelley in Crawfordsville, Indiana. First printing, October 2015 This book is dedicated to many important relationships: my parents, my brothers, and my closest friends. Second to them, it is devoted to the open-source community, whose work we consume every day without knowing. Contents Foreword Preface Acknowledgments About the Author 1 The Philosophy of Bayesian Inference 1.1 Introduction 1.1.1 The Bayesian State of Mind 1.1.2 Bayesian Inference in Practice 1.1.3 Are Frequentist Methods Incorrect? 1.1.4 A Note on “Big Data” 1.2 Our Bayesian Framework 1.2.1 Example: Mandatory Coin-Flip 1.2.2 Example: Librarian or Farmer? 1.3 Probability Distributions 1.3.1 Discrete Case 1.3.2 Continuous Case 1.3.3 But What Is λ? 1.4 Using Computers to Perform Bayesian Inference for Us 1.4.1 Example: Inferring Behavior from Text-Message Data 1.4.2 Introducing Our First Hammer: PyMC 1.4.3 Interpretation 1.4.4 What Good Are Samples from the Posterior, Anyway? 1.5 Conclusion 1.6 Appendix 1.6.1 Determining Statistically if the Two λs Are Indeed Different? 1.6.2 Extending to Two Switchpoints 1.7 Exercises 1.7.1 Answers 1.8 References 2 A Little More on PyMC 2.1 Introduction 2.1.1 Parent and Child Relationships 2.1.2 PyMC Variables 2.1.3 Including Observations in the Model 2.1.4 Finally. . . 2.2 Modeling Approaches 2.2.1 Same Story, Different Ending 2.2.2 Example: Bayesian A/B Testing 2.2.3 A Simple Case 2.2.4 A and B Together 2.2.5 Example: An Algorithm for Human Deceit 2.2.6 The Binomial Distribution 2.2.7 Example: Cheating Among Students 2.2.8 Alternative PyMC Model 2.2.9 More PyMC Tricks 2.2.10 Example: Challenger Space Shuttle Disaster 2.2.11 The Normal Distribution 2.2.12 What Happened the Day of the Challenger Disaster? 2.3 Is Our Model Appropriate? 2.3.1 Separation Plots 2.4 Conclusion 2.5 Appendix 2.6 Exercises 2.6.1 Answers 2.7 References 3 Opening the Black Box of MCMC 3.1 The Bayesian Landscape 3.1.1 Exploring the Landscape Using MCMC 3.1.2 Algorithms to Perform MCMC
Description: