Python Automation Cookbook Explore the world of automation using Python recipes that will enhance your skills Jaime Buelta BIRMINGHAM - MUMBAI Python Automation Cookbook Copyright © 2018 Packt Publishing All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews. Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book. Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information. Commissioning Editor: Aaron Lazar Acquisition Editor: Shriram Shekhar Content Development Editor: Manjusha Mantri Technical Editor: Adhithya Haridas Copy Editor: Safis Editing Project Coordinator: Prajakta Naik Proofreader: Safis Editing Indexer: Mariammal Chettiyar Graphics: Jisha Chirayil Production Coordinator: Shantanu Zagade First published: September 2018 Production reference: 1260918 Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK. ISBN 978-1-78913-380-6 www.packtpub.com In loving memory of Banjo mapt.io Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website. Why subscribe? Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals Improve your learning with Skill Plans built especially for you Get a free eBook or video every month Mapt is fully searchable Copy and paste, print, and bookmark content PacktPub.com Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details. At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks. Contributors About the author Jaime Buelta has been a professional programmer and a full-time Python developer and has been exposed to a lot of different technologies over his career. He has developed software for a variety of fields and industries, including aerospace, networking and communications, industrial SCADA systems, video game online services, and finance services. As part of these companies, he worked closely with various areas, such as marketing, management, sales, and game design, helping the companies achieve to their goals. He is a strong proponent of automating everything and making computers do most of the heavy lifting so users can focus on the important stuff. He is currently living in Dublin, Ireland, and has been a regular speaker at PyCon Ireland. This book could not have happened without the support and encouragement of my amazing wife, Dana. I also want to thank the team at Packt, especially Manjusha for her huge help in the process, and Shriram for encouraging me to write the book. Also, great thanks to Mario for reviewing the book and improving it. Finally, I'd like to thank the whole Python community. I can't overstate what a joy it is to work as a developer in the Python world. About the reviewer Mario Corchero is a senior software developer at Bloomberg. He leads the Python infrastructure team in London, enabling the company to work effectively in Python and building company-wide libraries and tools. His professional experience is mainly in C++ and Python, and he has contributed some patches to multiple Python open source projects. He is a PSF fellow, having received the Q3 2018 PSF Community Award, is vice president of Python España (the Python Spain association), and has served as Chair of PyLondinium, PyConES17, and PyCon Charlas at PyCon 2018. Mario is passionate about the Python community, open source, and inner source. Packt is searching for authors like you If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea. Table of Contents Preface 1 Chapter 1: Let Us Begin Our Automation Journey 8 Introduction 8 Creating a virtual environment 9 Getting ready 10 How to do it... 10 How it works... 11 There's more... 12 See also 13 Installing third-party packages 14 Getting ready 14 How to do it... 15 How it works... 15 There's more... 16 See also 16 Creating strings with formatted values 16 Getting ready 17 How to do it... 17 How it works... 18 There's more... 18 See also 20 Manipulating strings 20 Getting ready 20 How to do it... 21 How it works... 22 There's more... 23 See also 25 Extracting data from structured strings 25 Getting ready 25 How to do it... 26 How it works... 26 There's more... 27 See also 28 Using a third-party tool—parse 29 Getting ready 29 How to do it... 30 How it works... 31 There's more... 31 See also 32 Table of Contents Introducing regular expressions 33 Getting ready 34 How to do it... 34 How it works... 35 There's more... 37 See also 38 Going deeper into regular expressions 38 How to do it... 38 How it works... 39 There's more... 41 See also 42 Adding command-line arguments 42 Getting ready 42 How to do it... 43 How it works... 45 There's more... 47 See also 47 Chapter 2: Automating Tasks Made Easy 48 Introduction 48 Preparing a task 49 Getting ready 49 How to do it... 50 How it works... 52 There's more... 53 See also 55 Setting up a cron job 55 Getting ready 56 How to do it... 57 How it works... 59 There's more... 60 See also 60 Capturing errors and problems 60 Getting ready 61 How to do it... 61 How it works... 64 There's more... 65 See also 66 Sending email notifications 66 Getting ready 66 How to do it... 67 How it works... 69 There's more... 69 See also 70 Chapter 3: Building Your First Web Scraping Application 71 [ ii ] Table of Contents Introduction 71 Downloading web pages 72 Getting ready 72 How to do it... 73 How it works... 74 There's more... 74 See also 74 Parsing HTML 75 Getting ready 75 How to do it... 75 How it works... 77 There's more... 77 See also 78 Crawling the web 78 Getting ready 78 How to do it... 80 How it works... 81 There's more... 82 See also 83 Subscribing to feeds 83 Getting ready 84 How to do it... 84 How it works... 85 There's more... 86 See also 86 Accessing web APIs 86 Getting ready 87 How to do it... 88 How it works... 89 There's more... 90 See also 90 Interacting with forms 90 Getting ready 91 How to do it... 94 How it works... 95 There's more... 95 See also 96 Using Selenium for advanced interaction 96 Getting ready 96 How to do it... 96 How it works... 99 There's more... 99 See also 100 Accessing password-protected pages 100 Getting ready 100 [ iii ]