ebook img

Python for Informatics - Exploring Information PDF

248 Pages·1.468 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Python for Informatics - Exploring Information

Python for Informatics Exploring Information Version0.0.9-d2 Charles Severance Copyright©2009-CharlesSeverance. Printinghistory: October2013: MajorrevisiontoChapters13and14toswitchtoJSONanduseOAuth. AddednewchapteronVisualization. September2013: PublishedbookonAmazonCreateSpace January2010: Published book using the University of Michigan Espresso Book ma- chine. December2009: Majorrevisiontochapters2-10fromThinkPython: HowtoThinkLike a Computer Scientist and writing chapters 1 and 11-15 to produce Python for In- formatics: ExploringInformation June2008: Major revision, changed title to Think Python: How to Think Like a Com- puterScientist. August2007: Major revision, changed title to How to Think Like a (Python) Program- mer. April2002: FirsteditionofHowtoThinkLikeaComputerScientist. ThisworkislicensedunderaCreativeCommonAttribution-NonCommercial-ShareAlike 3.0 Unported License. This license is available at creativecommons.org/licenses/ by-nc-sa/3.0/.Youcanseewhattheauthorconsiderscommercialandnon-commercial usesofthismaterialaswellaslicenseexemptionsintheAppendixtitledCopyrightDetail. TheLATEXsourcefortheThinkPython: HowtoThinkLikeaComputerScientistversion ofthisbookisavailablefromhttp://www.thinkpython.com. Preface Python for Informatics: Remixing an Open Book It is quite natural for academics who are continuously told to “publish or perish” to want to always create something from scratch that is their own fresh creation. This book is an experiment in not starting from scratch, but instead “re-mixing” thebooktitledThinkPython: HowtoThinkLikeaComputerScientist writtenby AllenB.Downey,JeffElknerandothers. InDecemberof2009,IwaspreparingtoteachSI502-NetworkedProgramming attheUniversityofMichiganforthefifthsemesterinarowanddecideditwastime towriteaPythontextbookthatfocusedonexploringdatainsteadofunderstanding algorithms and abstractions. My goal in SI502 is to teach people life-long data handlingskillsusingPython. Fewofmystudentswereplanningtobeprofessional computer programmers. Instead, they planned be librarians, managers, lawyers, biologists,economists, etc. whohappenedtowanttoskillfullyusetechnologyin theirchosenfield. I never seemed to find the perfect data-oriented Python book for my course so I setouttowritejustsuchabook. Luckilyatafacultymeetingthreeweeksbefore I was about to start my new book from scratch over the holiday break, Dr. Atul PrakashshowedmetheThinkPythonbookwhichhehadusedtoteachhisPython course that semester. It is a well-written Computer Science text with a focus on short,directexplanationsandeaseoflearning. Theoverallbookstructurehasbeenchangedtogettodoingdataanalysisproblems as quickly as possible and have a series of running examples and exercises about dataanalysisfromtheverybeginning. Thechapters2-10aresimilartotheThinkPythonbookbuttherehavebeenmajor changes. Number-orientedexamplesandexerciseshavebeenreplacedwithdata- orientedexercises. Topicsarepresentedintheorderneededtobuildincreasingly sophisticateddataanalysissolutions. Sometopicsliketryandexceptarepulled forwardandpresentedaspartofthechapteronconditionals. Functionsaregiven very light treatment until they are needed to handle program complexity rather introduced as an early lesson in abstraction. Nearly all user-defined functions iv Chapter0. Preface have been removed from the example code and exercises outside Chapter 4. The word“recursion”1 doesnotappearinthebookatall. In chapters 1 and 11-16, all of the material is brand new, focusing on real-world uses and simple examples of Python for data analysis including regular expres- sions for searching and parsing, automating tasks on your computer, retrieving dataacrossthenetwork,scrapingwebpagesfordata,usingwebservices,parsing XML and JSON data, and creating and using databases using Structured Query Language. TheultimategoalofallofthesechangesisashiftfromaComputerSciencetoan Informaticsfocusistoonlyincludetopicsintoafirsttechnologyclassthatcanbe usefulevenifonechoosesnottobecomeaprofessionalprogrammer. Students who find this book interesting and want to further explore should look at Allen B. Downey’s Think Python book. Because there is a lot of overlap be- tweenthetwobooks,studentswillquicklypickupskillsintheadditionalareasof technicalprogrammingandalgorithmicthinkingthatarecoveredinThinkPython. Andgiventhatthebookshaveasimilarwritingstyle,youshouldbeabletomove quicklythroughThinkPythonwithaminimumofeffort. AsthecopyrightholderofThinkPython,Allenhasgivenmepermissiontochange thebook’slicenseonthematerialfromhisbookthatremainsinthisbookfromthe GNU Free Documentation License to the more recent Creative Commons Attri- bution—ShareAlikelicense. Thisfollowsageneralshiftinopendocumentation licenses moving from the GFDL to the CC-BY-SA (i.e. Wikipedia). Using the CC-BY-SAlicensemaintainsthebook’sstrongcopylefttraditionwhilemakingit evenmorestraightforwardfornewauthorstoreusethismaterialastheyseefit. I feel that this book serves an example of why open materials are so important to the future of education, and want to thank Allen B. Downey and Cambridge University Press for their forward looking decision to make the book available under an open Copyright. I hope they are pleased with the results of my efforts andIhopethatyouthereaderarepleasedwithourcollectiveefforts. IwouldliketothankAllenB.DowneyandLaurenCowlesfortheirhelp,patience, andguidanceindealingwithandresolvingthecopyrightissuesaroundthisbook. CharlesSeverance www.dr-chuck.com AnnArbor,MI,USA September9,2013 CharlesSeveranceisaClinicalAssociateProfessorattheUniversityofMichigan SchoolofInformation. 1Exceptofcourseforthisline. Contents Preface iii 1 Whyshouldyoulearntowriteprograms? 1 1.1 Creativityandmotivation . . . . . . . . . . . . . . . . . . . . . 2 1.2 Computerhardwarearchitecture . . . . . . . . . . . . . . . . . 3 1.3 Understandingprogramming . . . . . . . . . . . . . . . . . . . 4 1.4 Wordsandsentences . . . . . . . . . . . . . . . . . . . . . . . 5 1.5 ConversingwithPython . . . . . . . . . . . . . . . . . . . . . . 6 1.6 Terminology: interpreterandcompiler . . . . . . . . . . . . . . 8 1.7 Writingaprogram . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.8 Whatisaprogram? . . . . . . . . . . . . . . . . . . . . . . . . 11 1.9 Thebuildingblocksofprograms . . . . . . . . . . . . . . . . . 12 1.10 Whatcouldpossiblygowrong? . . . . . . . . . . . . . . . . . . 13 1.11 Thelearningjourney . . . . . . . . . . . . . . . . . . . . . . . 14 1.12 Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 1.13 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2 Variables,expressionsandstatements 19 2.1 Valuesandtypes . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.2 Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.3 Variablenamesandkeywords . . . . . . . . . . . . . . . . . . . 21 2.4 Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 vi Contents 2.5 Operatorsandoperands . . . . . . . . . . . . . . . . . . . . . . 22 2.6 Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.7 Orderofoperations . . . . . . . . . . . . . . . . . . . . . . . . 23 2.8 Modulusoperator . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.9 Stringoperations . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.10 Askingtheuserforinput . . . . . . . . . . . . . . . . . . . . . 24 2.11 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.12 Choosingmnemonicvariablenames . . . . . . . . . . . . . . . 26 2.13 Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 2.14 Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 2.15 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3 Conditionalexecution 31 3.1 Booleanexpressions. . . . . . . . . . . . . . . . . . . . . . . . 31 3.2 Logicaloperators . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.3 Conditionalexecution . . . . . . . . . . . . . . . . . . . . . . . 32 3.4 Alternativeexecution . . . . . . . . . . . . . . . . . . . . . . . 33 3.5 Chainedconditionals . . . . . . . . . . . . . . . . . . . . . . . 34 3.6 Nestedconditionals . . . . . . . . . . . . . . . . . . . . . . . . 35 3.7 Catchingexceptionsusingtryandexcept . . . . . . . . . . . . . 36 3.8 Shortcircuitevaluationoflogicalexpressions . . . . . . . . . . 37 3.9 Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.10 Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 4 Functions 43 4.1 Functioncalls . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.2 Built-infunctions . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.3 Typeconversionfunctions . . . . . . . . . . . . . . . . . . . . 44 4.4 Randomnumbers . . . . . . . . . . . . . . . . . . . . . . . . . 45 Contents vii 4.5 Mathfunctions . . . . . . . . . . . . . . . . . . . . . . . . . . 46 4.6 Addingnewfunctions . . . . . . . . . . . . . . . . . . . . . . . 47 4.7 Definitionsanduses . . . . . . . . . . . . . . . . . . . . . . . . 48 4.8 Flowofexecution . . . . . . . . . . . . . . . . . . . . . . . . . 49 4.9 Parametersandarguments . . . . . . . . . . . . . . . . . . . . 49 4.10 Fruitfulfunctionsandvoidfunctions . . . . . . . . . . . . . . . 50 4.11 Whyfunctions? . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.12 Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.13 Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.14 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 5 Iteration 57 5.1 Updatingvariables . . . . . . . . . . . . . . . . . . . . . . . . 57 5.2 Thewhilestatement . . . . . . . . . . . . . . . . . . . . . . . 57 5.3 Infiniteloops . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 5.4 “Infiniteloops”andbreak . . . . . . . . . . . . . . . . . . . . 58 5.5 Finishingiterationswithcontinue . . . . . . . . . . . . . . . . 59 5.6 Definiteloopsusingfor . . . . . . . . . . . . . . . . . . . . . 60 5.7 Looppatterns . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 5.8 Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.9 Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 6 Strings 67 6.1 Astringisasequence . . . . . . . . . . . . . . . . . . . . . . . 67 6.2 Gettingthelengthofastringusinglen . . . . . . . . . . . . . . 68 6.3 Traversalthroughastringwithaloop . . . . . . . . . . . . . . 68 6.4 Stringslices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 6.5 Stringsareimmutable . . . . . . . . . . . . . . . . . . . . . . . 69 6.6 Loopingandcounting . . . . . . . . . . . . . . . . . . . . . . . 70 viii Contents 6.7 Theinoperator . . . . . . . . . . . . . . . . . . . . . . . . . . 70 6.8 Stringcomparison . . . . . . . . . . . . . . . . . . . . . . . . . 70 6.9 stringmethods . . . . . . . . . . . . . . . . . . . . . . . . . . 71 6.10 Parsingstrings . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 6.11 Formatoperator . . . . . . . . . . . . . . . . . . . . . . . . . . 74 6.12 Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 6.13 Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 6.14 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 7 Files 79 7.1 Persistence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 7.2 Openingfiles . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 7.3 Textfilesandlines. . . . . . . . . . . . . . . . . . . . . . . . . 81 7.4 Readingfiles . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 7.5 Searchingthroughafile . . . . . . . . . . . . . . . . . . . . . . 83 7.6 Lettingtheuserchoosethefilename . . . . . . . . . . . . . . . 85 7.7 Usingtry, except,andopen. . . . . . . . . . . . . . . . . . 85 7.8 Writingfiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 7.9 Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 7.10 Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 7.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 8 Lists 91 8.1 Alistisasequence . . . . . . . . . . . . . . . . . . . . . . . . 91 8.2 Listsaremutable . . . . . . . . . . . . . . . . . . . . . . . . . 91 8.3 Traversingalist . . . . . . . . . . . . . . . . . . . . . . . . . . 92 8.4 Listoperations . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 8.5 Listslices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 8.6 Listmethods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 8.7 Deletingelements . . . . . . . . . . . . . . . . . . . . . . . . . 94 Contents ix 8.8 Listsandfunctions . . . . . . . . . . . . . . . . . . . . . . . . 95 8.9 Listsandstrings . . . . . . . . . . . . . . . . . . . . . . . . . . 96 8.10 Parsinglines . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 8.11 Objectsandvalues . . . . . . . . . . . . . . . . . . . . . . . . 98 8.12 Aliasing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 8.13 Listarguments . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 8.14 Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 8.15 Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 8.16 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 9 Dictionaries 107 9.1 Dictionaryasasetofcounters . . . . . . . . . . . . . . . . . . 109 9.2 Dictionariesandfiles . . . . . . . . . . . . . . . . . . . . . . . 110 9.3 Loopinganddictionaries . . . . . . . . . . . . . . . . . . . . . 111 9.4 Advancedtextparsing . . . . . . . . . . . . . . . . . . . . . . . 112 9.5 Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 9.6 Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 9.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 10 Tuples 117 10.1 Tuplesareimmutable . . . . . . . . . . . . . . . . . . . . . . . 117 10.2 Comparingtuples . . . . . . . . . . . . . . . . . . . . . . . . . 118 10.3 Tupleassignment . . . . . . . . . . . . . . . . . . . . . . . . . 119 10.4 Dictionariesandtuples . . . . . . . . . . . . . . . . . . . . . . 121 10.5 Multipleassignmentwithdictionaries . . . . . . . . . . . . . . 121 10.6 Themostcommonwords . . . . . . . . . . . . . . . . . . . . . 122 10.7 Usingtuplesaskeysindictionaries . . . . . . . . . . . . . . . . 124 10.8 Sequences: strings,lists,andtuples–OhMy! . . . . . . . . . . . 124 10.9 Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 10.10 Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 10.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 x Contents 11 Regularexpressions 129 11.1 Charactermatchinginregularexpressions . . . . . . . . . . . . 130 11.2 Extractingdatausingregularexpressions . . . . . . . . . . . . . 131 11.3 Combiningsearchingandextracting . . . . . . . . . . . . . . . 133 11.4 Escapecharacter. . . . . . . . . . . . . . . . . . . . . . . . . . 137 11.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 11.6 BonussectionforUnixusers . . . . . . . . . . . . . . . . . . . 138 11.7 Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 11.8 Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 11.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 12 Networkedprograms 143 12.1 HyperTextTransportProtocol-HTTP . . . . . . . . . . . . . . 143 12.2 TheWorld’sSimplestWebBrowser . . . . . . . . . . . . . . . 144 12.3 RetrievinganimageoverHTTP . . . . . . . . . . . . . . . . . 145 12.4 Retrievingwebpageswithurllib . . . . . . . . . . . . . . . . 148 12.5 ParsingHTMLandscrapingtheweb . . . . . . . . . . . . . . . 148 12.6 ParsingHTMLusingRegularExpressions . . . . . . . . . . . . 149 12.7 ParsingHTMLusingBeautifulSoup . . . . . . . . . . . . . . . 150 12.8 Readingbinaryfilesusingurllib . . . . . . . . . . . . . . . . . 152 12.9 Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 12.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 13 UsingWebServices 155 13.1 eXtensibleMarkupLanguage-XML . . . . . . . . . . . . . . . 155 13.2 ParsingXML . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 13.3 Loopingthroughnodes . . . . . . . . . . . . . . . . . . . . . . 156 13.4 JavaScriptObjectNotation-JSON . . . . . . . . . . . . . . . . 157 13.5 ParsingJSON . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 13.6 ApplicationProgrammingInterfaces(API) . . . . . . . . . . . . 159

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.