Programming Computer Vision with Python > m o c k. o o b e w o w w. w w < k o o B e w! o W m o d fr a o nl w o D Jan Erik Solem Beijing•Cambridge•Farnham•Ko¨ln•Sebastopol•Tokyo ProgrammingComputerVisionwithPython byJanErikSolem Copyright©2012JanErikSolem.Allrightsreserved. PrintedintheUnitedStatesofAmerica PublishedbyO’ReillyMedia,Inc.,1005GravensteinHighwayNorth,Sebastopol,CA95472. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editionsarealsoavailableformosttitles(http://my.safaribooksonline.com).Formoreinformation, contactourcorporate/institutionalsalesdepartment:(800)[email protected]. Interiordesigner: DavidFutato Projectmanager: PaulC.Anagnostopoulos Coverdesigner: KarenMontgomery Copyeditor: PriscillaStevens Editors: AndyOram,MikeHendrickson Proofreader: RichardCamp Productioneditor: HollyBauer Illustrator: LaurelMuller June2012 Firstedition RevisionHistoryfortheFirstEdition: 2012-06-11 Firstrelease Seehttp://oreilly.com/catalog/errata.csp?isbn=0636920022923forreleasedetails. NutshellHandbook,theNutshellHandbooklogo,andtheO’Reillylogoareregisteredtrademarks ofO’ReillyMedia,Inc.ProgrammingComputerVisionwithPython,theimageofabullheadfish, andrelatedtradedressaretrademarksofO’ReillyMedia,Inc. Manyofthedesignationsusedbymanufacturersandsellerstodistinguishtheirproductsare claimedastrademarks.Wherethosedesignationsappearinthisbook,andO’ReillyMedia,Inc., wasawareofatrademarkclaim,thedesignationshavebeenprintedincapsorinitialcaps. Whileeveryprecautionhasbeentakeninthepreparationofthisbook,thepublisherandauthors assumenoresponsibilityforerrorsoromissions, orfordamagesresultingfromtheuseofthe informationcontainedherein. ISBN:978-1-449-31654-9 [M] Table of Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii 1. BasicImageHandlingandProcessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 PIL—ThePythonImagingLibrary 1 1.2 Matplotlib 3 1.3 NumPy 7 1.4 SciPy 16 1.5 AdvancedExample:ImageDe-Noising 23 Exercises 26 ConventionsfortheCodeExamples 27 2. LocalImageDescriptors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 2.1 HarrisCornerDetector 29 2.2 SIFT—Scale-InvariantFeatureTransform 36 2.3 MatchingGeotaggedImages 44 Exercises 51 3. ImagetoImageMappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 3.1 Homographies 53 3.2 WarpingImages 57 3.3 CreatingPanoramas 70 Exercises 77 4. CameraModelsandAugmentedReality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 4.1 ThePin-HoleCameraModel 79 4.2 CameraCalibration 84 4.3 PoseEstimationfromPlanesandMarkers 86 4.4 AugmentedReality 89 Exercises 98 iii 5. MultipleViewGeometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.1 EpipolarGeometry 99 5.2 ComputingwithCamerasand3DStructure 107 5.3 MultipleViewReconstruction 113 5.4 StereoImages 120 Exercises 125 6. ClusteringImages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 6.1 K-MeansClustering 127 6.2 HierarchicalClustering 133 6.3 SpectralClustering 140 Exercises 145 7. SearchingImages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 7.1 Content-BasedImageRetrieval 147 7.2 VisualWords 148 7.3 IndexingImages 151 7.4 SearchingtheDatabaseforImages 155 7.5 RankingResultsUsingGeometry 160 7.6 BuildingDemosandWebApplications 162 Exercises 165 8. ClassifyingImageContent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 8.1 K-NearestNeighbors 167 8.2 BayesClassifier 175 8.3 SupportVectorMachines 179 8.4 OpticalCharacterRecognition 183 Exercises 189 9. ImageSegmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 9.1 GraphCuts 191 9.2 SegmentationUsingClustering 200 9.3 VariationalMethods 204 Exercises 206 10. OpenCV. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 10.1 TheOpenCVPythonInterface 209 10.2 OpenCVBasics 210 10.3 ProcessingVideo 213 10.4 Tracking 216 10.5 MoreExamples 223 Exercises 226 iv | TableofContents A. InstallingPackages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 A.1 NumPyandSciPy 227 A.2 Matplotlib 228 A.3 PIL 228 A.4 LibSVM 228 A.5 OpenCV 229 A.6 VLFeat 230 A.7 PyGame 230 A.8 PyOpenGL 230 A.9 Pydot 230 A.10 Python-graph 231 A.11 Simplejson 231 A.12 PySQLite 232 A.13 CherryPy 232 B. ImageDatasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 B.1 Flickr 233 B.2 Panoramio 234 B.3 OxfordVisualGeometryGroup 235 B.4 UniversityofKentuckyRecognitionBenchmarkImages 235 B.5 Other 235 C. ImageCredits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 C.1 ImagesfromFlickr 237 C.2 OtherImages 238 C.3 Illustrations 238 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 TableofContents | v > m o c k. o o b e w o w w. w w < k o o B e w! o W m o d fr a o nl w o D Preface Today, images and video are everywhere. Online photo-sharing sites and social net- workshavetheminthebillions.Searchengineswillproduceimagesofjustaboutany conceivable query. Practically all phones and computers come with built-in cameras. Itisnotuncommonforpeopletohavemanygigabytesofphotosandvideosontheir devices. Programmingacomputeranddesigningalgorithmsforunderstandingwhatisinthese imagesisthefieldofcomputervision.Computervisionpowersapplicationslikeimage search,robotnavigation,medicalimageanalysis,photomanagement,andmanymore. The idea behind this book is to give an easily accessible entry point to hands-on computervisionwithenoughunderstandingoftheunderlyingtheoryandalgorithms tobeafoundationforstudents,researchers,andenthusiasts.ThePythonprogramming language,thelanguagechoiceofthisbook,comeswithmanyfreelyavailable,powerful modulesforhandlingimages,mathematicalcomputing,anddatamining. Whenwritingthisbook,Ihaveusedthefollowingprinciplesasaguideline.Thebook should: Bewritteninanexploratorystyleandencouragereaderstofollowtheexampleson . theircomputersastheyarereadingthetext. Promoteandusefreeandopensoftwarewithalowlearningthreshold.Pythonwas . theobviouschoice. Becompleteandself-contained.Thisbookdoesnotcoverallofcomputervision . butratheritshouldbecompleteinthatallcodeispresentedandexplained.The readershouldbeabletoreproducetheexamplesandbuilduponthemdirectly. Bebroadratherthandetailed,inspiringandmotivationalratherthantheoretical. . Inshort,itshouldactasasourceofinspirationforthoseinterestedinprogramming computervisionapplications. vii PrerequisitesandOverview Thisbooklooksattheoryandalgorithmsforawiderangeofapplicationsandproblems. Hereisashortsummaryofwhattoexpect. WhatYouNeedtoKnow Basicprogrammingexperience.Youneedtoknowhowtouseaneditorandrun . scripts,howtostructurecodeaswellasbasicdatatypes.FamiliaritywithPython orotherscriptinglanguageslikeRubyorMatlabwillhelp. Basicmathematics.Tomakefulluseoftheexamples,ithelpsifyouknowabout . matrices,vectors,matrixmultiplication,andstandardmathematicalfunctionsand conceptslikederivativesandgradients.Someofthemoreadvancedmathematical examplescanbeeasilyskipped. WhatYouWillLearn Hands-onprogrammingwithimagesusingPython. . Computervisiontechniquesbehindawidevarietyofreal-worldapplications. . Many of the fundamental algorithms and how to implement and apply them . yourself. The code examples in this book will show you object recognition, content-based imageretrieval,imagesearch,opticalcharacterrecognition,opticalflow,tracking,3D reconstruction,stereoimaging,augmentedreality,poseestimation,panoramacreation, imagesegmentation,de-noising,imagegrouping,andmore. ChapterOverview Chapter1,“BasicImageHandlingandProcessing” IntroducesthebasictoolsforworkingwithimagesandthecentralPythonmodules usedinthebook.Thischapteralsocoversmanyfundamentalexamplesneededfor theremainingchapters. Chapter2,“LocalImageDescriptors” Explainsmethodsfordetectinginterestpointsinimagesandhowtousethemto findcorrespondingpointsandregionsbetweenimages. Chapter3,“ImagetoImageMappings” Describesbasictransformationsbetweenimagesandmethodsforcomputingthem. Examplesrangefromimagewarpingtocreatingpanoramas. Chapter4,“CameraModelsandAugmentedReality” Introduces how to model cameras, generate image projections from 3D space to imagefeatures,andestimatethecameraviewpoint. Chapter5,“MultipleViewGeometry” Explainshowtoworkwithseveralimagesofthesamescene,thefundamentalsof multiple-viewgeometry,andhowtocompute3Dreconstructionsfromimages. viii | Preface
Description: