ebook img

Computer Vision: Algorithms and Applications (Texts in Computer Science) PDF

824 Pages·2010·52.8 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Computer Vision: Algorithms and Applications (Texts in Computer Science)

Texts in Computer Science Editors DavidGries FredB.Schneider Forfurthervolumes: www.springer.com/series/3191 Richard Szeliski Computer Vision Algorithms and Applications 123 Dr. Richard Szeliski Microsoft Research One Microsoft Way 98052-6399 Redmond Washington USA [email protected] SeriesEditors DavidGries FredB.Schneider DepartmentofComputerScience DepartmentofComputerScience UpsonHall UpsonHall CornellUniversity CornellUniversity Ithaca,NY14853-7501,USA Ithaca,NY14853-7501,USA ISSN 1868-0941 e-ISSN 1868-095X ISBN 978-1-84882-934-3 e-ISBN 978-1-84882-935-0 DOI 10.1007/978-1-84882-935-0 Springer London Dordrecht Heidelberg New York BritishLibraryCataloguinginPublicationData AcataloguerecordforthisbookisavailablefromtheBritishLibrary LibraryofCongressControlNumber:2010936817 © Springer-VerlagLondonLimited2011 Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permittedundertheCopyright,DesignsandPatentsAct1988,thispublicationmayonlybereproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers,orinthecaseofreprographicreproductioninaccordancewiththetermsoflicensesissuedby theCopyrightLicensingAgency.Enquiriesconcerningreproductionoutsidethosetermsshouldbesent tothepublishers. Theuseofregisterednames,trademarks,etc.,inthispublicationdoesnotimply,evenintheabsenceofa specificstatement,thatsuchnamesareexemptfromtherelevantlawsandregulationsandthereforefree forgeneraluse. Thepublishermakesnorepresentation,expressorimplied,withregardtotheaccuracyoftheinformation containedinthisbookandcannotacceptanylegalresponsibilityorliabilityforanyerrorsoromissions thatmaybemade. Printedonacid-freepaper SpringerispartofSpringerScience+BusinessMedia(www.springer.com) Thisbookisdedicatedtomyparents, ZdzisławandJadwiga, andmyfamily, Lyn,Anne,andStephen. 1 Introduction 1 Whatiscomputervision? • Abriefhistory • Bookoverview • Samplesyllabus • Notation n^ 2 Imageformation 27 Geometricprimitivesandtransformations • Photometricimageformation • Thedigitalcamera 3 Imageprocessing 87 Pointoperators • Linearfiltering • Moreneighborhoodoperators • Fouriertransforms • Pyramidsandwavelets • Geometrictransformations • Globaloptimization 4 Featuredetectionandmatching 181 Pointsandpatches • Edges • Lines 5 Segmentation 235 Activecontours • Splitandmerge • Meanshiftandmodefinding • Normalizedcuts • Graphcutsandenergy-basedmethods 6 Feature-basedalignment 273 2Dand3Dfeature-basedalignment • Poseestimation • Geometricintrinsiccalibration 7 Structurefrommotion 303 Triangulation • Two-framestructurefrommotion • Factorization • Bundleadjustment • Constrainedstructureandmotion 8 Densemotionestimation 335 Translationalalignment • Parametricmotion • Spline-basedmotion • Opticalflow • Layeredmotion 9 Imagestitching 375 Motionmodels • Globalalignment • Compositing 10 Computationalphotography 409 Photometriccalibration • Highdynamicrangeimaging • Super-resolutionandblurremoval • Imagemattingandcompositing • Textureanalysisandsynthesis 11 Stereocorrespondence 467 Epipolargeometry • Sparsecorrespondence • Densecorrespondence • Localmethods • Globaloptimization • Multi-viewstereo 12 3Dreconstruction 505 ShapefromX • Activerangefinding • Surfacerepresentations • Point-basedrepresentations • Volumetricrepresentations • Model-basedreconstruction • Recoveringtexturemapsandalbedos 13 Image-basedrendering 543 Viewinterpolation • Layereddepthimages • LightfieldsandLumigraphs • Environmentmattes • Video-basedrendering 14 Recognition 575 Objectdetection • Facerecognition • Instancerecognition • Categoryrecognition • Contextandsceneunderstanding • Recognitiondatabasesandtestsets Preface Theseedsforthisbookwerefirstplantedin2001whenSteveSeitzattheUniversityofWash- ingtoninvitedmetoco-teachacoursecalled“ComputerVisionforComputerGraphics”. At thattime,computervisiontechniqueswereincreasinglybeingusedincomputergraphicsto createimage-basedmodelsofreal-worldobjects,tocreatevisualeffects,andtomergereal- world imagery using computational photography techniques. Our decision to focus on the applicationsofcomputervisiontofunproblemssuchasimagestitchingandphoto-based3D modelingfrompersonalphotosseemedtoresonatewellwithourstudents. Sincethattime,asimilarsyllabusandproject-orientedcoursestructurehasbeenusedto teachgeneralcomputervisioncoursesbothattheUniversityofWashingtonandatStanford. (ThelatterwasacourseIco-taughtwithDavidFleetin2003.) Similarcurriculahavebeen adoptedatanumberofotheruniversitiesandalsoincorporatedintomorespecializedcourses oncomputationalphotography.(Forideasonhowtousethisbookinyourowncourse,please seeTable1.1inSection1.4.) Thisbookalsoreflectsmy20years’experiencedoingcomputervisionresearchincorpo- rateresearchlabs,mostlyatDigitalEquipmentCorporation’sCambridgeResearchLaband atMicrosoftResearch. Inpursuingmywork, Ihavemostlyfocusedonproblemsandsolu- tiontechniques(algorithms)thathavepracticalreal-worldapplicationsandthatworkwellin practice. Thus,thisbookhasmoreemphasisonbasictechniquesthatworkunderreal-world conditionsandlessonmoreesotericmathematicsthathasintrinsicelegancebutlesspractical applicability. Thisbookissuitableforteachingasenior-levelundergraduatecourseincomputervision to students in both computer science and electrical engineering. I prefer students to have either an image processing or a computer graphics course as a prerequisite so that they can spendlesstimelearninggeneralbackgroundmathematicsandmoretimestudyingcomputer visiontechniques. Thebookisalsosuitableforteachinggraduate-levelcoursesincomputer vision(bydelvingintothemoredemandingapplicationandalgorithmicareas)andasagen- eralreferencetofundamentaltechniquesandtherecentresearchliterature.Tothisend,Ihave attemptedwhereverpossibletoatleastcitethenewestresearchineachsub-field,evenifthe technicaldetailsaretoocomplextocoverinthebookitself. Inteachingourcourses,wehavefounditusefulforthestudentstoattemptanumberof smallimplementationprojects,whichoftenbuildononeanother,inordertogetthemusedto workingwithreal-worldimagesandthechallengesthatthesepresent. Thestudentsarethen askedtochooseanindividualtopicforeachoftheirsmall-group,finalprojects. (Sometimes these projects even turn into conference papers!) The exercises at the end of each chapter contain numerous suggestions for smaller mid-term projects, as well as more open-ended x problems whose solutions are still active research topics. Wherever possible, I encourage studentstotrytheiralgorithmsontheirownpersonalphotographs,sincethisbettermotivates them, often leads to creative variants on the problems, and better acquaints them with the varietyandcomplexityofreal-worldimagery. Informulatingandsolvingcomputervisionproblems,Ihaveoftenfounditusefultodraw inspirationfromthreehigh-levelapproaches: • Scientific: build detailed models of the image formation process and develop mathe- maticaltechniquestoinverttheseinordertorecoverthequantitiesofinterest(where necessary,makingsimplifyingassumptiontomakethemathematicsmoretractable). • Statistical: useprobabilisticmodelstoquantifythepriorlikelihoodofyourunknowns andthenoisymeasurementprocessesthatproducetheinputimages,theninferthebest possibleestimatesofyourdesiredquantitiesandanalyzetheirresultinguncertainties. Theinferencealgorithmsusedareoftencloselyrelatedtotheoptimizationtechniques usedtoinvertthe(scientific)imageformationprocesses. • Engineering: develop techniques that are simple to describe and implement but that are also known to work well in practice. Test these techniques to understand their limitation and failure modes, as well as their expected computational costs (run-time performance). Thesethreeapproachesbuildoneachotherandareusedthroughoutthebook. Mypersonalresearchanddevelopmentphilosophy(andhencetheexercisesinthebook) haveastrongemphasisontestingalgorithms. It’stooeasyincomputervisiontodevelopan algorithmthatdoessomethingplausibleonafewimagesratherthansomethingcorrect. The bestwaytovalidateyouralgorithmsistouseathree-partstrategy. First,testyouralgorithmoncleansyntheticdata,forwhichtheexactresultsareknown. Second, add noise to the data and evaluate how the performance degrades as a function of noise level. Finally, test the algorithm on real-world data, preferably drawn from a wide varietyofsources, suchasphotosfoundontheWeb. Onlythencanyoutrulyknowifyour algorithm can deal with real-world complexity, i.e., images that do not fit some simplified modelorassumptions. Inordertohelpstudentsinthisprocess,thisbookscomeswithalargeamountofsupple- mentarymaterial,whichcanbefoundonthebook’sWebsitehttp://szeliski.org/Book. This material,whichisdescribedinAppendixC,includes: • pointerstocommonlyuseddatasetsfortheproblems,whichcanbefoundontheWeb • pointerstosoftwarelibraries,whichcanhelpstudentsgetstartedwithbasictaskssuch asreading/writingimagesorcreatingandmanipulatingimages • slidesetscorrespondingtothematerialcoveredinthisbook • aBibTeXbibliographyofthepaperscitedinthisbook. The latter two resources may be of more interest to instructors and researchers publishing new papers in this field, but they will probably come in handy even with regular students. Someofthesoftwarelibrariescontainimplementationsofawidevarietyofcomputervision algorithms, which can enable you to tackle more ambitious projects (with your instructor’s consent). Preface xi Acknowledgements I would like to gratefully acknowledge all of the people whose passion for research and inquiryaswellasencouragementhavehelpedmewritethisbook. Steve Zucker at McGill University first introduced me to computer vision, taught all of his students to question and debate research results and techniques, and encouraged me to pursueagraduatecareerinthisarea. TakeoKanadeandGeoffHinton,myPh.D.thesisadvisorsatCarnegieMellonUniversity, taught me the fundamentals of good research, writing, and presentation. They fired up my interest in visual processing, 3D modeling, and statistical methods, while Larry Matthies introducedmetoKalmanfilteringandstereomatching. DemetriTerzopouloswasmymentoratmyfirstindustrialresearchjobandtaughtmethe ropes of successful publishing. Yvan Leclerc and Pascal Fua, colleagues from my brief in- terludeatSRIInternational,gavemenewperspectivesonalternativeapproachestocomputer vision. DuringmysixyearsofresearchatDigitalEquipmentCorporation’sCambridgeResearch Lab,Iwasfortunatetoworkwithagreatsetofcolleagues,includingIngridCarlbom,Gudrun Klinker,KeithWaters,RichardWeiss,Ste´phaneLavalle´e,andSingBingKang,aswellasto supervisethefirstofalongstringofoutstandingsummerinterns,includingDavidTonnesen, SingBingKang,JamesCoughlan,andHarryShum. ThisisalsowhereIbeganmylong-term collaborationwithDanielScharstein,nowatMiddleburyCollege. AtMicrosoftResearch,I’vehadtheoutstandingfortunetoworkwithsomeoftheworld’s bestresearchersincomputervisionandcomputergraphics,includingMichaelCohen,Hugues Hoppe,StephenGortler,SteveShafer,MatthewTurk,HarryShum,Anandan,PhilTorr,An- tonioCriminisi, GeorgPetschnigg, KentaroToyama, RaminZabih, ShaiAvidan, SingBing Kang, Matt Uyttendaele, Patrice Simard, Larry Zitnick, Richard Hartley, Simon Winder, DrewSteedly,ChrisPal,NebojsaJojic,PatrickBaudisch,DaniLischinski,MatthewBrown, SimonBaker,MichaelGoesele,EricStollnitz,DavidNiste´r,BlaiseAguerayArcas,Sudipta Sinha, Johannes Kopf, Neel Joshi, and Krishnan Ramnath. I was also lucky to have as in- ternssuchgreatstudentsasPolinaGolland,SimonBaker,MeiHan,ArnoScho¨dl,RonDror, AshleyEden,JinxiangChai,RahulSwaminathan,YanghaiTsin,SamHasinoff,AnatLevin, Matthew Brown, Eric Bennett, Vaibhav Vaish, Jan-Michael Frahm, James Diebel, Ce Liu, JosefSivic,GrantSchindler,ColinZheng,NeelJoshi,SudiptaSinha,ZeevFarbman,Rahul Garg,TimCho,YekeunJeong,RichardRoberts,VarshaHedau,andDilipKrishnan. WhileworkingatMicrosoft,I’vealsohadtheopportunitytocollaboratewithwonderful colleaguesattheUniversityofWashington,whereIholdanAffiliateProfessorappointment. I’m indebted to Tony DeRose and David Salesin, who first encouraged me to get involved with the research going on at UW, my long-time collaborators Brian Curless, Steve Seitz, Maneesh Agrawala, Sameer Agarwal, and Yasu Furukawa, as well as the students I have hadtheprivilegetosuperviseandinteractwith,includingFre´dericPighin,Yung-YuChuang, Doug Zongker, Colin Zheng, Aseem Agarwala, Dan Goldman, Noah Snavely, Rahul Garg, and Ryan Kaminsky. As I mentioned at the beginning of this preface, this book owes its inception to the vision course that Steve Seitz invited me to co-teach, as well as to Steve’s encouragement,coursenotes,andeditorialinput. I’m also grateful to the many other computer vision researchers who have given me so manyconstructivesuggestionsaboutthebook,includingSingBingKang,whowasmyinfor-

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.