ebook img

Computer vision metrics : survey, taxonomy and analysis of computer vision, visual neuroscience, and deep learning PDF

653 Pages·2016·16.98 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Computer vision metrics : survey, taxonomy and analysis of computer vision, visual neuroscience, and deep learning

Scott Krig Computer Vision Metrics Textbook Edition Survey, Taxonomy and Analysis of Computer Vision, Visual Neuroscience, and Deep Learning Computer Vision Metrics Scott Krig Computer Vision Metrics Textbook Edition Survey, Taxonomy and Analysis of Computer Vision, Visual Neuroscience, and Deep Learning ScottKrig KrigResearch,USA ISBN978-3-319-33761-6 ISBN978-3-319-33762-3(eBook) DOI10.1007/978-3-319-33762-3 LibraryofCongressControlNumber:2016938637 #SpringerInternationalPublishingSwitzerland2016 ThisSpringerimprintispublishedbySpringerNatureThisworkissubjecttocopyright.Allrights arereservedbythePublisher,whetherthewholeorpartofthematerialisconcerned,specifically therightsoftranslation,reprinting,reuseofillustrations,recitation,broadcasting,reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology nowknownorhereafterdeveloped. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthis publicationdoesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesare exemptfromtherelevantprotectivelawsandregulationsandthereforefreeforgeneraluse. Thepublisher,theauthorsandtheeditorsaresafetoassumethattheadviceandinformationin thisbookarebelievedtobetrueandaccurateatthedateofpublication.Neitherthepublishernor the authors or the editors give a warranty, express or implied, with respect to the material containedhereinorforanyerrorsoromissionsthatmayhavebeenmade. Printedonacid-freepaper ThisSpringerimprintispublishedbySpringerNature TheregisteredcompanyisSpringerInternationalPublishingAGSwitzerland Foreword to the Second Edition The goal of this second version is to add new materials on deep learning, neuroscience applied to computer vision, historical developments in neural networks, and feature learning architectures, particularly neural network methods. In addition, this second edition cleans up some typos and other itemsfromthefirstversion.Intotal,threenewchaptersareaddedtosurvey the latest feature learning, and hierarchical deep learning methods and architectures.Overall,thisbookisprovidesawidesurveyofcomputervision methodsincludinglocalfeaturedescriptors,regionalandglobalfeatures,and feature learning methods, with a taxonomy for organizational purposes. Analysis is distributed through the book to provide intuition behind the various approaches, encouraging the reader to think for themselves about the motivations for each approach, why different methods are created, how each method is designed and architected, and why it works. Nearly 1000 references to the literature and other materials are provided, making com- putervisionandimagingresourcesaccessibleatmanylevels. Myexpectationforthereaderisthis:ifyouwanttolearnabout90 %of computervision,readthisbook.Tolearntheother10 %,readthereferences provided and spend at least 20 years creating real systems. Reading this bookwilltakeamatterofhours,andreadingthereferencesandcreatingreal systemswilltakealifetimetoonlyscratchthesurface.Wefollowtheaxiom oftheeminentDr.JackSparrow,whohasnotimeforextraneousdetails,and here we endeavor to present computer vision materials in a fashion that makes the fundamentals accessible to many outside the inner circles of academia: “Ilikeit.Simple,easytoremember”. JackSparrow,PiratesoftheCaribbean This book is suitable for independent study, reference, or coursework at theuniversitylevelandbeyondforexperiencedengineersandscientists.The chapters are divided in such a way that various courses can be devised to incorporate a subset of chapters to accommodate course requirements. For example, typical course titles include “Image Sensors and Image Processing,”“ComputerVisionAndImageProcessing,”“AppliedComputer Vision And Imaging Optimizations,” “Feature Learning, Deep Learning, and Neural Network Architectures,” “Computer Vision Architectures,” “Computer Vision Survey.” Questions are available for coursework at the end of each chapter. It is recommended that this book be used as a v vi ForewordtotheSecondEdition complement to other fine books, open source code, and hands-on materials for study in computer vision and related scientific disciplines, or possibly usedbyitselfforahigher-levelsurveycourse. This book may be used as required reading to provide a survey component to academic coursework for science and engineering disciplines, to complement other texts that contain hands-on and how-tomaterials. This book DOES NOT PROVIDE extensive how-to coding examples, worked out examples, mathematical proofs, experimental results and comparisons, or detailed performance data, which are already very well covered in the bibliography references. The goal is to provide an analysis across a representative survey of methods, rather than repeating what is already found in the references. This is not a workbook with open source code (only a little source code is provided), since there are many fine open source materials available already, which are referenced for the interested reader. Instead,thisbookDOESPROVIDEanextensivesurvey,taxonomy,and analysisofcomputervisionmethods.Thegoalistofindtheintuitionbehind the methods surveyed. The book is meant to be read, rather than worked through. Thisis nota workbook, butis intended to provide sufficientback- groundforthereadertofindpathwaysforwardintobasicorappliedresearch foravarietyofscientificandengineeringapplications.Insomerespects,this work is a museum of computer vision, containing concepts, observations, oddments,andrelicswhichfascinateme. The book is designed to complement existing texts and fill a niche in the literature.Thebook takes acomplete path through computer vision, begin- ning with image sensors, image processing, global-regional-local feature descriptor methods, feature learningand deep learning, neuralnetworks for computer vision, ground truth data and training, applied engineering optimizations across CPU, GPU, and software optimization methods. The author could not find a similar book, otherwise I would not have begun thiswork. This book aims at a survey, taxonomy, and analysis of computer vision methods from the perspective of the features used—the feature descriptors themselves, how they are designed and how they are organized. Learning methods and architectures are necessary and supporting factors, and are included here for completeness. However, I am personally fascinated by the feature descriptor methods themselves, and I regard them as an art-form for mathematically arranging pixelpatterns,shapes,andspectra to revealhowimagesarecreated.Iregardeachfeaturedescriptorasaworkof art,likeapaintingormathematicalsculpturepreparedbyanartist,andthus the perspective of this work is to survey feature descriptor and feature learningmethodsandappreciateeachone. Asshowninthisbookoverandoveragain,researchersarefindingthata wide range of feature descriptors are effective, and that one of the keys to best results seems to be the sheer number of features used in feature hierarchies, rather than the choice of SIFT vs. pixel patches vs. CNN features. In the surveys herein we see that many methods for learning and ForewordtotheSecondEdition vii trainingareused,manyarchitecturesareused,andtheconsensusseemstobe that hierarchical feature learning is now the mainstay of computer vision, followingonfromthepioneeringworkinconvolutionalneuralnetworksand deep learning methods applied to computer vision, which has accelerated since the new millennium. The older computer vision methods are being combinedwiththenewerones,andnowapplicationsarebeginningtoappear inconsumerdevices,ratherthanexoticmilitaryandintelligencesystemsof thepast. Special thanks to Courtney Clarke at Springer for commissioning this second version, and providing support and guidance to make the second versionbetter. Special thanks to all the wonderful feedback on the first version, which helped to shape this second version. Vin Ratford and Jeff Bier of the Embedded Vision Alliance (EVA) arranged to provide copies of the first version to all EVA members, both hardcopy and e-book versions, and maintained a feedback web page for review comments—much appreciated. Thanks to Mike Schmidt and Vadim Pisarevsky for excellent review comments over of the entire book. Juergen Schmidhuber provided links to historical information on neural networks and other useful information, Kunihiko Fukushima provided copies of some of his early neural network research papers, Rahul Suthankar provided updates on key trends in com- puter vision, and Hugo LaRochelle provided information and references on CNNtopics andPatrickCoxonHMAXtopics.Interestinginformation was alsoprovidedbyRobertGens,AndrejKarpathy.AndIwouldliketore-thank those who contributed to the first version, including Paul Rosin regarding syntheticinterestpoints,YannLeCunforprovidingkeyreferencesintodeep learning and convolutional networks, Shree Nayar for permission to use a fewimages,LucianoOviedoforblueskydiscussions,andmanyotherswho haveinfluencedmythinkingincludingAlexandreAlahi,SteveSeitz,Bryan Russel, Liefeng Bo, Xiaofeng Ren, Gutemberg Guerra-filho, Harsha Viswana, Dale Hitt, Joshua Gleason, Noah Snavely, Daniel Scharstein, Thomas Salmon, Richard Baraniuk, Carl Vodrick, Herve´ Je´gou, Andrew Richardson, Ofri Weschler, Hong Jiang, Andy Kuzma, Michael Jeronimo, EliTuriel,andmanyotherswhomIhavefailedtomention. As usual, thanks to my wife for patience with my research, and also for providing the “governor” switch to pace my work, without which I would likelyburnoutmorecompletely.Andmostofall,specialthankstothegreat inventorwhoinspiresusall,AnnoDomini2016. ScottKrig Contents 1 ImageCaptureandRepresentation. . . . . . . . . . . . . . . . . . . 1 ImageSensorTechnology. . .. . . . . .. . . . .. . . . .. . . . .. . . . 1 SensorMaterials. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 SensorPhotodiodeCells. . . . . . . . . . . . . . . . . . . . . . . . . . . 2 SensorConfigurations:Mosaic,Foveon,BSI. . . . . . . . . . . . 4 DynamicRange,Noise,SuperResolution. . . . . . . . . . . . . . 5 SensorProcessing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 De-Mosaicking. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 DeadPixelCorrection.. . . . . .. . . . . .. . . . . .. . . . . .. . . . 6 ColorandLightingCorrections. . . . . . . . . . . . . . . . . . . . . . 6 GeometricCorrections. . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 CamerasandComputationalImaging. . . . . . . . . . . . . . . . . . . 7 OverviewofComputationalImaging. . . . . . . . . . . . . . . . . . 7 Single-PixelComputationalCameras. . . . . .. . . . . . . . . . . . 8 2DComputationalCameras. . . . . . . . . . . . . . . . . . . . . . . . 9 3DDepthCameraSystems. . . . . . . . . . . . . . . . . . . . . . . . . 10 3DDepthProcessing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 OverviewofMethods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 ProblemsinDepthSensingandProcessing. . . . . . . . . . . . . 22 MonocularDepthProcessing. . . . . . . . . . . . . . . . . . . . . . . . 27 3DRepresentations:Voxels,DepthMaps,Meshes, andPointClouds. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Chapter1:LearningAssignments. . . . . . . . . . . . . . . . . . . . . . 33 2 ImagePre-Processing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 PerspectivesonImageProcessing. . . . . . . . . . . . . . . . . . . . . . 35 ProblemstoSolveDuringImagePreprocessing. . . . . . . . . . . . 36 VisionPipelinesandImagePreprocessing. . . . . . . . . . . . . . 36 Corrections. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Enhancements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 PreparingImagesforFeatureExtraction. . . . . . . . . . . . . . . 39 TheTaxonomyofImageProcessingMethods. . . . . . . . . . . . . 43 Point. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Line. . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Area. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Algorithmic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 ix

Description:
Based on the successful 2014 book published by Apress, this textbook edition is expanded to provide a comprehensive history and state-of-the-art survey for fundamental computer vision methods. With over 800 essential references, as well as chapter-by-chapter learning assignments, both students and r
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.