JosefBigun VisionwithDirection Josef Bigun Vision with Direction ASystematicIntroduction toImageProcessingandComputerVision With146Figures,including130inColor 123 JosefBigun IDE-Sektionen Box823 SE-30118,Halmstad Sweden [email protected] www.hh.se/staff/josef LibraryofCongressControlNumber:2005934891 ACMComputingClassification(1998):I.4,I.5,I.3,I.2.10 ISBN-10 3-540-27322-0 SpringerBerlinHeidelbergNewYork ISBN-13 978-3-540-27322-6 SpringerBerlinHeidelbergNewYork Thisworkissubjecttocopyright.Allrightsarereserved,whetherthewholeorpartofthematerial isconcerned,specificallytherightsoftranslation,reprinting,reuseofillustrations,recitation,broad- casting,reproductiononmicrofilmorinanyotherway,andstorageindatabanks.Duplicationof thispublicationorpartsthereofispermittedonlyundertheprovisionsoftheGermanCopyrightLaw ofSeptember9,1965,initscurrentversion,andpermissionforusemustalwaysbeobtainedfrom Springer.ViolationsareliableforprosecutionundertheGermanCopyrightLaw. SpringerisapartofSpringerScience+BusinessMedia springer.com ©Springer-VerlagBerlinHeidelberg2006 PrintedinGermany Theuseofgeneraldescriptivenames,registerednames,trademarks,etc.inthispublicationdoesnot imply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevantpro- tectivelawsandregulationsandthereforefreeforgeneraluse. TypesetbytheauthorusingaSpringerTEXmacropackage Production:LE-TEXJelonek,Schmidt&VöcklerGbR,Leipzig Coverdesign:KünkelLopkaWerbeagentur,Heidelberg Printedonacid-freepaper 45/3142/YL-543210 Tomyparents,H.andS.Bigun Preface Imageanalysisisacomputationalfeatwhichhumansshowexcellencein,incompar- isonwithcomputers.Yetthelistofapplicationsthatrelyonautomaticprocessingof imageshasbeengrowingatafastpace.Biometricauthenticationbyface,fingerprint, andiris,onlinecharacterrecognitionincellphonesaswellasdrugdesigntoolsare butafewofitsbenefactorsappearingontheheadlines. This is, of course, facilitated by the valuable output of the resarch community inthepast30years.Thepatternrecognitionandcomputervisioncommunitiesthat study image analysis have large conferences, which regularly draw 1000 partici- pants.Inawaythisisnotsurprising,becausemuchofthehuman-specificactivities critically rely on intelligent use of vision. If routine parts of these activities can be automated, much is to be gained in comfort and sustainable development. The re- searchfieldcouldequallybecalled visualintelligencebecauseitconcernsnearlyall activities of awake humans. Humans use or rely on pictures or pictorial languages torepresent,analyze,anddevelopabstractmetaphorsrelatedtonearlyeveryaspect ofthinkingandbehaving,beitscience,mathematics,philosopy,religion,music,or emotions. Thepresentvolumeisanintroductorytextbookonsignalanalysisofvisualcom- putationforsenior-levelundergraduatesorforgraduatestudentsinscienceanden- gineering. My modest goal has been to present the frequently used techniques to analyzeimagesinacommonframework–directionalimageprocessing.Inthat,Iam certainlyinfluencedbythemassiveevidenceofintricatedirectionalsignalprocess- ingbeingaccumulatedonhumanvision.Myhopeisthatthecontentsofthepresent text will be useful to a broad category of knowledge workers, not only those who are technically oriented. To understand and reveal the secrets of, in my view, the mostadvancedsignalanalysis“system”oftheknownuniverse,primatevision,isa greatchallenge.Itwillpredictablyrequirecross-fieldfertilizationsofmanysortsin science,nottheleastamongcomputervision,neurobiology,andpsychology. Thebookhasfiveparts,whichcanbestudiedfairlyindependently.Thesestud- ies are most comfortable if the reader has the equivalent mathematical knowledge acquired during the first years of engineering studies. Otherwise, the lemmas and theorems can be read to acquire a quick overview, even with a weaker theoretical VIII Preface background. Part I presents briefly a current account of the human vision system with short notes to its parallels in computer vision. Part II treats the theory of lin- ear systems, including the various versions of Fourier transform, with illustrations from image signals. Part III treats single direction in images, including the ten- sor theory for direction representation and estimation. Generalized beyond Carte- sian coordinates, an abstraction of the direction concept to other coordinates is of- fered.Here,thereadermeetsanimportanttoolofcomputervision,theHoughtrans- form and its generalized version, in a novel presentation. Part IV presents the con- cept of group direction, which models increased shape complexities. Finally, Part V presents the grouping tools that can be used in conjunction with directional pro- cessing.Theseincludeclustering,featuredimensionreduction,boundaryestimation, andelementarymorphologicaloperations.Informationondownloadablelaboratory exercises(inMatlab)basedonthisbookisavailableatthehomepageoftheauthor (http://www.hh.se/staff/josef). Iamindebtedtoseveralpeoplefortheirwisdomandthehelpthattheygaveme whileIwaswritingthisbook,andbefore.Icameincontactwithimageanalysisby readingthepublicationsofProf.Go¨staH.Granlund ashisPhDstudentandduring thebeautifuldiscussionsinhisresearchgroupatLinko¨pingUniversity,nottheleast withProf.HansKnutsson,inthemid-1980s.Thisheritageisunmistakenlyrecogniz- ableinmytext.Inthe1990s,duringmyemploymentattheSwissFederalInstitute of Technology in Lausanne, I greatly enjoyed working with Prof. Hans du Buf on textures.Thetracesofthiscollaborationaredistinctlyvisibleinthevolume,too. I have abundantly learned from my former and present PhD students, some of theirworkanddevotionisnotonlyaliveinmymemoryanddailywork,butalsoin thegraphicsandcontentsofthisvolume.Iwishtomention,alphabetically,Yaregal Assabie,SergeAyer,BenoitDuc,MaycelFaraj,StefanFischer,HartwigFronthaler, OleHansen,KlausKollreider,KennethNilsson,MartinPersson,LalithPremaratne, PhilippeSchroeter,andFabrizioSmeraldi.Asteachersintwoimageanalysiscourses using drafts of this volume, Kenneth, Martin, and Fabrizio provided, additionally, importantfeedbackfromstudents. I was privileged to have other coworkers and students who have helped me out alongthe“voyage”thatwritingabookis.Iwishtonamethosewhosecontributions have been most apparent, alphabetically, Markus Bc¨kman, Kwok-wai Choy, Stefan Karlsson, Nadeem Khan, Iivari Kunttu, Robert Lamprecht, Leena Lepisto¨, Madis Listak,HenrikOlsson,WernerPomwenger,BerndResch,PeterRomirer-Maierhofer, RadakrishnanPoomari,ReneSchirninger,DerkWesemann,HeikeWalter,andNiklas Zeiner. Atthefinalportofthisvoyage,Iwishtomentionnottheleastmyfamily,who notonlyputupwithmewritingabook,ofteninvadingtheprivatesphere,butwho also filled the breach and encouraged me with appreciated “kicks” that have taken meoutoflocalminima. I thank you all for having enjoyed the writing of this book and I hope that the readerwillenjoyittoo. August2005 J.Bigun Contents PartI HumanandComputerVision 1 NeuronalPathwaysofVision .................................... 3 1.1 OpticsandVisualFieldsoftheEye .......................... 3 1.2 PhotoreceptorsoftheRetina ................................ 5 1.3 GanglionCellsoftheRetinaandReceptiveFields .............. 7 1.4 TheOpticChiasm......................................... 9 1.5 LateralGeniculateNucleus(LGN) ........................... 10 1.6 ThePrimaryVisualCortex.................................. 11 1.7 SpatialDirection,Velocity,andFrequencyPreference........... 13 1.8 FaceRecognitioninHumans................................ 17 1.9 FurtherReading........................................... 19 2 Color ......................................................... 21 2.1 LensandColor ........................................... 21 2.2 RetinaandColor .......................................... 22 2.3 NeuronalOperationsandColor.............................. 24 2.4 The1931CIEChromaticityDiagramandColorimetry .......... 26 2.5 RGB:Red,Green,BlueColorSpace ......................... 30 2.6 HSB:Hue,Saturation,BrightnessColorSpace................. 31 PartII LinearToolsofVision 3 DiscreteImagesandHilbertSpaces .............................. 35 3.1 VectorSpaces ............................................ 35 3.2 DiscreteImageTypes,Examples............................. 37 3.3 NormsofVectorsandDistancesBetweenPoints ............... 40 3.4 ScalarProducts ........................................... 44 3.5 OrthogonalExpansion ..................................... 46 3.6 TensorsasHilbertSpaces................................... 48 3.7 SchwartzInequality,AnglesandSimilarityofImages ........... 53 X Contents 4 ContinuousFunctionsandHilbertSpaces......................... 57 4.1 FunctionsasaVectorSpace................................. 57 4.2 AdditionandScalinginVectorSpacesofFunctions............. 58 4.3 AScalarProductforVectorSpacesofFunctions ............... 59 4.4 Orthogonality............................................. 59 4.5 SchwartzInequalityforFunctions,Angles .................... 60 5 FiniteExtensionorPeriodicFunctions—FourierCoefficients ....... 61 5.1 TheFiniteExtensionFunctionsVersusPeriodicFunctions ....... 61 5.2 FourierCoefficients(FC) ................................... 62 5.3 (Parseval–Plancherel)ConservationoftheScalarProduct........ 65 5.4 HermitianSymmetryoftheFourierCoefficients................ 67 6 FourierTransform—InfiniteExtensionFunctions.................. 69 6.1 TheFourierTransform(FT)................................. 69 6.2 SampledFunctionsandtheFourierTransform ................. 72 6.3 DiscreteFourierTransform(DFT) ........................... 79 6.4 CircularTopologyofDFT .................................. 82 7 PropertiesoftheFourierTransform.............................. 85 7.1 TheDiracDistribution ..................................... 85 7.2 ConservationoftheScalarProduct........................... 88 7.3 Convolution,FT,andtheδ.................................. 90 7.4 ConvolutionwithSeparableFilters........................... 94 7.5 PoissonSummationFormula,theComb ...................... 95 7.6 HermitianSymmetryoftheFT .............................. 98 7.7 CorrespondencesBetweenFC,DFT,andFT................... 99 8 ReconstructionandApproximation .............................. 103 8.1 CharacteristicandInterpolationFunctionsinN Dimensions ..... 103 8.2 SamplingBand-PreservingLinearOperators................... 109 8.3 SamplingBand-EnlargingOperators ......................... 114 9 ScalesandFrequencyChannels.................................. 119 9.1 SpectralEffectsofDown-andUp-Sampling................... 119 9.2 TheGaussianasInterpolator ................................ 125 9.3 OptimizingtheGaussianInterpolator......................... 127 9.4 ExtendingGaussianstoHigherDimensions ................... 130 9.5 GaussianandLaplacianPyramids............................ 134 9.6 DiscreteLocalSpectrum,GaborFilters ....................... 136 9.7 DesignofGaborFiltersonNonregularGrids .................. 142 9.8 FaceRecognitionbyGaborFilters,anApplication.............. 146 Contents XI PartIII VisionofSingleDirection 10 Directionin2D ................................................ 153 10.1 LinearlySymmetricImages................................. 153 10.2 RealandComplexMomentsin2D........................... 163 10.3 TheStructureTensorin2D ................................. 164 10.4 TheComplexRepresentationoftheStructureTensor............ 168 10.5 LinearSymmetryTensor:DirectionalDominance .............. 171 10.6 BalancedDirectionTensor:DirectionalEquilibrium ............ 171 10.7 DecomposingtheComplexStructureTensor................... 173 10.8 DecomposingtheReal-ValuedStructureTensor ................ 175 10.9 ConventionalCornersandBalancedDirections................. 176 10.10 TheTotalLeastSquaresDirectionandTensors................. 177 10.11 DiscreteStructureTensorbyDirectTensorSampling ........... 180 10.12 ApplicationExamples...................................... 186 10.13 DiscreteStructureTensorbySpectrumSampling(Gabor)........ 187 10.14 RelationshipoftheTwoDiscreteStructureTensors ............. 196 10.15 HoughTransformofLines.................................. 199 10.16 TheStructureTensorandtheHoughTransform ................ 202 10.17 Appendix ................................................ 205 11 DirectioninCurvilinearCoordinates............................. 209 11.1 CurvilinearCoordinatesbyHarmonicFunctions ............... 209 11.2 LieOperatorsandCoordinateTransformations................. 213 11.3 TheGeneralizedStructureTensor(GST)...................... 215 11.4 DiscreteApproximationofGST ............................. 221 11.5 TheGeneralizedHoughTransform(GHT) .................... 224 11.6 VotinginGSTandGHT.................................... 226 11.7 HarmonicMonomials...................................... 228 11.8 “Steerability”ofHarmonicMonomials ....................... 230 11.9 SymmetryDerivativesandGaussians......................... 231 11.10 DiscreteGSTforHarmonicMonomials....................... 233 11.11 ExamplesofGSTApplications.............................. 236 11.12 FurtherReading........................................... 238 11.13 Appendix ................................................ 240 12 DirectioninND,MotionasDirection ............................ 245 12.1 TheDirectionofHyperplanesandtheInertiaTensor ............ 245 12.2 TheDirectionofLinesandtheStructureTensor................ 249 12.3 TheDecompositionoftheStructureTensor.................... 252 12.4 BasicConceptsofImageMotion ............................ 255 12.5 TranslatingLines.......................................... 258 12.6 TranslatingPoints ......................................... 259 12.7 DiscreteStructureTensorbyTensorSamplinginND ........... 263 XII Contents 12.8 AffineMotionbytheStructureTensorin7D................... 267 12.9 MotionEstimationbyDifferentialsinTwoFrames ............. 270 12.10 MotionEstimationbySpatialCorrelation ..................... 272 12.11 FurtherReading........................................... 274 12.12 Appendix ................................................ 275 13 WorldGeometrybyDirectioninN Dimensions ................... 277 13.1 CameraCoordinatesandIntrinsicParameters .................. 277 13.2 WorldCoordinates ........................................ 283 13.3 IntrinsicandExtrinsicMatricesbyCorrespondence............. 287 13.4 Reconstructing3DbyStereo,Triangulation ................... 293 13.5 SearchingforCorrespondingPointsinStereo.................. 300 13.6 TheFundamentalMatrixbyCorrespondence .................. 305 13.7 FurtherReading........................................... 307 13.8 Appendix ................................................ 308 PartIV VisionofMultipleDirections 14 GroupDirectionandN-FoldedSymmetry........................ 311 14.1 GroupDirectionofRepeatingLinePatterns ................... 311 14.2 TestImagesbyLogarithmicSpirals .......................... 314 14.3 GroupDirectionTensorbyComplexMoments................. 315 14.4 GroupDirectionandthePowerSpectrum ..................... 318 14.5 DiscreteGroupDirectionTensorbyTensorSampling ........... 320 14.6 GroupDirectionTensorsasTextureFeatures .................. 324 14.7 FurtherReading........................................... 326 PartV Grouping,Segmentation,andRegionDescription 15 ReducingtheDimensionofFeatures.............................. 329 15.1 PrincipalComponentAnalysis(PCA) ........................ 329 15.2 PCAforRareObservationsinLargeDimensions............... 335 15.3 SingularValueDecomposition(SVD) ........................ 338 16 GroupingandUnsupervisedRegionSegregation................... 341 16.1 TheUncertaintyPrincipleandSegmentation................... 341 16.2 PyramidBuilding ......................................... 344 16.3 ClusteringImageFeatures—PerceptualGrouping .............. 345 16.4 FuzzyC-MeansClusteringAlgorithm ........................ 347 16.5 EstablishingtheSpatialContinuity........................... 348 16.6 BoundaryRefinementbyOrientedButterflyFilters ............. 351 16.7 TextureGroupingandBoundaryEstimationIntegration ......... 354 16.8 FurtherReading........................................... 356