Covariances in Computer Vision and Machine Learning Synthesis Lectures on Computer Vision Editors GérardMedioni,UniversityofSouthernCalifornia SvenDickinson,UniversityofToronto SynthesisLecturesonComputerVisioniseditedbyGérardMedionioftheUniversityofSouthern CaliforniaandSvenDickinsonoftheUniversityofToronto.Theseriespublishes50–150page publicationsontopicspertainingtocomputervisionandpatternrecognition.Thescopewilllargely followthepurviewofpremiercomputerscienceconferences,suchasICCV,CVPR,andECCV. Potentialtopicsinclude,butnotarelimitedto: • ApplicationsandCaseStudiesforComputerVision • Color,Illumination,andTexture • ComputationalPhotographyandVideo • EarlyandBiologically-inspiredVision • FaceandGestureAnalysis • IlluminationandReflectanceModeling • Image-BasedModeling • ImageandVideoRetrieval • MedicalImageAnalysis • MotionandTracking • ObjectDetection,Recognition,andCategorization • SegmentationandGrouping • Sensors • Shape-from-X • StereoandStructurefromMotion • ShapeRepresentationandMatching iv • StatisticalMethodsandLearning • PerformanceEvaluation • VideoAnalysisandEventRecognition CovariancesinComputerVisionandMachineLearning HàQuangMinhandVittorioMurino 2017 ElasticShapeAnalysisofThree-DimensionalObjects IanH.Jermyn,SebastianKurtek,HamidLaga,andAnujSrivastava 2017 TheMaximumConsensusProblem:RecentAlgorithmicAdvances Tat-JunChinandDavidSuter 2017 ExtremeValueTheory-BasedMethodsforVisualRecognition WalterJ.Scheirer 2017 DataAssociationforMulti-ObjectVisualTracking MargritBetkeandZhengWu 2016 EllipseFittingforComputerVision:ImplementationandApplications KenichiKanatani,YasuyukiSugaya,andYasushiKanazawa 2016 ComputationalMethodsforIntegratingVisionandLanguage KobusBarnard 2016 BackgroundSubtraction:TheoryandPractice AhmedElgammal 2014 Vision-BasedInteraction MatthewTurkandGangHua 2013 CameraNetworks:TheAcquisitionandAnalysisofVideosoverWideAreas AmitK.Roy-ChowdhuryandBiSong 2012 v DeformableSurface3DReconstructionfromMonocularImages MathieuSalzmannandPascalFua 2010 Boosting-BasedFaceDetectionandAdaptation ChaZhangandZhengyouZhang 2010 Image-BasedModelingofPlantsandTrees SingBingKangandLongQuan 2009 Copyright©2018byMorgan&Claypool Allrightsreserved.Nopartofthispublicationmaybereproduced,storedinaretrievalsystem,ortransmittedin anyformorbyanymeans—electronic,mechanical,photocopy,recording,oranyotherexceptforbriefquotations inprintedreviews,withoutthepriorpermissionofthepublisher. CovariancesinComputerVisionandMachineLearning HàQuangMinhandVittorioMurino www.morganclaypool.com ISBN:9781681730134 paperback ISBN:9781681730141 ebook DOI10.2200/S00801ED1V01Y201709COV011 APublicationintheMorgan&ClaypoolPublishersseries SYNTHESISLECTURESONCOMPUTERVISION Lecture#13 SeriesEditors:GérardMedioni,UniversityofSouthernCalifornia SvenDickinson,UniversityofToronto SeriesISSN Print2153-1056 Electronic2153-1064 Covariances in Computer Vision and Machine Learning Hà Quang Minh IstitutoItalianodiTecnologia Vittorio Murino IstitutoItalianodiTecnologia and UniversityofVerona SYNTHESISLECTURESONCOMPUTERVISION#13 M &C Morgan&cLaypool publishers ABSTRACT Covariancematricesplayimportantrolesinmanyareasofmathematics,statistics,andmachine learning, as well as their applications. In computer vision and image processing, they give rise to a powerful data representation, namely the covariance descriptor, with numerous practical applications. Inthisbook,webeginbypresentinganoverviewofthefinite-dimensionalcovariancema- trixrepresentationapproachofimages,alongwithitsstatisticalinterpretation.Inparticular,we discussthevariousdistancesanddivergencesthatarisefromtheintrinsicgeometricalstructures of the set of Symmetric Positive Definite (SPD) matrices, namely Riemannian manifold and convex cone structures. Computationally, we focus on kernel methods on covariance matrices, especiallyusingtheLog-Euclideandistance. We then show some of the latest developments in the generalization of the finite- dimensionalcovariancematrixrepresentationtotheinfinite-dimensionalcovarianceoperatorrep- resentation via positive definite kernels. We present the generalization of the affine-invariant RiemannianmetricandtheLog-Hilbert-Schmidtmetric,whichgeneralizestheLog-Euclidean distance.Computationally,wefocusonkernelmethodsoncovarianceoperators,especiallyusing the Log-Hilbert-Schmidt distance. Specifically, we present a two-layer kernel machine, using theLog-Hilbert-Schmidtdistanceanditsfinite-dimensionalapproximation,whichreducesthe computationalcomplexityoftheexactformulationwhilelargelypreservingitscapability.The- oretical analysis shows that, mathematically, the approximate Log-Hilbert-Schmidt distance should be preferred over the approximate Log-Hilbert-Schmidt inner product and, computa- tionally,itshouldbepreferredovertheapproximateaffine-invariantRiemanniandistance. Numericalexperimentsonimageclassificationdemonstratesignificantimprovementsof theinfinite-dimensionalformulationoverthefinite-dimensionalcounterpart.Giventhenumer- ous applications of covariance matrices in many areas of mathematics, statistics, and machine learning, just to name a few, we expect that the infinite-dimensional covariance operator for- mulationpresentedherewillhavemanymoreapplicationsbeyondthoseincomputervision. KEYWORDS covariance descriptors in computer vision, positive definite matrices, infinite- dimensional covariance operators, positive definite operators, Hilbert-Schmidt operators, Riemannian manifolds, affine-invariant Riemannian distance, Log- Euclidean distance, Log-Hilbert-Schmidt distance, convex cone, Bregman diver- gences,kernelmethodsonRiemannianmanifolds,visualobjectrecognition,image classification