UNIVERSITY OF TECHNOLOGY SYDNEY DOCTORAL THESIS Feature Fusion, Feature Selection and Local N-ary Patterns for Object Recognition and Image Classification Author: Supervisor: ShengWANG QiangWU,XiangjianHE Athesissubmittedinfulfilmentoftherequirements forthedegreeofDoctorofPhilosophy inthe SchoolofComputingandCommunications FacultyofEngineeringandInformationTechnology January2015 CERTIFICATE OF ORIGINAL AUTHORSHIP Icertifythattheworkinthisthesishasnotpreviouslybeensubmittedforadegreenorhasitbeen submittedaspartofrequirementsforadegreeexceptasfullyacknowledgedwithinthetext. I also certify that the thesis has been written by me. Any help that I have received in my research work and the preparation of the thesis itself has been acknowledged. In addition, I certify that all informationsourcesandliteratureusedareindicatedinthethesis. SignatureofStudent: Date: i “Concernformanandhisfatemustalwaysformthechiefinterestofalltechnicalendeavors. Never forgetthisinthemidstofyourdiagramsandequations.” AlbertEinstein UNIVERSITYOFTECHNOLOGY,SYDNEY Abstract FacultyofEngineeringandInformationTechnology SchoolofComputingandCommunications DoctorofPhilosophy FeatureFusion,FeatureSelectionandLocalN-aryPatternsforObjectRecognitionand ImageClassification byShengWANG Objectrecognitionisoneofthemostfundamentaltopicsincomputervision. Duringpastyears,it has been the interest for both academies working in computer science and professionals working in the information technology (IT) industry. The popularity of object recognition has been proven by its motivation of sophisticated theories in science and wide spread applications in the industry. Nowadays,withmorepowerfulmachinelearningtools(bothhardwareandsoftware)andthehuge amountofinformation(data)readilyavailable,higherexpectationsareimposedonobjectrecogni- tion. Atitsearlystageinthe1990s,thetaskofobjectrecognitioncanbeassimpleastodifferentiate between object of interest and non-object of interest from a single still image. Currently, the task ofobjectrecognitionmayaswellincludesthesegmentationandlabelingofdifferentimageregions (i.e., to assign each segmented image region a meaningful label based on objects appear in those regions),andthenusingcomputerprogramstoinferthesceneoftheoverallimagebasedonthose segmented regions. The original two-class classification problem is now getting more complex as it now evolves toward a multi-class classification problem. In this thesis, contributions on object recognition are made in two aspects. These are, improvements using feature fusion and improve- ments using feature selection. Three examples are given in this thesis to illustrate three different feature fusion methods, the descriptor concatenation (the low-level fusion), the confidence value escalation (the mid-level fusion) and the coarse-to-fine framework (the high-level fusion). Two examples are provided for feature selection to demonstrate its ideas, those are, optimal descriptor selectionandimprovedclassifierselection. Feature extraction plays a key role in object recognition because it is the first and also the most important step. If we consider the overall object recognition process, machine learning tools are toservethepurposeoffindingdistinctivefeaturesfromthevisualdata. Givendistinctivefeatures, object recognition is readily available (e.g., a simple threshold function can be used to classify featuredescriptors). TheproposalofLocalN-aryPattern(LNP)texturefeaturescontributestoboth feature extraction and texture classification. The distinctive LNP feature generalizes the texture feature extraction process and improves texture classification. Concretely, the local binary pattern (LBP) is the special case of LNP with n = 2 and the texture spectrum is the special case of LNP withn = 3. TheproposedLNPrepresentationhasbeenproventooutperformthepopularLBPand oneoftheLBP’smostsuccessfulextension-localternarypattern(LTP)fortextureclassification. Acknowledgements I wish to thank Associate Professor Qiang WU and Professor Xiangjian HE, my principle and co- supervisors, for their many suggestions and constant encouragement, help and support during my candidature. IgratefullyacknowledgetheinvaluablediscussionwiththemandIhavethehonourof studyingandworkingwiththeminthepastfouryearsandninemonthswhichisstampedindelibly inmylife. IappreciatethesupportoftheInternationalResearchScholarship(IRS)providedbytheFacultyof Engineering and Information Technology (FEIT), University of Technology, Sydney (UTS). I ap- preciatethefinancialsupportformylivingandforattendingtheinternationalconferencesreceived fromtheFEITandtheUTSVice-Chancellor’sConferenceFund. Iwishtothankmyfellowcolleaguesandthestaffofthefacultyforprovidingvariousassistancefor the completion of this research work. In particular, Wenjing Jia, Min Xu for their invaluable help andsupport. Lastbutnotleast,Iwouldliketothankmyparentsfortheirunderstandingandsupport. Thisthesis couldnothavebeencompletedwithouttheirencouragementsandfinancialassistance. v Contents CERTIFICATEOFORIGINALAUTHORSHIP i Abstract iii Acknowledgements v Contents vi ListofFigures x ListofTables xiv Abbreviations xv 1 Introduction 1 1.1 BackgroundInformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1.1 GeneralObjectRecognitionFrameworkwithSupervisedLearningMethods 3 1.1.2 TypicalClassificationMethods . . . . . . . . . . . . . . . . . . . . . . . . 5 1.1.2.1 BoostingMethodandAdaBoostClassifier . . . . . . . . . . . . 5 1.1.2.2 NaiveBayesClassifier . . . . . . . . . . . . . . . . . . . . . . . 8 1.1.2.3 SVMClassifier . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.1.3 ExampleDatasetsinObjectRecognition . . . . . . . . . . . . . . . . . . 11 1.1.4 EvaluationMethodsandMeasurements . . . . . . . . . . . . . . . . . . . 12 1.1.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 1.2 ListofPublications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 1.2.1 JournalArticles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 1.2.2 ConferenceProceedings . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 1.3 ThesisOrganization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2 SummaryofFeatureExtractionMethods 18 2.1 Over-completeTemplateBasedFeatureDescription . . . . . . . . . . . . . . . . . 19 2.1.1 Haar-likeFeatureforFaceDetection . . . . . . . . . . . . . . . . . . . . . 19 2.1.2 HistogramDistanceofHaarRegionsforObjectRecognition . . . . . . . . 22 2.1.3 EdgeletFeaturesforPedestrianDetection . . . . . . . . . . . . . . . . . . 23 2.2 SparseKey-pointBasedFeatureDescription . . . . . . . . . . . . . . . . . . . . . 26 vi Contents vii 2.2.1 ImageMatchingUsingScaleInvariantFeatureTransform . . . . . . . . . 26 2.3 DenseSpatialSub-blockBasedFeatureDescription . . . . . . . . . . . . . . . . . 32 2.3.1 DenseSIFTFeatureforImageAlignmentandFaceRecognition . . . . . . 32 2.3.2 HistogramofOrientedGradientFeatureforPedestrianDetection . . . . . 36 2.3.3 DensityVarianceFeatureforLicensePlateDetection . . . . . . . . . . . . 42 2.4 HybridFilterResponseBasedFeatureDescription. . . . . . . . . . . . . . . . . . 43 2.4.1 ObjectBankRepresentationforSceneClassification . . . . . . . . . . . . 43 2.5 KernelBasedFeatureDescription . . . . . . . . . . . . . . . . . . . . . . . . . . 45 2.5.1 LocallyAdaptiveRegressionKernelsforCarDetection . . . . . . . . . . . 46 2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 3 FeatureFusionforObjectRecognition 51 3.1 Concatenation of Histogram Distances Between Intensity Histograms of Different SpatialSub-blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 3.1.1 ExploringRelationshipsBetweenDifferentSpatialSub-blocks . . . . . . . 53 3.1.2 CorrelatedHistogramofHaarRegions. . . . . . . . . . . . . . . . . . . . 54 3.1.2.1 OriginalHDHRDistance . . . . . . . . . . . . . . . . . . . . . 55 3.1.2.2 MinkowskiCityBlockL1 ManhattanDistance . . . . . . . . . . 56 3.1.2.3 MinkowskiEuclideanL2 EuclideanDistance. . . . . . . . . . . 56 3.1.3 ExperimentalResults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 3.2 Multi-scaleMid-levelFusionBasedonEscalatingConfidenceValues . . . . . . . 60 3.2.1 Multi-scaleGaussianSmoothing . . . . . . . . . . . . . . . . . . . . . . . 61 3.2.2 AdaptiveCombinationofWeakLearnersBasedonEdgelet . . . . . . . . . 63 3.2.3 ConcatenationofConfidenceValuesAcrossMulti-scaleSpaces . . . . . . 66 3.2.4 ExperimentalResults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 3.3 Coarse-to-fine Fusion Based on Fast Region of Interest Selection and Discrimina- tiveLearning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 3.3.1 FastRegionofInterestSelectioninCoarseStage . . . . . . . . . . . . . . 70 3.3.2 BoostingDistinctiveDSIFTFeaturesinFineStage . . . . . . . . . . . . . 71 3.3.3 ExperimentalResults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 4 FeatureSelectionforImageClassificationandObjectRecognition 79 4.1 ACompactDescriptorBasedonOptimalMaxPoolingOrientation . . . . . . . . . 80 4.1.1 ASimplifiedStrip-likeSpatialConfiguration . . . . . . . . . . . . . . . . 82 4.1.2 Selecting Optimal Strip-like Partition Scheme Based on SVM Classifica- tionResults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 4.1.3 ExperimentalResults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 4.2 AnImprovedClassifierwhichRequiresLessTrainingExamples . . . . . . . . . . 95 4.2.1 AnObservationfromUsingInsufficientTrainingExamples . . . . . . . . 96 4.2.2 AnImprovedAdaBoostAlgorithmwithSpatialConstraints . . . . . . . . 97 4.2.3 ExperimentalResults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 4.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 5 LocalN-aryPatternforTextureImageClassification 109 5.1 MoreDistinctiveTextureFeaturesinHighDimensionFeatureSpace . . . . . . . . 110 5.1.1 MappingLocalPatternstoIntegerValues . . . . . . . . . . . . . . . . . . 111 Contents viii 5.1.2 Generalization of Feature Extraction to The Bachet De Meziriac Weight Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 5.1.3 DistinctiveLocalPatternsofTheN−ryCodingScheme . . . . . . . . . . 120 5.1.3.1 DistinctivenessofLNPComparedtoLTP . . . . . . . . . . . . 120 5.1.3.2 An Observation of The Statistical Characteristics of The Local NaryRepresentation . . . . . . . . . . . . . . . . . . . . . . . 121 5.1.3.3 RotationInvariantandUniformPatterns . . . . . . . . . . . . . 125 UniformPatterns . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 RotationInvariantPatterns . . . . . . . . . . . . . . . . . . . . . . 126 5.1.3.4 TextureClassification . . . . . . . . . . . . . . . . . . . . . . . 127 One-against-restSVMClassifier . . . . . . . . . . . . . . . . . . . 127 NearestNeighborClassifier . . . . . . . . . . . . . . . . . . . . . 128 5.1.4 ExperimentalResults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 5.2 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 6 ConclusionsandFutureWorks 134 6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 6.1.1 ImprovedObjectRecognitionBasedonFeatureFusion . . . . . . . . . . . 135 6.1.1.1 Concatenating Distances Between Intensity Histograms of Dif- ferentSpatialSub-blocks . . . . . . . . . . . . . . . . . . . . . 135 6.1.1.2 Multi-scale Mid-level Fusion Based on Escalating Confidence Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 6.1.1.3 Coarse-to-fineFusionBasedonFastRegionofInterestSelection andDistinctiveFeatures . . . . . . . . . . . . . . . . . . . . . . 138 6.1.2 ImprovedObjectRecognitionBasedonFeatureSelection . . . . . . . . . 139 6.1.2.1 ACompactDescriptorBasedonOptimalMaxPoolingOrientation140 6.1.2.2 AClassifierwhichRequiresLessTrainingExamples . . . . . . 141 6.1.3 ImprovedTextureClassificationBasedonLNP . . . . . . . . . . . . . . . 142 6.2 FutureWorks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 A TheGeneralizedBachetDeMeziriacWeightProblem 146 A.1 AGeneralizationtoTheBMWP . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 A.2 ProofofProposition1(ProofbyContradiction) . . . . . . . . . . . . . . . . . . . 147 A.2.1 DetailsofTheContradition . . . . . . . . . . . . . . . . . . . . . . . . . 148 A.3 ProofofProposition2(ProofbyMathematicalInduction) . . . . . . . . . . . . . . 148 A.3.1 Whenn−1IsAnEvenNumber . . . . . . . . . . . . . . . . . . . . . . . 148 A.3.1.1 Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 A.3.1.2 InductiveStep . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 A.3.2 Whenn−1IsAnOddNumber . . . . . . . . . . . . . . . . . . . . . . . 150 A.3.2.1 Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 A.3.2.2 InductiveStep . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 B SomeImplementationsinMatlab 153 B.1 RecursiveSPMMaxPooling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 B.2 EdgeletFeatureExtraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 Contents ix C ListofTechniqueAbbreviationTerms 157 Bibliography 161
Description: