UC San Diego UC San Diego Electronic Theses and Dissertations Title Leveraging Human Perception and Computer Vision Algorithms for Interactive Fine-Grained Visual Categorization / Permalink https://escholarship.org/uc/item/5z2523gj Author Wah, Catherine Lih-Lian Publication Date 2014 Peer reviewed|Thesis/dissertation eScholarship.org Powered by the California Digital Library University of California UNIVERSITYOFCALIFORNIA,SANDIEGO LeveragingHumanPerceptionandComputerVisionAlgorithmsforInteractive Fine-GrainedVisualCategorization Adissertationsubmittedinpartialsatisfactionofthe requirementsforthedegreeofDoctorofPhilosophy in ComputerScience by CatherineLih-LianWah Committeeincharge: ProfessorSergeBelongie,Chair ProfessorDavidKriegman ProfessorGertLanckriet ProfessorLawrenceSaul ProfessorNunoVasconcelos 2014 Copyright CatherineLih-LianWah,2014 Allrightsreserved. TheDissertationofCatherineLih-LianWahisapprovedandisacceptable inqualityandformforpublicationonmicrofilmandelectronically: Chair UniversityofCalifornia,SanDiego 2014 iii DEDICATION Tomyparentsandtwinsister,whocontinuetosupportandinspiremeeveryday. iv EPIGRAPH Donotfeargoingforwardslowly;fearonlytostandstill. Chineseproverb Nothingisdifficult,onlyunfamiliar. Unknown GOD:IownyoulikeIownthecaves. THEOCEAN:Notachance. Nocomparison. GOD:Imadeyou. Icouldtameyou. THEOCEAN:Atonetime,maybe. Butnotnow. GOD:Iwillcometoyou,freezeyou,breakyou. THEOCEAN:Iwillspreadmyselflikewings. Iamabilliontinyfeathers. Youhaveno ideawhat’shappenedtome. DaveEggers v TABLEOFCONTENTS SignaturePage........................................................ iii Dedication ........................................................... iv Epigraph............................................................. v TableofContents ..................................................... vi ListofFigures ........................................................ viii Acknowledgements.................................................... xi Vita ................................................................. xv AbstractoftheDissertation ............................................. xvii Chapter1 Introduction ............................................... 1 Chapter2 BackgroundandRelatedWork................................ 4 2.1 Visipedia..................................................... 4 2.2 RelatedWork ................................................. 4 2.2.1 Fine-GrainedCategorization.............................. 4 2.2.2 Human-In-The-LoopMethods ............................ 6 2.2.3 ActiveClassification .................................... 7 Chapter3 InteractiveCategorizationwithPartsandAttributes .............. 9 3.1 Introduction .................................................. 9 3.2 RelatedWork ................................................. 10 3.3 VisualRecognitionwithHumansintheLoop ...................... 11 3.3.1 AlgorithmsandFramework............................... 14 3.3.2 IncorporatingComputerVision ........................... 15 3.3.3 ModelingUserResponses................................ 16 3.4 ExtensiontoPart-BasedModels ................................. 18 3.5 DatasetsandImplementationDetails ............................. 32 3.6 Experiments .................................................. 35 3.6.1 MeasuringPerformance.................................. 35 3.6.2 UsingBinaryAttributeQuestions ......................... 36 3.6.3 1-vs-allVs. Attribute-BasedClassification .................. 40 3.6.4 UsingPartandAttributeQuestions ........................ 41 3.7 Conclusion ................................................... 45 Chapter4 InteractiveCategorizationwithSimilarityLearning .............. 47 vi 4.1 Introduction .................................................. 47 4.2 RelatedWork ................................................. 50 4.3 PerceptualSimilarityMetricsforInteractiveCategorization .......... 51 4.3.1 MethodsandFramework................................. 52 4.3.2 IncorporatingComputerVision ........................... 58 4.4 ExtensiontoMultipleLocalizedPerceptualMetrics ................. 60 4.5 DatasetandImplementationDetails .............................. 65 4.6 Experiments .................................................. 68 4.6.1 EmbeddingGeneration .................................. 68 4.6.2 UsingNonlocalizedSimilarityMetrics ..................... 70 4.6.3 UsingMultipleLocalizedSimilarityMetrics ................ 79 4.7 HumanPerceptionofSimilarity.................................. 84 4.8 Conclusion ................................................... 87 Chapter5 Conclusion ................................................ 90 5.1 FinalThoughts................................................ 90 5.2 FutureDirections.............................................. 91 AppendixACUB-200-2011Dataset...................................... 92 A.1 Introduction .................................................. 92 A.2 DatasetSpecificationandCollection.............................. 93 A.3 Applications .................................................. 94 A.4 BenchmarksandBaselineExperiments ........................... 95 Bibliography ......................................................... 104 vii LISTOFFIGURES Figure2.1. Visipedia. ............................................... 5 Figure3.1. ScreencaptureofaniPadappforbirdspeciesrecognition. ... 10 Figure3.2. Examplesofclassificationproblems........................ 11 Figure3.3. Examplesofthevisual20questionsgame .................. 12 Figure3.4. Visualizationofthebasicalgorithmflow. ................... 14 Figure3.5. Examplesofuserresponses ............................... 17 Figure3.6. Interactivevisualrecognitionwithlocalization. ............. 19 Figure3.7. ProbabilisticModel. ..................................... 23 Figure3.8. FullyAutomatedPartDetectionResults.................... 25 Figure3.9. Userinterfaceforpartlocationsinput. ..................... 26 Figure3.10. Comparing part prediction accuracy for humans and comput- ers ..................................................... 28 Figure3.11. Poseclusters............................................. 35 Figure3.12. DifferentModelsofUserResponses........................ 37 Figure3.13. PerformanceonBirds-200whenusingcomputervision ...... 38 Figure3.14. Examples where computer vision and user responses work to- gether .................................................. 39 Figure3.15. Imagesthataremisclassifiedbyoursystem................. 40 Figure3.16. PerformanceonAnimalsWithAttributes .................. 41 Figure3.17. AttributeandPartQuestions. ............................. 42 Figure3.18. InteractiveClassificationUsingPartandAttributeQuestions. 43 Figure3.19. Examplesofthebehaviorofoursystem..................... 44 Figure4.1. Similaritymetricsforinteractivecategorization. ............ 48 viii Figure4.2. InterfaceforCollectingSimilarityComparisons. ............ 49 Figure4.3. Localized Similarity Comparisons for Interactive Categoriza- tion..................................................... 62 Figure4.4. DiscoveringDiscriminativeRegions. ....................... 63 Figure4.5. ComparingLocalizedandNonlocalizedComparisons. ....... 65 Figure4.6. Discriminative Regions. The 106 discovered discriminative re- gions. Weselect5touseinourexperiments. .................. 67 Figure4.7. EmbeddingGeneralizationError. ......................... 70 Figure4.8. NonlocalizedSimilarityEmbedding. ....................... 71 Figure4.9. LocalizedSimilarityEmbeddingforRegion1................ 72 Figure4.10. LocalizedSimilarityEmbeddingforRegion13............... 73 Figure4.11. LocalizedsimilarityembeddingforRegion21. .............. 74 Figure4.12. LocalizedsimilarityembeddingforRegion23. .............. 75 Figure4.13. LocalizedsimilarityembeddingforRegion39. .............. 76 Figure4.14. Test-TimeInterface. Anexampleofatest-timeuserinterfacefor ourinteractiveclassificationsystem. ......................... 77 Figure4.15. ObservingDeterministicUsers. ........................... 77 Figure4.16. ObservingSimulatedNoisyUsers. ......................... 78 Figure4.17. QualitativeResultsforNonlocalizedMetrics................ 80 Figure4.18. QualitativeResultsforLocalizedMetrics. .................. 81 Figure4.19. InteractiveCategorizationResults. ........................ 82 Figure4.20. QuestionDistribution..................................... 85 Figure4.21. Comparing human perception of nonlocalized vs. localized similarity................................................ 86 Figure4.22. ObservingAMTworkerbehavior. ......................... 88 ix
Description: