Semi-Supervised Training of Models for Appearance-Based Statistical Object Detection Methods Charles JosephRosenberg CMU-CS-04-150 May2004 SchoolofComputerScience CarnegieMellonUniversity Pittsburgh,PA 15213 Submittedin partialfulfillmentoftherequirements forthedegreeof DoctorofPhilosophy. Thesis Committee MartialHebert, Co-Chair SebastianThrun, Co-Chair Henry Schneiderman AvrimBlum TomMinka,MicrosoftResearch Copyright c 2004CharlesRosenberg (cid:13) ThisresearchwassupportedinpartbyafellowshipfromtheEastmanKodakCompany. The views and conclusions in this document are those of the author and should not be interpreted as representing the official policies,eitherexpressedorimpliedofCarnegieMellonUniversityortheEastmanKodakCompany. Keywords: ObjectDetection,Semi-SupervisedLearning,ComputerVision,MachineLearning,WeaklyLabeledData. Abstract Appearance-based objectdetection systemsusingstatisticalmodelshaveprovenquitesuccessful. Theycan reliably detect textured, rigid objects in a variety of poses, lighting conditions and scales. However, the construction ofthese systems istime-consuming and difficult because alarge number of training examples must be collected and manually labeled in order to capture variations in object appearance. Typically, this requires indicating which regions of the image correspond to the object to be detected, and which belong to background clutter, as well as marking key landmark locations on the object. The goal of this work is to pursue and evaluate approaches which reduce the amount of fully labeled examples needed, by training these models in a semi-supervised manner. To this end, we develop approaches based on Expectation- Maximizationandself-trainingthatutilizeasmallnumberoffullylabeledtrainingexamplesincombination with a set of “weakly labeled” examples. This is advantageous in that weakly labeled data are inherently less costly to generate, since the label information is specified in an uncertain or incomplete fashion. For example,aweaklylabeledimagemightbelabeledascontainingthetrainingobject,withtheobjectlocation and scale left unspecified. In this work weanalyze the performance of the techniques developed through a comprehensive empirical investigation. We find that supplementing a small fully labeled training set with weaklylabeleddatainthetrainingprocessreliablyimprovesdetectorperformanceforavarietyofdetection approaches. The outcome is the identification of successful approaches and key issues that are central to achieving goodperformance inthesemi-supervised training ofobjectdetection systems. i ii Acknowledgments First and foremost I would like to thank my co-advisor, Martial Hebert for all of his patience and support over the years, and for really giving me the push forward I needed to complete the work in this thesis and finish mydegree. Ialso wouldlike to thank Sebastian Thrun, myprimary advisor during myearly years at CarnegieMellonandmyco-advisorforthisthesiswork,forgivingmethefreedomtoexploremyownideas andresearch directions. I would like to thank the members of my committee for their great discussions and insights involving this work: Henry Schneiderman, Tom Minka and Avrim Blum. Especially Henry Schneiderman for providing his detector code and training data for this work, and for all of the time he spent helping me to understand thesystemsoIcouldadaptittomyneeds. I would of course like to thank all of my friends through the years at Carnegie Mellon: Kevin Watkins, DennisStrelow,AleksNanevski,DerekDreyer,RoseHoberman,FranklinChen,LaurieHiyakumoto,Mike VandeWeghe,IllahNourbakhsh,MartiLouw,NicolasVandapel,SanjivKumar,DanielHuber,DianeStidle, Nathaniel Daw, Francisco Pereira, Mark Fuhs, Goksel Dedeoglu, Catherine Copetas, Sharon Burks. I also want tothank Sanjiv Kumarfor collecting the images used inmy“color model” and “filtermodel” experi- ments. AndSanehNasserbakht forherfriendship andsupport during theearlydays ofmythesis. Andalso MonicaBruckerforproviding commentsonthefinalversion. Iwanttothankmyparents, FredaandStanleyforthewonderfully supportive andintellectually stimulating environment Igrewupin. AndmysisterHarrietforbeingagreatfriendandfoil. Iwould also like tothank Larry Rayfrom Kodak and the Eastman Kodak Company forproviding mewith funding duringmyfinalyearsatCarnegieMellon. iii iv Contents 1 Introduction 1 1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2.1 Statistical Appearance-Based ObjectDetectionSystem . . . . . . . . . . . . . . . . 2 1.2.2 TrainingDataLabelInformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2.3 FullyLabeledTrainingData . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2.4 WeaklyLabeledTrainingData . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2.5 Unlabeled TrainingData . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Semi-Supervised TrainingandtheObjectDetectionProblem . . . . . . . . . . . . . . . . . 4 1.4 DocumentOrganization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2 Approach 7 2.1 OverallApproach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 Semi-Supervised TrainingApproachOverview . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2.2 Expectation Maximization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2.3 Self-Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2.4 Self-Training andtheSelectionMetric . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2.5 Comparison ofApproaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3 Semi-Supervised TrainingApproachDetails . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3.1 FrameworkIntroduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3.2 TrainingtheModelwithFullyLabeledData. . . . . . . . . . . . . . . . . . . . . . 11 2.3.3 BatchTrainingwithWeaklyLabeledorUnlabeled Data . . . . . . . . . . . . . . . 11 2.3.4 Incremental TrainingApproachwithWeaklyLabeledData . . . . . . . . . . . . . 12 2.3.5 ASpecificExample: WeaklyLabeleddataforaGaussianMixtureModelwithEM . 14 2.4 PriorWork. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.4.1 GeneralObjectDetection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.4.2 Unlabeled /WeaklyLabeled/MultipleInstance Data . . . . . . . . . . . . . . . . . 15 2.4.3 WeaklyLabeledDataandImages . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.4.4 GraphbasedSemi-Supervised Approaches . . . . . . . . . . . . . . . . . . . . . . 17 2.4.5 Information Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 v 2.5 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.5.2 SimulationProtocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.5.3 SimulationResults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.6 RelatedConsiderations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.6.1 AsymptoticAnalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.6.2 Arethere“correct” labels? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.6.3 ClassifierandUnlabeledDataLabeling . . . . . . . . . . . . . . . . . . . . . . . . 22 2.7 KeyIssues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3 ColorBasedDetectorExperiments 25 3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.2 ColorBasedObjectDetectionModelandLearningfromWeaklyLabeledData . . . . . . . 26 3.3 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 4 FilterBasedDetection Experiments 35 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 4.2 DetectorDetails . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 4.3 ModeloftheSpatialDistribution ofFilterResponses . . . . . . . . . . . . . . . . . . . . . 37 4.4 DetectorImplementation EfficiencyIssues . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.5 TrainingtheModelwithFullyLabeledData . . . . . . . . . . . . . . . . . . . . . . . . . . 40 4.6 BatchTrainingwithWeaklyLabeledData . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.7 Incremental TrainingwithWeaklyLabeledData . . . . . . . . . . . . . . . . . . . . . . . 42 4.7.1 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 4.7.2 SelectionMetric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.7.3 Incremental TrainingProcedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.8 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 4.9 DataDescription . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 4.9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 4.9.2 SingleImageGroup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 4.9.3 TwoImageClosePairGroup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 vi 4.9.4 TwoImageNearPairGroup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 4.9.5 TwoImageFarPairGroup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 4.10 ExperimentDetailsandEvaluationMetrics . . . . . . . . . . . . . . . . . . . . . . . . . . 47 4.11 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.11.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.11.2 DetailedResults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.11.3 Establishing UpperandLowerPerformanceBounds . . . . . . . . . . . . . . . . . 50 4.11.4 Evaluating standardEM-allweaklylabeled dataatonce . . . . . . . . . . . . . . . 53 4.11.5 Evaluating weaklylabeleddataweightscheduleweighting withstandard EM . . . . 54 4.11.6 Evaluating incremental dataaddition basedonthedetectoddsratio . . . . . . . . . 56 4.11.7 Evaluating incremental dataaddition basedonreverseoddsratio(1-NN) . . . . . . 57 4.11.8 Evaluating incremental dataaddition basedonreverseoddsratio(m-NN) . . . . . . 59 4.11.9 Evaluatingincrementaldataadditionbasedonreverseoddsratioandavaryingnum- berofGaussiancomponents forthemetric(1-NN) . . . . . . . . . . . . . . . . . . 61 4.11.10EvaluatingincrementaldataadditionbasedonreverseoddsratioandfortyGaussian Model(4-NN) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 4.12 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5 SchneidermanDetectorExperiments 67 5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 5.2 DetectorDetails . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 5.2.1 Detection Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 5.2.2 TrainingwithFullyLabeledData . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 5.2.3 Semi-Supervised Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 5.3 DetectorStagesandTrainingProcess . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 5.4 Performance EvaluationMetrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 5.5 ExperimentSpecifics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 5.6 Experimental Variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 5.7 Experimental Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 5.8 AnalysisofExperimentalResults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 5.8.1 ListofExperiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 5.8.2 Sensitivity toFullyLabeledDataSetSizeandNumberofFeatures . . . . . . . . . . 78 5.8.3 WeaklyLabeledDataPerformance . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 vii 5.8.4 MSEScoringMetric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 5.8.5 RotationEstimationandSyntheticRotationVariation . . . . . . . . . . . . . . . . . 94 5.8.6 VaryingFeatureCount . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 5.8.7 FeatureandClassifierTraining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 5.8.8 Adaboost CrossValidation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.9 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 6 DiscussionandConclusions 103 6.1 SummaryandConclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 6.2 OpenQuestions/FutureWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 6.2.1 MethodsforSelectingtheFinalSelf-training Iteration . . . . . . . . . . . . . . . . 105 6.2.2 DetectorRetraining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 6.2.3 RelationtoCo-training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 6.2.4 InitialTrainingSetSelection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 6.2.5 TrainingwithDifferentTypesofInformation . . . . . . . . . . . . . . . . . . . . . 106 6.2.6 UtilizingImageContext . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 6.2.7 LearningfromDataMinedfromtheWeb . . . . . . . . . . . . . . . . . . . . . . . 107 6.2.8 Fullylabeleddatacanhurtperformance . . . . . . . . . . . . . . . . . . . . . . . . 108 7 References 111 7.1 MachineLearningandStatistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 7.1.1 GeneralMachineLearning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 7.1.2 LearningwithMultipleInstanceData . . . . . . . . . . . . . . . . . . . . . . . . . 111 7.1.3 GeneralLearningwithUnlabeledData/Semi-supervisedTraining/ActiveLearning andNon-VisionApplications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 7.2 ComputerVision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 7.2.1 GeneralObjectDetection/Recognition /ContentBasedImageRetrieval . . . . . . 115 7.2.2 Semi-Supervised Training for Object Detection / Recognition / Content Based Im- ageRetrieval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 7.2.3 RelatedVisionTopics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 viii
Description: