©Copyright2016 YuyinSun Toward Never-Ending Object Learning for Robots YuyinSun Adissertation submittedinpartialfulfillmentofthe requirementsforthedegreeof DoctorofPhilosophy UniversityofWashington 2016 ReadingCommittee: DieterFox,Chair AliFarhadi YejinChoi ProgramAuthorizedtoOfferDegree: ComputerScienceandEngineering UniversityofWashington Abstract TowardNever-EndingObjectLearningforRobots YuyinSun ChairoftheSupervisoryCommittee: ProfessorDieterFox ComputerScienceandEngineering Ahouseholdrobotusuallyworksinacomplexworkingenvironment,whereitwillcontinuouslysee newobjectsand encounternewconceptsin itslifetime. Therefore, beingableto learnmoreobjects iscrucialfortherobottobecontinuouslyusefuloveritslifespan. Movingbeyondpreviousobject learning research problem, of which mostly focuses on learning with given training objects and concepts,thisresearchaddressestheproblemofenablingarobottolearnnewobjectsandconcepts continuously. Specifically,ourcontributionsareasfollows: First, we study how to accurately identify target objects in scenes based on human users’ languagedescriptions. Weproposeanovelidentificationsystemusinganobject’svisualattributes and names to recognize objects. We also propose a method to enable the system to recognize objects based on new names without seeing any training instances of the names. The attribute- basedidentificationsystemimprovesbothusabilityandaccuracyoverthepreviousID-basedobject identificationmethods. Next,weconsidertheproblemoforganizingalargenumberofconceptsintoasemantichierarchy. We proposeaprincipleapproachforcreatingsemantichierarchiesofconceptsviacrowdsourcing. The approach can build hierarchies for various tasks and capture the uncertainty that naturally exists in these hierarchies. Experiments demonstrate that our method is more efficient, scalable, and accurate than previous methods. We also design a crowdsourcing evaluation to compare the hierarchiesbuiltbyourmethodtoexpertlybuiltones. Resultsoftheevaluationdemonstratethat ourapproachoutputs task-dependenthierarchiesthatcan significantlyimproveuser’s performance ofdesiredtasks. Finally,webuildthefirstnever-endingobjectlearningframework,NEOL,thatletsrobotslearn objectscontinuously. NEOLautomaticallylearnstoorganizeobjectnamesintoasemantichierarchy usingthecrowdsourcingmethodwepropose. Itthenusesthehierarchytoimprovetheconsistency andefficiencyofannotatingobjects. Further,itadaptsinformationfromadditionalimagedatasetsto learnobjectclassifiersfromaverysmallnumberoftrainingexamples. ExperimentsshowthatNEOL significantlyimprovesrobots’accuracyandefficiencyinlearningobjectsoverpreviousmethods. TABLE OF CONTENTS Page ListofFigures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv ListofTables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii Chapter1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Never-EndingObjectLearningforRobots . . . . . . . . . . . . . . . . . . . . . . 1 1.2 BackgroundandRelatedWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 ThesisStatement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.4 ThesisOutline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Chapter2: Attribute-BasedObjectIdentification . . . . . . . . . . . . . . . . . . . . . 12 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.3 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Chapter3: LearningtoIdentifyNewNames . . . . . . . . . . . . . . . . . . . . . . . . 32 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.2 RelatedWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.3 IndentificationSystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.4 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.5 InteractionwithHumans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.6 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Chapter4: BuildingSemanticHierarchiesviaCrowdsourcing . . . . . . . . . . . . . . 53 i 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.2 RelatedWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.3 ModelingDistributionoverHierarchies . . . . . . . . . . . . . . . . . . . . . . . 57 4.4 Approach: UpdatingDistributionsoverHierarchies . . . . . . . . . . . . . . . . . 60 4.5 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 4.6 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 Chapter5: EvaluatingTask-DependentTaxonomiesforNavigation . . . . . . . . . . . 81 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 5.2 BuildingTaxonomiesviaCrowdsourcing . . . . . . . . . . . . . . . . . . . . . . 83 5.3 ApplicationDomainandNavigationTasks . . . . . . . . . . . . . . . . . . . . . . 86 5.4 TaxonomiesUsedforEvaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 5.5 StructuralComparisonofTaxonomies . . . . . . . . . . . . . . . . . . . . . . . . 91 5.6 UserStudyforQuantitativeEvaluation . . . . . . . . . . . . . . . . . . . . . . . . 92 5.7 RelatedWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 5.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 Chapter6: Never-EndingObjectLearning . . . . . . . . . . . . . . . . . . . . . . . . . 101 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 6.2 RelatedWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 6.3 Never-EndingObjectLearning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 6.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 6.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 Chapter7: ConclusionandFutureWork . . . . . . . . . . . . . . . . . . . . . . . . . . 129 7.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 7.2 FutureWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 AppendixA: Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 A.1 DerivationoftheObjectiveFunction . . . . . . . . . . . . . . . . . . . . . . . . . 157 A.2 Minimizationof(A.8) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 A.3 ProofofTheorem1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 A.4 ProofofTheorem2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 ii A.5 Proofofproposition1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 AppendixB: LearningUser-SpecificHierarchies . . . . . . . . . . . . . . . . . . . . . . 166 B.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 B.2 Models,ParameterEstimationandInference . . . . . . . . . . . . . . . . . . . . . 168 B.3 ActiveLearninginBayesianSetting . . . . . . . . . . . . . . . . . . . . . . . . . 172 B.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 iii LIST OF FIGURES FigureNumber Page 1.1 Predominantobjectrecognitionpipelineistolearnclassifiersbasedonlabeledtrainingdata andtotesttheclassifiersonheldoutdata. . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Thisthesisarguesthatarobotcanlearnnewobjectsefficientlyandaccuratelybyintegrating multipleknowledgesources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1 Objectidentification: Identifyandvisuallyrecognizelanguageattributesthatrefertothe desiredobject(markedbyredrectangle). . . . . . . . . . . . . . . . . . . . . . . . . 13 2.2 110objectsintheRGB-DObjectAttributeDataset. Eachimageshownherebelongstoa differentobject. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.3 Sparsified codebooks for color (left) and shape (right) attributes. Our approach learned that solid color codewords are most relevant to classify colors, while depth codewords (grayscale)aremostoftenselectedforshapeclassification. . . . . . . . . . . . . . . . . 20 2.4 Scenes and sentences collected from Amazon Mechanical Turk. Our system correctly identifiesallobjectsexcepttheonesin(f), (h), and(j). Theattributes“stem”, “chunky”, and“tall”arerelativeorparts-basedattributesnotcontainedinthetrainingdata. . . . . . 21 2.5 Frequency of different subsets of attribute types used by people to identify objects. N stands for name, C stands for color, S stands for shape and M is stands material. Other meansotherattributetypes,suchasrelativeorparts-based. Onlythemostfrequentlyused attributesorattributecombinationsareshown. . . . . . . . . . . . . . . . . . . . . . . 23 2.6 Objectidentificationresultsusingfourtypesofattributesandtheircombination. . . . . . 24 2.7 Frequencyofdifferentsubsetsofattributetypesusedbyreproducinghumanbehaviorof identifyingobjects. Nisshortforname,Cisshortforcolor,SisshortforshapeandMis shortformaterial. othermeansotherattributetypessuchasrelativeorparts-based. Only themostfrequentlyusedattributesorattributecombinationsareshown. . . . . . . . . . . 29 2.8 Errorratesforlearningnewattributevaluesfromdifferentnumbersoftrainingexamples(# oftrainingsamplesareinlogscale). . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.1 Partofanobjectnamehierarchy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 iv
Description: