ebook img

Paying Attention to What Matters: Observation Abstraction in Partially Observable Environments PDF

167 Pages·2014·1.38 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Paying Attention to What Matters: Observation Abstraction in Partially Observable Environments

UUnniivveerrssiittyy ooff MMaassssaacchhuusseettttss AAmmhheerrsstt SScchhoollaarrWWoorrkkss@@UUMMaassss AAmmhheerrsstt Open Access Dissertations 2-2010 PPaayyiinngg AAtttteennttiioonn ttoo WWhhaatt MMaatttteerrss:: OObbsseerrvvaattiioonn AAbbssttrraaccttiioonn iinn PPaarrttiiaallllyy OObbsseerrvvaabbllee EEnnvviirroonnmmeennttss Alicia Peregrin Wolfe University of Massachusetts Amherst Follow this and additional works at: https://scholarworks.umass.edu/open_access_dissertations Part of the Computer Sciences Commons RReeccoommmmeennddeedd CCiittaattiioonn Wolfe, Alicia Peregrin, "Paying Attention to What Matters: Observation Abstraction in Partially Observable Environments" (2010). Open Access Dissertations. 188. https://doi.org/10.7275/1266281 https://scholarworks.umass.edu/open_access_dissertations/188 This Open Access Dissertation is brought to you for free and open access by ScholarWorks@UMass Amherst. It has been accepted for inclusion in Open Access Dissertations by an authorized administrator of ScholarWorks@UMass Amherst. For more information, please contact [email protected]. PAYING ATTENTION TO WHAT MATTERS: OBSERVATION ABSTRACTION IN PARTIALLY OBSERVABLE ENVIRONMENTS ADissertationPresented by ALICIAPEREGRINWOLFE SubmittedtotheGraduateSchoolofthe UniversityofMassachusettsAmherstinpartialfulfillment oftherequirementsforthedegreeof DOCTOROFPHILOSOPHY February2010 ComputerScience (cid:13)c CopyrightbyAliciaPeregrinWolfe2010 AllRightsReserved PAYING ATTENTION TO WHAT MATTERS: OBSERVATION ABSTRACTION IN PARTIALLY OBSERVABLE ENVIRONMENTS ADissertationPresented by ALICIAPEREGRINWOLFE Approvedastostyleandcontentby: AndrewG.Barto,Chair SridharMahadevan,Member ShlomoZilberstein,Member LeslieKaelbling,Member BruceTurkington,Member AndrewG.Barto,DepartmentChair ComputerScience Tomymother,MaryAnneSchweitzer,forhertimeandpatience. ACKNOWLEDGMENTS Thanksfirstlytomycommittee,inparticularforbearingwithmethroughseveralsched- ulechanges. AlsotothemembersoftheAutonomousLearningLaboratoryformanyinter- estingdiscussions,includingbutnotlimitedtoOzgurSimsek,AmyMcGovern,Balaraman Ravindran, Sarah Osentoski and Ashvin Shah. Other members of the UMass Computer Science community I’ve enjoyed many long discussions with include Victoria Manfredi, Jen Neville, Lisa Friedland, Emily Horrell and TJ Brunette. Prof. David Jensen, while not on the committee for my dissertation, was a helpful mentor and collaborator on earlier projects. Supportive friends and family include: Martin Walkow, providing the linguist’s per- spective;mysisterRachelWolfewhocanalwaysmakemeseethehumorinanysituation; my father John Wolfe; who taught me to always question, question, question; and my mother Mary Anne Schweitzer, who, in addition to probably hundreds of long phone calls pitchedinatthelastminutetotransportmyshoesintotownfromConnecticut. Also thanks to the many helpful staff members in the department, including but not limitedtoLeeanneLeclerc,BarbSutherlandandGwynMitchell. v ABSTRACT PAYING ATTENTION TO WHAT MATTERS: OBSERVATION ABSTRACTION IN PARTIALLY OBSERVABLE ENVIRONMENTS FEBRUARY2010 ALICIAPEREGRINWOLFE CombinedB.A./B.Sc.,BROWNUNIVERSITY M.Sc.,UNIVERSITYOFMASSACHUSETTS,AMHERST Ph.D.,UNIVERSITYOFMASSACHUSETTSAMHERST Directedby: ProfessorAndrewG.Barto Autonomousagentsmaynothaveaccesstocompleteinformationaboutthestateofthe environment. Forexample,arobotsoccerplayermayonlybeabletoestimatethelocations of other players not in the scope of its sensors. However, even though all the information neededforidealdecisionmakingcannotbesensed,allthatissensedisusuallynotneeded. The noise and motion of spectators, for example, can be ignored in order to focus on the game field. Standard formulations do not consider this situation, assuming that all the can besensedmustbeincludedinanyusefulabstraction. This dissertation extends the Markov Decision Process Homomorphism framework (Ravindran, 2004) to partially observable domains, focusing specically on reducing Par- tially Observable Markov Decision Processes (POMDPs) when the model is known. This involves ignoring aspects of the observation function which are irrelevant to a particular vi task. Abstractionisparticularlyimportantinpartiallyobservabledomains,asitenablesthe formationofasmallerdomainmodelandthusmoreefficientuseoftheobservedfeatures. vii TABLE OF CONTENTS Page ACKNOWLEDGMENTS ................................................... v ABSTRACT .............................................................. vi LISTOFTABLES.......................................................... x LISTOFFIGURES ....................................................... xi CHAPTER 1. INTRODUCTION ...................................................... 1 1.1 Background: ModelMinimization ..................................... 5 1.1.1 ControlledMarkovProcessHomomorphisms ..................... 9 1.1.2 ModelMinimizationinPartiallyObservableDomains ............. 10 2. POMDPHOMOMORPHISMS:POMDPTOPOMDP ABSTRACTION ....................................................15 2.1 Introduction ....................................................... 15 2.2 PartialObservability ................................................ 15 2.3 POMDPHomomorphisms ........................................... 17 2.4 EvaluatinganObservationMap....................................... 25 2.4.1 AbstractModel ............................................. 27 2.4.2 AbstractandShadowModels: TwoExamples .................... 30 2.4.3 ShadowModel .............................................. 33 2.4.4 AbstractShadowModel ...................................... 36 2.4.5 IndependenceofShadowandAbstractModels ................... 38 2.4.6 TimeAnalysis .............................................. 42 2.4.7 ShortcomingsoftheShadowModel ............................ 43 2.5 CompatibleShadowStates........................................... 47 viii 2.5.1 CompositeModel ........................................... 54 2.5.2 CompatibilityAlgorithm...................................... 64 2.5.3 TimeAnalysis .............................................. 71 2.6 ComparisonofShadowModelandCompatibilityTests................... 72 2.7 ImprovingtheObservationMap ...................................... 73 2.7.1 MergingDistributions ........................................ 76 2.7.2 ObservationSplits ........................................... 77 2.8 TimeComplexity................................................... 89 2.9 Conclusion ........................................................ 89 3. THEKRYLOVBASIS:POMDPTOPSRABSTRACTION................. 91 3.1 Overview ......................................................... 91 3.2 Background: PredictiveState......................................... 92 3.2.1 POMDPtoPSRCompression ................................. 95 3.3 PSRHomomorphisms .............................................. 99 3.4 Outline .......................................................... 101 3.5 ShadowModelTest................................................ 102 3.6 CompatibilityTest................................................. 111 3.7 CompatibilityAlgorithm ........................................... 118 3.7.1 TimeAnalysis ............................................. 120 3.8 ComparisonofPSRandPOMDPMethods ............................ 121 3.9 ObservationandValue-directedModels ............................... 121 3.10 PSRvs. POMDP:OneStepandTwoStepUpdateModels ............... 123 3.11 ObservationSplitting .............................................. 132 3.11.1 GraphBasedMatchingalgorithm ............................. 139 3.12 TimeExperiments: ComparisontoExistingWork ...................... 141 3.13 Conclusion ....................................................... 147 4. CONCLUSION ....................................................... 148 BIBLIOGRAPHY........................................................ 151 ix

Description:
involves ignoring aspects of the observation function which are irrelevant to a . models) and the size of the abstract models they typically create.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.