Table Of Content

SPRINGER BRIEFS IN INTELLIGENT SYSTEMS ARTIFICIAL INTELLIGENCE, MULTIAGENT SYSTEMS, AND COGNITIVE ROBOTICS Frans A. Oliehoek Christopher Amato A Concise Introduction to Decentralized POMDPs 123 SpringerBriefs in Intelligent Systems fi Arti cial Intelligence, Multiagent Systems, and Cognitive Robotics Series editors Gerhard Weiss, Maastricht University, Maastricht, The Netherlands Karl Tuyls, University of Liverpool, Liverpool, UK Editorial Board Felix Brandt, Technische Universität München, Munich, Germany Wolfram Burgard, Albert-Ludwigs-Universität Freiburg, Freiburg, Germany Marco Dorigo, Université libre de Bruxelles, Brussels, Belgium Peter Flach, University of Bristol, Bristol, UK Brian Gerkey, Open Source Robotics Foundation, Bristol, UK Nicholas R. Jennings, Southampton University, Southampton, UK Michael Luck, King’s College London, London, UK Simon Parsons, City University of New York, New York, US Henri Prade, IRIT, Toulouse, France Jeffrey S. Rosenschein, Hebrew University of Jerusalem, Jerusalem, Israel Francesca Rossi, University of Padova, Padua, Italy Carles Sierra, IIIA-CSIC Cerdanyola, Barcelona, Spain Milind Tambe, USC, Los Angeles, US Makoto Yokoo, Kyushu University, Fukuoka, Japan This series covers the entire research and application spectrum of intelligent systems, including artificial intelligence, multiagent systems, and cognitive robotics. Typical texts for publication in the series include, but are not limited to, state-of-the-art reviews, tutorials, summaries, introductions, surveys, and in-depth case and application studies of established or emerging fields and topics in the realm of computational intelligent systems. Essays exploring philosophical and societal issues raised by intelligent systems are also very welcome. More information about this series at http://www.springer.com/series/11845 Frans A. Oliehoek Christopher Amato (cid:129) A Concise Introduction to Decentralized POMDPs 123 FransA.Oliehoek Christopher Amato SchoolofElectricalEngineering,Electronics ComputerScienceandArtificialIntelligence andComputer Science Laboratory University of Liverpool MIT Liverpool Cambridge, MA UK USA ISSN 2196-548X ISSN 2196-5498 (electronic) SpringerBriefs inIntelligent Systems ISBN978-3-319-28927-4 ISBN978-3-319-28929-8 (eBook) DOI 10.1007/978-3-319-28929-8 LibraryofCongressControlNumber:2016941071 ©TheAuthor(s)2016 Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpart of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission orinformationstorageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilar methodologynowknownorhereafterdeveloped. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publicationdoesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfrom therelevantprotectivelawsandregulationsandthereforefreeforgeneraluse. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authorsortheeditorsgiveawarranty,expressorimplied,withrespecttothematerialcontainedhereinor foranyerrorsoromissionsthatmayhavebeenmade. Printedonacid-freepaper ThisSpringerimprintispublishedbySpringerNature TheregisteredcompanyisSpringerInternationalPublishingAGSwitzerland Dedicatedtofuturegenerationsofintelligent decisionmakers. Preface Thisbookpresentsanoverviewofformaldecisionmakingmethodsfordecentral- ized cooperative systems. It is aimed at graduate students and researchers in the fieldsofartificialintelligenceandrelatedfieldsthatdealwithdecisionmaking,such asoperationsresearchandcontroltheory.Whilewehavetriedtomakethebookrel- ativelyself-contained,wedoassumesomeamountofbackgroundknowledge. Inparticular,weassumethatthereaderisfamiliarwiththeconceptofanagentas wellassearchtechniques(likedepth-firstsearch,A*,etc.),bothofwhicharestan- dardinthefieldofartificialintelligence[RussellandNorvig,2009].Additionally, we assume that the reader has a basic background in probability theory. Although wegiveaveryconcisebackgroundinrelevantsingle-agentmodels(i.e.,the‘MDP’ and ‘POMDP’ frameworks), a more thorough understanding of those frameworks would benefit the reader. A good first introduction to these concepts can be found inthetextbookbyRussellandNorvig,withadditionaldetailsintextsbySuttonand Barto[1998],Kaelblingetal.[1998],Spaan[2012]andKochenderferetal.[2015]. We alsoassume thatthe readerhas abasic background ingame theoryand game- theoreticnotationslikeNashequilibriumandParetoefficiency.Eventhoughthese conceptsarenotcentraltoourexposition,wedoplacetheDec-POMDPmodelin themoregeneralcontexttheyoffer.Foranexplanationoftheseconcepts,thereader could refer to any introduction on game theory, such as those by Binmore [1992], OsborneandRubinstein[1994]andLeyton-BrownandShoham[2008]. This book heavily builds upon earlier texts by the authors. In particular, many partswerebasedontheauthors’previoustheses,bookchaptersandsurveyarticles [Oliehoek, 2010, 2012, Amato, 2010, 2015, Amato et al., 2013]. This also means that,eventhoughwehavetriedtogivearelativelycompleteoverviewofthework inthefield,thetextinsomecasesisbiasedtowardsexamplesandmethodsthathave been considered by the authors. For the description of further topics in Chapter 8, we have selected those that we consider important and promising for future work. Clearly, there is a necessarily large overlap between these topics and the authors’ recentworkinthefield. vii Acknowledgments Writing a book is not a standalone activity; it builds upon all the insights devel- opedintheinteractionswithpeers,reviewersandcoathors.Assuch,wearegrateful fortheinteractionwehavehadwiththeentireresearchfield.Wespecificallywant to thank the attendees and organizers of the workshops on multiagent sequential decisionmaking(MSDM)whichhaveprovidedauniqueplatformforexchangeof thoughtsondecisionmakingunderuncertainty. Furthermore, we would like to thank João Messias, Matthijs Spaan, Shimon Whiteson,andStefanWitwickifortheirfeedbackonsectionsofthemanuscript. Finally,wearegratefultoourformersupervisors,inparticularNikosVlassisand ShlomoZilberstein,whoenabledandstimulatedustogodownthepathofresearch ondecentralizeddecisionmaking. ix Contents 1 MultiagentSystemsUnderUncertainty .......................... 1 1.1 MotivatingExamples ....................................... 2 1.2 MultiagentSystems......................................... 4 1.3 Uncertainty................................................ 6 1.4 Applications .............................................. 7 2 TheDecentralizedPOMDPFramework .......................... 11 2.1 Single-AgentDecisionFrameworks ........................... 11 2.1.1 MDPs.............................................. 12 2.1.2 POMDPs ........................................... 13 2.2 MultiagentDecisionMaking:DecentralizedPOMDPs............ 14 2.3 ExampleDomains.......................................... 17 2.3.1 Dec-Tiger .......................................... 17 2.3.2 MultirobotCoordination:RecyclingandBox-Pushing ..... 19 2.3.3 NetworkProtocolOptimization ........................ 20 2.3.4 EfficientSensorNetworks............................. 20 2.4 SpecialCases,GeneralizationsandRelatedModels ............. 21 2.4.1 ObservabilityandDec-MDPs .......................... 21 2.4.2 FactoredModels..................................... 22 2.4.3 CentralizedModels:MMDPsandMPOMDPs............ 24 2.4.4 MultiagentDecisionProblems ......................... 25 2.4.5 PartiallyObservableStochasticGames .................. 30 2.4.6 InteractivePOMDPs ................................. 30 3 Finite-HorizonDec-POMDPs ................................... 33 3.1 OptimalityCriteria ......................................... 33 3.2 PolicyRepresentations:HistoriesandPolicies .................. 34 3.2.1 Histories ........................................... 34 3.2.2 Policies ............................................ 35 3.3 MultiagentBeliefs.......................................... 37 3.4 ValueFunctionsforJointPolicies ............................. 37 xi xii Contents 3.5 Complexity................................................ 39 4 ExactFinite-HorizonPlanningMethods .......................... 41 4.1 BackwardsApproach:DynamicProgramming .................. 41 4.1.1 GrowingPoliciesfromSubtreePolicies ................. 41 4.1.2 DynamicProgrammingforDec-POMDPs ............... 44 4.2 ForwardApproach:HeuristicSearch .......................... 46 4.2.1 TemporalStructureinPolicies:DecisionRules ........... 46 4.2.2 MultiagentA* ...................................... 47 4.3 ConvertingtoaNon-observableMDP ......................... 48 4.3.1 ThePlan-TimeMDPandOptimalValueFunction......... 49 4.3.2 Plan-TimeSufficientStatistics ......................... 49 4.3.3 AnNOMDPFormulation ............................. 51 4.4 OtherFinite-HorizonMethods................................ 52 4.4.1 Point-BasedDP ..................................... 52 4.4.2 Optimization ....................................... 52 5 ApproximateandHeuristicFinite-HorizonPlanningMethods ...... 55 5.1 ApproximationMethods..................................... 56 5.1.1 BoundedDynamicProgramming ....................... 56 5.1.2 EarlyStoppingofHeuristicSearch ..................... 57 5.1.3 ApplicationofPOMDPApproximationAlgorithms ....... 57 5.2 HeuristicMethods.......................................... 58 5.2.1 AlternatingMaximization ............................. 58 5.2.2 Memory-BoundedDynamicProgramming .............. 59 5.2.3 ApproximateHeuristic-SearchMethods ................. 61 5.2.4 EvolutionaryMethodsandCross-EntropyOptimization .... 64 6 Infinite-HorizonDec-POMDPs .................................. 69 6.1 OptimalityCriteria ......................................... 69 6.1.1 DiscountedCumulativeReward ........................ 69 6.1.2 AverageReward ..................................... 70 6.2 PolicyRepresentation....................................... 70 6.2.1 Finite-StateControllers:MooreandMealy............... 71 6.2.2 AnExampleSolutionforDEC-TIGER .................. 73 6.2.3 Randomization ...................................... 74 6.2.4 CorrelationDevices .................................. 74 6.3 ValueFunctionsforJointPolicies ............................. 75 6.4 Undecidability,AlternativeGoalsandTheirComplexity ......... 76 7 Infinite-HorizonPlanningMethods:DiscountedCumulativeReward 79 7.1 PolicyIteration ............................................ 79 7.2 OptimizingFixed-SizeControllers ............................ 81 7.2.1 Best-FirstSearch .................................... 82 7.2.2 BoundedPolicyIteration ............................. 83 7.2.3 NonlinearProgramming .............................. 85

A Concise Introduction to Decentralized POMDPs PDF

146 Pages·2016·2.316 MB·English

by Frans A. Oliehoek, Christopher Amato (auth.)

Checking for file health...

Download

Upgrade Premium

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Download A Concise Introduction to Decentralized POMDPs PDF Free - Full Version

by Frans A. Oliehoek, Christopher Amato (auth.)| 2016| 146 pages| 2.316| English

Download A Concise Introduction to Decentralized POMDPs by Frans A. Oliehoek, Christopher Amato (auth.) in PDF format completely FREE. No registration required, no payment needed. Get instant access to this valuable resource on PDFdrive.to!

Free Download PDF

About A Concise Introduction to Decentralized POMDPs

No description available for this book.

Detailed Information

Author:	Frans A. Oliehoek, Christopher Amato (auth.)
Publication Year:	2016
Pages:	146
Language:	English
File Size:	2.316
Format:	PDF
Price:	FREE

Download Free PDF

Safe & Secure Download - No registration required

Why Choose PDFdrive for Your Free A Concise Introduction to Decentralized POMDPs Download?

100% Free: No hidden fees or subscriptions required for one book every day.
No Registration: Immediate access is available without creating accounts for one book every day.
Safe and Secure: Clean downloads without malware or viruses
Multiple Formats: PDF, MOBI, Mpub,... optimized for all devices
Educational Resource: Supporting knowledge sharing and learning

Frequently Asked Questions

Is it really free to download A Concise Introduction to Decentralized POMDPs PDF?

Yes, on https://PDFdrive.to you can download A Concise Introduction to Decentralized POMDPs by Frans A. Oliehoek, Christopher Amato (auth.) completely free. We don't require any payment, subscription, or registration to access this PDF file. For 3 books every day.

How can I read A Concise Introduction to Decentralized POMDPs on my mobile device?

After downloading A Concise Introduction to Decentralized POMDPs PDF, you can open it with any PDF reader app on your phone or tablet. We recommend using Adobe Acrobat Reader, Apple Books, or Google Play Books for the best reading experience.

Is this the full version of A Concise Introduction to Decentralized POMDPs?

Yes, this is the complete PDF version of A Concise Introduction to Decentralized POMDPs by Frans A. Oliehoek, Christopher Amato (auth.). You will be able to read the entire content as in the printed version without missing any pages.

Is it legal to download A Concise Introduction to Decentralized POMDPs PDF for free?

https://PDFdrive.to provides links to free educational resources available online. We do not store any files on our servers. Please be aware of copyright laws in your country before downloading.

The materials shared are intended for research, educational, and personal use in accordance with fair use principles.