M S ULTIAGENT YSTEMS Algorithmic, Game-Theoretic, and Logical Foundations YoavShoham StanfordUniversity KevinLeyton-Brown UniversityofBritishColumbia Revision1.1 Multiagent Systems is copyright © Shoham and Leyton-Brown, 2009, 2010. This version is formatteddifferentlythanthebook—and inparticularhasdifferentpagenumbering—and has notbeenfullycopyedited.Pleasetreattheprintedbookasthedefinitiveversion. Youareinvitedtousethiselectroniccopywithoutrestrictionforon-screenviewing,butare requestedtoprintitonlyunderoneofthefollowingcircumstances: • Youliveinaplacethatdoesnotofferyouaccesstothephysicalbook; • Thecostofthebookisprohibitiveforyou; • Youneedonlyoneortwochapters. Finally,weaskyounottolinkdirectlytothePDFortodistributeitelectronically. Instead,we inviteyoutolinktohttp://www.masfoundations.org. Thiswillallowustogaugethelevelof interestinthebookandtoupdatethePDFtokeepitconsistentwithreprintingsofthebook. i TomywifeNoaandmydaughters Maia,TaliaandElla —YS ToJude —KLB Contents CreditsandAcknowledgments xi Introduction xiii 1 DistributedConstraintSatisfaction 1 1.1 Definingdistributedconstraintsatisfactionproblems 2 1.2 Domain-pruningalgorithms 4 1.3 Heuristicsearchalgorithms 8 1.3.1 Theasynchronousbacktrackingalgorithm 10 1.3.2 Asimpleexample 12 1.3.3 Anextendedexample:thefourqueensproblem 13 1.3.4 BeyondtheABTalgorithm 17 1.4 Historyandreferences 18 2 DistributedOptimization 19 2.1 Distributeddynamicprogrammingforpathplanning 19 2.1.1 Asynchronousdynamicprogramming 19 2.1.2 Learningreal-timeA∗ 20 2.2 ActionselectioninmultiagentMDPs 22 2.3 Negotiation,auctionsandoptimization 28 2.3.1 Fromcontractnetstoauction-likeoptimization 28 2.3.2 Theassignmentproblemandlinearprogramming 30 2.3.3 Theschedulingproblemandintegerprogramming 36 2.4 Sociallawsandconventions 44 2.5 Historyandreferences 46 3 IntroductiontoNoncooperativeGameTheory: GamesinNormalForm 47 3.1 Self-interestedagents 47 3.1.1 Example: friendsandenemies 48 3.1.2 Preferencesandutility 49 3.2 Gamesinnormalform 54 3.2.1 Example: theTCPuser’sgame 54 iv Contents 3.2.2 Definitionofgamesinnormalform 55 3.2.3 Moreexamplesofnormal-formgames 56 3.2.4 Strategiesinnormal-formgames 59 3.3 Analyzinggames:fromoptimalitytoequilibrium 60 3.3.1 Paretooptimality 61 3.3.2 DefiningbestresponseandNashequilibrium 62 3.3.3 FindingNashequilibria 63 3.3.4 Nash’stheorem: provingtheexistenceofNashequilibria 65 3.4 Furthersolutionconceptsfornormal-formgames 73 3.4.1 Maxminandminmaxstrategies 73 3.4.2 Minimaxregret 76 3.4.3 Removalofdominatedstrategies 78 3.4.4 Rationalizability 81 3.4.5 Correlatedequilibrium 83 3.4.6 Trembling-handperfectequilibrium 85 3.4.7 ǫ-Nashequilibrium 85 3.5 Historyandreferences 87 4 ComputingSolutionConceptsofNormal-FormGames 89 4.1 ComputingNashequilibriaoftwo-player,zero-sumgames 89 4.2 ComputingNashequilibriaoftwo-player,general-sumgames 91 4.2.1 ComplexityofcomputingasampleNashequilibrium 91 4.2.2 AnLCPformulationandtheLemke–Howsonalgorithm 93 4.2.3 Searchingthespaceofsupports 101 4.2.4 Beyondsampleequilibriumcomputation 104 4.3 ComputingNashequilibriaofn-player,general-sumgames 105 4.4 Computingmaxminandminmaxstrategiesfortwo-player,general-sumgames 108 4.5 Identifyingdominatedstrategies 108 4.5.1 Dominationbyapurestrategy 109 4.5.2 Dominationbyamixedstrategy 110 4.5.3 Iterateddominance 112 4.6 Computingcorrelatedequilibria 113 4.7 Historyandreferences 115 5 GameswithSequentialActions: ReasoningandComputingwiththeExtensiveForm 117 5.1 Perfect-informationextensive-formgames 117 5.1.1 Definition 118 5.1.2 Strategiesandequilibria 119 5.1.3 Subgame-perfectequilibrium 121 5.1.4 Computingequilibria: backwardinduction 124 5.2 Imperfect-informationextensive-formgames 130 5.2.1 Definition 130 UncorrectedmanuscriptofMultiagentSystems,publishedbyCambridgeUniversityPress Revision1.1©Shoham&Leyton-Brown,2009,2010. Contents v 5.2.2 Strategiesandequilibria 131 5.2.3 Computingequilibria: thesequenceform 134 5.2.4 Sequentialequilibrium 142 5.3 Historyandreferences 145 6 RicherRepresentations: BeyondtheNormalandExtensiveForms 147 6.1 Repeatedgames 148 6.1.1 Finitelyrepeatedgames 149 6.1.2 Infinitelyrepeatedgames 150 6.1.3 “Boundedrationality": repeatedgamesplayedbyautomata 153 6.2 Stochasticgames 159 6.2.1 Definition 160 6.2.2 Strategiesandequilibria 160 6.2.3 Computingequilibria 162 6.3 Bayesiangames 163 6.3.1 Definition 164 6.3.2 Strategiesandequilibria 167 6.3.3 Computingequilibria 170 6.3.4 Expostequilibrium 173 6.4 Congestiongames 174 6.4.1 Definition 174 6.4.2 Computingequilibria 175 6.4.3 Potentialgames 176 6.4.4 Nonatomiccongestiongames 178 6.4.5 Selfishroutingandthepriceofanarchy 180 6.5 Computationallymotivatedcompactrepresentations 185 6.5.1 Theexpectedutilityproblem 185 6.5.2 Graphicalgames 188 6.5.3 Action-graphgames 190 6.5.4 Multiagentinfluencediagrams 192 6.5.5 GALA 195 6.6 Historyandreferences 196 7 LearningandTeaching 199 7.1 Whythesubjectof“learning”iscomplex 199 7.1.1 Theinteractionbetweenlearningandteaching 199 7.1.2 Whatconstituteslearning? 201 7.1.3 Iflearningistheanswer,whatisthequestion? 202 7.2 Fictitiousplay 206 7.3 Rationallearning 211 7.4 Reinforcementlearning 215 7.4.1 LearninginunknownMDPs 215 7.4.2 Reinforcementlearninginzero-sumstochasticgames 216 7.4.3 Beyondzero-sumstochasticgames 219 Freeforon-screenuse;pleasedonotdistribute.Youcangetanotherfreecopy ofthisPDFororderthebookathttp://www.masfoundations.org. vi Contents 7.4.4 Belief-basedreinforcementlearning 220 7.5 No-regretlearninganduniversalconsistency 220 7.6 Targetedlearning 222 7.7 Evolutionarylearningandotherlarge-populationmodels 224 7.7.1 Thereplicatordynamic 224 7.7.2 Evolutionarilystablestrategies 228 7.7.3 Agent-basedsimulationandemergentconventions 230 7.8 Historyandreferences 233 8 Communication 235 8.1 “Doingbytalking”I:cheaptalk 235 8.2 “Talkingbydoing”:signalinggames 239 8.3 “Doingbytalking”II:speech-acttheory 241 8.3.1 Speechacts 242 8.3.2 Rulesofconversation 243 8.3.3 Agame-theoreticviewofspeechacts 245 8.3.4 Applications 248 8.4 Historyandreferences 251 9 AggregatingPreferences: SocialChoice 253 9.1 Introduction 253 9.1.1 Example: pluralityvoting 253 9.2 Aformalmodel 254 9.3 Voting 256 9.3.1 Votingmethods 256 9.3.2 Votingparadoxes 258 9.4 Existenceofsocialfunctions 260 9.4.1 Socialwelfarefunctions 260 9.4.2 Socialchoicefunctions 263 9.5 Rankingsystems 267 9.6 Historyandreferences 271 10 ProtocolsforStrategicAgents: MechanismDesign 273 10.1 Introduction 273 10.1.1 Example: strategicvoting 273 10.1.2 Example: buyingashortestpath 274 10.2 Mechanismdesignwithunrestrictedpreferences 275 10.2.1 Implementation 276 10.2.2 Therevelationprinciple 277 10.2.3 Impossibilityofgeneral,dominant-strategyimplementation 280 10.3 Quasilinearpreferences 280 10.3.1 Riskattitudes 281 10.3.2 Mechanismdesigninthequasilinearsetting 284 10.4 Efficientmechanisms 288 UncorrectedmanuscriptofMultiagentSystems,publishedbyCambridgeUniversityPress Revision1.1©Shoham&Leyton-Brown,2009,2010. Contents vii 10.4.1 Grovesmechanisms 288 10.4.2 TheVCGmechanism 292 10.4.3 VCGandindividualrationality 295 10.4.4 VCGandweakbudgetbalance 296 10.4.5 DrawbacksofVCG 297 10.4.6 Budgetbalanceandefficiency 301 10.4.7 TheAGVmechanism 302 10.5 Beyondefficiency 303 10.5.1 Whatelsecanbeimplementedindominantstrategies? 303 10.5.2 TractableGrovesmechanisms 305 10.6 Computationalapplicationsofmechanismdesign 307 10.6.1 Taskscheduling 307 10.6.2 Bandwidthallocationincomputernetworks 309 10.6.3 Multicastcostsharing 312 10.6.4 Two-sidedmatching 316 10.7 Constrainedmechanismdesign 321 10.7.1 Contracts 322 10.7.2 Bribes 323 10.7.3 Mediators 324 10.8 Historyandreferences 326 11 ProtocolsforMultiagentResourceAllocation: Auctions 329 11.1 Single-goodauctions 329 11.1.1 Canonicalauctionfamilies 330 11.1.2 AuctionsasBayesianmechanisms 332 11.1.3 Second-price,Japanese,andEnglishauctions 333 11.1.4 First-priceandDutchauctions 335 11.1.5 Revenueequivalence 337 11.1.6 Riskattitudes 340 11.1.7 Auctionvariations 341 11.1.8 “Optimal”(revenue-maximizing)auctions 343 11.1.9 Collusion 345 11.1.10 Interdependentvalues 348 11.2 Multiunitauctions 351 11.2.1 Canonicalauctionfamilies 351 11.2.2 Single-unitdemand 352 11.2.3 Beyondsingle-unitdemand 355 11.2.4 Unlimitedsupply: randomsamplingauctions 357 11.2.5 Positionauctions 359 11.3 Combinatorialauctions 361 11.3.1 Simplecombinatorialauctionmechanisms 363 11.3.2 Thewinnerdeterminationproblem 364 11.3.3 Expressingabid: biddinglanguages 368 11.3.4 Iterativemechanisms 373 Freeforon-screenuse;pleasedonotdistribute.Youcangetanotherfreecopy ofthisPDFororderthebookathttp://www.masfoundations.org. viii Contents 11.3.5 Atractablemechanism 375 11.4 Exchanges 377 11.4.1 Two-sidedauctions 377 11.4.2 Predictionmarkets 378 11.5 Historyandreferences 380 12 TeamsofSelfishAgents: AnIntroductiontoCoalitionalGameTheory 383 12.1 Coalitionalgameswithtransferableutility 383 12.1.1 Definition 384 12.1.2 Examples 384 12.1.3 Classesofcoalitionalgames 386 12.2 Analyzingcoalitionalgames 387 12.2.1 TheShapleyvalue 388 12.2.2 Thecore 391 12.2.3 Refiningthecore: ǫ-core,leastcore,andnucleolus 394 12.3 Compactrepresentationsofcoalitionalgames 397 12.3.1 Weightedmajoritygamesandweightedvotinggames 398 12.3.2 Weightedgraphgames 399 12.3.3 Capturingsynergies: arepresentationforsuperadditivegames 401 12.3.4 Adecompositionapproach:multi-issuerepresentation 402 12.3.5 Alogicalapproach:marginalcontributionnets 403 12.4 Furtherdirections 405 12.4.1 Alternativecoalitionalgamemodels 405 12.4.2 Advancedsolutionconcepts 407 12.5 Historyandreferences 407 13 LogicsofKnowledgeandBelief 409 13.1 Thepartitionmodelofknowledge 409 13.1.1 Muddychildrenandwarringgenerals 409 13.1.2 Formalizingintuitionsaboutthepartitionmodel 410 13.2 Adetourtomodallogic 413 13.2.1 Syntax 414 13.2.2 Semantics 414 13.2.3 Axiomatics 415 13.2.4 Modallogicswithmultiplemodaloperators 416 13.2.5 Remarksaboutfirst-ordermodallogic 416 13.3 S5: Anaxiomatictheoryofthepartitionmodel 417 13.4 Commonknowledge,andanapplicationtodistributedsystems 420 13.5 Doingtime,andanapplicationtorobotics 423 13.5.1 Terminationconditionsformotionplanning 423 13.5.2 Coordinatingrobots 427 13.6 Fromknowledgetobelief 429 13.7 Combiningknowledgeandbelief(andrevisitingknowledge) 431 13.8 Historyandreferences 436 UncorrectedmanuscriptofMultiagentSystems,publishedbyCambridgeUniversityPress Revision1.1©Shoham&Leyton-Brown,2009,2010.
Description: