ebook img

Game-theoretic learning and distributed optimization in memoryless multi-agent systems PDF

176 Pages·2017·3.72 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Game-theoretic learning and distributed optimization in memoryless multi-agent systems

Tatiana Tatarenko Game-Theoretic Learning and Distributed Optimization in Memoryless Multi-Agent Systems Game-Theoretic Learning and Distributed Optimization in Memoryless Multi-Agent Systems Tatiana Tatarenko Game-Theoretic Learning and Distributed Optimization in Memoryless Multi-Agent Systems 123 TatianaTatarenko TUDarmstadt Darmstadt,Germany ISBN978-3-319-65478-2 ISBN978-3-319-65479-9 (eBook) DOI10.1007/978-3-319-65479-9 LibraryofCongressControlNumber:2017952051 ©SpringerInternationalPublishingAG2017 Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpartof thematerialisconcerned,specificallytherightsoftranslation,reprinting,reuseofillustrations,recitation, broadcasting,reproductiononmicrofilmsorinanyotherphysicalway,andtransmissionorinformation storageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilarmethodology nowknownorhereafterdeveloped. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant protectivelawsandregulationsandthereforefreeforgeneraluse. Thepublisher,theauthorsandtheeditorsaresafetoassumethattheadviceandinformationinthisbook arebelievedtobetrueandaccurateatthedateofpublication.Neitherthepublishernortheauthorsor theeditorsgiveawarranty,expressorimplied,withrespecttothematerialcontainedhereinorforany errorsoromissionsthatmayhavebeenmade.Thepublisherremainsneutralwithregardtojurisdictional claimsinpublishedmapsandinstitutionalaffiliations. Printedonacid-freepaper ThisSpringerimprintispublishedbySpringerNature TheregisteredcompanyisSpringerInternationalPublishingAG Theregisteredcompanyaddressis:Gewerbestrasse11,6330Cham,Switzerland Abstract Learninginpotentialgamesandconsensus-baseddistributedoptimizationrepresent the main focus of the work. The analysis of potential games is motivated by the game-theoretic design, which renders an optimization problem in a multi-agent systemaproblemofpotentialfunctionmaximizationinamodeledpotentialgame. The interest to distributed consensus-based optimization is supported by growing popularityofdealingwithnetworkedsystemsindifferentengineeringapplications. Thisbookinvestigatesthealgorithmsthatenableagentsinasystemconverging tosomeoptimalstate.Thesealgorithmscanbeclassifiedaccordingtoinformation structuresof systems. A commonfeatureof the proceduresunderconsiderationis that they do not require agents to have memory to follow the prescribed rules. A general learning dynamics applicable to memoryless systems with discrete states andoracle-basedinformationispresented.Somesettingsguaranteeinganefficient behavior of this algorithm are provided. A special type of such efficient general learningprocedure,calledlogitdynamics,isconsidered.Further,theasynchronous andsynchronouslogitdynamicsisextendedtothecase ofgameswithcontinuous actions. Convergenceguarantees are discussed for this continuousstate dynamics aswell.Moreover,thecommunication-andpayoff-basedalgorithmsaredeveloped. They are proven to learn local optima in systems modeled by continuous action potential games. The stochastic approximation technique used to investigate the convergence properties of the latter procedures is also applied to distributed consensus-based optimization in networked systems. In this case, the stochastic nature of the proposed push-sum algorithm allows a system to escape suboptimal critical points and converge almost surely to a local minimum of the objective function,whichisnotassumedtobeconvex. v Contents 1 Introduction .................................................................. 1 1.1 MotivationofResearch ................................................ 1 1.2 ListofNotations........................................................ 4 2 GameTheoryandMulti-AgentOptimization............................ 7 2.1 GameTheory........................................................... 7 2.1.1 IntroductiontoGameTheory................................. 7 2.1.2 NashEquilibrium.............................................. 10 2.1.3 PotentialGames ............................................... 11 2.2 PotentialGameDesigninMulti-AgentOptimization................ 14 2.2.1 Multi-AgentSystemsModeledbyMeansofPotential Games.......................................................... 14 2.2.2 LearningOptimalStatesinPotentialGames................. 17 2.3 DistributedOptimizationinMulti-AgentSystems ................... 21 References..................................................................... 23 3 LogitDynamicsinPotentialGameswithMemorylessPlayers......... 27 3.1 Introduction............................................................. 27 3.2 MemorylessLearninginDiscreteActionGamesasaRegular PerturbedMarkovChain............................................... 31 3.2.1 Preliminaries:RegularPerturbedMarkovChains ........... 31 3.2.2 Convergence in Total Variation of General MemorylessLearningAlgorithms............................ 32 3.3 AsynchronousLearning................................................ 35 3.3.1 Log-LinearLearninginDiscreteActionGames............. 35 3.3.2 ConvergencetoPotentialFunctionMaximizers ............. 38 3.4 SynchronizationinMemorylessLearning ............................ 40 3.4.1 AdditionalInformationisNeeded............................ 40 3.4.2 IndependentLog-LinearLearning in Discrete ActionGames.................................................. 42 3.4.3 ConvergencetoPotentialFunctionMaximizers ............. 44 vii viii Contents 3.5 ConvergenceRateEstimationandFiniteTimeBehavior............. 46 3.5.1 Convergence Rate of Time-Inhomogeneous Log-LinearLearning .......................................... 50 3.5.2 Convergence Rate of Time-Inhomogeneous IndependentLog-LinearLearning............................ 54 3.5.3 SimulationResults:ExampleofaSensor CoverageProblem............................................. 55 3.6 LearninginContinuousActionGames................................ 60 3.6.1 Log-LinearLearninginContinuousActionGames ......... 60 3.6.2 IndependentLog-LinearLearninginContinuous ActionGames.................................................. 72 3.6.3 ChoiceoftheParameter....................................... 81 3.6.4 SimulationResults:ExampleofaRoutingProblem......... 84 3.7 Conclusion.............................................................. 87 References..................................................................... 89 4 Stochastic Methods in Distributed Optimization andGame-TheoreticLearning ............................................. 93 4.1 Introduction............................................................. 93 4.2 Preliminaries:IterativeMarkovProcess............................... 96 4.3 Push-SumAlgorithminNon-convexDistributedOptimization ..... 98 4.3.1 Problem Formulation: Push-Sum Algorithm andAssumptions .............................................. 98 4.3.2 ConvergencetoCriticalPoints................................ 101 4.3.3 PerturbedProcedure:ConvergencetoLocalMinima........ 105 4.3.4 ConvergenceRateofthePerturbedProcess.................. 108 4.3.5 Simulation Results: Illustrative Example andCongestionRoutingProblem............................. 115 4.4 Communication-BasedMemorylessLearning inPotentialGames ..................................................... 122 4.4.1 SimulationResults:CodeDivisionMultipleAccess Problem........................................................ 128 4.5 Payoff-BasedLearninginPotentialGames........................... 132 4.5.1 ConvergencetoaLocalMaximumofthePotential Function........................................................ 132 4.5.2 AdditionalCalculations....................................... 140 4.5.3 SimulationResults:PowerTransitionProblem.............. 140 4.6 ConcavePotentialGameswithUncoupledActionSets.............. 143 4.6.1 ProblemFormulationandAssumptions...................... 143 4.6.2 ApplicationstoElectricityMarkets........................... 144 4.6.3 Payoff-BasedAlgorithminConcaveGames................. 145 4.6.4 SimulationResults:ElectricityMarketApplication......... 150 4.7 Conclusion.............................................................. 152 References..................................................................... 154 Contents ix 5 Conclusion.................................................................... 157 AppendixA ....................................................................... 159 A.1 ConvergenceofProbabilityMeasures................................. 159 A.2 MarkovChainswithFiniteStates ..................................... 162 A.2.1 Time-HomogeneousMarkovChainswithFiniteStates..... 162 A.2.2 Time-InhomogeneousMarkovChainswithFiniteStates ... 164 A.3 MarkovChainswithGeneralStates................................... 166 A.3.1 Time-HomogeneousMarkovChainswithGeneralStates... 166 A.3.2 ConvergenceofTransitionProbabilityKernels.............. 168 A.3.3 Time-InhomogeneousMarkovChains withGeneralStates............................................ 169 References..................................................................... 171 Chapter 1 Introduction 1.1 Motivation ofResearch Due to the emergence of distributed networked systems, problems of cooperative controlinmulti-agentsystemshavegainedalotofattentionovertherecentyears. Someexamplesofnetworkedmulti-agentsystemsaresmartgrids,socialnetworks, autonomousvehicle teams, processorsin machinelearningscenarios,etc. Usually in such system there is a global objective to be achieved by appropriate local actions of agents. Such optimization problems can be solved centrally. However, for a centralized solution a central controller(central computingunit) is required, to collect the whole information about the system and to solve the optimization problem under consideration. This approach has limitations. Firstly, systems with suchsettingsaresensitivetothefailureofthecentralunit.Secondly,theinformation exchangeiscostly,sinceagentsneedtotransmittheirlocalinformationtothecentral unit and to receive the instructions from it. Moreover, due to a large network’s dimension,theoptimizationproblemcanbecomecomputationallyinfeasibleforthe centralcontroller.Finally,itmaybenoresourcestoincorporateacentralcomputing unitintothesystem.Thus,innetworkedmulti-agentsystemsagentsaremotivated tooptimizeaglobalobjectivewithoutanycentralizedcomputationbytakingonly thelocalinformationintoaccount.Variousapproacheshavebeenproposedsofarto dealwiththisproblem,includingreinforcementlearning,game-theoreticmodeling, anddistributedoptimizationalgorithms. This work focuses mainly on the game-theoretic approach to optimization in multi-agent systems. One of the reasons to use this approach in different applications is that one can design games, where a subset of Nash equilibrium states corresponds to the system optimal states that are desirable from a global perspective. In this context game-theoretic learning is a topic of specific interest. Itconcernstheanalysisofthedistributedadaptationrulesthatshouldbebasedon thelocalinformationavailabletoagentsandprovideconvergenceofthecollective behavior to an optimal solution of the global problem. On the other hand, there ©SpringerInternationalPublishingAG2017 1 T.Tatarenko,Game-TheoreticLearningandDistributedOptimization inMemorylessMulti-AgentSystems,DOI10.1007/978-3-319-65479-9_1 2 1 Introduction are multi-agent optimization problems falling naturally under this game-theoretic framework. One can think of a problem in a networked system (for example, cooperative routing), where the cost of each agent depends not only on her own action (distribution of demand to be transmitted over the network), but also on the joint actions (decisions) of other agents (congestion on chosen routes). This couplingevokes a game-theoreticformulation,since the local interests depend on the global behavior. We assume that the objective in the system is to minimize the overall cost of the agents (the social cost of routing), whereas each agent is only aware of her own cost. Such an optimization problem can be considered a special case of a distributed optimization problem, where agents need to come to a consensus over their estimations of the optimal state. Moreover, this consensus shouldmeetthe globaloptimizerofthe system. Thus,anaveragedynamicsneeds tobefoundtoallowagentsutilizingthelocalinformationinsuchanefficientway thatwouldguaranteedynamicsminimizingtheoverallcost. As we can see, the local information plays an important role in both game- theoretic learning and in distributed optimization algorithms. This work provides an appropriate dynamics for searching an optimum in multi-agent systems given different information settings in these systems. Moreover, to restrict the system resourcesrequiredtoperformanalgorithm,thepresentworkfocusesonmemoryless procedures,inwhichagentsbasetheirdecisions(theiractionchoices)onlyonthe currentinformationanddonotneedtostoretheirinputsandoutputsontheprevious steps. Thebookbeginswiththeintroductionofsomeimportantconceptsinthegame theory, comparison of possible information structures in systems, and overview of the existing relevant literature. All these issues are providedin Chap.2. In this chapterageneralparadigmandmotivationofthegame-theoreticapproachandthe distributedoptimizationinmulti-agentsystemsarehighlighted. MotivatedbythediscussioninChap.2,theworkproceedstoconsideringmemo- rylessmulti-agentsystemsdesignedbymeansofpotentialgames.Chapter3studies systems with the so-called oracle information, where each agent can calculate her current output given any action from her action set. Firstly, the chapter deals with discrete action potential games. A general memoryless stochastic procedure with specific properties is proposed to guarantee convergence in total variation of joint actions to a potential function maximizer, which under an appropriate game design coincides with a global optimizer in the centralized problem. The convergence rate is estimated for this procedure as well. Moreover, to accelerate this rate, this work providessome importantinsights into the finite time behavior ofthe procedure.Theanalysisisbasedon thetheoryofMarkov chainswith finite states.BackgroundmaterialonthistopicisdiscussedinAppendixA.2.Asspecial cases of such generalmemorylessoracle-basedalgorithm, the logit dynamicsand its synchronousversion are considered. Some optimal settings for the parameters ofthesedynamicsareestablished,underwhichthelearningprocessesdemonstrate an efficient performance.Moreover,the differencebetween the asynchronousand synchronous oracle-based learning dynamics is explained in terms of required

Description:
This book presents new efficient methods for optimization in realistic large-scale, multi-agent systems. These methods do not require the agents to have the full information about the system, but instead allow them to make their local decisions based only on the local information, possibly obtained
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.