The Price of Stochastic Anarchy ChristineChung1,KatrinaLigett2?,KirkPruhs1??,andAaronRoth2 1 DepartmentofComputerScience UniversityofPittsburgh {chung,kirk}@cs.pitt.edu 2 DepartmentofComputerScience CarnegieMellonUniversity {katrina,alroth}@cs.cmu.edu Abstract. Weconsiderthesolutionconceptofstochasticstability,andpropose thepriceofstochasticanarchyasanalternativetothepriceof(Nash)anarchy for quantifying the cost of selfishness and lack of coordination in games. As a solutionconcept,theNashequilibriumhasdisadvantagesthatthesetofstochas- ticallystablestatesofagameavoid:unlikeNashequilibria,stochasticallystable statesaretheresultofnaturaldynamicsofcomputationallyboundedanddecen- tralizedagents,andareresilienttosmallperturbationsfromidealplay.Theprice ofstochasticanarchycanbeviewedasasmoothedanalysisofthepriceofan- archy,distinguishingequilibriathatareresilienttonoisefromthosethatarenot. Toillustratetheutilityofstochasticstability,westudytheloadbalancinggame onunrelatedmachines.ThisgamehasanunboundedlylargepriceofNashanar- chyevenwhenrestrictedtotwoplayersandtwomachines.Weshowthatinthe twoplayercase,thepriceofstochasticanarchyis2,andthateveninthegeneral case,thepriceofstochasticanarchyisbounded.Weconjecturethatthepriceof stochasticanarchyisO(m),matchingthepriceofstrongNashanarchywithout requiringplayercoordination.Weexpectthatstochasticstabilitywillbeusefulin understandingtherelativestabilityofNashequilibriainothergameswherethe worstequilibriaseemtobeinherentlybrittle. ?PartiallysupportedbyanAT&TLabsGraduateFellowshipandanNSFGraduateResearch Fellowship. ??Supported in part by NSF grants CNS-0325353, CCF-0448196, CCF-0514058 and IIS- 0534531. 1 Introduction Quantifyingthepriceof(Nash)anarchyisoneofthemajorlinesofresearchinalgorith- micgametheory.Indeed,onefourthoftheauthoritativealgorithmicgametheorytext editedbyNisanetal.[20]iswhollydedicatedtothistopic.ButtheNashequilibrium solution concept has been widely criticized [15,4,9,10]. First, it is a solution charac- terizationwithoutaroadmapforhowplayersmightarriveatsuchasolution.Second, at Nash equilibria, players are unrealistically assumed to be perfectly rational, fully informed, and infallible. Third, computing Nash equilibria is PPAD-hard for even 2- player,n-actiongames[6],anditisthereforeconsideredveryunlikelythatthereexists apolynomialtimealgorithmtocomputeaNashequilibriumeveninacentralizedman- ner.Thus,itisunrealistictoassumethatselfishagentsingeneralgameswillconverge precisely to the Nash equilibria of the game, or that they will necessarily converge to anythingatall.Inaddition,thepriceofNashanarchymetriccomeswithitsownweak- nesses;itblindlyusestheworstcaseoverallNashequilibria,despitethefactthatsome equilibriaaremoreresilientthanotherstoperturbationsinplay. Considering these drawbacks, computer scientists have paid relatively little atten- tiontoiforhowNashequilibriawillinfactbereached,andevenlesstothequestionof whichNashequilibriaaremorelikelytobeplayedintheeventplayersdoconvergeto Nashequilibria.Toaddresstheseissues,weemploythestochasticstabilityframework fromevolutionarygametheorytostudysimpledynamicsofcomputationallyefficient, imperfect agents. Rather than defining a-priori states such as Nash equilibria, which mightnotbereachablebynaturaldynamics,thestochasticstabilityframeworkallows ustodefineanaturaldynamic,andfromitderivethestablestates.Wedefinetheprice of stochastic anarchy to be the ratio of the worst stochastically stable solution to the optimal solution. The stochastically stable states of a game may, but do not necessar- ily,containallNashequilibriaofthegame,andsothepriceofstochasticanarchymay bestrictlybetterthanthepriceofNashanarchy.Ingamesforwhichthestochastically stablestatesareasubsetoftheNashequilibria,studyingtheratiooftheworststochas- tically stable state to the optimal state can be viewed as a smoothed analysis of the priceofanarchy,distinguishingNashequilibriathatarebrittletosmallperturbationsin perfectplayfromthosethatareresilienttonoise. The evolutionary game theory literature on stochastic stability studies n-player gamesthatareplayedrepeatedly.Ineachround,eachplayerobservesheractionandits outcome,andthenusessimplerulestoselectheractionforthenextroundbasedonly on her size-restricted memory of the past rounds. In any round, players have a small probability of deviating from their prescribed decision rules. The state of the game is thecontentsofthememoriesofalltheplayers.Thestochasticallystablestatesinsucha gamearethestateswithnon-zeroprobabilityinthelimitofthisrandomprocess,asthe probability of error approaches zero. The play dynamics we employ in this paper are theimitationdynamicsstudiedbyJosephsonandMatros[16].Underthesedynamics, eachplayerimitatesthestrategythatwasmostsuccessfulforherinrecentmemory. Toillustratetheutilityofstochasticstability,westudythepriceofstochasticanar- chyoftheunrelatedloadbalancinggame[2,1,11].Toourknowledge,wearethefirst to quantify the loss of efficiency in any system when the players are in stochastically stableequilibria.Intheloadbalancinggameonunrelatedmachines,evenwithonlytwo players and two machines, there are Nash equilibria with arbitrarily high cost, and so the price of Nash anarchy is unbounded. We show that these equilibria are inherently 1 brittle, and that for two players and two machines, the price of stochastic anarchy is 2. This result matches the strong price of anarchy [1] without requiring coordination (atstrongNashequilibria,playershavetheabilitytocoordinatebyformingcoalitions). Wefurthershowthatinthegeneraln-player,m-machinegame,thepriceofstochastic anarchyisbounded.Morepreciselythepriceofstochasticanarchyisupperboundedby thenmthn-stepFibonaccinumber.Wealsoshowthatthepriceofstochasticanarchy isatleastm+1. Ourworkprovidesnewinsightintotheequilibriaoftheloadbalancinggame.Un- like some previous work on dynamics for games, our work does not seek to propose practicaldynamicswithfastconvergence;rather,weusesimpledynamicsasatoolfor understanding the inherent relative stability of equilibria. Instead of relying on player coordinationtoavoidtheNashequilibriawithunboundedcost(asisdoneinthestudy of strong equilibria), we show that these bad equilibria are inherently unstable in the face of occasional uncoordinated mistakes. We conjecture that the price of stochastic anarchyisclosertothelinearlowerbound,parallelingthepriceofstronganarchy. In light of our results, we believe the techniques in this paper will be useful for understandingtherelativestabilityofNashequilibriainothergamesforwhichtheworst equilibria are brittle. Indeed, for a variety of games in the price of anarchy literature, theworstNashequilibriaofthelowerboundinstancesarenotstochasticallystable. 1.1 RelatedWork Wegiveabriefsurveyofrelatedworkinthreeareas:alternativestoNashequilibriaas asolutionconcept,stochasticstability,andtheunrelatedloadbalancinggame. Recently,severalpapershavenotedthattheNashequilibriumisnotalwaysasuit- ablesolutionconceptforcomputationallyboundedagentsplayinginarepeatedgame, and have proposed alternatives. Goemans et al. [15] study players who sequentially play myopic best responses, and quantify the price of sinking that results from such play.FabrikantandPapadimitriou[9]proposeamodelinwhichagentsplayrestricted finite automata. Blum et al. [4,3] assume only that players’ action histories satisfy a propertycallednoregret,andshowthatformanygames,theresultingsocialcostsare noworsethanthoseguaranteedbypriceofanarchyresults. Although we believe this to be the first work studying stochastic stability in the computer science literature, computer scientists have recently employed other tools from evolutionary game theory. Fisher and Vo¨cking [13] show that under replicator dynamics in the routing game studied by Roughgarden and Tardos [22], players con- verge to Nash. Fisher et al. [12] went on to show that using a simultaneous adaptive samplingmethod,playconvergesquicklytoaNashequilibrium.Forathoroughsurvey of algorithmic results that have employed or studied other evolutionary game theory techniquesandconcepts,seeSuri[23]. Stochasticstabilityanditsadaptivelearningmodelasstudiedinthispaperwerefirst defined by Foster and Young [14], and differ from the standard game theory solution conceptofevolutionarilystablestrategies(ESS).ESSarearefinementofNashequilib- ria, and so do not always exist, and are not necessarily associated with a natural play dynamic.Incontrast,agamealwayshasstochasticallystablestatesthatresult(bycon- struction) from natural dynamics. In addition, ESS are resilient only to single shocks, whereasstochasticallystablestatesareresilienttopersistentnoise. 2 Stochastic stability has been widely studied in the economics literature (see, for example, [24,17,19,5,7,21,16]). We discuss in Sect. 2 concepts from this body of literaturethatarerelevanttoourresults.WerecommendYoung[25]foraninformative andreadableintroductiontostochasticstability,itsadaptivelearningmodel,andsome relatedresults.Ourworkdiffersfrompriorworkinstochasticstabilityinthatitisthe first to quantify the social utility of stochastically stable states, the price of stochastic anarchy. Wealsonoteaconnectionbetweenthestochasticallystablestatesofthegameand thesinksofagame,recentlyintroducedbyGoemansetal.asanotherwayofstudying thedynamicsofcomputationallyboundedagents.Inparticular,thestochasticallystable states of a game under the play dynamics we consider correspond to a subset of the sink equilibria, and so provide a framework for identifying the stable sink equilibria. In potential games, the stochastically stable states of the play dynamics we consider correspondtoasubsetoftheNashequilibria,thusprovidingamethodforidentifying whichoftheseequilibriaarestable. Inthispaper,westudythepriceofstochasticanarchyinloadbalancing.Even-Dar etal.[1]showthatwhenplayingtheloadbalancinggameonunrelatedmachines,any turn-takingimprovementdynamicsconvergetoNash.Andelmanetal.[1]observethat thepriceofNashanarchyinthisgameisunboundedandtheyshowthatthestrongprice ofanarchyislinearinthenumberofmachines.Fiatetal.[11]tightentheirupperbound tomatchtheirlowerboundatastrongpriceofanarchyofexactlym. 2 ModelandBackground We now formalize (from Young [24]) the adaptive play model and the definition of stochastic stability. We then formalize the play dynamics that we consider. We also provide in this section the results from the stochastic stability literature that we will lateruseforourresults. 2.1 AdaptivePlayandStochasticStability LetG=(X,π)beagamewithnplayers,whereX =Qn X representsthestrategy j=1 i setsX foreachplayeri,andπ =Qn π representsthepayofffunctionsπ :X →R i j=1 i i foreachplayer.Gisplayedrepeatedlyforsuccessivetimeperiodst = 1,2,...,andat each time step t, player i plays some action st ∈ X . The collection of all players’ i i actionsattimetdefinesaplayprofileSt = (St,St,...,St).Wewishtomodelcom- 1 2 n putationallyefficientagents,andsoweimaginethateachagenthassomefinitememory ofsize z, andthat after timestep t, all playersremember ahistory consistingof ase- quenceofplayprofilesht =(St−z+1,St−z+2,...,St)∈(X)z. Weassumethateachplayerihassomeefficientlycomputablefunctionp :(X)z× i X → Rthat,givenaparticularhistory,inducesasampleableprobabilitydistribution i P over actions (for all players i and histories h, p (h,a) = 1). We write p for Q a∈Xi i p .Wewishtomodelimperfectagentswhomakemistakes,andsoweimaginethat i i attimeteachplayeriplaysaccordingtop withprobability1−(cid:15),andwithprobability i (cid:15)playssomeactioninX uniformlyatrandom.3Thatis,forallplayersi,forallactions i 3Themistakeprobabilitiesneednotbeuniformrandom—allthatwerequireisthatthedistri- butionhassupportonallactionsinX . i 3 a∈X ,Pr[st =a]=(1−(cid:15))p (ht,a)+ (cid:15) .Thedynamicswehavedescribeddefine i i i |Xi| aMarkovprocessPG,p,(cid:15) withfinitestatespaceH = (X)z correspondingtothefinite histories.Fornotationalsimplicity,wewillwritetheMarkovprocessasP(cid:15)whenthere isnoambiguity. Thepotentialsuccessorsofahistorycanbeobtainedbyobservinganewplaypro- file,and“forgetting”theleastrecentplayprofileinthecurrenthistory. Definition2.1. For any S0 ∈ X, A history h0 = (St−z+2,St−z+3,...,St,S0) is a successorofhistoryht =(St−z+1,St−z+2,...,St). TheMarkovprocessP(cid:15) hastransitionprobabilityp(cid:15) ofmovingfromstateh = h,h0 (S1,...,Sz)tostateh0 =(T1,...,Tz): (Qn (1−(cid:15))p (h,Tz)+ (cid:15) ifh0isasuccessorofh; p(cid:15) = i=1 i i |Xi| h,h0 0 otherwise. We will refer to P0 as the unperturbed Markov process. Note that for (cid:15) > 0, p(cid:15) > 0 for every history h and successor h0, and that for any two histories h and h,h0 hˆ notnecessarilyasuccessorofh,thereisaseriesofz historiesh ,...,h suchthat 1 z h =h,h =hˆ,andforall1<i≤z,h isasuccessorofh .Thusthereispositive 1 z i i−1 probability of moving between any h and any hˆ in z steps, and so P(cid:15) is irreducible. Similarly,thereisapositiveprobabilityofmovingbetweenanyhandanyhˆ inz+1 steps,andsoP(cid:15)isaperiodic.Therefore,P(cid:15)hasauniquestationarydistributionµ(cid:15). The stochastically stable states of a particular game and player dynamics are the stateswithnonzeroprobabilityinthelimitofthestationarydistribution. Definition2.2 (Foster and Young [14]). A state h is stochastically stable relative to P(cid:15)iflim µ(cid:15)(h)>0. (cid:15)→0 Intuitively,weshouldexpectaprocessP(cid:15) tospendalmostallofitstimeatitsstochas- ticallystablestateswhen(cid:15)issmall. Whenaplayeriplaysatrandomratherthanaccordingtop ,wecallthisamistake. i Definition2.3 (Young [24]). Suppose h0 = (St−z+1,...,St) is a successor of h. A mistakeinthetransitionbetweenhandh0 isanyelementSt suchthatp (h,St) = 0. i i i Notethatmistakesoccurwithprobability≤(cid:15). We can characterize the number of mistakes required to get from one history to another. Definition2.4 (Young [24]). For any two states h,h0, the resistance r(h,h0) is the minimumtotalnumberofmistakesinvolvedinthetransitionh→h0ifh0isasuccessor ofh.Ifh0isnotasuccessorofh,thenr(h,h0)=∞. Notethatthetransitionsofzeroresistanceareexactlythosethatoccurwithpositive probabilityintheunperturbedMarkovprocessP0. Definition2.5. WerefertothesinksofP0asrecurrentclasses.Inotherwords,arecur- rentclassofP0isasetofstatesC ⊆H suchthatanystateinC isreachablefromany otherstateinC andnostateoutsideC isaccessiblefromanystateinsideC. 4 WemayviewthestatespaceH asthevertexsetofadirectedgraph,withanedge fromhtoh0ifh0isasuccessorofh,withedgeweightr(h,h0). Observation2.6. WeobservethattherecurrentclassesH ,H ,...,whereeachH ⊆ 1 2 i H,havethefollowingproperties: 1. Fromeveryvertexh∈H,thereisapathofcost0tooneoftherecurrentclasses. 2. For each H and for every pair of vertices h,h0 ∈ H , there is a path of cost 0 i i betweenhandh0. 3. ForeachH ,everyedge(h,h0)withh∈H ,h0 6∈H haspositivecost. i i i Let r denote the cost of the shortest path between H and H in the graph de- i,j i j scribedabove.WenowconsiderthecompletedirectedgraphGwithvertexset{H ,H ,...} 1 2 inwhichtheedge(H ,H )hasweightr .LetT beadirectedminimum-weightspan- i j i,j i ningin-treeofG rootedatvertexH .(Anin-treeisadirectedtreewhereeachedgeis i orientedtowardtheroot.)ThestochasticpotentialofH isdefinedtobethesumofthe i edgeweightsinT . i Youngprovesthefollowingtheoremcharacterizingstochasticallystablestates: Theorem2.7 (Young[24]).Inanyn-playergameGwithfinitestrategysetsandany setofactiondistributionsp,thestochasticallystablestatesofPG,p,(cid:15) aretherecurrent classesofminimumstochasticpotential. 2.2 ImitationDynamics In this paper, we study agents who behave according to a slight modification of the imitationdynamicsintroducedbyJosephsonandMatros[16].(Wenotethatthismodi- ficationisofnoconsequencetotheresultsofJosephsonandMatros[16]thatwepresent below.)Playeriusingimitationdynamicsparameterizedbyσ ∈ Nchooseshisaction attimet+1accordingtothefollowingmechanism: 1. PlayeriselectsasetY ofσ playprofilesuniformlyatrandomfromthez profiles inhistoryh . t 2. ForeachplayprofileS ∈ Y,irecallsthepayoffπ (S)heobtainedfromplaying i actionS . i 3. Player i plays the action among these that corresponds to his highest payoff; that is,heplaystheith componentofargmax π (S).Inthecaseofties,heplaysa S∈Y i highest-payoffactionatrandom. The value σ is a parameter of the dynamics that is taken to be n ≤ σ ≤ z/2. These dynamicscanbeinterpretedasmodelingasituationinwhichateachtimestep,players arechosenatrandomfromapoolofidenticalplayers,whoeachplayedinasubsetofthe lastz rounds.Theplayersarecomputationallysimple,andsodonotcounterspeculate the actions of their opponents, instead playing the action that has worked the best for theminrecentmemory. WewillsaythatahistoryhismonomorphicifthesameactionprofileS hasbeen repeatedforthelastzrounds:h=(S,S,...,S).JosephsonandMatros[16]provethe followingusefulfact: Proposition2.8. A set of states is a recurrent class of the imitation dynamics if and onlyifitisasingletonsetconsistingofamonomorphicstate. 5 Since the stochastically stable states are a subset of the recurrent classes, we can associatewitheachstochasticallystablestateh = (S,...,S)theuniqueactionprofile S itcontains.Thisallowsustonowdefinethepriceofstochasticanarchywithrespect toimitationdynamics.Forbrevity,wewillrefertothisthroughoutthepaperassimply thepriceofstochasticanarchy. Definition2.9. GivenagameG = (X,π)withasocialcostfunctionγ : X → R,the priceofstochasticanarchyofGisequaltomax γ(S) ,whereOPTistheplayprofile γ(OPT) thatminimizesγandthemaxistakenoverallplayprofilesSsuchthath=(S,...,S) isstochasticallystable. Given a game G, we define the better response graph of G: The set of vertices correspondstothesetofactionprofilesofG,andthereisanedgebetweentwoaction profilesS andS0 ifandonlyifthereexistsaplayerisuchthatS0 differsfromS only inplayeri’saction,andplayeridoesnotdecreasehisutilitybyunilaterallydeviating fromS toS0.JosephsonandMatros[16]provethefollowingrelationshipbetweenthis i i betterresponsegraphandthestochasticallystablestatesofagame: Theorem2.10. IfVisthesetofstochasticallystablestatesunderimitationdynamics, then V = {S : (S ,...,S ) ∈ V} is either a strongly connected component of the i i i betterresponsegraphofG,oraunionofstronglyconnectedcomponents. Goemans et al. [15] introduce the notion of sink equilibria and a corresponding notion of the “price of sinking”, which is the ratio of the social welfare of the worst sink equilibrium to that of the social optimum. We note that the strongly connected componentsofthebetterresponsegraphofGcorrespondtothesinkequilibria(under sequentialbetter-responseplay)ofG,andsoTheorem2.10impliesthatthestochasti- callystablestatesunderimitationdynamicscorrespondtoasubsetofthesinksofthe betterresponsegraphofG,andwegetthefollowingcorollary: Corollary2.11. ThepriceofstochasticanarchyofagameGunderimitationdynamics isatmostthepriceofsinkingofG. 3 LoadBalancing:GameDefinitionandPriceofNashAnarchy The load balancing game on unrelated machines models a set of agents who wish to schedule computing jobs on a set of machines. The machines have different strengths andweaknesses(forexample,theymayhavedifferenttypesofprocessorsordiffering amounts of memory), and so each job will take a different amount of time to run on each machine. Jobs on a single machine are executed in parallel such that all jobs on anygivenmachinefinishatthesametime.Thus,eachagentwhoscheduleshisjobon machineM endurestheloadonmachineM ,wheretheloadisdefinedtobethesumof i i therunningtimesofalljobsscheduledonM .Agentswishtominimizethecompletion i timefortheirjobs,andsocialcostisdefinedtobethemakespan:themaximumloadon anymachine. Formally,aninstanceoftheloadbalancinggameonunrelatedmachinesisdefined byasetofnplayersandmmachinesM ={M ,...,M }.Theactionspaceforeach 1 m player is X = M. Each player i has some cost c on machine j. Denote the cost i i,j 6 P of machine Mj for action profile S by Cj(S) = is.t.Si=jci,j. Each player i has utility function π (S) = −C (S). The social cost of an action profile S is γ(S) = i si max C (S). We define OPT to be the action profile that minimizes social cost: j∈M j OPT = argmin γ(S). Without loss of generality, we will always normalize so S∈X thatγ(OPT)=1. The coordination ratio of a game (also known as the price of anarchy) was intro- duced by Koutsoupias and Papadimitriou [18], and is intended to quantify the loss of efficiencyduetoselfishnessandthelackofcoordinationamongrationalagents.Given agameGandasocialcostfunctionγ,itissimpletoquantifytheOPTgamestateS: OPT=argminγ(S).Itislessclearhowtomodelrationalselfishagents.Inmostprior workithasbeenassumedthatselfishagentsplayaccordingtoaNashequilibrium,and thepriceofanarchyhasbeendefinedastheratioofthecostoftheworst(purestrategy) NashstatetoOPT.Inthispaper,werefertothismeasureasthepriceofNashanarchy, todistinguishitfromthepriceofstochasticanarchy,whichwedefinedinSect.2.2. Definition3.1. For a game G with a set of Nash equilibrium states E, the price of (Nash)anarchyismax γ(S) . S∈E γ(OPT) Weshowherethatevenwithonlytwoplayersandtwomachines,theloadbalancing game on unrelated machines has a price of Nash anarchy that is unbounded by any functionofmandn.Considerthetwo-player,two-machinegamewithc =c =1 1,1 2,2 andc = c = 1/δ,forsome0 < δ < 1.ThentheplayprofileOPT = (M ,M ) 1,2 2,1 1 2 is a Nash equilibrium with cost 1. However, observe that the profile S∗ = (M ,M ) 2 1 isalsoaNashequilibrium,withcost1/δ(sincebydeviating,playerscanonlyincrease their cost from 1/δ to 1/δ +1). The price of anarchy of the load balancing game is therefore1/δ,whichcanbeunboundedlylarge,althoughm=n=2. 4 UpperBoundonPriceofStochasticAnarchy Theloadbalancinggameisanordinalpotentialgame[8],andsothesinksofthebetter- response graph correspond to the pure strategy Nash equilibria. We therefore have by Corollary2.11thatthestochasticallystablestatesareasubsetofthepurestrategyNash equilibriaofthegame,andthepriceofstochasticanarchyisatmostthepriceofanarchy. Wehavenotedthateveninthetwo-person,two-machineloadbalancinggame,theprice ofanarchyisunbounded(evenforpurestrategyequilibria).Therefore,asawarmup,we boundthepriceofstochasticanarchyofthetwo-player,two-machinecase. 4.1 TwoPlayers,TwoMachines Theorem4.1. Inthetwo-player,two-machineloadbalancinggameonunrelatedma- chines,thepriceofstochasticanarchyis2. Note that the two-player, two-machine load balancing game can have at most two strictpurestrategyNashequilibria.(Forbrevityweconsiderthecaseofstrictequilibria. The argument for weak equilibria is similar). Note also that either there is a unique Nashequilibriumat(M ,M )or(M ,M ),ortherearetwoatN = (M ,M )and 1 1 2 2 1 1 2 N =(M ,M ). 2 2 1 AnactionprofileN ParetodominatesN0ifforeachplayeri,C (N)≤C (N0). Ni Ni0 7 Lemma4.2. IftherearetwoNashequilibria,andN ParetodominatesN ,thenonly 1 2 N isstochasticallystable(andviceversa). 1 Proof. Note that if N Pareto dominates N , then it also Pareto dominates (M ,M ) 1 2 1 1 and (M ,M ), since each is a unilateral deviation from a Nash equilibrium for both 2 2 players.Considerthemonomorphicstate(N ,...,N ).Ifbothplayersmakesimulta- 2 2 neous mistakes at time t to N , then by assumption, N will be the action profile in 1 1 h = (N ,...,N ,N ) with lowest cost for both players. Therefore, with positive t+1 2 2 1 probability,bothplayerswilldrawsamplesoftheirhistoriescontainingtheactionpro- fileN ,andthereforeplayit,untilh = (N ,...,N ).Therefore,thereisanedge 1 t+z 1 1 inG fromh = {N ,...,N }toh0 = {N ,...,N }ofresistance2.However,thereis 2 2 1 1 no edge from h0 to any other state in G with resistance < σ. Recall our initial obser- vation that in fact, N Pareto dominates all other action profiles. Therefore, no set of 1 mistakeswillyieldanactionprofilewithhigherpayoffthanN foreitherplayer,andso 1 toleavestateh0willrequireatleastσmistakes(sothatsomeplayermaydrawasample from their history that contains no instance of action profile h). Therefore, given any minimumspanningtreeofG rootedath,wemayaddanedge(h,h0)ofweight2,and removetheoutgoingedgefromh0,whichwehaveshownmusthavecost≥σ.Thisisa minimumspanningtreerootedath0 withstrictlylowercost.Wehavethereforeshown that h0 has strictly lower stochastic potential than h, and so by Theorem 2.7, h is not stochasticallystable.SinceatleastoneNashequilibriummustbestochasticallystable, h0 =(N ,...,N )istheuniquestochasticallystablestate. ut 1 1 Proof (ofTheorem4.1).IfthereisonlyoneNashequilibrium(M ,M )or(M ,M ), 1 1 2 2 thenitmustbetheonlystochasticallystablestate(sinceinpotentialgamesthesearea nonemptysubsetofthepurestrategyNashequilibria),andmustalsobeOPT.Inthis case,thepriceofanarchyisequaltothepriceofstochasticanarchy,andis1.Therefore, wemayassumethattherearetwoNashequilibria,N andN .IfN Paretodominates 1 2 1 N ,thenN mustbeOPT(sinceloadbalancingisapotentialgame),andbyLemma 2 1 4.2,N istheonlystochasticallystablestate.Inthiscase,thepriceofstochasticanarchy 1 is1(strictlylessthanthe(possiblyunbounded)priceofanarchy).Asimilarargument holds if N Pareto dominates N . Therefore, we may assume that neither N nor N 2 1 1 2 Paretodominatetheother. Withoutlossofgenerality,assumethatN isOPT,andthatinN = (M ,M ), 1 1 1 2 M is the maximally loaded machine. Suppose that M is also the maximally loaded 2 2 machine in N . (The other case is similar.) Together with the fact that N does not 2 1 ParetodominateN ,thisgivesusthefollowing: 2 c ≤c 1,1 2,2 c ≤c 2,1 2,2 c ≥c 1,2 2,2 FromthefactthatbothN andN areNashequilibria,weget: 1 2 c +c ≥c 1,1 2,1 2,2 c +c ≥c 1,1 2,1 1,2 Inthiscase,thepriceofanarchyamongpurestrategyNashequilibriais: c c +c c +c c 1,2 ≤ 1,1 2,1 ≤ 1,1 2,1 =1+ 2,1 c c c c 2,2 2,2 1,1 1,1 8 Similarly,wehave: c c +c c +c c 1,2 ≤ 1,1 2,1 ≤ 1,1 2,1 =1+ 1,1 c c c c 2,2 2,2 2,1 2,1 Combining these two inequalities, we get that the price of Nash anarchy is at most 1+min(c /c ,c /c ) ≤ 2.Sincethepriceofstochasticanarchyisatmostthe 1,1 2,1 2,1 1,1 priceofanarchyoverpurestrategies,thiscompletestheproof. ut 4.2 GeneralCase:nPlayers,mMachines Theorem4.3. The general load balancing game on unrelated machines has price of stochasticanarchyboundedbyafunctionΨ dependingonlyonnandm,and Ψ(n,m)≤m·F (nm+1), (n) whereF (i)denotestheithn-stepFibonaccinumber.4 (n) Toprovethisupperbound,weshowthatanysolutionworsethanourupperbound cannotbestochasticallystable.Toshowthisimpossibility,wetakeanyarbitrarysolu- tionworsethanourupperboundandshowthattheremustalwaysbeaminimumcost in-tree in G rooted at a different solution that has strictly less cost than the minimum costin-treerootedatthatsolution.WethenapplyProposition2.8andTheorem2.7.The proofproceedsbyaseriesoflemmas. Definition4.4. ForanymonomorphicNashstateh = (S,...,S),lettheNashGraph ofhbeadirectedgraphwithvertexsetM anddirectededges(M ,M )ifthereissome i j playeriwithS =M andOPT =M .LettheclosureM¯ ofmachineM ,betheset i i i j i i ofstatesreachablefromM byfollowing0ormoreedgesoftheNashgraph. i Lemma4.5. InanymonomorphicNashstateh=(S,...,S),ifthereisamachineM i suchthatC (S)>m,theneverymachineM ∈M¯ hascostC (S)>1. i j i j Proof. Supposethiswerenotthecase,andthereexistsanM ∈ M¯ withC (S) ≤ 1. j i j Since M ∈ M¯ , there exists a simple path (M = M ,M ,...,M = M ) with j i i 1 2 k j k ≤ m.SinceS isaNashequilibrium,itmustbethecasethatC (S) ≤ 2because k−1 by the definition of the Nash graph, the directed edge from M to M implies that k−1 k there is some player i with S = M , but OPT = M . Since 1 = γ(OPT) ≥ i k−1 i k C (OPT) ≥ c ,ifplayerideviatedfromhisactioninNashprofileS toS0 = M , k i,k i k he would experience cost C (S)+c ≤ 1+1 = 2. Since he cannot benefit from k i,k deviating (by definition of Nash), it must be that his cost in S, C (S) ≤ 2. By the k−1 sameargument,itmustbethatC (S)≤3,andbyinduction,C (S)≤k ≤m. ut k−2 1 Lemma4.6. ForanymonomorphicNashstateh = (S,...,S) ∈ G withγ(S) > m, thereisanedgefromhtosomeg = (T,...,T)whereγ(T) ≤ mwithedgecost≤ n inG. 1 ifi≤n; 4F (i)= F (i)∈o(2i)foranyfixedn. (n) ( ij=i−nF(n)(j) otherwise. (n) P 9
Description: