Tilburg University Computing normal form perfect equilibria for extensive two-person games von Stengel, B.; van den Elzen, A.H.; Talman, A.J.J. Published in: Econometrica Publication date: 2002 Document Version Peer reviewed version Link to publication in Tilburg University Research Portal Citation for published version (APA): von Stengel, B., van den Elzen, A. H., & Talman, A. J. J. (2002). Computing normal form perfect equilibria for extensive two-person games. Econometrica, 70(2), 693-715. General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal Take down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim. Download date: 18. mrt.. 2023 COMPUTING NORMAL FORM PERFECT EQUILIBRIA FOR EXTENSIVE TWO-PERSON GAMES By Bernhard von Stengel, Antoon van den Elzen, and Dolf Talman1 July 31, 2000 This paper presents an algorithm for computing an equilibrium of an extensive two-person game with perfect recall. The method is computationally efficientbyusingthesequenceform, whosesizeisproportionaltothesizeofthe game tree. The equilibrium is tracedona piecewise linear path inthe sequence form strategy space from an arbitrary starting vector. If the starting vector representsapairofcompletelymixedstrategies, thenthe equilibriumisnormal form perfect. Computational experiments compare the sequence form and the reduced normal form, and show that only the sequence form is tractable for larger games. Keywords: Extensive game, linear complementarity, Nash equilibrium, normal form perfect equilibrium, sequence form. 1The authors thank Eric van Damme, Marciano Siniscalchi, the editor, and the referees for helpfulcomments,andDavidAvis,RichardMcKelvey,andTedTurocywithhelponprogramming. Thefirst author wassupportedbyaHeisenberggrantfromtheDeutsche Forschungsgemeinschaft. 1 1. Introduction In this paper we present an algorithm for computing a Nash equilibrium of a two- person game in extensive form with perfect recall. The computed equilibrium is normal form perfect. If the game has several equilibria, they can potentially be found by varying the starting point of the algorithm. The method is fast since it uses the compact “sequence form” of the extensive game (see the references below) instead of its reduced normal form. It is simple because it is a version of Lemke’s algorithm for linear complementarity problems. We have implemented it in exact arithmetic, which guarantees numerical stability. Computational experiments show that the number of pivoting steps of our algorithm to find an equilibrium is of the same order as that of the simplex algorithm for linear programming applied to a comparable zero-sum game. “Typical” games with several hundred nodes are solved in less than a minute where it would be hopeless to use the reduced normal form. Our method therefore puts much more complex games in computational reach, even more so as computers get faster. The algorithm is a synthesis of previous, partly independent work by the au- thors and Daphne Koller and Nimrod Megiddo. For two-person games in normal form, van den Elzen and Talman (1991, 1999) (see also van den Elzen, 1993) de- scribed a complementary pivoting algorithm that traces a piecewise linear path from agivenstartingvectorto anequilibrium. Ifthestarting vectorisacompletelymixed strategypair, thenthecomputedpathleadstoaperfect equilibrium. Thefreechoice of the starting vector makes it possible to compute several equilibria if they exist. This pivoting algorithm can be applied to an extensive game by converting it to its normal form. Then, the variables are probabilities for pure strategies, each of which is a combination of choices, one choice for each information set. The number of strategies therefore increases exponentially with the number of information sets. The number of information sets is typically proportional to the size of the game tree (the number of tree nodes), so then the size of the normal form is exponential. Even the reduced normal form (where strategies that differ only in choices at unreachable information sets are identified) shows an exponential-type growth in that case. For example, the games with N tree nodes studied in our computational experiments 2 haveontheorderof 2√N reducedstrategiesratherthanontheorderof 2N unreduced strategies, which nevertheless leads to an “explosion” in size. Each pivoting step updates the entire linear system derived from the payoff matrices, which is very slow for matrices of exponential size. The sequence form of the extensive game (Romanovskii, 1962; von Stengel, 1996) is a strategic description where pure strategies are replaced by sequences of choices that lead to a node of the game tree, so there are at most as many sequences as there are nodes. The dimensions of the resulting matrix are proportional to the gametreesize. Eachpivotingstepappliedtothissystemisthereforecomputationally efficient. An algorithm is called computationally efficient if its asymptotic running time is bounded by a polynomial in the input size. For the overall number of pivoting steps, this is only an empirical observation. Our practical experiments show that the number of pivoting steps to find an equilibrium is about the same as the matrix dimension. The pivoting method, like the simplex algorithm for linear programming, is not polynomial in theory (certain specifically constructed worst cases take exponential time), but works well in practice. Koller, Megiddo, and von Stengel (1996) applied the complementary pivoting algorithm by Lemke (1965) to the sequence form. As before, each pivoting step takes polynomial time, and the number of pivoting steps is empirically a polynomial in the tree size. However, this algorithm finds only one equilibrium and it is not certain whether this equilibrium is normal form perfect. Here we show how to combine the (empirical) computational efficiency of the algorithm of Koller, Megiddo, and von Stengel (1996) and the flexibility of the algorithmofvandenElzenandTalman(1991). OurmethodisavariationofLemke’s algorithm and operates on the sequence form. It can be started anywhere to search for more than one equilibrium. If the starting strategy vector is completely mixed, the equilibrium found is normal form perfect. Equivalently, it is a Nash equilibrium in undominated strategies since the game has two players (van Damme, 1987). The key to our result is the new observation that the algorithm of van den Elzen and Talman is equivalent to Lemke’s algorithm for a specific auxiliary vector. This is readily applied to the sequence form, as described in Section 3 below. We 3 then study the nature of the computed path. The path and the equilibrium found have all properties of the normal form in a compact representation. The implementation of our algorithm also resolves a number of technical diffi- culties of degeneracy and numerical accuracy. Degeneracy is intrinsic for extensive games, even with generic payoffs and when using the sequence form, since the prob- abilities for the players’ behavior off the equilibrium path are underdetermined. In order to avoid a well-known numerical instability of Lemke’s algorithm (Tomlin, 1978), we employ arbitrary precision arithmetic, and yet achieve good running times due to the use of “integer pivoting”. We also give a concise exposition of the sequence form in Section 2, and show, more explicitly than in earlier publications, how it relates to the normal form via equation (2.2). The sequence form defines an equilibrium problem where each play- er’s strategy space is a polytope. Charnes (1953) described the solution of zero-sum games that are constrained in this way. For a game in extensive form, Romanovskii (1962) derived such a constrained matrix game which is equivalent to the sequence form. Until recently, this publication was overlooked in the English-speaking com- munity. Eaves(1973) appliedLemke’salgorithmto games whichinclude polyhedral- ly constrained bimatrix games, but with different parameters than we do. Dai and Talman (1993) described an algorithm that corresponds to ours but requires simple polyhedra as strategy spaces, which is not the case for the sequence form. Selten (1988, pp.226, 237ff)definedsequenceformstrategyspacestoexploittheirlinearity, but not for computational purposes. Recent surveys on algorithms for computing Nash equilibria are McKelvey and McLennan (1996) and von Stengel (2000). The setupof thepaper isasfollows. Section2 recallsthe notionof thesequence form, its derivation from the extensive game, and how its equilibria are the solutions to a corresponding linear complementary problem. The algorithm is presented in Section3andillustratedinSection4withanexample. InSection5weprovethatthe equilibrium found is normal form perfect if the starting strategy vector is completely mixed, and note that the algorithm mimics the linear tracing procedure. Section 6 discusses the handling of degeneracy. In Section 7 we show that our method is an instance of a homotopy, and mention how to find equilibria of negative index. 4 Section 8 compares the method with other algorithms. In Section 9, we present results of computational experiments. 2. The sequence form linear complementarity problem We consider extensive two-person games, with conventions similar to von Stengel (1996)andKoller,Megiddo,andvonStengel(1996). Anextensivegameisgivenbya tree with a finite number of nodes, chance moves with positive probabilities, payoffs to both players at the leaves (the terminal nodes), and information sets partitioning the set of remaining decision nodes. The choices of a player at an information set are denoted by labels of tree edges. For simplicity, labels corresponding to different choices anywhere in the tree are distinct. On the unique path from the root to a node of the tree, the labels denoting the choices of a particular player define a sequence of choices for that player. We assume that both players have perfect recall. By definition, this means that all nodes in an information set h of a player define the same sequence σ of choices for that player. Under that assumption, each choice c h at h is the last choice of a unique sequence σ c. This defines all possible sequences h of a player except for the empty sequence . The set of choices at an information ∅ set h is denoted C . The set of information sets of player i is H , and the set of his h i sequences is S , so i S = σ c h H , c C . i h i h {∅} ∪ { | ∈ ∈ } The size of the extensive game is the amount of data needed to specify it. It is proportional to the total number of nodes of the game tree. The number S of i | | sequences of player i is 1 + C , which is at most linear in the size of the h∈Hi| h| extensive game. P A behavior strategy β of player i is given by probabilities β(c) for his choices c which fulfill β(c) 0 and β(c) = 1 for all h in H . This definition of β can c C i ≥ ∈ h be extended to the sequencePs σ in S by writing i β[σ] = β(c). (2.1) c in σ Y 5 A pure strategy π of a player is a behavior strategy with π(c) 0,1 for all ∈ { } choices c. The set of pure strategies of player i is denoted P . Thus, π[σ] 0,1 i ∈ { } for all sequences σ in S . The pure strategies π with π[σ] = 1 are those “agreeing” i with σ by prescribing all the choices in σ, and arbitrary choices at the information sets not touched by σ. In the normal form of the extensive game, one considers pure strategies and their probability mixtures. A mixed strategy µ of player i assigns a probability µ(π) to every π in P . In the sequence form of the extensive game, one considers i the sequences of a player instead of his pure strategies. A randomized strategy of player i is described by the realization probabilities of playing the sequences σ in S . i For a behavior strategy β, these are obviously β[σ] as in (2.1). For a mixed strategy µ of player i, they are given by µ[σ] = π[σ]µ(π). (2.2) πX∈Pi For player 1, this defines a map x from S to IR by x(σ) = µ[σ] for σ in S which 1 1 we call the realization plan of µ or a realization plan for player 1. A realization plan for player 2, similarly defined on S , is denoted y. The important properties of 2 realization plans are stated in the following two lemmas (Koller and Megiddo, 1992; von Stengel, 1996). Lemma 2.1: For player 1, x is the realization plan of a mixed strategy if and only if x(σ) 0 for all σ S and 1 ≥ ∈ x( ) = 1, ∅ (2.3) x(σ c) = x(σ ), h H . h h 1 ∈ c C X∈ h A realization plan y of player 2 is characterized analogously. Proof: Equations (2.3) hold for the realization probabilities x(σ) = β[σ] for a behavior strategy β and thus for every pure strategy π, and therefore for their convex combinations in (2.2) with the probabilities µ(π). To simplify notation, we write realization plans as vectors x = (x ) and σ σ S1 ∈ y = (y ) with sequences as subscripts. According to Lemma 2.1, these vectors σ σ S2 ∈ 6 are characterized by x 0, Ex = e, y 0, Fy = f (2.4) ≥ ≥ for suitable matrices E and F, and vectors e and f that are equal to (1,0...,0) , > where E and e have 1+ H rows and F and f have 1+ H rows; an example for 1 2 | | | | E, e, F, and f is given in (2.6) below. Inequalities like (2.4) hold componentwise and 0 denotes a vector of zeroes. The number of information sets and therefore the number of rows of E and F is at most linear in the size of the game tree. Mixed strategies of a player are called realization equivalent (Kuhn, 1953) if they define the same realization probabilities for all nodes of the tree given any strategy of the other player. Lemma 2.2: Two mixed strategies µ and µ of player i are realization equiv- 0 alent if and only if they have the same realization plan, that is, µ[σ] = µ[σ] for all 0 σ S . i ∈ Proof: Consider (2.2) as defining a linear map from IRPi to IRSi that maps | | | | the vector (µ(π)) to (µ[σ]) with the fixed coefficients π[σ], π P . Then π∈Pi σ∈Si ∈ i mixed strategies with the same image under this map are clearly realization equiv- alent. The linear map in the preceding proof maps the simplex of mixed strategies of a player to the polytope of realization plans. These polytopes are characterized by (2.4) as asserted by Lemma 2.1. They define the player’s strategy spaces in the sequence form and are denoted by X = x x 0, Ex = e , Y = y y 0, Fy = f . (2.5) { | ≥ } { | ≥ } Theverticesof X and Y aretheplayers’purestrategiesuptorealizationequivalence, which is the identification of pure strategies used in the reduced normal form of the game (for generic payoffs). Figure 2.1 shows an extensive game where the choices of player 1 and player 2 are denoted by the upper and lower case letters L,R,S,T and a,b,c,d, respectively. 7 1 º ∙ •J J L ¹J ¸R J J ºJ 1 ∙ •J J J S ¹ ¸T J J chance J •J J J J 1/ J 1/ J 2 2 J J J J 2 J 2 J º ∙ º ∙ £•B £•B £•B £•B £ B £ B £ B £ B a¹£ B b a £ B¸b c¹£ B d c £ B¸d £ B £ B £ B £ B £ B £ B £ B £ B 11 3 0 0 0 24 6 0 3 0 0 10 4 0 0 1 µ ¶ µ ¶ µ ¶ µ ¶ µ ¶ µ ¶ µ ¶ µ ¶ Figure 2.1.–A two-person extensive game. The sets of sequences are S = ,L,R,RS,RT and S = ,a,b,c,d . In the 1 2 {∅ } {∅ } constraints (2.4) we have 1 1 1 E = 1 1 1 , F = 1 1 1 , e = f = 0 . (2.6) ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ − − 1 1 1 1 1 1 0 ⎢ − ⎥ ⎢− ⎥ ⎢ ⎥ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ Sequence form payoffs are defined for pairs of sequences whenever these lead to a leaf, multiplied by the probabilities of chance moves on the path to the leaf. This defines two sparse matrices A and B of dimension S S for player 1 and 1 2 | |×| | player 2, respectively. For the game in Figure 2.1, A and B are shown in Figure 2.2. When the players use the realization plans x and y, the expected payoffs are x Ay > for player 1 and x By for player 2. These terms represent the sum over all leaves > of the payoffs at leaves multiplied by their realization probabilities. Using linear programming duality, von Stengel (1996) showed that any Nash equilibrium of the gameis a pair (x,y) of realization plans so that there exist vectors u,v,r,s that fulfill the linear constraints 8 a b c d a b c d ∅ ∅ ∅ ∅ 11 3 L 3 0 L A = R B = R 0 0 0 12 RS 0 5 2 0 RS 6 0 RT 0 1 RT Figure 2.2.–Sequence form payoff matrices A and B for the game in Figure 2.1. Rows and columns correspond to the sequences of the players which are marked at the side. Any sequence pair not leading to a leaf has matrix entry zero, which is left blank. x, y 0 ≥ Ex = e Fy = f (2.7) r = E u Ay 0 > − ≥ s = F v B x 0 > > − ≥ and the complementarity condition x r = 0, y s = 0. (2.8) > > The vectors u and v have dimension 1 + H and 1 + H , respectively, and are 1 2 | | | | unconstrained in sign. The nonnegative slack vectors r and s have dimension S 1 | | and S , respectively. 2 | | Conditions (2.7) and (2.8) define a linear complementarity problem or LCP. A standard LCP is specified by an n n matrix M and an n-vector b. The problem × is to find n-vectors z and w so that z 0, w = b+Mz 0, z w = 0. (2.9) > ≥ ≥ The condition z w = 0 states that the nonnegative vectors z = (z ,...,z ) and > 1 n > w = (w ,...,w ) are complementary, that is, at least one variable of each pair 1 n > (z ,w ) for 1 i n is zero. i i ≤ ≤ 9
Description: