ebook img

Quite Inefficient Algorithms for Solving Systems of Polynomial Equations [expository notes] PDF

17 Pages·0.485 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Quite Inefficient Algorithms for Solving Systems of Polynomial Equations [expository notes]

Quite Inefficient Algorithms for Solving Systems of Polynomial Equations Without Gröbner Bases! Resultants, Primary Decomposition and Galois Groups Lars Wallenborn Jesko Hüttenhain 11/25/11 — 12/02/11 Contents 1 Resultants 2 1.1 Solving without Gröbner Bases . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1.1 Multiplication Matrices . . . . . . . . . . . . . . . . . . . . . . . . 5 1.1.2 Solving via Multivariate Factorization . . . . . . . . . . . . . . . . 6 1.1.3 Ideal Membership . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.2 Multivariate Resultants . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2 Primary Ideal Decomposition 8 2.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2 Rational Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3 General Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3 Galois Groups 14 3.1 The Universal Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.2 The Dimension of A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.3 The Emergence of Splitting Fields . . . . . . . . . . . . . . . . . . . . . . 15 1 1 Resultants We discuss some methods for solving zero-dimensional systems of polynomial equations withouttheaidofGröbnerbases. Itwillturnoutthatthesemethodsalsoyieldtechniques to compute resultants. In this section, let always f ,...,f ∈ k[x ,...,x ] be polynomials over an alge- 0 n 1 n braically closed field k = k of degrees d := deg(f ). Let I = (f ,...,f ). We assume i i 1 n thatf ishomogeneousofdegreed = 1. Wesetµ := d ···d , d := d +···+d −(n−1) 0 0 1 n 1 n and d (cid:18) (cid:19) (cid:18) (cid:19) (cid:88) n+d n+d+1 ν := −µ = −µ. d d k=0 A point p ∈ kn is called a solution if p ∈ Z(I). Finally, denote by · : k[x ,...,x ] −→ k[x ,...,x ]/I =: A 1 n 1 n the canonical projection. We denote by L : A → A the k-automorphism of A which is f given by multiplication with f for some f ∈ k[x ,...,x ]. 1 n 1.1 Solving without Gröbner Bases Theorem 1.1 (Bézout’s Theorem). (a). If there are only finitely many solutions, then their number, counted with multiplic- ity, is at most µ. (b). For a generic choice of f ,...,f , there are precisely µ solutions, each with multi- 1 n plicity one. (cid:3) Remark 1.2. Here, “generic” means that for “almost every” choice of polynomials, the above holds. To properly define this “almost”, one needs more algebraic geometry than we are willing to present here. Definition 1.3. In the following, γ = (γ ,...,γ ) ∈ Nn denotes a multiindex. We 1 n partition the set S = {xγ : |γ| ≤ d} as S = S ∪· ··· ∪· S , where 0 n (cid:40) (cid:12) (cid:41) (cid:12) ∀j < i : d > γ S := xγ ∈ S (cid:12) j j (1.1) i (cid:12) and d ≤ γ (cid:12) i i for i > 0 and S := {xγ ∈ S | ∀j : d > γ }. A monomial xγ ∈ S is said to be reduced 0 j j i if d > γ for all j > i. j i We set S := S ∪···∪S . S is the set of monomials of degree at most d which are + 1 n 0 not divisible by any of the xdi. Since S will play a special role, we use xα to denote its i 0 elements and xβ for elements in S . + 2 Fact 1.4. Behold: (a). If xα ∈ S , then degxα ≤ d−1. Furthermore, |S | = µ and |S | = ν. 0 0 + (cid:16) (cid:46) (cid:17) (b). If xβ ∈ S for i > 0, then deg xβ xdi ≤ d−d i i i Proof. Part(a)followsfromS = {xγ1···xγn | ∀i : 0 ≤ γ ≤ d −1}andtheobservation 0 1 n i i that d−1 = (cid:80)n (d −1). For part (b), use the explicit description (1.1). i=1 i Definition 1.5. For xγ ∈ S , define i (cid:40) (cid:46) xγf xdi ; i (cid:54)= 0 f := i i γ xγf ; i = 0 0 Fact 1.6. We can write f as a k-linear combination of the xγ ∈ S. γ Proof. For γ ∈ S , this follows from the assertions d = 1 and Fact 1.4.(a). For γ ∈ S , 0 0 i this follows because deg(f ) ≤ |γ| ≤ d. γ Definition 1.7. Write S0 = {xα1,...,xαµ} and S+ = (cid:8)xβ1,...,xβν (cid:9). The Sylvester- type matrix associated to our given data is the matrix M such that     xα1 fα1  ..   ..   .   .      M·xαµ = fαµ. (1.2) xβ1 fβ1  .   .   ..   ..      xβν f βν Such a matrix exists by Fact 1.6. We write (cid:32) (cid:33) M M 00 01 M = . (1.3) M M 10 11 where M is a µ×µ square matrix and M is a ν ×ν square matrix. 00 11 Remark 1.8. For a generic choice of f ,...,f , the matrix M is invertible. Let us 1 n 11 understand why. If we let M = (λ ) and consider the equality 11 ij i,j xβ1 f  β1 . . M11· ..  =  .. , xβν f βν 3 this means nothing more than ν (cid:88) f = λ ·xβk. βi ik k=1 A generic choice of the f means a generic choice of coefficients λ , and for almost every i ik such choice, the vectors (λ ,...,λ ) for 1 ≤ i ≤ ν are linearly independent. i1 iν Hence, the following Assumption 1.9 is justified: Assumption 1.9. We henceforth assume M to be invertible. Note that this implies 11 that A is a zero-dimensional ring, i.e. a finite k-vectorspace. Definition 1.10. We define M(cid:102):= M(cid:102)(f0) := M00−M01M−111M10. (1.4) Furthermore, we define two maps φ : kn → kµ and ψ : kn → kν where pα1 pβ1 . . φ(p) :=  ..  ψ(p) :=  ..  pαµ pβν In other words, the maps are induced by the monomials in S and S , respectively. 0 + Theorem 1.11. For all solutions p, the vector φ(p) is an eigenvector of M(cid:102) with eigen- value f (p). Furthermore, for a generic choice of f , the set φ(Z(I)) is linearly indepen- 0 0 dent. Proof. Let p ∈ Z(I). Then,     fα1(p) (xα1f0)(p)  ..   ..   .   .  (cid:32) (cid:33) (cid:32) (cid:33) (cid:32) (cid:33)     (cid:32) (cid:33) M00 M01 · φ(p) = M· φ(p) = fαµ(p) = (xαµf0)(p) = f0(p)·φ(p) M M ψ(p) ψ(p) f (p)  0  0 10 11  β1     .   .   ..   ..      f (p) 0 βν by (1.2). This gives us the two identities M ·φ(p)+M ·ψ(p) = f (p)·φ(p) (1.5) 00 01 0 M ·φ(p)+M ·ψ(p) = 0 (1.6) 10 11 4 which we can use to conclude M(cid:102)·φ(p) = M00·φ(p)−M01M−111M10·φ(p) by (1.4) = M ·φ(p)+M M−1M ·ψ(p) by (1.6) 00 01 11 11 = M ·φ(p)+M ·ψ(p) 00 01 = f (p)·φ(p) by (1.5) 0 Finally, we note that for a generic choice, f takes distinct values at all the p ∈ Z(I), 0 therefore the eigenvalues are distinct and the corresponding eigenvectors linearly inde- pendent. Theorem 1.12. For generic f0, the set S0 is a k-basis of A. Furthermore, M(cid:102) is the matrix corresponding to L with respect to the basis S . f0 0 Proof. By Bézout (Theorem 1.1), A has dimension µ over k. Since this is also the cardinality of S , the first part of the theorem will follow once we show that the xα are 0 linearly independent. Assume that c ·xα1 +···+c ·xαµ = 0. 1 µ Evaluating this equation at a solution p gives c ·pα1 +···+c ·pαµ = 0. 1 µ Let Z(I) = {p ,...,p } and define the square matrix P := (pαi) . Since 1 µ j ij (c ,...,c )·P = 0 1 µ and P is invertible by Theorem 1.11, we conclude c = 0 for all i, which proves that S i 0 is linearly independent. Let M be the coordinate matrix of L in the basis S . Clearly, Mφ(p) = f (p)φ(p) f0 0 0 for every solution p. But from Theorem 1.11 we know that f0(p)φ(p) = M(cid:102)φ(p) and therefore, Mφ(p) = M(cid:102)φ(p) for every solution p. We know that the µ different φ(p) are linearly independent, so they form a basis. 1.1.1 Multiplication Matrices By setting f = x in Theorem 1.12, we get that the matrix of multiplication by x is 0 i i M(cid:102)(xi). However, it is possible to compute all of these maps simultaneously by using f = u x +···+u x , where u ,...,u are varibles. Thus, by means of 0 1 1 n n 1 n M(cid:102)(f0) = u1M(cid:102)(x1)+···+unM(cid:102)(xn), we can compute all multiplication matrices at once. 5 1.1.2 Solving via Multivariate Factorization Asabove,supposethatf = u x +···+u x whereu ,...,u arevariables. Inthiscase, 0 1 1 n n 1 n det(M(cid:102)(f0))becomesapolynomialink[u1,...,un]. Theresultsofthissectionimplythat the eigenvalues of M(cid:102)(f0) are f0(Z(I)). Since all of the eigenspaces have dimension 1, (cid:89) (cid:89) det(M(cid:102)) = f0(p) = (u1p1+···+unpn). (1.7) p∈Z(I) (p1,...,pn)∈Z(I) By factoring det(M(cid:102)(f0)) into irreducibles in k[u1,...,un], we get all solutions. 1.1.3 Ideal Membership For a given f ∈ k[x ,...,x ] we want to decide whether f ∈ I or not. Using the above 1 n M(cid:102)(xi), we can solve this problem by the following Fact 1.13. f ∈ I ⇔ f(M(cid:102)(x1),...,M(cid:102)(xn)) = 0 ∈ kµ×µ. Proof. We know that f ∈ I if and only if f = 0, i.e. f becomes zero in A. This is equivalent to saying that L is the zero map. If we write L := L and f = (cid:80) a xλ f i xi λ λ then (cid:16) (cid:88) (cid:17) L = y (cid:55)→ a xλ·y f λ λ (cid:88) (cid:89)n = a (y (cid:55)→ x y)◦λi λ i λ i=1 (cid:88) = a L◦λ λ λ = f(L ,...,L ), 1 n Thus, M(cid:102)(f) = f(M(cid:102)(x1),...,M(cid:102)(xn)) is the zero matrix if and only if f ∈ I. 1.2 Multivariate Resultants We now set up notation for this section. Let f be the homogeneous component of f ij i in degree j. Let F ,...,F ∈ k[x ,...,x ] arise from f by homogenization in a new 0 n 0 n i (cid:16) (cid:17) variable x . This means F := xdif x1,..., xn , F = f and F = (cid:80)di f xdi−j. We 0 i 0 i x0 x0 0 0 i j=0 ij 0 say that p = [p : ... : p ] ∈ Pn is a solution at infinity if p = 0 and F (p) = 0 for all 0 n 0 i i. We set G := f for all i. i i,di Definition 1.14. Let F ,...,F ∈ k[x ,...,x ] be homogeneous polynomials. The re- 0 n 0 n sultant res(F ,...,F ) is a polynomial in their coefficients which is the zero polynomial 0 n if and only if the F have a common projective root. We denote the same polynomial by i res(f ,...,f ) if F is the homogenization of f by x . 0 n i i 0 Fact 1.15. There exist solutions at infinity if and only if res(G ,...,G ) = 0. 1 n 6 Proof. Note that F (0,x ,...,x ) = G . Thus, there exists a solution at infinity if and i 1 n i only if the homogeneous polynomials G share a common root. i Theorem 1.16. If there are no solutions at infinity and M is a matrix corresponding to L , then f0 res(f ,...,f ) = res(G ,...,G )·det(M). 0 n 1 n Furthermore, if we denote by χ (T) the characteristic polynomial of M in an indeter- M minante T, res(T −f ,f ,...,f ) = res(G ,...,G )·χ (T). 0 1 n 1 n M Metaproof. A proof for these equalities can be traced back to [Jou91]. Definition 1.17. Recall Definition 1.7. We define the matrix M(cid:48) to be the matrix that arises from M by deleting all rows and columns corresponding to reduced monomials. Proposition 1.18. It is res(f ,...,f )·det(M(cid:48)) = det(M). 0 n Metaproof. This is [CLO98, Theorem 4.9]. Corollary 1.19. It is res(G ,...,G )·det(M(cid:48)) = det(M ). 1 n 11 Proof. If det(M ) (cid:54)= 0, we can write 11 (cid:32) (cid:33) (cid:32) (cid:33) (cid:32) (cid:33) det(M(cid:102))·det(M11) = det M(cid:102) 0 = det I −M01M−111 ·det M00 M01 M M 0 I M M 10 11 10 11 = det(M). Hence by Proposition 1.18, Theorem 1.16 and Theorem 1.12 (in this order), det(M(cid:102))·det(M11) = det(M) = res(f ,...,f )·det(M(cid:48)) 0 n = res(G ,...,G )·det(L )·det(M(cid:48)) 1 n f0 = res(G1,...,Gn)·det(M(cid:102))·det(M(cid:48)). whenf1,...,fn aresufficientlygeneric. Cancellingdet(M(cid:102)), whichisgenericallynonzero, we conclude that det(M ) = res(G ,...,G )·det(M(cid:48)) 11 1 n holds for almost all choices of coefficients of the f . i Since both sides of this equation are polynomial in the coefficients of the f , it means i that the equality holds in general. 7 2 Primary Ideal Decomposition In this section, I the ideal generated by the polynomials f ,...,f ∈ k[x ,...,x ] over 1 s 1 n any field k. We denote by K := k the algebraic closure of k. We then write · : k[x ,...,x ] (cid:16) k[x ,...,x ]/I =: A 1 n 1 n for the canonical projection. Assume that A is of dimension zero, i.e. V := Z(I) ⊆ Kn is finite. We denote by n (p) (resp. m (p)) the multiplicity of (T −f(p)) in the characteristic f f (resp. minimal) polynomial of L , where L : A → A is the multiplication by f for some f f polynomial f. When there is no risk of confusion, we write n(p) instead of n (p) and f equivalently, m(p) instead of m (p). f 2.1 Preliminaries Definition 2.1. An ideal J ⊆ R of a ring is said to be primary if fg ∈ I implies that √ f ∈ I or gk ∈ I for some k ∈ N. If J is primary, then J is a prime ideal. Definition 2.2. A primary decomposition of an ideal J is a decomposition of the form J = J ∩ ··· ∩ J with J primary. By [Eis94, Theorem 3.10], every ideal of a 1 r i Noetherian ring has a primary decomposition. We say that the decomposition is minimal √ if r is minimal. In this case, the ideals P := J are prime ideals which are minimal i i over J. √ √ Fact 2.3. Assume that I and J are primary ideals of a domain R. If I = J, then I ∩J is primary. Proof. Let fg ∈ I ∩J ⊆ I. We can assume that f ∈/ I ∩J. Hence, let us assume that √ √ f ∈/ I. This means gr ∈ I for some r. But this means g ∈ I = J, so gs ∈ J for some s. We set k := max(r,s) and obtain gk ∈ I ∩J. Fact 2.4. Let P be a maximal ideal in a domain R. Then, for any h ∈/ P, there exists an element g ∈ R such that 1+gh ∈ P. Proof. Since h ∈/ P, it becomes a unit in K := R/P. Choosing any element g ∈ R which is mapped to −h−1 under R (cid:16) R/P yields 1+gh = 0 in K, i.e. 1+gh ∈ P. Lemma 2.5. The ideal I has a minimal primary decomposition I = I ∩···∩I . For any √ 1 r (cid:83) suchdecomposition, theP := I aredistinctmaximalideals. Furthermore, I (cid:54)⊂ P i i i j(cid:54)=i j (cid:83) and for any g ∈ I \ P , it is I = I +(g). i j(cid:54)=i j i 8 Proof. We already established that such a primary decomposition always exists. Note that P is a minimal prime ideal over I, so dim(I ) = dim(P ) = dim(I) = 0. Since P i i i i is zero-dimensional and prime, it is maximal. If P = P for some i (cid:54)= j, then I ∩I i j i j (cid:83) is primary by Fact 2.3, contradicting minimality of the decompostion. If I ⊆ P , i j(cid:54)=i j then I ⊆ P for some j (cid:54)= i by Prime Avoidance (see [Eis94, Lemma 3.3]). This means i j P ⊆ P and hence, P = P by maximality, which is absurd. i j i j (cid:83) Finally, let g ∈ I \ P . Certainly, I +(g) ⊆ I . Now for every j (cid:54)= i, note that i j(cid:54)=i j i g ∈/ P and so we can choose a h ∈ R with the property that 1+gh ∈ P , by Fact 2.4. j j j j We choose m ∈ N such that, for all j, (1+gh )m ∈ I . Expanding the product j j (cid:89) (cid:89) (cid:92) (1+gh )m ∈ I ⊆ I , j j j j(cid:54)=i j(cid:54)=i j(cid:54)=i (cid:84) we conclude 1+gh ∈ I for a particular h ∈ R. Given any a ∈ I , j(cid:54)=i j i (cid:92) a(1+gh) ∈ I ∩ I = I i j j(cid:54)=i and therefore a = (1+gh)a+g(−ha) ∈ I +(g) as desired. (cid:83) In the following, we describe a method to compute certain g ∈ I \ P which yield i i j(cid:54)=i j the ideals I by Lemma 2.5. We can do this, again, without using Gröbner bases. i 2.2 Rational Case Recall that a solution p ∈ V is k-rational if p ∈ kn. For this subsection, we assume that all solutions are k-rational. (cid:84) Proposition/Definition 2.6. I = I is a minimal primary decomposition with p∈V p I := {f ∈ k[x ,...,x ] | ∃g ∈ k[x ,...,x ] : gf ∈ I and g(p) (cid:54)= 0}. p 1 n 1 n (cid:112) In this case, I is the maximal ideal m := (x −p ,...,x −p ) where the p are the p p 1 1 n n i coordinates of p. Remark. In fact, in the zero-dimensional case, the minimal primary decomposition is unique, but we will not give a proof for this. Proof. Theassumptionthatallsolutionsarek-rationalmeansthatwecanassumek = K is algebraically closed. (cid:84) We show I ⊆ I. Pick any f such that f ∈ I for all p. Choose polynomials g p∈V p p p such that g (p) (cid:54)= 0 and g f ∈ I. We have already seen that there exist idempotents p p (cid:80) e ∈ A, i.e. e (q) = δ . The function g := e g satisfies g(p) = g (p) (cid:54)= 0 for all p p pq p∈V p p p p ∈ V. Let h := (cid:80) ep , then (1−gh) vanishes on all of V and thus, (1−gh)k ∈ I for p∈V g(p) 9 some k, by the Hilbert Nullstellensatz. Multiplying this out gives 1−gh(cid:48) ∈ I for some polynomial h(cid:48). Since gf ∈ I, we know gfh(cid:48) ∈ I. Since f −fgh(cid:48) = f(1−gh(cid:48)) ∈ I, we conclude f ∈ I. (cid:112) To show that I = m , it will suffice to show that there exists a natural number p p k ∈ N such that (x −p )k ∈ I because it implies m ⊆ (cid:112)I and the statement then i i p p p follows by maximality of m . Set p (cid:89) g := (x −q ) i q∈V i i qi(cid:54)=pi Then, g ·(x −p ) vanihses on all of V. Hence, gk(x −p )k ∈ I for some k ∈ N. Since i i i i i i gk(p) (cid:54)= 0, we know (x −p )k ∈ I by definition. This also proves that I is primary. i i i p p We are left to show minimality of the decomposition. Assume that I = (cid:84)r I is √ i=1 i minimal. Let P := I . It is a maximal ideal with corresponding point p ∈ Kn. Then, i i i (cid:16)√ (cid:17) (cid:16)(cid:92) (cid:17) (cid:16)(cid:89) (cid:17) (cid:91)r V = Z(I) = Z I = Z P ⊆ Z P = Z(P ) = {p ,...,p } i i i 1 r i i i=1 implies |V| ≤ r and we are done. We used well-known facts about algebraic sets, see [Har06, Propositions 1.1 and 1.2], for instance. Proposition 2.7. If f ∈ k[x ,...,x ] takes distinct values at all solutions, then 1 n (cid:16) (cid:17) ∀p ∈ V : I = I + (f −f(p))m(p) . p Proof. Pick p ∈ V and set g := (f −f(p))m(p). By Lemma 2.5 and Proposition 2.6, it suffices to show that g ∈ I and g ∈/ m for all q (cid:54)= p. The latter condition is equivalent p q to g(q) (cid:54)= 0, which follows because f(q) (cid:54)= f(p) by assumption. To prove that g ∈ I , let p (cid:89) h := (f −f(q))m(q). q(cid:54)=p Denote by µ the minimal polynomial of the multiplication map L . Then, by definition f gh = µ(f). However, the Cayley-Hamilton Theorem [Eis94, Theorem 4.3] says that µ(L ) is the zero map on A. Applied to 1, we obtain f 0 = µ(L )(1) = µ(f) = µ(f). f Hence, gh ∈ I ⊆ I . Since h(p) = (cid:81) (f(p)−f(q))m(q) (cid:54)= 0 by our assumption on f, p q(cid:54)=p we know h ∈/ m . Thus, no power of h can be contained in I . Since I is primary, this p p p means g ∈ I . p Example 2.8. Consider the case k = Q and f := x2+2y(y−1) 1 f := xy(y−1) 2 f := y(y2−2y+1) 3 10

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.