A Gentle Introduction to a Beautiful Theorem of Molien Holger Schellwat 7 1 0 [email protected], O¨rebro universitet, Sweden 2 Universidade Eduardo Mondlane, Mo¸cambique n a 12 January, 2017 J 6 1 Abstract ] M The purpose of this note is to give an accessible proof of Moliens Theorem in Invariant Theory, in the language of today’s Linear Algebra G andGroupTheory,inordertopreventthisbeautifultheoremfrom being . forgotten. h t a m Contents [ 1 Preliminaries 3 1 v 2 The Magic Square 6 2 9 6 3 Averaging over the Group 9 4 0 4 Eigenvectors and eigenvalues 11 . 1 5 Moliens Theorem 13 0 7 1 6 Symbol table 17 : v 7 Lost and found 17 i X References 17 r a Index 18 1 Introduction We presentsome memories of a visit to the ring zoo in 2004. This time we met an animal looking like a unicorn, known by the name of invariant theory. It is rare, old, and very beautiful. The purpose of this note is to give an almost self contained introduction to and clarify the proof of the amazing theorem of Molien,aspresentedin[Slo77]. Anintroductionintothis area,andmuchmore, is contained in [Stu93]. There are many very short proofs of this theorem, for instance in [Sta79], [Hu90], and [Tam91]. Informally, Moliens Theorem is a power series generating function formula for counting the dimensions of subrings of homogeneous polynomials of certain degree which are invariant under the action of a finite group acting on the variables. As an apetizer, we display this stunning formula: 1 1 Φ (λ):= G |G| det(id−λT ) g∈G g X We can immediately see elements of linear algebra, representation theory, and enumerative combinatorics in it, all linked together. The paper [Slo77] nicely showshowthis method canbe appliedin Codingtheory. For CodingTheoryin general, see [Bie04]. Before we can formulate the Theorem, we need to set the stage by look- ing at some Linear Algebra (see [Rom 08]), Group Theory (see [Hu96]), and Representation Theory (see [Sag 91] and [Tam91]). 2 1 Preliminaries LetV ∼=Cn beafinitedimensionalcomplexinnerproductspacewithorthonor- mal basis B = (e ,...,e ) and let x = (x ,...,x ) be the orthonormal basis 1 n 1 n of the algebraic dual space V∗ satisfying ∀1 ≤ i,j ≤ n : x (e ) = δ . Let G i j ij be a finite group acting unitarily linear on V from the left, that is, for every g ∈GthemappingV →V,v7→g.visaunitarybijectivelineartransformation. Using coordinates, this can be expressed as [g.v]B =[g]B,B[v]B, where [g]B,B is unitary. Thus, the action is a unitary representation of G, or in other words, a G–module. Note that we are using left composition and column vectors, i.e. v=(v ,...,v )conv=ention[v v ... v ]⊤, c. f. [Ant73]. 1 n 1 2 n The elements of V∗ are linear forms(linear functionals), and the elements x ,...,x , looking like variables, are also linear forms, this will be important 1 n later. Thinking of x ,...,x as variables, we may view (see [Tam91]) S(V∗), the 1 n symmetric algebra on V∗ as the algebra R := C[x] := C[x ,...,x ] of poly- 1 n nomial functions V → C or polynomials in these variables (linear forms). It is naturally graded by degree as R = R , where R is the vector space d∈N d d spanned by the polynomials of (total) degree d, in particular, R = C, and 0 R =V∗. L 1 The action of G on V can be lifted to an action on R. 1.1 Proposition. Let V, G, R as above. Then the mapping . : G × R → R,(g,f)7→g.f defined by (g.f)(v):=f(g−1.v) for v∈V is a left action. Proof. For v∈V, g,h∈G, and f ∈R we check 1. (1.f)(v)=f(1−1.v)=f(1.v)=f(v) 2. ((hg).f)(v)=f((hg)−1.v)=f((g−1h−1).v)= f(g−1.(h−1.v))=(g.f)(h−1.v)=(h.(g.f))(v) In fact, we know more. 1.2 Proposition. Let V, G, R as above. For every g ∈ G, the mapping T : R → R,f 7→ g.f is an algebra automorphism preserving the grading, i.e. g g.R ⊂R (here we do not bother about surjectivity). d d Proof. For v∈V, g ∈G, c∈C, and f,f′ ∈R we check 1. (g.(f +f′))(v)=(f +f′)(g−1.v)=f(g−1.v)+f′(g−1.v)= (g.f)(v)+(g.f′)(v)=(g.f +g.f′)(v), thus g.(f +f′)=g.f +g.f′ 2. (g.(f ·f′))(v)=(f ·f′)(g−1.v)=f(g−1.v)·f′(g−1.v)= (g.f)(v)·(g.f′)(v)=(g.f ·g.f′)(v), thus g.(f ·f′)=g.f ·g.f′ 3 3. (g.(cf))(v)=(cf)(g−1.v)=c(f(g−1.v))=c((g.f)(v))=(c(g.f))(v) 4. By part 2. it is clear that the grading is preserved. 5. To show that f 7→g.f is bijective it is enough to show that this mapping is injective on the finite dimensional homogeneous components R . Let d us introduce a name for this mappig, say Td : R → R ,f 7→ g.f. Now g d d f ∈ ker(Td) implies that g.f = 0 ∈ R , i.e. g.f is a polynomial mapping g d from V to C of degree d vanishing identically, ∀v ∈ V : (g.f)(v) = 0. By definition of the extended action we have ∀v ∈ V : f(g−1.v) = 0. Since G acts on V this implies that ∀v ∈ V : f(v) = 0, so f is the zero mapping. Since our ground field has characteristic 0, this implies that f isthe zeropolynomial,whichwemayviewasanelementofeveryR . See d for instance [Cox91], proposition 5 in section 1.1. 6. Note that every Td is also surjective, since all group elements have their g inverse in G. Both propositions together give us a homomorphism from G into Aut(R). They also clarify the rˆole of the induced matrices, which are classical in this area, as mentionend in [Slo77]. Since the monomials x ,...,x of degree one 1 n form a basis for R , it follows from the proposition that their products x := 1 2 (x2,x x ,x x ,...,x x ,x2,x x ,...) forma basis for R , and, in general,the 1 1 2 1 3 1 n 2 2 3 2 monomials of degree d in the linear forms (!) x ,...,x form a basis x of R . 1 n d d Clearly, they certainly span R , and by the last observation in the last proof d they are linearly independent. 1.3 Definition. In the context from above, that is g ∈G, f ∈Rd, and v∈V, we define Tgd :Rd →Rd,f 7→g.f :Rd →C,v7→f(g−1.v)=f(Tg−1(v)). 1.4 Remark. In particular, we have (Tg1(f))(v) = f(Tg−1(v)), see proposition 1.6 below. Keep in mind that a function f ∈ R maps to Td(f) = g.f. Setting A := d g g [T1] , then A[d] := [Td] is the d–th induced matrix in [Slo77], because g x,x g g xd,xd T1(f ·f′) = T1(f)·T1(f′). Also, if f,f′ are eigenvectors of T1 corresponding g g g g to the eigenvaluesλ,λ′, thenf·f′ is aneigenvectorofT2 with eigenvalueλ·λ′, g because T (f · f′) = T (f) · T (f′) = (λf) · (λ′f′) = (λλ′)(f · f′). All this g g g generalizes to d>2, we will get back to that later. We end this section by verifying two little facts needed in the next section. 1.5 Proposition. The firstinduced operator of the inverse of a group element g ∈G is given by T1 =(T1)−1. g−1 g Proof. Sincedim(V∗)<∞,itissufficienttoprovethatTg1−1◦Tg1 =idV∗. Keep in mind that (Tg1(f))(v)=f(Tg−1(v)). For arbitrary f ∈V∗ we see that (T1 ◦T1)(f)=T1 (T1(f))=T1 (g.f)=g−1.(g.f)=(g−1g).f =f. g−1 g g−1 g g−1 4 We will be mixing group action notation and composition freely, depending on the context. The following observation is a translation device. 1.6 Proposition. For g ∈G nd f ∈V∗ the following holds: T1(f)=g.f =f ◦Tg−1. Proof. For v∈V we see (T1(f))(v)=(g.f)(v)d=ef f(g−1.v)=f(Tg−1(v)). 5 2 The Magic Square Remember that we require a unitary representation of G, that is the operators T : V → V need to be unitary, i.e. ∀g ∈ G : (T )−1 = (T )∗. The first g g g goal of this sections is to show that this implies that the induced operators Td : R → R ,f 7→ g.f are also unitary. We saw that T1 = V∗, the algebraic g d d g dual of V. In order to understand the operator duals of V and V∗ we need to look on their inner products first. We may assume that the operators T g are unitary with respect to the standard inner product hu,vi=[u]B,B•[v]B,B, where • denotes the dot product. BeforewecanspeakofunitarityoftheinducedoperatorsTdwehavetomake g clear which inner product applies on R1 = V∗. Quite naively, for f,g ∈ V∗ we are tempted to define hf,gi=[f] •[g] . x,x x,x We will motivate this in a while, but first we take a look at the diagram in [Rom 08], chapter10, with our objects: T× ←−−g−− T1 R1 =V∗ −−−g−→ V∗ =R1 P P V −−−Tg−→ V y y T∗ ←−−g−− HereP (“Rho”)denotestheRieszmap,see[Rom 08],Theorem9.18,where it is called R, but R denotes already our big ring. We started by looking at the operator T , which is unitary, so its inverse is the Hilbert space adjoint T∗. g g Omiting the names of the bases we have [T∗]=[T ]∗. We also see the operator g g adjoint T× with matrix [T×] = [T ]⊤, the transpose. However, the arrow for g g g T1 is not in the original diagram, but soon we will see it there, too. g Fortunately,theRieszmapP turnsalinearformintoavectoranditsinverse τ : V → V∗ maps a vector to a linear form, both are conjugate isomorphisms. This is mostly all we need in order to show that T1 is unitary. In the following g three propositions we use that V has the orthonormalbasis B and that V∗ has the orthonormal basis x. 2.1 Proposition. For every f ∈ V∗ the coordinates of its Riesz vector are given by [P(f)] =(f(e ),...,f(e )). e 1 n Proof. Writing τ for the inverse of P, we need to show that n P(f)= f(e )e i i i=1 X which is equivalent to n f =τ f(e )e . i i ! i=1 X 6 Itissufficienttoshowthelatterforvaluesoff onthebasisvectorse ,1≤j ≤n. j We obtain n n n τ f(e )e (e )= e , f(e )e = e , f(e )e i i j j i i j i i !! * !+ Xi=1 Xi=1 Xi=1 D (cid:16) (cid:17)E n =f(e ) he ,e i=f(e )·1. i j i i i=1 X In particular, this implies that P(x )=e . i i 2.2 Proposition. Our makeshift inner product on V∗ satisfies hf,gi=hP(f),P(g)i, where f,g ∈V∗. Proof. By our vague definition we have hf,gi = [f] •[g] . It is enough x,x x,x to show that hx ,x i = hP(x ),P(x )i. From the comment after the proof of i j i j Proposition 2.1 we obtain hP(x ),P(x )i=he ,e i=δ =e •e =[x ] •[x ] . i j i j ij i j i x,x j x,x Hence,ourguessfortheinnerproductonV∗ wascorrect. Wewillnowrelate the Riesz vectorof f ∈V∗ to the Riesz vectorof f◦T−1. Recallthat the Riesz g vector of f ∈V∗ is the unique vector w=P(f) suchthat f(v)=hv,wi for all v ∈ V. If f 6= 0 it can be found by scaling any nonzero vector in the cokernel of f, which is one–dimensional, see [Rom 08], in particular Theorem 9.18. 2.3 Proposition. Let T : V → V be unitary, f ∈ V∗, w = P(f) the vector g of f ∈ V∗. Then T (w) is the Riesz vector of f ◦T−1, i.e. the Riesz vector of g g T1(f). g Proof. We may assume that f 6= 0. Using the notation hwi for the one– dimensional subspace spanned by w, we start with a little diagram: hwi⊙ker(f)−T→g hT (w)i⊙ker(f ◦T−1), g g wheere ⊙ denotes the orthogonal direct sum. We need to show that f ◦ T−1 = h·,T (w)i, i.e. that (f ◦ T−1)(v) = g g g hv,T (w)iforallv∈V. Sincew =P(f)thevectoroff,wehavef(v)=hv,wi g for all v∈V. We obtain (f ◦T−1)(v)= T−1(v),w Tg u=nitaryhv,T (w)i. g g g From remark 1.4 we conclude(cid:10)that f ◦T−(cid:11)1 =T1(f). g g 7 Observethatproposition2.3implies the commutativityofthe followingtwo diagrams. V∗ −−−Tg−1→ V∗ V∗ −(−T−g1)−−→1 V∗ and P P P P V −−−Tg−→ V V −(−T−g)−−→1 V y y y y Indeed, 2.3 implies P ◦T1 =T ◦P (1) g g P ◦(T1)−1 =(T )−1◦P (2) g g 2.4 Proposition. The first induced operator T1 is unitary. g Proof. We may use that T is unitary, that is, g hTg(v),wi= v,(Tg)−1(w) = v,(Tg−1)(w) (∗). Let f,h ∈ V∗ arbitrary, (cid:10)w := P(f), an(cid:11)d u(cid:10):= P(h). W(cid:11)e need to check that (T1)(f),h = f,(T1)−1(h) . We see that g g (cid:10) (T1)((cid:11)f),h(cid:10) propos=ition2.2(cid:11) (P ◦T1)(f),P(h) (=1)h(T ◦P)(f),P(h)i g g g (cid:10) (cid:11)=h(Tg(P))(f(cid:10)),P(h)i=hTg(w)(cid:11),ui=∗ w,Tg−1(u) = P(f),Tg−1(P(h)) = P(f),(Tg−1◦(cid:10)P)(h) (cid:11) (=2)(cid:10)P(f),(P ◦(T1)−(cid:11)1)(h(cid:10)) = P(f),P((T1)−(cid:11)1(h)) g g = (cid:10)f,(Tg1)−1(h) (cid:11) (cid:10) (cid:11) (cid:10) (cid:11) After havinglookedateigenvalueswewillsee thatthis generalizesto higher degree, that Td is diagonalizable for all d ∈ Z+. But first let us look at the g matrix version of proposition 2.4. 2.5 Proposition. [T1] =[T ] g x,x g e,e Proof. Let A := [Tg]B,B = [A1|···|Ai|···|An] = [ai,j] and B := [Tg1]x,x = [B |···|B |···|B ]=[b ]. We will use the commutativity of the diagram, i.e. 1 i n i,j P−1◦T ◦P =T , which we will mark as (cid:3). No, the proofis not finished here. g g We get T (e )=A = n a e and g i i k=1 k,i k T1(x )=(cid:3) (P−1◦PT ◦P)(x )=P−1(T (P(x )) g i g i g i n n 2=.1P−1(T (e ))=P−1 a e ko=nj. a P−1(e ) g i k,i k k,i k ! k=1 k=1 X X n 2.1 = a x k,i k k=1 X On the other hand, [T1(x )] = [T1] e = B implies T1(x ) = n b e . g i x g x,x i i g i k=1 k,i k Together we obtain b =a , and the proposition follows. k,i k,i P 8 3 Averaging over the Group Now we apply averagingto obtain self-adjoint operators. 3.1 Definition. We define the following operators: 1 1. Tˆ :V →V,v7→Tˆ(v):= T (v) g |G| g∈G X 1 2. Tˆ1 :V∗ →V∗,f 7→Tˆ1(f):= T1(f) |G| g g∈G X These are sometimes called the Reynolds operator of G. 3.2 Proposition. The operators Tˆ and Tˆ1 are self-adjoint (Hermitian). Proof. The idea of the averaging trick is that if g ∈ G runs through all group element and g′ ∈ G is fixed, then the products g′g run also through all group elements. We will make use of the facts that every T and every T1 is unitary. g g 1. We need to show that Tˆ(v),w = v,Tˆ(w) for arbitrary v,w ∈ V. We obtain D E D E 1 1 Tˆ(v),w = T (v),w = hT (v),wi g g |G| |G| D E * gX∈G + gX∈G 1 1 un=it. |G| v,(Tg)−1(w) = |G| v,(Tg−1)(w) g∈G g∈G X (cid:10) (cid:11) X (cid:10) (cid:11) 1 = hv,(Tg′)(w)i= v,Tˆ(w) |G| gX′∈G D E 2. The same proof, mutitis mutandis, replacing Tˆ ↔ Tˆ1, T ↔ T1, v ↔ f, g g and w↔h shows that Tˆ1(f),h = f,Tˆ1(h) . D E D E Consequently, Tˆ and Tˆ1 are unitarily diagonalizable with real spectrum. 3.3 Proposition. The operators Tˆ and Tˆ1 are idempotent, i.e. 1. Tˆ◦Tˆ =Tˆ 2. Tˆ1◦Tˆ1 =Tˆ1 . In particular, the eigenvalues of both operators are either 0 or 1. Proof. Again, we show only one part, the other part is analog. To begin with, let s∈G be fixed. Then 1 1 T ◦Tˆ=T ◦ T = T ◦T s s g s g |G| |G| g∈G g∈G X X 1 1 = Tsg = Tg′ =Tˆ. |G| |G| g∈G g′∈G X X 9 From this it follows that 1 1 1 Tˆ◦Tˆ = T ◦Tˆ= T ◦Tˆ ab=ove= Tˆ g g |G| |G| |G| g∈G g∈G g∈G X X X 1 = ·|G|·Tˆ =Tˆ. |G| FromTˆ◦Tˆ =TˆweconcludethatTˆ◦(Tˆ−id)=0. Thustheminimalpolynomial of T divides the polynomial λ(λ−1), so all eigenvalues are contained in {0,1}. We will now look at the eigenvalues of T and T1 and their interrelation. g g Since both operators are unitary, their eigenvalues have absolute value 1. 3.4 Proposition. 1. If v ∈V is an eigenvector of T for the eigenvalue λ, g then v is an eigenvector of Tg−1 for the eigenvalue λ= λ1. 2. If f ∈ V∗ is an eigenvector of T1 for the eigenvalue λ, then f is an g eigenvector of T1 for the eigenvalue 1. g−1 λ 3. If f ∈V∗ is an eigenvector of T1 for the eigenvalue λ, then P(f)∈V is g an eigenvector of T for the eigenvalue λ= 1. g λ 4. If v ∈ V is an eigenvector of T for the eigenvalue λ, then P−1(v) ∈ V∗ g is an eigenvector of T1 for the eigenvalue λ= 1. g λ Proof. We willmakeuseofthe commutativityofProposition2.3. Observethat g.v=T (v) and g.f =f ◦T . g g 1. T (v)=g.v=λv =⇒ g−1.g.v=g−1.λv =⇒ g−1.g.v=λg−1.v g 1 =⇒ v=λg−1.v =⇒ Tg−1(v)=g−1.v= λv 2. T1(f)=g.f =λf =⇒ g−1.g.f =g−1.λf =⇒ g−1.g.f =λg−1.f g 1 =⇒ f =λg−1.f =⇒ T1 (f)=g−1.f = f g−1 λ 3. T1(f)=λf =P⇒◦ P(T1(f))=P(λf)=(⇒1) T (P(f))=P(λf) g g g 1 =⇒ T (P(f))=λP(f)= P(f) g λ 4. T (v)=λvP=−⇒1◦P−1(T (v))=P−1(λv)=(cid:3)⇒(T1◦P−1)(v)=λP−1(v) g g g 1 =⇒ T1(P−1(v))= P−1(v) g λ Thisimpliesthatifweconsidertheunionofthe spectraoverallg ∈G,then we obtain the same (multi)set, no matter if we take T or T1. g g 10