The Fourier Transform and Equations over Finite Abelian Groups An introduction to the method of trigonometric sums LECTURE NOTES BY L´aszl´o Babai Department of Computer Science University of Chicago December 1989 Updated June 2002 VERSION 1.3 The aim of these notes is to present an easily accessible introduction to a powerful method of number theory. The punchline will be the following finite counterpart of Fermat’s Last The- orem: Theorem 0.1 If k is an integer, q a prime power, and q ≥ k4 +4, then the Fermat equation xk+yk =zk (1) has a nontrivial solution in the finite field F of order q. q This result seems to belong to algebraic geometry over finite fields: we have an algebraic variety and we assert that it has points over F other than certain q “trivial” ones. In fact, we can asymptotically estimate the number of solutions if q/k4 is large. As we shall see, algebraic equations have little to do with the method. In- deed, a much more general result will follow easily from the basic theory. Let F× =F \{0}. q q Theorem 0.2 Let k be an integer, A ,A ⊆ F ,l = (q−1)/|A | (not neces- 1 2 q i i sarily integers), and assume that q ≥k2l l +4. (2) 1 2 Then the equation x+y =zk (x∈A ,y ∈A ,z ∈F×) (3) 1 2 q has at least one solution. Theorem 0.1 follows from this result if we set A = A = {ak : a ∈ F×. 1 2 q Clearly, |A |= q−1 ≥(q−1)/k and therefore l ≤k in this case. i g.c.d.(k,q−1) i Note that in Theorem 0.2, the sets A and A are arbitrary (as long as they 1 2 arenottoosmallcomparedtoq). Thisresulthasaflavorofcombinatoricswhere the task often is to create order out of nothing (i.e., without prior structural assumptions). Results like this one have wide applicability in combinatorial terrain such as combinatorial number theory (to which they belong) and even in the theory of computing. 1 Notation C: field of complex numbers C× =C\{0}: multiplicative group of complex numbers Z: ring of integers Z =Z/nZ: ring of mod n residue classes n F : field of q elements where q is a prime power q (F ,+): the additive group of F q q F× =F \{0}: the multiplicative group of F . q q q 1 Characters Let G be a finite abelian group of order n, written additively. A character of G is a homomorphism χ:G→C× of G to the multiplicative group of (nonzero) complex numbers: χ(a+b)=χ(a)χ(b) (a,b∈G). (4) Clearly, χ(a)n =χ(na)=χ(0)=1 (a∈G), (5) so the values of χ are nth roots of unity. In particular, χ(−a)=χ(a)−1 =χ(a) (6) where the bar indicates complex conjugation. The principal character is defined by χ (a)=1 (a∈G). (7) 0 Proposition 1.1 For any nonprincipal character χ of G, χ(a)=0. (8) (cid:88) a∈G Proof: Let b∈G be such that χ(b)(cid:54)=1, and let S denote the sum on the left hand side of equation (8). Then χ(b)·S = χ(b)χ(a)= χ(b+a)=S (cid:88) (cid:88) a∈G a∈G hence S(χ(b)−1)=0, proving the claim. (cid:3) 2 Corollary 1.2 (First orthogonality relation for characters) Letχandψ be two characters of G. Then n if χ=ψ χ(a)ψ(a)=(cid:26) (9) (cid:88) 0 otherwise. a∈G Proof: The case χ = ψ follows from equation (6). If χ (cid:54)= ψ, then χψ is a nonprincipal character, hence Proposition 1.1 applies. (cid:3) As observed in the last proof, the pointwise product of the characters χ and ψ is a character again: (χψ)(a):=χ(a)ψ(a) (10) Let G denote the set of characters. It is easy to see that this set forms an abelian g(cid:98)roup under operation (10). G is called the dual group of G. (cid:98) Proposition 1.3 Let ω be a primitive nth root of unity. Then the map χ : j Z →C× defined by n χ (a):=ωja (11) j is a character of Z for every j ∈Z. Moreover, n (a) χ =χ if and only if j ≡kmod n; j k (b) χ =χj; j 1 (c) Zn ={χ0,...,χn−1}. ∼ (d) (cid:98)Consequently, Z =Z . n n (cid:98) Proof: (a) and (b) are straightforward. Let now χ be an arbitrary character; then χ(1) = ωj for some j,0 ≤ j ≤ n−1 by eqn. (5). If follows that χ = χ . j Now, (d) is immediate. (cid:3) Proposition 1.4 If G is a direct sum: G = H ⊕H , and ϕ : H → C× is a 1 2 i i character of H (i=1,2), then χ=ϕ ⊕ϕ , defined by i 1 2 χ(h ,h ):=ϕ (h )·ϕ (h ), (12) 1 2 1 1 2 2 is a character of G. Moreover, all characters of G are of this form. Conse- quently, G∼=H ⊕H (13) 1 2 (cid:98) (cid:98) (cid:98) Proof: The first statement is clear, and it is easy to verify that the map H ⊕H → G defined by (12) is injective. Let now χ ∈ G. The restriction 1 2 ϕ(cid:98)i =χ(cid:98)|Hi is c(cid:98)learly a character of Hi, and it is easy to verify(cid:98)that χ=ϕ1⊕ϕ2. (cid:3) ∼ Corollary 1.5 G=G. (cid:98) 3 Proof: G∼=Z ⊕···⊕Z ,henceG∼=Z ⊕···⊕Z ∼=Gusingtheprevious n1 nk n1 nk two propositions. (cid:3) (cid:98) (cid:98) (cid:98) We remark that there is no natural isomorphism between G and G; even for cyclic groups, the isomorphism selected depends on the arbitrary ch(cid:98)oice of ω. ∼ The consequent isomorphism G=G is, however, natural: (cid:98)(cid:98) Corollary 1.6 G can be identified with G in the following natural way: for a∈G, define a˜∈G by (cid:98)(cid:98) (cid:98)(cid:98) a˜(χ)=χ(a) (χ∈G). (14) (cid:98) The map a(cid:55)→a˜ is an isomorphism of G and G. (cid:98)(cid:98) Proof: Left to the reader. (cid:3) Let CG denote the space of functions f : G → C. This is an n-dimensional linear space over C. We introduce an inner product over this space: 1 (f,g)= f(a)g(a) (f,g ∈CG). (15) n (cid:88) a∈G Theorem 1.7 G forms an orthonormal basis in CG. (cid:98) Proof: Orthonormality follows from Cor. 1.2. Completeness follows from Cor. 1.5 which implies that |G|=n=dim(CG). (cid:3) Letχ0,...,χn−1 bethech(cid:98)aractersofG={a0,...,an−1}. Then×nmatrix C =(χ (a )) (16) i j is the character table of G. Corollary 1.8 The matrix A= √1 C is unitary, i.e., AA∗ =A∗A=I. (A∗ is n the conjugate transpose of A; I is the n×n identity matrix.) Proof: A∗A = I follows immediately from Theorem 1.7 in view of the for- mula (15). (cid:3) Corollary 1.9 (Second orthogonality relation for characters) Leta,b∈ G. Then n if a=b χ(a)χ(b)=(cid:26) (17) (cid:88) 0 otherwise. χ∈G (cid:98) First proof: ThisisarestatementofthefactthatAA∗ =I inCorollary1.8.(cid:3) Second proof: In view of the identification of G and G (Cor. 1.6), Cor. 1.9 is a restatement of Cor. 1.2 for the abelian group G in pla(cid:98)(cid:98)ce of G. (cid:3) We state a special case separately. The fol(cid:98)lowing is the dual of Proposi- tion 1.1. 4 Corollary 1.10 For any non-zero element a∈G, χ(a)=0. (cid:3) (cid:88) χ∈G (cid:98) 2 Fourier Transform Corollary 2.1 Any function f ∈CG can be written as a linear combination of characters: f = c χ. (18) (cid:88) χ χ∈G (cid:98) Such a linear combination is also called a trigonometric sum since f(a) is expressed as a combination of nth roots of unity. The coefficients c are called χ the Fourier coefficients and are given by the formula c =(χ,f). (19) χ Proof: Expansion (18) exists by Theorem 1.7. The inner product (χ,f) is equal to c by orthonormality. (cid:3) χ The function f :G→C, defined by (cid:98) (cid:98) f(χ)=nc = χ(a)f(a) (χ∈G), (20) χ (cid:88) (cid:98) a∈G (cid:98) is called the Fourier Transform of f. This transformation is easily inverted: using equations (18) and (20), we see that 1 f = c χ= f(χ)χ, (cid:88) χ (cid:88) n χ∈G χ∈G (cid:98) (cid:98) (cid:98) hence the formula for the Inverse Fourier Transform is 1 f(a)= f(χ)χ(−a) (a∈G). (21) n (cid:88) χ∈G (cid:98) (cid:98) We derive a simple consequence. Let δ ∈CG be defined by 1 if a=0 δ(a)=(cid:26) 0 if a(cid:54)=0 (a∈G). Corollary 2.2 (a) δ(χ)=1 (χ∈G). (22) (cid:98) (cid:98) 5 (b) 1 δ = χ. (23) n (cid:88) χ∈G (cid:98) Proof: (a) follows from eqn. (20). (b) follows from eqn. (21). (Note that (b) also follows from the second orthogonality relation (17) with a=0.) (cid:3) Applying formula (15) to G we obtain the inner product 1(cid:98) (f,g)= f(χ)g(χ) (f,g ∈CG) (24) n (cid:88) (cid:98) χ∈G (cid:98) √ over the space CG. Corollary 1.8 tells us that Fourier transformation is n (cid:98) times a unitary transformation between CG and CG: (cid:98) Theorem 2.3 (Plancherel formula) For any f,g ∈CG, (f,g)=n(f,g). (25) (cid:98) (cid:98) First proof: Using the notation introduced before Cor. 1.8, let f =(f(a0),...,f(an−1)),g =(g(a0),...,g(an−1)), f =(f(χ0),...,f(χn−1)),g =(g(χ0),...,g(χn−1)) As(cid:98)in (1(cid:98)6), let C =(cid:98) (χ (a )(cid:98)) be (cid:98)the charac(cid:98)ter table of G. Then f = fC, i j g =gC, and (cid:98) (cid:98) 1 1 (f,g)= ·f ·g∗ = fCC∗g∗ =f ·g∗ =n·(f,g). (26) n n (cid:98) (cid:98) (cid:98) (cid:98) (As b√efore, ∗ denotes conjugate transpose.) We made use of the fact that C = nA, hence CC∗ =nAA∗ =nI. (Cor. 1.8). (cid:3) Second proof: The map f (cid:55)→f is clearly linear. Therefore it suffices to prove (25) for elements f,g of a basis(cid:98)of CG. The functions of δa defined by δ (b)=δ(b−a) (b∈G) (27) a (the characteristic vectors of the singletons) form a basis of CG. Clearly, δ (χ)=χ(a), (28) a (cid:98) hence by the second orthogonality relation, 1 1 if a=b (δa,δb)= n (cid:88)χ(a)χ(b)=(cid:26) 0 otherwise (cid:98) (cid:98) χ∈G (cid:98) On the other hand, obviously, 1/n if a=b 1 (δa,δb)=(cid:26) 0 otherwise. (cid:27)= n(δa,δb). (cid:3) (cid:98) (cid:98) Let (cid:107)f(cid:107)= (f,f). (cid:112) 6 √ Corollary 2.4 (cid:107)f(cid:107)= n(cid:107)f(cid:107). (cid:3) (cid:98) The characteristic function of a set A⊆G is the function f ∈CG defined by A 1 if a∈A fA(a)=(cid:26) 0 otherwise. (29) Proposition 2.5 For A,B ⊆G, 1 (f ,f )= |A∩B|. (30) A B n In particular, √ n (cid:107)f (cid:107)= |A|. (31) A (cid:112) Proof:Evident. (cid:3) We note that f (χ )=|A|. (32) A 0 (cid:98) This is n-times the principal Fourier coefficient of f . The remaining Fourier A coefficients give important “randomness” information on the set A. Let Φ(A)=max{|f (χ)|:χ∈G,χ(cid:54)=χ }. A 0 (cid:98) (cid:98) The smaller Φ(A), the “smoother,” more “random looking” the set A is. We shallestimateΦ(A)forspecific“smooth”setsinSections5and6. Herewegive a lower bound which holds for every set A⊆G. Proposition 2.6 For every A⊆G, if |A|≤n/2, then Φ(A)≥ |A|/2. (33) (cid:112) Proof: By Cor. 2.4 and eqn. (31) we have: (cid:107)f (cid:107)2 =n(cid:107)f (cid:107)2 =|A|. A A (cid:98) On the other hand, n(cid:107)f (cid:107)2 = |f (χ)|2 ≤(f (χ ))2+(n−1)Φ(A)2 =|A|2+(n−1)Φ(A)2. A (cid:88) A A 0 (cid:98) χ∈G (cid:98) (cid:98) (cid:98) Consequently, (n−|A|)|A| |A| |A|2+(n−1)Φ(A)2 ≥n|A|; Φ(A)2 ≥ ≥ . n−1 2 (cid:3) The “smooth” sets will be those which come close to this bound. The as- sumption |A|≤n/2 is justified by the following exercise. 7 Exercise 2.7 Prove: Φ(A)=Φ(G\A) for every A⊆G. For A⊆G and k ∈Z, let kA={ka:a∈A}. Exercise 2.8 Prove: If g.c.d.(k,n)=1 then Φ(kA)=Φ(A) for every A⊆G. In particular Φ(−A)=Φ(A). More generally, Φ is invariant under automorphism of G. Let Aut G denote the automorphism group of G. Exercise 2.9 Prove: if α∈Aut G then Φ(A)=Φ(αA) for every A⊆G. Exercise 2.10 Prove: Φ(A+a)=Φ(A) for every a∈G. (Here A+a={u+a:u∈A}.) 3 Equations over finite abelian groups We shall consider the following general problem: Let A ,...,A ⊆G and let a 1 k be a fixed element of G. Estimate the number of solutions of the equation x +···+x =a (x ∈A ,i=1,...,k). (34) 1 k i i In particular, decide whether or not a solution exists. Let |A | = m . Assume for a moment that while the sets A are fixed, i i i the element a ∈ G is selected at random. This makes the expected number of solutions equal to m ···m 1 k (35) n the numerator being the number of k-tuples from A ×···×A , and 1 being 1 k n the chance that a random element a ∈ G happens to be equal to k x for fixed x ∈G. (cid:80)i=1 i i Itisremarkablethatunderfairlygeneralcircumstances,thequantitym ···m /n 1 k will be close to the actual number of solutions for every a∈G. Weshallgiveasufficientconditionforthistohappen. Firstofallweobserve thatthenumberofsolutionswillnotchangeifwereplaceA byA −a={u−a: k k u∈A }andsettherighthandsideineqn. (34)tozero. Soitsufficestoconsider k the homogeneous equation x +···+x =0 (x ∈A ,i=1,...,k). (36) 1 k i i Let N denote the number of solutions of eqn. (36). We first describe an explicit formula for N. 8 Theorem 3.1. The number of solutions of eqn. (36) is m ···m N = 1 k +R, (37) n where m =|A | and i i k 1 R= f (χ). (38) n (cid:88) (cid:89) Ai χ∈G i=1 (cid:98) χ(cid:54)=χ(cid:98)0 Proof: The number of solutions is clearly 1 N = δ(x +···+x )= χ(x +···+x ) (cid:88) 1 n n (cid:88) (cid:88) 1 k (x ,...,x ) χ∈G (x ,...,x ) 1 n 1 n x ∈A (cid:98) x ∈A i i i i (wehaveusedeqn.(23)). Sinceχ(x +···+x )=χ(x )···χ(x ),therightmost 1 n 1 k sum factors as k ( χ(x )). We recognize the term in the parentheses (cid:81)i=1 (cid:80)xi∈Ai i as f (χ). In summary, Ai (cid:98) 1 k N = f (χ). n (cid:88)(cid:89) Ai χ∈Gi=1 (cid:98) (cid:98) We separate out the term corresponding to χ : 0 k 1 N = f (χ )+R. n(cid:89) Ai 0 i=1 (cid:98) By eqn. (32), f (χ )=m . This observation concludes the proof. (cid:3) Ai 0 i Thevalueo(cid:98)fthisformuladependsonourabilitytoestimatethe“error-term” R. In order to be able to conclude that equation (36) has a solution at all, we need to prove that |R|<(m ···m )/n. 1 k The art of estimating |R| and variations of it constitute the method of trigonometric sums. 4 The Cauchy-Schwarz trick In the case k = 3, one can give a strong explicit upper bound on R under surprisingly general circumstances. It will follow from the estimate that if at least one of the sets A is smooth (all non-principal Fourier coefficients of A i i are small) and the sets are not too small, then equation (36) has approximately (m ···m )/n solutions. Along the way, we shall experience a little Cauchy- 1 k Schwarz magic. 9
Description: