ebook img

Iteration of Quadratic Polynomials Over Finite Fields PDF

0.18 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Iteration of Quadratic Polynomials Over Finite Fields

Iteration of Quadratic Polynomials Over Finite Fields 7 1 D.R. Heath-Brown 0 Mathematical Institute, Oxford 2 n a J 1 Introduction 7 1 Let f(X) F [X] and define the iterates fj(X) by setting f0(X) = X and ] q T fj+1(X) =∈ f(fj(X)). Let m F , and consider the sequence of values q N ∈ f0(m),f1(m),f2(m),.... Since the field F is finite, the sequence eventually q . h recurs, andoneentersaclosedcycle. Weareinterestedinthequestions:- How t a long is it before one enters the cycle? How long is the cycle? In general we m can construct a directed graph Γ = Γ (F ), whose vertices are the elements f f q [ m of Fq, and with edges (m,f(m)). The trajectory f0(m),f1(m),f2(m),... 2 then consists of a pre-cyclic “tail”, followed by a cycle. v Linear polynomials are easily handled. When f(X) = X + b one has 7 0 fj(X) = X +bj, so that if b = 0 the cycles are singleton sets, and if b = 0 7 6 then Γ is a union of cycles of length p, the characteristic of the field. For 2 f 0 linear polynomials f(X) = aX +b with a = 0,1 one finds that . 6 1 0 fj(X) = aj X +b(a 1)−1 b(a 1)−1. 7 { − }− − 1 v: ThusΓf consistsofcyclesoflengthord(a)togetherwithacycle{−b(a−10−1} of length 1. i X The situation is much more interesting for higher degree polynomials, r a and forms the basis for Pollard’s famous “Rho Algorithm” for integer fac- torization [4]. If one wishes to factor N the algorithm calculates successive iterates fj(m) and f2j(m) modulo N, until one reaches a value for which g.c.d.(fj(m) f2j(m),N) > 1. If this highest common factor is different − from N then one has obtained a non-trivial factor of N. When p is a prime divisor of N, the sequence of iterates modulo p will have an initial segment of length t say, (the “tail” of the letter rho) followed by a cycle of length c say. Thus p fj(m) f2j(m) when j is the smallest multiple of c for which j > t. | − In particular the first such j is at most t+c. If p′ is some other prime divisor 1 of N there will be a corresponding value j′ for which p fj′(m) f2j′(m). | − Unless the two values j and j′ are the same, the method will produce a non- trivial divisor g.c.d.(fj(m) f2j(m),N) ofN. The efficiency of thealgorithm − depends on t and c being small. A crude probabilistic argument predicts that, over the field F , the se- q quence f0(m),f1(m),f2(m),... is likely to complete a cycle after roughly O(q1/2) steps. This is a version of the “Birthday Paradox”. Specifically, if oneimaginesthesequence astakingvaluesinF independently anduniformly q at random, then the chance of having a repetition within N steps, say, is N−1 j 1 1 − − q j=0 (cid:18) (cid:19) Y and when N is of order √q this is roughly 1 exp N2/2q . Thus there is a − { } positive probability of a repetition as soon as N √q. ≫ Unfortunately there are examples in which this heuristic clearly fails. Thus if f(X) = X2 one has fj(m) = m2j, and if m has odd order r one gets a pure cycle of length l, where l is the order of 2 modulo r. Thus if q is a prime of the shape 2r+1, with r a prime for which 2 is a primitive root, then the cycle length will be r 1 = (q 3)/2 whenever m has order r modulo q. − − While it is not known that infinitely many such primes q exist it is certainly conjectured to be so. Thus we will expect to get cycles of length q for a ≫ positive proportion of initial values m. A second example is provided by the polynomial f(X) = X2 2. If m = a + a−1 for some a F , then fj(m) = a2j + a−2j, and we−have a q ∈ situation similar to that described above. If q = 2r + 1 with r a prime for which 2 is a primitive root, then again we will have cycles of length q for ≫ a positive proportion of initial values m. Thirdly one can consider polynomials of the shape f(X) = X3+c, in the case in which q is a prime with q 2 (mod 3). Here one sees that f induces a ≡ permutation of F , since X3 = a hasa unique solution in F , for every a F . q q q ∈ If m F is given, the trajectory f0(m),f1(m),... is therefore completely q ∈ cyclic, and our question merely concerns the length of the cycle. However the proportion of permutations in the symmetric group S for which m belongs q to a cycle of given length k, is exactly q−1. Thus one might expect all cycle lengths to occur equally often, and that one should get a cycle of length at least q/2, say, with probability around 1/2. The numerical evidence seems to support this. For a given prime p 2 (mod 3) and every c = 1,...p 1 ≡ − we compute the length, l(c,p) say, of the cycle which starts at m = 0. We then see for how many values of c the scaled cycle length p−1l(c,p) falls into each of the intervals ((k 1)/10,k/10], for k = 1,...10. If the permutations − 2 Prime 100019 100043 (0, 1 ] 10030 9936 10 ( 1 , 2 ] 9944 9730 10 10 ( 2 , 3 ] 9992 9976 10 10 ( 3 , 4 ] 10122 10232 10 10 ( 4 , 5 ] 10212 10034 10 10 ( 5 , 6 ] 9830 10000 10 10 ( 6 , 7 ] 9902 10086 10 10 ( 7 , 8 ] 9904 10012 10 10 ( 8 , 9 ] 10070 9946 10 10 ( 9 ,1] 10012 10090 10 Table 1: Distribution of scaled cycle lengths induced by the various polynomials X3+c were genuinely random we would expect roughly the same number of scaled cycle lengths in each such interval. The data for the first two primes p 2 (mod 3) beyond 105 are presented in ≡ Table 1. The figures appear to support the random permutation model well. The main goal of the present paper is to describe a quite different theory for the iterates of quadratic polynomials in odd characteristic, in which it is clear why the anomalous cases above must be excluded. In contrast to the situation with f(X) = X3 + 1, when f(X) = aX2 + bX + c, the equation f(x) = s typically has either 2 solutions or none at all, the latter case holding for roughly half the possible values of s (those for which b2 4a(c s) is − − a non-square). When f(x) = s has two solutions x = t and x = t , the 1 2 equations f(x) = t and f(x) = t will again typically have either 2 solutions 1 2 or none. In this way, considering solutions of fr(x) = m, one sees that Γ is f potentially much more complicated than a series of cycles. Our main result demonstrates this distinction clearly. Theorem 1 Let F be a finite field of characteristic p = 2, and let f(X) = q 6 aX2 +c F [X] with a = 0. Suppose that fi(0) = fj(0) for 0 i < j r. q ∈ 6 6 ≤ ≤ Then #fr(F ) = µ q +O(24r√q) (1) q r uniformly in a and c, where the constant µ is defined recursively by taking r µ = 1 and 0 µ = µ 1µ2. (2) r+1 r − 2 r Moreover we have µ 2/r as r . r ∼ → ∞ At this point we should mention some closely related work. Shao [5, The- orem 1.6] handles the case f(X) = X2 + 1 by a method which generalizes 3 readily to other quadratics. The condition in his theorem is stronger than ours (that fi(0) = fj(0) for 0 i < j r) but an examination of the proof 6 ≤ ≤ shows that he only needs something like our condition. His result does not include an explicit dependence on r. Juul, Kurlberg, Madhu and Tucker [3] handle general rational functions rather than restricting to quadratic poly- nomials. Their emphasis is on the reductions of a given rational function φ(X) Q(X) modulo different primes, but they show under quite general ∈ conditions that the sum of all cycle lengths is o(p) as p . (See Corollary → ∞ 2 below.) Before discussing the implications of the theorem, let us examine the condition that fi(0) = fj(0) for 0 i < j r. The critical points of a 6 ≤ ≤ polynomial f(X) are the roots ξ of f′(X), and f is said to be “post-critically finite” if the iterates fj(ξ) eventually enter a cycle, for every critical point ξ. In dynamics in general post-critically finite maps are a very important subclass. Of course, over a finite field every polynomial is post-critically finite. However ourconditioncanbeviewedassayingthat,inanapproximate sense, f fails to be post-critically finite. (When f(X) = aX2 + c the only critical point is ξ = 0.) Certainly the condition that fi(0) = fj(0) for 0 i < j r fails 6 ≤ ≤ for the polynomials f(X) = X2 and f(X) = X2 2, with i = 0,j = 1 − and i = 2,j = 3 respectively. Suppose next that f is the reduction of a polynomial F(X) = AX2 + C Z[X], with A,C > 0, then the sequence ∈ F0(0),F1(0),F2(0),...isstrictlymonotonic, withFj(0) (A+C)2j−1. Thus ≤ if p (A+C)2r we cannot have p Fj(0) Fi(0) with 0 i < j r. The ≥ | − ≤ ≤ condition of the theorem will therefore hold when loglogp loglog(A+C) r . (3) ≤ log2 − log2 In following this paper the reader may wish to bear in mind the archetypal example f(X) = X2 +1, for which r 1 loglogp suffices. ≤ 2 Our main theorem above has the following immediate consequences. Corollary 1 Let F be a finite field of characteristic p = 2, and let f(X) = q 6 aX2 +c F [X] with a = 0. Then fi(0) = fj(0) for some i,j with q ∈ 6 q i < j . ≪ loglogq Corollary 2 Let F be a finite field with p > 2 prime, and let f(X) = p aX2+c F [X] be the reduction of AX2+C Z[X], where A,C > 0. Then q ∈ ∈ the sum of all the cycle lengths in Γ will be O (p(loglogp)−1). Similarly f A,C the length of any pre-cyclic path in Γ will be O (p(loglogp)−1). f A,C 4 The first corollary gives an unconditional bound o(q) for the first recur- rence in the sequence f0(0),f1(0),f2(0),.... The second corollary proves a similar result for arbitrary initial values for the reductions of fixed positive definitequadraticpolynomialsAX2+C. Moreover ithighlightsthedifference in behaviour between such polynomials and the cubic case f(X) = X3 +1, where the cycle lengths can sum to q. To prove Corollary 1 we choose r = [(loglogq)/(log4)] 1, so that − 24r√q q/r. Then, according to Theorem 1, we have either fi(0) = fj(0) ≪ for some i < j r, or #fr(F ) q/r. Writing the latter bound as q ≤ ≪ #fr(F ) Cq/r for an appropriate constant C we deduce in the latter q ≤ case that if k = [Cq/r] then the values fr(0),fr+1(0),...,fr+k(0) cannot be distinct, since they all lie in fr(F ) and k +1 > Cq/r. In either case there q must therefore be acceptable values i < j r +k. The claim then follows. ≤ For Corollary 2 we observe as above that the condition of the theorem holds under the assumption (3). The choice r = [(loglogp)/(log4)] 1, will − satisfy (3) when p 1, and the theorem then yields #fr(F ) p/r. All A,C p ≫ ≪ cycles lie inside fr(F ), giving the first assertion of the corollary. Moreover p if f0(m),...,ft(m) is a pre-cyclic path then fr(m),...,ft(m) are distinct elements in fr(F ), so that t r p/r. We then see that t r+p/r, from p − ≪ ≪ which the second assertion follows. WeshouldexplaintherestrictiontopolynomialsaX2+c. Foranarbitrary polynomial f, if we define g(X) := f(X +d) d, then we will have gj(X) = − fj(X+d) d. Thus Γ may be obtainedfrom Γ by relabelling each vertex m g f − asm d. Sincethetwographsareisomorphicinthissense, itsuffices tostudy − f(X+d) d forasuitably chosend. Inthecaseinwhichf(X) = aX2+bX+c − (and F has odd characteristic) we can choose d = b/(2a) to produce a q − polynomial g(X) of the shape aX2 +c′. Thus we may translate our results into statements about general quadratic polynomials as follows. Corollary 3 Let F be a finite field of characteristic p = 2, and let f(X) = q 6 aX2 +bX +c F [X] with a = 0. Suppose that fi( b/(2a)) = fj( b/(2a)) q ∈ 6 − 6 − for 0 i < j r. Then ≤ ≤ #fr(F ) = µ q +O(24r√q) q r uniformly in a,b and c, with the same µ as before. r In particular fi( b/(2a)) = fj( b/(2a)) for some i,j with − − q i < j . ≪ loglogq If q is prime, and f is the reduction of a positive definite quadratic poly- nomial AX2 +BX +C Z[X], then the sum of all the cycle lengths in Γ f ∈ 5 will be O (q(loglogq)−1). Similarly the length of any pre-cyclic path in A,B,C Γ will be O (q(loglogq)−1). f A,B,C Inmuchthesamewayonecanshowthatitwouldsufficetoproveourtheorem for polynomials f(X) = X2 +d. One could then deduce the corresponding result for aX2 +d/a by considering iterates of g(X) := a−1f(aX). Theorem 1 gives us an asymptotic formula #fr(F ) µ q. We proceed q r ∼ to give a probabilistic argument showing why one might expect this, and how the recurrence relation (2) arises. When r = 0 we have #f0(F ) = q, so that q µ = 1. Suppose now that we have a relation #fr(F ) µ q. We will use 0 q r ∼ an inductive argument to produce the corresponding result for fr+1. To have m fr+1(F ) it is necessary and sufficient that m f(F ) and q q ∈ ∈ that n fr(F ) for at least one solution n of f(x) = m. Since F contains q q ∈ (q+1)/2 squares one has m f(F ) in exactly (q+1)/2 cases, and except for q ∈ the value m = f(0) there will then be precisely two possible values of n. Let these be n and n . If the probability of these lying in fr(F ) were µ each, 1 2 q q independently, one might expect that the probability of at least one being in fr(F ) should be 2µ µ2, by the inclusion-exclusion principle. It would q q − q then follow that m belongs to fr+1(F ) with probability around 1(2µ µ2). q 2 q− q One would therefore produce an asymptotic expression #fr+1(F ) µ q q r+1 ∼ with µ as in (2). r+1 We next explain why µ 2r−1, as claimed in Theorem 1. Writing r ∼ ν = 2/µ we see that ν = 2 and r r 0 1 ν = ν +1+ . r+1 r ν 1 r − An easy induction then shows that ν r +2 for all r 0, whence ν r r+1 ≥ ≥ ≤ ν +1+1/(r+1). Another induction shows that r r ν r+2+ j−1, (r 1), r ≤ ≥ j=1 X so that ν r+3+logr for r 1. Together with the lower bound ν r+2 r r ≤ ≥ ≥ this shows that ν r and hence µ 2/r. r r ∼ ∼ Acknowledgments The author would particularly like to extend his thanks to Giacomo Micheli, for a number of interesting conversations intro- ducing the author to the subject of polynomial iteration. Joe Silverman also provided a number of helful comments. Thanks are also due to Tim Brown- ing, for elucidating a technical point in Section 3, to Maksym Radziwil l for some preliminary computational results, and to Ben Green, Rafe Jones, Tom Tucker and Michael Zieve for some useful references. 6 2 A Second Moment Calculation Fundamental toourtreatment ofTheorem 1will bemoments ofthefunctions ρ (m) = # x F : fr(x) = m . r q { ∈ } Our first task is to estimate the moments N(r;k) := ρ (m)k r mX∈Fq for r = 0,1,2,... and k = 1,2,.... Trivially we have ρ (m) = 1 for all m so 0 that N(0;k) = q for every k. Moreover it is also clear that N(r;1) = q for every r. Before moving to the general situation it may be helpful to think first about the case k = 2, for which N(r;2) = # (x,y) F2 : fr(x) = fr(y) . (4) { ∈ q } The equation fr(X) fr(Y) = 0 defines a curve in A2. An absolutely − irreducible curve C over F will have q+O (√q) points, by Weil’s “Riemann q C Hypothesis”. However our curve is far from being irreducible. Indeed fr(X) fr(Y) = fr−1(X)+fr−1(Y) fr−1(X) fr−1(Y) , − − whence a trivial induction(cid:0) produces (cid:1)(cid:0) (cid:1) r−1 fr(X) fr(Y) = (X Y) fj(X)+fj(Y) . (5) − − j=0 Y(cid:0) (cid:1) Thus we obtain r + 1 factors. However it is not immediately clear when polynomials of the form fj(X)+fj(Y) are absolutely irreducible over F . q In general, suppose that φ(X,Y) is a polynomial of degree D, over a field K, and let Φ(U,V,W) = WDφ(U/W,V/W) be the corresponding form. If Φ factors as Φ Φ over the algebraic completion K then there will necessarily 1 2 3 be triple (u,v,w) = (0,0,0) K such that Φ (u,v,w) = Φ (u,v,w) = 0. 1 2 6 ∈ For any such triple we then have Φ = Φ Φ + Φ Φ = 0. This gives 1 2 2 1 ∇ ∇ ∇ us a simple criterion for absolute irreducibility, which is sufficient, though 3 not necessary: If Φ vanishes only at the origin in K , then Φ must be ∇ absolutely irreducible. Weapplythiscriteriontofj(X)+fj(Y). WritingD = 2j forconvenience, and Fj(U,W) = WDfj(U/W), (6) 7 we have (Fj(U,W)+Fj(V,W)) ∇ ∂ = WD−1(fj)′(U/W), WD−1(fj)′(V/W), (Fj(U,W)+Fj(V,W)) . ∂W (cid:18) (cid:19) If f(X) = aX2+c then (fj)′(X) = 2afj−1(X)(fj−1)′(X). It then follows by induction that j−1 WD−1(fj)′(U/W) = (2a)j Fs(U,W). s=0 Y In particular, if (Fj(u,w)+Fj(v,w)) vanishes, then there are indices s,t ∇ ≤ j 1 for which Fs(u,w) = Ft(v,w) = 0. Since − Fs(u,0) = a2s−1u2s and Ft(v,0) = a2t−1v2t we see that w = 0 would imply u = v = w = 0, which is excluded. We then see that we would have fs(x) = ft(y) = 0 for some x,y K such that ∈ fj(x) + fj(y) = 0. However fj(x) = fj−s(fs(x)) = fj−s(0), and similarly for fj(y). It follows that if fj(X)+fj(Y) fails to be absolutely irreducible, then fj−s(0)+fj−t(0) = 0 for some pair of non-negative integers s,t j 1. ≤ − If s = t then since F has odd characteristic we have fj−s(0) = 0 = f0(0) q with 1 j s j. Otherwise fj−s+1(0) = fj−t+1(0) with distinct positive ≤ − ≤ integers j s+1,j t+1 j+1. Since Theorem 1 assumes that the values − − ≤ f0(0),f1(0),...,fr(0) are distinct we therefore conclude that the polynomial fj(X)+fj(Y) isirreducible over thealgebraiccompletion F , for every j < r. q We are now ready to estimate N(r;2). In view of (4) and (5) we have r−1 N(r;2) q + # (x,y) F2 : fj(x)+fj(y) , ≤ { ∈ q } j=0 X there being q solutions to x y = 0. To get a corresponding lower bound we − may use the inclusion-exclusion principle to show that r−1 N(r;2) q+ # (x,y) F2 : fj(x)+fj(y) A B , ≥ { ∈ q }− j − ij j=0 0≤j≤r−1 0≤i<j≤r−1 X X X where A is the number of common solutions to j X Y = 0 and fj(X)+fj(Y) = 0, − 8 and B is the number of common solutions to ij fi(X)+fi(Y) = 0 and fj(X)+fj(Y) = 0. However if fj(x) + fj(y) = 0 with x = y then fj(x) = 0, which has at most2j solutions. ThusA 2j. Similarly, if(x,y)weretolieontwodistinct j ≤ curves fj(X)+fj(Y) = 0 and fi(X)+fi(Y) = 0 with 0 i < j r 1, ≤ ≤ − then fj(y) = fj−i(fi(y)) = fj−i( fi(x)) = fj(x), − since fj−i is an even polynomial. We would then have 2fj(x) = 0 so that x, and similarly y, would be a root of fj. There are therefore at most 2j choices for x, and since y then satisfies fi(y) = fi(x) there are at most 2i choices − of y for each possible x. Thus B 2j+i. It follows that ij ≤ A 2r and B 22r. j ij ≤ ≤ 0≤j≤r−1 0≤i<j≤r−1 X X We therefore conclude that r−1 N(r;2) = q + # (x,y) F2 : fj(x)+fj(y) +O(4r). { ∈ q } j=0 X It remains to count points on the curves fj(X) + fj(Y) = 0. We have already shown that these are absolutely irreducible, and indeed nonsingular, under the assumptions of Theorem 1. If we write N for the number of pro- r jective points on the curve, and D = 2j for its degree, then Weil’s “Riemann Hypothesis” tells us that N (q +1) (D 1)(D 2)√q. r | − | ≤ − − There are at most D points at infinity, so that # (x,y) F2 : fj(x)+fj(y) q D2√q. { ∈ q }− ≤ Finally, summin(cid:12)g for 0 j r 1 we find that (cid:12) (cid:12) ≤ ≤ − (cid:12) r−1 # (x,y) F2 : fj(x)+fj(y) = rq +O(4r√q). { ∈ q } j=0 X We may therefore summarize the conclusions of this section as follows. Lemma 1 Under the assumptions of Theorem 1 we have N(r;2) := ρ (m)2 = (r +1)q +O(4r√q). r mX∈Fq 9 3 Higher Moments — Irreducible Curves We now develop the ideas of the previous section to estimate N(r;k) for k 3. Here N(r;k) is the number of solutions of ≥ fr(x ) = ... = fr(x ) (7) 1 k in F . These equations define a curve, but, as in the previous section, it is q far from being an irreducible curve. Our task in this section is to identify the absolutely irreducible components, and to show that they are all defined over F . q In view of (5), for any solution of (7) and any pair of distinct indices 1 i,j k, there is a corresponding ≤ ≤ d = d(i,j) = d(j,i) 1,0,1,...,r 1 ∈ {− − } such that φ(x ,x ;d) = 0, where i j fd(X)+fd(Y), d 0, φ(X,Y;d) = ≥ X Y, d = 1. (cid:26) − − If there is more than one choice for d(i,j) we choose the smallest. We now make the following definition. Definition 1 A “(D,k)-graph” is a weighted graph on k vertices, for which any edge ij has integral weight in the range [ 1,D]. If some edge has weight − equal to D we say that we have a “strict (D,k)-graph”. If there is an edge between every pair of vertices we say we have a “complete (D,k)-graph”. Thus each solution of (7) produces a complete (D,k)-weighted graph. We now introduce the following further definition. Definition 2 Let G be a complete (D,k)-graph. Then we say G is “proper” if, whenever a,b,c are distinct vertices, with d(a,b) d(a,c) d(b,c), then ≤ ≤ either d(a,b) = d(a,c) = d(b,c) = 1 or d(a,b) < d(a,c) = d(b,c). − We then have the following lemma. Lemma 2 The graph associated to a solution of (7) is proper. To prove the claim, observe firstly that if d(a,b) = d(a,c) = 1, then − x = x and x = x , whence x = x , so that d(b,c) = 1. Next we show a b a c a c − that one cannot have d(a,b) = d(a,c) 0. Writing d = d(a,b) = d(a,c) this ≥ would imply that fd(x ) = fd(x ) and fd(x ) = fd(x ), whence a b a c − − fd(x ) fd(x ) = 0. b c − 10

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.