Diffusion approximations via Stein’s method and time changes 7 1 Miko laj J. Kasprzak, 0 Miko laj Kasprzak 2 Department of Statistics n Universityof Oxford a 24-29 St Giles’ J Oxford, OX1 3LB UnitedKingdom 6 e-mail: [email protected] 2 ] Abstract: Weextendtheideasof[Bar90]anduseStein’smethodtoobtain R abound onthedistance between ascaledtime-changed random walkand P atime-changed BrownianMotion.Wethenapplythisresulttoboundthe . distance between atime-changed compensated scaledPoissonprocess and h atime-changedBrownianMotion. t This allows us to bound the distance between the Moran model with a m mutationandWright-Fisherdiffusionwithmutationuponnotingthatthe formermaybeexpressed asadifference of twotime-changed Poissonpro- [ cesses and the diffusion part of the latter may be expressed as a time- changed BrownianMotion. 1 The method is applicable to amuch wider class of examples satisfying v theStroock-Varadhantheoryofdiffusionapproximation([SV79]). 3 3 MSC 2010 subject classifications: Primary 60B10, 60F17; secondary 6 60J70, 60J65,60E05,60E15. 7 Keywords and phrases: Stein’s method, functional convergence, time- 0 changed BrownianMotion,Moranmodel,Wright-Fisherdiffusion. . 1 0 1. Introduction 7 1 : In his seminal paper [Ste72], Charles Stein introduced a method for proving v normal approximations and obtained a bound on the speed of convergence to i X the standard normal distribution. He observed that a random variable Z has r standard normal law if and only if Zf(Z)= f′(Z) for all smooth functions a E E f.Therefore,if,forarandomvariableW withmean0andvariance1, f′(W) E − Wf(W) is close to zero for a large class of functions f, then the law of W E shouldbeapproximatelyGaussian.Hethenproposedthat,insteadofevaluating h(W) h(Z) directly for a given function h, one can first find an f = f h |E −E | solving the following Stein equation: f′(w) wf(w)=h(w) h(Z) − −E and then find a bound on f′(W) Wf(W). This approach often turns |E −E | out to be much easier, due to some bounds on the solutions f , which can be h 1 Miko laj J. Kasprzak/Diffusion approximations viaStein’smethod 2 derived in terms of the derivatives of h. Since then, Stein’s method has been significantly developed and extended to approximations by distributions other than normal. The aim of Stein’s method is to find a bound of the quantity h h, |Eνn −Eµ | where µ is the target (known) distribution, ν is the approximating law and h n is chosen from a suitable class of real-valued test functions . The idea is to H find an operator acting on a class of real-valued functions such that A ( f Domain( ) f =0) ν =µ, ν ∀ ∈ A E A ⇐⇒ where µ is our target distribution. In the next step, for a given function h , ∈H a solution f =f to the following Stein equation: h f =h h µ A −E is sought and its properties studied. Finally, using various mathematical tools (amongwhichthe mostpopularare Taylor’sexpansionsinthe continuouscase, Malliavin calculus and coupling methods), a bound is sought for the quantity f . |EνnA h| Anaccessibleaccountofthemethodcanbefound,forexample,inthesurveys [LRS16]and[Ros11]aswellasthebooks[BHJ92]and[CGS11],whichtreatthe cases of Poissonand normal approximation, respectively, in detail. [Swa16] is a database of information and publications connected to Stein’s method. Approximations by infinite-dimensional laws have not been covered in the Stein’s method literature very widely, with the notable exceptions of [Bar90], [BJ09]andrecently[CD13].Wewillfocusontheideastakenfrom[Bar90],which provides bounds on the Brownian Motion approximation of a one-dimensional scaledrandom walk and some other one-dimensionalprocesses including scaled sums of locally dependent random variables and examples from combinatorics. In the sequel, we show that the approachpresented in [Bar90] can be extended to time-changes of Brownian Motion, including diffusions in the natural scale. The mostimportantexamplewe applyourtheoryto isthe approximationof the Moran model with mutation by the Wright-Fisher diffusion with mutation. The former, first introduced in [Mor58] as an alternative for the Wright-Fisher model (first formally described in [Wri31]) is one of the simplest and most importantmodelsofthegeneticdrift,i.e.thechangeinthefrequenciesofalleles in a population.It assumes that the populationis divided into two allelic types (A and a) and the frequency of each of the alleles is governed by a birth-death process and an independent mutation process. Specifically, in a population of size n, at exponential rate n , a pair of genes is sampled uniformly at random. 2 Then one of them is selected at random to die and the other one gives birth to (cid:0) (cid:1) another gene of the same allelic type. In addition, every gene of type a changes its type independently at rate ν and every gene of type A changes its type 2 independently at rate ν . The model then looks at the proportion of a-genes in 1 the population. On the other hand, the Wright-Fisher model is a discrete Markov chain and does not allow for overlapping generations. Specifically, each step represents a Miko laj J. Kasprzak/Diffusion approximations viaStein’smethod 3 generation. In generation k each of the n individuals chooses its parent inde- pendently,uniformlyatrandomfromtheindividualspresentingenerationk 1 − and inherits its genetic type. This model also then looks at the proportion of a-genes in the population. The Moranmodel turns out to be easier to study mathematically. It may be proved,for instanceusing the Stroock-Varadhantheoryofdiffusionapproxima- tion(see[SV79]),thatitconvergesweaklytotheWright-Fisherdiffusion,which is alsoascalinglimit ofthe Wright-Fishermodel. We showhow toput a bound on the speed of this convergence. The Wright-Fisher diffusion is often used in practiceingeneticsforinferenceconcerninglargepopulations.Itisgivenbythe equation: dM(t)=γ(M(t))dt+ M(t)(1 M(t))dB , t − where γ : [0,1] encompasses mutpation. For a discussion of probabilistic → R models in genetics see [Eth12]. In Section 2 we introduce the space of test functions we will find the bounds for.InSection3wepresentourmainreuslts.Theorem3.1showshowtheaproach in[Bar90] canbe extended to the approximationofa scaled,time-changedran- dom walk by a time-changed Brownian Motion. In Theorem 3.2 we apply The- orem3.1 to look atthe distance between a time-changedPoissonProcessanda time-changedBrowianMotion.Theorem3.4showshowthis canbe extendedto findthespeedofconvergenceoftheMoranmodelwithmutationtothe Wright- Fisher diffusion with mutation. The bound obtained therein makes it possible toanalysetheimpactthemutationratesandthenumberofindividualshaveon the quality of the approximation and the interplay between those parameters. Section 5.2 provides proofs of the reults presented in Section 2 and comments on how the proof of Theorem 3.4 may be adapted to find the speed of con- vergence in other examples satisfying the Stroock-Varadhantheory of diffusion approximation(see [SV79]). In what follows, will always denote the sup norm and D = D[0,1] = k·k D([0,1], )willbetheSkorokhodspaceofca`dla`greal-valuedfunctionson[0,1]. R 2. Space M Let us define: f(w) f := sup | | , k kL 1+ w 3 w∈D[0,1] k k andletL be the Banachspace ofthe continuousfunctions f :D[0,1] such →R that f < . We now let M L consist of the twice Fr´echet differentiable L k k ∞ ⊂ functions f, such that: D2f(w+h) D2f(w) k h , (2.1) f k − k≤ k k for some constant k , uniformly in w,h D[0,1]. By Dkf we mean the k-th f ∈ Fr´echet derivative of f and the k-linear norm B on L is defined to be B = k k sup B[h,...,h]. Note the following: {h:khk=1}| | Miko laj J. Kasprzak/Diffusion approximations viaStein’smethod 4 Lemma 2.1. For every f M, f < , where: M ∈ k k ∞ f(w) Df(w) D2f(w) f := sup | | + sup k k + sup k k k kM 1+ w 3 1+ w 2 1+ w w∈D[0,1] k k w∈D[0,1] k k w∈D[0,1] k k D2f(w+h) D2f(w) + sup k − k. h w,h∈D[0,1] k k Proof. Note that for f M it is possible to find a constant K satisfying: f ∈ A) Df(w) Df(w) Df(0) + Df(0) k k≤k − k k k MVT w sup D2f(θw) + Df(0) ≤ k k k k k k θ∈[0,1] w sup D2f(θw) D2f(0) + D2f(0) + Df(0) ≤k k"θ∈[0,1] k − k k k # k k (cid:0) (cid:1) (2.1) w k sup θw + D2f(0) + Df(0) f ≤ k k" θ∈[0,1]k k k k# k k k w 2+ D2f(0) (1 w 2)+ Df(0) <K (1+ w 2); f f ≤ k k k k ∨k k k k k k B) D2f(w) D2f(w) D2f(0) + D2f(0) k k≤k − k k k (2.1) k w + D2f(0) <K (1+ w ); f f ≤ k k k k k k 1 C) f(w+h) f(w) Df(w)[h] D2f(w)[h,h] K h 3, (2.2) f − − − 2 ≤ k k (cid:12) (cid:12) (cid:12) (cid:12) uniformly i(cid:12)n w,h D, where the last inequality follows(cid:12)by Taylor’s theorem (cid:12) ∈ (cid:12) and (2.1). Therefore: f(w) Df(w) D2f(w) f = sup | | + sup k k + sup k k k kM 1+ w 3 1+ w 2 1+ w w∈D w∈D w∈D k k k k k k D2f(w+h) D2f(w) + sup k − k < . h ∞ w,h∈D k k We now let M0 M be the class of functionals g M such that: ⊂ ∈ g := g + sup g(w) + sup Dg(w) + sup D2g(w) < . M0 M k k k k | | k k k k ∞ w∈D w∈D w∈D This is Proposition 3.1 of [BJ09]: Proposition 2.2. Suppose that, for each n 1, the random element Y of n ≥ D[0,1] is piecewise constant with intervals of constancy of length at least r . n Let (Z ) be random elements of D[0,1] converging weakly in D[0,1], with n n≥1 respect to the Skorokhod topology, to a random element Z C([0,1], ). If: ∈ R g(Y ) g(Z ) Cτ g (2.3) n n n M0 |E −E |≤ k k Miko laj J. Kasprzak/Diffusion approximations viaStein’smethod 5 for each g M0 and if τ log2(1/r ) n→∞ 0, then Y Z in D[0,1] (weakly n n n ∈ −−−−→ ⇒ in the Skorokhod topology). A similar result holds when Y is a continuous-time Markov chain: n Proposition 2.3. Suppose that, for each n 1, the random element Y of n ≥ D[0,1] is a contiuous-time Markov chain with mean holding time 1 0, iden- λn → tically distributed for each state. Let (Z ) be random elements of D[0,1] n n≥1 converging weakly in D[0,1], with respect to the Skorokhod topology, to a ran- dom element Z C([0,1], ). Suppose further that: ∈ R g(Y ) g(Z ) Cτ g (2.4) n n n M0 |E −E |≤ k k for each g M0 and that τ log2 (λ )3 n→∞ 0. Then Y Z in D[0,1] n n n ∈ −−−−→ ⇒ (weakly in the Skorokhod topology). (cid:0) (cid:1) We provide a proof of Proposition2.3 in the Appendix. 3. Main results Theorem 3.1 below is an extension of Theorem 1 in [Bar90] to the case of a time-changed scaled random walk: Theorem 3.1. Let X ,X ,... be i.i.d. with mean 0, variance 1 and finite third 1 2 moment.Let s:[0,1] [0, )be astrictly increasing, continuousfunction with → ∞ s(0)=0. Define: ⌊ns(t)⌋ Y (t)=n−1/2 X , t [0,1] n i ∈ i=1 X and let (Z(t),t [0,1])= (B(s(t)),t [0,1]), where B is a standard Brownian ∈ ∈ Motion. Suppose that g M. Then: ∈ 30+54 51/3s(1) g(Y ) g(Z) g · n−1/2 log(2s(1)n) n M |E −E |≤k k √πlog2 3 3 2p + g s(1) 1+ s(1)3/2 X 3n−1/2 M 1 k k (cid:18)2(cid:19) rπ !E| | 2160 + g n−3/2(log(2s(1)n))3/2. k kM√π(log2)3/2 In Theorem 3.1 we do not claim that our bounds are sharp. Our bound in Theorem 3.1 is of the same order as the one obtained in the original case in [Bar90]. This result can also be extended in a straightforwardway to instances in which the time change is random and independent of the step sizes of the random walk. We can obtain this by conditioning on the time change. Theorem 3.2 below treats a time-changed Poisson process and can also be extendedtorandomtimechanges,independentofthePoissonprocessofinterest, by conditioning. Miko laj J. Kasprzak/Diffusion approximations viaStein’smethod 6 Theorem 3.2. Suppose that P is a Poisson process with rate 1 and S(n) : [0,1] [0, ) is a sequence of increasing continuous functions, such that → ∞ S(n)(0) = 0. Let S : [0,1] [0, ) be also increasing and continuous. Let → ∞ Z(t)=B(S(t)),t [0,1] where B is a standard Brownian Motion and ∈ P nS(n)(t) nS(n)(t) Y˜ (t)= − , t [0,1]. n √n ∈ (cid:0) (cid:1) Then, for all g M: ∈ 27√2 27√2 g(Y˜ ) g(Z) g 2+ S(1) S S(n) + S S(n) 3/2 n M |E −E |≤k k ( 2√π ! k − k 2√π k − k q 30+54 51/3S(1) +n−1/2 · log(2s(1)n) √πlog2 (cid:20) p 3 3 2 log(1+2e−1)+2logn + 1+ S(n)(1)3/2 S(n)(1)(1+2e−1)+1+ 2 π loglog(n+2) (cid:18) (cid:19) r ! (cid:21) 9 S(n)(1) 1/2 16701+128(logn)3 1/3 +n−1 1+3nS(n)(1) 4+ 2 (loglog(n+3))3 p (cid:16) (cid:17) (cid:20) (cid:21) 2160 33402+256(logn)3 + n−3/2 (log(2S(1)n))3/2+8+ . √π(log2)3/2 (loglog(n+3))3 (cid:20) (cid:21)(cid:27) Remark 3.3. The bound in Theorem 3.2 goes to 0 as long as the time changes Sn S uniformly. → Theorem 3.4 below gives a bound on the speed of convergence of the Moran model with mutation to the Wright-Fisher diffusion with mutation. In the for- mer, in a population of size n, each individual carries a particular gene of one ofthe twoforms:Aanda.Eachindividualhasexactlyoneparentandoffspring inherit the genetic type of their parent. Now, at exponential rate n a pair of 2 genes is sampled uniformly at random from the population. One of the pair is (cid:0) (cid:1) selected atrandomto die and the other one splits in two. In addition, every in- dividualoftypeAchangesitstypeindependently atrateν andeverindividual 2 of type a changes its type independently at rate ν . 1 Theorem 3.4. Let M (t) be the proportion of type a genes in the population at n time t [0,1] under the Moran model with mutation rates ν ,ν , as described 1 2 ∈ above. Let (M(t),t [0,1]) denote the Wright-Fisher diffusion given by: ∈ dM(t)=(ν (ν +ν )M(t))dt+ M(t)(1 M(t))dB . 2 1 2 t − − Then, for any g M: p ∈ g(M ) g(M) n |E −E | g 18+ν1/2+47ν3/4+31ν3/2+ν +3ν2+9ν3 ≤k kM 1 1 1 2 2 2 1.0n2(cid:16)106+425ν1/2+623ν +39ν3/2+7ν5/2 (cid:17) · · 2 2 2 2 (cid:16) (cid:17) Miko laj J. Kasprzak/Diffusion approximations viaStein’smethod 7 + 12+3ν +3ν2+9ν3 1.02 106+425ν1/2+623ν +39ν3/2+7ν5/2 2 2 2 · 1 1 1 1 (cid:0) 1 (cid:1)(cid:16) (cid:17) +7 (1+2ν )(ν +ν )+31(ν +ν )3 n−1/4 2 1 2 1 2 2 (cid:18) (cid:19)(cid:27) + g 2112 18+ν1/2+47ν3/4+31ν3/2+ν +3ν2+9ν3 log n2/4+ν n 3/2 k kM 1 1 1 2 2 2 2 + 12+3ν +h(cid:16)3ν2+9ν3 log n2/4+ν n 3/2 n−3. (cid:17)(cid:0) (cid:0) (cid:1)(cid:1) 2 2 2 1 (cid:0) (cid:1)(cid:0) (cid:0) (cid:1)(cid:1) i Remark 3.5. If ν 1 and ν 1 then we can write: 1 2 ≥ ≥ g(M ) g(M) n |E −E | g (18+79ν3/2+13ν3)(1.02 106+1094ν5/2) ≤k kM 1 2 · 2 +(12h+15ν3)(1.02 106+1094ν5/2) 2 · 2 + 7 31.5ν3+32.5ν3+ν ν (1+93ν +93ν ) n−1/4 1 2 1 2 1 2 + g(cid:0) 2112(18+79ν3/2+13ν3)n−3 log n2/(cid:1)4(cid:3)+ν n 3/2 k kM 1 2 1 + g 2112(12+15ν3)n−3 log n2/4(cid:0)+ν(cid:0)n 3/2. (cid:1)(cid:1) k kM 2 1 The approximation gets worse as the mu(cid:0)tatio(cid:0)n rates incr(cid:1)e(cid:1)ase and the number of individuals decreases. Should we want to make the mutation rates depend on n and be of the same order, we will require them to be o n1/22 in order for the bound to converge to 0 as n . →∞ (cid:0) (cid:1) IntheproofofTheorem3.4wewillusethefactthatthe diffusionpartofthe Wright-Fisherdiffusionwithmutation,justlikeanyone-dimensionalcontinuous localmartingale,canbeexpressedasatime-changedBrownianMotion.Wewill also use the idea of [Kur12] and write the Moran model as a difference of two time-changedPoissonprocesses,counting the up-jumps and the down-jumps of thechain.ThiswillletusappealtoTheorem3.2toobtainthebounds.Thetime changes we apply in this case are random and therefore we will first condition on them. Another key idea used in the proof will be the Donnelly-Kurtz look-down constructioncoming from [DK96]and decribed inChapter 2.10 of[Eth12]. The n-particle look-down process is denoted by a vector (ψ (t),...,ψ (t)) with each 1 n index representing a ”level” and each of ψ ’s representing the type of the indi- i vidualat level i at time t. Individual at level k is equipped with an exponential clockwithratek 1,independentofotherindividuals,andatthetimestheclock − rings it ”looks down” at a level chosen uniformly at random from 1,...,k 1 { − } andadoptsthetype oftheindividualatthatlevel.Inaddition,the typeofeach individual evolves accordingto the mutation process. A comparisonof the gen- erators of the Moran model and the look-down process shows that, as long as the two are started from the same initial exchangeable condition, they produce the same distribution of types in the population. In addition, it may be shown that the Wright-Fisher diffusion may be represented as the proportion of type a individuals in the population, in which the types are distributed according Miko laj J. Kasprzak/Diffusion approximations viaStein’smethod 8 to the infinite-particle look-down process. The corresponding Moran model is then the proportionof type a individuals among the ones located on the first n levels.Due to exchangeability,we may then describe the Moranmodel M ata n fixedtime,asdependingontheWright-FisherdiffusionM inthe followingway: nM (t) Binomial(n,M(s)). n ∼ Remark 3.6. Our bound in Theorem 3.4 is sufficient to conclude that the Moran model converges weakly in the uniform topology to the Wright-Fisher diffusion on compact intervals. This follows from Proposition 2.3. Using the notation therein, in this case, τ =n−1/4 and λ = n(n−1). n n 2 4. Setting up Stein’s method Let us first define: ⌊ns(1)⌋ ⌊ns(1)⌋ An(t)=n−1/2 Zi [i/n,s(1)](s(t))=n−1/2 Zi [s−1(i/n),1](t), (4.1) 1 1 i=1 i=1 X X i.i.d for Z (0,1). In a preparation for the proof of Theorem 3.1, we will apply i ∼ N Stein’s method to find the distance between A and Y . n n 4.1. The Stein equation We first note that if U ,U ,... are i.i.d. Ornstein-Uhlenbeck processes with sta- 1 2 tionary law (0,1). Then defining: N ⌊ns(t)⌋ W (t,u)=n−1/2 U (u), u 0,t [0,1] n i ≥ ∈ i=1 X we obtain that the law of A is stationary for (W (,u)) . Denote its gen- n n · u≥0 erator by . By properties of stationary distributions, f = 0 for all n µ n A E A f Domain( ) if and only if µ= (A ). Therefore, we can treat n n ∈ A L f =g g(A ) (4.2) n n A −E as our Stein equation. In the next subsection, for any g from a suitable class of functions, we will find anf satisfying equation(4.2). Then,in the sequel,we will find a boundon f(Y ), which will readily give us a bound on g(Y ) g(A ). n n n n |EA | |E −E | Proposition 4.1. The generator of the process (W (,u)) acts on any n n u≥0 A · f M in the following way: ∈ ( f)(w):= Df(w)[w]+ D2f(w) A(2) . An − E n h i Before we prove this result, we need a lemma: Miko laj J. Kasprzak/Diffusion approximations viaStein’smethod 9 Lemma 4.2. Letting =σ(W (,v),v u), we have: n,u n F · ≤ W (,u+v) e−vW (,u)=D σ(v)A (). n n n · − · · Proof. We firstnote thatfor eachi 1we canconstructindependent standard ≥ Brownian Motions B such that (X (u),u 0)=(e−uB (e2u),u 0). Then: i i i ≥ ≥ ⌊ns(·)⌋ ⌊ns(·)⌋ W (,u+v) e−vW(,u)=n−1/2 U (u+v) e−vn−1/2 U (u) n k k · − · − k=1 k=1 X X ⌊ns(·)⌋ =Dn−1/2e−(u+v) B e2(u+v) B e2u k k − kX=1 h (cid:16) (cid:17) (cid:0) (cid:1)i ⌊ns(·)⌋ =Dn−1/2σ(v) Z =σ(v)A (). k n · k=1 X Proof of Proposition 4.1. Note that the semigroup of (W (,u)) , acting on n u≥0 · L, is defined by: (T f)(w):= [f(W (,u)W (,0)=w] n,u n n E · | · and by Lemma 4.2 we readily obtain that: (T f)(w)= f(we−u+σ(u)A () . (4.3) n,u n E · and we can define the generator by(cid:2): = lim Tn,u−I(cid:3). Also, we have that An uց0 u for f M: ∈ (T f)(w) f(w) Df(w)[σ(u)A w(1 e−u)] n,u n − −E − − 1 (cid:12)(cid:12) D2f(w) σ(u)An w(1 e−u) (2) − 2E { − − } (cid:12) (2.2) h i(cid:12)(cid:12) K σ(u)A w(1 e−u) 3 (cid:12) f n ≤ Ek − − k 4K σ3(u) A 3+(1 e−u)3 w 3 K (1+ w 3)u3/2 f n 3 ≤ Ek k − k k ≤ k k (cid:2) (cid:3) for a constant K depending only on f, where the last inequality follows from 3 the fact that for u 0, σ3(u) 3u3/2 and (1 e−u)3 u3/2. So: ≥ ≤ − ≤ Miko laj J. Kasprzak/Diffusion approximations viaStein’smethod 10 (T f f)(w)+uDf(w)[w] u D2f(w)[A(2)] n,u − − E n (cid:12)(cid:12)(Tn,uf)(w) f(w) Df(w)[σ(u)An w(1 e(cid:12)(cid:12)−u)] ≤(cid:12) − −E − − (cid:12) 1 (cid:12)(cid:12) D2f(w)[ σ(u)An w(1 e−u) (2)] + σ(u) Df(w)[An] − 2E { − − } | E | (cid:12) + (u−1+e−u)Df(w)[w] + σ22(u) −u(cid:12)(cid:12)(cid:12) ED2f(w)[A(n2)] (cid:12)(cid:18) (cid:19) (cid:12) +(cid:12)(cid:12)(1−e−u)2D2f(w)[w(2)(cid:12)(cid:12)] +(cid:12)(cid:12)(cid:12) σ(u)(1 e−u) D2f(w)[An,(cid:12)(cid:12)(cid:12)w] 2 − E (cid:12) (cid:12) K1(cid:12)(cid:12) 1+ w 3 u3/2+ σ(u) (cid:12)(cid:12)Df(cid:12)(cid:12)(w)[An] + u 1+e−u Df(w(cid:12)(cid:12)) w ≤ (cid:12) k k | E(cid:12) | | − |k kk k σ2(u) + (cid:0) u(cid:1) D2f(w) A 2 n 2 − k kEk k (cid:12) (cid:12) +(cid:12)(cid:12)(cid:12)(1−e−u)2(cid:12)(cid:12)(cid:12)D2f(w) w 2+σ(u)(1 e−u) D2f(w) w An 2 k kk k − k kk kEk k (2.2) K (1+ w 3)u3/2+ σ(u) Df(w)[A ] + u 1+e−u K (1+ w 2) w 1 n f ≤ k k | E | | − | k k k k σ2(u) (1 e−u)2 + u K (1+ w ) A 2+ − K (1+ w ) w 2 f n f 2 − k k Ek k 2 k k k k (cid:12) (cid:12) +(cid:12)(cid:12)σ(u)(1 e−(cid:12)(cid:12)u)(1+ w ) w An (cid:12) − (cid:12) k k k kEk k 3u3/2 K (1+ w 3)+K (1+ w 2) w +K (1+ w ) A 2 1 f f n ≤ k k k k k k k k Ek k +Kf((cid:0)1+ w ) w 2+(1+ w ) w An + σ(u) Df(w)[An] k k k k k k k kEk k | E | K4(1+ w 3)u3/2, (cid:1) (4.4) ≤ k k forsomeconstantK depending onlyonf.The lastinequalityfollowsfromthe 4 fact that: ⌊ns(1)⌋ EDf(w)[An]=EDf(w)n−1/2 Zk1[s−1(k/n),1] k=1 X ⌊ns(1)⌋ =n−1/2 Df(w) [s−1(k/n),1] [Zk] 1 E k=1 X (cid:2) (cid:3) =0. It follows that for any f M: ∈ T f(w) f(w) f(w)= lim n,u − = Df(w)[w]+ D2f(w) A(2) . An uց0 u − E n h i