ebook img

On Stein's method and mod-* convergence PDF

0.53 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview On Stein's method and mod-* convergence

ON STEIN’S METHOD AND MOD-* CONVERGENCE YACINE BARHOUMI-ANDRÉANI 7 Abstract. Stein’s method allows to prove distributional convergence of a sequence of 1 randomvariablesandtoquantifyitwithrespecttoagivenmetricsuchasKolmogorov’s(a 0 2 Berry-Esséentype theorem). Mod-*convergencequantifies the convergenceofa sequence of random variables to a given distribution in a sense unusual in probability theory, a n priori unrelated to a metric on probability measures. a J Thisarticlegivesaconnectionbetweenthesetwonotions. Itshowsthatmod-*conver- 1 gencecanbeunderstoodasahigherorderapproximationindistributionwhenthelimiting 1 function is integrable and proves a refined Berry-Esséen type theorem for sequences con- verging in the mod-Gaussian sense. ] R Contents P . h 1. Introduction 2 t a Notations 5 m 2. Mod-* convergence 5 [ 2.1. Reminder of the main notions 5 1 2.2. Mod-Gaussian convergence in the Laplace setting 10 v 2.3. Mod-* convergence with infinitely divisible distributions 11 6 8 3. Stein’s method 13 0 3.1. Reminder of Stein’s method with zero-bias 13 3 0 3.2. A Stein’s operator for the penalised Gaussian distribution 15 . 3.3. Perturbation of the Gaussian operator and Edgeworth expansion 18 1 0 3.4. Approximation by signed measures 19 7 4. The sum of i.i.d. symmetric random variables 21 1 : 4.1. A mod-Gaussian approximation theorem 21 v i 4.2. A Kolmogorov approximation 24 X 4.3. Beyond the classical Berry-Esséen speed of convergence 25 r a 4.4. Beyond the classical Kolmogorov approximation 28 4.5. Last remarks 30 5. Appendix : Stein’s estimates 31 5.1. Overview and main definitions 31 5.2. Basic estimates 33 5.3. Integral representations 43 5.4. Operator norms estimates 49 Acknowledgements 59 References 60 Date: January 12, 2017. 2000 Mathematics Subject Classification. 60E10, 60E05, 60F05, 60G50,60B10. 1 2 Y. BARHOUMI-ANDRÉANI 1. Introduction Let (X ) be a sequence of random variables converging in law to Z N (0,1) ; for n n ∼ instance, take X to be the sum of n i.i.d. random variables of expectation 0 and variance n 1/n. The Central Limit Theorem asserts that d (X ,Z) := sup P(X 6 x) P(Z 6 x) 0 Kol n n x R| − | −n−−+−→ ∈ → ∞ The Berry-Esséen theorem [4, 8] is a direct continuation of the Central Limit Theorem : it gives the rate of convergence of this latest limit under the form C d (X ,Z) 6 Kol n √n with a constant C depending on the sequence (X ) . n n To prove their bound, Berry and Esséen used a Fourier inversion. Such a method per- fectly applies in the framework of a sum of independent variables but becomes less efficient in the context of a marked dependence. Charles Stein introduced his eponymous method in [19] as an alternative to the Fourier formalism to achieve a Berry-Esséen bound. The key point consisted in replacing the char- acteristic function by a characteristic operator easier to handle in situations of dependency. Many paradigmshifts were then observed inthe theory ; initially designed for the Gaussian distribution, the methodwasextended tothe Poisson setting in[5]andthecharacterisation of the distribution via the operator was replaced by a fixed point equation in law using a probabilistic transformation such as the 0-bias or the size-bias transform (see [9, 10]). Mod-Gaussian convergence was introduced in [13]. A sequence of random variables (X ) is said to converge in the mod-Gaussian sense if there exists a sequence (γ ) of n n n n strictly positive reals and a function Φ : R C such that, locally uniformly in u R → ∈ E eiu(Xn E(Xn)) − Φ(u) (cid:0) e−u2γn2/2 (cid:1) −n−→−+−∞→ Due to the type of convergence and the properties of the converging sequence, Φ is a continuous function satisfying Φ(0) = 1 and Φ(u) = Φ( u). Moreover, Φ is not necessarily − the Fourier transform of a probability distribution (see [13]). This last fact impeds the naive probabilistic interpretation that would think X E(X ) as the sum of a random n n − variable and a Gaussian noise. The same notion can be defined in the Poisson framework (see [15]) : a sequence (Z ) is n n is said to converge in the mod-Poisson sense at speed (γ ) if, locally uniformly in x U, n n ∈ E xZn Φ(x) eγ(cid:0)n(x−1(cid:1)) −n−→−+−∞→ for a continuous function Φ : U R satisfying Φ(1) = 1. Here, U designates the unit → circle. ON STEIN’S METHOD AND MOD-* CONVERGENCE 3 These two notions define mod-* convergence with * Gaussian,Poisson , but the set ∈ { } of admissible distributions can also extend to the infinitely divisible case (see [6]) or any distribution that arises as a limit in law whose Fourier transform does not vanish. As we consider unormalised (hence diverging) random variables, renormalising directly their Fourier transform gives a non trivial limiting function. Note that a change of renor- malisation (setting u = v/γ ) implies the convergence in law of (X E(X ))/γ to the n n n n − Gaussian distribution : E eiv(Xn E(Xn))/γn − 1 (cid:0) e−v2/2 (cid:1) −n−→−+−∞→ This former type of convergence is thus more precise than the usual convergence in law. It is unusual in probability theory and was only investigated in a few occurrences, for instance in [12]. But it is well exploited in other branches of mathematics such as number theory since, for example, Keating and Snaith’s celebrated moments conjecture writes (see [14]) E eλlog|ζ(12+iTU)| Φ (λ) (cid:16)e(12loglogT)λ2/2 (cid:17) −T−−+−→ ζ → ∞ Here, ζ denotes the Riemann Zeta function, Φ is a particular function described in [14] ζ and U denotes a random variable uniformly distributed in [0,1]. The convergence holds locally uniformly in λ Re > 1 , hence for λ iR. The fact that Φ does not write ζ ∈ { − } ∈ as the Fourier-Laplace transform of a probability distribution asks the question of the existence of a random variable that would naturally converge in the mod-Gaussian sense to the same function, to be able to compare it to the original sequence of random variables. The first result of this paper answers the following Question 1.1. Given a function Φ : R C satisfying some admissibility assumptions → (described in detail in theorem 2.4) and γ , can we construct a family of random → ∞ variables ( (Φ)) that converges in the mod-Gaussian sense to Φ ? γ γ H The answer to this question is given in theorem 2.4. A slight shift of point of view, using the Laplace transform in place of the Fourier transform in the precedent definition allows to already give a flavour of the result : Theorem 1.2. Let Φ : R R be a continuous positive integrable function on R such that Φ(0) = 1. Let γ > 0 and X→ N (0,γ2). Define the random variable (Φ) by the change γ γ ∼ H of probability E f(X )Φ Xγ E(f( (Φ))) := γ γ2 (1) Hγ (cid:16)E Φ Xγ(cid:16) (cid:17)(cid:17) γ2 (cid:16) (cid:16) (cid:17)(cid:17) 4 Y. BARHOUMI-ANDRÉANI for all bounded measurable f : R R. Then, → E euHγ(Φ) E Φ Xγ2γ +u = (2) E(cid:0)(euXγ)(cid:1) (cid:16)E (cid:16)Φ Xγ (cid:17)(cid:17) γ2 and in particular, locally uniformly in u R (cid:16) (cid:16) (cid:17)(cid:17) ∈ E eu γ(Φ) H Φ(u) E(cid:0)(euXγ)(cid:1) −γ−→−+−∞→ The duality relation (2) will appear to be a direct avatar of the Gaussian change of probability. Moregenerally, itwillholdinthecontext ofanyinfinitely divisible distribution (see theorem 2.14). Since there exists now a “canonical” random variable associated to Φ (with some addi- tional restrictive hypotheses, though), it is tempting to think of metrising mod-Gaussian convergence in this restricted setting by performing a probabilistic approximation with this new distribution. This asks the Question 1.3. Given a sequence of random variables (X ) that converges in the mod- n n Gaussian sense at speed (γ ) to a given function Φ satisfying the hypotheses of theorem n n (1.2), can we find a bound for d (X , (Φ)) := sup P(X 6 x) P( (Φ) 6 x) Kol n Hγn x R| n − Hγn | ∈ and compare it to the classical Berry-Esséen bound obtained for the convergence in law of (X /γ ) to the Gaussian distribution ? n n n The answer to question 1.3 will use Stein’s method, by first describing a characteristic operator of the law of (Φ), and then following Stein’s steps in [20]. We will address the Hγn problem with complex analytic methods in a subsequent publication, using the information on the values of Φ and its derivatives in 0 in the same vein as Berry and Esséen. The most famous sequence of random variables converging in law to the Gaussian dis- tribution is the sum of i.i.d. random variables. Such a sequence is also convergent in the mod-Gaussian sense to an explicit function satisfying the hypotheses of theorem 1.2, namely Φ : x e Cx4/4 for a certain constant C (see example 2.5). We will treat this C − 7→ example in section 4. The distribution of (Φ ) is the Gaussian with quartic interaction γ C potential, a real-valued version of the ceHlebrated Φ4 model given by P( (Φ ) 6 x) = γ C H x exp at2 bt4 dt/ where a,b and are explicitely described in section 4. The − 2 − 4 Za,b Za,b S−te∞in estimates developed in this context in section 5 can thus be of independent use when R (cid:0) (cid:1) one deals with such a distribution. This article is structured in the following way : section 2 defines mod-* convergence and constructs a canonical family of distributions associated to it, section 3 explains the links between this canonical family and Edgeworth expansion or signed expansion and develops the fondamentals of Stein’s method for a sequence of random variables converging in the ON STEIN’S METHOD AND MOD-* CONVERGENCE 5 mod-Gaussian sense to a general function Φ ; last, section 4 treats the example of the sum of i.i.d. symmetric random variables and proves a Berry-Esséen theorem using Stein’s method and the estimates of section 5. Notations We gather here some notations used throughout the paper. If γ is a positive real number, N (0,γ2) designates the Gaussian distribution of expec- tation 0 and variance γ2 and U (I) the uniform distribution in the set I. If n is an integer, J1,nK designates the set 1,2,...,n ; z z denotes the complex conjugation. Allrandomvariablesw{illbeconsid}ered7→onaprobabilityspace(Ω, ,P). Thedistribution of the random variable X : Ω R will be denoted by P : if A F is a measurable set, X P (A) := P(X A). If X and−→Y are two random variables havin∈g Fthe same distribution, X ∈ that is PX = PY, we will note X =L Y. The convergence in law/in distribution will be denoted by L . For f L−1→(P ), f > 0, the penalisation or bias of P by f is the probability measure X X P denot∈ed by Y f(X) P := P Y E(f(X)) • X This definition is equivalent to the following : for all g L (P ), ∞ X ∈ E(f(X)g(X)) E(g(Y)) = E(f(X)) 2. Mod-* convergence 2.1. Reminder of the main notions. Definition 2.1 (Mod-Gaussianconvergence). Let(X ) beasequence ofrandomvariables n n ofexpectation0and(γ ) beasequenceofstrictlypositiverealnumbers. LetG N (0,1). n n ∼ We say that (X ) converges in the mod-Gaussian sense if n n E eiuXn Φ(u) E((cid:0)eiuγnG(cid:1)) −n−→−+−∞→ the convergence being locally uniform in u R and Φ : R C hence being a continuous ∈ → function satisfying Φ(0) = 1 and Φ(u) = Φ( u). − When such a convergence holds, we write it as mod-G (X ,γ ) Φ n n −n−−+−→ → ∞ Remark 2.2. One can always be reduced to the case of a sequence of random variables with zero expectation. Otherwise, we include additional renormalization in the Fourier transform of the Gaussian random variable, which corresponds to the original definition of [13]. 6 Y. BARHOUMI-ANDRÉANI A trivial example but a useful insight for the intuition allows to illustrate the concept : Example 2.3. Consider X := Y + γ G where (Y ) is independent of G N (0,1), n n n n n ∼ where Y L Y and γ + . Then, n n −n−−−+→ ∞ → ∞ → ∞ E eiuXn = E eiuYn Φ(u) := E eiuY E((cid:0)eiuγnG(cid:1)) −n−→−+−∞→ ∞ (cid:0) (cid:1) (cid:0) (cid:1) Thus, in the case of an additive independent Gaussian noise, such a renormalisation gives at the limit the Fourier transform of a probability measure. An interesting question related to question 1.1 concerns the probabilistic meaning of this particular typeconvergence. Since thelimiting functionisnotalways theFouriertransform of a probability measure (see e.g. example 2.5), the intuitive idea of an additive correlated noise that disappears with this particular type of renormalisation (a deconvolution) is not satisfactory if we escape from the domain of probability theory at the limit. As pointed out in the introduction, one solution to this problem is to change the prob- ability using Φ as a weight : Theorem 2.4 (A probabilistic interpretation of mod-Gaussian convergence). Let (γ ) be n n a sequence of strictly positive real numbers such that γ + when n + and let n → ∞ → ∞ Φ be an admissible function for the mod-Gaussian convergence, i.e. a continuous complex function satisfying Φ(0) = 1 and Φ(u) = Φ( u). − Suppose moreover that (1) Φ can be analytically extended on the whole complex plane and satisfies, β R, ∀ ∈ sup Φ(z) < a R | | ∞ ∀ ∈ z a+i[0,β] ∈ sup Φ(z) 0 (3) z a+i[0,β]| | −a−−−→ ∈ →±∞ (2) Φ(ix) = Φ(x) for all x R, (3) Φ(x) > 0 for all x R,∈ ∈ Define the distribution P of a random variable (Φ) by the following penalisation Hγn(Φ) Hγn Φ G P := γn P (4) Hγn(Φ) E Φ(cid:16) G(cid:17) • γnG γn (cid:16) (cid:16) (cid:17)(cid:17) Then, (Φ)/γ L N (0,1) Hγn n −−n−−→ →∞ mod-G ( (Φ),γ ) Φ Hγn n −−n−−→ →∞ ON STEIN’S METHOD AND MOD-* CONVERGENCE 7 Note that (4) coincides with (1), but in the Fourier framework, which imposes Φ to be positive and real on R. We will compare these two settings in section 2.2. Before proving the theorem, we give an example of such a function Φ that will be our guiding example. Example 2.5. For C > 0, set ΦC(x) = e−Cx44 (5) This function is the mod-Gaussian limit of n 1 Z := X n n1/4 k k=1 X where (Xk)k isasequence ofi.i.d. symmetric randomvariables(thatisX =L X) satisfying E(X2) = 1 and κ := E(X4) < 3. − More precisely, for C = (3 E(X4))/6, we have mod-Gaussian convergence of (Z ) to n n − Φ at speed n1/4 : C n E eixZn = E eix nk=1Xk/n1/4 = E eixX/n1/4 P (cid:0) (cid:1) = en(cid:16)log(E(exp(ixX/n1/4(cid:17)))) (cid:16) (cid:16) (cid:17)(cid:17) = enlog(cid:16)1+2x√2n+κ2x44n+xn4ε1(cid:16)n1x/4(cid:17)(cid:17) = en(cid:18)2x√2n+κ2x44n−21(cid:16)2x√2n(cid:17)2+xn4ε2(cid:16)n1x/4(cid:17)(cid:19) = e√nx22+(κ−3)x244+xn4ε2(cid:16)n1x/4(cid:17) Here, ε and ε are functions that tend to 0 in 0 and are bounded on a compact neigh- 1 2 borhood of 0. We thus have the following convergence that holds locally uniformly in x in a certain interval around 0 E eixZn E e(cid:0)ixn1/4G(cid:1) −n−→−+−∞→ e−(32−4κ)x4 We can check moreover that t(cid:0)he requi(cid:1)red assumptions of analyticity and boundedness in a horizontal strip are fullfilled : sup e Cz4 = sup e C(a+iy)4 = sup e C(a4+y4 6a2y2) 6 C e Ca4/2 0 z a+i[0,β] − y [0,β] − y [0,β] − − β′ − −a−−−→ ∈ (cid:12) (cid:12) ∈ (cid:12) (cid:12) ∈ →±∞ (cid:12) (cid:12) (cid:12) (cid:12) In accorda(cid:12)nce to(cid:12) theorem(cid:12)2.4, the r(cid:12)andom variable (Φ ) of distribution given by Hn1/4 C (4) with Φ (x) = e Cx4/4 satisfies the same type of convergence, and one can write C − E eiuZn 1 E eiu(cid:0)Hn1/4(Φ(cid:1)C) −n−→−+−∞→ (cid:16) (cid:17) 8 Y. BARHOUMI-ANDRÉANI in the same vein as convergence in law writes E eiuZn/n1/4 1 (cid:16)E(eiuG) (cid:17) −n−−+−→ → ∞ We now prove theorem 2.4. Proof. For θ R, write ∈ E eiθγnG Φ G E eiθHγn(Φ) = E Φ γGn eiθγnG = E(cid:18)eiθγnG(cid:19) (cid:16)γn(cid:17)! =: RΦ(x)µ(nθ)(dx) E(cid:0)(eiθγnG)(cid:1) E(e(cid:16)iθγn(cid:16)G)E(cid:17) Φ G(cid:17) E Φ G R E Φ G γn γn γn (cid:16) (cid:16) (cid:17)(cid:17) (cid:16) (cid:16) (cid:17)(cid:17) (cid:16) (cid:16) (cid:17)(cid:17) where eiθγnG G Φ(x)µ(θ)(dx) := E Φ ZR n (cid:18)E(eiθγnG) (cid:18)γn(cid:19)(cid:19) x dx = eθ2γn2/2 eiθγnxΦ e−x2/2 ZR (cid:18)γn(cid:19) √2π x dx = Φ e−12(x−iθγn)2 ZR (cid:18)γn(cid:19) √2π y dy = Φ +iθ e−12y2 ZR iθγn (cid:18)γn (cid:19) √2π − Set g(z) := Φ(z/γ +iθ)e z2/2 n − If g is analytic on the whole complex plane, the Cauchy formula gives g + g g g = 0 − − Z[ a,a] Za+i[0,β] Z[ a,a]+iβ Z a+i[0,β] − − − If moreover g satisfies the hypothesis (3), we can write g(x)dx 6 β sup g(z) 0 (cid:12)Za+i[0,β] (cid:12) | |z∈a+i[0,β]| | −a−→−±−∞→ (cid:12) (cid:12) Hence, (cid:12) (cid:12) (cid:12) (cid:12) g = g + g g =: g +R(a) − Z[ a,a]+iβ Z[ a,a] (cid:18)Za+i[0,β] Z a+i[0,β] (cid:19) Z[ a,a] − − − − with R(a) 6 2 β sup g(z) 0 | | | |z a+i[0,β]| | −a−−−→ ∈ →±∞ ON STEIN’S METHOD AND MOD-* CONVERGENCE 9 Passing to the limit on a + , we get → ∞ g = g ZR iβ ZR − Now, sup e z2/2 = sup e (a+iu)2/2 = sup e a2/2+u2/2 = eβ2/2e a2/2 0 − − − − z a+i[0,β] u [0,β] u [0,β] −a−−−→ ∈ (cid:12) (cid:12) ∈ (cid:12) (cid:12) ∈ →±∞ sup (cid:12)Φ(z) (cid:12)= sup Φ(cid:12)(a+iu) (cid:12) 0 by the hypothesis (3) z a+i[0,β](cid:12)| | (cid:12) u [0,β]| (cid:12) | −(cid:12)a−−−→ ∈ ∈ →±∞ sup e z2/2Φ(z) 0 − z a+i[0,β] −a−−−→ ∈ (cid:12) (cid:12) →±∞ We can thus(cid:12) write (cid:12) (cid:12) (cid:12) y dy G ZRΦ(x)µn(θ)(dx) = ZRΦ(cid:18)γn +iθ(cid:19)e−21y2√2π = E(cid:18)Φ(cid:18)γn +iθ(cid:19)(cid:19) The condition (3) ensures that Φ is bounded on a horizontal strip, hence, by the domi- nated convergence theorem, the continuity of Φ on the complex plane and the hypothesis Φ(iθ) = Φ(θ) for all θ R, we get ∈ G lim Φ(x)µ(θ)(dx) = E lim Φ +iθ = Φ(iθ) = Φ(θ) n→+∞ZR n (cid:18)n→+∞ (cid:18)γn (cid:19)(cid:19) Finally, dominated convergence implies G lim E Φ = Φ(0) = 1 n→+∞ (cid:18) (cid:18)γn(cid:19)(cid:19) which proves the theorem. (cid:3) (θ) Remark 2.6. The fact that the signed (complex) measures µ satisfy n lim Φ(x)µ(θ)(dx) = Φ(θ) = Φ(x)δ (dx) n θ n + R R → ∞Z Z forallΦsatisfyingtheassumptionsoftheorem2.4canberephrasedintoaweakconvergence (θ) of the sequence (µ ) to the measure δ . Note that the space of functions on which this n n θ convergence holds is restrictive and is a strict subset of the space of continuous bounded functions. On this last space, the weak convergence does not hold as one can check by considering the limit of the Fourier transform eiαxµ(θ)(dx). R n The last theorem motivates the following R Definition 2.7. Let G N (0,1), γ > 0 and Φ be a function satisfying the hypotheses of theorem 2.4. We define∼the distribution H (Φ,γ) by Φ G H H (Φ,γ) P := γ P (6) γ ∼ ⇐⇒ Hγ E Φ(cid:16) G(cid:17) • γG γ (cid:16) (cid:16) (cid:17)(cid:17) 10 Y. BARHOUMI-ANDRÉANI Remark 2.8. Another way of writing (6) is to say that H has a Lebesgue-density given by γ fγ(x) = c1 Φ γx2 e−12(xγ)2, cγ := γ√2πE(Φ(G/γ)) (7) γ (cid:18) (cid:19) 2.2. Mod-Gaussian convergence in the Laplace setting. As noticed in remark 2.6, (θ) the key point in theorem 2.4 is to show that (µ ) converges weakly to δ for a certain n n θ (θ) notion of weak convergence of measures. But the fact that lim Φ(x)µ (dx) = n + R n → ∞ Φ(iθ) forces the function Φ to have an additionnal symmetry and gives the hint that this is R the variable iθ that should be the relevant parameter. It thus becomes natural to consider the Laplace transform in place of the Fourier transform. Definition 2.9. Let (X ) be a sequence of random variables of expectation 0 and (γ ) n n n n a sequence of strictly positive real numbers. Suppose moreover that E euXn < for all u A R where A is an open set containing 0 or A = R ∞ + ∈ ⊂ (cid:0) (cid:1) (X ) is said to converge in the mod-Gaussian-Laplace sense at speed (γ ) if n n n n E euXn Φ(u) E((cid:0)euγnG(cid:1)) −n−→−+−∞→ where Φ : A R is a continuous function satisfying Φ(0) = 1, the last convergence being + → locally uniform in u A. ∈ Remark 2.10. Note that the function Φ here defined must always be positive, as a limit of a sequence of positive functions. The advantage of choosing the Fourier transform in place of the Laplace transform is clear : the former one always exists when the latter one needs to specify the range of u R where it is defined. But for the purpose that we have set, a ∈ real function is more suited. We now prove theorem 1.2. Proof. Remember the change of probability of the Gaussian measure : for all u R and for X N (0,γ2) ∈ γ ∼ euXγ P = P E(euXγ) • Xγ Xγ+uγ2 Hence, for all u R ∈ E euHγ(Φ) E euXγΦ Xγ2γ E E(eeuuXXγγ)Φ Xγ2γ E Φ Xγ2γ +u = = (cid:18) (cid:19) = E(cid:0)(euXγ)(cid:1) E(e(cid:16)uXγ)E (cid:16)Φ X(cid:17)γ(cid:17) E Φ Xγ(cid:16) (cid:17) (cid:16)E (cid:16)Φ Xγ (cid:17)(cid:17) γ2 γ2 γ2 Now, since Φ is integrable, do(cid:16)mi(cid:16)nate(cid:17)d(cid:17)convergen(cid:16)ce(cid:16)allo(cid:17)w(cid:17)s to exchang(cid:16)e li(cid:16)m (cid:17)(cid:17) and ex- γ + → ∞ pectation. One has moreover Xγ/γ2 =L X1/γ. This last quantity tends to 0 in distribution, hence, using the continuity of Φ, we have, locally uniformly in u R ∈ E eu γ(Φ) Φ(u) H = Φ(u) E(cid:0)(euXγ)(cid:1) −γ−→−+−∞→ Φ(0)

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.