ebook img

AN ALMOST SURE CENTRAL LIMIT THEOREM FOR THE HOPFIELD MODEL# Anton Bovier 1 PDF

22 Pages·2007·0.29 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview AN ALMOST SURE CENTRAL LIMIT THEOREM FOR THE HOPFIELD MODEL# Anton Bovier 1

AN ALMOST SURE CENTRAL LIMIT THEOREM # FOR THE HOPFIELD MODEL 1 2;3;4 Anton Bovier and V(cid:19)eronique Gayrard Weierstra(cid:25){Institut fu(cid:127)r Angewandte Analysis und Stochastik Mohrenstrasse 39, D-10117 Berlin, Germany Abstract: We prove a central limit theorem for the (cid:12)nite dimensional marginals of the Gibbs distribution of the macroscopic `overlap'-parameters in the Hop(cid:12)eld model in the case where the numberofrandom`patterns', M, as a functionofthesystem sizeN satis(cid:12)eslimN"1M(N)=N =0, without any assumptions on the speed of convergence. The covariance matrix of the limiting gaussian distributionsis diagonal and independent of the disorder for almost all realizations of the patterns. Keywords: Hop(cid:12)eld model, neural networks, central limit theorem, Brascamp-Lieb inequalities AMS Subject Classi(cid:12)cation: 60F05, 60K35 # WorkpartiallysupportedbytheCommissionoftheEuropeanCommunitiesundercontractCHRX-CT93-0411 1 e-mail: [email protected] 2 e-mail: [email protected],[email protected] 3 Permanentaddress: CPT-CNRS,Lunimy,Case 907,F-13288Marseille Cedex, France 4 Supported byGraduiertenkolleg \Stochastische Prozesse und Probabilistische Analysis", Berlin I. Introduction Let us recallthe de(cid:12)nitionsof the Hop(cid:12)eldmodel[Ho] and the mainquantitiesof interest. For a more detailed exposition of the model we refer to [BG3]. Let N be an integer and M:IN ! IN M(N) be a strictly increasing function. We set (cid:11)(N) (cid:17) . In the present work we will consider N N IN only the case where limN"1(cid:11)(N) = 0. We denote by SN (cid:17) f(cid:0)1;1g and S (cid:17) f(cid:0)1;1g the set of spin con(cid:12)gurations, (cid:27), in (cid:12)nite, resp. in(cid:12)nite volume. We denote by (cid:27)i the value of (cid:27) at (cid:22) i. Let ((cid:10);F;IP) be an abstract probability space and let f(cid:24) [!];i;(cid:22) 2INg, denote a family of i independent identically distributed random variables on this space. For the purposes of this paper (cid:22) 1 we will assume that IP[(cid:24) =(cid:6)1] = , but more general distributions can be considered. i 2 M(N) We de(cid:12)ne random maps mN[!] :SN ![(cid:0)1;1] whose components are given by N (cid:22) 1 X (cid:22) mN[!]((cid:27)) (cid:17) (cid:24)i [!](cid:27)i ; (cid:22)=1;:::;M(N): (1:1) N i=1 The Hamiltonian of the Hop(cid:12)eld model is now de(cid:12)ned as M(N) X N (cid:22) 2 HN[!]((cid:27)) (cid:17)(cid:0) (mN[!]((cid:27))) 2 (cid:22)=1 (1:2) N 2 =(cid:0) kmN[!]((cid:27))k2 2 M where k(cid:1)k2 denotes the `2-norm in IR . With this Hamiltonian we de(cid:12)ne in a natural way (cid:12)nite volume Gibbs measures on (SN;B(SN)) via (cid:0)N 2 (cid:0)(cid:12)HN[!]((cid:27)) (cid:22)N;(cid:12)[!]((cid:27)) (cid:17) e (1:3) ZN;(cid:12)[!] where the parameter (cid:12) > 0 denotes the inverse temperature and where the normalizing factor ZN;(cid:12)[!] is given by X (cid:0)N (cid:0)(cid:12)HN[!]((cid:27)) (cid:0)(cid:12)HN[!]((cid:27)) ZN;(cid:12)[!](cid:17)2 e (cid:17)IE(cid:27)e (1:4) (cid:27)2SN M(N) M(N) We furthermore introduce the measures on (IR ;B(IR )) induced by the Gibbs measures and the maps mN[!]: (cid:0)1 QN;(cid:12)[!] (cid:17)(cid:22)N;(cid:12)[!](cid:14)mN[!] (1:5) Over the last few years a very satisfactory and complete description of the measures QN;(cid:12)[!] M(N) has been obtained in the case limN"0 N = 0. In particular, in [BGP1], a law of large number type was proven for the random vectors mN[!], and in [BG1] the associated full large deviation M(N) principle was obtained, without any condition on the speed of convergence of to zero. In N 1 such a situation it is natural to also expect a central limit theorem to hold. Such results were in fact proven in several papers by B. Gentz [Ge1], [Ge2] and [Ge3]. However, they required strong 2 M(N) M (N) conditions on the speed at which N tends to zero, the weakest being limN"0 N = 0 in [Ge3]. In this note we show that the central limit theorem holds under the sole hypothesis that M(N) limN"0 N = 0. Thus, in this regime all the classical theorems of probability theory are now established. We note that the proof of the CLT requires a far more detailed analysis of the local properties of the measures QN;(cid:12) then all previous results in the same regime. The crucial ingredient is a local convexity estimate that was given in [BG2] and the crucial new analytic tool are Brascamp-Lieb inequalities [BL,HS,N,NS]. (cid:3) In order to state the results we need some more notation and de(cid:12)nitions. Let m ((cid:12)) be the (cid:3) largest solution of the mean (cid:12)eld equation m=tanh((cid:12)m). Note that m ((cid:12)) is strictly positive for (cid:3) 2 (cid:3) (m ((cid:12))) (cid:3) (cid:22) all (cid:12) >1, lim(cid:12)"1m ((cid:12)) =1, lim(cid:12)#1 3((cid:12)(cid:0)1) =1 and m ((cid:12)) =0 if (cid:12) (cid:20)1. Denoting by e the (cid:22)-th M unit vector of the canonical basis of IR we set, for all ((cid:22);s) 2f(cid:0)1;1g(cid:2)f1;:::;M(N)g, ((cid:22);s) (cid:3) (cid:22) m (cid:17)sm ((cid:12))e ; (1:6) and for any (cid:26)>0 we de(cid:12)ne the balls n (cid:12) o B(cid:26)((cid:22);s) (cid:17) x2IRM(cid:12)kx(cid:0)m((cid:22);s)k2 (cid:20)(cid:26) (1:7) 1 For any pair of indices ((cid:22);s) and any (cid:26) >0 we de(cid:12)ne the conditional measures ((cid:22);s) ((cid:22);s) M(N) QN;(cid:12);(cid:26)[!](A)(cid:17)QN;(cid:12)[!](Aj B(cid:26) ); A2B(IR ) (1:8) ((cid:22);s) ((cid:22);s) Let XN be a random vector distributedaccording to QN;(cid:12);(cid:26)[!] and denote by XN;(cid:12);(cid:26)[!] it's expec- tation. We want to characterize the distribution of the normalized centered variable p ((cid:22);s) XeN (cid:17) N(XN (cid:0)XN;(cid:12);(cid:26)[!]) (1:9) To do so we consider it's Laplace transform (recall that Xe is M(N)-dimensional): Z p ((cid:22);s) L((cid:22);s) [!](t) (cid:17) e N(t;x(cid:0)XN;(cid:12);(cid:26)[!])dQ((cid:22);s) [!](x); t2IRM(N) (1:10) N;(cid:12);(cid:26) N;(cid:12);(cid:26) M(N) where ((cid:1);(cid:1)) stands for the scalar product in IR . We prove the following theorem: 1 Alltheresults ofthispapercouldalsobeformulatedintermsof\tiltedGibbsmeasure",i.e. withasymmetry breaking magnetic(cid:12)eld addedinstead ofthe conditioning (see [BG3]) forprecise de(cid:12)nitions. 2 + Theorem 1.1: Assume limN"1(cid:11)(N) =0. Assume that (cid:12) 2IR nf1g and set ( (cid:3) 2 1(cid:0)(m ((cid:12))) 1(cid:0)(cid:12)(1(cid:0)(m(cid:3)((cid:12)))2) if (cid:12) >1 C((cid:12)) (cid:17) (1:11) 1 if (cid:12) <1 1(cid:0)(cid:12) There exists a constant c((cid:12)) >0 such that with probability one, for all but a (cid:12)nite number of indices N, if (cid:26) satis(cid:12)es p 1 (cid:3) n 1 p o (cid:11)(N) m >(cid:26) >c((cid:12)) N1=4 ^ (cid:11)(N) +c0 (cid:3) (1:12) 2 m for some constant c0 >0, then for all t with ktk2 <1 we have ((cid:22);s) 1 2 lim logLN;(cid:12);(cid:26)[!](t) = 2C((cid:12))ktk2 (1:13) N"1 Corollary 1.2: Under the assumptions of Theorem 1, for all k 2 IN, the (cid:12)nite dimensional marginals of order k of the law of XeN under Q(N(cid:22);;(cid:12)s;)(cid:26)[!] converge weakly, as N diverges, to the k k gaussian measure on (IR ;B(IR )) with mean zero and covariance matrix C((cid:12))1I where 1I is the identity matrix. 2 M (N) Remark: Thesameresultwasobtainedin[Ge3]underthestrongerassumptionlimN"0 N =0. We will see in the sequel that, due to the sharp concentration properties of the measure ((cid:22);s) ((cid:22);s) QN;(cid:12);(cid:26)[!], the centering XN;(cid:12);(cid:26)[!] obeys the following bound: Lemma 1.3: Under the assumption of Theorem 1.1, with probability one, for all but a (cid:12)nite number of indices N, (cid:13) (cid:13) (cid:13) ((cid:22);s) ((cid:22);s)(cid:13) (cid:13)XN;(cid:12);(cid:26)[!](cid:0)m (cid:13) (cid:20)(cid:26)~ (1:14) 2 where p (cid:11)(N) (cid:26)~(cid:17)c~0 (cid:3) (1:15) m for some constant c~0 >0. The remainder of this paper is organized as follows. We only present the proof of Theorem 2 1 in the case where (cid:12) > 1, the case (cid:12) < 1 being trivial . Moreover, in order to avoid having to distinguishseveral cases andsincewe aremainlyinterestedintheregimeofparameters notcovered 2 in [Ge3] , we will assume that M(N) >(logN) . It is however not di(cid:14)cult at all to treat the case 2 (cid:0)cM M(N) (cid:20) (logN) . In fact, wherever estimates of the form e appear, they can be replaced by p (cid:0)c N e if so desired by trivial modi(cid:12)cations. The basic structure of the proof is as follows: 2 The situation at (cid:12) = 1 as well as the limits (cid:12) ! 1 taken in various ways are up tp now completely uninvestigated and promise arather rich andcomplex structure. 3 (i) Using the Hubbard-Stratonovich transformation, show that for (cid:26) chosen as in (1.12), the Laplace transform (1.10), L((cid:22);s) , can be expressed in terms of the Laplace transform Le((cid:22);s) N;(cid:12);(cid:26) N;(cid:12);(cid:26) of a smoothed version Qe((cid:22);s) of the measure Q((cid:22);s) . N;(cid:12);(cid:26) N;(cid:12);(cid:26) (ii) Show that the measures Qe((cid:22);s) for all (cid:26) satisfying (1.12) are equivalent. N;(cid:12);(cid:26) (iii) Choose (cid:26) as the lower bound in (1.12) and, using the results of [BG2], show that the corre- (cid:0)NV(x) sponding measures have densities of the forms e with V strictly convex; moreover, the Hessian of V is uniformly close to a multiple of the identity. (iv) TheBrascamp-Liebinequalities,togetherwithasimplereverse[DGI],nowyieldasymptotically coinciding upper and lower bounds on the Laplace transform which imply Theorem 1.1. Assuming (ii), we present (i), (iii) and (iv) in Section 2. This represents the essential and original part of the proof. While the results of (ii) use by now quite standard techniques and are not very original, they require rather lengthy computations. We give them in Section 3; readers not interested in these technicalities are advised not to read that section. Notation and conventions: Before giving the proofs, let us (cid:12)x some general conventions on ((cid:22);s) notation. From now on the parameter (cid:26) in XN;(cid:12);(cid:26)[!] is (cid:12)xed and chosen as in (1.12). We willthen simply write ((cid:22);s) ((cid:22);s) X [!](cid:17)XN;(cid:12);(cid:26)[!] (1:16) and no confusion should arise from this. In general, in order not to overburden the notation, we willsuppresspart of or all of the subscripts(cid:12);N;(cid:26) when we feel that this cannot be confusing. We will also often suppress the explicit dependence of several quantities on (cid:12) and N: mostly we will (cid:3) (cid:3) write m (cid:17)m ((cid:12)), M (cid:17)M(N), (cid:11) (cid:17)(cid:11)(N). Finally, let us insist that to simplify the notation, the dependance of various random quantities on ! will be made explicit only when we want to stress the random nature of these quantities. Acknowledgements: We thank J.-D. Deuschel, G. Giacomin and D. Io(cid:11)e for informing us of their results in [DGI] concerning the reverse Brascamp-Lieb inequalities prior to publication. 2. Proof of Theorem 1.1 Inthis section we give the mainpart of the proofof Theorem 1.1. We recall(cid:12)rst the Hubbard- Stratonovich transformation [H,S]. (cid:16) (cid:17)M=2 (cid:8) (cid:9) M M M (cid:12)N 1 2 Let N(cid:12)N bethegaussianmeasure on(IR ;B(IR )) withdensity 2(cid:25) exp (cid:0)2(cid:12)Nkzk2 M with respect to Lebesgue measure in IR . The Hubbard-Stratonovich approach consists in consid- 4 ering the convolution QeN;(cid:12) (cid:17)QN;(cid:12) ?N(cid:12)MN (2:1) instead of the measure QN;(cid:12) itself. The resulting measure QeN;(cid:12) is absolutely continuous and has density 1 expf(cid:0)(cid:8)N;(cid:12)(z)g (2:2) ZN;(cid:12) M with respect to Lebesgue's measure in IR . The function (cid:8)N;(cid:12)(z) can be computed explicitly and is given by N 1 2 1 X M (cid:8)N;(cid:12)(z) = kzk2 (cid:0) lncosh(cid:12)((cid:24)i;z); z 2IR (2:3) 2 (cid:12)N i=1 Note that under our assumptions on (cid:11), the measures QeN;(cid:12) and QN;(cid:12) have the same convergence M propertiesas for large enough N, the gaussian N gets concentrated sharplyon a sphereof radius (cid:12)N p (cid:11)=(cid:12). In complete analogy with (1.8) to (1.10), we introduce the conditional measures Qe(N(cid:22);;(cid:12)s;)(cid:26)(A)(cid:17)QN;(cid:12)(A jB(cid:26)((cid:22);s)); A2B(IRM) (2:4) and, for ZN distributed according to Qe(N(cid:22);;(cid:12)s;)(cid:26) we consider the Laplace transform Z p ((cid:22);s) Le((cid:22);s) (t) (cid:17) e N(t;z(cid:0)Z )dQe((cid:22);s) (z); t2IRM (2:5) N;(cid:12);(cid:26) N;(cid:12);(cid:26) p ((cid:22);s) ((cid:22);s) of the normalizedcentered variable ZeN (cid:17) N(ZN(cid:0)Z ), where Z is the expectation of ZN. For later convenience, we also introduce the quantities p ((cid:22);s) ((cid:22);s) Le((cid:22);s) (t) (cid:17)e N(t;Z (cid:0)X )Le((cid:22);s) (t) (2:6) N;(cid:12);(cid:26) N;(cid:12);(cid:26) We will proof in the remainder of this section the analog of Theorem 1.1 for the function Le((cid:22);s) (t) (cid:12);N;(cid:26)(cid:22) with (cid:26)(cid:22)=(cid:26)(cid:22)(N) that tends to zero as N tends to in(cid:12)nity. Thefollowingproposition,whoseproofwillbegiveninSection3, assuresthatthisimpliesthat ((cid:22);s) L (t) converges to the same limit. (cid:12);N;(cid:26)(cid:22) Proposition 2.1: Assume that (cid:12) >1. There exist (cid:12)nite positive constants c (cid:17) c((cid:12));c~(cid:17) c~((cid:12));c(cid:22)(cid:17) c(cid:22)((cid:12)) such that, with a probability one, for all but a (cid:12)nite number of indices N, if (cid:26) satis(cid:12)es 1 (cid:3) (cid:8) 1 p (cid:9) m >(cid:26)>c((cid:12)) N1=4 ^ (cid:11) (2:7) 2 then, for all t with ktk2 <1, 5 i) L((cid:22);s) (t)(cid:0)1(cid:0)e(cid:0)c~M(cid:1)(cid:20)e(cid:0)21(cid:12)ktk22Lb((cid:22);s) (t) (cid:20)e(cid:0)c~M +L((cid:22);s) (t)(cid:0)1+e(cid:0)c~M(cid:1) (2:8) (cid:12);N;(cid:26) (cid:12);N;(cid:26) (cid:12);N;(cid:26) ii) for any (cid:26)(cid:22) satisfying (2.7) (cid:0) (cid:1) (cid:0) (cid:1) Le((cid:22);s) (t) 1(cid:0)e(cid:0)c(cid:22)M (cid:20) Le((cid:22);s) (t) (cid:20)e(cid:0)c(cid:22)M +Le((cid:22);s) (t) 1+e(cid:0)c(cid:22)M (2:9) (cid:12);N;(cid:26)(cid:22) (cid:12);N;(cid:26) (cid:12);N;(cid:26)(cid:22) iii) for any (cid:26)(cid:22) satisfying (2.7) (cid:12)(cid:16) (cid:17)(cid:12) (cid:12) ((cid:22);s) ((cid:22);s) (cid:12) (cid:0)c(cid:22)M (cid:12) X (cid:0)Z(cid:26) ;t (cid:12) (cid:20)ktk2e (2:10) Remark: Note that (iii) implies that (cid:12) (cid:12) p (cid:12)(cid:12)lnLe((cid:12)(cid:22);N;s;)(cid:26)(t)(cid:0)lnLb((cid:12)(cid:22);N;s;)(cid:26)(t)(cid:12)(cid:12) (cid:20) Nktk2e(cid:0)c(cid:22)M (2:11) 2 which under our assumption M(N) (cid:21)(lnN) tends to zero. We now want to compute the Laplace transform Le((cid:22);s) (t) for (cid:26)(cid:22)(cid:17) (cid:26)(cid:22)(N) that tends to zero as (cid:12);N;(cid:26)(cid:22) N tends to in(cid:12)nity. + Proposition 2.2: Assume that (cid:12) 2IR nf1g and set: (cid:3) 2 (cid:21)((cid:12)) (cid:17)1(cid:0)(cid:12)(1(cid:0)((cid:12)m ((cid:12))) ) : (2:12) Let (cid:11)(N) and (cid:26)(cid:22)(N) be decreasing functions of N that go to zero as N goes to in(cid:12)nity and satisfy q (cid:11)(N) (cid:26)(cid:22)(N) (cid:21)2 : (2:13) (cid:21)((cid:12)) Then with probability one, for all but a (cid:12)nite number of indices N, ktk2(1(cid:0)3(cid:26)(cid:22)e(cid:0)M) (cid:20)lnLe((cid:22);s) (t) (cid:20) ktk2(1+2(cid:26)(cid:22)e(cid:0)M) (2:14) (cid:12);N;(cid:26)(cid:22) 2(cid:12)((cid:21)((cid:12))(cid:0)(cid:13)(N)) 2(cid:12)((cid:21)((cid:12))+(cid:13)(N)) where (cid:20) q (cid:21) p p lnN 0 8 (cid:13)(N) =(cid:12) 3( (cid:26)(cid:22)+ (cid:11))+c N +c m(cid:3)(cid:26)(cid:22) (2:15) 0 for some strictly positive constants c and c . We recall a few notation and de(cid:12)nitions. Let S and T be two M(cid:2)M real symmetric matrices. The matrix norm is de(cid:12)ned by kTk (cid:17) sup j(x;Tx)j (2:16) x:kxk2=1 6 M We say that T isnonnegative, andwe writeT (cid:21)0, if(x;Tx) (cid:21)0 forany x2IR . More generally, M we say that T (cid:21) S or S (cid:20) T if T (cid:0)S (cid:21) 0. For any function V: IR ! IR, we will denote by 2 r V(x) it's Hessian matrix at x. Lemma 2.3: Let (cid:11)(N) and (cid:26)(N) be decreasing functions of N that go to zero as N goes to in(cid:12)nity. Assume that (cid:12) 6=1. Then with a probability one, for all but a (cid:12)nite number of indices N, M for all v in the set fv 2IR :kvk2 (cid:20)(cid:26)(N)g, we have: 2 (1;1) 0 <((cid:21)((cid:12))(cid:0)(cid:13)(N))1I (cid:20)r (cid:8)(cid:12);N(m +v) (cid:20)((cid:21)((cid:12))+(cid:13)(N))1I (2:17) where (cid:13)(N) is de(cid:12)ned in (2.15). Proof: We will only give the proof of the upper bound. The proof of the lower bound is very similar and can already be found in [BG2], [BG3]. A straightforward computation gives N 2 (1;1) (cid:12) X t 2 (cid:3) 1 r (cid:8)(cid:12);N(m +v) =1I(cid:0)(cid:12)A+ (cid:24)i(cid:24)itanh ((cid:12)(m (cid:24)i +((cid:24)i;v))) (2:18) N i=1 2 Our strategy will be to show that r (cid:8) can be rewritten as it's dominant contribution, (cid:21)((cid:12))1I, plus terms that willeitherhave smallnormorbenonnegative. We willthen make use thetwo following facts: for any real symmetric matrices T and S, i) if S (cid:21)0 then T +S (cid:21)T. ii) if T =t1I+S with kSk (cid:20)s for some constants t and s, then T (cid:21)(t(cid:0)s)1I. Introducing a parameter 0 < (cid:28) (cid:28) 1 that will be appropriately chosen later, we decompose 2 r (cid:8) as 2 (1;1) r (cid:8)(cid:12);N(m +v) =(cid:21)((cid:12))1I+T1+T2+T3+T4+T5 (2:19) where 2 (cid:3) 2 (cid:3) T1 (cid:17)(cid:12)[tanh ((cid:12)m (1(cid:0)(cid:28)))(cid:0)tanh ((cid:12)m )]1I 2 (cid:3) T2 (cid:17)(cid:12)(1(cid:0)tanh ((cid:12)m (1(cid:0)(cid:28))))(1I(cid:0)A) N 2 (cid:3) (cid:12) X t T3 (cid:17)(cid:0)tanh ((cid:12)m (1(cid:0)(cid:28))) (cid:24)i(cid:24)i1Ifj((cid:24)i;v)j(cid:21)(cid:28)m(cid:3)g N i=1 (2:20) N X (cid:12) t 2 (cid:3) 1 2 (cid:3) T4 (cid:17)N (cid:24)i(cid:24)i1Ifj((cid:24)i;v)j<(cid:28)m(cid:3)g[tanh ((cid:12)(m (cid:24)i +((cid:24)i;v)))(cid:0)tanh ((cid:12)m (1(cid:0)(cid:28)))] i=1 N X (cid:12) t 2 (cid:3) 1 T5 (cid:17)N (cid:24)i(cid:24)i1Ifj((cid:24)i;v)j(cid:21)(cid:28)m(cid:3)gtanh ((cid:12)(m (cid:24)i +((cid:24)i;v))) i=1 7 It is easy to verify that T4 (cid:21)0 and T5 (cid:21)0. Thus, by i), 2 (1;1) r (cid:8)(cid:12);N(m +v) (cid:21)(cid:21)((cid:12))1I+T1+T2+T3 (2:21) and we are left to show that T1, T2 and T3 have small norms. Let us treat T1 (cid:12)rst. Trivially, 2 (cid:3) 2 (cid:3) kT1k (cid:17)(cid:12)jtanh ((cid:12)m (1(cid:0)(cid:28)))(cid:0)tanh ((cid:12)m )j (cid:3) (cid:3) (cid:12)(cid:28) (2:22) (cid:20)2(cid:12)jtanh((cid:12)m (1(cid:0)(cid:28)))(cid:0)tanh((cid:12)m )j (cid:20) 1(cid:0)(cid:28) 1 PN (cid:22) (cid:23) LetA(N) (cid:17)A(N)denotetheM(cid:2)M randommatrixwithelements N i=1(cid:24)i (cid:24)i Thesmallness of kT2k comes from the well know fact (see e.g. [G]) that, for small (cid:11), the matrix A(N) is very close to the identity. In particular, it follows from Theorem 4.1 of [BG3] that, for large enough N, there exists a numerical constant K such that, for all (cid:15)(cid:21)0, ! (cid:2) p (cid:3) (1+p(cid:11))2 (cid:18)r (cid:15) (cid:19)2 IP kA(N)(cid:0)1Ik (cid:21)2 (cid:11)+(cid:11)+(cid:15) (cid:20)Kexp (cid:0)N p +1(cid:0)1 (2:23) K 1+ (cid:11) p In particular, choosing (cid:15)(cid:17)c lnN=N for some constant c>0 su(cid:14)ciently large, (2.23) reduces to (cid:2) p p (cid:3) 1 IP kA(N)(cid:0)1Ik(cid:21)2 (cid:11)+(cid:11)+c lnN=N (cid:20) 2 N Finallywe are left to estimate kT3k. But this was already done in[BG2] (see equations (4.77)- (4.79) together with Proposition 4.8). We rephrase this result hereafter in the particular (simpler but weaker) form we need: for all (cid:26) (cid:21)0, " # N X t (cid:3) 4 IP sup k (cid:24)i(cid:24)i1Ifj((cid:24)i;v)j(cid:21)(cid:28)m(cid:3)gk (cid:21)2(cid:0)((cid:11);(cid:28)m =(cid:26)) (cid:20) 2 (2:24) v2B(cid:26) i=1 N where (cid:20) (cid:21) (cid:0)((cid:11);a=(cid:26)) (cid:20)C e(cid:0)(1(cid:0)2p(cid:11))24a(cid:26)22 +(cid:11)(jln(cid:11)j+2)+2lnN (2:25) N for some constant C < 1 (C (cid:25)25). Therefore, collecting (2.22), (2.23), (2.24) and the de(cid:12)nitions 5 of T2 and T3 we have, with a probability larger than 1(cid:0) N2, (cid:12)(cid:28) p p (cid:3) kT1k+kT2k+kT3k (cid:20) +(cid:12)(2 (cid:11)+(cid:11)+c lnN=N)+2(cid:12)(cid:0)((cid:11);(cid:28)m =(cid:26)) (2:26) 1(cid:0)(cid:28) 2 wherewe madeusedofthetrivialbounds0 (cid:20)tanh x(cid:20)1. Itonlyremainsto choosetheparameter p (cid:28). Setting (cid:28) (cid:17) (cid:26), we get that, for large enough N, (cid:20) (cid:18) q (cid:19) (cid:21) p p lnN (cid:0) 8 lnN(cid:1) kT1k+kT2k+kT3k (cid:20)(cid:12) 2 (cid:26)+ 2 (cid:11)+(cid:11)+c N +25 m(cid:3)(cid:26)+(cid:11)(jln(cid:11)j+2)+2 N (2:27) 8 where the r.h.s. is easily seen to be bounded by (cid:13)(N) if (cid:11) and (cid:26) are small. The lower bound in (2.17) then follows from (2.27) and (2.21) by ii) and an application of the Borel-Cantelli Lemma. This concludes the proof of Lemma 2.3 } The following slight generalization of the Brascamp-Lieb inequalities will be our crucial tool to exploit Lemma 2.3. Lemma 2.4: Assume that the positive numbers `;(cid:14) and (cid:26) satisfy the relations q (cid:11)(N) (cid:26) >K (2:28) `(cid:0)(cid:14) where `(cid:0)(cid:14) K (cid:21)2+lnK (2:29) `+(cid:14) M Let V: IR !IR be a non-negative function such that for all x 2B(cid:26) 2 0<(`(cid:0)(cid:14))1I (cid:20)r V(x) (cid:20)(`+(cid:14))1I (2:30) Denote by IEV the expectation with respect to the probability measure on (B(cid:26);B(B(cid:26))) (cid:0)NV(x) e 1Ifx2B(cid:26)g M R d x (2:31) (cid:0)NV(x) M e d x B(cid:26) M Then, for any t2IR , ktk22 (cid:0)p (cid:1)2 ktk22 (cid:0)er (cid:20)IEV N(t;x(cid:0)IEV(x)) (cid:20) 1+er (2:32) `+(cid:14) `(cid:0)(cid:14) and 2 p 2 ktk2 N(t;x(cid:0)IEV(x)) ktk2 1(cid:0)er (cid:20)lnIEVe (cid:20) +er (2:33) 2(`+(cid:14)) 2(`(cid:0)(cid:14)) where (cid:0)M 2(cid:26)e 2 er (cid:17) ktk2 (2:34) `(cid:0)(cid:14) Proof: Let us (cid:12)rst consider (2.32). Note that the upper bound would simply obtain from an applicationoftheBrascamp-Liebinequalities[BL]ifitwerenotthatthemeasure(2.31)hassupport inaballof(cid:12)niteradius. Becauseofthiswewillhavetobealittlemorecarefulandtakeintoaccount \boundary e(cid:11)ects" which, as we shall see, do nothing but create asymptotically negligible small terms. Similarly,the lower bound willessentially result from a \reverse" Brascamp-Lieb inequality recently obtained by [DGI]. Our proof isbased on a representation whichwas originallyintroduced byHel(cid:11)erandSjo(cid:127)strand[HS].ItwasrecentlyusedbyNaddaf[N]andNaddafandSpencer[NS]who noticed in particular that this representation provides a very simple way of proving the Brascamp- Lieb inequalities. 9

Description:
Keywords: Hop eld model, neural networks, central limit theorem, . Thus, in this regime all the classical theorems of probability theory are now.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.