Table Of ContentMODERATE DEVIATIONS IN A RANDOM GRAPH AND FOR THE
SPECTRUM OF BERNOULLI RANDOM MATRICES
9
Hanna Do¨ring1, Peter Eichelsbacher2
0
0
2
n
a
J
1
2
Abstract: WeproveamoderatedeviationprincipleforsubgraphcountstatisticsofErd˝os-
]
R R´enyi random graphs. This is equivalent in showing a moderate deviation principle for the
P
trace of a power of a Bernoulli random matrix. It is done via an estimation of the log-
.
h
t Laplace transform and the G¨artner-Ellis theorem. We obtain upper bounds on the upper
a
m tail probabilities of the number of occurrences of small subgraphs. The method of proof is
[
used to show supplemental moderate deviation principles for a class of symmetric statistics,
1
including non-degenerate U-statistics with independent or Markovian entries.
v
6
4
2
3
.
1 1. Introduction
0
9
1.1. Subgraph-count statistics. Consider an Erd˝os-R´enyi random graph with n vertices,
0
v: where for all n different pairs of vertices the existence of an edge is decided by an indepen-
2
i
X dentBernoulliexperimentwithprobabilityp. Foreachi 1,..., n ,letX betherandom
(cid:0) (cid:1) ∈ { 2 } i
r
a variable determining if the edge ei is present, i.e. P(Xi = 1) = 1(cid:0) (cid:1)P(Xi = 0) = p(n) =: p.
−
The following statistic counts the number of subgraphs isomorphic to a fixed graph G with
k edges and l vertices
k
W = 1 X .
{(eκ1,...,eκk)∼G} κi!
1≤κ1<·X··<κk≤(n2) Yi=1
Here (e ,...,e ) denotes the graph with edges e ,...,e present and A G denotes the
κ1 κk κ1 κk ∼
fact that the subgraph A of the complete graph is isomorphic to G. We assume G to be a
1
Ruhr-Universit¨at Bochum, Fakulta¨t fu¨r Mathematik, NA 3/67, D-44780 Bochum, Germany,
hanna.doering@ruhr-uni-bochum.de
2
Ruhr-Universit¨at Bochum, Fakulta¨t fu¨r Mathematik, NA 3/68, D-44780 Bochum, Germany,
peter.eichelsbacher@ruhr-uni-bochum.de
2 HANNADO¨RING,PETER EICHELSBACHER
graph without isolated vertices and to consist of l 3 vertices and k 2 edges. Let the
≥ ≥
constant a := aut(G) denote the order of the automorphism group of G. The number of
copies of G in K , the complete graph with n vertices and n edges, is given by n l!/a
n 2 l
and the expectation of W is equal to
(cid:0) (cid:1) (cid:0) (cid:1)
n l!
E[W] = l pk = (nlpk).
a O
(cid:0) (cid:1)
It is easy to see that P(W > 0) = o(1) if p n l/k. Moreover, for the graph property that
−
≪
G is a subgraph, the probability that a random graph possesses it jumps from 0 to 1 at the
threshold probability n 1/m(G), where
−
e
H
m(G) = max : H G,v > 0 ,
H
v ⊆
(cid:26) H (cid:27)
e ,v denote the number of edges and vertices of H G, respectively, see [JL R00].
H H
⊆
LimitingPoissonandnormaldistributions forsubgraphcountswerestudiedforprobability
functions p = p(n). For G be an arbitrary graph, Rucin´ski proved in [Ruc88] that W is
Poisson convergent if and only if
npd(G) n→∞ 0 or nβ(G)p n→∞ 0.
−→ −→
Here d(G) denotes the density of the graph G and
v v
G H
β(G) := max − : H G .
e e ⊂
(cid:26) G − H (cid:27)
Consider
n 2 2k n
c := − (l 2)! p(1 p)pk 1 (1.1)
n,p −
l 2 a − 2 −
(cid:18) − (cid:19) r(cid:16) (cid:17)
and
Z := W −E(W) = 1≤κ1<···<κk≤(n2)1{(eκ1,...,eκk)∼G} ki=1Xκi −pk . (1.2)
cn,p P n 2 2k(l 2)! n p(1(cid:16)Qp)pk 1 (cid:17)
l−2 a − 2 − −
−
Z has asymptotic standard normal dis(cid:0)tribu(cid:1)tion, if npkq−1(cid:0)n→(cid:1)∞ and n2(1 p) n→∞ , see
−→ ∞ − −→ ∞
Nowicki, Wierman, [NW88]. For G be an arbitrary graph with at least one edge, Rucin´ski
proved in [Ruc88] that W−E(W) converges in distribution to a standard normal distribution
√V(W)
if and only if
npm(G) n→∞ and n2(1 p) n→∞ . (1.3)
−→ ∞ − −→ ∞
Here and in the following V denotes the variance of the corresponding random variable.
Rucin´ski closed the book proving asymptotic normality in applying the method of moments.
One may wonder about the normalization (1.1) used in [NW88]. The subgraph count W
is a sum of dependent random variables, for which the exact calculation of the variance is
MODERATE DEVIATIONS IN RANDOM GRAPHS AND BERNOULLI RANDOM MATRICES 3
tedious. In [NW88], the authors approximated W by a projection of W, which is a sum of
independent random variables. For this sum the variance calculation is elementary, proving
the denominator (1.1) in the definition of Z. The asymptotic behaviour of the variance of W
foranyp = p(n) issummarized inSection2in[Ruc88]. Themethodofmartingaledifferences
used by Catoni in [Cat03] enables on the conditions np3(k−12) n→∞ and n2(1 p) n→∞
−→ ∞ − −→ ∞
to give an alternative proof of the central limit theorem, see remark 4.2.
A common feature is to prove large and moderate deviations, namely, the asymptotic
computation of small probabilities on an exponential scale. Considering the moderate scale
is the interest in the transition from a result of convergence in distribution like a central
limit theorem-scaling to the large deviations scaling. Interesting enough proving that the
subgraph count random variable W satisfies a large or a moderate deviation principle is an
unsolved problem up to now. The main goal of this paper is to prove a moderate deviation
principle for the rescaled Z, filling a substantial gapin the literature on asymptotic subgraph
count distributions, see Theorem 1.1. Before we recall the definition of a moderate deviation
principle and state our result, let us remark, that exponentially small probabilities have been
studied extensively in the literature. A famous upper bound for lower tails was proven by
Janson [Jan90], applying the FKG-inequality. This inequality leads to good upper bounds
for the probability of nonexistence W = 0. Upper bounds for upper tails were derived by Vu
[Vu01], Kim and Vu [KV04] and recently by Janson, Oleskiewicz and Rucin´ski [JOR04] and
in [JR04] by Janson and Rucin´ski. A comparison of seven techniques proving bounds for the
infamous upper tail can be found in [JR02]. In Theorem 1.3 we also obtain upper bounds
on the upper tail probabilities of W.
Let us recall the definition of a large deviation principle (LDP). A sequence of probability
measures (µ ),n N on a topological space equipped with a σ-field is said to satisfy
n
{ ∈ } X B
the LDP with speed s and good rate function I( ) if the level sets x : I(x) α are
n
ր ∞ · { ≤ }
compact for all α [0, ) and for all Γ the lower bound
∈ ∞ ∈ B
1
liminf logµ (Γ) inf I(x)
n
n→∞ sn ≥ −x∈int(Γ)
and the upper bound
1
limsup logµ (Γ) inf I(x)
n
n sn ≤ −x cl(Γ)
→∞ ∈
hold. Here int(Γ) and cl(Γ) denote the interior and closure of Γ respectively. We say a
sequence of random variables satisfies the LDP when the sequence of measures induced by
these variables satisfies the LDP. Formally a moderate deviation principle is nothing else
but the LDP. However, we will speak about a moderate deviation principle (MDP) for a
4 HANNADO¨RING,PETER EICHELSBACHER
sequence of random variables, whenever the scaling of the corresponding random variables
is between that of an ordinary Law of Large Numbers and that of a Central Limit Theorem.
In the following, we state one of our main results, a moderate deviation principle for the
rescaled subgraph count statistic W when p is fixed, and when the sequence p(n) converges
to 0 or 1 sufficiently slowly.
Theorem 1.1. Let G be a fixed graph without isolated vertices, consisting of k 2 edges
≥
and l 3 vertices. The sequence (β ) is assumed to be increasing with
n n
≥
4
nl 1pk 1 p(1 p) β nl pk 1 p(1 p) . (1.4)
− − n −
− ≪ ≪ −
p (cid:16) p (cid:17)
Then the sequence (S ) of subgraph count statistics
n n
k
1
S := S := 1 X pk
n βn {(eκ1,...,eκk)∼G} κi − !
1≤κ1<·X··<κk≤(n2) Yi=1
satisfies a moderate deviation principle with speed
2k(l 2)! 2β2 1 1
s = a − n = β2 (1.5)
n c2 n 2 2 n p2k 1(1 p) n
(cid:0) n,p (cid:1) l−2 2 − −
−
and rate function I defined by (cid:0) (cid:1) (cid:0) (cid:1)
x2
I(x) = . (1.6)
2 2k(l 2)! 2
a −
(cid:0) (cid:1) 2
Remarks 1.2. (1) Using n 2 2 n n2(l 1), we obtain s βn ;
l−2 2 ≤ − n ≥ nl−1pk−1√p(1 p)
− (cid:18) − (cid:19)
therefore the condition(cid:0) (cid:1) (cid:0) (cid:1)
nl 1pk 1 p(1 p) β
− − n
− ≪
p
implies that s is growing to infinity as n and hence is a speed.
n
→ ∞
4
(2) If we choose β such that β nl pk 1 p(1 p) and using the fact that s is a
n n − n
≪ −
speed implies that (cid:16) (cid:17)
p
n2p6k−3(1 p)3 n→∞ . (1.7)
− −→ ∞
This is a necessary but not a sufficient condition on (1.4).
The approach to prove Theorem 1.1 yields additionally to a central limit theorem for
Z = W EW, see remark 4.2, and to a concentration inequality for W EW:
−
cn,p −
MODERATE DEVIATIONS IN RANDOM GRAPHS AND BERNOULLI RANDOM MATRICES 5
Theorem 1.3. Let G be a fixed graph without isolated vertices, consisting of k 2 edges
≥
and l 3 vertices and let W be the number of copies of G. Then for every ε > 0
≥
const.ε2n2lp2k
P(W EW εEW) exp ,
− ≥ ≤ −n2l 2p2k 1(1 p)+const.εn2l 2p1 k(1 p) 1
(cid:18) − − − − − − − (cid:19)
where const. are only depending on l and k.
We will give a proof of Theorem 1.1 and Theorem 1.3 in the end of section 4.
Remark 1.4. Let us consider the example of counting triangles: l = k = 3, a = 6. The
necessary condition (1.7) of the moderate deviation principle turns to
n2p15 and n2(1 p)3 as n .
−→ ∞ − −→ ∞ → ∞
This can be compared to the expectedly weaker necessary and sufficient condition for the
central limit theorem for Z in [Ruc88]:
np and n2(1 p) as n .
−→ ∞ − −→ ∞ → ∞
The concentration inequality in Theorem 1.3 for triangles turns to
const.ε2n6p6
P(W EW εEW) exp ε > 0.
− ≥ ≤ −n4p5(1 p)+const.εn4p 2(1 p) 1 ∀
(cid:18) − − − − (cid:19)
Kim and Vu showed in [KV04] for all 0 < ε 0.1 and for p 1 logn, that
≤ ≥ n
W EW
P − 1 e Θ(p2n2) .
−
εp3n3 ≥ ≤
(cid:18) (cid:19)
As we will see in the proofof Theorem 1.3, the bound ford(n) in (1.12) leads to an additional
term of order n2p8. Hence in general our bounds are not optimal. Optimal bounds were
obtained only for some subgraphs. Our concentration inequality can be compared with the
bounds in [JR02], which we leave to the reader.
1.2. Bernoulli random matrices. Theorem 1.1 can be reformulated as a moderate devi-
ation principle for traces of a power of a Bernoulli random matrix.
Theorem 1.5. Let X = (X ) be a symmetric n n-matrix of independent real-valued
ij i,j
×
random variables, Bernoulli-distributed with probability
P(X = 1) = 1 P(X = 0) = p(n), i < j
ij ij
−
and P(X = 0) = 1, i = 1,...,n. Consider for any fixed k 3 the trace of the matrix to
ii
≥
the power k
n
Tr(Xk) = X X X . (1.8)
i1i2 i2i3··· iki1
i1,.X..,ik=1
6 HANNADO¨RING,PETER EICHELSBACHER
Note that Tr(Xk) = 2W, for W counting circles of length k in a random graph. We obtain
that the sequence (T ) with
n n
Tr(Xk) E[Tr(Xk)]
T := − (1.9)
n
2β
n
satisfies a moderate deviation principle for any β satisfying (1.4) with l = k and with rate
n
function (1.6) with l = k and a = 2k:
x2
I(x) = . (1.10)
2((k 2)!)2
−
Remark 1.6. The following is a famous open problem in random matrix theory: Consider
X to be a symmetric n n matrix with entries X (i j) being i.i.d., satisfying some
ij
× ≤
exponential integrability. The question is to prove for any fixed k 3 a LDP for
≥
1
Tr(Xk)
nk
and the MDP for
1
Tr(Xk) E[Tr(Xk)] (1.11)
β (k) −
n
for a properly chosen sequence β (k).(cid:0)For k = 1 the LDP(cid:1)in question immediately follows
n
from Cram´er’s theorem (see [DZ98, Theorem 2.2.3]), since
n
1 1
Tr(X) = X .
ii
n n
i=1
X
For k = 2, notice that
n
1 2 1
Tr(X2) = X2 + X2 =: A +B .
n2 n2 ij n2 ii n n
i<j i=1
X X
By Cram´er’s theorem we know that (A˜ ) with A˜ := 1 X2 satisfies the LDP, and
n n n (n) i<j ij
2
by Chebychev’s inequality we obtain for any ε > 0
P
1
limsup logP( B ε) = .
n
n | | ≥ −∞
n
→∞
Hence (A ) and ( 1 Tr(X2)) are exponentially equivalent (see [DZ98, Definition 4.2.10]).
n n n2 n
Moreover (A ) and (A˜ ) are exponentially equivalent, since Chebychev’s inequality leads
n n n n
to
1 1 n2(n 1)
limsup logP( A A˜ > ε) = limsup logP X2 ε − = .
n | n − n| n | ij| ≥ 2 −∞
n→∞ n→∞ (cid:18) Xi<j (cid:19)
Applying Theorem4.2.13in[DZ98], weobtaintheLDPfor(1/n2Tr(X2)) under exponential
n
integrability. For k 3, proving the LDP for (1/nkTr(Xk)) is open, even in the Bernoulli
n
≥
MODERATE DEVIATIONS IN RANDOM GRAPHS AND BERNOULLI RANDOM MATRICES 7
case. For Gaussian entries X with mean 0 and variance 1/n, the LDP for the sequence of
ij
empirical measures of the corresponding eigenvalues λ ,...,λ , e.g.
1 n
n
1
δ ,
n λi
i=1
X
has been established by Ben Arous and Guionnet in [BAG97]. Although one has the repre-
sentation
k n
1 1 X 1
Tr(Xk) = Tr = λk,
nk nk/2 √n nk/2 i
(cid:18) (cid:19) i=1
X
the LDP cannot be deduced from the LDP of the empirical measure by the contraction
principle [DZ98, Theorem 4.2.1], because x xk is not bounded in this case.
→
Remark 1.7. Theorem 1.5 told us that in the case of Bernoulli random variables X , the
ij
MDP for (1.11) holds for any k 3. For k = 1 and k = 2, the MDP for (1.11) holds for
≥
arbitrary i.i.d. entries X satisfying some exponential integrability: For k = 1 we choose
ij
β (1) := a with a any sequence with lim √n = 0 and lim n = . For
n n n n→∞ an n→∞ an ∞
n
1
(X E(X ))
ii ii
a −
n
i=1
X
the MDP holds with rate x2/(2V(X )) and speed a2/n, see Theorem 3.7.1 in [DZ98]. In
11 n
the case of Bernoulli random variables, we choose β (1) = a with (a ) any sequence with
n n n n
np(1 p) n p(1 p)
lim − = 0 and lim − =
n→∞ p an n→∞ p an ∞
and p = p(n). Now ( 1 n (X E(X ))) satisfies the MDP with rate function x2/2 and
an i=1 ii− ii n
speed
P
a2
n .
np(n)(1 p(n))
−
Hence, in this case p(n) has to fulfill the condition n2p(n)(1 p(n)) .
− → ∞
For k = 2, we choose β (2) = a with a being any sequence with lim n = 0 and
n n n n→∞ an
lim n2 = . Applying Chebychev’s inequality and exponential equivalence arguments
n→∞ an ∞
similar as in Remark 1.6, we obtain the MDP for
n
1
(X2 E(X2))
a ij − ij
n
i,j=1
X
with rate x2/(2V(X )) and speed a2/n2.The case of Bernoulli random variables can be
11 n
obtained in a similar way.
8 HANNADO¨RING,PETER EICHELSBACHER
Remark 1.8. For k 3 we obtain the MDP with β = β (k) such that
n n
≥
nk 1p(n)k 1 p(n)(1 p(n)) β nk p(n)k 1 p(n)(1 p(n)) 4.
− − n −
− ≪ ≪ −
p (cid:0) p (cid:1)
Considering a fixed p, the range of β is what we should expect: nk 1 β nk. But we
n − n
≪ ≪
also obtain the MDP for functions p(n). In random matrix theory, Wigner 1959 analysed
Bernoulli random matrices in Nuclear Physics. Interestingly enough, a moderate deviation
principle for the empirical mean of the eigenvalues of a random matrix is known only for
symmetric matrices with Gaussian entries and for non-centered Gaussian entries, respec-
tively, see [DGZ03]. The proofs depend on the existence of an explicit formula for the joint
distribution of the eigenvalues or on corresponding matrix-valued stochastic processes.
1.3. Symmetric Statistics. Onthe way ofproving Theorem 1.1, we willapply anice result
of Catoni [Cat03, Theorem 1.1]. Doing so, we recognized, that Catoni’s approach lead us to
a general approach proving a moderate deviation principle for a rich class of statistics, which
-without loss of generality- can be assumed to be symmetric statistics. Let us make this
more precise. In [Cat03], non-asymptotic bounds of the log-Laplace transform of a function
f of k(n) random variables X := (X ,...,X ) lead to concentration inequalities. These
1 k(n)
inequalities can be obtained for independent random variables or for Markov chains. It is
assumed in [Cat03] that the partial finite differences of order one and two of f are suitably
bounded. The line of proof is a combination of a martingale difference approach and a Gibbs
measure philosophy.
Let (Ω, ) be the product of measurable spaces k(n)( , ) and P = k(n)µ be a product
A ⊗i=1 Xi Bi ⊗i=1 i
probability measure on (Ω, ). Let X ,...,X take its values in (Ω, ) and assume that
1 k(n)
A A
(X ,...,X ) is the canonical process. Let (Y ,...,Y ) be an independent copy of X :=
1 k(n) 1 k(n)
(X ,...,X ) such that Y is distributed according to µ , i = 1,...,k(n). The function
1 k(n) i i
f : Ω R is assumed to be bounded and measurable.
→
k(n)
Let ∆ f(x ;y ) denote the partial difference of order one of f defined by
i 1 i
k(n)
∆ f(x ;y ) := f(x ,...,x ) f(x ,...,x ,y ,x ,...,x ),
i 1 i 1 k(n) − 1 i−1 i i+1 k(n)
k(n)
where x := (x ,...,x ) Ω and y . Analogously we define for j < i and y
1 1 k(n) ∈ i ∈ Xi j ∈ Xj
the partial difference of order two
k(n) k(n)
∆ ∆ f(x ;y ,y ) :=∆ f(x ;y ) f(x ,...,x ,y ,x ,...,x )
i j 1 j i i 1 i − 1 j−1 j j+1 k(n)
+f(x ,...,x ,y ,x ,...,x ,y ,x ,...,x ).
1 j 1 j j+1 i 1 i i+1 k(n)
− −
MODERATE DEVIATIONS IN RANDOM GRAPHS AND BERNOULLI RANDOM MATRICES 9
Now we can state our main theorem. If the random variables are independent and if the
partial finite differences of the first and second order of f are suitably bounded, then f,
properly rescaled, satisfies the MDP:
Theorem 1.9. In the above setting assume that the random variables in X are independent.
Define d(n) by
k(n) i 1
1 1 −
d(n) := ∆ f(Xk(n);Y ) 2 ∆ f(Xk(n);Y ) + ∆ ∆ f(Xk(n);Y ,Y ) . (1.12)
| i 1 i | 3| i 1 i | 4 | i j 1 j i |
!
i=1 j=1
X X
Moreover let there exist two sequences (s ) and (t ) such that
n n n n
s2
(1) nd(n) n→∞ 0 for all ω Ω and
t3 −→ ∈
n
s
(2) nVf(X) n→∞ C > 0 for the variance of f.
t2 −→
n
Then the sequence of random variables
f(X) E f(X)
−
t
n(cid:2) (cid:3)!
n
satisfies a moderate deviation principle with speed s and rate function x2.
n 2C
In Section 2 we are going to prove Theorem 1.9 via the G¨artner-Ellis theorem. In [Cat03]
an inequality has been proved which allows to relate the logarithm of a Laplace transform
with the expectation and the variance of the observed random variable. Catoni proves a
similar result for the logarithm of a Laplace transform of random variables with Markovian
dependence. One can find a different d(n) in [Cat03, Theorem 3.1]. To simplify notations
we did not generalize Theorem 1.9, but the proof can be adopted immediately. In Section 3
we obtain moderate deviations for several symmetric statistics, including the sample mean
and U-statistics with independent and Markovian entries. In Section 4 we proof Theorem
1.1 and 1.3.
2. Moderate Deviations via Laplace Transforms
Theorem 1.9 is an application of the following theorem:
Theorem 2.1. (Catoni, 2003)
In the setting of Theorem 1.9, assuming that the random variables in X are independent,
10 HANNADO¨RING,PETER EICHELSBACHER
one obtains for all s R ,
+
∈
s2
logEexp sf(X) sE f(X) Vf(X) s3d(n) (2.13)
− − 2 ≤
(cid:12) (cid:12)
=(cid:12)(cid:12)k(n) s3 ∆ f(cid:0)(Xk(n)(cid:1);Y ) 3 +(cid:2) k(n) i(cid:3)−1 s3 ∆ f(Xk(cid:12)(cid:12)(n);Y ) 2 ∆ ∆ f(Xk(n);Y ,Y ) .
3 | i 1 i | 4 | i 1 i | | i j 1 j i |
i=1 i=1 j=1
X XX
Proof of Theorem 2.1. We decompose f(X) into martingale differences
F (f(X)) = E f(X) X ,...,X E f(X) X ,...,X , for all i 1,...,k(n) .
i 1 i 1 i 1
− − ∈ { }
(cid:2) (cid:12) (cid:3) (cid:2) (cid:12)k(n) (cid:3)
The variance can be re(cid:12)presented by Vf(X) =(cid:12) E F (f(X)) 2 .
i
Catoni uses the triangle inequality and comXip=a1resh(cid:0)the two te(cid:1)rmi s logEesf(X) sE[f(X)] and
−
s2Vf(X) to the above representation of the variance with respect to the Gibbs measure with
2
density
eW
dP := dP,
W E[eW]
where W is a bounded measurable function of (X ,...,X ). We denote an expectation
1 k(n)
due to this Gibbs measure by E , e.g.
W
E[X exp(W)]
E [X] := .
W E[exp(W)]
On the one hand Catoni bounds the difference
k(n)
s2
logEesf(X) sE[f(X)] E F f(X) 2
(cid:12) − − 2 Xi=1 sE f(X)−E[f(X)] X1,...,Xi−1 (cid:2)(cid:0) i(cid:0) (cid:1)(cid:1) (cid:3)(cid:12)
via partial i(cid:12)(cid:12)ntegration: (cid:2) (cid:12)(cid:12) (cid:3) (cid:12)(cid:12)
k(n)
s2
logEes f(X) E[f(X)] E F2(f(X))
(cid:12) (cid:0) − (cid:1) − 2 Xi=1 sE f(X)−E[f(X)] X1,...,Xi−1 (cid:2) i (cid:3)(cid:12)
(cid:12)(cid:12)k(n) s (s α)2 (cid:2) (cid:12)(cid:12) (cid:3) (cid:12)(cid:12)
= − M3 [F (f(X))]dα ,
i
(cid:12)Xi=1 Z0 2 sE f(X)−E[f(X)] X1,...,Xi−1 +αFi(f(X)) (cid:12)
(cid:12) (cid:12)
where M3[X] :=(cid:12) E X E [X] 3 (cid:2)for a bound(cid:12)(cid:12)ed measu(cid:3)rable function U of (X (cid:12),...,X ).
U U − U 1 k(n)
Moreover
(cid:2)(cid:0) (cid:1) (cid:3)
k(n) s (s α)2
− M3 [F (f(X))]dα
i
(cid:12)Xi=1 Z0 2 sE f(X)−E[f(X)] X1,...,Xi−1 +αFi(f(X)) (cid:12)
(cid:12) (cid:12)
(cid:12)k(n) (cid:2)s (cid:12)(cid:12) k(n) s3(cid:3) (cid:12)
F (f(X)) 3 (s α)2dα ∆ f(Xk(n);Y ) 3 .
≤ || i || − ≤ 3 | i 1 i |
(cid:12)Xi=1 ∞Z0 (cid:12) Xi=1
(cid:12) (cid:12)
(cid:12) (cid:12)