ebook img

Necessary and Sufficient Conditions for the Strong Law of Large Numbers for U-statistics PDF

22 Pages·0.21 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Necessary and Sufficient Conditions for the Strong Law of Large Numbers for U-statistics

9 Necessary and Sufficient Conditions for the 9 9 Strong Law of Large Numbers for U-statistics ∗† 1 n a J Rafa l Lata la and Joel Zinn 7 1 Warsaw University and Texas A&M University ] R P . h Abstract t a m Under some mild regularity on the normalizing sequence, we ob- [ tain necessary and sufficient conditions for the Strong Law of Large Numbersfor(symmetrized) U-statistics. We also obtain nasc’s for the 1 v a.s. convergence of series of an analogous form. 8 6 0 1 Introduction. 1 0 9 Thegeneralquestionaddressedinthispaperisthatofnecessaryandsufficient 9 / conditions for h t a m 1 ε h(X ) → 0, a.s. , : i i v γ i n Xi∈In X r where In = {i = (ii,i2,... ,id) : 1 ≤ i1 < i2 < ... < id ≤ n}, {Xj}∞j=1 is a a sequence of iid r.v.’s, X = (X ,··· ,X ). With no loss of generality we i i1 id may assume that h is symmetric in its arguments. Further, as in [CGZ] and in [Zh1], it is also important to consider the question of the almost sure convergence to zero of 1 max|h(X )|. i γn i∈In ∗AMS 1991 Subject Classification: Primary 60F15, Secondary 60E15 †Key words and phrases: U-statistics, Strong Law of Large Numbers, random series 1 2 R. LATAL A AND J. ZINN In fact, it is through the study of this problem that one is able to complete the characterization for the original question. Without the symmetrization by Rademachers, Hoeffding ([H]) in 1961 proved that for general d and γ = n , mean zero is sufficient for the nor- n d malized sum above to go to zero alm(cid:0)o(cid:1)st surely. And, under a pth moment d one has the a.s. convergence to zero with γn = np ([S] when 0 < p < 1, in the product case with mean zero [T] for 1 ≤ p < 2 and in the case of general degenerate h [GZ] for 1 < p < 2). Itissomewhat surprising thatittookuntilthe90’stosee thatHoeffding’s sufficient condition was not necessary ([GZ]). In the particular case in which d = 2, h(x,y) = xy and the variables are symmetric, necessary and sufficient conditons were given in ([CGZ]) in 1995. This was later extended to d ≥ 3 by Zhang ([Zh1]). Very recently Zhang [Zh2] obtained “computable” necessary and sufficient conditions in the case d = 2 and, in general, found equivalent conditions in terms of a law of large numbers for modified maxima. Other related work is that of [M] in which the different indices go to infinity at their own pace and [G] in which the variables in different coordinates can be based on different distributions. In this paper we obtain nasc’s for strong laws for ‘maxima’ for general d. This likely would have enabled one to complete Zhang’s program. However, we also found a more classical way of handling the reduction of the case of sums to the case of max’s. The organization of the paper is as follows. In Section 2 we introduce the necessary notation and give the basic Lemmas. Now the form of our main Theorem is inductive. The reason we present the result in this form is that the conditions in the case d > 2 are quite involved. Because of the format of our Theorem we first present in Section 3, the case that the function, h, is the product of the coordinates. As mentioned earlier, this case received quite a bit attention, culminating in Zhang’s paper ([Zh1]). In the first part of Section 3 we show how the methods developed in this paper allow one to give a relatively simple, and perhaps transparent, proof of Zhang’s result. We, then, prove the main result, namely, the nasc’s for the Strong Law for symmetric U-statistics. Again, because of our inductive format, in order to clearly bring out the main idea’s of our proof, we also give a simple proof of Zhang’s result for the case d = 2. Finally in Section 4 we consider the question of convergence of mul- tidimensional random series h (X˜ ). We obtain necessary and suf- i∈Zd i i + P SLLN FOR U-STATISTICS 3 ficient conditions for a.s. convergence in the case of nonnegative or sym- metrized kernels. This generalizes the results of [KW1] (case d = 2 and h (x,y) = a xy). i,j i,j 2 Preliminaries and Basic Lemmas. Let us first introduce multiindex notation we will use in the paper: • i = (i ,i ,... ,i )-multiindex of size d i 2 d • X = (X ,X ,... ,X ), where X is a sequence of i.i.d. random vari- i ii i2 id j ables with values in some space E and the common law µ • X˜ = (X(1),X(2),... ,X(d)),where(X(k)),k = 1,... ,dareindependent i ii i2 id j copies of (X ), j • ε = ε ε ···ε , where (ε ) is a Rademacher sequence (i.e. a sequence i i1 i2 id i of i.i.d. symmetric random variables taking on values ±1) independent of other random variables (1) (2) (d) (j) • ε˜ = ε ε ···ε , where (ε ) is a doubly indexed Rademacher se- i i1 i2 id i quence independent of other random variables • µ = ⊗k µ - product measure on Ek k i=1 ′ • for I ⊂ {1,2,... ,d}, by E and E we will denote expectation with I I respect to (Xk) and (Xk) respectively i k∈I i k/ǫI ′ • i = (i ) and I = {1,2,... ,d}\I for I ⊂ {1,2,... ,d} I k k∈I • I = {i = (i ,i ,... ,i ) : 1 ≤ i < i < ... < i ≤ n}, n i 2 d 1 2 d • C = {i = (i ,i ,... ,i ) : 1 ≤ i ,i ,... ,i ≤ n} n i 2 d 1 2 d • AI,x = AxI = {z ∈ EI : ∃a ∈ A,aI = xI,aI′ = z} for A ⊂ Ed,I ⊂ {1,... ,d}. The results in this section were motivated by the difficulty in computing quantities such as: P(maxh(X ,Y ) > t), i j i,j≤n 4 R. LATAL A AND J. ZINN where {X } are independent random variables and {Y } is an independent i i copy, and h is, say, symmetric in its arguments. In the one-dimensional case, namely, P(max ξ > t), where {ξ } are i≤n i i independent r.v.’s, we have the simple inequality 1 min( P(|ξ | > t),1) ≤ P(max|ξ | > t) ≤ min( P(|ξ | > t),1). (1) i i i 2 i Xi Xi If this type of inequality held for any dimension, the proofs and results would look much the same as in dimension 1. Here we give an example to see the difference between the cases d = 1 and d > 1. Consider the set in the unit square given by: A = {(x,y) ∈ [0,1]2 : x < a,y < b or x < b,y < a} and assume that the X ,Y are iid uniformly distributed on [0,1]. By (1) it i j easily follows that P( max I (X ,Y ) > 0) ∼ min(na,1)min(nb,1), A i j 1≤i,j≤n which is equivalent to n P(I (X ,Y ) > 0) ∼ n2ab if and only if both a i,j=1 A i j and b are of order O(1P).2 n Lemma 1 Suppose that the nonnegative functions f (x ) satisfy the following i i conditions f (X˜ ) ≤ 1 a.s. for all i (2) i i E f (X˜ ) ≤ 1 a.s. for any I ⊂ {1,2,... ,d}, 0 < Card(I) < d (3) I i i XiI Let m˜ = E f (X˜ ), then 1 i i i P E( f (X˜ ))2 ≤ m˜2 +(2d −1)m˜ (4) i i 1 1 Xi and 1 P( f (X˜ ) ≥ m˜ ) ≥ 2−d−2min(m˜ ,1). (5) i i 1 1 2 Xi SLLN FOR U-STATISTICS 5 Proof. Let S(d) denote the family of nonempty subsets of {1,... ,d} and for a fixed I ∈ S(d) and i let J˜(i,I) = {j : j = i and j 6= i for all kǫ/I}. I I k k Then we have by (2) and (3) E( f (X˜ ))2 ≤ (E f (X ))2 + E E′f (X˜ )E′ f (X˜ ) i i i i I I i i I j j Xi Xi I∈XS(d)Xi j∈XJ˜(i,I) ≤ m2 + E E′f (X˜ ) = m2 +(2d −1)m . 1 I I i i 1 1 I∈XS(d)Xi 2 The inequality (5) follows by (4) and the Paley-Zygmund inequality. The next Lemma is an undecoupled version of Lemma 1, the proof of it is similar as of Lemma 1 and is omitted. Lemma 2 Suppose that the nonnegative functions f (x ) satisfy the following i i conditions f (X ) ≤ 1 a.s. for all i i i and ′ E f (X ) ≤ 1 a.s. for all i and I ⊂ {1,2,... ,d}, 0 < Card(I) < d , I j j X j∈J(i,I) where J(i,I) = {j : {k : ∃ i = j } = I}. l k l Let m = E f (X ), then 1 i i i P E( f (X ))2 ≤ m2 +(2d −1)m (6) i i 1 1 Xi and 1 P( f (X ) ≥ m ) ≥ 2−d−2min(m ,1). (7) i i 1 1 2 Xi 6 R. LATAL A AND J. ZINN In the rest of this paper we will refer to the next Corollary as the “Section Lemma”. Corollary 1 If the set A ⊂ Ed satisfies the condition nd−lµ (AI,XI) ≤ 1 a.s. for all I ⊂ {1,... ,d} with 0 < Card(I) = l < d d−l then P(∃ X˜ ∈ A) ≥ 2−d−2min(ndµ (A),1) i∈Cn i d and for n ≥ d P(∃ X ∈ A) ≥ 2−d−2d−dmin(ndµ (A),1). i∈In i d Proof. The first inequality follows immediately by Lemma 1 applied to f = I . To prove the second inequality we use Lemma 2 and notice that i A n min( µ (A),1) ≥ d−dmin(ndµ (A),1).2 d d (cid:18)d(cid:19) 3 Strong Laws of Large Numbers We will assume in this section that the sequence γ satisfy the following n regularity conditions γ is nondecreasing (8) n γ ≤ Cγ for any n (9) 2n n 2dk 2dl ≤ C for any l = 1,2,... (10) γ2 γ2 Xk≥l 2k 2l As mentioned in the Introduction we first give a proof of Zhang’s result [Zh1] for the product case i.e. h(x) = d x for x ∈ Rd. To state the SLLN i=1 i in this case we need to define numberQs cn by the formula X2 c = min{c > 0 : nE( ∧1) ≤ 1}. n c2 SLLN FOR U-STATISTICS 7 Theorem 1 Assume that h(x) = d x , and that the r.v.’s X are sym- i=1 i i metric. Then, under the regularityQassumptions (8)-(10), the following are equivalent: d 1 1 h(X ) = X → 0 a.s. (11) γ i γ ir n Xi∈In n Xi∈InYr=1 ∞ l γ2 2klP( X2 > 2k ,minX2 > c2 ) < ∞ for all 1 ≤ l ≤ d. (12) r c2(d−l) r≤l r 2k Xk=1 Yr=1 2k Proof. We give only the proof of the necessity of the conditions (12). The sufficiency can be proved as in the Theorem 2. Let n T(r) = X(r)2 n ir iXr=1 and n T(r)(c) = X(r)2 ∧c2. n ir iXr=1 Step 1. We first reduce to the sum of squares, i.e. we will show that condition (11) implies d γ−2 X2 → 0 a.s. (13) n ir Xi∈InYr=1 By the symmetry of X we have that γ−1 d ε X → 0 a.s. Thus for n i∈In r=1 ir ir a.a. sequences (Xi), the Walsh sums (i.e.Pthe linQear combinations of products of d Rademachers) converge to 0 a.s. Hence, they converge in probability. This implies (by a result of Bonami about hypercontractivity of Walshes [B]) that for a.a. sequences (X ), γ−2 d X2 → 0 and (13) is proved. i n i∈In r=1 ir Step 2. We now go to a diadPic subQsequence and then decouple. By the Borel-Cantelli Lemma, the condition (13) implies that ∞ d ∀ P( X2 ≥ εγ2 ) < ∞. ε>0 ir 2k Xk=1 i∈XI2k−1Yr=1 8 R. LATAL A AND J. ZINN Now let us notice that I ⊇ {i ∈ I : (r −1)2k−l < i ≤ r2k−l} if l is such 2k 2k r that 2l ≥ d. Moreover the random variables in these blocks are independent of the other blocks, thus we obtain ∞ d ∀ P( (X(r))2 ≥ εγ2 ) < ∞. ε>0 ir 2k kX=l+1 i∈CX2k−l−1Yr=1 Hence, using the regularity assumption (9) ∞ d ∀ P( T(r) ≥ εγ2 ) < ∞. (14) ε>0 2k 2k Xk=1 Yr=1 Step 3. At this point we use 1-dimensional case of Lemma 1. We apply it to n (X(r))2 c−2T(r)(c ) = j ∧1 n n n c2 Xj=1 n and notice that Ec−2T(r)(c ) = 1 by the definition of c . We get that n n n n 1 1 1 P(T(r)(c ) ≥ c2) = P(c−2T(r)(c ) ≥ Ec−2T(r)(c )) ≥ . n n 2 n n n n 2 n n n 8 Hence d 2(d−l) d 2(d−l) c c 1 P( T(r) ≥ n ) ≥ P( T(r)(c ) ≥ n ) ≥ ( )d−l n 2d−l n n 2d−l 8 Y Y r=l+1 r=l+1 and d 1 l γ2 P( T(r) ≥ 2l−dγ2 ) ≥ ( )d−lP( T(r) ≥ 2k ). n 2k 8 n c2(d−l) Yr=1 Yr=1 2k Thus condition (14) yields ∞ l γ2 P( max (X(r))2 > 2k ) < ∞ (15) Xk=1 i1,...,il≤2kYr=1 ir c22(kd−l) Now, here is the main point. SLLN FOR U-STATISTICS 9 Step 4. At this point we need to replace the max inside the probabil- ity with 2kl outside the probability. To do this we use the Section Lemma (Corollary 1). To get small sections there are a variety of choices. To obtain Zhang’s result, we reduce the probabilities even further by intersecting the sets in the following manner. ∞ l γ2 P( max (X(r))2I > 2k ) < ∞ Xk=1 i1,...,il≤2kYr=1 ir {(Xi(rr))2>c22k} c22(kd−l) To see why we have small sections, just note that E(X2 ∧c2 ) 1 P(X2 > c2 ) ≤ 2k = . 2k c2 2k 2k Now we just use the Section lemma to get ∞ l γ2 2klP( X2I > 2k ) < ∞ Xk=1 Yr=1 r {Xr2>c22k} c22(kd−l) Or, equivalently, ∞ l γ2 2klP( X2 > 2k , min X2 > c2 ) < ∞, r c2(d−l) 1≤r≤l r 2k Xk=1 Yr=1 2k 2 which yields (12). In Theorem 2 we reduce the SLLN for symmetric or nonnegative kernels to a SLLN for “modified maxima”. To see what this means consider the case d = 2. Then, A = {(x,y) ∈ E2 : h2(x,y) ≤ γ2 ,2kE h2I (x,Y) ≤ γ2 , k,2 2k Y h2≤γ2k 2k 2kE h2I (X,y) ≤ γ2 }. X h2≤γ2k 2k So that {∃i ∈ C ,X˜ ǫ/A } = {maxϕ(X˜ ) > γ2 }, 2k i k,2 i 2k i∈C2k where ϕ(x,y) = h2(x,y)∨2kE h2I (x,Y)∨2kE h2I (X,y). Y h2≤γ2k X h2≤γ2k 10 R. LATAL A AND J. ZINN In [Zh2] Zhang, using different methods, also reduced the probem to “modified maxima”. We continue in Theorem 3 to find nasc’s for the SLLN for the maximum, which, hence, could also be used to complete Zhang’s program. For a measurable function h on Ed which is symmetric with respect to permutations of the variables, we define for k = 1,2,... A = {x ∈ Ed : h2(x) ≤ γ2 } k,1 2k and for l = 1,... ,d−1 A k,l+1 = {x ∈ A : 2klE h2I (x) ≤ γ2 for all I ⊂ {1,2,...d}, Card(I) = l}. k,l I Ak,l 2k Theorem 2 Suppose that assumptions (8)-(10) are satisfied and the sets A k,l are defined as above. Then the following conditions are equivalent: 1 ε h(X ) → 0 a.s. (16) i i γ n Xi∈In 1 ε˜h(X˜ ) → 0 a.s. (17) i i γ n iX∈Cn 1 h2(X ) → 0 a.s. (18) i γ2 n Xi∈In 1 h2(X˜ ) → 0 a.s. (19) i γ2 n iX∈Cn ∞ P(∃ X ǫ/A ) < ∞ (20) i∈I2k i k,d X k=1 ∞ ˜ P(∃ X ǫ/A ) < ∞ (21) i∈C2k i k,d X k=1

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.