U On weak approximation of -statistics Masoud M. Nasari ∗ 9 0 School of Mathematics and Statistics, Carleton University, Canada 0 2 e-mail: [email protected] n a J 5 Abstract 1 ] This paper investigates weak convergence of U-statistics via approximation in R probability. The classical condition that the second moment of the kernel of P the underlying U-statistic exists is relaxed to having 4 moments only (modulo . 3 h a logarithmic term). Furthermore, the conditional expectation of the kernel is t a only assumed to be in the domain of attraction of the normal law (instead of m the classical two-moment condition). [ 1 v 3 1 Introduction 4 3 2 . 1 Employing truncation arguments and the concept of weak convergence of self- 0 normalized and studentized partial sums, which were inspired by the works of 9 Cso¨rgo˝,SzyszkowiczandWangin[5],[4],[2]and[3],wederiveweakconvergence 0 : resultsviaapproximationsinprobabilityforpseudo-self-normalizedU-statistics v i and U-statistic type processes. Our results require only that (i) the expected X value of the product of the kernel of the underlyingU-statistic to the exponent ar 4 andits logarithm exists (instead ofhaving2moments ofthekernel), andthat 3 (ii) the conditional expected value of the kernel on each observation is in the domainofattractionofthenormallaw(insteadofhaving2moments). Similarly relaxed moment conditions were first used by Cso¨rgo˝, Szyszkowicz and Wang [5] for U-statistics type processes for changepoint problems in terms of kernels of order 2 (cf. Remark 5). Our results in this exposition extend their work to approximating U-statistics with higher order kernels. The thus obtained weak convergenceresultsforU-statisticsinturnextendthoseobtainedbyR.G.Miller Jr. and P.K. Sen in [9] in 1972 (cf. Remark 3). The latter results of Miller and Sen are based on the classical condition of the existence of the second moment of the kernel of the underlying U-statistic which in turns implies the existence of the second moment of the conditional expected value of the kernel on each of the observations. ∗ Research supported by a Carleton university Faculty of Graduate Studies and Research schol- arship, and NSERC Canada Discovery Grants of M. Cs¨orgo˝ and M. Mojirsheibani at Carleton university. 1 2 Main results and Background Let X ,X ,..., be a sequence of non-degenerate real-valued i.i.d. random vari- 1 2 ables with distribution F. Let h(X ,...,X ), symmetric in its arguments, 1 m be a Borel-measurable real-valued kernel of order m 1, and consider the pa- ≥ rameter θ = ... h(x ,...,x )dF(x )...dF(x )< .Thecorresponding 1 m 1 m Z Z ∞ Rm U-statistic (c|f. S{zerfl}ing [10] or Hoeffding [8]) is −1 n U = h(X ,...,X ), n (cid:18)m(cid:19) i1 im X C(n,m) where m n and denotes the sum over C(n,m) = 1 i < ... < ≤ C(n,m) { ≤ 1 im n . P ≤ } In order to state our results, we first need the following definition. Definition. A sequence X,X ,X ,..., of i.i.d. random variables is said to 1 2 be in the domain of attraction of the normal law (X DAN) if there exist ∈ sequences of constants A and B > 0 such that, as n , n n → ∞ n X A i=1 i− n N(0,1). d P B −→ n Remark 1. Furtherer to this definition of DAN, it is known that A can be n taken as nE(X) and B = n1/2ℓ (n), where ℓ (n) is a slowly varying function n X X at infinity (i.e., lim ℓX(nk) = 1 for any k > 0), defined by the distribution n→∞ ℓX(n) of X. Moreover, ℓ (n) = Var(X) > 0, if Var(X) < , and ℓ (n) , as X X ∞ → ∞ n , if Var(X) = . Aplso X has all moments less than 2, and the variance → ∞ ∞ of X is positive, but need not be finite. Also define the pseudo-self-normalized U-process as follows. m 0 , 0 t < , U[∗nt] = U[nt]−θ , m≤ t n1, V n ≤ ≤ n where[.] denotesthegreatest integer function,V2 := n h˜2(X )andh˜ (x) = n i=1 1 i 1 E(h(X1,...,Xm) θ X1 = x). P − | Theorem 1. If (a) E h(X1,...,Xm) 43 log h(X1,...,Xm) < and h˜1(X1) DAN, | | | | ∞ ∈ (cid:16) (cid:17) then, as n , we have → ∞ [nt ] (b) 0 U∗ N(0,t ), for t (0,1]; m [nt0] →d 0 0 ∈ 2 [nt] (c) U∗ W(t)on (D[0,1],ρ), whereρis the sup-norm for functions in m [nt] →d D[0,1] and W(t),0 t 1 is a standard Wiener process; { ≤ ≤ } (d) On an appropriate probability space for X ,X ,..., we can construct a 1 2 standard Wiener process W(t),0 t < such that { ≤ ∞} [nt] W(nt) sup U∗ = o (1). 0≤t≤1 (cid:12)(cid:12) m [nt]− n12 (cid:12)(cid:12) P (cid:12) (cid:12) Remark 2. The statement(cid:12) (c), whose notion w(cid:12)ill be used throughout, stands for the following functional central limit theorem (cf. Remark 2.1 in Cso¨rgo˝, Szyszkowicz and Wang [3]). On account of (d), as n , we have → ∞ g(S /V ) g(W(.)) [n.] n d −→ for all g : D = D[0,1] R that are (D,D) measurable and ρ-continuous, or −→ ρ-continuous except at points forming a set of Wiener measure zero on (D,D), whereD denotes the σ-field of subsets of D generated by the finite-dimensional subsets of D. Theorem1isfashionedaftertheworkonweakconvergenceofself-normalized partialsumsprocessesofCso¨rgo˝,SzyszkowiczandWangin[2],[3]and[4],which constitute extensions of the contribution of Gin´e, Go¨tze and Mason in [6]. As to h˜ (X ) DAN, since Eh˜ (X ) = 0 and h˜ (X ),h˜ (X ),..., are i.i.d. 1 1 1 1 1 1 1 2 ∈ random variables, Theorem 1 of [2] (cf. also Theorem 2.3 of [3]) in this context reads as follows. Lemma 1. As n , the following statements are equivalent: → ∞ (a) h˜ (X ) DAN ; 1 1 ∈ [nt0]h˜ (X ) (b) i=1 1 i N(0,t ) for t (0,1]; d 0 0 P V −→ ∈ n [nt] h˜ (X ) (c) i=1 1 i W(t) on (D[0,1],ρ), where ρ is the sup-norm metric d P V −→ n for functions in D[0,1] and W(t),0 t 1 is a standard Wiener { ≤ ≤ } process; (d) On an appropriate probability space for X ,X ,..., we can construct a 1 2 standard Wiener process W(t),0 t < such that { ≤ ∞} [nt] ˜h (X ) W(nt) sup i=1 1 i = o (1). 0≤t≤1(cid:12)(cid:12)P Vn − n21 (cid:12)(cid:12) P (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) 3 Also,inthesamevein, Proposition2.1of[3]for˜h (X ) DANreadsasfollows. 1 1 ∈ Lemma 2. As n , the following statements are equivalent: → ∞ (a) h˜ (X ) DAN; 1 1 ∈ There is a sequence of constants B , such that n ր ∞ [nt0]h˜ (X ) (b) i=1 1 i N(0,t ) for t (0,1]; d 0 0 P B −→ ∈ n [nt] h˜ (X ) (c) i=1 1 i W(t) on (D[0,1],ρ), where ρ is the sup-norm metric d P B −→ n for functions in D[0,1] and W(t),0 t 1 is a standard Wiener { ≤ ≤ } process; (d) On an appropriate probability space for X ,X ,..., we can construct a 1 2 standard Wiener process W(t),0 t < such that { ≤ ∞} [nt] ˜h (X ) W(nt) sup i=1 1 i = o (1). 0≤t≤1(cid:12)(cid:12)P Bn − n21 (cid:12)(cid:12) P (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) In view of Lemma 2, a scalar normalized companion of Theorem 1 reads as follows. Theorem 2. If (a) E h(X1,...,Xm) 43 log h(X1,...,Xm) < and h˜1(X1) DAN, | | | | ∞ ∈ (cid:16) (cid:17) then, as n , we have → ∞ [nt ] U θ (b) 0 [nt0]− N(0,t ), where t (0,1]; d 0 0 m B −→ ∈ n [nt] U θ [nt] (c) − W(t) on (D[0,1],ρ), where ρ is the sup-norm for d m B −→ n functions in D[0,1] and W(t),0 t 1 is a standard Wiener process; { ≤ ≤ } (d) On an appropriate probability space for X ,X ,..., we can construct a 1 2 standard Wiener process W(t),0 t < such that { ≤ ∞} [nt]U θ W(nt) [nt] sup − =o (1). 0≤t≤1 (cid:12)(cid:12) m Bn − n12 (cid:12)(cid:12) P (cid:12) (cid:12) (cid:12) (cid:12) 4 By defining m 1 Y∗(t) = 0 for 0 t − , n ≤ ≤ n k k(U θ) Y∗( ) = k − for k = m,...,n n n m nVar(h˜ (X )) 1 1 q and for t k−1 , k , k = m,...,n , ∈ n n (cid:2) (cid:3) k 1 k 1 k k 1 Y∗(t) = Y∗( − )+n(t − ) Y∗( ) Y∗( − ) , n n n − n (cid:18) n n − n n (cid:19) we can state the already mentioned 1972 weak convergence result of Miller and Sen as follows. Theorem A. If (I)0< E[(h(X ,X ,...,X ) θ)(h(X ,X ,...,X ) θ)] = Var(h˜ (X )) < 1 2 m 1 m+1 2m−1 1 1 − − ∞ and (II)Eh2(X ,...,X ) < , 1 m ∞ then, as n , → ∞ Y∗(t) W(t) on (C[0,1],ρ), n →d where ρ is the sup-norm for functions in C[0,1] and W(t),0 t 1 is a { ≤ ≤ } standard Wiener process . Remark 3. When Eh2(X ,...,X ) < , first note that existence of the sec- 1 m ∞ ondmomentofthekernelh(X ,...,X )impliestheexistenceofthesecondmo- 1 m ment of h˜ (X ). Therefore, according to Remark 1, B = n Eh˜2(X ). This 1 1 n 1 1 q means that under the conditions of Theorem A, Theorem 2 holds true and, via (c)oflatter, ityieldsaversionofTheoremAonD[0,1]. Wenoteinpassingthat our method of proofs differs from that of cited paper of Miller and Sen. We use a method of truncation `a la [5] to relax the condition Eh2(X ,...,X ) < to 1 m ∞ thelessstringentmomentconditionE h(X1,...,Xm) 43 log h(X1,...,Xm) < | | | | (cid:16) (cid:17) that, in turn, enables us to have h˜ (X ) DAN in general, with the possi- 1 1 ∞ ∈ bility of infinite variance. Remark 4. Theorem 1 of [2] (Theorem 2.3 in [3]) as well as Proposition 2.1 of [3], continue to hold true in terms of Donskerized partial sums that are elements ofC[0,1].Consequently, thesameistruefortheabovestated Lemmas 1 and 2, concerning h˜ (X ) DAN. This in turn, mutatis mutandis, renders 1 1 ∈ appropriate versions of Theorems 1 and 2 to hold true in (C[0,1],ρ). 5 Proof of Theorems 1 and 2. In view of Lemmas 1 and 2, in order to prove Theorems 1 and 2, we only have to prove the following theorem. Theorem 3. If E h(X1,...,Xm) 43 log h(X1,...,Xm) < and h˜1(X1) | | | | ∞ ∈ (cid:16) (cid:17) DAN then, as n , we have → ∞ [nt] [nt] h˜ (X ) sup U∗ i=1 1 i = o (1), (1) 0≤t≤1(cid:12)(cid:12) m [nt]− P Vn (cid:12)(cid:12) P (cid:12) (cid:12) and (cid:12) (cid:12) (cid:12)[nt]U θ [nt] ˜h ((cid:12)X ) sup [nt]− i=1 1 i = o (1). (2) 0≤t≤1(cid:12)(cid:12) m Bn − P Bn (cid:12)(cid:12) P (cid:12) (cid:12) (cid:12) (cid:12) Proof of(cid:12) Theorem 3. In view of (b(cid:12)) of Lemma 2 with t0 = 1, Corollary V2 2.1 of [3], yields n 1. This in turn implies the equivalency of (1) and (2). B2 →P n Therefore, it suffices to prove (2) only. It can be easily seen that [nt]U θ [nt] h˜ (X ) [nt] h˜ (X ) sup [nt]− i=1 1 i sup i=1 1 i (cid:12) (cid:12) (cid:12) (cid:12) 0≤t≤1(cid:12) m Bn − P Bn (cid:12)≤ 0≤t<m (cid:12)P Bn (cid:12) (cid:12) (cid:12) n (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) [nt]U θ [nt] h˜ (X ) + sup [nt]− i=1 1 i . (cid:12) (cid:12) m≤t≤1(cid:12) m Bn − P Bn (cid:12) n (cid:12) (cid:12) (cid:12) (cid:12) m (cid:12) (cid:12) Since, as n , we have 0and, consequently, in view of (d) of Lemma 2 → ∞ n → [nt] h˜ (X ) sup i=1 1 i = o (1), (cid:12) (cid:12) P 0≤t<m (cid:12)P Bn (cid:12) n (cid:12) (cid:12) (cid:12) (cid:12) in order to prove (2), it will be(cid:12)enough to sh(cid:12)ow that [nt]U θ [nt] ˜h (X ) sup [nt]− i=1 1 i = o (1), (cid:12) (cid:12) P m≤t≤1(cid:12) m Bn − P Bn (cid:12) n (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) or equivalently to show that −1 k k k 1 max (cid:12) (h(X ,...,X ) θ) h˜ (X ) (cid:12) m≤k≤n(cid:12)(cid:12) mBn(cid:18)m(cid:19) CX(k,m) i1 im − − Bn Xi=1 1 i (cid:12)(cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) −1 k k = max (cid:12) h(X ,...,X ) θ h˜ (X ) ... h˜ (X ) (cid:12) m≤k≤n(cid:12)(cid:12) mBn(cid:18)m(cid:19) CX(k,m) (cid:16) i1 im − − 1 i1 − − 1 im (cid:17) (cid:12)(cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) = o (1). (3) P 6 The first equation of (3) follows from the fact that k m k h˜ (X )+...+h˜ (X ) = h˜ (X ), 1 i1 1 im k (cid:18)m(cid:19) 1 i CX(k,m)(cid:16) (cid:17) Xi=1 where denotes the sum over C(k,m) = 1 i < ... < i k . To C(k,m) { ≤ 1 m ≤ } establisPh (3), without loss of generality we can, and shall assume that θ = 0. 1 1 Considering that for large n, (cf. Remark 1), to conclude (3), it B ≤ √n n will be enough to show that, as n , the following holds: → ∞ −1 n−21 mm≤ka≤xn(cid:12)(cid:12)(cid:12)k(cid:18)mk(cid:19) CX(k,m)(cid:16)h(Xi1,...,Xim)−h˜1(Xi1)−...−˜h1(Xim)(cid:17)(cid:12)(cid:12)(cid:12) = oP(1). (4) (cid:12) (cid:12) (cid:12) (cid:12) To estab(cid:12)lish (4), for the ease of notation, let (cid:12) h(1)(X ,...,X ):= h(X ,...,X )I E(h(X ,...,X )I ), i1 im i1 im (|h|≤n32)− i1 im (|h|≤n32) h˜(1)(X ) := E(h(1)(X ,...,X )X ), j = 1,...,m, ij i1 im | ij ψ(1)(X ,...,X ) := h(1)(X ,...,X ) h˜(1)(X ) ... h˜(1)(X ), i1 im i1 im − i1 − − im h(2)(X ,...,X ) := h(X ,...,X )I E(h(X ,...,X )I ), i1 im i1 im (|h|>n23)− i1 im (|h|>n23) h˜(2)(X ) := E(h(2)(X ,...,X )X ), j = 1,...,m, ij i1 im | ij where I is the indicator function of the set A. Now observe that A −1 n−21 mm≤ka≤xn(cid:12)(cid:12)(cid:12)k(cid:18)mk(cid:19) CX(k,m)(cid:16)h(Xi1,...,Xim)−h˜1(Xi1)−...−˜h1(Xim)(cid:17)(cid:12)(cid:12)(cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) −1 ≤ n−21 mm≤ka≤xn(cid:12)(cid:12)(cid:12)k(cid:18)mk(cid:19) CX(k,m)(cid:16)h(Xi1,...,Xim)−h(1)(Xi1,...,Xim)(cid:17)(cid:12)(cid:12)(cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) −1 (cid:12) +n−21 mm≤ka≤xn(cid:12)(cid:12)(cid:12)k(cid:18)mk(cid:19) CX(k,m)(cid:16)˜h1(Xi1)+...+h˜1(Xim)−h˜(1)(Xi1)−...−h˜(1)(Xim)(cid:17)(cid:12)(cid:12)(cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) −1 (cid:12) +n−21 mm≤ka≤xn(cid:12)(cid:12)k(cid:18)mk(cid:19) ψ(1)(Xi1,...,Xim) (cid:12)(cid:12) (cid:12) CX(k,m) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) := J (n)+J (n)+J (n). 1 2 3 We will show that J (n)= o (1), s= 1,2,3. s P 7 To deal with the term J (n), first note that 1 h(X ,...,X ) h(1)(X ,...,X )= h(2)(X ,...,X ). i1 im − i1 im i1 im Therefore, in view of Theorem 2.3.3 of [1] page 43, for ǫ > 0, we can write −1 Pn−21 mm≤ka≤xn(cid:12)(cid:12)k(cid:18)mk(cid:19) h(2)(Xi1,...,Xim)(cid:12)(cid:12) > ǫ (cid:12) CX(k,m) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) ǫ−1n−21 m E(cid:12) h(2)(X1,...,Xm) + n E h(2)(X1(cid:12),...,Xm) ≤ | | | | (cid:0) (cid:1) ǫ−1n−21 2m E h(X1,...,Xm) +ǫ−1n21 2m E(h(X1,...,Xm)I 3 ) ≤ | | | | (|h|>n2) ǫ−1n−21 2m E h(X1,...,Xm) +ǫ−1 2m E(h(X1,...,Xm) 43I 3 ) ≤ | | | | (|h|>n2) 0, asn . −→ → ∞ Here we have used the fact that E h(X1,...,Xm) 43 < . The last line above | | ∞ implies that J (n) = o (1). 1 P Next to deal with J (n), first observe that 2 m h˜ (X )+...+h˜ (X ) h˜(1)(X ) ... h˜(1)(X ) = h˜(2)(X ). 1 i1 1 im − i1 − − im ij Xj=1 It can be easily seen that m h˜(2)(X ) is symmetric in X ,...,X . Thus, j=1 ij i1 im in view of Theorem 2.3.3 oPf [1] page 43, for ǫ > 0, we have −1 m Pn−21 mm≤ka≤xnk(cid:12)(cid:12)(cid:18)mk(cid:19) h˜(2)(Xij)(cid:12)(cid:12) > ǫ (cid:12) CX(k,m) Xj=1 (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) ǫ−1n−21 2m E (cid:12)h(X1,...,Xm) +ǫ−1n21 2m E((cid:12)h(X1,...,Xm)I 3 ) ≤ | | | | (|h|>n2) 0, asn , −→ → ∞ i.e., J (n) = o (1). 2 P Note. Alternatively, one can use Etemadi’s maximal inequality for partial sumsofi.i.d. randomvariables, followed byMarkov inequality, toshowJ (n)= 2 o (1). P 8 As for the term J (n), first note that k −1 ψ(1)(X ,...,X ) is 3 m C(k,m) i1 im a U-statistic. Consequently one more appl(cid:0)ica(cid:1)tionPof Theorem 2.3.3 page 43 of [1] yields, −1 Pn−21 mm≤ka≤xnk(cid:12)(cid:12)(cid:18)mk(cid:19) ψ(1)(Xi1,...,Xim)(cid:12)(cid:12) > ǫ (cid:12) CX(k,m) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) n−1ǫ−2 m2 E((cid:12)ψ(1)(X ,...,X ))2 (cid:12) 1 m ≤ 2 n −1 k +n−1ǫ−2 (2k+1)E(cid:18)m(cid:19) ψ(1)(Xi1,...,Xim) . (5) k=Xm+1 CX(k,m) Observing that E(ψ(1)(X ,...,X ))2 C(m) E h2(X ,...,X )I , 1 m 1 m 3 ≤ (cid:16) (|h|≤n2)(cid:17) where C(m) is a positive constant that does not depend on n, Eψ(1)(X ,...,X )= E(ψ(1)(X ,...,X )X ) = 0, j = 1,...,m, i1 im i1 im | ij and in view of Lemma B page 184 of [10], it follows that for some positive constants C (m) and C (m) which do not depend on n, the R.H.S. of (5) is 1 2 bounded above by ǫ−2 n−1 E h2(X ,...,X )I (C (m) +C (m) log(n)) 1 m 3 1 2 (cid:16) (|h|≤n2)(cid:17) ǫ−2C1(m)n−31 E h(X1,...,Xm) 43 ≤ | | +ǫ−2 C1(m) E h(X1,...,Xm) 43 I 3 (cid:16) | | (n<|h|≤n2)(cid:17) +ǫ−2C2(m)n−31 log(n)E h(X1,...,Xm) 43 | | +ǫ−2 C2(m) E h(X1,...,Xm) 43 log h(X1,...,Xm) I 3 (cid:16)| | | | (n<|h|≤n2)(cid:17) ǫ−2C1(m)n−31 E h(X1,...,Xm) 43 ≤ | | +ǫ−2 C1(m)E h(X1,...,Xm) 34 I(|h|>n) | | (cid:16) (cid:17) +ǫ−2C2(m)n−31 log(n)E h(X1,...,Xm) 43 | | +ǫ−2 C2(m)E h(X1,...,Xm) 43 log h(X1,...,Xm) I(|h|>n) | | | | (cid:16) (cid:17) 0, asn . −→ → ∞ Thus J (n) = o (1). This also completes the proof of (4), and hence also that 3 P of Theorem 3. Now, as already noted above, the proof of Theorems 1 and 2 follow from Theorem 3 and Lemmas 1 and 2. Remark5. StudyingaU-statisticstypeprocessthatcanbewrittenasasumof three U-statistics of order m = 2, Cso¨rgo˝, Szyszkowicz and Wang in [5] proved that under the slightly more relaxed condition that E h(X1,...,Xm) 43 < , | | ∞ 9 as n , we have → ∞ n−23 max (h(Xi,Xj) h˜1(Xi) ˜h1(Xj)) = oP(1). 1≤k≤n − − 1≤Xi<j≤k In the proof of the latter, the well known Doob maximal inequality for martin- gales was used, which gives us a sharper bound. The just mentioned inequality is not applicable for the processes in Theorems 1 and 2, even for U-statistics of order 2. The reason for this is that the inside parts of the absolute values of J (n), s = 1,2,3, are not martingales. Also, since (h(X ,...,X ) s C(k,m) i1 im − h˜ (X ) ... h˜ (X )), for m > 2, no longer formPa martingale, it seems that 1 i1 − − 1 im the Doob maximal inequality is not applicable for the process n−m+12 1m≤ka≤xn (h(Xi1,...,Xim)−h˜1(Xi1)−...−h˜1(Xim)), X C(k,m) which is an extension of the U-statistics parts of the process used by Cso¨rgo˝, Szyszkowicz and Wang in [5] for m = 2. Due to the nonexistence of the second moment of the kernel of the un- derlying U-statistic in the following example, the weak convergence result of Theorem A fails to apply. However, using Theorem 1 for example, one can still derive weak convergence results for the underlying U-statistic. Example. Let X ,X ,..., be a sequence of i.i.d. random variables with the 1 2 density function x a−3, x a 1, a= 0, f(x)= | − | | − | ≥ 6 (cid:26) 0 , elsewhere. Consider the parameter θ = Em(X ) = am, where m 1 is a positive integer, 1 and the kernel h(X ,...,X ) = m X . Then with≥m,n satisfying n m, 1 m i=1 i ≥ the corresponding U-statistic is Q −1 m n U = X . n (cid:18)m(cid:19) ij CX(n,m)jY=1 Simple calculation shows that ˜h (X ) = X am−1 am. 1 1 1 − It is easy to check that E h(X1,...,Xm) 43 log h(X1,...,Xm) < and | | | | ∞ (cid:16) (cid:17) that h˜ (X ) DAN (cf. Gut [7], page 439). In order to apply Theorem 1 for 1 1 ∈ this U-statistic, define m 0 , 0 t < , ≤ n U∗ = [nt] ([nmt])−1PC([nt],m)Qmj=1Xij − am , m t 1. (Pni=1(Xi am−1 − am)2)12 n ≤ ≤ 10