Table Of ContentU
On weak approximation of -statistics
Masoud M. Nasari
∗
9
0 School of Mathematics and Statistics, Carleton University, Canada
0
2 e-mail: mmnasari@connect.carleton.ca
n
a
J
5 Abstract
1
] This paper investigates weak convergence of U-statistics via approximation in
R
probability. The classical condition that the second moment of the kernel of
P
the underlying U-statistic exists is relaxed to having 4 moments only (modulo
. 3
h
a logarithmic term). Furthermore, the conditional expectation of the kernel is
t
a only assumed to be in the domain of attraction of the normal law (instead of
m
the classical two-moment condition).
[
1
v
3 1 Introduction
4
3
2
.
1 Employing truncation arguments and the concept of weak convergence of self-
0 normalized and studentized partial sums, which were inspired by the works of
9
Cso¨rgo˝,SzyszkowiczandWangin[5],[4],[2]and[3],wederiveweakconvergence
0
: resultsviaapproximationsinprobabilityforpseudo-self-normalizedU-statistics
v
i and U-statistic type processes. Our results require only that (i) the expected
X
value of the product of the kernel of the underlyingU-statistic to the exponent
ar 4 andits logarithm exists (instead ofhaving2moments ofthekernel), andthat
3
(ii) the conditional expected value of the kernel on each observation is in the
domainofattractionofthenormallaw(insteadofhaving2moments). Similarly
relaxed moment conditions were first used by Cso¨rgo˝, Szyszkowicz and Wang
[5] for U-statistics type processes for changepoint problems in terms of kernels
of order 2 (cf. Remark 5). Our results in this exposition extend their work to
approximating U-statistics with higher order kernels. The thus obtained weak
convergenceresultsforU-statisticsinturnextendthoseobtainedbyR.G.Miller
Jr. and P.K. Sen in [9] in 1972 (cf. Remark 3). The latter results of Miller and
Sen are based on the classical condition of the existence of the second moment
of the kernel of the underlying U-statistic which in turns implies the existence
of the second moment of the conditional expected value of the kernel on each
of the observations.
∗
Research supported by a Carleton university Faculty of Graduate Studies and Research schol-
arship, and NSERC Canada Discovery Grants of M. Cs¨orgo˝ and M. Mojirsheibani at Carleton
university.
1
2 Main results and Background
Let X ,X ,..., be a sequence of non-degenerate real-valued i.i.d. random vari-
1 2
ables with distribution F. Let h(X ,...,X ), symmetric in its arguments,
1 m
be a Borel-measurable real-valued kernel of order m 1, and consider the pa-
≥
rameter θ = ... h(x ,...,x )dF(x )...dF(x )< .Thecorresponding
1 m 1 m
Z Z ∞
Rm
U-statistic (c|f. S{zerfl}ing [10] or Hoeffding [8]) is
−1
n
U = h(X ,...,X ),
n (cid:18)m(cid:19) i1 im
X
C(n,m)
where m n and denotes the sum over C(n,m) = 1 i < ... <
≤ C(n,m) { ≤ 1
im n . P
≤ }
In order to state our results, we first need the following definition.
Definition. A sequence X,X ,X ,..., of i.i.d. random variables is said to
1 2
be in the domain of attraction of the normal law (X DAN) if there exist
∈
sequences of constants A and B > 0 such that, as n ,
n n
→ ∞
n X A
i=1 i− n N(0,1).
d
P B −→
n
Remark 1. Furtherer to this definition of DAN, it is known that A can be
n
taken as nE(X) and B = n1/2ℓ (n), where ℓ (n) is a slowly varying function
n X X
at infinity (i.e., lim ℓX(nk) = 1 for any k > 0), defined by the distribution
n→∞ ℓX(n)
of X. Moreover, ℓ (n) = Var(X) > 0, if Var(X) < , and ℓ (n) , as
X X
∞ → ∞
n , if Var(X) = . Aplso X has all moments less than 2, and the variance
→ ∞ ∞
of X is positive, but need not be finite.
Also define the pseudo-self-normalized U-process as follows.
m
0 , 0 t < ,
U[∗nt] = U[nt]−θ , m≤ t n1,
V n ≤ ≤
n
where[.] denotesthegreatest integer function,V2 := n h˜2(X )andh˜ (x) =
n i=1 1 i 1
E(h(X1,...,Xm) θ X1 = x). P
− |
Theorem 1. If
(a) E h(X1,...,Xm) 43 log h(X1,...,Xm) < and h˜1(X1) DAN,
| | | | ∞ ∈
(cid:16) (cid:17)
then, as n , we have
→ ∞
[nt ]
(b) 0 U∗ N(0,t ), for t (0,1];
m [nt0] →d 0 0 ∈
2
[nt]
(c) U∗ W(t)on (D[0,1],ρ), whereρis the sup-norm for functions in
m [nt] →d
D[0,1] and W(t),0 t 1 is a standard Wiener process;
{ ≤ ≤ }
(d) On an appropriate probability space for X ,X ,..., we can construct a
1 2
standard Wiener process W(t),0 t < such that
{ ≤ ∞}
[nt] W(nt)
sup U∗ = o (1).
0≤t≤1 (cid:12)(cid:12) m [nt]− n12 (cid:12)(cid:12) P
(cid:12) (cid:12)
Remark 2. The statement(cid:12) (c), whose notion w(cid:12)ill be used throughout, stands
for the following functional central limit theorem (cf. Remark 2.1 in Cso¨rgo˝,
Szyszkowicz and Wang [3]). On account of (d), as n , we have
→ ∞
g(S /V ) g(W(.))
[n.] n d
−→
for all g : D = D[0,1] R that are (D,D) measurable and ρ-continuous, or
−→
ρ-continuous except at points forming a set of Wiener measure zero on (D,D),
whereD denotes the σ-field of subsets of D generated by the finite-dimensional
subsets of D.
Theorem1isfashionedaftertheworkonweakconvergenceofself-normalized
partialsumsprocessesofCso¨rgo˝,SzyszkowiczandWangin[2],[3]and[4],which
constitute extensions of the contribution of Gin´e, Go¨tze and Mason in [6].
As to h˜ (X ) DAN, since Eh˜ (X ) = 0 and h˜ (X ),h˜ (X ),..., are i.i.d.
1 1 1 1 1 1 1 2
∈
random variables, Theorem 1 of [2] (cf. also Theorem 2.3 of [3]) in this context
reads as follows.
Lemma 1. As n , the following statements are equivalent:
→ ∞
(a) h˜ (X ) DAN ;
1 1
∈
[nt0]h˜ (X )
(b) i=1 1 i N(0,t ) for t (0,1];
d 0 0
P V −→ ∈
n
[nt] h˜ (X )
(c) i=1 1 i W(t) on (D[0,1],ρ), where ρ is the sup-norm metric
d
P V −→
n
for functions in D[0,1] and W(t),0 t 1 is a standard Wiener
{ ≤ ≤ }
process;
(d) On an appropriate probability space for X ,X ,..., we can construct a
1 2
standard Wiener process W(t),0 t < such that
{ ≤ ∞}
[nt] ˜h (X ) W(nt)
sup i=1 1 i = o (1).
0≤t≤1(cid:12)(cid:12)P Vn − n21 (cid:12)(cid:12) P
(cid:12) (cid:12)
(cid:12) (cid:12)
(cid:12) (cid:12)
3
Also,inthesamevein, Proposition2.1of[3]for˜h (X ) DANreadsasfollows.
1 1
∈
Lemma 2. As n , the following statements are equivalent:
→ ∞
(a) h˜ (X ) DAN;
1 1
∈
There is a sequence of constants B , such that
n
ր ∞
[nt0]h˜ (X )
(b) i=1 1 i N(0,t ) for t (0,1];
d 0 0
P B −→ ∈
n
[nt] h˜ (X )
(c) i=1 1 i W(t) on (D[0,1],ρ), where ρ is the sup-norm metric
d
P B −→
n
for functions in D[0,1] and W(t),0 t 1 is a standard Wiener
{ ≤ ≤ }
process;
(d) On an appropriate probability space for X ,X ,..., we can construct a
1 2
standard Wiener process W(t),0 t < such that
{ ≤ ∞}
[nt] ˜h (X ) W(nt)
sup i=1 1 i = o (1).
0≤t≤1(cid:12)(cid:12)P Bn − n21 (cid:12)(cid:12) P
(cid:12) (cid:12)
(cid:12) (cid:12)
(cid:12) (cid:12)
In view of Lemma 2, a scalar normalized companion of Theorem 1 reads as
follows.
Theorem 2. If
(a) E h(X1,...,Xm) 43 log h(X1,...,Xm) < and h˜1(X1) DAN,
| | | | ∞ ∈
(cid:16) (cid:17)
then, as n , we have
→ ∞
[nt ] U θ
(b) 0 [nt0]− N(0,t ), where t (0,1];
d 0 0
m B −→ ∈
n
[nt] U θ
[nt]
(c) − W(t) on (D[0,1],ρ), where ρ is the sup-norm for
d
m B −→
n
functions in D[0,1] and W(t),0 t 1 is a standard Wiener process;
{ ≤ ≤ }
(d) On an appropriate probability space for X ,X ,..., we can construct a
1 2
standard Wiener process W(t),0 t < such that
{ ≤ ∞}
[nt]U θ W(nt)
[nt]
sup − =o (1).
0≤t≤1 (cid:12)(cid:12) m Bn − n12 (cid:12)(cid:12) P
(cid:12) (cid:12)
(cid:12) (cid:12)
4
By defining
m 1
Y∗(t) = 0 for 0 t − ,
n ≤ ≤ n
k k(U θ)
Y∗( ) = k − for k = m,...,n
n n m nVar(h˜ (X ))
1 1
q
and for t k−1 , k , k = m,...,n ,
∈ n n
(cid:2) (cid:3)
k 1 k 1 k k 1
Y∗(t) = Y∗( − )+n(t − ) Y∗( ) Y∗( − ) ,
n n n − n (cid:18) n n − n n (cid:19)
we can state the already mentioned 1972 weak convergence result of Miller and
Sen as follows.
Theorem A. If
(I)0< E[(h(X ,X ,...,X ) θ)(h(X ,X ,...,X ) θ)] = Var(h˜ (X )) <
1 2 m 1 m+1 2m−1 1 1
− − ∞
and
(II)Eh2(X ,...,X ) < ,
1 m
∞
then, as n ,
→ ∞
Y∗(t) W(t) on (C[0,1],ρ),
n →d
where ρ is the sup-norm for functions in C[0,1] and W(t),0 t 1 is a
{ ≤ ≤ }
standard Wiener process .
Remark 3. When Eh2(X ,...,X ) < , first note that existence of the sec-
1 m
∞
ondmomentofthekernelh(X ,...,X )impliestheexistenceofthesecondmo-
1 m
ment of h˜ (X ). Therefore, according to Remark 1, B = n Eh˜2(X ). This
1 1 n 1 1
q
means that under the conditions of Theorem A, Theorem 2 holds true and, via
(c)oflatter, ityieldsaversionofTheoremAonD[0,1]. Wenoteinpassingthat
our method of proofs differs from that of cited paper of Miller and Sen. We use
a method of truncation `a la [5] to relax the condition Eh2(X ,...,X ) < to
1 m
∞
thelessstringentmomentconditionE h(X1,...,Xm) 43 log h(X1,...,Xm) <
| | | |
(cid:16) (cid:17)
that, in turn, enables us to have h˜ (X ) DAN in general, with the possi-
1 1
∞ ∈
bility of infinite variance.
Remark 4. Theorem 1 of [2] (Theorem 2.3 in [3]) as well as Proposition
2.1 of [3], continue to hold true in terms of Donskerized partial sums that are
elements ofC[0,1].Consequently, thesameistruefortheabovestated Lemmas
1 and 2, concerning h˜ (X ) DAN. This in turn, mutatis mutandis, renders
1 1
∈
appropriate versions of Theorems 1 and 2 to hold true in (C[0,1],ρ).
5
Proof of Theorems 1 and 2.
In view of Lemmas 1 and 2, in order to prove Theorems 1 and 2, we only have
to prove the following theorem.
Theorem 3. If E h(X1,...,Xm) 43 log h(X1,...,Xm) < and h˜1(X1)
| | | | ∞ ∈
(cid:16) (cid:17)
DAN then, as n , we have
→ ∞
[nt] [nt] h˜ (X )
sup U∗ i=1 1 i = o (1), (1)
0≤t≤1(cid:12)(cid:12) m [nt]− P Vn (cid:12)(cid:12) P
(cid:12) (cid:12)
and (cid:12) (cid:12)
(cid:12)[nt]U θ [nt] ˜h ((cid:12)X )
sup [nt]− i=1 1 i = o (1). (2)
0≤t≤1(cid:12)(cid:12) m Bn − P Bn (cid:12)(cid:12) P
(cid:12) (cid:12)
(cid:12) (cid:12)
Proof of(cid:12) Theorem 3. In view of (b(cid:12)) of Lemma 2 with t0 = 1, Corollary
V2
2.1 of [3], yields n 1. This in turn implies the equivalency of (1) and (2).
B2 →P
n
Therefore, it suffices to prove (2) only.
It can be easily seen that
[nt]U θ [nt] h˜ (X ) [nt] h˜ (X )
sup [nt]− i=1 1 i sup i=1 1 i
(cid:12) (cid:12) (cid:12) (cid:12)
0≤t≤1(cid:12) m Bn − P Bn (cid:12)≤ 0≤t<m (cid:12)P Bn (cid:12)
(cid:12) (cid:12) n (cid:12) (cid:12)
(cid:12) (cid:12) (cid:12) (cid:12)
(cid:12) (cid:12) (cid:12) (cid:12)
[nt]U θ [nt] h˜ (X )
+ sup [nt]− i=1 1 i .
(cid:12) (cid:12)
m≤t≤1(cid:12) m Bn − P Bn (cid:12)
n (cid:12) (cid:12)
(cid:12) (cid:12)
m (cid:12) (cid:12)
Since, as n , we have 0and, consequently, in view of (d) of Lemma 2
→ ∞ n →
[nt] h˜ (X )
sup i=1 1 i = o (1),
(cid:12) (cid:12) P
0≤t<m (cid:12)P Bn (cid:12)
n (cid:12) (cid:12)
(cid:12) (cid:12)
in order to prove (2), it will be(cid:12)enough to sh(cid:12)ow that
[nt]U θ [nt] ˜h (X )
sup [nt]− i=1 1 i = o (1),
(cid:12) (cid:12) P
m≤t≤1(cid:12) m Bn − P Bn (cid:12)
n (cid:12) (cid:12)
(cid:12) (cid:12)
(cid:12) (cid:12)
or equivalently to show that
−1 k
k k 1
max (cid:12) (h(X ,...,X ) θ) h˜ (X ) (cid:12)
m≤k≤n(cid:12)(cid:12) mBn(cid:18)m(cid:19) CX(k,m) i1 im − − Bn Xi=1 1 i (cid:12)(cid:12)
(cid:12) (cid:12)
(cid:12) (cid:12)
(cid:12) (cid:12)
−1
k k
= max (cid:12) h(X ,...,X ) θ h˜ (X ) ... h˜ (X ) (cid:12)
m≤k≤n(cid:12)(cid:12) mBn(cid:18)m(cid:19) CX(k,m) (cid:16) i1 im − − 1 i1 − − 1 im (cid:17) (cid:12)(cid:12)
(cid:12) (cid:12)
(cid:12) (cid:12)
(cid:12) (cid:12)
= o (1). (3)
P
6
The first equation of (3) follows from the fact that
k
m k
h˜ (X )+...+h˜ (X ) = h˜ (X ),
1 i1 1 im k (cid:18)m(cid:19) 1 i
CX(k,m)(cid:16) (cid:17) Xi=1
where denotes the sum over C(k,m) = 1 i < ... < i k . To
C(k,m) { ≤ 1 m ≤ }
establisPh (3), without loss of generality we can, and shall assume that θ = 0.
1 1
Considering that for large n, (cf. Remark 1), to conclude (3), it
B ≤ √n
n
will be enough to show that, as n , the following holds:
→ ∞
−1
n−21 mm≤ka≤xn(cid:12)(cid:12)(cid:12)k(cid:18)mk(cid:19) CX(k,m)(cid:16)h(Xi1,...,Xim)−h˜1(Xi1)−...−˜h1(Xim)(cid:17)(cid:12)(cid:12)(cid:12) = oP(1). (4)
(cid:12) (cid:12)
(cid:12) (cid:12)
To estab(cid:12)lish (4), for the ease of notation, let (cid:12)
h(1)(X ,...,X ):= h(X ,...,X )I E(h(X ,...,X )I ),
i1 im i1 im (|h|≤n32)− i1 im (|h|≤n32)
h˜(1)(X ) := E(h(1)(X ,...,X )X ), j = 1,...,m,
ij i1 im | ij
ψ(1)(X ,...,X ) := h(1)(X ,...,X ) h˜(1)(X ) ... h˜(1)(X ),
i1 im i1 im − i1 − − im
h(2)(X ,...,X ) := h(X ,...,X )I E(h(X ,...,X )I ),
i1 im i1 im (|h|>n23)− i1 im (|h|>n23)
h˜(2)(X ) := E(h(2)(X ,...,X )X ), j = 1,...,m,
ij i1 im | ij
where I is the indicator function of the set A. Now observe that
A
−1
n−21 mm≤ka≤xn(cid:12)(cid:12)(cid:12)k(cid:18)mk(cid:19) CX(k,m)(cid:16)h(Xi1,...,Xim)−h˜1(Xi1)−...−˜h1(Xim)(cid:17)(cid:12)(cid:12)(cid:12)
(cid:12) (cid:12)
(cid:12) (cid:12)
(cid:12) (cid:12)
−1
≤ n−21 mm≤ka≤xn(cid:12)(cid:12)(cid:12)k(cid:18)mk(cid:19) CX(k,m)(cid:16)h(Xi1,...,Xim)−h(1)(Xi1,...,Xim)(cid:17)(cid:12)(cid:12)(cid:12)
(cid:12) (cid:12)
(cid:12) (cid:12)
(cid:12) −1 (cid:12)
+n−21 mm≤ka≤xn(cid:12)(cid:12)(cid:12)k(cid:18)mk(cid:19) CX(k,m)(cid:16)˜h1(Xi1)+...+h˜1(Xim)−h˜(1)(Xi1)−...−h˜(1)(Xim)(cid:17)(cid:12)(cid:12)(cid:12)
(cid:12) (cid:12)
(cid:12) (cid:12)
(cid:12) −1 (cid:12)
+n−21 mm≤ka≤xn(cid:12)(cid:12)k(cid:18)mk(cid:19) ψ(1)(Xi1,...,Xim) (cid:12)(cid:12)
(cid:12) CX(k,m) (cid:12)
(cid:12) (cid:12)
(cid:12) (cid:12)
(cid:12) (cid:12)
:= J (n)+J (n)+J (n).
1 2 3
We will show that J (n)= o (1), s= 1,2,3.
s P
7
To deal with the term J (n), first note that
1
h(X ,...,X ) h(1)(X ,...,X )= h(2)(X ,...,X ).
i1 im − i1 im i1 im
Therefore, in view of Theorem 2.3.3 of [1] page 43, for ǫ > 0, we can write
−1
Pn−21 mm≤ka≤xn(cid:12)(cid:12)k(cid:18)mk(cid:19) h(2)(Xi1,...,Xim)(cid:12)(cid:12) > ǫ
(cid:12) CX(k,m) (cid:12)
(cid:12) (cid:12)
(cid:12) (cid:12)
ǫ−1n−21 m E(cid:12) h(2)(X1,...,Xm) + n E h(2)(X1(cid:12),...,Xm)
≤ | | | |
(cid:0) (cid:1)
ǫ−1n−21 2m E h(X1,...,Xm) +ǫ−1n21 2m E(h(X1,...,Xm)I 3 )
≤ | | | | (|h|>n2)
ǫ−1n−21 2m E h(X1,...,Xm) +ǫ−1 2m E(h(X1,...,Xm) 43I 3 )
≤ | | | | (|h|>n2)
0, asn .
−→ → ∞
Here we have used the fact that E h(X1,...,Xm) 43 < . The last line above
| | ∞
implies that J (n) = o (1).
1 P
Next to deal with J (n), first observe that
2
m
h˜ (X )+...+h˜ (X ) h˜(1)(X ) ... h˜(1)(X ) = h˜(2)(X ).
1 i1 1 im − i1 − − im ij
Xj=1
It can be easily seen that m h˜(2)(X ) is symmetric in X ,...,X . Thus,
j=1 ij i1 im
in view of Theorem 2.3.3 oPf [1] page 43, for ǫ > 0, we have
−1 m
Pn−21 mm≤ka≤xnk(cid:12)(cid:12)(cid:18)mk(cid:19) h˜(2)(Xij)(cid:12)(cid:12) > ǫ
(cid:12) CX(k,m) Xj=1 (cid:12)
(cid:12) (cid:12)
(cid:12) (cid:12)
ǫ−1n−21 2m E (cid:12)h(X1,...,Xm) +ǫ−1n21 2m E((cid:12)h(X1,...,Xm)I 3 )
≤ | | | | (|h|>n2)
0, asn ,
−→ → ∞
i.e., J (n) = o (1).
2 P
Note. Alternatively, one can use Etemadi’s maximal inequality for partial
sumsofi.i.d. randomvariables, followed byMarkov inequality, toshowJ (n)=
2
o (1).
P
8
As for the term J (n), first note that k −1 ψ(1)(X ,...,X ) is
3 m C(k,m) i1 im
a U-statistic. Consequently one more appl(cid:0)ica(cid:1)tionPof Theorem 2.3.3 page 43 of
[1] yields,
−1
Pn−21 mm≤ka≤xnk(cid:12)(cid:12)(cid:18)mk(cid:19) ψ(1)(Xi1,...,Xim)(cid:12)(cid:12) > ǫ
(cid:12) CX(k,m) (cid:12)
(cid:12) (cid:12)
(cid:12) (cid:12)
n−1ǫ−2 m2 E((cid:12)ψ(1)(X ,...,X ))2 (cid:12)
1 m
≤
2
n −1
k
+n−1ǫ−2 (2k+1)E(cid:18)m(cid:19) ψ(1)(Xi1,...,Xim) . (5)
k=Xm+1 CX(k,m)
Observing that E(ψ(1)(X ,...,X ))2 C(m) E h2(X ,...,X )I ,
1 m 1 m 3
≤ (cid:16) (|h|≤n2)(cid:17)
where C(m) is a positive constant that does not depend on n,
Eψ(1)(X ,...,X )= E(ψ(1)(X ,...,X )X ) = 0, j = 1,...,m,
i1 im i1 im | ij
and in view of Lemma B page 184 of [10], it follows that for some positive
constants C (m) and C (m) which do not depend on n, the R.H.S. of (5) is
1 2
bounded above by
ǫ−2 n−1 E h2(X ,...,X )I (C (m) +C (m) log(n))
1 m 3 1 2
(cid:16) (|h|≤n2)(cid:17)
ǫ−2C1(m)n−31 E h(X1,...,Xm) 43
≤ | |
+ǫ−2 C1(m) E h(X1,...,Xm) 43 I 3
(cid:16) | | (n<|h|≤n2)(cid:17)
+ǫ−2C2(m)n−31 log(n)E h(X1,...,Xm) 43
| |
+ǫ−2 C2(m) E h(X1,...,Xm) 43 log h(X1,...,Xm) I 3
(cid:16)| | | | (n<|h|≤n2)(cid:17)
ǫ−2C1(m)n−31 E h(X1,...,Xm) 43
≤ | |
+ǫ−2 C1(m)E h(X1,...,Xm) 34 I(|h|>n)
| |
(cid:16) (cid:17)
+ǫ−2C2(m)n−31 log(n)E h(X1,...,Xm) 43
| |
+ǫ−2 C2(m)E h(X1,...,Xm) 43 log h(X1,...,Xm) I(|h|>n)
| | | |
(cid:16) (cid:17)
0, asn .
−→ → ∞
Thus J (n) = o (1). This also completes the proof of (4), and hence also that
3 P
of Theorem 3. Now, as already noted above, the proof of Theorems 1 and 2
follow from Theorem 3 and Lemmas 1 and 2.
Remark5. StudyingaU-statisticstypeprocessthatcanbewrittenasasumof
three U-statistics of order m = 2, Cso¨rgo˝, Szyszkowicz and Wang in [5] proved
that under the slightly more relaxed condition that E h(X1,...,Xm) 43 < ,
| | ∞
9
as n , we have
→ ∞
n−23 max (h(Xi,Xj) h˜1(Xi) ˜h1(Xj)) = oP(1).
1≤k≤n − −
1≤Xi<j≤k
In the proof of the latter, the well known Doob maximal inequality for martin-
gales was used, which gives us a sharper bound. The just mentioned inequality
is not applicable for the processes in Theorems 1 and 2, even for U-statistics
of order 2. The reason for this is that the inside parts of the absolute values of
J (n), s = 1,2,3, are not martingales. Also, since (h(X ,...,X )
s C(k,m) i1 im −
h˜ (X ) ... h˜ (X )), for m > 2, no longer formPa martingale, it seems that
1 i1 − − 1 im
the Doob maximal inequality is not applicable for the process
n−m+12 1m≤ka≤xn (h(Xi1,...,Xim)−h˜1(Xi1)−...−h˜1(Xim)),
X
C(k,m)
which is an extension of the U-statistics parts of the process used by Cso¨rgo˝,
Szyszkowicz and Wang in [5] for m = 2.
Due to the nonexistence of the second moment of the kernel of the un-
derlying U-statistic in the following example, the weak convergence result of
Theorem A fails to apply. However, using Theorem 1 for example, one can still
derive weak convergence results for the underlying U-statistic.
Example. Let X ,X ,..., be a sequence of i.i.d. random variables with the
1 2
density function
x a−3, x a 1, a= 0,
f(x)= | − | | − | ≥ 6
(cid:26) 0 , elsewhere.
Consider the parameter θ = Em(X ) = am, where m 1 is a positive integer,
1
and the kernel h(X ,...,X ) = m X . Then with≥m,n satisfying n m,
1 m i=1 i ≥
the corresponding U-statistic is Q
−1 m
n
U = X .
n (cid:18)m(cid:19) ij
CX(n,m)jY=1
Simple calculation shows that ˜h (X ) = X am−1 am.
1 1 1
−
It is easy to check that E h(X1,...,Xm) 43 log h(X1,...,Xm) < and
| | | | ∞
(cid:16) (cid:17)
that h˜ (X ) DAN (cf. Gut [7], page 439). In order to apply Theorem 1 for
1 1
∈
this U-statistic, define
m
0 , 0 t < ,
≤ n
U∗ =
[nt] ([nmt])−1PC([nt],m)Qmj=1Xij − am , m t 1.
(Pni=1(Xi am−1 − am)2)12 n ≤ ≤
10