Table Of ContentKERNEL ESTIMATION OF DENSITY LEVEL SETS
5 Benoît CADRE1
0
0
Laboratoire de Mathématiques, Université Montpellier II,
2
CC 051, Pla
e E. Bataillon, 34095 Montpellier
edex 5, FRANCE
n
a
J
f f f
n
4 Abstra
t. Let be a multivariate density and be a kernel estimate of
n X , ,X
1 drawn from the -sample 1 ··· n of i.i.d. random variables with density
f
. We
ompute the asymptoti
rate of
onvergen
e towards 0 of the volume
] t f t
T of the symmetri
di(cid:27)eren
e between the -level set { ≥ } and its plug-in
f t
S n
estimator { ≥ }. As a
orollary, we obtain the exa
t rate of
onvergen
e
.
h of a plug-in type estimate of the density level set
orresponding to a (cid:28)xed
f
t
a probability for the law indu
ed by .
m
[ Key-words : Kernel estimate, Density level sets, Hausdor(cid:27) measure.
1
2000 Mathemati
s Subje
t Classi(cid:28)
ation : 62H12, 62H30.
v
1
2 1. Introdu
tion.Re
ent years have witnessed an in
reasing interest in esti-
2
mation of density level sets and in related multivariate mappings problems.
1
0 The main reason is the re
ent advent of powerfull mathemati
al tools and
5
omputational ma
hinery that render these problems mu
h more tra
table.
0
One of the most powerful appli
ation of density level sets estimation is in
/
h
unsupervised
luster analysis (see Hartigan [1℄), where one tries to break a
t
a
omplex data set into a series of pie
ewise similar groups or stru
tures, ea
h
m
of whi
h may then be regarded as a separate
lass of data, thus redu
ing
:
v overall data
ompexity. But there are many other (cid:28)elds where the knowl-
i edge of density level sets is of great interest. For example, Devroye and Wise
X
[2℄, Grenander [3℄, Cuevas [4℄ and Cuevas and Fraiman [5℄ used density sup-
r
a port estimation for pattern re
ognition and for dete
tion of the abnormal
behavior of a system.
t (t)
In this paper, we
onsider the problem of estimating the -level set L
f IRk
of amultivariate probability density with support in from independent
X , ,X f t 0 t
1 n
randomvariables ··· withdensity .Re
allthatfor ≥ ,the -level
f
set of the density is de(cid:28)ned as follows :
(t) = x IRk : f(x) t .
L { ∈ ≥ }
1
adremath.univ-montp2.fr
1
(t) n
The question now is how to de(cid:28)ne the estimates of from the -sample
X , ,X
1 n
··· ? Even in a nonparametri
framework, there are many possible
answers to this question, depending on the restri
tions one
an impose on
the level set and the density under study. Mainly, there are two families of
su
h estimators : the plug-in estimators and the estimators
onstru
ted by
f f
n
an ex
ess mass approa
h. Assume that an estimator of the density is
(t) f t
n
available. Then a straightforward estimator of the level set is { ≥ },
the plug-in estimator. Mol
hanov [6, 7℄ and Cuevas and Fraiman [5℄ proved
onsisten
y of these estimators and obtained some rates of
onvergen
e. The
M
n
ex
ess mass approa
h suggest to (cid:28)rst
onsider the empiri
al mapping
L IRk
de(cid:28)ned for every borel set ⊂ by
n
1
M (L) = 1 tλ(L),
n n {Xi∈L}−
i=1
X
λ IRk (t)
where denotestheLebesguemeasure on .Anatural estimator of isa
M (L) L
n
maximizer of over agiven
lassof borel sets .For di(cid:27)erent
lassesof
level sets (mainly star-shaped or
onvex level sets), estimators based on the
ex
ess mass approa
h were studied by Hartigan [8℄, Müller [9℄, Müller and
Sawitzki [10℄, Nolan [11℄andPolonik [12℄,whoproved
onsisten
yandfound
ertainratesof
onvergen
e. Whenthelevelsetisstar-shaped,Tsybakov[13℄
re
ently proved that the ex
ess mass approa
h gives estimators with opti-
mal rates of
onvergen
e in an asymptoti
ally minimax sense, whithin the
studied
lasses of densities. Though this result has a great theoreti
al in-
terest, assuming the level set to be
onvex or star-shaped appears to be
somewhat unsatisfa
tory for the statisti
al appli
ations. Indeed, su
h an as-
sumption does not permit to
onsider the important
ase where the density
under study is multimodal with a (cid:28)nite number of modes, and hen
e the
results
an not be applied to
luster analysis in parti
ular. In
omparison,
the plug-in estimators do not
are about the spe
i(cid:28)
shape of the level set.
Moreover, another advantage of the plug-in approa
h is that it leads to eas-
ily
omputable estimators. We emphasize that, if the ex
ess mass approa
h
often gives estimators with optimal rates of
onvergen
e, the
omplexity of
the
omputational algorithm of su
h an estimator is high, due to the pres-
en
e of the maximizing step (see the
omputational algorithm proposed by
Hartigan, [8℄).
In this paper, we study a plug-in type estimator of the density level set
(t) f
, using a kernel density estimate of (Rosenblatt, [14℄). Given a kernel
K IRk IRk h = h(n) > 0
on (i.e., a probability density on ) and a bandwidth
2
h 0 n f
su
h that → as grows to in(cid:28)nity, the kernel estimate of is given by
n
1 x X
f (x) = K − i , x IRk.
n nhk h ∈
Xi=1 (cid:16) (cid:17)
(t) (t)
n
We let the plug-in estimate of be de(cid:28)ned as
(t) = x IRk : f (x) t .
n n
{ ∈ ≥ }
IRk
In the whole paper, the distan
e between two borel sets in is a mea-
λ IRk
sure -in parti
ular the volume or Lebesgue measure on - of the sym-
∆ A∆B = (A Bc) (Ac B)
metri
di(cid:27)eren
e denoted (i.e., ∩ ∪ ∩ for all sets
A,B
). Our main result (Theorem 2.1) deals with the limit law of
√nhkλ (t)∆(t) ,
n
(cid:16) (cid:17)
whi
h is proved to be degenerate.
Consider now the following statisti
al problem. In
luster analysis for
instan
e, it is of interest to estimate the density level set
orresponding to
p [0,1] f
a (cid:28)xed probability ∈ for the law indu
ed by . The data
ontained
p
in this level set
an then be regarded as the most important data if is
f t
far enough from 0. Sin
e is unknown, the level of this density level set
(t)
is unknown as well. The natural estimate of the target density level set
(t ) t
n n n
be
omes , where is su
h that
f dλ = p.
n
Zn(tn)
As a
onsequen
e of our main result, we obtain in Corollary 2.1 the exa
t
(t ) (t)
n n
asymptoti
rate of
onvergen
e of to . More pre
isely, we prove that
β
n
for some whi
h only depends on the data, one has :
2
β √nhkλ (t )∆(t) K2dλ
n n n
→ sπ
(cid:16) (cid:17) Z
in probability.
The pre
ise formulations of Theorem 2.1 and Corollary 2.1 are given in
Se
tion 2.Se
tion 3isdevoted to the proofof Theorem 2.1 whilethe proofof
Corollary 2.1 is given in Se
tion 4. The appendix is dedi
ated to a
hange of
3
(k 1)
variablesformulainvolvingthe - -dimensional Hausdor(cid:27)measure(Propo-
sition A).
2. The main results.
t Θ (0, )
2.1 Estimation of -level sets. In the following, ⊂ ∞ denotes an
.
open interval and kk stands for the eu
lidean norm over any (cid:28)nite dimen-
f
sional spa
e. Let us introdu
e the hypotheses on the density :
f f(x) 0 x
H1. is twi
e
ontinuously di(cid:27)erentiable and → as k k → ∞;
t Θ
H2. For all ∈ ,
inf f > 0,
f−1({t})k∇ k
ψ(x) x IRk
where, here and in the following, ∇ denotes the gradient at ∈ of
ψ : IRk IR
thedi(cid:27)erentiablefun
tion → .Next,weintrodu
etheassumptions
K
on the kernel :
K
H3. is a
ontinuously di(cid:27)erentiable and
ompa
tly supported fun
-
µ :
tion. Moreover, there exists a monotone nonin
reasing fun
tion
IR IR K(x)= µ( x ) x IRk
+
→ su
h that k k for all ∈ .
K
The assumption on the support of is only provided for simpli
ity of the
proofs.Asamatteroffa
t,one
ould
onsideramoregeneral
lassofkernels,
in
luding the gaussian kernel for instan
e. Moreover, as we will use Pollard's
K µ( . )
results [15℄, is assumed to be of the form kk .
(k 1)
Throughout the paper, H denotes the - -dimensional Hausdor(cid:27) mea-
IRk
sureon (
f.Evansand Gariepy, [16℄). Re
allthat H agrees withordinary
(k 1) ∂A
(cid:16) - -dimensional surfa
e area(cid:17) on ni
e sets. Moreover, is the boundary
A IRk
of the set ⊂ ,
3 k = 1
α(k) = if ;
k+4 k 2
(
if ≥ .
g : IRk IR λ
+ g
andforanyboundedborelfun
tion → , standsforthemeasure
A IRk
de(cid:28)ned for ea
h borel set ⊂ by
λ (A) = gdλ.
g
ZA
P
Finally, the notation → denotes the
onvergen
e in probability.
λ(∂(t)) = 0
It
an be proved that if H1, H3 hold and if , one has :
P
λ (t)∆(t) 0.
n
→
(cid:16) (cid:17)
4
The aim of Theorem 2.1 below is to obtain the exa
t rate of
onvergen
e.
g : IRk IR
+
Theorem 2.1. Let → be a bounded borel fun
tion and assume
nhk/(logn)16 nhα(k)(logn)2 0
that H1-H3 hold. If → ∞ and → , then for
t Θ
almost every (a.e.) ∈ :
2t g
√nhkλ (t)∆(t) P K2dλ d .
g n
(cid:16) (cid:17) → sπ Z Z∂(t) k∇fk H
g
Remarks 2.1. • Noti
e that the rightmost integral is de(cid:28)ned be
ause is
(t) t > 0
bounded and is a
ompa
t set for all a
ording to H1.
g 1
•In pra
ti
e, thisresultismainly interesting when ≡ ,sin
ewethen have
the asymptoti
behavior of the volume of the symmetri
di(cid:27)eren
e between
the two level sets. The general
ase is provided for the proof of Corollary 2.1
below.
f f
• If we only assume to be Lips
hitz instead of H1, then is an almost
everywhere
ontinuously di(cid:27)erentiable fun
tion by Radema
her's theorem
and Theorem 2.1 holds under the additional assumption on the bandwidth :
nhk+2(logn)2 0
→ .
2.2 Estimation of level sets with (cid:28)xed probability. In order to derive
f
the
orollary, we need an additional
ondition on .
t (0,sup f] λ(f−1[t ε,t+ε]) 0 ε 0
H4. For all ∈ IRk , − → as → . Moreover,
λ(f−1(0,ε]) 0 ε 0
→ as → .
f
Roughlyspeaking,H4meansthatthesetswhere is
onstantdonot
harge
IRk
the Lebesgue measure on . Many densities with a (cid:28)nite number of lo
al
f
extrema satisfy H4. However, noti
e that if is a
ontinuous density su
h
λ(f−1(0,ε]) 0 ε 0
that → as → , then it is
ompa
tly supported.
Let us now denote by P the appli
ation
[0,sup f] [0,1]
: IRk →
P t λ ((t)).
f
7→
f p [0,1]
Observe that P is one-to-one if satis(cid:28)es H1, H4. Then, for all ∈ ,
t(p) [0,sup f] λ ((t(p))) = p
let ∈ IRk be the unique real number su
h that f .
(p) (p)
t [0,sup f ] λ ( (t )) = p
Morevover, let n ∈ IRk n be su
h that fn n n . Noti
e that
t(p) f IRk
n n
does exists sin
e is a density on .
5
The aim of Corollary 2.1 below is to obtain the exa
t rate of
onvergen
e
(t ) (t)
n n
of to . We also introdu
e an estimator of the unknown integral in
Theorem 2.1.
k 2 (α )
n n
Corollary 2.1. Let ≥ , be a sequen
e of positive real numbers
α 0 nhk+2/logn
n
su
h that → and assume that H1-H4 hold. If → ∞,
nhk+4(logn)2 0 α2nhk/(logn)2 p (Θ)
→ and n → ∞ then, for a.e. ∈ P :
β 2
√nhk n λ (t(p))∆(t(p)) P K2dλ,
(p) n n → sπ
tn (cid:16) (cid:17) Z
q
(p) (p)
β = α /λ( (t ) (t +α )).
n n n n n n n
where −
Remarks 2.2. • It is of statisti
al interest to mpentio[0n,1th]e tfa(p
)t that(tp)under
n
the assumptions of the
orollary, we have for all ∈ : → with
probability 1 (see Lemma 4.3).
k = 1 h
• When , the
onditions of Theorem 2.1 on the bandwidth do not
permit to derive Corollary 2.1. In pra
ti
e, estimations of density level sets
and their appli
ations to
luster analysis for instan
e are mainly interesting
in high-dimensional problems.
3. Proof of Theorem 2.1.
t > 0
3.1. Auxiliary results and proof of Theorem 2.1. For all , let
(logn)β (logn)β
t = f−1 t ,t and t = f−1 t,t+ ,
Vn − √nhk Vn √nhk
h i h i
β > 1/2 K˜
where is (cid:28)xed. Moreover, stands for the real number :
K˜ = K2dλ.
Z
g : IRk IR
+
Proposition3.1.Let → be abounded borel fun
tionand assume
nhk/(logn)31β nhα(k)(logn)2β 0
that H1-H3 hold. If → ∞ and → , then for
t Θ
a.e. ∈ :
lim√nhk P(f (x) t)dλ (x) = lim√nhk P(f (x) < t)dλ (x)
n g n g
n ZVnt ≥ n ZVtn
tK˜ g
= d .
s2π f H
Z∂(t) k∇ k
6
g : IRk IR
+
Proposition3.2.Let → be abounded borel fun
tionand assume
nhk/(logn)5β nhα(k)(logn)2β 0
that H1-H3 hold. If → ∞ and → , then for
t Θ
a.e. ∈ :
limnhkvar λ t (t) = 0 = limnhkvar λ t (t)c .
n g Vn∩n n g Vn∩n
h (cid:16) (cid:17)i h (cid:16) (cid:17)i
t Θ
Proof of Theorem 2.1. Let ∈ be su
h that both
on
lusions of Propo-
sitions 3.1 and 3.2 hold. A
ording to H3 and Pollard ([15℄, Theorem 37 and
Problem 28, Chapter II), we have almost surely (a.s.) :
sup f Ef 0.
n n
| − |→
IRk
sup Ef (x) f(x) x
Moreover, sin
e both n n and vanish as k k → ∞ by H1, H3,
we have :
sup Ef f 0.
n
| − |→
IRk
n
Thus, a.s. and for large enough :
t
sup f f .
n
| − |≤ 2
IRk
(t) (t/2) (t) (t/2)
n
Consequently, ⊂ and sin
e ⊂ , we get :
λ (t)∆(t) = 1 dλ + 1 dλ . (3.1)
g n {fn<t,f≥t} g {fn≥t,f<t} g
(cid:16) (cid:17) Z(t/2) Z(t/2)
Let A = √nhk sup f f (logn)β .
n n
| − |≤
(t/2)
n o
(t/2)
Sin
e is a
ompa
t set by H1, it is a
lassi
al exer
ise to prove that
P(A ) 1
n
→ under the assumptions of the theorem. Hen
e, one only needs to
A A
n n
prove that the result of Theorem 2.1 holds on the event . But on , one
λ ( (t)∆(t)) = J1+J2
g n n n
has a
ording to (3.1) : , where :
J1 = λ t (t)c and J2 = λ t (t) .
n g Vn∩n n g Vn∩n
(cid:16) (cid:17) (cid:16) (cid:17)
j = 1 j = 2
By Propositions 3.1 and 3.2, if or :
tK˜ g
√nhkJj P d , (3.2)
n → s2π f H
Z∂(t) k∇ k
7
h nhα(k)(logn)2β 0 nhk/(logn)31β
if the bandwidth satis(cid:28)es → and → ∞.
β = 16/31
Letting , the theorem is proved •
X
3.2. Proof of Proposition 3.1. Let be a random variable with density
f
,
x X hk√n
V (x)= varK − and Z (x) = (f (x) Ef (x)),
n n n n
h V (x) −
n
(cid:16) (cid:17)
p
x IRk V (x) = 0 Φ
n
for all ∈ su
h that 6 . Moreover, denotes the distribution
(0,1)
fun
tion of the N law.
c
In the proofs, denotes a positive
onstant whose value may vary from
line to line.
IRk
Lemma 3.1. Assume that H1, H3 hold and let C ⊂ be a
ompa
t set
inf f > 0 c > 0 n 1 x
C
su
h that . Then, there exists su
h that for all ≥ , ∈ C
u IR
and ∈ : c
P(Z (x) u) Φ(u) .
n
| ≤ − |≤ √nhk
n 1
Proof.BytheBerry-Essèeninequality (
f.Feller,[17℄),onehasforall ≥ ,
u IR x IRk V (x) = 0
n
∈ and ∈ su
h that 6 :
3 x X x X 3
P(Z (x) u) Φ(u) E K − EK − .
n
| ≤ − | ≤ nV (x)3 h − h
n
(cid:12) (cid:16) (cid:17) (cid:16) (cid:17)(cid:12)
(cid:12) (cid:12)
p (cid:12) (cid:12)
It is a
lassi
al exer
ise to dedu
e from H1, H3 that
x X x X 3
supE K − EK − chk and inf V (x) chk,
n
x∈C h − h ≤ x∈C ≥
(cid:12) (cid:16) (cid:17) (cid:16) (cid:17)(cid:12)
(cid:12) (cid:12)
(cid:12) (cid:12)
hen
e the lemma •
g : IRk IR Θ (g)
+ 0
For all borel bounded fun
tion → , we let to be the set
t Θ
of ∈ su
h that :
1 1 g
lim λ f−1[t ε,t] = lim λ f−1[t,t+ε] = d .
g g
εց0 ε (cid:16) − (cid:17) εց0 ε (cid:16) (cid:17) Z∂(t) k∇fk H
g : IRk IR
+
Lemma 3.2. Let → be a borel bounded fun
tion and assume
Θ (g) = Θ
0
that H1, H2 hold. Then we have : a.e.
8
t Θ η > 0
Proof. A
ording to H1, H2, for all ∈ , there exists su
h that :
inf f >0.
f−1[t−η,t+η]k∇ k
t Θ ε > 0
We dedu
e from Proposition A that for all ∈ and small enough :
1 1 t g
λ f−1[t ε,t] = d ds.
g
ε − ε f H
(cid:16) (cid:17) Zt−εZ∂(s) k∇ k
Using the Lebesgue-Besi
ovit
h theorem (
f. Evans and Gariepy, [16℄, The-
t Θ
orem 1, Chapter I), we then have for a.e. ∈ :
1 g
lim λ f−1[t ε,t] = d ,
g
εց0 ε (cid:16) − (cid:17) Z∂(t) k∇fk H
λ (f−1[t,t + ε]) λ (f−1[t ε,t])
g g
and the same result holds for instead of − ,
hen
e the lemma •
λ(∂(t)) = 0
It is a straightforward
onsequen
e of Lemma 3.2 above that
t Θ
for a.e. ∈ . For simpli
ity, we shall assume throughout that this is true
t Θ Θ
for all ∈ . Sin
e is an open interval, we have in parti
ular
λ f−1[t ε,t+ε] = λ f−1(t ε,t+ε) ,
− −
(cid:16) (cid:17) (cid:16) (cid:17)
t Θ ε> 0
for all ∈ and small enough.
t Θ x IRk f(x)V (x) = 0
n
We now let for ∈ and ∈ su
h that 6 :
nhk hk√n
t (x) = (t f(x)) and t (x) = (t Ef (x)),
n sK˜f(x) − n Vn(x) − n
p
Φ(u) = 1 Φ(u) u IR
and (cid:28)nally, − for all ∈ .
g : IRk IR
+
Lemma 3.3. Let → be a bounded borel fun
tion and assume
nhk/(logn)2β nhk+4(logn)2β 0
that H1, H2 hold. If → ∞ and → , then for
t Θ (g)
0
all ∈ :
lim√nhk P(f (x) t)dλ (x) Φ(t (x))dλ (x) = 0
n g n g
n hZVnt ≥ −ZVnt i
and lim√nhk P(f (x) < t)dλ (x) Φ(t (x))dλ (x) = 0.
n g n g
n hZVtn −ZVtn i
9
t Θ (g)
0
Proof. We only prove the (cid:28)rst equality. Let ∈ . First note that for
x IRk V (x) = 0
n
all ∈ su
h that 6 :
P(f (x) t)= P(Z (x) t (x)).
n n n
≥ ≥
IRk inf f > 0 t
There exists a
ompa
t set C ⊂ su
h that C and Vn ⊂ C for all
n
. Observe that by Lemma 3.1 and the above remarks,
√nhk P(f (x) t)dλ (x) Φ(t (x))dλ (x) cλ ( t).
hZVnt n ≥ g −ZVnt n g i ≤ g Vn
λ ( t) 0
Sin
e g Vn → by Lemma 3.2, one only needs now to prove that :
E := √nhk Φ(t (x)) Φ(t (x))dλ (x) 0.
n n n g
ZVnt | − | →
Φ
One dedu
es from the Lips
hitz property of that
E c√nhkλ ( t) sup t (x) t (x). (3.3)
n ≤ g Vn | n − n |
x∈Vt
n
t (x) t (x) x t
But, by de(cid:28)nitions of n and n , we have for all ∈ Vn :
1
t (x) t (x)
n n
√nhk| − |
1 1 hk
t f(x) + Ef (x) f(x)
n
≤ | − |(cid:12) K˜f(x) − V (x)h−k(cid:12) sVn(x)| − |!
(cid:12) n (cid:12)
(cid:12) (cid:12)
(cid:12)q q (cid:12)
(logn)β (cid:12) K˜f(x) V (x)h−k (cid:12) hk
n
| − | + Ef (x) f(x) . (3.4)
≤ √nhk vu K˜f(x)Vn(x)h−k sVn(x)| n − |!
u
t
t
It is a
lassi
al exer
ise to dedu
e from H1, H3 that, sin
e Vn is
ontained
in C,
sup Ef (x) f(x) ch2,
n
| − |≤
x∈Vt
n
and similarly, that
sup K˜f(x) V (x)h−k ch.
n
| − | ≤
x∈Vt
n
One dedu
es from (3.4) and above that
sup t (x) t (x) c(√h(logn)β +√nhk+4).
n n
| − | ≤
x∈Vt
n
10