9 Random Marked Sets 0 0 2 Felix Ballani, Zakhar Kablu hko, Martin S hlather r a M Institute for Mathemati al Sto hasti s, Georgia Augusta University, Golds hmidtstr. 7, D-37077 Göttingen, Germany, 3 1 ballanimath.uni-goettingen.de, kablu hmath.uni-goettingen.de, s hlathermath.uni-goettingen.de ] R P Mar h 13, 2009 . h t a m Abstra t. We introdu eRad new lass of sto hasti pro esses whi h are [ de(cid:28)ned on a random set in . These pro esses an be seen as a link be- tween random (cid:28)elds and marked point pro esses. Unlike for random (cid:28)elds, 1 v the mark ovarian e fun tion need in general not be positive de(cid:28)nite. This 8 implies that in many situations the use of simple geostatisti al methods ap- 8 pears to be questionable. Surprisingly, for a spe ial lass of pro esses based 3 2 on Gaussian random (cid:28)elds, we do have positive de(cid:28)niteness for the orre- . 3 sponding mark ovarian e fun tion and mark orrelation fun tion. 0 9 Classi(cid:28) ation. Primary: 60G60, 60G55; se ondary: 60G15, 60D05 0 : v Keywords. random (cid:28)eld, random set, marked point pro ess, mark or- i X relation fun tion, mark ovarian e fun tion r a 1 Introdu tion Rd Quantities measured in are mostly modelled as so- alled regionalized variables, i.e., one usuallyRdassumes that these quantities an, inprin iple, be measured everywhere in and that the hoi e of sampling points does not depend on the values of these quantities. Based on this assumption, several geostatisti al methods like variogram analysis or kriging an be applied, see [6℄. However, there are two types of situations where this assumption does not hold [23℄ and hen e, un riti al use of geostatisti al methods might ause in orre t or meaningless results. 1 The (cid:28)rst type of problems is aused by the investigators themselves by some kind of preferential sampling [9℄. For instan e, this happens when data are sampled only at pla es where high values of the variable of interest are expe ted. The se ond type of problems is intrinsi to the investigated obje t itself. An obvious situation is the investigation of individuals, e.g. trees in a forest, where intera tions among individuals are present. In this parti ular situation the theory of marked point pro esses provides a formal framework for data analysis [10, 12℄. Here, we like to draw the attention to some further, de eptive situations whereimpli it onditioning hasbeenmostlyignoredinliterature[13,16,29℄. For instan e, the investigation of pesti ides in soil is restri ted to ropland andtheheightofforestlitterisrestri tedtosilvi ulturalareas. Inboth ases, a presele tion annot be ex luded sin e environmental onditions dire tly in(cid:29)uen e the kind of land use. A further, simple example has motivated this work and appears when the altitude is predi ted by geostatisti al methods based on measurements that are taken above sealevel only. Su h kind of onditioning might be onsidered as minor, but an ause major e(cid:27)e ts, nonetheless. We advise aution be ause of the following fa ts: 1. Any hara teristi , su h as the ovarian e fun tion or the variogram, has to be understood as a onditional quantity given measurements an be taken at ertain lo ations. 2. In general, neither the ovarian e fun tion is positive de(cid:28)nite nor the variogram is onditionally negative de(cid:28)nite. Sin e Gaussian random (cid:28)elds are rather popular, a bigger part of this paper deals with the following model: the sealevel is at 0 and the altitude is Z t given by some (smooth) stationary Gaussian random (cid:28)eld with mean − 1 and varian e . Then we fa e the following oddities when inferen e is based on measurements above sealevel only: 1. The theoreti al variogram is not onditionally negative de(cid:28)nite, in general. C(x,y) 2. A naive de(cid:28)nition of the ovarian e fun tion by C(x,y) = E[Z(x)Z(y) Z(x) 0,Z(y) 0] m2 | ≥ ≥ − m R leads in general to a fun tion whi h is not positive de(cid:28)nite, for any ∈ . 3. A more suitable de(cid:28)nition of the ovarian e fun tion for the altitude Z(x) 0 Z(y) above sealevel asthe onditional ovarian e given that ≥ and ≥ 0 leads to a fun tion whi h is never di(cid:27)erentiable. t = 0 4. If , the onditional ovarian e fun tion is positive de(cid:28)nite. Though, no random (cid:28)eld exists that is independent of the sampling lo a- tions and that an model the altiditude above sealevel. 2 Before dis ussing the above set up in detail, we will introdu e a theo- reti al framework so that both a meaningful de(cid:28)nition of se ond-order har- a teristi s is possible and usual random (cid:28)elds as well as marked point pro- esses are in luded as parti ular ases. For this reason we extend the no- tRio=n o[f a r,and]om uRppder semi- ontinuous (u.s. .) fun tion (taking vaRludes in −∞ ∞ ) on su h that the domain is a random subset of . To this end we make use of Matheron's [17℄ idea and onsider the hypograph A = (x,t) X R : t f(x) , X Rd f { ∈ × ≤ } ⊆ f : X R A f f of a fun tion → . In fa t, is losed if and only if is u.s. . on X f A f losed , and the mapping 7→ is a bije tion. The paper is organized as follows. In Se tion 2 we formally introdu e the notion of a random marked losed set and dis uss some examples. In Se tion 3 we generalise the de(cid:28)nition of several hara teristi s for random (cid:28)elds to random marked sets. We show that, in general, they do not share the same de(cid:28)niteness properties as their random (cid:28)eld analogues. In Se tion Z(x) Z(x) 4 we studyt GaRussian random (cid:28)elds given that ex eeds a ertain threshold ∈ . In Se tion 5 some resultson the di(cid:27)erentiablity of the mark ovarian e fun tion ofrandommarkedsetsaregiven. InSe tion6,we olle t the proofs of the statements of the pre eding se tions. 2 Random marked losed sets R = R ,+ Denote by ∪{−∞ ∞} the extended real line. Let Φ = (X,f) : X Rd , f : X R . usc { ⊆ is losed → is u.s. .} Φ A Rd R usc cl is isomorphi to the system U of all losed sets ⊆ × whi h satisfy x Rd t R: (x,t) A x [ ,t] A ∀ ∈ ∀ ∈ ∈ ⇒ { }× −∞ ⊆ (1) by the bije tion τ : Φ usc cl → U (X,f) (x,t) X R : t f(x) , (X,f) Φ . usc 7→ { ∈ × ≤ } ∈ Th(RedsubsReq)uent propositionfollowRsdimmRediately from thefa t thatthespa e cl F × (oRfd losRe)d subsets of × is ompa t [17, 19℄ and that U is losed in F × . 3 Φ usc cl Proposition 1. is ompa t in the topology indu ed by U . (Ω, ,P) (Ξ,Z) : De(cid:28)nition 1. Let be A a omplete probability spa e and let Ω Φ usc → be a mapping with ω Ω : τ(Ξ,Z) B = ∅ { ∈ ∩ 6 } ∈ A B Rd R (Ξ,Z) for every ompa t set in × . Then is alled a random marked losed set. The distribution law of a random losed set is hara terized by the prob- abilities of hitting ompa t sets [18, 19℄ whereas, by [18, Prop. 2.3.1℄, it su(cid:30) es to restri t to a suitable base. When hoosing the same base of all B [t , ] i i (cid:28)nite unions of halfR ydlinders × ∞ as in [24, Thm. XII-6℄ for random u.s. . fun tions on , we obtain the following hara terization of random marked losed sets. (Ξ,Z) Theorem 1. The distribution of a random marked losed set (as a Φ usc probability measure on ) is ompletely determined by the joint probabili- ties P sup Z(x)< t , B Ξ = ∅, i I; B Ξ = ∅, j 1,...,n I , i i j ∩ 6 ∈ ∩ ∈ { }\ x∈Bi∩Ξ (cid:0) (cid:1) B ,...,B Rd t ,...,t R I 1 n 1 n wher1e,...,n n aNre ompa t subsets of , ∈ , and is a subset of { }, ∈ . (Ξ,Z) De(cid:28)nition 2. A random marked losed set is alled stationary if P(τ(Ξ,Z)+(x,0) ) = P(τ(Ξ,Z) ) ∈ · ∈· x Rd for all ∈ , and it is alled isotropi if P(θτ(Ξ,Z) ) = P(τ(Ξ,Z) ) ∈ · ∈ · θ SO θ(Rd 0 ) = Rd 0 d+1 for all rotations ∈ with ×{ } ×{ }. Example 1. A parti ular model of a random marked losed set that de- Z s ribes an unbiased sampliRngd of a random (cid:28)eld [27℄ is given when is a random u.s. . fun tion on that is independent of the random losed set Ξ (Ξ,Z) . We all a random-(cid:28)eld model. If the data are onsistent with a random-(cid:28)eld model, any analysis sim- pli(cid:28)es onsiderably sin e the domain and the marks an be investigated sep- arately (see also Remark 4) by using standard te hniques for random sets [26℄ and for geostatisti al data [6, 10℄. For the parti ular ase of marked point pro esses, several tests for the random-(cid:28)eld model hypothesis have been developed [11, 23℄. 4 Ξ Z(x) = d(∂Ξ,x) Example 2. Let x beRad random losed set aΞnd Z the Euk- lidean distan e of ∈ to the boundary of . Then is even ontinuous Ξ Z on . Sin e lo al maxima of are only attained at lo ations in the interior Ξ (Ξ,Z) of , the random marked set is a random-(cid:28)eld model if and only if Ξ = ∂Ξ Z almost surely, in whi h ase is trivial. Example 3. Cressie et al. [7℄ onsider the spatial predi tion on a river Ξ network. Here, is the (cid:29)ow of the river (as a one-dimensional line or a Z two-dimensional stripe) and models the dissolved oxygen. Ξ Example 4. LCet2 be a random losed setRrdepresented as a lo ally (cid:28)nite union of losed -smooth hypersurfa es in su h that any two hypersur- (d 1) fa es interse t at most in a set of measure zero with respe t to the − - x Ξ Z(x) dimensional Hausdor(cid:27) measure. For any ∈ , the mark is the maxi- x mum of the mean urvatures of the hypersurfa es at . The mean urvature has its importan e for example in the analysis of foams [15℄. 3 Chara teristi s for random marked losed sets For the des ription of random (cid:28)elds a set of se ond-order hara teristi s like the variogram, the ovarian e fun tion and the orrelation fun tion are used [6℄. In analogy to these summary fun tions, several se ond-order har- a teristi s for marked point pro esses have been introdu ed as onditional quantitiesgiventheexisten eofpointsoftherespe tiveunmarkedpointpro- ess [22, 26℄. Sin e point pro esses an be des ribed as random ( ounting) measures, these quantities have been derived as Radon-Nikodym derivatives of ertain se ond-order moment measures [3, Se tion 2.7℄. Nevertheless, ran- dom measures are not always appropriate for the de(cid:28)nition of se ond-order hara teristi s as the following example illustrates. Ξ R1 Example 5. Let the stationary random losed set in be given by Ξ = ξ+ [2z p,2z+p] 2z+1 , − ∪{ } z∈Z [ p (0, 1) ξ [0,1] where ∈ 3 and isuniformlydistributedin . Obviously,interpoint r (0,2p] distan es ∈ are only possible if both points belong to the same ξ + [2z p,2z + p] r (1 p,1 + p] segment − , and interpoint distan es ∈ − ξ +[2z p,2z +p] are only possible if one point belongs to a segment − and ξ +2z 1 ξ +2z +1 tPh(eo,ortherΞ)is=fr0om one rof th(1e sipn,g1le+topn]s, { − } or { }. Sin e ∈ forall ∈ − , theapproa h ofde(cid:28)ningse ond-order 5 hara teristiR s1usingarandom measure, whi h isherebasedontheLebesgue measure on , annot a ount forsegment-singleton point pairs, andhen e, r (1 p,1+p] these hara teristi s are unde(cid:28)ned for ∈ − . Nonetheless, it does make sense also to onsider the orrelation of two marks given that the r r (1 p,1+p] orresponding points are a distan e , ∈ − , apart. B (x) Rd ε x InRwd hat follows,ε 0 denotes the Eu lidean ball in with entre ∈ and radius ≥ , ⊕ denotes Minkowski addition, and we write Ξ Ξ B (o) ⊕ε ε shortly (Ξ,Zf)or ⊕ . Rd RLet εbe a0stationary random marked losed setZin with marks ε in . For any ≥ de(cid:28)ne the (stationary) random (cid:28)eld by max Z(y), x Ξ , e ⊕ε Zε(x) = y∈Ξ∩Bε(x) ∈ 0, . otherwise e f :R2 R h Rd Let → be a right- ontinuous fun tion. For all ∈ de(cid:28)ne κ (h) = lim E f Z (o),Z (h) o,h Ξ f ε ε ⊕ε ε→0+ | ∈ (2) h (cid:16) (cid:17) i κ (h) < P(o,h eΞ )e> 0 ε > 0 κ (h) |f| ⊕ε f whenever ∞ and ∈ for all , otherwise is unde(cid:28)ned. f In parti ular, for the following hoi es of , e(m ,m )= m , c(m ,m )= m m , v(m ,m ) = m2, 1 2 1 1 2 1 2 1 2 1 (3) de(cid:28)ne E(h) = κ (h) e (4) 1 γ(h) = (κ (h)+κ ( h)) κ (h) v v c 2 − − (5) cov(h) = κ (h) κ (h)κ ( h) c e e − − (6) κ (h) κ (h)κ ( h) c e e cor(h) = − − (κ (h) κ (h)2)1/2(κ ( h) κ ( h)2)1/2 (7) v e v e − − − − k (h) = (m)−2κ (h), (m = 0), mm c 6 (8) where m = E[Z(o) o Ξ] | ∈ is the mean mark. 6 γ cov cor We all the mark variogram, the mark ovarian e fun tion, the k k (Ξ,Z) mm mm mark orrelatiΞon funR dtion and Stoyan's -fun tion of [22℄. Note that, if ≡ , these de(cid:28)nitions are ompatible with the lassi al de(cid:28)nitions for random (cid:28)elds (see Remark 4). (Ξ,Z) Whenever isassumedtobebothstationaryandisotropi the har- a teristi sgiveEn(bry) (r4)(cid:21)(8[0),are)rotationinvaEri(ahn)t. hBysRligdhtabuseofnotation we will write , ∈ ∞ , instead of , ∈ . The same applies for the fun tions de(cid:28)ned in Eq. (5)(cid:21)(8). Ψ = ν ( Ξ ) ε d ⊕ε Remark 1. Let ·∩ bethe randomvolume measure asso iated Ξ ν d ⊕ε d with the random losed set . Here, is the -dimensional Lebesgue (2) µ Ψ ε ε Bme,aBsure. If(Rd) denotes the se ond-order moment measure of then, for 1 2 ∈ B , we have E f Z (x),Z (y) 1 (x)1 (y) dxdy ε ε Ξ⊕ε Ξ⊕ε) ZB2ZB1 h (cid:16) (cid:17) i = E ef Ze(x),Z (y) Ψ (dx)Ψ (dy) ε ε ε ε (cid:20)ZB2ZB1 (cid:16) (cid:17) (cid:21) = f(me ,m )eQ (d(m ,m ))µ(2)(d(x,y)) 1 2 ε;x,y 1 2 ε ZB1×B2ZR2 = f(m ,m )Q (d(m ,m ))P(x,y Ξ )dxdy, 1 2 ε;x,y 1 2 ⊕ε ZB2ZB1ZR2 ∈ Q ε;x,y where isthetwo-point markdistribution oftheweighted randommea- (Ψ ,Z ) (x,y) P(x,y Ξ ) > 0 ε ε ⊕ε sure [3℄. Hen e, for almost all with ∈ , we have e E f Z (x),Z (y) x,y Ξ = f(m ,m )Q (d(m ,m )). ε ε ⊕ε 1 2 ε;x,y 1 2 | ∈ R2 h (cid:16) (cid:17) i Z e e P(o,h Ξ) > 0 h Rd Remark 2. In ase ∈ , ∈ , the above de(cid:28)nition takes the simpler form κ (h) = E[f(Z(o),Z(h)) o,h Ξ] f | ∈ if we impose the integrability onditions E[Z(o)1 (o)1 (h)] < , κ (h) < , Ξ Ξ |e| | | ∞ ∞ f = e for , and, E[Z(o)21 (o)1 (h)] < , κ (h) < , Ξ Ξ |v| | | ∞ ∞ 7 f = c f = v a a + − for or . This an be seaen aRs follows. Denoting by and the positive and the negative part of ∈ , respe tively, we always have Z (o) 1 (o)1 (h) Z (o) 1 (o)1 (h) Z (o)1 (o)1 (h) ε + Ξ⊕ε Ξ⊕ε ≤ ε + Ξ⊕ε Ξ⊕ε ≤ | ε | Ξ⊕ε Ξ⊕ε ε ε κ (h) < e e e |e| for all ≤ , where the right-hand side is integrable due to ∞. Similarly, Z (o) 1 (o)1 (h) Z(o) 1 (o)1 (h) Z(o)1 (o)1 (h). ε − Ξ⊕ε Ξ⊕ε ≤ − Ξ Ξ ≤ | | Ξ Ξ e In the same way we obtain Z (o)21 (o)1 (h)+ Z(o)21 (o)1 (h) | ε | Ξ⊕ε Ξ⊕ε | | Ξ Ξ as an integrable uepper bound of |Zε(o)|21Ξ⊕ε(o)1Ξ⊕ε(h)| and |Z(o)|2 +|Z(h)|2 1Ξ(o)1Ξ(h)+e |Zε(o)|2 +|Zε(h)|2 1Ξ⊕ε(o)1Ξ⊕ε(h) (cid:16) (cid:17) (cid:0) (cid:1) Z (eo)Z (h)1 e (o)1 (h) Z as an integrable upper bound of | ε ε | Ξ⊕ε Ξ⊕ε . Sin e is Ξ ε > 0 x Ξ δ > 0 u.s. . on , a value exists for every ∈ and for every su h Z(y) Z(x)+δ y Be(x) eΞ Z (x) Z(x) ε ε that ≤ for all ∈ ∩ . Hen e, we have → ε 0+ x / Ξ x / Ξ ⊕ε from above as → . Further, ∈ implies ∈ for all su(cid:30) iently ε e small . We then have f(Z (o),Z (h))1 (o)1 (h) f(Z(o),Z(h))1 (o)1 (h) ε ε Ξ⊕ε Ξ⊕ε → Ξ Ξ a.s. ε 0+ e e as → . Hen e, by the dominated onvergen e theorem, we have E f Z (o),Z (h) 1 (o)1 (h) ε ε Ξ⊕ε Ξ⊕ε κ (h) = lim f ε→0+ h (cid:16) P(o,h (cid:17)Ξ⊕ε) i e e ∈ E[f (Z(o),Z(h))1 (o)1 (h)] Ξ Ξ = . P(o,h Ξ) ∈ Remark3. Thereexistsanalternative on eptofrandommarkedsetswhi h is inspired by the notion of random (cid:28)elds and where se ond-order hara ter- isti s inRth∅e=senRse ofζt∅he pre eding remark anRbe de(cid:28)nedζ.∅ (RL∅e)t ∪{ } beσthe extension of by some . WeBd1enoBt2e by BB1 (tRhe)respeB t2ive Borel, -(cid:28),eζl∅d whi h is generated by all sets ∪ for ∈ B and ⊂ {−∞ ∞ Z}.(,x) :Ω R∅ x Rd A fam(ilΩy,of,rPa)ndom variables · → , ∈ , on thΞe probabil- ity spa e A is alled a random (cid:28)eld with random domain , if Ξ = x Rd : Z(,x) = ζ∅ . { ∈ · 6 } 8 Z ζ∅ , ,ζ∅ Clearly, when takes only values di(cid:27)erent frRom Ror {−∞ ∞ } this notRiodn of a random marked set in ludes usual - or -valued random (cid:28)elds on . Ξ Note that is a ran1do(mx)s=et 1in(aZv(exr)y) geneZral sense [18℄, entirely deter- Ξ R mZined( by its(Rindd),i a(tRor∅)) . If is jointly meaΞsurable, i.e., is A⊗B B -measurable, then the realizations of are almost surely Borel measurable. Ifwehave evenalmostsurely losed(open) realiza- Ξ Z tions of then is alled a random (cid:28)eld with random losed (open) domain, see alsPo([o19℄. Ξ) > 0 Z If ∈ holds for a stationary random (cid:28)eld with random Ξ domain we an de(cid:28)ne se ond-order hara teristi s without any further Z assuZm(xp)tio=nZon(xp)ath rxegulΞarity. LZet(x)b=e t0he (stationary) rafnd:oRm2 (cid:28)eldRgiven by for ∈ , ahnd Rde otherwise. Let → be a measurable fun tion. For all ∈ de(cid:28)ne e e κ (h) = E[f(Z(o),Z(h)) o,h Ξ] f | ∈ P(o,h Ξ)> 0 E[fe(Z(o)e,Z(h))1Ξ(o)1Ξ(h)] < whenever ∈ and | | ∞. (Ξ,Z) Remark 4. Let be a stationaery reeal-valued random-(cid:28)eld model and max Z(y), x Ξ , ⊕ε Zε(x) = y∈Ξ∩Bε(x) ∈ Z(x), . otherwise Z Ξ Z (x) Z(x) x Ξ ε Sin e is u.s. . on we hZave x→ Rd frεom a0b+ove for ∈ , and ε hen e, by the de(cid:28)nition of , for all ∈ as → . Then, using the Z Ξ independen e of and , we obtain κ (h) = lim E[f(Z (o),Z (h)) o,h Ξ ] f ε ε ⊕ε ε→0+ | ∈ = lim E[f(Z (o),Z (h))] ε ε ε→0+ = E[f (Z(o),Z(h))] h Rd P(o,h Ξ ) > 0 ε > 0 ⊕ε for all ∈ whi h satisfy ∈ for all and, depending f on the hoi e oΞf a ording Rtod(3), one of the integrability onditions in Remark 2 with repla ed by . κ f Remark5. Thede(cid:28)nitionof a ordingto(2)is,inimportantsitutations, onsistent with the lassi al de(cid:28)nition of the se ond-order hara teristi s of Φ stationary markedRpdointRpro esses [Ξ22℄: Let be a stationary simple marked point pro ess on × . Then, is the support of the unmarked point e 9 Φ = Φ( R) pro ess ·× . We assume that the se ond-order moment measure µ(2) Φ h > 0 of is lo ally (cid:28)nite. For k k we have e E f Z (o),Z (h) 1 (o)1 (h) ε ε Ξ⊕Bε(o) Ξ⊕Bε(o) h (cid:16) (cid:17) i = E f Z (o),Z (h) 1 1 e eε ε {Φ(Bε(o))=1} {Φ(Bε(h))=1} h (cid:16) (cid:17) i +E f Z (o),Z (h) 1 1 e ε e ε {Φ(Bε(o))>1} {Φ(Bε(h))≥1} h (cid:16) (cid:17) i +E f Z (o),Z (h) 1 1 . eε eε {Φ(Bε(o))=1} {Φ(Bε(h))>1} h (cid:16) (cid:17) i 0 < ε< h /2 e e For any k k the (cid:28)rst summand equals E f(m ,m )1 (x )1 (x ) =:µ(2)(B (o) B (h)). 1 2 Bε(o) 1 Bε(h) 2 f ε × ε e (x1,m1)X,(x2,m2)∈Φ We an extend the argumentation in [8, Prop. 9.3.XV℄ in order to on lude that P(o,h Ξ ) P(Φ(B (o)) 1,Φ(B (h)) 1) ⊕ε ε ε ∈ = ≥ ≥ 1 µ(2)(B (o) B (h)) µ(2)(B (o) B (h)) → ε ε ε ε × × ε 0+ ε > 0 as → . If we additionally impose the ondition that for some , E f Zε(o),Zε(h) 1{Φ(Bε(o))>1}1{Φ(Bε(h))≥1}1{|f(Zeε(o),Zeε(h))|>M} sup 0 h(cid:12) (cid:16) P(cid:17)((cid:12)Φ(B (o)) 1,Φ(B (h)) 1) i → ε∈(0,ε) (cid:12)(cid:12) e e (cid:12)(cid:12) ε ≥ ε ≥ M as → ∞, we obtain (2) µ (B (o) B (h)) κ (h) = lim f ε × ε f ε→0+ µ(2)(Bε(o) Bε(h)) × µ(2) whi h equals -a.e. the Radon-Nikodym derivative (2) dµ (x,x+h) f . dµ(2)(x,x+h) α E f Z (o),Z (h) ε ε For instan e, the above ondition is satis(cid:28)ed if is uni- (0,ε) α > 1 (cid:12) (cid:16) (cid:17)(cid:12) formly bounded on for some . (cid:12) e e (cid:12) (cid:12) (cid:12) 10