ebook img

Exponential bounds for minimum contrast estimators PDF

0.34 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Exponential bounds for minimum contrast estimators

Electronic Journal of Statistics ISSN:1935-7524 Exponential bounds for minimum contrast estimators Yuri Golubev 9 0 Universit´e de Provence 0 39, rue F. Joliot-Curie 2 13453 Marseille, France e-mail: [email protected] n a J Vladimir Spokoiny 6 Weierstrass-Institute and Humboldt UniversityBerlin, ] Mohrenstr. 39, 10117 Berlin, Germany T e-mail: [email protected] S . Abstract: The paper focuses on general properties of parametric mini- h mum contrast estimators. The quality of estimation is measured in terms t a of the rate function related to the contrast, thus allowing to derive ex- m ponential risk bounds invariant with respect to the detailed probabilistic structure of the model. This approach works well for small or moderate [ samples and covers the case of a misspecified parametric model. Another importantfeatureofthepresentedboundsisthattheymaybeusedinthe 1 casewhentheparametricsetisunboundedandnon-compact.Thesebounds v donotrelyontheentropyorcoveringnumbersandcanbeeasilycomputed. 5 Themostimportantstatisticalfactresultingfromtheexponentialbondsis 5 aconcentrationinequalitywhichclaimsthatminimumcontrastestimators 6 concentratewithalargeprobabilityonthelevelsetoftheratefunction.In 0 typicalsituations,everysuchsetisaroot-nneighborhoodoftheparameter . of interest.We alsoshow that the obtained bounds can helpforbounding 1 theestimationrisk,constructingconfidencesetsfortheunderlyingparam- 0 eters.Ourgeneralresultsareillustratedforthecaseofani.i.d.sample.We 9 also consider several popular examples including least absolute deviation 0 estimation and the problem of estimating the location of a change point. : v What we obtain in these examples slightly differs from the usual asymp- i toticresultspresentedinstatisticalliterature.Thisdifferenceisduetothe X unboundness oftheparameter setandapossiblemodelmisspecification. r a AMS2000subjectclassifications:Primary62F10;secondary62J12,62F25. Keywords and phrases: exponential risk bounds, rate function, quasi maximumlikelihood,smoothcontrast. 1. Introduction One of the most fundamental ideas in statistics is to describe an unknown dis- tribution IP oftheobserveddata Y IRn withthehelpofasimpleparametric family (IPθ, θ Θ), where Θ is a su∈bset in a finite dimensional space, say, in ∈ IRp. In this situation, the statistical model is characterizedby the value of the 1 imsart-ejs ver. 2008/08/29 file: ejs_2009_352.tex date: January 6, 2009 golubev, yu. and spokoiny, v. /exponential bounds for minimum contrast estimators 2 parameter θ Θ and the statistical inference about IP is reduced to recover- ing θ.Thest∈andardlikelihoodapproachsuggeststoestimate θ bymaximizing the corresponding likelihood function. The maximum likelihood estimator can be generalized in several ways resulting in the so-called minimum contrast and M-estimators; see Huber (1967) and Huber (1981). The main idea behind this generalization is to estimate the underlying parameter θ by minimizing over Θ a contrast function L(Y,θ): − θ =argmin L(Y,θ) =argmaxL(Y,θ). (1.1) θ∈Θ {− } θ∈Θ e The negativesigninthis notationcomesfromthe mainexamplewhichwehave in mind when L(Y,θ) is the log-likelihood or quasi log-likelihood. A natural conditiononthecontrastfunctionisthatitsexpectationunderthetruemeasure IPθ0 is minimized at the true parameter θ0, i.e. θ0 =argmaxIEθ0L(Y,θ). (1.2) θ∈Θ If L(Y,θ) is log-likelihood ratio, that is, L(Y,θ)=log dIPθ (Y) dIPθ 0 tKhe(InPθth,eIPvθa)lubeet−weIEenθ0LIP(θθ,θan0)d cIPoθin.cIitdeisswweiltlhknthowenKtuhlalbtaKck(-ILPeθib,lIePrθ)diivsearglwenayces 0 0 0 non-negative and K(IPθ ,IPθ)=0 if and only if IPθ =IPθ. 0 0 Ifthedistribution IP doesnotbelongtotheparametricfamily (IPθ, θ Θ), ∈ then the targetof estimation canbe naturally defined as the point ofminimum of IEL(Y,θ). We will see that this point θ indeed minimizes a special 0 − distance between the underlying measure IP and the measures IPθ from the given parametric family. Theclassicalparametricstatisticaltheoryfocusesmostlyonasymptoticprop- erties of the difference between θ and the true value θ as the sample size 0 n tends to infinity. There is a vast literature on this issue. We only mention the book Ibragimov and Khas’mienskij (1981), which provides a comprehensive study of asymptotic properties of maximum likelihood and Bayesian estima- tors. Typical results claim that the maximum likelihood and Bayes estimators areasymptotically optimal under certainregularityconditions.Large deviation results about minimum contrast estimators can be found in Jensen and Wood (1998)andSieders and Dzhaparidze(1987),whilesubtlesmallsamplesizeprop- ertiesoftheseestimatorsarepresentedinField(1982)andField and Ronchetti (1990). Another streamofthe literatureconsidersminimum contrastestimatorsina generali.i.d.situation,whentheparameterset Θ isasubsetofsomefunctional space. We mention the papers Van de Geer (1993), Birg´e and Massart (1993), Birg´e and Massart (1998), Birg´e (2006) and references therein. The studies mostly focused onthe concentrationproperties ofthe maximum maxθL(Y,θ) imsart-ejs ver. 2008/08/29 file: ejs_2009_352.tex date: January 6, 2009 golubev, yu. and spokoiny, v. /exponential bounds for minimum contrast estimators 3 rather on the properties of the estimator θ which is the point of maximum of L(Y,θ). The established results are based on deep probabilistic facts from the empirical process theory;see e.g. van deer Vaart and Wellner (1996). In this paper we also focus on the properties ofthe maximum of L(Y,θ) over θ Θ. ∈ However, we do not assume any particular structure of the contrast. Our basic result claims that if for every θ Θ the differences L(Y,θ) L(Y,θ ) has 0 ∈ − exponentialmoments, then under rather generaland mild conditions, the max- imum maxθ L(Y,θ) L(Y,θ0) has similar exponential moments. In what follows, to k{eep notat−ion shorter}, we omit the argument Y in the contrast function L(Y,θ) writing L(θ) instead of L(Y,θ). However, one has to keep inmindthat L(θ) is arandomfieldthatdepends onthe observeddata Y .We also denote L(θ,θ )=L(θ) L(θ ). 0 0 − To explain the main idea in this paper, introduce the function M(µ,θ,θ )d=ef logIEexp µL(θ,θ ) . 0 0 − Let µ∗ be a maximizer of this function w.r.t(cid:8). µ, i.e. (cid:9) µ∗(θ)d=ef argmaxM(µ,θ,θ ). (1.3) 0 µ The rate function is defined via the Legendre transform of L(θ,θ ): 0 M∗(θ,θ )d=ef maxM(µ,θ,θ )= logIEexp µ∗(θ)L(θ,θ ) . (1.4) 0 0 0 µ − (cid:8) (cid:9) Similar notions have already appeared in Chernoff (1952) and Bahadur (1960) for studying the models with i.i.d. observations. Obviously M∗(θ,θ ) M(0,θ,θ ) = 0. The following identity follows im- 0 0 ≥ mediately from the above definition: IEexp µ∗(θ)L(θ,θ )+M∗(θ,θ ) =1, θ Θ. 0 0 ∈ n o We aim to extend this pointwise identity to the supremum over θ Θ, which particularly enables us to replace θ with the estimator θ. Unfor∈tunately, in some situations, IEexpsupθ µ∗(θ)L(θ,θ0)+M∗(θ,θ0) = . We illustrate ∞ this fact by some examples for a simple Gaussian liner moedel. (cid:8) (cid:9) 1.1. Examples for a linear Gaussian model To illustrate how the quantities µ∗(θ) and M∗(θ,θ ) can be computed let us 0 consider the simplest case where L(θ,θ ) is a Gaussian field. 0 Example 1.1. [Gaussian contrast] Let for each pair θ,θ′ Θ, the difference L(θ,θ′)=L(θ) L(θ′) isaGaussianrandomvariable.Inth∈iscasewecall L(θ) − imsart-ejs ver. 2008/08/29 file: ejs_2009_352.tex date: January 6, 2009 golubev, yu. and spokoiny, v. /exponential bounds for minimum contrast estimators 4 a Gaussian contrast. With M(θ,θ′) = IEL(θ,θ′), D2(θ,θ′) = VarL(θ,θ′), the random variable L(θ,θ′) is normal−N M(θ,θ′),D2(θ,θ′) . Moreover, − M(µ,θ,θ )= logIEexp µL(θ,θ ) =(cid:0)µM(θ,θ ) µ2D2(θ(cid:1),θ )/2 0 0 0 0 − − andthe values µ∗(θ), M∗(θ,(cid:8)θ ) defined(cid:9)in (1.3)–(1.4) canbe easily computed: 0 M(θ,θ ) µ∗(θ) = argmax µM(θ,θ ) µ2D2(θ,θ )/2 = 0 , 0 − 0 D2(θ,θ ) µ≥0 0 (cid:8) M2(θ,θ ) (cid:9) M∗(θ,θ ) = supM(µ,θ,θ )= 0 . 0 0 2D2(θ,θ ) µ≥0 0 The formula can be further simplified if L(θ) is a Gaussian log-likelihood. Example 1.2. [Gaussian model] Let L(θ,θ )=log dIPθ (Y) 0 dIPθ 0 be a Gaussian random variable for any θ Θ, and in addition IP = IPθ for some θ Θ.Asinpreviousexample,let M∈(θ,θ ) and D(θ,θ ) denotem0 ean 0 0 0 a1nydievladriina∈gncMeo(fθ,Lθ(θ),θ=0)D.2T(θh,eθlik)e/l2ihaonoddphreonpceer,tyµ∗im(θp)lies1I/E2θ0aenxdp{ML(∗θ(θ,θ,θ0)})== 0 0 0 M(θ,θ )/4. ≡ 0 Finally we consider a classical linear Gaussian regression. Example 1.3. [LinearGaussianmodel]Considerthelinearmodel Y =Xθ + 0 σε, where Y IRn, θ IRp, X is a known n p matrix, and ε is a white ∈ ∈ × Gaussian noise in IRn, i.e. ε are i.i.d. standard normal. Then i L(θ)= Y Xθ 2/(2σ2), −k − kn where denotes the standard Euclidian norm in IRn. Obviously n k·k M(θ,θ )= X(θ θ ) 2/(2σ2), D(θ,θ )= X(θ θ ) 2/σ2, 0 k − 0 kn 0 k − 0 kn and thus (see Example 1.2) M∗(θ,θ )= X(θ θ ) 2/(8σ2). 0 k − 0 kn The log-likelihood ratio can be written as L(θ,θ )= X(θ θ ),ε /σ X(θ θ ) 2/(2σ2). 0 h − 0 in −k − 0 kn Let k denote the rank of the matrix X⊤X. Obviously k p and the vectors X(θ θ ) span a linear subspace X in IRn of dimensio≤n k. Denote by Π 0 − imsart-ejs ver. 2008/08/29 file: ejs_2009_352.tex date: January 6, 2009 golubev, yu. and spokoiny, v. /exponential bounds for minimum contrast estimators 5 the projector in IRn on X. Then sup µ∗(θ)L(θ,θ )+M∗(θ,θ ) 0 0 θ∈IRp (cid:8) X(θ θ ),ε (cid:9) X(θ θ ) 2 = sup h − 0 in k − 0 kn θ∈IRp 2σ − 8σ2 n o Πu,ε Πu 2 = sup h in k kn u∈IRn 2σ − 8σ2 n o Πu,Πε Πu 2 = sup h in k kn = Πε 2/2, u∈IRn 2σ − 8σ2 k kn n o where the maximum is attained at any u IRn such that Πu = 2σΠε. It is well known that Πε 2 follows χ2 - dist∈ribution with k degree of freedom k kn and IEθ0expsuθp µ∗(θ)L(θ,θ0)+M∗(θ,θ0) =IEexp kΠεk2n/2 =∞. (cid:8) (cid:9) (cid:8) (cid:9) However,for any positive s<1, it holds by the same argument that sup µ∗(θ)L(θ,θ )+sM∗(θ,θ ) 0 0 θ (cid:8)= sup Πu,ε /(2σ) ((cid:9)2 s) Πu 2/(8σ2) = Πε 2/(4 2s), u∈IRn h in − − k kk k kn − (cid:8) (cid:9) and thus Πε 2 2 s k/2 IEθ0expsuθp µ∗(θ)L(θ,θ0)+sM∗(θ,θ0) =IEexp k4 2ksn = 1−s . (cid:8) (cid:9) n − o (cid:16) − (cid:17) An important feature of this inequality is that it only involves the effective dimension k of the parameter space and does not depend on the design X, noise level σ2, sample size n, etc. Later we show that such a behaviour of the log-likelihood is not restricted to Gaussian linear models and it can be proved for a quite general statistical set-up. 1.2. Main result The examples from Section 1.1 suggest to consider in the general situation the maximumoftherandomfield µ∗(θ)L(θ,θ )+sM∗(θ,θ ) for s<1.Themain 0 0 result of the paper shows that under some technical conditions this maximum is indeed stochastically bounded in a rather strong sense. Namely, for some ρ (0,1) ∈ IEsupexp ρ µ∗(θ)L(θ,θ )+sM∗(θ,θ ) C(ρ,s), (1.5) 0 0 θ∈Θ ≤ n (cid:2) (cid:3)o where C(ρ,s) is a constant that can be easily controlled in typical examples. Thisresultparticularlyyieldsthat µ∗(θ)L(θ,θ ) and M∗(θ,θ ) havebounded 0 0 imsart-ejs ver. 2008/08/29 filee: ejes_2009_352.tex deate: January 6, 2009 golubev, yu. and spokoiny, v. /exponential bounds for minimum contrast estimators 6 exponential moments. Another corollary of this fact is that θ concentrates on thesets A(z,θ )= θ:M∗(θ,θ ) z forsufficientlylarge z inthesensethat 0 0 { ≤ } theprobability IP θ A(z,θ ) isexponentiallysmallin z.Uesuallyeverysuch 0 concentrationsetisa6∈root-nvicinityofthepoint θ .SeeSection2.3forprecise 0 (cid:0) (cid:1) formulations. Ibrageimov and Khas’minskij (1981) stated a version of (1.5) for the i.i.d. case and used it to prove consistency of θ. Webrieflycommentonsomeusefulfeaturesofthebasicinequality(1.5).First of all this bound is non-asymptotic and may be useed even if the sample size is small or moderate. It is also applicable in the situation when the parametric modeling assumption is misspecified. Our results may be used in such cases as wellwiththe“true”parameter θ definedasthemaximumpointofthecontrast 0 expected value: θ0 =argmaxθIEL(θ). Another interesting question is about the accuracy of estimation when the parameterset Θ is notcompact.The typicalresultsin the classicalparametric theory has been established for compact parametric sets since this assumption simplifies considerably the conditions and the technical tools. There exist very few results for the case of non-compact sets. See Ibragimov and Khas’minskij (1981) for an example. Our conditions are quite mild and particularly, the pa- rameter set can be non-compact and unbounded. Moreover, we present some examples in Section 4 illustrating that the quality of the minimum contrast estimation can heavily depend on topological properties of Θ and on the be- haviorof the rate function M∗(θ,θ ) for large θ. The correspondingaccuracy 0 of estimation can be different from the classical root-n behavior. The paper is organizedas follows. The main result is presented in Section 2. Section 2.3 presents some useful corollaries of (1.5) describing concentration properties of θ, some risk bounds, confidence sets for the targetparameter θ 0 based on the L(θ,θ). Section 2.4 specifies the approach to the important case of a smooth ceontrast. In this situation the main conditions ensuring (1.5) are substantially simeplified. Section 3 illustrates how our approach applies to the classical i.i.d. case while Section 4 presents some applications of the general exponential bound to three particular problems: estimation of the median, of the scale parameter of an exponential model and of the change point location. Although these examples have already been studied, the proposed approach reveals some new features of the classical least squares and least absolute devi- ationestimatorsinthecaseswhenthe parametricassumptionismisspecifiedor the parameter set is not compact. In the case of median estimation the result appliesevenif the observationsdo nothavethe firstmoment.The lastexample in this section considers the prominent change point problem. We particularly show that in the case when the size of the jump is completely unknown, the accuracy of estimation of its location differs from the well known parametric rate 1/n and it depends on the distance of the change point to the edge of the observation interval and involves an extra iterated-log factor. imsart-ejs ver. 2008/08/29 file: ejs_2009_352.tex date: January 6, 2009 golubev, yu. and spokoiny, v. /exponential bounds for minimum contrast estimators 7 2. Risk bound for the minimum contrast This section presents a general exponential bound on the minimum contrast value in a rather general set-up. Let L(θ), θ Θ, be a random contrast function of a finite dimensional parame−ter θ Θ∈ IRp given on some prob- ability space (Ω,F,IP). We also assume that∈L(θ⊂) is separable random field and IEL(θ) exists for all θ Θ. The minimum contrast estimator is defined as a minimizer of L(θ) an∈d the target of estimation is the value θ which 0 minimizes the expe−ctation IEL(θ). It is clear that for any θ◦ Θ − ∈ θ =argmaxL(θ,θ◦) and θ =argmaxIEL(θ,θ◦). 0 θ∈Θ θ∈Θ Ourstudy foceusesonthe value ofmaximumin θ ofthe randomfield L(θ,θ ): 0 L(θ,θ )= supL(θ,θ )= sup L(θ) L(θ ) . 0 0 0 θ∈Θ θ∈Θ − (cid:8) (cid:9) By definition, L(θ,eθ ) is a non-negative random variable. 0 e 2.1. Preliminaries. The case of a discrete parameter set The main goal of this paper is to obtain exponential bounds for the supremum in θ of the random field L(θ,θ ), without specifying a particular structure 0 of the model or contrast function L(θ). Instead we impose some conditions of finite exponential moments for the increments L(θ,θ′) = L(θ) L(θ′). With M(µ,θ,θ ) = logIEexp µL(θ,θ ) , the global exponential m−oment condi- 0 0 − tion reads as follows: (cid:8) (cid:9) (EG) For any θ Θ the set Υ(θ) = µ (0, ) : M(µ,θ,θ ) < is 0 ∈ ∈ ∞ ∞ non-empty. (cid:8) (cid:9) Notethat Υ(θ) is anintervalbecause M(µ,θ,θ )< implies M(µ′,θ,θ )< 0 0 ∞ forall µ′ <µ.Moreover,inthebasicexampleofthe log-likelihoodcontrast, ∞(itEhGo)ldiss Mful(fi1l,leθd,θa0u)to=m−atliocaglIlEyθw0itdhIPθ(0/,d1IP]θ0Υ≤(θ,0θfo)r. all θ and the condition 0 Under the condition (EG) the(cid:0)function⊂s (cid:1)µ∗(θ) and M∗(θ,θ ) from (1.3)– 0 (1.4)arenon-trivialandcorrectlydefined.Usuallythese functions canbe easily evaluatedinasmallneighborhoodofthetargetparameter θ .However,itmight 0 be difficult to compute themfor all θ Θ.Therefore,in the sequelwe proceed with another function µ(θ), which ca∈n be viewed as a rough approximation of µ∗(θ). Section 4 provides some examples. So, let µ(θ) be a given function taking values in Υ(θ,θ ). Define 0 M(θ,θ )d=ef M(µ(θ),θ,θ )= logIEexp µ(θ)L(θ,θ ) . 0 0 0 − The most important requirement on µ(θ) is tha(cid:8)t M(θ,θ ) i(cid:9)s positive and 0 increases as θ moves away from θ . By definition, for any θ Θ, 0 ∈ IEexp µ(θ)L(θ,θ )+M(θ,θ ) =1. (2.1) 0 0 n o imsart-ejs ver. 2008/08/29 file: ejs_2009_352.tex date: January 6, 2009 golubev, yu. and spokoiny, v. /exponential bounds for minimum contrast estimators 8 This means that the random function µ(θ)L(θ,θ )+M(θ,θ ) has bounded 0 0 exponentialmomentsforevery θ.Weaimtoderiveasimilarfactforthesupre- mum of this function in θ Θ. More precisely, we are interested in bounding ∈ the following value: Q(ρ,s)d=ef IEsupexp ρ µ(θ)L(θ,θ )+sM(θ,θ ) , (2.2) 0 0 θ∈Θ n (cid:2) (cid:3)o where ρ,s [0,1]. ∈ Webeginwitharoughupperboundforaspecialcaseofadiscreteparameter set. Theorem 2.1. Assume (EG) and let Θ be a discrete set. Then for any s<1 Q(1,s) = IEsupexp µ(θ)L(θ,θ )+sM(θ,θ ) 0 0 θ∈Θ (cid:8) (cid:9) exp (1 s)M(θ,θ ) . (2.3) 0 ≤ − − θ∈Θ X (cid:8) (cid:9) Proof. Since IEexp µ(θ)L(θ,θ )+sM(θ,θ ) =exp (1 s)M(θ,θ ) , we 0 0 0 − − obviously have (cid:8) (cid:9) (cid:8) (cid:9) Q(1,s) IEexp µ(θ)L(θ,θ )+sM(θ,θ ) = exp (1 s)M(θ,θ ) . 0 0 0 ≤ − − θ∈Θ θ∈Θ X (cid:8) (cid:9) X (cid:8) (cid:9) Usually, the function M(θ,θ ) rapidly grows as θ moves away from θ . 0 0 Thispropertyisoftensufficienttoboundthesumintherighthand-sideof(2.3) by a fixed constant. Although Theorem 2.1 is a rather simple corollary of (2.1), the bound (2.3) yields a number of useful statistical corollaries. Some of them are presented in Section 2.3. However, even in discrete case, this bound may be too rough (see theexampleinSection4.3).Itisalsoclearthat(2.3)isuselessinthecontinuous case. The next section demonstrates how the bound (2.3) can be extended to the case of an arbitrary parameter set. 2.2. The general exponential bound Hereweaimtoextendtheexponentialbound(2.3)fromthediscretecasetothe case of an arbitrary finite dimensional parameter set. We apply the standard approachwhich evaluates the supremum overthe whole parameterset Θ via a weighted sum of local maxima. Define for any θ,θ′ Θ ∈ ζ(θ)d=ef µ(θ) L(θ,θ ) IEL(θ,θ ) , ζ(θ,θ′)d=ef ζ(θ) ζ(θ′). 0 0 − − Note that the dep(cid:8)endence of ζ(θ,θ′) on(cid:9) θ disappears if µ(θ)=µ(θ′). 0 imsart-ejs ver. 2008/08/29 file: ejs_2009_352.tex date: January 6, 2009 golubev, yu. and spokoiny, v. /exponential bounds for minimum contrast estimators 9 Usually the local properties of the centered contrast difference ζ(θ,θ′) are controlledbythevariance D2(θ,θ′)=Varζ(θ,θ′),whichdefinesasemi-metric on Θ see, e.g. van der Vaart and Wellner (1996). However, in some cases, it is more convenient to deal with a slightly different metric which we denote by S(θ,θ′). This metric usually bounds the standard deviation D(θ,θ′) from above.Sections 2.4 and 3 present some typical examples of constructing such a metric. Below in this section we assume that the metric S(, ) is given. Define for any point θ◦ Θ and a radius ǫ>0 the ball · · ∈ B(ǫ,θ◦)= θ :S(θ,θ◦) ǫ . ≤ Tocontrolthelocalbehaviorofth(cid:8)eprocess L(θ) w(cid:9)ithinanysuchball B(ǫ,θ◦), we impose the following local exponential condition: (EL) There exist ǫ> 0 and λ > 0 such that for any θ◦ Θ, ν > 0, and 0 ∈ λ λ ≤ sup logIEexp 2λξ(θ,θ′) 2ν2λ2, θ,θ′∈B(ǫ,θ◦) ≤ 0 (cid:8) (cid:9) where ζ(θ,θ′) ξ(θ,θ′)d=ef . S(θ,θ′) In fact, this condition only requires that every random increment ξ(θ,θ′) has bounded exponential moment for some λ > 0. Then Lemma 5.8 from the Ap- pendix implies the prescribed quadratic behavior in λ for λ λ. For a fixed θ◦ Θ and ǫ′ ǫ, by N(ǫ′,ǫ,θ◦) we denote≤the local covering number defined as∈the minima≤l number of balls B(ǫ′, ) required to cover the ball B(ǫ,θ◦). With this covering number we associate·the local entropy ∞ Q(ǫ,θ◦)d=ef 2−klogN(2−kǫ,ǫ,θ◦). k=1 X Webeginwithalocalresultwhichboundsthemaximumoftheprocess L(θ) over a local ball B(ǫ,θ◦). Theorem 2.2. Assume (EG) and (EL) with some ǫ > 0, ν 0, and 0 λ>0. Let also ρ<1 be such that ρǫ/(1 ρ) λ. Then for any θ◦≥ Θ − ≤ ∈ 2ν2ǫ2ρ2 logIE sup exp ρ µ(θ)L(θ,θ )+M(θ,θ ) 0 +(1 ρ)Q(ǫ,θ◦). 0 0 θ∈B(ǫ,θ◦) n (cid:2) (cid:3)o≤ 1−ρ − The next theorem is the global bound which generalizes the upper bound from Theorem 2.1. Theorem 2.3. Assume (EG) and (EL) for some λ,ν ,ǫ, and let π() be a 0 · σ-finite measure on Θ such that π B(ǫ,θ) sup ν (2.4) θ∈B(ǫ,θ◦)π(cid:0)B(ǫ,θ◦)(cid:1) ≤ 1 (cid:0) (cid:1) imsart-ejs ver. 2008/08/29 file: ejs_2009_352.tex date: January 6, 2009 golubev, yu. and spokoiny, v. /exponential bounds for minimum contrast estimators 10 for some ν [1, ). Let for some ρ,s < 1, it holds ρǫ/(1 ρ) λ and the 1 function Mǫ(∈θ◦,θ∞0)=infθ∈B(ǫ,θ◦)M(θ,θ0) fulfill − ≤ 1 H (ρ,s)d=ef log exp ρ(1 s)M (θ,θ ) π(dθ) < . (2.5) ǫ ZΘ π B(ǫ,θ) − − ǫ 0 ! ∞ (cid:8) (cid:9) Let finally Q(ǫ,θ◦) Q(cid:0)(ǫ) for(cid:1)all θ Θ. Then the value Q(ρ,s) from (2.2) ≤ ∈ satisfies 2ν2ǫ2ρ2 Q(ρ,s) 0 +(1 ρ)Q(ǫ)+log(ν )+H (ρ,s). (2.6) 1 ǫ ≤ 1 ρ − − AsinTheorem2.1,propergrowthconditionsonthefunction M(θ,θ ) ensure 0 that the integral H (ρ,s) in (2.6) is bounded by a fixed constant. ǫ 2.3. Some corollaries This section demonstrates how Theorems 2.1–2.3 can be used in the statistical analysis of the minimum contrast estimator θ = argmaxθ∈ΘL(θ). We show that probabilistic properties of this estimator may be easily derived from the following inequality: for prescribed ρ,s<1, e IEexp ρ µ(θ)L(θ,θ )+sM(θ,θ ) Q(ρ,s), (2.7) 0 0 ≤ n (cid:2) (cid:3)o which obviously follows freom Teheorem 2.3eand the definition (2.2) of Q(ρ,s). 2.3.1. A risk bound for the “natural” loss A first corollaryof Theorem2.1 presents exponentialbounds separately for the minimum contrast L(θ,θ ) and for the “natural” loss M(θ,θ ). 0 0 Corollary 2.4. For any ρ,s<1 e e IEexp ρµ(θ)L(θ,θ ) Q(ρ,0), (2.8) 0 ≤ n o IEexp ρseM(θe,θ0) Q(ρ,s). (2.9) ≤ n o Substituting s= 0 in (2.7) yieldsethe first bound. To prove the second one, notice that L(θ,θ ) 0. Therefore the elementary inequality 1 x 0 0 ≥ { ≥ } ≤ exp(µx) for any µ>0 yields (see also (2.7)) e IEexp ρsM(θ,θ ) = IEexp ρsM(θ,θ ) 1 L(θ,θ ) 0 0 0 0 ≥ (cid:8) (cid:9) IEexp(cid:8)ρsM(θ,θ0)(cid:9)+(cid:8)ρµ(θ)L(θ,θ0)(cid:9) Q(ρ,s). e ≤ e e ≤ Notice the exponential bound (2(cid:8).9) implies a similar risk bou(cid:9)nd for a poly- e e e nomial loss M(θ,θ ) r; see Lemma 5.7 for a precise result. 0 (cid:12) (cid:12) ims(cid:12)art-eejs ve(cid:12)r. 2008/08/29 file: ejs_2009_352.tex date: January 6, 2009

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.