ebook img

Online estimation of the geometric median in Hilbert spaces : non asymptotic confidence balls PDF

0.24 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Online estimation of the geometric median in Hilbert spaces : non asymptotic confidence balls

Online estimation of the geometric median in Hilbert spaces : non asymptotic confidence balls Hervé CARDOT, Peggy CÉNAC, Antoine GODICHON Institut de Mathématiquesde Bourgogne, Université de Bourgogne, 9 RueAlain Savary, 21078Dijon, France 5 email: {herve.cardot, peggy.cenac, antoine.godichon}@u-bourgogne.fr 1 0 2 January29, 2015 n a J 7 Abstract 2 Estimation proceduresbasedon recursivealgorithms areinterestingand powerful ] T techniquesthatareabletodealrapidlywith(very)largesamplesofhighdimensional S data. The collected data may be contaminated by noise so that robust location indi- . h cators, such as the geometric median, may be preferred to the mean. In this context, at an estimator of the geometric median based on a fast and efficient averaged non lin- m earstochasticgradientalgorithmhasbeendevelopedbyCardotetal.(2013). Thiswork [ aimsatstudyingmorepreciselythenonasymptoticbehaviorofthisalgorithmbygiving 1 nonasymptoticconfidenceballs.Thisnewresultisbasedonthederivationofimproved v L2ratesofconvergenceaswellasanexponentialinequalityforthemartingaletermsof 0 3 therecursivenonlinearRobbins-Monroalgorithm. 9 6 Keywords: FunctionalDataAnalysis, Martingalesin Hilbertspace,RecursiveEstimation, 0 . RobustStatistics,SpatialMedian,StochasticGradientAlgorithms. 1 0 5 1 1 Introduction : v i X Dealing withlarge samplesofobservationstakingvalues in highdimensionalspacessuch r asfunctionalspacesisnotunusualnowadays. Inthiscontext,simpleestimatorsoflocation a suchasthearithmeticmeancanbegreatlyinfluencedbyasmallnumberofoutlyingvalues. Thus,robustindicatorsoflocationmaybepreferredtothemean. Wefocusinthisworkon the estimation of the geometric median, also called L1-median or spatial median. It is a generalizationoftherealmedianintroducedbyHaldane(1948)thatcannowbecomputed rapidly, even for large samples in high dimension spaces, thanks to recursive algorithms (seeCardotetal.(2013)). Let H be a separable Hilbert space, we denoteby .,. its inner product and by . the h i kk associated norm. Let X be a randomvariable taking values in H, the geometricmedian m ofX isdefinedby: m := argminE[ X h X ]. (1) h H k − k−k k ∈ 1 Many properties of this median in separable Banach spaces are given by Kemperman (1987) such as existence and uniqueness, as well as robustness (see also the review Small (1990)). Recently, this median has received much attention in the literature. For example, Minsker (2014) suggests to consider, in various statistical contexts, the geometric median ofindependentestimatorstoobtainmuchtighterconcentrationbounds. Infunctionaldata analysis, KrausandPanaretos (2012) consider resistant estimators of the covariance oper- ators based on the geometric median in order to derive a robust test of equality of the second-orderstructurefortwosamples. Thegeometricmedianisalsochosentobethecen- tral location indicator in various typesofrobustfunctional principal componentsanalyses (see Locantoreetal. (1999), Gervini (2008) and Balietal. (2011)). Finally, a general defini- tion of the geometric median on manifolds is given in Arnaudonetal. (2012) with signal processingissuesinmind. Considerasequenceofi.i.dcopiesX ,X ,...,X ,...ofX. Anaturalestimatorm ofm, 1 2 n n basedonX ,...,X ,isobtainedbyminimizing theempiricalrisk 1 n b n m := argmin∑[ X h X ]. (2) n i i h H k − k−k k ∈ i=1 b Convergencepropertiesoftheempiricalestimatorm arereviewedinMöttönenetal.(2010) n whenthedimensionof H isfinitewhereastherecentworkofChakrabortyandChaudhuri b (2014) proposes a deep asymptotic study for random variables taking values in separable Banachspaces. Given a sample X ,...,X , the computation of m generally relies on a variant of the 1 n n Weiszfeld’s algorithm (see e.g. Kuhn (1973)) introduced by VardiandZhang (2000). This b iterative algorithm is relatively fast (see BeckandSabach (2014) for an improved version) butitisnotadaptedtohandleverylargedatasetsofhigh-dimensionaldatasinceitrequires tostoreallthedata. Howeverhugedatasetsarenotunusualanymorewiththedevelopment ofautomaticsensorsandsmart meters. InthiscontextCardotetal. (2013)have developed a much faster recursive algorithm, which does not require to store all the data and can be updatedautomatically when the data arrive online. The estimation procedure is based on thesimplefollowingrecursivescheme, X Z Z = Z +γ n+1− n (3) n+1 n n X Z n+1 n k − k wherethesequenceofsteps(γ ) controlstheconvergenceofthealgorithmand satisfythe n usual conditions for the convergence of Robbins Monro algorithms (see Section 3). The averagedversionofthealgorithmisgivenby 1 Z = Z + Z Z , (4) n+1 n n+1 n+1− n (cid:0) (cid:1) with Z = 0, so that Z = 1 ∑n Z. The averaging stepdescribedin (4), and firststudied 0 n n i=1 i in PolyakandJuditsky (1992), allows a considerable improvement of the convergence of 2 the initial Robbins-Monro algorithm. It is shown in Cardotetal. (2013) that the recursive averaged estimator Z and the empirical estimator m have the same Gaussian limiting n n distribution. In infinite dimensional spaces, this nice result heavily relies on the (locally) b strong convex properties of the objective function to be minimized. Note that Bach (2014) adoptsananalogousrecursivepointofviewforlogisticregressionunderslightlydifferent conditions, called self-concordance, which involve uniform conditions on the third order derivativesoftheobjectivefunction. Theaim of this work is to give new arguments in favor of the averaged stochasticgra- dient algorithm by providing a sharp control of its deviations around the true median, for finite samples. To get such non asymptotic confidence balls, new results about the be- havior of the stochastic algorithm are proved : improved convergence rates in quadratic mean compared to those obtained in Cardotetal. (2013) as well as new exponential in- equalitiesfor"near"martingalesequencesinHilbertspaces,similartotheseminalresultof Pinelis(1994)formartingales. Notethat,asfar asweknow,thereareonlyveryfewresults in the literature on exponential bounds for non linear recursive algorithms (see however Balsubramanietal.(2013)forrecursivePCA). The paper is organized as follows. Section 2 recalls some convexity properties of the geometricmedianaswellasthebasicassumptionsensuringtheuniquenessofthegeomet- ric median. In Section 3, the rates of convergence of the stochastic gradient algorithm are derived in quadratic mean as well as in L4. In Section 4, an exponential inequality is de- rived borrowing ideas from TarrèsandYao (2014). It enables us to build non asymptotic confidence balls for the Robbins-Monro algorithm as well as its averaged version. All the proofsaregatheredinSection5. 2 Assumptions on the median and convexity properties Letusfirststatebasicassumptionsonthemedian. (A1) TherandomvariableXisnotconcentratedonastraightline: forallh H,thereexists ∈ h H suchthat h,h = 0and ′ ′ ∈ h i > Var h′,X 0. h i (cid:0) (cid:1) (A2) X isnotconcentratedaroundsinglepoints: thereisaconstantC > 0suchthatforall h H: ∈ E X h 1 C. − k − k ≤ h i Assumption (A1) ensures that the median m is uniquely defined (Kemperman, 1987). As- sumption(A2)iscloselyrelatedtosmallballprobabilitiesandtothedimensionofH. Itwas proved in Chaudhuri (1992) that when H = Rd, assumption (A2) is satisfied when d 2 ≥ underclassicalassumptionsonthedensityofX. Adetaileddiscussiononassumption(A2) anditsconnectionwithsmallballsprobabilitiescanbefoundinCardotetal.(2013). 3 We now recall some results about convexity and robustness of the geometric median. Wedenoteby G : H R theconvexfunctionwewouldlike tominimize, definedforall −→ h H by ∈ G(h) := E[ X h X ]. (5) k − k−k k Thisfunctionis Fréchetdifferentiable on H,wedenoteby ΦitsFréchetderivative,andfor all h H: ∈ X h Φ(h) := G = E − . h ∇ − X h (cid:20)k − k(cid:21) Underpreviousassumptions,mistheuniquezeroofΦ. LetusdefineUn+1 := −kXXnn++11−−ZZnnk andletusintroducethesequenceofσ-algebra Fn := σ(Z ,...,Z ) = σ(X ,...,X ). Forallintegern 1, 1 n 1 n ≥ E[U ] = Φ(Z ). (6) n+1 n n |F The sequence (ξ ) defined by ξ := Φ(Z ) U is a martingale difference sequence n n n+1 n − n+1 withrespecttothefiltration( ). Moreover,wehaveforalln, ξ 2and n n+1 F k k ≤ E ξ 2 1 Φ(Z ) 2 1. (7) n+1 n n k k |F ≤ −k k ≤ (cid:2) (cid:3) Algorithm(3)canbewrittenasaRobbins-Monroorastochasticgradientalgorithm: Z m = Z m γ Φ(Z )+γ ξ . (8) n+1 n n n n n+1 − − − We now consider the Hessian of G, which is denotedby Γ : H H. It satisfies(see h −→ Gervini(2008)) 1 (X h) (X h) Γ = E I − ⊗ − , h X h H− X h 2 (cid:20)k − k (cid:18) k − k (cid:19)(cid:21) where I istheidentityoperatorin H andu v(h) = u,h vforallu,v,h H. Thefollow- H ⊗ h i ∈ ing(local)strongconvexitypropertieswillbeuseful(seeCardotetal.(2013)forproofs). Proposition2.1(Cardotetal. (2013)). Under assumptions (A1)and (A2), for anyreal number > A 0,thereisapositiveconstantc suchthatforallh H with h A,andforallh H: A ′ ∈ k k ≤ ∈ cA h′ 2 h′,Γhh′ C h′ 2. k k ≤ h i ≤ k k Asaparticularcase,thereisapositive constantc suchthatforallh H: m ′ ∈ cm h′ 2 h′,Γmh′ C h′ 2. (9) k k ≤ h i ≤ k k The following corollary recall some properties of the spectrum of the Hessian of G, in particularonthespectrumofΓ . m Corollary2.1. Underassumptions(A1)and(A2),forallh H,thereisanincreasingsequenceof ∈ 4 non-negativeeigenvalues λ andanorthonormalbasis v ofeigenvectors ofΓ suchthat j,h j,h h (cid:0) (cid:1) (cid:0) (cid:1) Γ v = λ v , h j,h j,h j,h σ(Γ ) = λ ,j N , h j,h ∈ λ C(cid:8). (cid:9) j,h ≤ Moreover, if h A,forall j Nwehave c λ C. A j,h k k ≤ ∈ ≤ ≤ Asaparticularcase,theeigenvalues λ ofΓ satisfy, c λ C,forall j N. j,m m m j,m ≤ ≤ ∈ The bounds are an immediate consequenceof Proposition 2.1. Remark that with these different convexity properties of the geometric median, we are close to the framework of Bach (2014). The difference comes from the fact that G does not satisfy the generalized self-concordanceassumptionwhichiscentralinthelatterwork. 3 Rates of convergence of the Robbins-Monro algorithms Ifthesequence(γ ) ofstepsizesfulfillstheclassicalfollowingassumptions: n n ∑ γ2 < ∞ and ∑ γ = ∞, n n n 1 n 1 ≥ ≥ therecursiveestimatorZ isstronglyconsistent(seeCardotetal.(2013)). Thefirstcondition n on the stepsizes ensures that the recursive algorithm converges towards some value in H whereasthesecondconditionforcesthealgorithmtoconvergetom,theuniqueminimizer ofG. Fromnowon,Z ischosensothatitisbounded(considerforexampleZ = X 1 1 1 1 X M {k k≤ ′} forsomenonnegativeconstant M ). Consequently,thereisapositiveconstant Msuchthat ′ foralln 1: ≥ E Z m 2 M. n k − k ≤ Let us consider now sequences ((cid:2)γ ) of the(cid:3)form γ = c n α where c is a positive n n n γ − γ constant, and α (1/2,1). In order to get confidence balls for the median, the following ∈ additionalassumptionissupposedtohold. (A3) ThereisapositiveconstantCsuchthatforallh H: ∈ E X h 2 C. − k − k ≤ (cid:2) (cid:3) ThisassumptionensuresthattheremaindertermintheTaylorapproximationtothegradi- entisbounded. Notethatthisassumptionisalsorequiredtogettheasymptoticnormality in Cardotetal. (2013). It is also assumed in ChakrabortyandChaudhuri (2014) for deriv- ingtheasymptoticnormalityoftheempiricalmedianestimator. Remarkthatforthesakeof simplicity, wehaveconsideredthesameconstantC in(A2)and(A3). Asin(A2), Assump- tion(A3)iscloselyrelatedtosmallballprobabilities andwhen H = Rd,thisassumptionis 5 satisfiedwhend 3underweakconditions. ≥ Westatenowthefirstnewandimportantresultontheratesofconvergenceinquadratic mean oftheRobbins Monroalgorithm. A comparisonwith Proposition3.2 in Cardotetal. (2013)revealsthatthelogarithmictermhasdisappearedaswellastheconstantC thatwas N relatedtoasequence(Ω ) ofeventswhoseprobabilitywastendingtoone. N N Theorem 3.1. Assuming (A1)-(A3) hold, the algorithm (Z ) defined by (3), with γ = c n α, n n γ − converges in quadratic mean, for all α (1/2,1) and for all α < β < 3α 1, with the following ∈ − rate: 1 E Z m 2 = O , (10) k n− k nα (cid:18) (cid:19) (cid:2) (cid:3) 1 E Z m 4 = O . (11) k n− k nβ (cid:18) (cid:19) h i Upperboundsfortheratesofconvergenceatorderfourarealsogivenbecausetheywill be usefulin severalproofs. Remarkthat obtaining betterratesofconvergence at theorder fourwouldalsobepossibleattheexpenseoflongerproofs,butitisnotnecessaryhere. The proof of this theorem relies on two technical lemmas. The following one gives an upper boundofthequadraticmeanerror. Lemma3.1. Assuming(A1)-(A3)hold,there arepositive constants C ,C ,C ,C suchthatforall 1 2 3 4 n 1: ≥ C E kZn−mk2 ≤ C1e−C4n1−α + nα2 +C3 sup E kZk−mk4 . (12) n/2 1 k n (cid:2) (cid:3) − ≤ ≤ h i TheproofofLemma3.1isgiveninSection5. Lemma 3.2. Assuming the three assumptions (A1) to (A3), for all α (1/2,1), there are a rank ∈ n andpositiveconstantsC ,C suchthatforalln n : α 1′ 2′ ≥ α 1 2 C 1 E kZn+1−mk4 ≤ 1− n E kZn−mk4 + n31′α +C2′n2αE kZn−mk2 . (13) (cid:18) (cid:19) h i h i (cid:2) (cid:3) The proof of Lemma 3.2 is given in Section 5. The next result gives the exact rate of convergencein quadraticmean and statesthatit is notpossibletogettheparametricrates ofconvergencewiththeRobbinsMonroalgorithmwhenα (1/2,1). ∈ Proposition3.1. Assume(A1)-(A3)hold,forallα (1/2,1),there isapositive constantC such ′ ∈ thatforalln 1, ≥ C E Z m 2 ′. k n− k ≥ nα h i 6 4 Non asymptotic confidence balls 4.1 Non asymptotic confidence ballsfor theRobbins-Monro algorithm The aim is now to derive an upper bound for P[ Z m t], for t > 0. A simple first n k − k ≥ result can be obtained by applying Markov’s inequality and Theorem 3.1. We give below a sharper bound that relies on exponential inequalities that are close to the ones given in Theorem3.1inPinelis(1994). AsexplainedinRemark4.2below,itwasnotpossibletoapply directly Theorem 3.1 of Pinelis (1994) and the following proposition gives an analogous exponential inequality in the case where we do not have exactly a sequence of martingale differences. Proposition4.1. Let(β ) beasequenceoflinearoperatorsonHand(ξ )beasequence n,k (k,n) N N n ∈ × of H-valuedmartingale differences adapted toafiltration ( ). Moreover, let (γ ) beasequenceof n n F > positive realnumbers. Then,forallr 0andforalln 1, ≥ n 1 n P ∑− γkβn 1,kξk+1 r 2e−r ∏ 1+E ekβn−1,j−1γj−1ξjk 1 βn 1,j 1γj 1ξj j 1 "(cid:13)(cid:13)k=1 − (cid:13)(cid:13) ≥ # ≤ (cid:13)(cid:13)j=2(cid:16) h − −(cid:13) − − − (cid:13)(cid:12)|F− i(cid:17)(cid:13)(cid:13) (cid:13) (cid:13) (cid:13) (cid:12) (cid:13) (cid:13) (cid:13) (cid:13) n (cid:13) (cid:13)(cid:12) (cid:13) (cid:13) (cid:13) 2exp(cid:13) r+ ∑E e βn 1,j 1γj 1ξj 1 β γ ξ (cid:13) . k − − − k n 1,j 1 j 1 j j 1 ≤ − (cid:13)(cid:13)j=2 h − −(cid:13) − − − (cid:13)(cid:12)|F− i(cid:13)(cid:13)! (cid:13) (cid:12) (cid:13) (cid:13) (cid:13) (cid:13) (cid:12) (cid:13) The proof of Proposition 4.1 is postpone(cid:13)d to Section 5. As in TarrèsandYao (2014), it (cid:13) enablestogiveasharpupperboundforP ∑kn=−11βn−1,kγkξk+1 ≥ t . h(cid:13) (cid:13) i Corollary 4.1. Let (β ) be sequence of linear(cid:13)operators on H, (ξ(cid:13)) be a sequence of H-valued n,k (cid:13) (cid:13)n martingale differences adapted toafiltration ( ) and (γ ) beasequenceofpositive real numbers. n n F Let(N )and σ2 betwodeterministic sequencessuchthat n n (cid:0) (cid:1) n 1 N sup β γ ξ a.s. and σ2 ∑− E β γ ξ . n ≥ k n−1,k k k+1k n ≥ k n−1,k k k+1k Fn k n 1 k=1 ≤ − (cid:2) (cid:12) (cid:3) (cid:12) > Forallt 0andalln 1, ≥ n 1 t2 P ∑− β γ ξ t 2exp . "(cid:13)(cid:13)k=1 n−1,k k k+1(cid:13)(cid:13) ≥ # ≤ (cid:18)−2(σn2+tNn/3)(cid:19) (cid:13) (cid:13) (cid:13) (cid:13) In order to apply(cid:13) these results, le(cid:13)t us linearize the gradient around m in decomposi- tion(8), Z m = Z m γ Γ (Z m)+γ ξ γ δ , (14) n+1 n n m n n n+1 n n − − − − − 7 whereδ := Φ(Z ) Γ (Z m)andintroduce,foralln 1,thefollowingoperators: n n m n − − ≥ α := I γ Γ , n H n m − n n β := ∏α = ∏(I γ Γ ), n k H k k − k=1 k=1 β := I . 0 H Byinduction,(14)yields Z m = β (Z m)+β M β R , (15) n n 1 1 n 1 n n 1 n − − − − − − with n 1 R := ∑− γ β 1δ , n k −k k k=1 n 1 Mn := ∑− γkβ−k1ξk+1. k=1 Remark4.1. Notethatwemakeanabuseofnotationbecause β 1 doesnotnecessarily exist. How- −k ever,ifc < 1,thelinearoperator β 1 isbounded. Moreover, wecanmakethisabusebecause,even γ C −k if βk has not a continuous inverse, we only need to consider βn−1β−k1 := ∏nj=−k1+1 IH −γjΓm , whicharecontinuousoperators fork n 1. (cid:0) (cid:1) ≤ − Notethat,if β is invertible forall k 1, (M ) is amartingale sequenceadaptedtothe k n ≥ filtration( ). Moreover, n F t t t P[ Z m t] P β M +P β R +P β (Z m) n n 1 n n 1 n n 1 1 k − k ≥ ≤ k − k ≥ 2 k − k ≥ 4 k − − k ≥ 4 (cid:20) (cid:21) (cid:20) (cid:21) (cid:20) (cid:21) t E[ β R ] E β (Z m)) 2 ≤ P(cid:20)kβn−1Mnk ≥ 2(cid:21)+4 k nt−1 nk +16 (cid:2)k n−1 t21− k (cid:3). (16) Inthiscontext,Corollary4.1canbewrittenasfollows: Corollary4.2. Let(N ) and σ2 betwodeterministic sequencessuchthat n n 1 n n 1 ≥ ≥ (cid:0) (cid:1) n 1 N sup β β 1γ ξ a.s. and σ2 ∑− E β β 1γ ξ . n ≥ n−1 −k k k+1 n ≥ n−1 −k k k+1 Fn k≤n−1(cid:13) (cid:13) k=1 h(cid:13) (cid:13)(cid:12) i (cid:13) (cid:13) (cid:13) (cid:13) (cid:12) > (cid:13) (cid:13) (cid:13) (cid:13) Then,forallt 0andforalln 1, ≥ n 1 t2 P ∑− β β 1γ ξ t 2exp . "(cid:13)(cid:13)k=1 n−1 −k k k+1(cid:13)(cid:13) ≥ # ≤ (cid:18)−2(σn2+tNn/3)(cid:19) (cid:13) (cid:13) (cid:13) (cid:13) Wecannowder(cid:13)ivenonasymptotic(cid:13)confidenceballsfortheRobbinsMonroalgorithm. 8 Theorem 4.1. Assume that (A1)-(A3) hold. There is a positive constant C such that for all δ ∈ (0,1),thereisarankn suchthatforalln n , δ δ ≥ C 4 P Z m ln 1 δ. k n− k ≤ nα/2 δ ≥ − (cid:20) (cid:18) (cid:19)(cid:21) Remark 4.2. Note that we could not apply Theorem 3.1 in Pinelis (1994) to the martingale term M = ∑n 1β 1γ ξ . Infact,twoproblemsareencountered. First,aswritteninRemark4.1,β 1 n k=−1 −k k k+1 −k does notnecessarily exist. The second problem is that although there is a positive constant M such that β M M for all n 1, the sequence β M may not be convergent ( β n 1 n n 1 n n 1 k − k ≤ ≥ k − kk k k − k denotestheusualspectral normofoperator β ). n 1 − 4.2 Non asymptotic confidence ballsfor theaveraged algorithm: AsinCardotetal.(2013)andPelletier(2000),wemakeuseofdecomposition(14). Bysum- mingandapplyingAbel’stransform,weget: 1 T T n 1 1 n 1 Γ T = 1 n+1 + ∑ T ∑ δ + M , (17) m n k k n+1 n γ − γ γ − γ − n 1 n k=2 (cid:20) k k+1(cid:21) k=1 ! b with T := Z m, n n − T := Z m n n − n M := ∑ ξ . n+1 k+1 k=1 b Thelasttermisthemartingaleterm. ApplyingPinelis-Bernstein’sLemma(seeTarrèsandYao (2014),AppendixA)tothistermandshowingthattheotheronesarenegligible,wegetthe followingnonasymptoticconfidenceballs. Theorem 4.2. Assume that (A1)-(A3) hold. For all δ (0,1), there is a rank n such that for all δ ∈ n n , δ ≥ 2 1 4 P Γ (Z m) 4 + ln 1 δ. m n − ≤ 3n √n δ ≥ − (cid:20) (cid:18) (cid:19) (cid:18) (cid:19)(cid:21) (cid:13) (cid:13) (cid:13) (cid:13) Sincethesmallesteigenvalueλ ofΓ isstrictlypositive, min m 4 2 1 4 P Z m + ln 1 δ. n − ≤ λ 3n √n δ ≥ − (cid:20) min (cid:18) (cid:19) (cid:18) (cid:19)(cid:21) (cid:13) (cid:13) (cid:13) (cid:13) Remark4.3. Wecanalsohaveamoreprecise formoftherankn (seetheProofofTheorem4.2): δ 1 1 1 n := max 6C1′ 1/2−α/2 , 6C2′ α−1/2 , 6C3′ 2 , (18) δ  δln 4 ! δln 4 ! δln 4 !   δ δ δ  (cid:0) (cid:1) (cid:0) (cid:1) (cid:0) (cid:1)   9 where C ,C and C areconstants. Wecanremark thatthefirsttwotermsaretheleading onesand 1′ 2′ 3′ iftherateαischosenequalto2/3,theyareofthesameorderthatisn = O 1 6. δ δ−lnδ (cid:0) (cid:1) Remark4.4. Wecanmakeaninformalcomparisonofpreviousresultwiththecentrallimittheorem statedin(Cardotetal.(2013),Theorem3.4),evenifthelatterresultisofasymptotic nature. Under assumptions(A1)-(A3),theyhaveshownthat √n Zn−m −n−L−→∞ N 0,Γ−m1ΣΓ−m1 , → (cid:16) (cid:17) (cid:0) (cid:1) with, (X m) (X m) Σ = E − − . X m ⊗ X m (cid:20)k − k k − k(cid:21) > Thisimplies,withthecontinuityofthenormin H,thatforallt 0, lim P √n Z m t = P[ V t], n n ∞ − ≥ k k ≥ → (cid:2)(cid:13) (cid:0) (cid:1)(cid:13) (cid:3) (cid:13) (cid:13) whereV isacentered H-valuedGaussianrandomvectorwithcovarianceoperator ∆ = Γ 1ΣΓ 1. V −m −m Operator ∆ is self-adjoint and non negative, so that it admits a spectral decomposition ∆ = V V ∑ η v v , where η η .... 0 is the sequence of ordered eigenvalues associated to the j≥1 j j ⊗ j 1 ≥ 2 ≥ ≥ orthonormal eigenvectors v ,v ,...UsingtheKarhunen-LoèveexpansionofV,wedirectly getthat 1 2 V 2 = ∑η2V2 k k j j j 1 ≥ where V ,V ,... are i.i.d. centered Gaussian variables with unitvariance. Thusthe distribution of 1 2 2 V is a mixture of independent Chi-square random variables with one degree of freedom. Com- k k puting the quantiles of V to build confidence balls would require to know, or to estimate, all the k k (leading)eigenvaluesoftherathercomplicatedoperator ∆ andthisisnotsuchaneasytask. V Ontheotherhand,theuseoftheconfidenceballsgiven inTheorem 4.2onlyrequires theknowl- edgeofλ . Thiseigenvalueisnotdifficulttoestimatesinceitcanalsobewrittenas min 1 1 λ = E λ E (X m) (X m) , min X m − max " X m 3 − ⊗ − #! (cid:20)k − k(cid:21) k − k where λ (A)denotesthelargesteigenvalueofoperator A. max Remark 4.5. Under previous assumptions and the additional condition α > 2/3, it can be shown withdecomposition (17)thatthereisapositiveconstantC suchthat ′ C E Z m 2 ′. n k − k ≤ n (cid:2) (cid:3) Theaveragedalgorithm convergesattheparametric rateofconvergenceinquadratic mean. 10

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.