STOCHASTIC CONVERGENCE OF A NONCONFORMING FINITE ELEMENT METHOD FOR THE THIN PLATE SPLINE SMOOTHER FOR OBSERVATIONAL DATA ZHIMING CHEN∗, RUI TUO†, AND WENLONG ZHANG ‡ Abstract. Thethinplatesplinesmootherisaclassicalmodelforfindingasmoothfunctionfrom theknowledgeofitsobservationatscatteredlocationswhichmayhaverandomnoises. Weconsider anonconformingMorleyfiniteelementmethodtoapproximatethemodel. Weprovethestochastic 7 convergence of the finite element method which characterizes the tail property of the probability 1 distributionfunctionofthefiniteelementerror. Wealsoproposeaself-consistentiterativealgorithm 0 to determine the smoothing parameter based on our theoretical analysis. Numerical examples are 2 included to confirm the theoretical analysis and to show the competitive performance of the self- consistentalgorithmforfindingthesmoothingparameter. n a Key words. Thin plate spline, Morley element, stochastic convergence, optimal J parameter choice. 0 3 1. Introduction. The thin plate spline smoother is a classical mathematical modelforfindingasmoothfunctionfromtheknowledgeofitsobservationatscattered ] A locationswhichmaysubjecttorandomnoises. LetΩbeaboundedLipschitzdomain N in Rd (d ≤ 3) and u0 ∈ H2(Ω) be the unknown smooth function. Let {xi}ni=1 ⊂ Ω be the scattered locations in the domain where the observation is taken. We want . h to approximate u from the noisy data y = u (x )+e , 1 ≤ i ≤ n, where {e }n t 0 i 0 i i i i=1 a are independent and identically distributed random variables on some probability m space (X,F,P) satisfying E[e ] = 0 and E[e2] ≤ σ2. Here and in the following E[X] i i [ denotes the expectation of the random variable X. The thin plate spline smoother, 1 i.e.,D2-splinesmoothertoapproximateu0,isdefinedtobetheuniquesolutionofthe following variational problem v 6 n 2 1 (cid:88) min (u(x )−y )2+λ |u|2 , (1.1) 6 u∈H2(Ω)n i i n H2(Ω) 8 i=1 0 . where λn >0 is the smoothing parameter. 1 Thesplinemodelforscattereddatahasbeenextensivelystudiedintheliterature. 0 For Ω = Rd, [4] proved that (1.1) has a unique solution u ∈ H2(Rd) when the set 7 n 1 T = {xi : i = 1,2,··· ,n} is not collinear (i.e. the points in T are not on the : same plane). Explicit formula of the solution is constructed in [4] based on radial v basis functions. [8] derived the convergence rate for the expectation of the error i X |u −u |2 , j = 0,1,2. Under the assumption that e , i = 1,2,··· ,n, are also n 0 Hj(Ω) i r sub-Gaussion random variables, [10] proved the stochastic convergence of the error a in terms of the empirical norm (cid:107)u −u (cid:107) :=(n−1(cid:80)n |u (x )−u (x )|2)1/2 when n 0 n i=1 n i 0 i d = 1. The stochastic convergence which provides additional tail information of the probability distribution function for the random error is very desirable for the ∗LSEC,InstituteofComputationalMathematics,AcademyofMathematicsandSystemSciences andSchoolofMathematicalScience,UniversityofChineseAcademyofSciences,ChineseAcademy ofSciences,Beijing100190,China. ThisauthorwassupportedinpartbytheChinaNSFunderthe grant113211061. ([email protected]). †InstituteofSystemsScience,AcademyofMathematicsandSystemSciences,ChineseAcademy ofSciences,Beijing100190,China. ([email protected]). ‡SchoolofMathematicalScience, UniversityofChineseAcademyofSciences, ChineseAcademy ofSciences,Beijing100190,China. ([email protected]). 1 approximation of random variables. We refer to [12] for further information of the thin plate spline smoothers. It is well-known that the numerical method based on radial basis functions to solve the thin plate spline smoother requires to solve a symmetric indefinite dense linear system of equations of the size O(n), which is challenging for applications with very large data sets [6]. Conforming finite element methods for the solution of thin platemodelarestudiedin[1]andthereferencestherein. In[6]amixedfiniteelement method for solving ∇u is proposed and the expectation of the finite element error is n proved. The advantage of the mixed finite element method in [6] lies in that one can use simple H1(Ω)-conforming finite element spaces. The H1 smoother in [6] that the mixed finite element method aims to approximate is not equivalent to the thin plate spline model (1.1). In this paper we consider the nonconforming finite element approximation to the problem (1.1). We use the Morley element [5, 7, 9] which is of particular interest for solvingfourthorderPDEssinceithastheleastnumberofdegreesoffreedomoneach element. The difficulty of the finite element analysis for the thin plate smoother is the low stochastic regularity of the solution u . One can only prove the boundedness n of E[|u |2 ] (see Theorem 2.2 below). This difficulty is overcome by a smoothing n H2(Ω) operator based on the C1-element for any Morley finite element functions. We also provetheprobabilitydistributionfunctionoftheempiricalnormofthefiniteelement error has an exponentially decaying tail. For that purpose we also prove the conver- genceoftheerror(cid:107)u −u (cid:107) intermsoftheOrliczψ norm(seeTheorem4.7below) n 0 n 2 which improves the result in [10]. One of the central issues in the application of the thin plate model is the choice of the smoothing parameter λ . In the literature it is usually made by the method of n crossvalidation[12]. Theanalysisinthispapersuggeststheoptimalchoiceshouldbe λ1/2+d/8 =O(σn−1/2(|u | +σn−1/2)−1). (1.2) n 0 H2(Ω) Sinceonedoesnotknowu andtheupperboundofthevarianceσinpracticalapplica- 0 tions, we propose a self-consistent algorithm to determine λ from the natural initial n guess λn = n−4+4d. Our numerical experiments show this self-consistent algorithm performs rather well. The layout of the paper is as follows. In section 2 we recall some preliminary propertiesofthethinplatemodel. Insection3weintroducethenonconformingfinite element method and show the convergence of the finite element solution in terms of the expectation of Sobolev norms. In section 4 we study the tail property of the probability distribution function for the finite element error based on the theory of empiricalprocessforsub-Gaussionnoises. Insection5weintroduceourself-consistent algorithm for finding the smooth parameter λ and show several numerical examples n to support the analysis in this paper. 2. The thin plate model. In this section we collect some preliminary results about the thin plate smoother (1.1). In this paper, we will always assume that Ω is a boundedLipschitzdomainsatisfyingtheuniformconecondition. Wewillalsoassume that T are uniformly distributed in the sense that [8] there exists a constant B > 0 such that hmax ≤B, where hmin h = sup inf |x−x |, h = inf |x −x |. max i min i j x∈Ω1≤i≤n 1≤i(cid:54)=j≤n 2 It is easy to see that there exist constants B ,B such that B n−1/d ≤ h ≤ 1 2 1 max Bh ≤B n−1/d. min 2 We write the empirical inner product between the data and any function v ∈ C(Ω¯) as (y,v) = 1 (cid:80)n y v(x ). We also write (u,v) = 1 (cid:80)n u(x )v(x ) for any n n i=1 i i n n i=1 i i u,v ∈ C(Ω¯) and the empirical norm (cid:107)u(cid:107) = (1 (cid:80)n u2(x ))1/2 for any u ∈ C(Ω¯). n n i=1 i By [8, Theorems 3.3-3.4], there exists a constant C >0 depending only on Ω,B such that for any u∈H2(Ω) and sufficiently small h , max (cid:107)u(cid:107) ≤C((cid:107)u(cid:107) +h2 |u| ), (cid:107)u(cid:107) ≤C((cid:107)u(cid:107) +h2 |u| ). (2.1) L2(Ω) n max H2(Ω) n L2(Ω) max H2(Ω) It follows from (2.1) and Lax-Milgram lemma that the minimization problem (1.1) has a unique solution u ∈H2(Ω). The following convergence result is proved in [8]. n Lemma 2.1. Let U ∈H2(Rd) be the solution of following variational problem: n min (cid:107)u−y(cid:107)2 +λ |u|2 , (2.2) n n H2(Rd) u∈D−2L2(Rd) where D−2L2(Rd)={u|Dαu∈L2(Rd), |α|=2}. Then there exist constants λ >0 0 and C >0 such that for any λ ≤λ and nλd/4 ≥1, n 0 n E(cid:2)(cid:107)U −u (cid:107)2(cid:3)≤Cλ |u |2 + Cσ2 , (2.3) n 0 n n 0 H2(Ω) nλd/4 n E(cid:2)|U |2 (cid:3)≤C|u |2 + Cσ2 . (2.4) n H2(Ω) 0 H2(Ω) nλ1+d/4 n Define the bilinear form a:H2(Ω)×H2(Ω)→R as (cid:88) (cid:90) ∂2u ∂2v a (u,v)= dx, ∀u,v ∈H2(Ω). (2.5) Ω ∂x ∂x ∂x ∂x 1≤i,j≤d Ω i j i j It is obvious that |u|2 =a(u,u) for any u∈H2(Ω). H2(Ω) Theorem 2.2. Let u ∈H2(Ω) be the unique solution of (1.1). Then there exist n constants λ >0 and C >0 such that for any λ ≤λ and nλd/4 ≥1, 0 n 0 n E(cid:2)(cid:107)u −u (cid:107)2(cid:3)≤Cλ |u |2 + Cσ2 , (2.6) n 0 n n 0 H2(Ω) nλd/4 n E(cid:2)|u |2 (cid:3)≤C|u |2 + Cσ2 . (2.7) n H2(Ω) 0 H2(Ω) nλ1+d/4 n Proof. It is clear that u ∈ H2(Ω) and U ∈ H2(Rd) satisfy the following n n variational forms: λ a (u ,v)+(u ,v) =(y,v) , ∀v ∈H2(Ω), (2.8) n Ω n n n n λ a (U ,w)+(U ,w) =(y,w) , ∀w ∈H2(Rd). (2.9) n Rd n n n n Let F :H2(Ω)→D−2L2(Rd) be the extension operator defined by Fu= argmin |v| . H2(Ω) v∈D−2L2(Rd),v|Ω=u 3 It is known [4, 8] that Fu = u in Ω and |Fu| ≤ C|u| for some constant H2(Rd) H2(Ω) C >0. Wewriteu˜=FuinRd inthefollowing. Thus,itfollowsfrom(2.8)-(2.9)that λnaΩ(un−Un,v)+(un−Un,v)n =λnaRd\Ω¯(Un,v˜), ∀v ∈H2(Ω), which implies by taking v =u −U | ∈H2(Ω) that n n Ω λ |u −U |2 +(cid:107)u −U (cid:107)2 ≤λ |U | |u˜ −U˜ | n n n H2(Ω) n n n n n H2(Rd) n n H2(Rd) ≤Cλ |U | |u −U | , n n H2(Rd) n n H2(Ω) where U˜ =F(U | ). Therefore n n Ω |u −U |2 ≤C|U |2 , (cid:107)u −U (cid:107)2 ≤λ |U |2 . (2.10) n n H2(Ω) n H2(Rd) n n n n n H2(Rd) SinceU isthesolutionof (2.2)andU˜ =U inΩ,wehave|U | ≤|U˜ | ≤ n n n n H2(Rd) n H2(Rd) C|U | . Therefore, E[|u |2 ] ≤ CE[|U |2 ], which implies (2.7) by using n H2(Ω) n H2(Ω) n H2(Ω) (2.4). Similarly one obtains (2.6) from the second estimate in (2.10) and (2.3)-(2.4). This completes the proof. Theorems 2.1 and 2.2 suggest that an optimal choice of the parameter λ is such n that λ1+d/4 =O((σ2n−1)|u |−2 ). n 0 H2(Ω) 3. Nonconforming finite element method. In this section we consider the nonconforming finite element approximation to the thin plate model (1.1) whose so- lution u ∈H2(Ω) satisfies the following weak formulation n λ a (u ,v)+(u ,v) =(y,v) , ∀v ∈H2(Ω). (3.1) n Ω n n n n We assume Ω is a polygonal or polyhedral domain in Rd (d = 2,3) in the reminder of this paper. Let M be a family of shape regular and quasi-uniform finite element h meshes over the domain Ω. We will use the Morley element [5] for 2D, [9] for 3D to define our nonconforming finite element method. The Morley element is a triple (K,P ,Σ ), where K ∈ M is a simplex in Rd, P = P (K) is the set of second K K h K 2 order polynomials in K, and Σ is the set of the degrees of freedom. In 2D, for the K element K with vertices a ,1 ≤ i ≤ 3, and mid-points b of the edge opposite to the i i vertex a , 1 ≤ i ≤ 3, Σ = {p(a ),∂ p(b ),1 ≤ i ≤ 3,∀p ∈ C1(K)}. In 3D, for the i K i ν i element K with edges S which connects the vertices a ,a , 1≤i<j ≤4, and faces ij i j F opposite to a , 1 ≤ j ≤ 4, Σ = { 1 (cid:82) p,1 ≤ i < j ≤ 4, 1 (cid:82) ∂ p,1 ≤ j ≤ j j K |Sij| Sij |Fj| Fj ν 4,∀p∈C1(K)}. Here∂ pisthenormalderivativeofpoftheedges(2D)orfaces(3D) ν of the element. We refer to Figure 3.1 for the illustration of the degrees of freedom of the Morley element. Let V be the Morley finite element space h V ={v :v | ∈P (K),∀K ∈M ,f(v | )=f(v | ),∀f ∈Σ ∩Σ }. h h h K 2 h h K1 h K2 K1 K2 The functions in V may not be continuous in Ω. Given a set G⊂R2, let M (G)= h h {K ∈ M : G∩K (cid:54)= ∅} and N(G) the number of elements in M (G). For any h h v ∈V , we define h h 1 (cid:88) vˆ (x )= (v | )(x ), i=1,2,··· ,n. (3.2) h i N(x ) h K(cid:48) i i K(cid:48)∈Mh(xi) 4 Fig. 3.1. The degrees of freedom of 2D Morley (left) and 3D Morley (right) element. Noticethatifx islocatedinsidesomeelementK, thenM (x )={K}andvˆ (x )= i h i h i v (x ), i = 1,2,··· ,n. With this definition we know that (vˆ ,wˆ ) and (e,wˆ ) are h i h h n h n well-defined for any v ,w ∈V . h h h Let a (u ,v )= (cid:88) (cid:88) (cid:90) ∂2uh ∂2vh dx, ∀u ,v ∈V . h h h ∂x ∂x ∂x ∂x h h h K∈Mh1≤i,j≤d K i j i j The finite element approximation of the problem (3.1) is to find u ∈V such that h h λ a (u ,v )+(uˆ ,vˆ ) =(y,vˆ ) , ∀v ∈V . (3.3) n h h h h h n h n h h Since the sampling point set T is not collinear, by Lax-Milgram lemma, the problem (3.3) has a unique solution. LetI :H2(K)→P (K)bethecanonicallocalnodalvalueinterpolantofMorley K 2 element [7, 9] and I : L2(Ω) → V be the global nodal value interpolant such that h h (I u)| = I u for any K ∈ M and piecewise H2(K) functions u ∈ L2(Ω). We h K K h introduce the mesh dependent semi-norm |·| , m≥0, m,h (cid:32) (cid:33)1/2 (cid:88) |v| = |v|2 , m,h Hm(K) K∈Mh for any v ∈L2(Ω) such that v| ∈Hm(K),∀K ∈M . K h Lemma 3.1. We have |u−I u| ≤Ch2−m|u| , ∀u∈Hm(K),0≤m≤2, (3.4) K Hm(K) K H2(K) (cid:107)u−I(cid:100)hu(cid:107)n ≤Ch2|u|H2(Ω), ∀u∈H2(Ω), (3.5) where h is the diameter of the element K and h=max h . K K∈Mh K Proof. Since I p = p for any p ∈ P (K) [9], the estimate (3.4) follows from the K 2 standard interpolation theory for finite element method [3]. Moreover, we have, by local inverse estimates and the standard interpolation estimates (cid:104) (cid:105) (cid:107)u−I u(cid:107) ≤ inf (cid:107)u−p(cid:107) +|K|−1/2(cid:107)I (u−p)(cid:107) K L∞(K) L∞(K) K L2(K) p∈P2(K) ≤Ch2−d/2|u| . K H2(K) 5 LetT ={x ∈T:x ∈K,1≤i≤n}. BytheassumptionTisuniformlydistributed K i i and the mesh is quasi-uniform, we know that the cardinal #T ≤Cnhd. Thus K 1 (cid:88) (cid:107)u−I(cid:100)hu(cid:107)2n ≤ n #TK(cid:107)u−IKu(cid:107)2L∞(K) ≤Ch4|u|2H2(Ω). K∈Mh This proves (3.5). The following property of Morley element will be used below. Lemma 3.2. Let K,K(cid:48) ∈ M and F = K ∩K(cid:48). There exists a constant C h independent of h such that for any v ∈V , |α|≤2, h h (cid:107)∂α(v | −v | )(cid:107) ≤Ch2−|α|−d/2(|v | +|v | ). h K h K(cid:48) L∞(F) h H2(K) h H2(K(cid:48)) Proof. By [9, Lemma 5] we know that (cid:107)v | −v | (cid:107) ≤Ch3/2(|v | +|v | ). h K h K(cid:48) L2(F) h H2(K) h H2(K(cid:48)) By using the inverse estimate we then obtain (cid:107)∂α(v | −v | )(cid:107) ≤Ch−|α|(cid:107)v | −v | (cid:107) h K h K(cid:48) L∞(F) h K h K(cid:48) L∞(F) ≤Ch−|α|−(d−1)/2(cid:107)v | −v | (cid:107) h K h K(cid:48) L2(F) ≤Ch2−|α|−d/2(|v | +|v | ). h H2(K) h H2(K(cid:48)) This proves the lemma. Lemma 3.3. There exists a linear operator Π : V → H2(Ω) such that for any h h v ∈V , h h |v −Π v | ≤Ch2−m|v | , m=0,1,2, (3.6) h h h m,h h 2,h (cid:107)vˆ −Π v (cid:107) ≤Ch2|v | , (3.7) h h h n h 2,h where the constant C is independent of h. Proof. Wewillonlyprovethelemmaforthecased=2. Thecaseofd=3willbe briefly discussed in the appendix of this paper. We will construct Π v by using the h h Agyris element. We recall [3, P.71] that for any K ∈ M , Agyris element is a triple h (K,P ,Λ ),whereP =P (K)andthesetofdegreesoffreedom,withthenotation K K K 5 in Figure 3.2, Λ = {p(a ),Dp(a )(a −a ),D2p(a )(a −a ,a −a ),∂ p(b ),1 ≤ K i i j i i j i k i ν i i,j,k ≤3,j (cid:54)=i,k (cid:54)=i,∀p∈C2(K)}. Let X be the Agyris finite element space h X ={v :v | ∈P (K),∀K ∈M ,f(v | )=f(v | ),∀f ∈Λ ∩Λ .} h h h K 5 h h K1 h K2 K1 K2 It is known that X ⊂H2(Ω). h We define the operator Π as follows. For any v ∈ V , w := Π v ∈ X such h h h h h h h that for any K ∈M , w | ∈P (K) and h h K 5 1 (cid:88) ∂α(w | )(a )= ∂α(v | )(a ), 1≤i≤3, |α|≤2, (3.8) h K i N(a ) h K(cid:48) i i K(cid:48)∈Mh(ai) ∂ (w | )(b )=∂ (v | )(b ), 1≤i≤3. (3.9) ν h K i ν h K i HereM (a )andN(a )aredefinedabove(3.2). Toshowtheestimate(3.6)wefollow h i i an idea in [3, Theorem 6.1.1] and use the element Hermite triangle of type (5) [3, 6 Fig. 3.2. The degrees of freedom of Agyris element (left) and Hermite triangle of type (5) (right). P.102], which is a triple (K,P ,Θ ), where P = P (K) and the set of degrees of K K K 5 freedom Θ = {p(a ),Dp(a )(a −a ),D2p(a )(a −a ,a −a ),Dp(b )(a −b ),1 ≤ K i i j i i j i k i i i i i,j,k ≤ 3,j (cid:54)= i,k (cid:54)= i,∀p ∈ C2(K)}. The finite element space of Hermite triangle of type (5) is H1 conforming and a regular family of Hermite triangle of type (5) is affine-equivalent. For any K ∈ M , denote by p ,p ,p ,q the basis functions h i ij ijk i associated with the degrees of freedom p(a ),Dp(a )(a −a ),D2p(a )(a −a ,a − i i j i i j i k a ),Dp(b )(a −b ), 1≤i,j,k ≤3,j (cid:54)=i,k (cid:54)=i. i i i i For any v ∈ V , we also define a linear operator q := Λ v as follows: for any h h h h h K ∈M , q | ∈P (K) and h h K 5 1 (cid:88) ∂α(q | )(a )= ∂α(v | )(a ), 1≤i≤3, |α|≤2, (3.10) h K i N(a ) h K(cid:48) i i K(cid:48)∈Mh(ai) D(q | )(b )(a −b )=D(v | )(b )(a −b ), 1≤i≤3. (3.11) h K i i i h K i i i ThenfromthedefinitionofMorleyelementandHermitetriangleoftype(5),weknow that φ | :=(v −q )| ∈P (K) satisfies h K h h K 5 (cid:88) φ (x)= D(φ | )(a )(a −a )p (x) h h K i j i ij i,j=1,2,3,j(cid:54)=i (cid:88) + D2(φ | )(a )(a −a ,a −a )p (x). h K i j i k i ijk i,j,k=1,2,3,j(cid:54)=i,k(cid:54)=i Since a regular family of Hermite triangle of type (5) is affine-equivalent, by stan- dard scaling argument [3, Theorem 3.1.2], we obtain easily |q | +|p | + i Hm(K) i Hm(K) |p | +|p | ≤Ch1−m, m=0,1,2. Thus, for m=0,1,2, ij Hm(K) ijk Hm(K) K 1/2 3 (cid:88) (cid:88) |φh|Hm(K) ≤Ch1K−m h|α||∂α(vh|K)(ai)−∂α(qh|K)(ai)|2 . (3.12) i=11≤|α|≤2 ByLemma3.2andthefactthat∂α(q | )(a )isthelocalaverageof∂αv overelements h K i h around a in (3.10) i 1/2 (cid:88) |∂α(vh|K)(ai)−∂α(qh|K)(ai)|≤Ch1−|α| |vh|2H2(K(cid:48)) , ∀1≤|α|≤2. K(cid:48)∈Mh(ai) 7 Inserting above estimate into (3.12), we get 1/2 (cid:88) |vh−qh|Hm(K) ≤Ch2−m |vh|2H2(K(cid:48)) , m=0,1,2. (3.13) K(cid:48)∈Mh(K) By (3.8)-(3.11) we know that q −w ∈P (K) and satisfies h h 5 3 (cid:88) q (x)−w (x)= D(q | −w | )(b )(a −b )q (x). h h h K h K i i i i i=1 On the other hand, for 1≤i≤3, D(q | −w | )(b )(a −b )=∂ (q | −v | )(b )[(a −b )·ν], h K h K i i i ν h K h K i i i since ∂ (w | )(b ) = ∂ (v | )(b ) by (3.9) and the tangential derivative of (q | − ν h K i ν h K i h K w | ) vanishes as a consequence of (3.8) and (3.10). Since |q | ≤ Ch1−m for h K i Hm(K) K m=0,1,2, we obtain then (cid:32) 3 (cid:33)1/2 (cid:88) |q −w | ≤Ch2−m |∂ (q | −v | )(b )|2 h h Hm(K) ν h K h K i i=1 1/2 (cid:88) ≤Ch2−m |vh|2H2(K(cid:48)) , m=0,1,2, (3.14) K(cid:48)∈Mh(K) whereinthesecondinequalitywehaveusedthefactthatbytheinverseestimateand (3.13), |∂ (q | −v | )(b )|≤|q −v | ≤Ch−1|q −v | ν h K h K i h h W1,∞(K) K h h H1(K) 1/2 (cid:88) ≤C |vh|2H2(K(cid:48)) . K(cid:48)∈Mh(K) Combining (3.13) and (3.14) shows (3.6). Toshow(3.7),weusethenotationintheproofofLemma3.1,theinverseestimate and (3.6) to get C (cid:88) (cid:107)vˆ −w (cid:107)2 ≤ #T (cid:107)v −w (cid:107)2 ≤C(cid:107)v −w (cid:107)2 ≤Ch4|v |2 . h h n n K h h L∞(K) h h L2(Ω) h 2,h K∈Mh This completes the proof. For any function v which is piecewise in C2(K) for any K ∈ M , we use the h convenient energy norm |(cid:107)v|(cid:107) =(cid:0)λ |v|2 +(cid:107)vˆ(cid:107)2(cid:1)1/2. h n 2,h n Here vˆ(x ), i=1,2,··· ,n, is defined as in (3.2), that is, vˆ(x ) is the local average of i i all v| (x ), where K(cid:48) ∈M such that x ∈K(cid:48). K(cid:48) i h i Theorem 3.4. Let u ∈ H2(Ω) be the unique solution of (3.1) and u ∈ V be n h h the solution of (3.3). Then there exist constants λ >0 and C >0 such that for any 0 8 λ ≤λ and nλd/4 ≥1, n 0 n E(cid:2)(cid:107)u −uˆ (cid:107)2(cid:3)≤C(λ +h4)|u |2 +C(cid:34)1+ h4 +(cid:18)h4(cid:19)1−d/4(cid:35) σ2 . (3.15) 0 h n n 0 H2(Ω) λn λn nλdn/4 In particular, if h4 ≤Cλ , we have n E(cid:2)(cid:107)u −uˆ (cid:107)2(cid:3)≤Cλ |u |2 + Cσ2 . (3.16) 0 h n n 0 H2(Ω) nλd/4 n Proof. We start by using the Strang lemma [3] |λ a (u ,v )+(u −y,vˆ ) | |(cid:107)u −uˆ |(cid:107) ≤C inf |(cid:107)u −vˆ |(cid:107) +C sup n h n h n h n . (3.17) n h h vh∈Vh n h h 0(cid:54)=vh∈Vh |(cid:107)vh|(cid:107)h By Lemma 3.1 we have inf |(cid:107)u −vˆ |(cid:107) ≤C(λ1/2+h2)|u | . (3.18) n h h n n H2(Ω) vh∈Vh Since for any v ∈ V , Π v ∈ H2(Ω), by (3.1) and the fact that y = u (x )+e , h h h h i 0 i i i=1,2,···n, to obtain λ a (u ,v )+(u −y,vˆ ) n h n h n h n =λ a (u ,v −Π v )+(u −y,vˆ −Π v ) n h n h h h n h h h n ≤λ |u | |v −Π v | +(cid:107)u −u (cid:107) (cid:107)vˆ −Π v (cid:107) +(e,vˆ −Π v ) . n n H2(Ω) h h h 2,h n 0 n h h h n h h h n Now by using Lemma 3.3 we have |λ a (u ,v )+(u −y,vˆ ) | sup n h n h n h n |(cid:107)v |(cid:107) 0(cid:54)=vh∈Vh h h h2 |(e,vˆ −Π v ) | ≤Cλ1/2|u | +C (cid:107)u −u (cid:107) + sup h h h n . (3.19) n n H2(Ω) λ1n/2 n 0 n 0(cid:54)=vh∈Vh |(cid:107)vh|(cid:107)h Sincee ,i=1,2,··· ,n,areindependentandidenticallydistributedrandomvariables, i we have E(cid:2)|(e,vˆ −Π v ) |2(cid:3)=σ2n−1(cid:107)vˆ −Π v (cid:107)2 ≤Cσ2n−1h4|v |2 , h h h n h h h n h 2,h where we have used Lemma 3.3 in the last inequality. Let N be the dimension of V which satisfies N ≤ Ch−d since the mesh is h h h quasi-uniform. Recall that if {X }Nh are random variables, E[sup |X |] ≤ i i=1 1≤i≤Nh i (cid:80)Nh E[|X |]. We have then i=1 i (cid:34) |(e,v −Π v ) |2(cid:35) (cid:20)|(e,v −Π v ) |2(cid:21) σ2h4−d E sup h h h n ≤N · sup E h h h n ≤C . |(cid:107)v |(cid:107)2 h |(cid:107)v |(cid:107)2 nλ 0(cid:54)=vh∈Vh h h 0(cid:54)=vh∈Vh h h n (3.20) Combining (3.17)-(3.20) we obtain E(cid:2)|(cid:107)u −uˆ |(cid:107)2(cid:3)≤Cλ E(cid:2)|u |2 (cid:3)+Ch4E(cid:2)(cid:107)u −u (cid:107)2(cid:3)+Cσ2h4−d. n h h n n H2(Ω) λ n 0 n nλ n n This completes the proof by using Theorem 2.2. 9 4. Stochasticconvergence. Inthissectionwestudythestochasticconvergence of the error (cid:107)u −uˆ (cid:107) which characterizes the tail property of P((cid:107)u −uˆ (cid:107) ≥ z) 0 h n 0 h n for z > 0. We assume the noises e , i = 1,2,··· ,n, are independent and identically i distributedsub-Gaussianrandomvariableswithparameterσ >0. Arandomvariable X is sub-Gaussion with parameter σ if it satisfies (cid:104) (cid:105) E eλ(X−E[X]) ≤e21σ2λ2, ∀λ∈R. (4.1) The probability distribution function of a sub-Gaussion random variable has a expo- nentially decaying tail, that is, if X is a sub-Gaussion random variable, then P(|X−E[X]|≥z)≤2e−12z2/σ2, ∀z >0. (4.2) In fact, by Markov inequality, for any λ>0, P(X−E[X]≥z)=P(λ(X−E[X])≥λz)≤e−λzE[eλ(X−E[X])]≤e−λz−21σ2λ2. By taking λ = z/σ2 yields P(X −E[X] ≥ z) ≤ e−21z2/σ2. Similarly, one can prove P(X−E[X]≤−z)≤e−12z2/σ2. This shows (4.2). 4.1. Stochastic convergence of the thin plate splines. We will use several tools from the theory of empirical processes [11, 10] for our analysis. We start by re- callingthedefinitionofOrlicznorm. Letψ beamonotoneincreasingconvexfunction satisfying ψ(0) = 0. Then the Orilicz norm (cid:107)X(cid:107) of a random variable X is defined ψ as (cid:26) (cid:20) (cid:18) (cid:19)(cid:21) (cid:27) |X| (cid:107)X(cid:107) =inf C >0:E ψ ≤1 . (4.3) ψ C By using Jensen inequality, it is easy to check (cid:107)X(cid:107) is a norm. In the following we ψ will use the (cid:107)X(cid:107) norm with ψ (t) = et2 −1 for any t > 0. By definition we know ψ2 2 that P(|X|≥z)≤2e−z2/(cid:107)X(cid:107)2ψ2, ∀z >0. (4.4) The following lemma is from [11, Lemma 2.2.1] which shows the inverse of this prop- erty. Lemma 4.1. If there exist positive constants C,K such that P(|X| > z) ≤ Ke−Cz2, ∀z >0, then (cid:107)X(cid:107) ≤(cid:112)(1+K)/C. ψ2 LetT beasemi-metricspacewiththesemi-metricdand{X :t∈T}bearandom t process indexed by T. Then the random process {X :t∈T} is called sub-Gaussian t if P(|Xs−Xt|>z)≤2e−12z2/d(s,t)2, ∀s,t∈T, z >0. (4.5) For a semi-metric space (T,d), an important quantity to characterize the complexity of the set T is the entropy which we now introduce. The covering number N(ε,T,d) is the minimum number of ε-balls that cover T. A set is called ε-separated if the distance of any two points in the set is strictly greater than ε. The packing number D(ε,T,d) is the maximum number of ε-separated points in T. logN(ε,T,d) is called the covering entropy and logD(ε,T,d) is called the packing entropy. It is easy to check that [11, P.98] ε N(ε,T,d)≤D(ε,T,d)≤N( ,T,d). (4.6) 2 10