Normal Approximation for White Noise Functionals by Stein's Method and Hida Calculus

January3,2017 LectureNoteSeries,IMS,NUS—ReviewVol.9inx6in HidaStein7 7 1 0 2 n a J Normal Approximation for White Noise Functionals 2 by Stein’s Method and Hida Calculus ] R P Louis H.Y.Chen . h Department of Mathematics t National University of Singapore a m 10 Lower Kent Ridge Road, Singapore 119076 [email protected] [ 1 v Yuh-JiaLee 0 Department of Applied Mathematics 6 National University of Kaohsiung 3 Kaohsiung, TAIWAN 811 0 [email protected] 0 . 1 0 Hsin-HungShih 7 Department of Applied Mathematics 1 National University of Kaohsiung : v Kaohsiung, TAIWAN 811 i [email protected] X r a In this paper we establish a framework for normal approximation for white noise functionals by Stein’s method and Hida calculus. Our work isinspiredbythatofNourdinandPeccati(Probab.TheoryRelat.Fields 145,75-118,2009),whocombinedStein’smethodandMalliavincalculus for normal approximation for functionals of Gaussian processes. 1. Introduction Stein’s method,introducedbyC.Stein[39]inhis1972paper,isapowerful way of determining the accuracy of normal approximation to the distri- bution of a sum of dependent random variables. It has been extended to approximations by a broad class of other probability distributions such as thePoisson,compoundPoisson,andGammadistributions,andtoapproxi- 1 January3,2017 LectureNoteSeries,IMS,NUS—ReviewVol.9inx6in HidaStein7 2 L. H. Y. Chen, Y. J. Lee & H. H. Shih mationsonfinite aswellasinfinitedimensionalspaces.Theresultsofthese approximationshavebeenextensivelyappliedinawiderangeofotherfields suchasthetheoryofrandomgraphs,computationalmolecularbiology,etc. Forfurther details,see[1,2,3,6,35,36,40]andthe referencescitedtherein. Analysis on infinite-dimensional Gaussian spaces has been formulated in terms of Malliavin calculus and Hida calculus. The former, introduced by Malliavin ([29]), studies the calculus of Brownian functionals and their applications onthe classicalWiener space.The connectionbetween Stein’s method and Malliavin calculushas been explored by Nourdin and Peccati (see [31]). They developed a theory of normal approximation on infinite- dimensional Gaussian spaces. In their connection, the Malliavin derivative D plays an important role. See also [7,9]. Hida calculus, also known as white noise analysis, is the mathematical theoryofwhitenoiseinitiatedbyT.Hidainhis1975CarletonMathematical Lecture Notes [14]. Let B(t); t R be a standard Brownian motion and { ∈ } let the white noise B˙(t) dB(t)/dt,t R, be represented by generalized ≡ ∈ functions.Byregardingthecollection B˙(t); t R asacoordinatesystem, { ∈ } Hidadefinedandstudiedgeneralizedwhitenoisefunctionalsϕ(B˙(t), t R) ∈ throughtheirU-functionals.Werefertheinterestedreaderto[15,16,19,38]. The objective of this paper is to develop a connection between Stein’s method and Hida calculus for normal approximation for white noise func- tionals(seeSection5).Ourapproachisanalogoustothatfortheconnection between Stein’s method and Malliavin calculus as established by Nourdin andPeccati[31].TheconnectionbetweenStein’smethodandHidacalculus will be built on the expression of the number operator (or the Ornstein- Ulenbeck operator)intermsofthe Hida derivativesthroughintegrationby partstechniques(seeSection4).Thedifficultythatwehaveencounteredso faristhattheHidaderivative∂ ,thatistheB˙(t)-differentiation,cannotbe t defined on all square-integrable white noise functionals in (L2). Extending the domainof∂ to a largersubclass of(L2) andstudying the regularityof t ∂ will be a key contribution in our paper. t At the time of completing this paper we came to know about the PhD thesisofChu[11],inwhichhedevelopednormalapproximation(inWasser- stein distance) for L´evy functionals by applying Stein’s method and Hida calculus. He achieved this by using the white noise approach of Lee and Shih [28]. We list some notation which will be often used in this paper. Notation 1.1: January3,2017 LectureNoteSeries,IMS,NUS—ReviewVol.9inx6in HidaStein7 Stein’sMethod and Hida Calculus 3 (1) ForareallocallyconvexspaceV,CV denotesitscomplexification. If (V, ) is a real Hilbert space, then CV is a complex Hilbert V |·| space with the |·|CV-normgivenby |φ|2CV =|φ1|2V +|φ2|2V for any φ=φ +iφ ,φ ,φ V.Specially,forV = with -norm(see 1 2 1 2 p p ∈ S |·| (2) STehcetisoynm2bfoolr(t,he)ddeenfiontietisotnh)e, we-w,ilClstil-lCuse, |·|p-to,doernCote |·-|CCSp. ′ ′ p p p p · · S S S S S− S S− S pairing. (3) For a n-linear operator T on X X, Txn means T(x,...,x) ×···× as well as Txn 1y =T(x,...,x,y), x,y X, where X is a real or − ∈ complex locally convex space. (4) The constant ω with r > 0 is given by the square root of r sup n/22nr; n N , where N =N 0 . 0 0 { ∈ } ∪{ } 2. Stein’s method Inthis sectionwe giveabrief expositionofthe basics ofStein’s method for normal approximation. 2.1. From characterization to approximation In his 1986 monograph [39], Stein proved the following characterization of the normal distribution. Proposition 2.1: (Stein’s lemma) The following are equivalent. (i) W (0,1); ∼N (ii) E[f (W) Wf(W)]=0 for all f 1. ′ − ∈CB Proof: By integration by parts, (i) implies (ii). If (ii) holds, solve f (w) wf(w)=h(w) Eh(Z) ′ − − where h and Z has the standard normal distribution (denoted by B ∈ C Z N(0,1)). Its bounded unique solution f is given by h ∼ fh(w) = e21w2 ∞e−21t2[h(t) Eh(Z)]dt − − Zw w = e21w2 e−12t2[h(t) Eh(Z)]dt. (2.1) − Z−∞ Using w∞e−21t2dt≤w−1e−21w2 for w >0, we can show that fh ∈CB1 with f √2πe h and f 4 h . Substituting f for f in (ii) k hk∞R≤ k k∞ k h′k∞ ≤ k k∞ h leads to Eh(W)=Eh(Z) for h . B ∈C January3,2017 LectureNoteSeries,IMS,NUS—ReviewVol.9inx6in HidaStein7 4 L. H. Y. Chen, Y. J. Lee & H. H. Shih This proves (i). The equation EWf(W)=Ef (W) (2.2) ′ for all f 1, which characterizes the standard normal distribution, is ∈ CB called the Stein identity for normal distribtution. In fact, if W N(0,1), ∼ (2.2) holds for absolutely continuous f such that Ef (W) < . ′ | | ∞ LetW bearandomvariablewithEW =0andVar(W)=1.Proposition 2.1 suggests that the distribution of W is “close” to N(0,1) if and only if E[f′(W) Wf(W)] 0 − ≃ for f 1. How “close” the distribution of W is to the standard normal ∈ CB distrbutionmaybequantifiedbydetermininghowcloseE[f (W) Wf(W)] ′ − is to 0.To this endwe define adistance between the distribution ofW and the standard normal distribution as follows. dG(W,Z):= sup Eh(W) Eh(Z) (2.3) h G| − | ∈ where G is a separating class and the distance dG is said to be induced by G.ByaseparatingclassG,wemeanaclassofBorelmeasurablereal-valued functions defined on R such that two random variables, X and Y, have the same distribution if Eh(X) = Eh(Y) for h G. Such a separating class ∈ contains functions h for which both Eh(X) and Eh(Y) exist. Let f be the solution, given by (2.1), of the Stein’s equation h f (w) wf(w)=h(w) Eh(Z) (2.4) ′ − − where h G. Then we have ∈ dG(W,Z)=hsupG|E[fh′(W)−Wfh(W)]|. (2.5) ∈ So bounding the distance dG(W,Z) is equivalent to bounding sup E[f (W) Wf (W)],forwhichweneedtostudytheboundedness h∈G | h′ − h | properties of f and the probabilistic structure of W. h The following three separating classes of Borel measurable real-valued functions defined on R are of interest in normal approximation. G := h; h(u) h(v) u v , W { | − |≤| − |} G := h;h(w)=1 for w x and=0 for w>x,x R , K { ≤ ∈ } G := h;h(w)=I(w A),A is a Borel subset of R . TV { ∈ } January3,2017 LectureNoteSeries,IMS,NUS—ReviewVol.9inx6in HidaStein7 Stein’sMethod and Hida Calculus 5 The distances induced by these three separating classes are respectively called the Wasserstein distance, the Kolmogorov distance, and the total variation distance. It is customary to denote dG , dG and dG respec- W K TV tively by d , d and d . W K TV Sinceforeachhsuchthat h(x) h(y) x y ,thereexistsasequence | − |≤| − | of h 1 with h 1 such that h h 0 as n , we have n ∈C k ′nk∞ ≤ k n− k∞ → →∞ d (W,Z)= sup Eh(W) Eh(Z). (2.6) W h 1, h′ ∞ 1| − | ∈C k k ≤ By an application of Lusin’s theorem, we also have d (W,Z)= sup Eh(W) Eh(Z). (2.7) TV | − | h ,0 h 1 ∈C ≤ ≤ It is also known that 1 d (W,Z)= sup Eh(W) Eh(Z) TV 2 | − | h ∞ 1 k k ≤ 1 = sup Eh(W) Eh(Z). 2 | − | h , h ∞ 1 ∈Ck k ≤ It is generally much harder to obtain an optimal bound on the Kol- mogorov distance than on the Wasserstein distance. There is a discussion on this and examples of bounding the Wasserstein distance and the Kol- mogorov distance are given in Chen [7]. We now state a proposition that concerns the boundedness properties of the solution f , given by (2.1), of the Stein equation (2.4) for h either h boundedorabsolutelycontinuouswithboundedh.Theuseofthesebound- ′ edness properties is crucialfor bounding the Wasserstein,Kolmogorovand total variation distances. Proposition2.2: Letf betheuniquesolution,givenby(2.1),oftheStein h equation (2.4), where h is either bounded or absolutely continuous. 1. If h is bounded, then kfhk∞ ≤ π/2kh−Eh(Z)k∞, kfh′k∞ ≤2kh−Eh(Z)k∞. (2.8) 2. If h is abpsolutely continuous with bounded h, then ′ kfhk∞ ≤2kh′k∞, kfh′k∞ ≤ 2/πkh′k∞, kfh′′k∞ ≤2kh′k∞. (2.9) 3. If h=I where x Rp, then, writing f as f , ( ,x] h x −∞ ∈ 0<fx(w)≤√2π/4, |wfx(w)|≤1, |fx′(w)|≤1, (2.10) January3,2017 LectureNoteSeries,IMS,NUS—ReviewVol.9inx6in HidaStein7 6 L. H. Y. Chen, Y. J. Lee & H. H. Shih and for all w,u,v R, ∈ f (w) f (v) 1, (2.11) | x′ − x′ |≤ (w+u)f (w+u) (w+v)f (w+v) (w +√2π/4)(u + v ). (2.12) x x | − |≤ | | | | | | 4. If h=h where x R, ǫ>0, and x,ǫ ∈ 1, w x, ≤ h (w)= 0, w x+ǫ, x,ǫ  ≥ 1+ǫ−1(x w), x<w <x+ǫ, − then, writing f as f , we have for all w,v,t R, h x,ǫ ∈ 0 f (w) 1, f (w) 1, f (w) f (v) 1 (2.13) ≤ x,ǫ ≤ | x′,ǫ |≤ | x′,ǫ − x′,ǫ |≤ and f (w+t) f (w) | x′,ǫ − x′,ǫ | 1 t∨0 (w +1)t + I(x w+u x+ǫ)du ≤ | | | | ǫ ≤ ≤ Zt∧0 (w +1)t +I(x 0 t w x 0 t+ǫ). (2.14) ≤ | | | | − ∨ ≤ ≤ − ∧ Except for (2.14), which can be found on page 2010 in Chen and Shao [10],theboundsinProposition2.2andtheirproofscanbefoundinLemmas 2.3, 2.4 and 2.5 in Chen, Goldstein and Shao [8]. 2.2. Stein identities and error terms LetW be a randomvariablewithEW =0andVar(W)=1.Inadditionto using the boundedness properties of the solution f of the Stein equation h (2.4), we also need to exploit the probabilistic structure of W, in order to bound the error term in (2.5). This is done through the construction of a Stein identity for W for normal approximation. This is perhaps best understood by looking at a specific example. Let X , ,X be independent random variables with EX = 0, 1 n i ··· Var(X ) = σ2 and EX 3 < . Let W = n X and W(i) = W X i i | i| ∞ i=1 i − i for i = 1, ,n. Assume that Var(W) = 1, which implies n σ2 = 1. ··· P i=1 i Let f 2. Using the independence among the X and the property that ∈ CB i P January3,2017 LectureNoteSeries,IMS,NUS—ReviewVol.9inx6in HidaStein7 Stein’sMethod and Hida Calculus 7 EX =0, we have i n n EWf(W)= EX f(W)= EX [f(W(i)+X ) f(W(i))] i i i − i=1 i=1 X X n Xi = E X f (W(i)+t)dt i ′ i=1 Z0 X n = E ∞ f (W(i)+t)K (t)dt ′ i Xi=i Z−∞ n b = E ∞ f (W(i)+t)K (t)dt (2.15) ′ i Xi=i Z−∞ where K (t) =X [I(X >t>0) I(X <t<0)], i i i i − K (t) =EK (t). i i b b Itcanbe shownthatforeachi,σi−2Ki isaprobabilitydensityfunction.So (2.15) can be rewritten as n EWf(W)= σi2Ef′(W(i)+Ti) (2.16) i=1 X where T , ,T ,X , ,X are independent and T has the density 1 n 1 n i ··· ··· σi−2Ki. Both the equations (2.15) and (2.16) are Stein identities for W for normal approximation.From (2.16) we obtain n E[f′(W)−Wf(W)]= σi2E[f′(W)−f′(W(i))] i=1 X n σ2E[f (W(i)+T ) f (W(i))] (2.17) − i ′ i − ′ i=1 X where the error terms on the right hand side provide an expression for the deviationofE[f (W) Wf(W)]from0.Nowletf bef whereh 1 such ′ h − ∈C that h 1. Applying Taylor expansion to (2.17), and using (2.6) and ′ k k∞ ≤ January3,2017 LectureNoteSeries,IMS,NUS—ReviewVol.9inx6in HidaStein7 8 L. H. Y. Chen, Y. J. Lee & H. H. Shih (2.9), we obtain n d (W,Z) sup σ2EX f W ≤Xi=1h∈C1,kh′k∞≤1 i | i|k h′′k∞ n + sup σ2ET f Xi=1h∈C1,kh′k∞≤1 i | i|k h′′k∞ n n 2 σ2EX +2 σ2ET . ≤ i | i| i | i| i=1 i=1 X X Since σ2EX (EX 3)2/3)(EX 3)1/3 = EX 3 and σ2ET = i | i| ≤ | i| | i| | i| i | i| σ2(1/2σ2)EX 3 =(1/2)EX 3, we have i i | i| | i| n d (W,Z) 3 EX 3. (2.18) W i ≤ | | i=1 X The bound in (2.18) is of optimal order. However, it is much harder to obtain a bound of optimal order for the Kolmogorov distance. Proofs of such a bound on d (W,Z) can be found in Chen [7] and Chen, Goldstein K and Shao [8]. Stein’s identities for locally dependent random variables and otherdependent randomvariablescanbe found inChenandShao[10]and Chen, Goldstein and Shao [8]. 2.3. Integration by parts Let W be a random variable with EW = 0 and Var(W) = 1. In many situations of normal approximation, the Stein identity for W takes the form EWf(W)=ETf (W) (2.19) ′ where T is random variable defined on the same probability as W such that ET < , and f an absolutely continuous function for which the | | ∞ expectationsexist.TypicalexamplesofsuchsituationsarecaseswhenW is afunctionalofGaussianrandomvariablesorofaGaussianprocess.Insuch situations,the Steinidentity(2.19)isoftencontructedusingintegrationby parts as in the case of the Stein identity for N(0,1). By letting f(w)=w, we obtain ET =EW2 =1. Let F be a σ-algebra with respect to which W is measurable. From (2.19), E[f′(W) Wf(W)]=E[f′(W)(1 T)] − − =E[f′(W)E(1 T F)]. − (cid:12) (cid:12) January3,2017 LectureNoteSeries,IMS,NUS—ReviewVol.9inx6in HidaStein7 Stein’sMethod and Hida Calculus 9 Now let f =f where h G, a separating class of functions. Assume that h ∈ f < . Then by (2.3), || h′||∞ ∞ dG(W,Z) =hsupG|E[fh′(W)−Wfh(W)]| ∈ sup f EE(1 T F) ≤(cid:18)h∈Gk h′k∞(cid:19) | − (cid:12) | sup f Var(E(T(cid:12)F)), ≤(cid:18)h∈Gk h′k∞(cid:19)q (cid:12) whereforthelastinequalityitisassumedthatE(T F)(cid:12) issquareintegrable. By Proposition 2.2, (cid:12) (cid:12) 2/π, G =G , W hsu∈pGkfh′k∞ ≤p21,, GG ==GGTKV,. (2.20) This implies  d(W,Z) θEE(1 T F) θ Var(E(T F)) (2.21) ≤ | − |≤ q where (cid:12) (cid:12) (cid:12) (cid:12) 2/π, d(W,Z)=d (W,Z), W θ = 1, d(W,Z)=d (W,Z), (2.22) p K 2, d(W,Z)=dTV(W,Z). For the rest of thissection, we will present two approaches to the con- struction of the Stein identity (2.19). Let Z N(0,1) and let ψ be an absolutely continuous function such ∼ that Eψ(Z) = 0, Var(ψ(Z)) = 1 and Eψ (Z)2 < . Define W = ψ(Z). ′ ∞ FollowingChatterjee[5],weuseGaussianinterpolationtoconstructaStein identity for ψ(Z). Let f be absolutely continuouswith bounded derivative, whichimplies f(w) C(1+ w)forsomeC >0.LetZ beanindependent ′ | |≤ | | copy of Z and let W =ψ(√tZ+√1 tZ ) for 0 t 1. Then we have t ′ − ≤ ≤ 1 ∂W EWf(W) =E(W W )f(W)=E f(W) tdt 1 0 − ∂t Z0 1 Z Z =E f(W) ′ ψ (√tZ+√1 tZ )dt.(2.23) ′ ′ 2√t − 2√1 t − Z0 (cid:18) − (cid:19) Let Ut =√tZ+√1 tZ′, − Vt =√1 tZ √tZ′. − − January3,2017 LectureNoteSeries,IMS,NUS—ReviewVol.9inx6in HidaStein7 10 L. H. Y. Chen, Y. J. Lee & H. H. Shih Then U N(0,1), V N(0,1), and U and V are independent. This t t t t ∼ ∼ together with f(w) C(1 + w) implies the integrability of the right | | ≤ | | hand side of (2.23). Solving for Z, we obtain Z =√tU +√1 tV . t t − The equation (2.23) can be rewritten as 1 1 1 EWf(W) = Ef(ψ(√tU +√1 tV ))V ψ (U ) t t t ′ t 2 t(1 t) − Z0 − = 1 1 p1 Ef (W)ψ (Z)ψ (U )dt ′ ′ ′ t 2 √t Z0 1 1 1 = E f′(W)E ψ′(Z)ψ′(√tZ+√1 tZ′)dt Z 2 √t − (cid:20) (cid:18)Z0 (cid:19)(cid:21) (cid:12) where for the second equality we used the indepedence of U and(cid:12)V , and t t appliedthe characterizationequationforN(0,1)to V .Note thatthe char- t acterizatonequationisobtainedbyintegrationbyparts.Hencewehavefor absolutely continuous f with bounded derivative, EWf(W)=ET(Z)f′(W) (2.24) where 1 1 T(x)= E ψ (x)ψ (√tx+√1 tZ ) dt. (2.25) ′ ′ ′ 2√t − Z0 h i In[5],Chatterjeeobtainedamultivariateversionof(2.24)whereψ :Rd −→ R for d 1. ≥ Here is a simple application of (2.21) and (2.24). Let X , ,X be 1 n ··· independent and identically distributed as N(0,1). Let n (X2 1) W = i=1 i − . √2n P TherandomvariableW hasthestandardizedχ2distributionwithndegrees X2 1 n of freedom. Let ψ(X ) = i − . Then W = ψ(X ). Let W(i) = W i i √2n − i=1 X ψ(X ) and let X , ,X be an independent copy of X , ,X . By the i i′ ··· n′ 1 ··· n independence of X , ,X , and by (2.24) and (2.25), 1 n ··· n n EWf(W) = Eψ(Xi)f′(W(i)+g(Xi))= ET(Xi)f′(W) i=1 i=1 X X =ETf′(W)

