ebook img

A uniform law for convergence to the local times of linear fractional stable motions PDF

0.42 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview A uniform law for convergence to the local times of linear fractional stable motions

TheAnnalsofAppliedProbability 2016,Vol.26,No.1,45–72 DOI:10.1214/14-AAP1085 (cid:13)c InstituteofMathematicalStatistics,2016 A UNIFORM LAW FOR CONVERGENCE TO THE LOCAL TIMES OF LINEAR FRACTIONAL STABLE MOTIONS 6 1 0 By James A. Duffy 2 Institute for New Economic Thinking, Oxford Martin School b and University of Oxford e F We provide a uniform law for the weak convergence of additive 1 functionals of partial sum processes to the local times of linear frac- tional stable motions, in a setting sufficiently general for statistical ] T applications.Ourresultsarefundamentaltotheanalysisoftheglobal propertiesofnonparametricestimatorsofnonlinearstatisticalmodels S that involve such processes as covariates. . h t a 1. Introduction. Let x = t v be the partial sum of a scalar linear m t s=1 s process {v }, for which the finite-dimensional distributions of d−1x con- [ t P n ⌊nr⌋ verge to those of X(r). Under certain regularity conditions, we then have 2 the finite-dimensional convergence v 7 d n x −d a 6 (1.1) Lf(a,h ):= n f t n L(a) f, 4 n n nhn t=1 (cid:18) hn (cid:19)f.d.d. ZR 5 X 0 where a ∈ R, f is Lebesgue integrable, hn = o(dn) is a deterministic se- . quence, and L denotes theoccupation density (or local time;see Remark 2.5 1 0 below) associated to X. Convergence results of this kind are particularly 5 well documented in the case where {x } is a random walk [see the mono- t 1 graph by Borodin and Ibragimov (1995)], and have more recently been ex- : v tended to cover generating mechanisms that allow the increments of {x } t i X to exhibit significant temporal dependence [Jeganathan (2004), Wang and r Phillips (2009a)]. a These more general theorems concerning (1.1) have, in turn, played a fundamental role in the study of nonparametric estimation and testing in Received May 2014; revised October 2014. AMS 2000 subject classifications. Primary 60F17, 60G18, 60J55; secondary 62G08, 62M10. Key words and phrases. Fractional stable motion, fractional Brownian motion, local time,weakconvergencetolocaltime,integralfunctionalsofstochasticprocesses,nonlinear cointegration, nonparametric regression. This is an electronic reprint of the original article published by the Institute of Mathematical Statistics in The Annals of Applied Probability, 2016,Vol. 26, No. 1, 45–72. This reprint differs from the original in pagination and typographic detail. 1 2 J. A.DUFFY the setting of nonlinear cointegrating models. The simplest of these models takes the form (1.2) y =m (x )+u , t 0 t t where {x } is as above, {u } is a weakly dependent error process, and m is t t 0 an unknownfunction,assumedtopossessacertain degree ofsmoothness (or beotherwiseapproximable). Inaseries ofrecent papers,(1.1)has facilitated the development of a pointwise asymptotic distribution theory for kernel regression estimators of m under very general conditions: see especially 0 Wang and Phillips (2009a, 2009b, 2011, 2015), Kasparis and Phillips (2012) and Kasparis, Andreou and Phillips (2012).1 However, there are definite limits to the range of problems that can be successfully addressed with the aid of (1.1). In particular, since it concerns f only the finite-dimensional convergence of L (a,h ), (1.1) is suited only n n to studying the local behavior of a nonparametric estimator: that is, its behavior in the vicinity of a fixed spatial point. For the purpose of obtain- ing uniform rates of convergence for kernel regression estimators on “wide” domains—thatis,ondomainshavingawidthof thesame orderas therange of {x }n —it is manifestly inadequate. [See Duffy (2015), for a detailed t t=1 account.] The situation is even worse with regard to sieve nonparametric estimation in this setting—which initially motivated the author’s research on this problem—since in this case the development of even a pointwise asymptotic distribution theory requires a prior result on the uniform con- sistency of the estimator, over the entire domain on which estimation is to be performed. The main purposeof this paper is thus to provide conditions underwhich the finite-dimensional convergence in (1.1) can be strengthened to the weak convergence (1.3) Lf(a,h ) L(a) f, n n R Z whereLf(a,h )isregardedasaprocessindexedby(f,a)∈F×R,and{h } n n n may be random. Results of this kind are available in the existing literature, butonlyintherandomwalkcase,whichrequiresthattheincrementsof {x } t be independent, and X to be an α-stable L´evy motion [see Borodin (1981, 1982); Perkins (1982); and Borodin and Ibragimov (1995), Chapter V]. In 1If {x } is Markov, then this distribution theory may be developed by quite different t arguments,withouttheuseof(1.1),seeKarlsenandTjøstheim(2001)andKarlsen,Myk- lebust and Tjøstheim (2007), which have spawned a large literature. While we consider this approach to the problem to be equally important, our results touch upon it only a little, since we work with a class of regressor processes that are typically (excepting the random walk case) non-Markov. UNIFORMCONVERGENCE TO LOCAL TIMES 3 contrast, we allow the increments of {x } to be serially correlated, such t that the associated limiting process X may be a linear fractional stable motion, which subsumes the α-stable L´evy motion and fractional Brownian motion as special cases. Further, we permit the bandwidth sequence {h } n to be a random process, subject only to certain weak asymptotic growth conditions: this is of considerable utility in statistical applications, where the assumption that {h } is a “given” deterministic sequence seems quite n f unrealistic. Crucialto theproofof (1.3)isanovel orderestimate for L (a,1) n when f =0, which is of interest in its own right. The remainder of this paper is organized as follows. Our assumptions on R the data generating mechanism are described in Section 2. The main result (Theorem 3.1) is discussed in Section 3. An outline of the proof follows in Section 4, together with the statement of two key auxiliary results (Propo- sitions 4.1 and 4.2). A preliminary application of our results to the kernel nonparametric estimation of m in (1.2) is given in Section 5. The proof of 0 Theorem 3.1 appears in Section 6, followed in Section 7 by proofs of Propo- sitions 4.1 and 4.2. A proof related to the application appears in Section 8. The final two Sections 9 and 10 are of a more technical nature, detailing the proofs of two lemmas required in Section 7, and somay beskippedon a first reading. 1.1. Notation. For a complete listing of the notation used in this paper, see Section H of the Supplement.2 The stochastic order notations o (·) and p O (·) have the usual definitions, as given, for example, in van der Vaart p (1998), Section 2.2. For deterministic sequences {a } and {b }, we write n n a ∼ b if lim a /b = 1, and a ≍ b if lim a /b ∈ (−∞,∞) \ n n n→∞ n n n n n→∞ n n {0}; for random sequences, a . b denotes a = O (b ). X X de- n p n n p n n notes weak convergence in the sense of van der Vaart and Wellner (1996), and X X the convergence of finite-dimensional distributions. For a n f.d.d. metric space (Q,d), ℓ (Q) [resp., ℓ (Q)] denotes the space of uniformly ∞ ucc bounded functions on Q, equipped with the topology of uniform conver- gence (resp., uniform convergence on compacta). For p≥1, X a random variable, and f:R→R, kXk :=(E|X|p)1/p and kfk :=( |f|p)1/p. BI de- p p R notes the space of boundedand Lebesgue integrable functions on R. ⌊·⌋ and R ⌈·⌉, respectively, denote the floor and ceiling functions. C denotes a generic constant that may take different values even at different places in the same proof; a.b denotes a≤Cb. 2. Model and assumptions. Our assumptions on the generating mecha- nism are similar to those of Jeganathan (2004)—who proves a finite-dimen- sional counterpart to our main theorem—and arecomparable to those made 2TheSupplement is available as an addendumto arXiv:1501.05467. 4 J. A.DUFFY ontheregressorprocessinpreviousworkontheestimationofnonlinearcoin- tegrating regressions [see, e.g., Park and Phillips (2001), Wang and Phillips (2009b, 2012, 2015); and Kasparis and Phillips (2012)]. Assumption 1. (i) {ε } is a scalar i.i.d. sequence. ε lies in the domain t 0 of attraction of a strictly stable distribution with index α∈(0,2], and has characteristic function ψ(λ):=Eeiλε0 satisfying ψ∈Lp0 for some p ≥1. 0 (ii) {x } is generated according to t t ∞ (2.1) x := v , v := φ ε , t s t k t−k s=1 k=0 X X and either: (a) α∈(1,2], ∞ |φ |<∞ and φ:= ∞ φ 6=0; or φ ∼kH−1−1/απ k=0 k k=0 k k k for some {π } strictly positive and slowly varying at infinity, with k k≥0 P P (b) H >1/α; or (c) H <1/α and ∞ φ =0. k=0 k In both cases (b) aPnd (c), H ∈(0,1). Remark2.1. Part(i)impliesthatthereexistsaslowlyvaryingsequence {̺ } such that k ⌊nr⌋ 1 (2.2) ε Z (r), t α n1/α̺n f.d.d. t=1 X where Z denotes an α-stable L´evy motion on R, with Z (0)=0. That is, α α the increments of Z are stationary, and for any r <r the characteristic α 1 2 function of Z (r )−Z (r ) has the logarithm α 2 α 1 πα −(r −r )c|λ|α 1+iβsgn(λ)tan , 2 1 2 (cid:20) (cid:18) (cid:19)(cid:21) where β∈[−1,1] and c>0; following Jeganathan (2004), page 1773, we im- pose the further restriction that β =0 when α=1. We shall also require that {̺ } be chosen such that c=1 here, which provides a convenient nor- k malization for the scale of Z . α Remark 2.2. To permit the alternative forms of (ii) to be more con- cisely referenced, we shall refer to (a) as corresponding to the case where H = 1/α; this designation may be justified by the manner in which the finite-dimensional limit of d−1x depends on (H,α), as displayed in (2.6) n ⌊nr⌋ below. The statement that H <1/α will also beused as a shorthand for (c), that is, it will always be understood that ∞ φ =0 in this case. k=0 k P UNIFORMCONVERGENCE TO LOCAL TIMES 5 We shall treat the parameters (including H and α) describing the data generating mechanism as “fixed” throughout, ignoring the dependence of any constants on these. Let {c } denote a sequence with c =1 and k 0 φ, if H =1/α, (2.3) c = k |H −1/α|−1kH−1/απ , otherwise. (cid:26) k ByKaramata’stheorem[Bingham,GoldieandTeugels(1987),Theorem1.5.11], k φ ∼c as k→∞. Set l=0 k k (P2.4) dk :=k1/αck̺k, ek :=kd−k1, and note that the sequences {c }, {d } and {e } are regularly varying with k k k indices H−1/α, H and 1−H, respectively. Theorems 5.1–5.3 in Kasahara and Maejima (1988) yield Proposition 2.1. Under Assumption 1, 1 (2.5) X (r):= x X(r), r∈[0,1], n ⌊nr⌋ dn f.d.d. where X is the linear fractional stable motion (LFSM) r X(r):= (r−s)H−1/αdZ (s) α (2.6) Z0 0 + [(r−s)H−1/α−(−s)H−1/α]dZ (s) α Z−∞ with the convention that X =Z when H =1/α; Z is an α-stable L´evy α α motion on R, with Z (0)=0. α Remark 2.3. ForadetaileddiscussionoftheLFSM,seeSamorodnitsky and Taqqu (1994). When α=2, Z is a Brownian motion with variance 2; α if additionally H 6=1/α, X is thus a fractional Brownian motion. Remark 2.4. Excepting such cases as the following: (i) α∈(1,2], H >1/α [Astrauskas (1983), Theorem 2]; (ii) α=2, H =1/α and Eε2<∞ [Hannan (1979)]; and 0 (iii) α=2,H <1/αandE|ε |q <∞forsomeq>2[DavidsonanddeJong 0 (2000), Theorem 3.1]; it may not be possible to strengthen the convergence in (2.5) to weak con- vergence on ℓ [0,1]. Weak convergence may hold, however, with respect to ∞ a weaker topology, and we shall be principally concerned with whether this topology is sufficiently strong that (2.7) inf X (r) inf X(r), sup X (r) sup X(r), n n r∈[0,1] r∈[0,1] r∈[0,1] r∈[0,1] 6 J. A.DUFFY such as would follow from weak convergence in the Skorokhod M topology 1 [see Skorohod (1956), Section 2.2.10]. When H =1/α, sufficient conditions for this kind of convergence—which entail further restrictions on {φ } than k are imposed here—are given in Avram and Taqqu (1992), Theorem 2 and Tyran-Kamin´ska (2010), Theorem 1 and Corollary 1. However, when H < 1/α and α∈(0,2), the sample paths of X are unbounded, and thus (2.7) cannotpossiblyhold[seeSamorodnitskyandTaqqu(1994),Example10.2.5]. Inanycase,(2.7)isnot necessaryforthemainresultsofthispaper;itmerely permits Theorem 3.1 below to take a slightly strengthened form. Remark 2.5. InconsequenceofTheorem3(i)inJeganathan(2004),the convergence in (2.5) occurs jointly with n 1 Lf(a):= f(x −d a) L(a) f, a∈R n en t n f.d.d. R t=1 Z X foreveryf ∈BI.Here,{L(a)}a∈R denotestheoccupationdensity(local time) of X, a process which, almost surely, has continuous paths and satisfies 1 (2.8) f(x)L(x)dx= f(X(r))dr ZR Z0 for all Borel measurable and locally integrable f. (For the existence of L, see Theorem 0 in Jeganathan (2004); the path continuity may be deduced from Theorem 3.1 below.) 3. A uniform law for the convergence to local time. Our main result concerns the convergence n 1 x −d a (3.1) Lf(a,h ):= f t n L(a) f, n n enhn t=1 (cid:18) hn (cid:19) ZR X where Lf(a,h ) is regarded as a process indexed by (f,a)∈F ×R. (F ×R n n is endowed with theproducttopology, F havingthe L1 topology, and R the usual Euclidean topology.) {h } is a measurable bandwidth sequence that n may be functionally dependent on {x }, or indeed upon any other elements t of the probability space; it is required only to satisfy: Assumption 2. h ∈ H := [h ,h ] with probability approaching 1 n n n n (w.p.a.1), where h =o(d ) and h−1=o(e log−2n). n n n n Define (3.2) BI := f ∈BI| |f(x)||x|βdx<∞ β R (cid:26) Z (cid:27) UNIFORMCONVERGENCE TO LOCAL TIMES 7 and let BIL denote the subset of Lipschitz continuous functions in BI . In β β order to state conditions on F ⊂BI that are sufficient for (3.1) to hold, we first recall some definitions familiar from the theory of empirical processes. A function F :R→R is termed an envelope for F, if sup |f(x)|≤F(x) + f∈F for every x∈R. Given a pair of functions l,u∈L1, define the bracket [l,u]:={f ∈L1|l(x)≤f(x)≤u(x),∀x∈R}; we say that [l,u] is an ε-bracket if ku−lk <ε, and a continuous bracket 1 if l and u are continuous. Let N∗(ε,F) denote the minimum number of [] continuous ε-brackets required to cover F. Assumption 3. (i) F ⊂BI has envelope F ∈BIL , for some β>0; and β (ii) for each ε>0, N∗(ε,F)<∞. [] Wemaynowstateourmainresult,theproofofwhichappearsinSection6. Theorem 3.1. Suppose Assumptions 1–3 hold. Then: (i) (3.1) holds in ℓ (F ×R); ucc and if additionally (2.7) holds, then (ii) (3.1) holds in ℓ (F ×R). ∞ Remark 3.1. The case where h =1, F ={f} and {x } is a random n t walk—which here entails H =1/α and φ =0 for all i≥1—has been stud- i ied extensively: see in particular Borodin (1981, 1982), Perkins (1982) and Borodin and Ibragimov (1995), Chapter V. In those works, it is proved (un- der these more restrictive assumptions on {x }) that t ⌊nr⌋ 1 f(x −d a) L(a;r) f t n en R t=1 Z X on ℓ (R×[0,1]), where L(a;r) denotes the local time of X restricted to ∞ [0,r]. Theorem 3.1 could be very easily extended in this direction; we have refrained from doing so only to keep the paper to a reasonable length. The principal contribution of Theorem 3.1 is thus to extend this convergence in a direction more suitable for statistical applications, by allowing {v } to be t serially correlated and the bandwidth sequence {h } to be data-dependent. n Remark 3.2. After the manuscript of this paper had been completed, we obtained a copy of an unpublished manuscript by Liu, Chan and Wang (2014) in which, under rather different assumptions from those given here, a result similar to Theorem 3.1 is proved (for a fixed f and a deterministic 8 J. A.DUFFY sequence {h }).Regardingthedifferencesbetween ourmainresultandtheir n Theorem 2.1, we may note particularly their requirement that there exist a sequence of processes {X∗} with X∗= X, and a δ>0 such that n n d (3.3) sup |X (r)−X∗(r)|=o (n−δ), n n a.s. r∈[0,1] a condition which excludes a large portion of the processes considered in this paper, in view of Remark 2.4 above. [The availability of (3.3) permits these authors to prove their result by an argument radically different from that developed here.] On the other hand, our results do not subsume theirs, since these authors do not require v to be a linear process. t Although Assumption 3 requires that F have a smooth envelope and smooth brackets, it is perfectly consistent with F containing discontinuous functions. Indeed, Assumption 3 is consistent with such cases as the follow- ing,asverifiedinSectionAoftheSupplement.[Weexpectthatboundedness and |f(x)||x|βdx<∞ could also be relaxed through the use of a suitable truncation argument, such as is employed in the proof of Theorem V.4.1 in R Borodin and Ibragimov (1995).] Example 3.1 (Single function). F ={f} where f ∈BI , and is ma- β jorised by another function F ∈BIL , in the sense that |f(x)|≤F(x) for all β x∈R.Thisobtainstriviallyiff isitselfinBIL (simplytakeF(x):=|f(x)|), β but is also consistent with f ∈BI having finitely many discontinuities (at β the points {a }K , where a <a ), and being Lipschitz continuous on k k=1 k k+1 (−∞,a )∪[a ,∞); all that is really necessary here is for f to have one- 1 K sided Lipschitz approximants. Importantly, this includes the case where f(x)=1{x∈I} for any bounded interval I. Example 3.2 (Parametric family). F ={g(x,θ)|θ∈Θ}⊂BIL , where β Θ is compact, and there exists a τ ∈(0,1] and a g˙ ∈BIL such that β |g(x,θ)−g(x,θ′)|≤g˙(x)kθ−θ′kτ for all θ,θ′∈Θ. Example 3.3 (Smooth functions). F = {f ∈ Cτ(R)||f| ≤ F}, where F ∈BIL and β Cτ(R):={f ∈BI| ∃C <L s.t. |f(x)−f(x′)|≤C |x−x′|τ ∀x,x′∈R} L f f for some τ ∈(0,1] and L<∞. 4. Outline of proof and auxiliary results. 4.1. Outline of proof. The principal relationships between the results in this paper are summarized in Figure 1. The proof of Theorem 3.1, depicted UNIFORMCONVERGENCE TO LOCAL TIMES 9 Fig. 1. Outline of proofs. in the top half of the figure, proceeds as follows. To reduce the difficulties arising by the randomness of h=h , we decompose n (4.1) Lf(a,h)=Lϕ(a) f + Lf(a,h)−Lϕ(a) f , n n n n R R Z (cid:20) Z (cid:21) where (4.2) ϕ(x):=(1−|x|)1{|x|≤1} ϕ ϕ denotes the triangular kernel function, and L (a):=L (a,1). (This choice n n of ϕ is made purely for convenience; any compactly supported Lipschitz function would serve our purposes equally well here.) It thus suffices to show that Lϕ L in ℓ∞(R), and that the bracketed term on the right-hand n side of (4.1) is uniformly negligible over (f,h)∈F ×H . n ϕ In view of Remark 2.5 above, the finite-dimensional distributions of L n ϕ converge to those of L. The asymptotic tightness of L will follow from the n 10 J. A.DUFFY bound on the spatial increments n n 1 1 Lϕ(a )−Lϕ(a )= [ϕ(x −d a )−ϕ(x −d a )]=: g (x ), n 1 n 2 e t n 1 t n 2 e 1 t n n t=1 t=1 X X given in Proposition 4.1 below. The bracketed term on the right-hand side of (4.1) may be written as n n 1 1 x −d a 1 t n (4.3) f −ϕ(x −d a) f =: g (x ). t n 2 t en t=1(cid:20)h (cid:18) h (cid:19) ZR (cid:21) en t=1 X X Control of (4.3) over progressively denser subsets of F ×H is provided by n Proposition 4.2 below; the conjunction of a bracketing argument and the continuity of the brackets suffices to extend this to the entirety of F ×H . n By construction, both g =0 and g =0. The proofs of Propositions 1 2 4.1 and 4.2 may therefore be approached in a unified way, through the R R analysis of sums of the form n (4.4) S g:= g(x ), n t t=1 X where g ranges over a class G, all members of which have the property that g=0. Suchfunctions are termed zero energy functions [Wang andPhillips (2011)]; we shallcorrespondingly term {Sng}g∈G azero energy process. Such R −1/2 processes are “centered” in the sense that e S g converges weakly to a n n mixed Gaussian variate [Jeganathan (2008), Theorem5]; whereas e−1S g n n L(0) g if g6=0. Equation (4.4) will be handled by decomposing S g as n R R n−1 S g= M g+N g, n nk n k=0 X where each M g is a martingale; see (7.4) below. We provide order es- nk timates for the sums of squares and conditional variances of the M g’s nk (Lemma 7.3); by an application of either Burkholder’s inequality, or a tail bound due to Bercu and Touati (2008), these translate into estimates for the M g’s themselves. Propositions 4.1 and 4.2 then follow by standard nk arguments. 4.2. Key auxiliary results. To state these, we introduce the quantity (4.5) kfk :=inf{c∈R ||fˆ(λ)|≤c|λ|β,∀λ∈R} [β] + for f ∈ BI, β ∈ (0,1], and fˆ(λ) := eiλxf(x)dx. It is easily verified that kfk is indeed a norm on the space BI :={f ∈BI|kfk <∞} (modulo [β] [β] [β] R

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.