Large Deviations for Stochastic Processes Jin Feng Thomas G. Kurtz Department of Mathematics and Statistics, University of Massa- chusetts, Amherst, MA 01003-4515 E-mail address: [email protected] DepartmentsofMathematicsandStatistics, UniversityofWiscon- sin at Madison, 480 Lincoln Drive, Madison, WI 53706 - 1388 E-mail address: [email protected] 2000 Mathematics Subject Classification. Primary 60F10, 47H20; Secondary 60J05, 60J25, 60J35, 49L25 Key words and phrases. large deviations, exponential tightness, approximation of nonlinear semigroups, Skorohod topology, Markov processes, random evolutions, occupation measures, weakly interacting particles, viscosity solutions, comparison principle, mass transport techniques Work supported in part by NSF Grants DMS 98-04816 and DMS 99-71571. Abstract. Generalfunctionallargedeviationresultsforcadlagprocessesun- der the Skorohod topology are developed. For Markov processes, criteria for large deviation results are given in terms convergence of infinitesimal gener- ators. Following an introduction and overview, the material is presented in threeparts. • Part1givesnecessaryandsufficientconditionsforexponentialtightness thatareanalogoustoconditionsfortightnessinthetheoryofweakcon- vergence. AnalogsoftheProhorovtheoremgivenbyPuhalskii,O’Brien andVervaat,anddeAcosta,thenimplythelargedeviationprinciple,at least alongsubsequences. Representations of the ratefunctioninterms ofratefunctionsforthefinitedimensionaldistributionsareextendedto include situations in which the rate function may be finite for sample pathswithdiscontinuities. • Part 2 focuses on Markov processes in metric spaces. For a sequence ofsuchprocesses,convergenceofFleming’slogarithmicallytransformed nonlinearsemigroupsisshowntoimplythelargedeviationprincipleina manneranalogoustotheuseofconvergenceoflinearsemigroupsinweak convergence. In particular cases, this convergence can be verified using thetheoryofnonlinearcontractionsemigroups. Thetheoryofviscosity solutions of nonlinear equations is used to generalize earlier results on semigroupconvergence,enablingtheapproachtocoverawidevarietyof situations. The key requirement is that a comparison principle holds. Controlmethodsareusedtogiverepresentationsoftheratefunctions. • Part3discussesmethodsforverifyingthecomparisonprincipleandap- plies the general theory to obtain a variety of new and known results onlargedeviationsforMarkovprocesses. ApplicationsincludeFreidlin- Wentzell theory for nearly deterministic processes, random walks and Markovchains,Donsker-Varadhantheoryforoccupationmeasures,ran- domevolutionsandaveragingproblemsforstochasticsystems, rescaled time-space lattice equations, and weakly interacting particle systems. ThelatterresultsincludenewcomparisonprinciplesforaclassofHamilton- JacobiequationsinHilbertspacesandspacesofprobabilitymeasures. 1 1July29,2005 Contents Preface v Notation vii Introduction 1 Chapter 1. Introduction 3 1.1. Basic methodology 4 1.2. The basic setting for Markov processes 6 1.3. Related approaches 8 1.4. Examples 10 1.5. An outline of the major obstacles 25 Chapter 2. An overview 29 2.1. Basic setup 30 2.2. Compact state spaces 30 2.3. General state spaces 33 Part 1. The general theory of large deviations 39 Chapter 3. Large deviations and exponential tightness 41 3.1. Basic definitions and results 41 3.2. Identifying a rate function 50 3.3. Rate functions in product spaces 53 Chapter 4. Large deviations for stochastic processes 57 4.1. Exponential tightness for processes 57 4.2. Large deviations under changes of time-scale 62 4.3. Compactification 64 4.4. Large deviations in the compact uniform topology 65 4.5. Exponential tightness for solutions of martingale problems 67 4.6. Verifying compact containment 71 4.7. Finite dimensional determination of the process rate function 73 Part 2. Large deviations for Markov processes and semigroup convergence 77 Chapter 5. Large deviations for Markov processes and nonlinear semigroup convergence 79 5.1. Convergence of sequences of operator semigroups 79 5.2. Applications to large deviations 82 i ii CONTENTS Chapter 6. Large deviations and nonlinear semigroup convergence using viscosity solutions 97 6.1. Viscosity solutions, definition and convergence 98 6.2. Large deviations using viscosity semigroup convergence 106 Chapter 7. Extensions of viscosity solution methods 109 7.1. Viscosity solutions, definition and convergence 109 7.2. Large deviation applications 125 7.3. Convergence using projected operators 129 Chapter 8. The Nisio semigroup and a control representation of the rate function 133 8.1. Formulation of the control problem 133 8.2. The Nisio semigroup 139 8.3. Control representation of the rate function 140 8.4. Properties of the control semigroup V 141 8.5. Verification of semigroup representation 149 8.6. Verifying the assumptions 153 Part 3. Examples of large deviations and the comparison principle 161 Chapter 9. The comparison principle 163 9.1. General estimates 163 9.2. General conditions in Rd 170 9.3. Bounded smooth domains in Rd with (possibly oblique) reflection 177 9.4. Conditions for infinite dimensional state space 181 Chapter 10. Nearly deterministic processes in Rd 197 10.1. Processes with independent increments 197 10.2. Random walks 204 10.3. Markov processes 205 10.4. Nearly deterministic Markov chains 216 10.5. Diffusion processes with reflecting boundaries 218 Chapter 11. Random evolutions 227 11.1. Discrete time, law of large numbers scaling 228 11.2. Continuous time, law of large numbers scaling 241 11.3. Continuous time, central limit scaling 256 11.4. Discrete time, central limit scaling 263 11.5. Diffusions with periodic coefficients 265 11.6. Systems with small diffusion and averaging 267 Chapter 12. Occupation measures 279 12.1. Occupation measures of a Markov process - Discrete time 280 12.2. Occupation measures of a Markov process - Continuous time 284 Chapter 13. Stochastic equations in infinite dimensions 289 13.1. Stochastic reaction-diffusion equations on a rescaled lattice 289 13.2. Stochastic Cahn-Hilliard equations on rescaled lattice 301 13.3. Weakly interacting stochastic particles 311 CONTENTS iii Appendix 339 Appendix A. Operators and convergence in function spaces 341 A.1. Semicontinuity 341 A.2. General notions of convergence 342 A.3. Dissipativity of operators 346 Appendix B. Variational constants, rate of growth and spectral theory for the semigroup of positive linear operators 349 B.1. Relationship to the spectral theory of positive linear operators 350 B.2. Relationship to some variational constants 353 Appendix C. Spectral properties for discrete and continuous Laplacians 363 C.1. The case of d=1 364 C.2. The case of d>1 364 C.3. E =L2(O)∩{ρ:R ρdx=0} 365 C.4. Other useful approximations 366 Appendix D. Results from mass transport theory 367 D.1. Distributional derivatives 367 D.2. Convex functions 371 D.3. The p-Wasserstein metric space 372 D.4. The Monge-Kantorovich problem 374 D.5. Weighted Sobolev spaces H1(Rd) and H−1(Rd) 378 µ µ D.6. Fisher information and its properties 382 D.7. Mass transport inequalities 390 D.8. Miscellaneous 397 Bibliography 399 Preface This work began as a research paper intended to show how the convergence of nonlinear semigroups associated with a sequence of Markov processes implied the large deviation principle for the sequence. We expected the result to be of littleutilityforspecificapplications,sinceclassicalconvergenceresultsfornonlinear semigroupsinvolvehypothesesthatareverydifficulttoverify,atleastusingclassical methods. We should have recognized at the beginning that the modern theory of viscosity solutions provides the tools needed to overcome the classical difficulties. Once we did recognized that convergence of the nonlinear semigroups could be verified, the method evolved into a unified treatment of large deviation results for Markov processes, and the research “paper” steadily grew into the current volume. There are many approaches to large deviations for Markov processes, but this book focuses on just one. Our general title reflects both the presentation in Part 1 of the theory of large deviations based on the large deviation analogue of the compactnesstheoryforweakconvergence,materialthatisthefoundationofseveral of the approaches, and by the generality of the semigroup methods for Markov processes. ThegoalofPart2istodevelopanapproachforprovinglargedeviations,inthe context of metric-space-valued Markov processes, using convergence of generators in much the same spirit as for weak convergence (e.g. Ethier and Kurtz [36]). This approach complements the usual method that relies on asymptotic estimates obtained through Girsanov transformations. The usefulness of the method is best illustrated through examples, and Part 3 contains a range of concrete examples. We would like to thank Alex de Acosta, Paul Dupuis, Richard Ellis, Wen- dell Fleming, Jorge Garcia, Markos Katsoulakis, Jim Kuelbs, Peter Ney, Anatolii Puhalskii and Takis Souganidis for a number of helpful conversations, and Peter Ney and Jim Kuelbs for organizing a long-term seminar at the University of Wis- consin - Madison on large deviations that provided much information and insight. In particular, the authors’ first introduction to the close relationship between the theory of large deviations and that of weak convergence came through a series of lectures that Alex de Acosta presented in that seminar. v Notation (1) (E,r). A complete, separable metric space. (2) B(E). The σ-algebra of all Borel subsets of E. (3) A⊂M(E)×M(E). An operator identified with its graph as a subset in M(E)×M(E). (4) B(E). The space of bounded, Borel measurable functions. Endowedwith the norm kfk=sup |f(x)|, (B(E),k·k) is a Banach space. x∈E (5) B (E). The space of locally bounded, Borel measurable functions, that loc is, functions in M(E) that are bounded on each compact. (6) B (x)={y ∈E :r(x,y)<(cid:15)}. The ball of radius (cid:15)>0 and center x∈E. (cid:15) (7) buc-convergence, buc-approximable, buc-closure, closed and dense, and buc-lim, See Definition A.6. (8) C(E). The space of continuous functions on E. (9) C (E)=C(E)∩B(E). b (10) C(E,R). The collection of functions that are continuous as mappings from E into R with the natural topology on R. (11) C (E). ForE locallycompact,thefunctionsthatarecontinuousandhave c compact support. (12) Ck(O), for O ⊂ Rd open and k = 1,2,...,∞. The space of functions whose derivatives up to kth order are continuous in O. (13) Ck(O)=Ck(O)∩C (O). c c (14) Ck,α(O), for O ⊂ Rd open, k = 1,2,..., and α ∈ (0,1]. The space of functions f ∈Ck(O) satisfying (cid:12)∂βf(cid:12) h∂βfi kfk = sup sup(cid:12) (cid:12)+ sup <∞, k,α 0≤β≤k O (cid:12)∂xβ(cid:12) |β|=k ∂xβ α where h i |f(x)−f(y)| f =sup , α∈(0,1]. α x6=y |x−y|α (15) Ck,α(O). The space of functions f ∈ C(O) such that f| ∈ Ck,α(D) for loc D every bounded open subset D ⊂O. (16) C [0,∞). The space of E-valued, continuous functions on [0,∞). E (17) Cb(U),forU locallycompact. Thespaceofcontinuousfunctionsvanishing at infinity. (18) D(A)={f :∃(f,g)∈A}. The domain of an operator A. (19) D+(A)={f ∈D(A),f >0}. (20) D++(A)={f ∈D(A),inf f(y)>0}. y∈E (21) D [0,∞). ThespaceofE-valued,cadlag(rightcontinuouswithleftlimit) E functions on [0,∞) with the Skorohod topology, unless another topology is specified. (See Ethier and Kurtz [36], Chapter 3). vii viii NOTATION (22) D(O), for O ⊂Rd open. The space C∞(O) with the topology giving the c space of Schwartz test functions. (See D.1). (23) D0(O). The space of continuous linear functionals on D(O), that is, the space of Schwartz distributions. (24) g∗ (respectively g ). The upper semicontinuous (resp. lower semicontinu- ∗ ous)regularizationofafunctiong onametricspace(E,r). Thedefinition is given by (6.2) (resp. (6.3)). (25) limsup G and liminf G for a sequence of sets G . Defini- n→∞ n n→∞ n n tion 2.4 in Section 2.3. (26) Hk(Rd), for ρ∈P(Rd). A weighted Sobolev space. See Appendix D.5. ρ (27) K(E)⊂C (E). Thecollectionofnonnegative,bounded,continuousfunc- b tions. (28) K (E) ⊂ K(E). The collection of strictly positive, bounded, continuous 0 functions. (29) K (E)⊂K (E). The collection of bounded continuous functions satisfy- 1 0 ing inf f(x)>0. x∈E (30) M(E). The R-valued, Borel measurable functions on E. (31) Mu(E). The space of f ∈M(E) that are bounded above. (32) Ml(E). The space of f ∈M(E) that are bounded below. (33) M(E,R). The space of Borel measurable functions with values in R and f(x)∈R for at least one x∈E. (34) M [0,∞). The space of E-valued measurable functions on [0,∞). E (35) Md×d. The space of d×d matrices. (36) Mu(E,R)⊂M(E,R)(respectively,Cu(E,R)⊂C(E,R)). Thecollection of Borel measurable (respectively continuous) functions that are bounded above (that is, f ∈Mu(E,R) implies sup f(x)<∞). x∈E (37) Ml(E,R)⊂M(E,R) (respectively, Cl(E,R)⊂C(E,R)). The collection of Borel measurable (respectively continuous) functions that are bounded below. (38) M(E). The space of (positive) Borel measures on E. (39) M (E). The space of finite (positive) Borel measures on E. f (40) M (U),U ametricspace. Thecollectionofµ∈M(U×[0,∞))satisfying m µ(U ×[0,t])=t for all t≥0. (41) MT(U) (T > 0). The collection of µ ∈ M(U ×[0,T]) satisfying µ(U × m [0,t])=t for all 0≤t≤T. (42) P(E)⊂M (E). The space of probability measures on E. f (43) R=[−∞,∞]. (44) R(A)={g :∃(f,g)∈A}. The range of an operator A. (45) T#ρ=γ. γ ∈P(E) is the push-forward (Definition D.1) of ρ∈P(E) by the map T.