ebook img

Risk-sensitive control of reflected diffusion processes on orthrant PDF

0.27 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Risk-sensitive control of reflected diffusion processes on orthrant

RISK-SENSITIVE CONTROL OF REFLECTED DIFFUSION PROCESSES ON ORTHRANT 7 1 SUNILKUMARGAUTTAM, K. SURESHKUMARANDCHANDANPAL 0 2 n a J Abstract. Inthisarticle,weprovetheexistenceofoptimalrisk-sensitive controlwithstateconstraints. Weusenearmonotoneassumptiononthe 5 running cost to provetheexistence of optimal risk-sensitive control. ] C O Keywords: Risksensitivecontrol, discountedrisk-sensitivecontrol, diffusion . h in the orthrant, t a m 2000 Mathematics Subject Classification. Primary 93E20; secondary 60J70 [ 1 v 1. Introduction and Problem Description 3 1 In this paper we study the risk-sensitive control problem when the state 2 dynamics is governed by a controlled reflecting stochastic differential equa- 1 tion in d-dimentional orthant. We prove that the risk-sensitive value is 0 an eigenvalue of the nonlinear eigenvalue problem with oblique boundary . 1 conditions (see, the equation (3.2) ) which is the so called Hamilton Jacobi 0 7 Bellman(HJB)equationoftherisk-sensitivecontrolproblemwithstatecon- 1 straints. We also show that any minimizing selector in (3.2) corresponding : v to the eigen function of the risk-sensitive value is a risk-sensitive optimal i control. We use near monotone structural condition on the running cost X and a blanket recurrence condition for the state dynamics for proving this r a result. The paper is organized as follows. The remaining part of Section 1 con- tains the detailed description of the problem and some results on controlled reflected stochastic differential equations which are used in subsequent sec- tions. In Section 2, we discuss an auxillary risk-sensitive control problem with discounted cost structure. We prove the existence of optimal value and control without the structural condition near monotonicity on the running cost. In the final section, we prove our main theorem, i.e. Theorem 3.2. The proof is based on the socalled vanishing discounting method. Let U be a compact metric space and D denote the positive orthrant of Rd, i.e., D := {x ∈ Rd : x > 0, ∀ i =1,2,··· ,d}. i 1 2 SUNILKUMARGAUTTAM,K.SURESHKUMARANDCHANDANPAL Let A, ∂A denote the closure and boundary of the set A, for any subset A of Rd respectively. . For the given functions b : D×U −→ Rd, σ : D −→ Rd×d and γ : Rd −→ Rd, consider the controlled reflected diffusion in D, given by the solution of the reflected stochastic differential equation (in short RSDE) dX = b(X ,v )dt+σ(X )dW −γ(X )dξ , t t t t t t t (1.1) dξ = I dξ , t {Xt∈∂D} t ξ = 0, X = x ∈ D, 0 0 where W = (W ,··· ,W ) is an Rd-valued standard Wiener process, v(·) is 1 d a U-valued measurable process non anticipative with respect to W(·), called anadmissiblecontrol. Infactthepair(v(·),W(·)) definedonafilteredprob- ability space (Ω,F,{F },P) satisfying the usual hypothesis is an admissible t control if and only if v(·) is measurable and {F }-adapted, see Remark 2.1, t p.31 of [1]. Henceforth, all filtered probability spaces are assumed to satisfy usual hypothesis. The set of all admissible control is denoted by A. Byasolutionto(1.1)wemeanapairofcontinuoustimeprocesses(X(·),ξ(·)) satisfying (1.1) such that the process X(·) is D-valued and ξ(·) is a non- decreasing process which increases only when X(·) hits the boundary ∂D. TheaboveisaspecialcaseofthemoregeneraldefinitionofsolutionsofSDEs with reflection, see [8]. In fact we consider the case when the direction of reflection is single valued. We use the relaxed control frame work given as follows. The compact metricspaceU = P(S)forsomecompactmetricspaceS,whereP(S)denote thespaceofprobabilitymeasuresonS endowedwiththeProhorovtopology, i.e. the topology induced by weak convergence. The drift coefficient b takes the form b(x,v) = ¯b(x,s)v(ds), v ∈ U,x ∈ D. ZS . For l = 1,2,···, set D′ = D∩B(0,l), B(0,l) = {x ∈ Rd|kxk < l}. l From the proof of Theorem A2 (ii) and the remark in p. 28 of [9] there exists open domains D ⊆ Rd with C∞ boundary such that lm • The distance between ∂D′ and D satisfies, l lm 1 d(D ,∂D′)< , l ≥ 1, lm l m • D ⊆ D , n ≥ m, l ≥1. ln lm Set D = ∪∞ D , m ≥ 1. m l=1 lm Then we have (i) For each m ≥ 1, D is with C∞ smooth boundary and D ↓D¯. m m (ii) For any compact set C ⊂ D¯, we have C ⊂ D for m ≥ 1 and l lm sufficiently large. RISK-SENSITIVE CONTROL OF REFLECTED DIFFUSION PROCESS 3 We make the following assumption which is sufficient to ensure the exis- tence of unique solution to the equation (1.1) (A1) (i) The function ¯b is bounded continuous, Lipschitz continuous in its first argument uniformly with respect to the second argument. (ii) The functions σ ∈ C2(D¯),i,j = 1,··· ,d and bounded. ij (iii) The function a d=ef σσ⊥ is uniformly elliptic with ellipticity constant δ, i.e., xa(x)x⊥ ≥ δ|x|2, x ∈ D, where x⊥ denote the transpose of the vector x. (A2) (i) The function γ = (γ ,··· ,γ ) is such that γ ∈ C2(Rd), and there 1 d i b exists η > 0 such that γ(x)·n (x) ≥ η for all x ∈ ∂D , m m here n (·) denote the outward normal to ∂D . m m (ii) There exists a symmetric matrix valued map M : Rd −→ Rd⊗Rd, Rd⊗ Rd the set of all d × d real valued matrices with usual metric, such that M = (m ), m ∈ C (Rd)∩W2,∞(Rd) for i,j = 1,2,··· ,d and satisfies the ij ij b following (a) there exists δ such that 1 x⊥Mx ≥ δ kxk2,x ∈ Rd; 1 (b) there exists C > 0 such that 0 d C kx−yk2+ m (x)(x −y )γ (x) ≥ 0, for all x ∈ ∂D,y ∈ D; 0 ij i i j i,j X (c) Let z ∈ D and if for some C > 0 0 d C kx−yk2+ m (x)(x −y )z (x) ≥ 0, for all x ∈ ∂D,y ∈ D; 0 ij i i j i,j X then z = θγ(x) for some θ > 0. The existence of a unique weak solution of (1.1) for an admissible control has been proved in [[10],[14]] using the following programme. First establish the existence of unique strong solution with zero drift as follows. • Establish the existence of a solution to (1.1) in the smooth domain D , m ≥ 1, m • use convergence arguments to obtain a solution of (1.1) in D, • establish pathwise uniqueness, see Lemma 3.3 of [2]. Now with nonzero drift, usingGirsanov transformation methodto establish existence of unique weak solution under admissible controls, see [[1], pp-42- 44]. For a Markov control, one can prove the existence of unique strong solution by adapting the approach by Zovokin and Veretenikov, see [ [1], 4 SUNILKUMARGAUTTAM,K.SURESHKUMARANDCHANDANPAL pp.45-46] for the analogous proof for the unconstrainted diffusions. See Theorem 3.2 of [2] for details. The running cost function r : D ×U −→ [0,∞) is given in the relaxed frame work as r(x,v) = r¯(x,s)v(ds),x ∈ D,v ∈ U. ZS Throughoutthispaperweassumethatthecostfunctionr¯iscontinuousin (x,s) and Lipschitz continuous in the first argument uniformly with respect to the second. We consider two risk-sensitive cost criteria, discounted cost and ergodic cost criteria which is described below. 1.1. Discounted cost criterion. Let θ ∈ (0, Θ) be the risk-aversion pa- rameter. In the α-discounted cost criterion, controller chooses his control v(·) from the set of all admissible controls A to minimize his α-discounted risk-sensitive cost given by (1.2) Jαv(θ,x) := 1θ lnExv eθR0∞e−αtr(Xt,vt)dt ,x ∈ D, h i where α > 0 is the discount parameter, X(·) is the solution of the s.d.e. (1.1) corresponding to v(·) ∈ A and Ev denote the expectation with respect x to the law of the process (1.1) corresponding to the admissible control v with the initial condition X = x. An admissible control v∗(·) ∈ A is called 0 optimal control if Jv∗(θ,x)≤ Jv(θ,x), for all v(·) ∈ A and x ∈ D. α α 1.2. Ergodic cost criterion. Inthiscriterioncontrollerchooseshiscontrol v(·) ∈ A so as to minimize his risk-sensitive accumulated cost given by (1.3) ρv(θ,x) = limsup θ1T lnExv eθR0Tr(Xt,vt)dt ,x ∈ D. T→∞ h i The definition of optimal control is analogous. From now onwards, we take Θ = 1 without any loss of generality. 1.3. Various subclasses of controls. Anadmissiblecontrol v(·) is said to be a Markov control if there exists a measurable map v¯ : [0,∞)×D −→ U such that v(t) = v¯(t,X(t)). By an abuse of notation, the measurable map v¯: [0,∞)×D −→ U, itself is called Markov control. Ifv¯hasnoexplicit time dependence then it is said to be a stationary Markov control. We denote the set of all Markov control and stationary Markov control by M and S respectively. An admissible control v(·) is said to be a feedback control if it X,ξ isprogressivelymeasurablewithrespectto{F },where(X(·),ξ(·)) denote t X,ξ the solution of (1.1) and F denote sigma field generated by {X ,ξ |s ≤ t s s t},t ≥ 0. This is equivalent to saying that there exists a progressively measurable map v¯ : [0, ∞)×C[[0,∞) : D¯)×C[[0,∞) : D¯) → U such that v(t) = v¯(t,X[0,t],ξ[0,t]),t ≥ 0, where X[0,t],ξ[0,t] denote respectively {X ,0 ≤ s ≤ t},{ξ ,0 ≤ s ≤ t}. Hence by an abuse of notation, we denote s s RISK-SENSITIVE CONTROL OF REFLECTED DIFFUSION PROCESS 5 the set of feedback controls by all progressively measurable maps. The following lemma tells that we can restrict ourselves to feedback controls. Its proof is a straightforward adaptation of Theorem 2.3.4 (a), p.52 of [1]. Lemma 1.1. Let (v(·),W(·)) be an admissible control and (X(·),ξ(·)) be a solution pair to (1.1) on a filtered probability space (Ω,F,{F },P). Then t on an augmentation (Ω˜,F˜,{F˜},P˜) with a {F˜ }-Wiener process W˜ (·) and a t t feedbackcontrolv˜(·)suchthat(X(·),ξ(·)) solves(1.1)forthepair(v˜(·),W˜ (·)) on (Ω˜,F˜,{F˜},P˜). t 1.4. Properties of Controlled RSDEs. Weprovesomeresultsaboutthe controlled RSDE (1.1) which are used in the subsequent sections. To the best of our knowledge these results are not available the controlled RSDEs we are considering. First result is about the equivalence of waek solution and martingale problem for reflected diffusions. For a feedback control v(·), we say that the RSDE (1.1) admits a weak solution if there exists a filtered probability space (Ω,F,{F },P), a {F }- t t Wiener process W(·) and a pair of {F }-adapted processes (X(·),ξ(·)) with t a.s. continuous paths such that X(·) is D-valued, ξ(·) is non decreasing and satisfy dX(t) = b(X(t),v(t,X[0,t],ξ[0,t])dt +σ(X(t)dW(t)−γ(X(t))dξ(t) dξ(t) = I dξ(t),X(0) = x,ξ(0) =0 P a.s.. {X(t)∈∂D} Set (1.4) H = {f ∈ C2(D)|∇f ·γ ≥ 0 on ∂D} 0 and 1 (1.5) Lf(x,v) = b(x,v)·∇f(x)+ trace(a(x)∇2f(x)),f ∈ D(L), 2 where the domain D(L) of the oblique elliptic operator L contains C2 (D), b,γ the set of all bounded twice continuously differentiable functions satisfying ∇f ·γ ≥ 0 on ∂D. Constrained controlled martingale problem: A pair of {F }-adapted t processes(X(·),ξ(·)) definedonafilteredprobabilityspace(Ω,F,{F },P)is t said solve the constrained controlled martingale problem to the RSDE (1.1) corresponding to the admissible control v(·) and initial condition x ∈ D if the following holds. (i) X(·) is D-valued and ξ(·) is non decreasing and X(0) = x,ξ(0) = 0 a.s. (ii) t I dξ(s) = ξ(t), P a.s. for all t ≥ 0, {X(s)∈∂D} Z0 6 SUNILKUMARGAUTTAM,K.SURESHKUMARANDCHANDANPAL (iii) For all f ∈ H, t t M (t) = f(X(t))− L(X(s),v(s))ds+ ∇f ·γ(X(s))dξ(s), t ≥ 0 f Z0 Z0 is an {F }-martingale in (Ω,F,P). t Theorem 1.1. For a feedback control v(·), the pair of processes (X(·),ξ(·)) defined on a filtered probability space (Ω,F,{F },P) solves the constrained t controlled martingale problem iff there exists a filtered probability space (Ω˜,F˜,{F˜},P˜) and a pair of processes (X˜(·),ξ˜(·)) which is a weak solution t to (1.1) such that (X(·),ξ(·)) and (X˜(·),ξ˜(·)) agree in law. Proof. Suppose(X(·),ξ(·))solvestheconstrainedcontrolledmartingaleprob- lem. Hence the law of X(·) solves the corresponding submartingale prob- lem. Now using Theorem 1 of [12], there exists a filtered probability space (Ω˜,F˜,{F˜},P˜)and{F˜ }-adaptedprocesseswithcontinuouspaths(X˜(·),ξ˜(·)) t t and a Wiener process W˜ (·) such that (X˜(·),ξ˜(·)) is a weak solution to (1.1) and law of X(·) is same as law of X˜(·). Now since (1.1) has a unique weak solution, law of (X(·),ξ(·)) equals the law of (X˜(·),ξ˜(·)). Converse follows from Itˆo’s formula. (cid:3) Remark 1.1. Under suitable C2 smoothness assumption on the domain and bounded continuity assumption on direction of reflection γ, the equivalence is shown in [18]. The case of domains with piecewise smooth boundaries and with constant direction of reflections is treated in [7]. Foranadmissiblecontrolv(·),if(X(·),ξ(·))denoteauniqueweaksolution pair to the RSDE (1.1) on (Ω,F,{F },P) and τ a {F }-stopping time, then t t F isfinitelygeneratedandhenceusingTheorem1.3.4,p.34of[18],itfollows τ that regular conditional probability distribution (rcpd) P of P given F ω τ exists. Now we prove a result analogous to Lemma 2.3.7 of [1]. Lemma 1.2. Let (X(·),ξ(·)) denote a weak solution pair corresponding to an admissible feedback control v(·) and defined on (Ω,F,{F },P) and τ be a t finite{F }-stopping time. Thenthe conditional lawµ oftheprocess X(τ+·) t ω given F is a.s. the law of the process X (·), where X (·) is a unique weak τ ω ω solution to the RSDE (1.1) on a probability space (Ω ,F ,{F },P ) for ω ω ω,t ω an admissible control given by v (t)= v(t+τ(ω),X[0,τ(ω)+t],ξ[0,τ(ω)+ ω t]),t ≥0. Proof. For f ∈ H, since t t M = f(X )−f(X )− L(X ,v )ds+ ∇f ·γ(X )dξ , t ≥ 0, t t 0 s s s s Z0 Z0 where L is given by (1.5) is an {F }-martingale on (Ω,F,P), it follows from t Theorem 1.2.10, p.28 of [18] that there exist a P-null set N such that for τ(ω) ω ∈/ N, M (t) = M −M ,t ≥ 0 is a Martingale with respect to {F } f t t∧τ(ω) t RISK-SENSITIVE CONTROL OF REFLECTED DIFFUSION PROCESS 7 on (Ω,F,P ). Hence under P , ω ω t t τ(ω) M (t) = f(X )−f(X )− Lf(X ,v )ds+ ∇f·γ(X )dξ ,t ≥ τ(ω) f t τ(ω) s s s s Zτ(ω) Zτ(ω) is a Martingale under P ,ω ∈/ N. i.e., ω t τ(ω) M (t) = f(X )−f(X )− Lf(X(τ(ω)+s,v )ds f t τ(ω) s+τ(ω) Z0 t + ∇f ·γ(X )dξ ,t ≥ 0 s+τ(ω) s+τ(ω) Z0 is a Martingale under P ,ω ∈/ N. i.e. (X (·),ξ (·)) := (X(· + ω),ξ(·) + ω ω ω τ(ω)−ξ(τ(ω)) solves the constrained controlled martingale problem for the admissible control v and initial distribution X(τ(ω)). This completes the ω proof. (cid:3) Now we give a characterization for recurrence of the RSDE (1.1) corre- sponding to a stationary Markov control in the following lemma. Lemma 1.3. Let v(·) ∈ S and X(·) be a solution to the RSDE (1.1) cor- responding to v(·) and B be a ball in D. Then X(·) is recurrent iff the PDE Lϕ(x,v(x)) = 0, in B¯c, (1.6) ϕ ≡ 1 on ∂B, ∇ϕ·γ ≡ 0 on ∂D. has a unique non negative bounded solution in W2,d+1(B¯c)∩C(Bc). loc Proof. Note that ϕ ≡ 1 is always a positive bounded solution of (1.6) in W2,d+1(B¯c)∩C(Bc). Also anapplication of Itˆo-Dynkin formulaand Fatou’s loc lemma implies that any bounded non negative solution ϕ ∈ W2,d+1(B¯c)∩ loc C(Bc) satisfies ϕ(x) ≥ P (τ(B¯c)< ∞),x ∈ D. x Hence the result follows, since non degeneracy of the RSDE implies that X(·) recurrent iff it is B-recurrent for some ball B in D. (cid:3) 1.5. Notations. In this subsection, we introduce various frequently used notations in this paper. We denote sup|r(x,v)| by krk . For ϕ ∈ C (D), ∞ b v,x the space of all real-valued bounded continuous functions, we denote for each B, a Borel subset of D, kϕk = sup|ϕ(x)|, kϕk = sup|ϕ(x)|. ∞,B ∞ x∈B x∈D For a Banach space X with norm k·k , 1 ≤ p < ∞, define for κ ≥ 0 X T Lp(κ,T;X) = {ϕ :(κ, T)→ X|ϕis Borel measurable and kϕ(t)kp dt < ∞} X Zκ 8 SUNILKUMARGAUTTAM,K.SURESHKUMARANDCHANDANPAL with the norm T 1 kϕk = kϕ(t)kp dt p. p;X X hZκ i ThenormoftheBanachspaceL∞((κ,1)×D)isdenotedbyk·k . ∞;(κ,1)×D C∞((κ,1)×D) denotes the space of all functions in C∞((κ,1)×D) which c are compactly supported. The spaces C∞((κ,1]×D), C∞([κ,1]×D) are c c similarly defined. For κ< T < ∞andanopenboundedsetB inRd, Hβ/2,β([κ,T]×B),κ ≥ 0, denotes the set of all continuous functions ϕ(t,x) in [κ,T]×B together with all the derivatives of the from DrDsϕ(t,x) for 2r+s< β, have a finite t x norm [β] β j kϕk = kϕk +H (ϕ)+ H (ϕ), H(β);[κ,T]×B ∞;[κ,T]×B [κ,T]×B [κ,T]×B j=1 X where Hj (ϕ) = kDrDsϕk [κ,T]×B t x ∞;[κ,T]×B 2r+s=j X β β β/2 H (ϕ) = H (ϕ)+H (ϕ) [κ,T]×B x,[κ,T]×B t,[κ,T]×B Hβ (ϕ) = H(β−[β]) (DrDsϕ) x,(κ,T)×B x,[κ,T]×B t x 2r+s=[β] X Hβ/2 (ϕ) = H(β−22r−s)(DrDsϕ) t,[κ,T]×B t,[κ,T]×B t x 0<β−2r−s<2 X |ϕ(t,x)−ϕ(t,x¯)| (α) H (ϕ) = sup , 0 < α < 1, x,[κ,T]×B kx−x¯kα (t,x),(t,x¯)∈[κ,T]×B |ϕ(t,x)−ϕ(t¯,x)| (α) H (ϕ) = sup , 0 < α < 1. t,[κ,T]×B |t−t¯|α (t,x),(t¯,x)∈[κ,T]×B We denote Cβ2,β([κ,T]×B) = {ϕ ∈ C([κ,T]×B)|ϕ ∈ Cβ/2,β([κ,T]×K), for some compact subset of B}. ThespaceW1,2,p((κ,T)×D)),κ ≥ 0,denotesthesetofallϕ ∈Lp(κ,T;W2,p(D)) such that ∂ϕ ∈ Lp((κ,T;Lp(D)) with the norm given by ∂t ∂ϕ p p p kϕk = kϕk +k k , 1 ≤ p < ∞. 1,2,p;W2,p(D) p;W2,p(D) ∂t p;Lp(D) 1,2,p Also the local Sobolev spaces W ((κ,T)×D) are defined by loc 1,2,p W (κ,T)×D) loc = ϕ: (κ,T)×D → R|ϕ is measurable and ϕ ∈ W1,2,p((κ,T)×K), n for some K is a compact subset of D . o RISK-SENSITIVE CONTROL OF REFLECTED DIFFUSION PROCESS 9 For any domain B in D, define W1,2,p((κ,T)×B) = ϕ :(κ,T)×B → R kϕk < ∞ , 1,2,p;(κ,T)×B n (cid:12) o where the norm k·k is defined as (cid:12) 1,2,p;(κ,T)×B (cid:12) T T ∂ϕ(t,x) p kϕkp = |ϕ(t,x)|pdxdt+ dxdt 1,2,p;(κ,T)×B ∂t Zκ ZB Zκ ZB(cid:12) (cid:12) T ∂ϕ(t,x) p (cid:12) T (cid:12) ∂2ϕ(t,x) p + dxdt+ (cid:12) (cid:12) |dxdt. ∂x ∂x x Xi Zκ ZB(cid:12) i (cid:12) Xij Zκ ZB(cid:12) i j (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) 2. Analysis of the Discounted Cost criterion In this section, we study the discounted risk-sensitive control problem with the state dynamics (1.1) and cost criterion Jαv(θ,x) = θ1 ln Exv eθR0∞e−αtr(Xt,vt)dt . h i The α-discounted risk-sensitive control problem is to minimize (1.2) over all admissible controls. We define the so-called ‘value function’ for the cost (1.2) as (2.1) φ (θ,x)= inf Jv(θ,x). α α v∈A Set (2.2) J¯αv(θ,x) = Exv eθR0∞e−αtr(Xt,vt)dt . h i Since logarithm is an increasing function for fixed θ > 0, a minimizer of J¯v(θ,x) if it exists will be a minimizer of Jv(θ,x)). Corresponding to the α α cost (2.2), the value function is defined as (2.3) u (θ,x)= inf J¯v(θ,x). α α v∈A Note that 1 (2.4) φ (θ,x)= lnu (θ,x). α α θ Since we are dealing with exponential cost we need multiplicative version of DPP in place of additive DPP, see [[6], pp. 53-59]. We mimic thearguments as in [15] to prove DPP for the value function u (θ,x). α Theorem 2.1 (DPP). Let τ be any bounded stopping time with respect to the natural filtration of process X(·), i.e., {FX}. Then t (2.5) uα(θ,x)= infExv eθR0τe−αtr(Xt,vt)dtuα θe−ατ,X(τ) . v(·) h i (cid:0) (cid:1) where infimum is taken over all feedback controls. 10 SUNILKUMARGAUTTAM,K.SURESHKUMARANDCHANDANPAL Proof. Note that, given two feedback controls v (t) and v (t), t ≥ 0 and τ 1 2 as above, v(·) defined as (2.6) v(t) = v (t)I +v (t−τ)I , t ≥ 0, 1 {t<τ} 2 {t≥τ} isalsoafeedbackcontrol. Indeed,wearegivenpairsofprocesses(X (·),ξ (·),v (·)) 1 1 1 and (X (·),ξ (·),v (·)) satisfying (1.1) on some, possibly distinct, probabil- 2 1 2 ity spaces (Ω ,F ,P ), (Ω ,F ,P ) respectively, with v (·),v (·) in feedback 1 1 1 2 2 2 1 2 from. Also, X (0) = x and the law of X (0) = the law of X (τ), where 1 2 1 τ is a prescribed stopping time with respect to the natural filtration of process X (·). By augmenting (Ω ,F ,P ) suitably, one can construct a 1 1 1 1 processes (X(·),ξ(·)) and v(·) satisfying (1.1) such that they coincide with (X (·),ξ (·)) and v (·) on [0,τ], and (X(τ +·),ξ(τ +·)) and v(τ +·) agree 1 1 1 in law with (X (·),ξ(·)) and v (·). Also the conditional law of X(τ +·) of 2 2 given F is the same as its conditional law given X(τ) and agrees with the τ conditional law of X(τ + ·) given X (0) a.s. with respect to the common 2 law of X (0),X(τ). The above construction uses Lemma 1.2. 2 Let ǫ > 0. Let X(·) be a process (1.1) controlled by v(·) as above with v (·) an arbitrary feedback control and v (·) an ǫ-optimal feedback control 1 2 for initial data X(τ). By (2.3) we have uα(θ,x) ≤ Exv eθR0τe−αtr(Xt,vt)dt+θRτ∞e−αtr(Xt,vt)dt = ExvheθR0τe−αtr(Xt,vt)dt×eθe−ατR0∞e−αtr(iXt+τ,vt+τ)dt = ExvheθR0τe−αtr(Xt,vt)dtE eθe−ατR0∞e−αtr(Xt,vt)dt X(iτ) ≤ ExvheθR0τe−αtr(Xt,vt)dt uhα θe−ατ,X(τ) +ǫ (cid:12)(cid:12)(cid:12) ii = ExvheθR0τe−αtr(Xt,vt)dtu(cid:0)α θ(cid:0)e−ατ,X(τ) (cid:1)+ǫE(cid:1)ixv eθR0τe−αtr(Xt,vt)dt . h i h i Since τ,r are bounded and ǫ is arbitra(cid:0)ry we get (cid:1) uα(θ,x)≤ infExv eθR0τe−αtr(Xt,vt)dtuα θe−ατ,X(τ) . v(·) h i (cid:0) (cid:1) Conversely, Let ǫ > 0 and v(·) is an ǫ-optimal feedback control for initial data X(0) =x. Then uα(θ,x)+ǫ ≥ Exv eθR0τe−αtr(Xt,vt)dt+θRτ∞e−αtr(Xt,vt)dt = ExvheθR0τe−αtr(Xt,vt)dtE eθe−ατR0∞e−αtr(iXt,vt)dt X(τ) h h (cid:12) ii ≥ Exv eθR0τe−αtr(Xt,vt)dtinfE eθe−ατR0∞e−αtr(Xt,v(cid:12)(cid:12)t)dt X(τ) v(·) (cid:20) h (cid:12) i(cid:21) = Exv eθR0τe−αtr(Xt,vt)dtuα θe−ατ,X(τ) . (cid:12)(cid:12) h i Thus (cid:0) (cid:1) uα(θ,x)+ǫ ≥ infExv eθR0τe−αtr(Xt,vt)dtuα θe−ατ,X(τ) . v(·) h i (cid:0) (cid:1)

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.