SOLVING TWO-POINT BOUNDARY VALUE PROBLEMS FOR A WAVE EQUATION VIA THE PRINCIPLE OF STATIONARY ACTION AND OPTIMAL CONTROL ∗ PETER M. DOWER† AND WILLIAM M. MCENEANEY‡ Abstract. A new approach to solving two-point boundary value problems for a wave equation is developed. This new approach exploits the principle of stationary action to reformulate and solve such problems in the framework of optimal control. In particular, an infinite dimensional optimal control problem is posed so that the wave equation dynamicsandtemporalboundarydataofinterestarecapturedviathecharacteristicsoftheassociated Hamiltonianand choice of terminal payoff respectively. In order to solve this optimal control problem for any such terminal payoff, and hence solve any two-point boundary value problem corresponding to the boundary data encapsulated by that terminal 5 payoff, a fundamental solution to the optimal control problem is constructed. Specifically, the optimal control problem 1 corresponding to any given terminal payoff can be solved via a max-plus convolution of this fundamental solution with 0 the specified terminal payoff. Crucially, the fundamental solution is shown to be a quadratic functional that is defined 2 with respect to the unique solution of a set of operator differential equations, and computable using spectral methods. An example is presented in which this fundamental solution is computed and applied to solve a two-point boundary n valueproblem for thewave equation of interest. a J Key words. Two-pointboundaryvalueproblems,waveequation, stationaryaction,optimalcontrol,fundamental solution. 8 AMS subject classifications. 49LXX,93C20,35L53, 47F05,47D06. ] C 1. Introduction. The principle of stationary action, or action principle, states that any trajectory gen- O erated by a conservative system must render the corresponding action functional stationary in the calculus of . variations sense, see for example [10, 11, 12]. As this action functional is defined as the time integral of the h t associated Lagrangian, it may be regarded as the payoff due to a unique trajectory generated by some gen- a eralized system dynamics, and corresponding to a specified initial system state. By regarding the velocity of m these generalized dynamics as an input, the action principle may be expressed as an optimal control problem. [ Recent work by the authors has exploited this correspondence with optimal control to develop a fundamental 1 solution to the classicalgravitationalN-body problem, see [15, 16]. This fundamental solution is a special case v of a more general notion of a fundamental solution semigroup developed for optimal control problems, see for 6 example [14, 7, 20, 8]. In the specific case of the gravitationalN-body problem,by constructing a fundamental 0 solution to the optimal control problem corresponding to stationary action, a fundamental solution to a class 0 2 of N-body two-point boundary value problems (TPBVPs) may also be constructed. In this paper, the corre- 0 sponding fundamentalsolutionconstructionfora classofTPBVPsis extendedviainfinite dimensionalsystems . theory to a wave equation [19, 3, 4, 5, 6, 7]. The specific wave equation considered is expressed via the partial 1 0 differential equation (PDE) and boundary data 5 ∂2u κ ∂2u 1 = . u(,0)=0=u(,L), L R . (1.1) : ∂s2 m ∂λ2 · · ∈ >0 v (cid:16) (cid:17) In a mechanical setting (for example), u(s,λ) may be interpreted as the displacement of a vibrating string at i X time s [0,¯t], t¯ R , and location λ Λ, Λ =. (0,L). Here, constants κ,m R model the distributed >0 >0 r elastic s∈pring con∈stant and mass respecti∈vely (with SI units of N and kgm 1). A∈n example pair of initial and a − terminal conditions defining a TPBVP of interest is u(0, )=x(), u(t, )=z(), (1.2) · · · · in which x and z denote the initial and terminal displacements. The problem to solve is then Find the initial velocity ∂u(0, ) . ∂s · TPBVP(t,x,z)= (if it exists) such that (1.1) and (1.2) (1.3) hold with functions x and z given. ∗ThisresearchwaspartiallysupportedbyAFOSR,NSF,andtheAustralianResearchCouncil. Preliminaryresultscontributing tothispaperappearin[6]. †Department of Electrical & Electronic Engineering, University of Melbourne, Parkville, Victoria 3010, Australia. Email: [email protected] ‡Department of Mechanical & Aerospace Engineering, University of California at San Diego, 9500 Gilman Dr., La Jolla, CA 92093-04111, USA.Email: [email protected] 1 (Another example of a TPBVP of interest is to determine the initial velocity such that a desired terminal velocity is attained.) In order to formulate the action principle for system (1.1), note that the potential and kinetic energiesassociatedwith asolutionu(s, )of(1.1)attime s [0,t]arerespectivelydenotedbyV(u(s, )) and T(∂u(s, )), where V and T are the functio·nals defined by ∈ · ∂s · 2 2 ∂u ∂u ∂u V(u(s, ))= κ (s,λ) dλ, T (s, ) = m (s,λ) dλ. (1.4) · 2 ∂λ ∂s · 2 ∂s ZΛ(cid:12) (cid:12) (cid:18) (cid:19) ZΛ(cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) The action principle states that any(cid:12)solution(cid:12)u of (1.1) must render the action(cid:12)function(cid:12)al (cid:12) (cid:12) (cid:12) (cid:12) t ∂u V(u(s, )) T (s, ) ds (1.5) · − ∂s · Z0 (cid:18) (cid:19) stationary in the sense of the calculus of variations [12], where V and T denote the energy functionals as per (1.4),andtheintegrandistheLagrangianoritsadditiveinverse(asselectedhere). Thisincludesanysolutionof theTPBVP(1.3). Hence,byformulatinganappropriateoptimalcontrolproblemencapsulatingthisvariational problem, solutions of the TPBVP (1.3) may be investigated. In particular, a fundamental solution to the TPBVP (1.3) can be constructed within the framework of infinite dimensional optimal control, using a more general notion of fundamental solution semigroup for optimal control [14, 7, 20, 8]. The attendant optimal controlproblemandsubsequentTPBVPfundamentalsolutionisformulatedanddevelopedinSections2and3. Useful auxiliary optimal control problems, and their interrelationships,are employed in this development. The application of this fundamental solution is then considered in the context of an illustrative example in Section 5. Selected technical details of relevance to the development are included in the appendices. In terms of the notation, R, R , and R denotes the sets of reals, non-negative reals, and positive reals 0 >0 respectively. Given an open subse≥t D of a Euclidean space and a Banach space Z, the respective spaces of continuous,continuously differentiable, andLebesgue squareintegrable functions mapping D to Z are denoted by C(D; Z), C1(D;Z), and L (D;Z). Symbols ∂ and ∂2 denote first and second order differentation for 2 functions defined on Λ. An operator :X Y between Banach spaces X and Y is Fr´echet differentiable at x X if there exists a bounded lineaOr opera→tor d (x) (X;Y) such that the limit lim (x+h) (∈x) d (x)h Y/ h X exists and is zero. O ∈L khkX→0kO − O − O k k k 2. Approximating stationary action via optimal control. Where the action functional is concave or convex, the action principle can be formulated as an optimal control problem, see for example [15, 16]. However,thisconvexityorconcavity,correspondingtothatofthe payofforcostfunctional,islimitedtoafinite time horizon that is determined by parameters associated with the kinetic and potential energies. In the finite dimensionalcase,this limited time horizonisstrictly positive,sothatthe conservativedynamics definedbythe actionprinciplecanbepropagatedviasolutionoftheoptimalcontrolproblemuptothattimehorizon. However, intheinfinitedimensionalcaseconsideredhere,thislimitedtimehorizontendstozero,seeTheorem2.1and[6], therebycomplicatingthedirectapplicationoftheapproachof[15,16]. Inordertoovercomethiscomplication,a perturbedoptimalcontrolproblemisformulatedthatapproximatesthe stationaryactionprincipleonastrictly positive time horizon, thereby allowing the solution of the TPBVP (1.3) to be approximated on that time horizon. By concatenating such horizons via the dynamic programming principle, solutions on longer horizons canalsobeapproximated. Suchapproximationsareshown(usingwell-knownsemigroupapproximationresults) to be exact in the limit of vanishing perturbations, see Section 4. 2.1. Preliminaries. Define an L and Sobolev space by 2 x, ∂x absolutely continuous, . . X =L (Λ;R), X = x X x(0)=0=x(L), , (2.1) 2 0 ∈ (cid:12)(cid:12) ∂2x X (cid:12) ∈ (cid:12) and let , and denote the standard L2 inner produ(cid:12)(cid:12)ct and norm on X. A specific unbounded operator of interehstiin conks·idkering the wave equation (1.1) is densely defined on X by A . . x=( x)()= ∂2x(), dom( )=X X, X =X . (2.2) 0 0 A A · − · A ⊂ 2 Operator is closed, positive, self-adjoint, and boundedly invertible, and has a unique, positive, self-adjoint, boundedlyAinvertible square root, denoted by 12. The inverse of this square root is denoted by =. ( 12)−1 (X). See Appendix A (Lemma A.1), [3] (EAxample 2.2.5, Lemma A.3.73, and Examples A.4.J3 andAA.4.26∈), L and also [2]. These properties admit the definition of Hilbert spaces X1 =. dom( 12), x, ξ 1 =. 12 x, 12 ξ , x,ξ X1 , (2.3) 2 A h i2 hA A i ∀ ∈ 2 . . Y1 =X1 X , (x,p),(ξ,π) =m x, ξ 1 + 1 p, π , x,ξ X1, p,π X . (2.4) 2 2 ⊕ h i⊕ h i2 κh i ∀ ∈ 2 ∈ The corresponding norms are denoted by 1 and . Similarly, it is also convenient to define the set k·k2 k·k⊕ . Y0 =X0 X1 Y1 . (2.5) ⊕ 2 ⊂ 2 1 Operators and 2 are Riesz-spectral operators, see Appendix B and [3]. Define orthonormal Riesz bases A A . . B ={ϕn}∞n=1, ϕn(·)= L2 sin(nLπ ·), (2.6) B˜=. {ϕ˜n}∞n=1 ⊂X21 , ϕ˜n(·)=. q√n2πL sin(nLπ ·), for X and X1 respectively (see Lemma A.3). The input space for the optimal control problem of interest is 2 . W[r,t]=L2([r,t];X1) (2.7) 2 for all t R , r [0,t]. The corresponding norm is defined by w 2 =. t w(s) 2 ds. ∈ ≥0 ∈ k kW[r,t] r k k21 2.2. Approximating optimal control problem. In order to formuRlate the action principle for the conservative infinite dimensional dynamics of (1.1), define the abstract Cauchy problem [19, 3] by ξ˙(s)=w(s), ξ(0)=x X1 , (2.8) ∈ 2 in which ξ(s) denotes the infinite dimensionalstate at time s [0,t]that has evolvedfrominitial state x X1 ∈ ∈ 2 inthepresenceofinputw W[0,s]. Thederivativein(2.8)isofFr´echettype,definedwithrespecttothenorm ∈ 1. The mild solution [19, 3] of (2.8) is defined as k·k2 s ξ(s)=x+ w(σ)dσ (2.9) Z0 for all x X1, w W[0,s], s [0,t]. In view of these dynamics, define the payoff (action) functional ∈ 2 ∈ ∈ Jmµ,ψ :[0,t¯)×X12 ×W[0,t¯)→R for some t¯∈R>0 by t Jµ (t,x,w)=. κ ξ(s) 2 m µw(s) 2ds+ψ(ξ(t)), (2.10) m,ψ Z0 2 k k12 − 2 kJ k12 in which κ, m R are physical constants as per (1.1), µ R is a real-valued perturbation parameter, >0 >0 µ :X1 X0 ∈X1 is a bounded linear operator given by ∈ J 2→ × 2 . µw= J w, (2.11) J µ (cid:20) I (cid:21) is the identity operator on X1, and ψ : X1 R is any concave terminal payoff. In the integrand in (2.10), I 2 2→ notethat ξ(s) 1 ∂ξ(s) iswell-definedasξ(s) X1 foreachs [0,t]. Notealsothat w(s) 1 w(s) k k2 ≡k k ∈ 2 ∈ kJ k2 ≡k k is well-defined as ran( )=X1. Consequently, (2.10) approximates the action functional (1.5) as J 2 t ∂u V(u(s, )) Tµ (s, ) ds (2.12) · − ∂s · Z0 (cid:18) (cid:19) 3 where Tµ :X1 R is the approximate kinetic energy functional defined analogously to (1.4) by 2→ 2 2 ∂u . ∂u ∂u Tµ (s, ) = m (s, ) +µ2 (s, ) = m µw(s) 2 . (2.13) (cid:18)∂s · (cid:19) 2 (cid:13)∂s · (cid:13) (cid:13)∂s · (cid:13)12! 2 kJ k21 (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) Note in particular that the 2 term int(cid:13)roduces a(cid:13)penalt(cid:13)y on spat(cid:13)ial ripples in ∂u(s, ) for µ = 0. This term k·k21 ∂s · 6 vanishes for µ=0, so that T =T0 by inspection of (1.4) and (2.13). In order to ensure that the optimal control problem defined via the payoff functional Jµ of (2.10) has a m,ψ finite value, itis criticalto establishthe existence of at¯µ R in (2.10) suchthat Jµ (t,x, ) is either convex or concave for all t [0,¯tµ) and x X1. To this end, i∈t m>ay0 be shown (see Theorme,mψ 2.1 a·t the end of this section)thattheseco∈nddifference∆∈Jµ 2(t,x,w,δw˜)=. Jµ (t,x,w+δw˜) 2Jµ (t,x,w )+Jµ (t,x,w δw˜) m,ψ ∗ m,ψ ∗ − m,ψ ∗ m,ψ ∗− of Jmµ,ψ(t,x,·) for an input w∗ ∈W[0,t] in direction w˜ ∈W[0,t], δ ∈R (with δkw˜kW[0,t] 6=0) satisfies ∆Jµ (t,x,w,δw˜) δ2 mµ2 κ(t2) w˜ 2 <0 (2.14) m,ψ ∗ ≤− − 2 k kW[0,t] h i for all t [0,¯tµ), provided the terminal payoff ψ is concave, where ∈ t¯µ =. µ(2m)12 . (2.15) κ That is, the payoff functional Jµ (t,x, ) of (2.10) is strictly concave under these conditions. Consequently, the approximate action principlem(,mψodifie·d to include a concaveterminal payoffψ, and perturbed by µ R ) >0 may be expressed via the value function Wµ :R 0 X1 R, ∈ ≥ × 2→ Wµ(t,x)=. sup Jµ (t,x,w). (2.16) m,ψ w W[0,t] ∈ By interpreting (2.16) as an optimal control problem, it is shown that the state feedback characterization of the optimal (velocity) input for the approximate action principle is defined via w (s) = k(s,ξ (s)), where ∗ ∗ k(s,isx)a=.selm1f-aAd12joIinµtAb12o∇unxdWedµ(litn−easr,xop).erHaetorer,tξh∗a(t·)adpepnrootxeismtahteestrtahjeecitdoernyti(t2y.8f)orcosrmreaslploµndinRg to(intopubtewd∗e,fiannedd µ >0 I ∈ later). Consequently,byselectingaterminalpayoffthatforcestheterminaldisplacementξ(t)toz (fixedapriori as per (1.3)), the corresponding initial velocity required to achieve this terminal displacement is shown to be w(0)= m1 A21 IµA12 ∇xWµ(t,ξ∗(0))= m1 A21 IµA12 ∇xWµ(t,x). The characteristic equations corresponding to the Hamiltonian associated with (2.16) imply that this initial velocity determines the corresponding initial momentum costate. Here, it is convenient to define a scaled . 1 . 1 costate π(s) = m µ−2 w(s), so that the initialization π(0) = p = m µ−2 w(0) ultimately yields the terminal I I displacement z after evolution of the state and costate dynamics to time t (0,t¯µ). This evolution is governed ∈ by the abstract Cauchy problem 1 (cid:18) πξ˙˙((ss)) (cid:19)=A⊕µ (cid:18) πξ((ss)) (cid:19) , A⊕µ =. κ 210 µ21 12 m10Iµ2 ! , dom(A⊕µ)=. Y21 , (2.17) − A I A in which ξ(s) and π(s) denote the state and costate at time s [0,t], evolvedfrom ξ(0)=x and π(0)=p. The uniformly continuous semigroup of bounded linear operators T∈µ⊕(s)∈ L(Y21) generated by A⊕µ ∈L(Y12) yields solutions of (2.17) of the form ξ(s) ξ = (s) , (2.18) π(s) Tµ⊕ π (cid:18) (cid:19) (cid:18) (cid:19) for all s [0,t], with ξ solving an approximationof the wave equation (1.1) given by ∈ ξ¨(s)=−(mκ)A12 IµA12 ξ(s) (2.19) 4 for s R . Furthermore, is shown to converge strongly (as µ 0) to an unbounded, closed, and densely define∈dop≥e0rator ⊕ onY0 =A. ⊕µdom( ⊕)=. X0 X1 . Thisoperatord→efinestherelatedabstractCauchyproblem A A × 2 x˙(s) x(s) . 0 1 . . p˙(s) =A⊕ p(s) , A⊕ = κ m0I , dom(A⊕)=Y0 =X0⊕X12 , (2.20) (cid:18) (cid:19) (cid:18) (cid:19) (cid:18) − A (cid:19) and is the generator of the C0-semigroup of bounded linear operators ⊕(t) (Y1), t R 0, yielding all T ∈ L 2 ∈ ≥ solutions of (2.20) of the form x(s) x = (s) , (2.21) p(s) T⊕ p (cid:18) (cid:19) (cid:18) (cid:19) in which x(s) and p(s) denote the state and costate analogouslyto (2.18). Crucially, the state x is the solution is the wave equation (1.1) itself. As the first Trotter-Kato theorem (e.g. [9]) implies that the semigroup (t) converges strongly to (t) for t R on bounded intervals, solutions (2.18) of (2.19) converge to soluTtµ⊕ions ⊕ 0 T ∈ ≥ (2.21) of (1.1) as µ 0. In this sense, solutions of the TPBVP (1.3) defined with respect to the waveequation → (1.1) may be approximated via the optimal control problem defined by (2.16). Where the terminal velocity is specified (rather than the terminal position as in (1.3)), this same approach may be applied by employing the terminal payoff . ψ(x)=ψv(x)=m v, x 1 . (2.22) hJ J i2 In that case, the terminal momentum costate is given by π(t)= 12 µ21 12 xWµ(0,ξ∗(t))= 21 µ12 21 xψv(ξ∗(t))=m µ12 v. A I A ∇ A I A ∇ I Hence, by solving the optimal control problem (2.16) defined with respect to terminal payoff ψ of (2.22), the v infinite dimensional dynamics of (2.19) can be propagatedforwardfrom a knowninitial position x X0 X1 ∈ ⊂ 2 to a known terminal velocity ξ˙(t)= v X . As in the terminal position case, this approximationconverges µ 0 I ∈ to the actual wave equation dynamics satisfying ξ˙(t) = v X1 as µ 0. The rigorous development yielding ∈ 2 → this conclusion commences with a theorem concerning the concavity of the payoff functional Jµ of (2.10). m,ψ Theorem 2.1. Given t R>0, x X1, and concave terminal payoff ψ : X1 R, the payoff functional Jµ (t,x, )of(2.10)isstrictly∈concave.∈Inpa2rticular, theseconddifference∆Jµ (t2,→x,w,δw˜)=. Jµ (t,x,w+ m,ψ · m,ψ ∗ m,ψ ∗ δw˜) 2Jµ (t,x,w )+Jµ (t,x,w δw˜) of the payoff functional Jµ (t,x, ) at any w W[0,t] is strictly nega−tive ams,ψper (2.14∗) forman,ψy directi∗o−n δw˜ W[0,t] in the input spacme,ψdefined· by δ R,∗δ∈w˜ W[0,t] =0. Proof. Fix t R>0, x X1, w∗ W∈[0,t], δw˜ W[0,t] and δ R, with δ∈w˜ W[0k,t]k= 0, a6s per the ∈ ∈ 2 ∈ ∈ ∈ . k k 6 theorem statement. Define the trajectories corresponding to inputs w and wˆ =w +δw˜ via (2.8) as ∗ ∗ r r r ξ∗(r)=. x+ w∗(s)ds, ξˆ(r)=. x+ w∗(s)+δw˜(s)ds=ξ∗(r)+δξ˜(r), ξ˜(r)=. w˜(s)ds, (2.23) Z0 Z0 Z0 where r [0,t]. The integrated action functional in the payoff (2.10) is of the form tV(ξ(s)) Tµ(w(s))ds, ∈ 0 − where V and Tµ are quadratic functionals given by R . . V(ξ(s))= κ ξ(s) 2 , Tµ(w(s))= m µw(s) 2 , (2.24) 2 k k12 2 kJ k12 with operator µ as per (2.11). Applying (2.23) in (2.24), J r r 2 V(ξ (r)+δξ˜(r))=V(ξ (r))+δκ ξ (r), w˜(s)ds +δ2(κ) w˜(s)ds , (2.25) ∗ ∗ ∗ 2 (cid:28) Z0 (cid:29)21 (cid:13)Z0 (cid:13)12 (cid:13) (cid:13) Tµ(w∗(r)+δw˜(r))=T(w∗(r))+δm w∗(r), w˜(r) 1 +µ2 w∗((cid:13)(cid:13)r), w˜(r) 1 (cid:13)(cid:13) hJ J i2 h i2 h i +δ2(m) w˜(r) 2 +µ2 w˜(r) 2 . (2.26) 2 kJ k21 k k21 h5 i Hence, combining (2.10), (2.25), and (2.26), Jµ (t,x,w +δw˜) Jµ (t,x,w ) m,ψ ∗ − m,ψ ∗ t = V(ξ (r)+δξ˜(r)) V(ξ (r)) [Tµ(w (r)+δw˜(r)) Tµ(w (r))] dr+ψ(ξ (t)+δξ˜(t)) ψ(ξ (t)) ∗ ∗ ∗ ∗ ∗ ∗ − − − − Z0 t r r 2 = δκ ξ(r), w˜(s)ds +δ2(κ) w˜(s)ds 2 Z0 (cid:28) Z0 (cid:29)12 (cid:13)Z0 (cid:13)21 (cid:13) (cid:13) −δm hJ w∗(r), J w˜(r)i12 +µ(cid:13)(cid:13)2hw∗(r), w˜((cid:13)(cid:13)r)i12 −δ2(m2) kJ w˜(r)k212 +µ2kw˜(r)k221 dr +ψ(ξ (t)+hδξ˜(t)) ψ(ξ (t)) i h i (2.27) ∗ ∗ − A corresponding expression for Jµ (t,x,w δw˜) Jµ (t,x,w ) follows by replacing δ with δ in (2.27). m,ψ ∗ − − m,ψ ∗ − Adding this expression to (2.27) yields the second difference ∆Jµ (t,x,w ,δw˜) of Jµ (t,x,w ) at w in m,ψ ∗ m,ψ ∗ ∗ direction δw˜, with t r 2 ∆Jmµ,ψ(t,x,w∗,δw˜)=Z0 δ2κ(cid:13)(cid:13)Z0 w˜(s)ds(cid:13)(cid:13)12 −δ2mhkJ w˜(r)k212 +µ2kw˜(r)k221i dr+∆ψ(ξ∗(t),δξ˜(t)), (2.28) (cid:13) (cid:13) where ∆ψ(ξ (t),δξ˜(t))=. ψ(ξ ((cid:13)t)+δξ˜(t)) (cid:13)2ψ(ξ (t))+ψ(ξ (t) δξ˜(t)) is the second difference of ψ at ξ (t) in direction∗δξ˜(t). With a vie∗w to dealing−with fir∗st term on∗the−right-hand side of (2.28), define q =. ξ˜(r∗) = 0rw˜(σ)dσ ∈X12 and Π:X12→R by Πω =. hω, qw˜i12. Note that Π is a closed linear operator (a funw˜ctional) in (X1;R). Also note that for w˜ W[0,r], r [0,t], Ho¨lder’s inequality and Cauchy-Schwartzimplies that RL 2 ∈ ∈ r r r w˜(s) 1 ds √r w˜ W[0,r], Πw˜(s) ds= qw˜, w˜(s) 1 ds √r qw˜ 1 w˜ W[0,r] < . (2.29) k k2 ≤ k k | | |h i2| ≤ k k2 k k ∞ Z0 Z0 Z0 That is, w˜ L1([0,r];X1) and Πw˜ L1([0,r];R). Hence, as X1 and R are separable Hilbert spaces (separa- ∈ 2 ∈ 2 bility ofthe formerfollowsby existence ofa countablebasis,see Lemma A.3), itfollowsby [3,TheoremA.5.23, p.628] (for example) that r r Π w˜(s)ds= Πw˜(s)ds. (2.30) Z0 Z0 Recalling the definition of Π, r r r r r 2 Π w˜(s)ds= w˜(s)ds, q = w˜(s)ds, w˜(σ)dσ = w˜(s)ds , (2.31) w˜ Z0 (cid:28)Z0 (cid:29)12 (cid:28)Z0 Z0 (cid:29)21 (cid:13)Z0 (cid:13)21 (cid:13) (cid:13) (cid:13) (cid:13) while (cid:13) (cid:13) r r r r r r Πw˜(s)ds= w˜(s), qw˜ 1 ds= w˜(s), w˜(σ)dσ ds= w˜(s), w˜(σ) 1 dσds Z0 Z0 h i2 Z0 (cid:28) Z0 (cid:29)21 Z0 Z0 h i2 r r r 2 ≤ kw˜(s)k21 kw˜(σ)k12 dσds= kw˜(s)k21 ds ≤rkw˜k2W[0,r] ≤rkw˜k2W[0,t], (2.32) Z0 Z0 (cid:18)Z0 (cid:19) in which [3, Theorem A.5.23, p.628] is applied a second time to obtain the third equality, and the left-hand inequalityin (2.29) isappliedto obtainedthe upper bound. Hence, combining(2.31)and(2.32)in(2.30)yields r 2 w˜(s)ds r w˜ 2 , (cid:13)Z0 (cid:13)12 ≤ k kW[0,t] (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) 6 which in turn implies that the first term on the right-hand side of (2.28) is t r 2 t δ2κ w˜(s)ds dr δ2κ rdr w˜ 2 =δ2κ(t2) w˜ 2 . (2.33) Z0 (cid:13)Z0 (cid:13)21 ≤ Z0 k kW[0,t] 2 k kW[0,t] (cid:13) (cid:13) (cid:13) (cid:13) Substituting (2.33) in (2.28(cid:13)) thus yields(cid:13)the second difference bound ∆Jµ (t,x,w,δw˜) δ2 mµ2 κ(t2) w˜ 2 δ2m w˜ 2 +∆ψ(ξ (t),δξ˜(t)). (2.34) m,ψ ∗ ≤− − 2 k kW[0,t]− kJ kW[0,t] ∗ h i As the second difference ∆ψ(ξ (t),δξ˜(t)) is non-positive by concavity of ψ, (2.34) is strictly negative if mµ2 ∗ κ(t2)>0. Thatis,ift [0,¯tµ),wheret¯µ R isasper(2.15). Undertheseconditions,itfollowsimmediatel−y tha2t the payoff function∈al Jµ (t,x, ) of (∈2.10>)0is strictly concave. m,ψ · 3. Fundamental solutionto the approximating optimal control problem. Afundamentalsolution in this optimal controlcontext is an object from which the value function Wµ of (2.16) can be computed given any concave terminal payoff ψ. This fundamental solution is constructed via four auxiliary control problems. 3.1. Auxiliary control problems. The auxiliary control problems of interest employ the same running payoff as used in (2.10) to define the approximating optimal control problem (2.16). A specific terminal payoff is used in each auxiliary problem. Two of these terminal payoffs depend on an additional function z X1 describing the terminal displacement. These terminal payoffs are denoted by ψµ,c : X1 X1 R, ψ∈µ,∞2: 2 × 2→ X1 X1 R , and ψ0 :X1 R, where µ, c R 0 denote real-valued parameters. Specifically, 2 × 2→ ∪{−∞} 2→ ∈ ≥ ψ0(x)=. 0, ψµ,c(x,z)=. −2c kKµ(x−z)k212 , ψµ,∞(x,z)=. 0,, kKµµ((xx−zz))k211 =>00,, (3.1) (cid:26) −∞ kK − k2 where µ (X1) is a boundedly invertible operator to be defined later. Using these terminal payoffs and K ∈ L 2 a fixed real-valued parameter m (0,m), the four auxiliary control problems of interest are defined via their ∈ respective (auxiliary) value functions Wµ :R 0 X1 R, Wµ,c, Wµ,c :R 0 X1 X1 R, Wµ,∞ :R 0 X1 X1 R , ≥ × 2→ ≥ × 2 × 2→ ≥ × 2 × 2→ ∪{−∞} where Wµ(t,x)=. sup Jµ (t,x,w), Wµ,c(t,x,z)=. sup Jµ (t,x,w), (3.2) m,ψ0 m,ψµ,c(,z) w W[0,t] w W[0,t] · Wµ,c(t,x,z)=. ∈sup Jµ (t,x,w), ∈ (3.3) m,ψµ,c(,z) w W[0,t] · Wµ, (t,x,z)=. ∈sup Jµ (t,x,w). (3.4) ∞ m,ψµ,∞(,z) w W[0,t] · ∈ The majority of the subsequent analysis will concern the value function Wµ,c of (3.3) and its convergence to Wµ, of (3.4) as c . Ultimately, Wµ, plays the role of the fundamental solution for the optimal control ∞ ∞ →∞ problem (2.16), in the sense that Wµ(t,x)= sup Wµ, (t,x,z)+ψ(z) (3.5) ∞ z X1 { } ∈ 2 forallt R 0,x X1. TheremainingvaluefunctionsWµ andWµ,c areusefulinensuringthattheseauxiliary ∈ ≥ ∈ 2 problems are well-defined. In particular, note by inspection of the value functions (3.2)–(3.4) that Wµ(t,x) Wµ,c(t,x,z) Wµ,c(t,x,z) Wµ. (t,x,z)> (3.6) ∞ ≥ ≥ ≥ −∞ forallt R>0,x,z X1. Here,thelastinequalityfollowsbynotingthatthespecificconstantinputwˆ W[0,t] ∈ . ∈ 2 ∈ defined by wˆ(s)= 1(z x) for all s [0,t] is suboptimal in the definition (3.4) of Wµ, (t,x,z). In particular, Wµ, (t,x,z) Jµt − (t,x,wˆ) ∈ m z x 2 > . The following is assumed t∞hroughout. A∞ssumpt≥ionm1,.ψµW,∞µ(·(,zt,)x)< ≥for−a2ltl kµ −(0,k121], t −∞(0,¯tµ), x X1. ∞ ∈ ∈ ∈ 2 7 3.2. Fundamental solution. Inordertoconstructafundamentalsolutiontotheoptimalcontrolproblem (2.16), it is useful to define the value functional Wµ :R 0 X1 R by ≥ × 2→ . Wµ(t,x)= sucp Wµ, (t,x,z)+ψ(z) . (3.7) ∞ z X1 { } ∈ 2 c Theorem 3.1. The value functionals Wµ and Wµ of (2.16) and (3.7) are equivalent, with Wµ(t,x) = Wµ(t,x) for all t [0,t¯µ) and x X1. ∈ ∈ 2 Proof. Fix t [0,t¯µ), x X1. Substituting (3.4)cin (3.7), and recalling (3.1), ∈ ∈ 2 c (2.8) holds with Wµ(t,x)= sup sup Jµ (t,x,w)+ψµ, (ξ(t),z)+ψ(z) m,ψ0 ∞ ξ(0)=x z∈X12 w∈W[0,t](cid:26) (cid:12)(cid:12) (cid:27) c (cid:12) (2.8) holds with = sup sup Jµ (t,x,w)+ψµ, (ξ(t),z)+ψ(z)(cid:12) m,ψ0 ∞ ξ(0)=x w∈W[0,t]z∈X12 (cid:26) (cid:12)(cid:12) (cid:27) (cid:12) (cid:12) (2.8) holds with = sup Jµ (t,x,w)+ sup ψµ, (ξ(t),z)+ψ(z) . (3.8) w∈W[0,t] m,ψ0 z∈X12 { ∞ }(cid:12)(cid:12) ξ(0)=x (cid:12) By inspection of (3.1), the inner supremum must be achieved at z = z =. ξ(t(cid:12)), with sup ψµ, (ξ(t),z)+ ∗ z∈X21{ ∞ ψ(z) =ψµ, (ξ(t),z )+ψ(z )=0+ψ(ξ(t))=ψ(ξ(t)). Substituting in (3.8) and recalling (2.16) yields that ∞ ∗ ∗ } (2.8) holds with Wµ(t,x)= sup Jµ (t,x,w)+ψ(ξ(t)) = sup Jµ (t,x,w)=Wµ(t,x). m,ψ0 ξ(0)=x m,ψ w∈W[0,t](cid:26) (cid:12)(cid:12) (cid:27) w∈W[0,t] c (cid:12) (cid:12) Theorem 3.1 provides an explicit decomposition of the approximating optimal control problem associated with the principle of stationary action. In particular, it provides a means of evaluating of the value functional Wµ of (2.16) for any concave terminal payoff ψ, including that of (2.22). In this regard, inspection of (3.7) via Theorem 3.1 reveals that Wµ, of (3.4) can be regarded as an approximation (for µ = 0) of the fundamental ∞ 6 solutiontotheTPBVP(1.3)viatheprincipleofstationaryaction. Consequently,characterizationofanexplicit representation of Wµ, is important for its application in the computation of Wµ. ∞ 3.3. Explicit representation of the fundamental solution. In order to characterizethe fundamental solution of the approximating optimal control problem (2.16) via Theorem 3.1, an explicit form for the value function Wµ, of (3.4) may be constructed via three steps: ∞ ❶ Show that the value function Wµ, of (3.4) may be obtained as the limit of Wµ,c of (3.3) as c ; ∞ ❷ Develop a verificationtheoremthat provides a means for proposing and validated an explicit rep→re∞sen- tation for the value function Wµ,c of (3.3); ❸ Findanexplicitrepresentationsatisfyingthe conditionsofthe verificationtheoremof❷ ,andapplythe limit argument of ❶ to obtain the corresponding representation for Wµ, of (3.4). ∞ 3.3.1. Limit argument – ❶ . This first step is formalized via the following theorem. Theorem 3.2. The auxiliary value functions Wµ,c, Wµ, of (3.3), (3.4) satisfy the limit relationship ∞ lim Wµ,c(t,x,z)=Wµ,∞(t,x,z) (3.9) c →∞ for all t [0,¯tµ), x,z X1. ∈ ∈ 2 In orderto demonstratethe limit propertysummarizedby Theorem3.2, it is useful to firstnote that a ball of any fixed radius centered on z X1 can be reached by a sufficiently near-optimal trajectory defined with ∈ 2 respect to (3.3). 8 Lemma 3.3. Fix t ∈ [0,t¯µ) and x,z ∈ X12. For each ǫ ∈ R>0, there exists a c¯=. c¯ǫt,x,z ∈ R>0, δ¯∈ (0,1], such that ξc,δ(t) z ǫ for all c (c¯, ) and δ (0,δ¯), where ξc,δ() denotes the trajectory of system (2.8) 1 − 2 ≤ ∈ ∞ ∈ · correspond(cid:13)ing to any(cid:13)δ-optimal input wc,δ W[0,t] in the definition (3.3) of Wµ,c(t,x,z). Proof.(cid:13)Fix t [0,(cid:13)t¯µ) and x,z X1. R∈ecalling the assumed bounded invertibility of µ on X1, see (3.1), ∈ ∈ 2 K 2 set κ =. 1 2 R . Suppose the statement of the lemma is false. That is, there exists an ǫ R such µ Kµ− 21 ∈ >0 ∈ >0 that for a(cid:13)ll c¯ (cid:13)R>0 and δ¯ (0,1], there exists a c (c¯, ) and δ (0,δ¯) such that ξc,δ(t) z 1 > ǫ. So, (cid:13) ∈(cid:13) ∈ ∈ ∞ ∈ − 2 given this ǫ R>0, choose a specific c¯ R>0 and δ¯ (0,1] such that (cid:13) (cid:13) ∈ ∈ ∈ (cid:13) (cid:13) ( c¯ )ǫ2 δ¯ Wµ(t,x) Wµ, (t,x,z). (3.10) 2κµ − ≥ − ∞ (NotethatthisisalwayspossiblebyAssumption1.) Letc (c¯, )andδ (0,δ¯)besuchthat ξc,δ(t) z >ǫ 1 ∈ ∞ ∈ − 2 as per the hypothesis above. Note by bounded invertibility of Kµ on X12, (cid:13)(cid:13) (cid:13)(cid:13) ǫ2 < 1 (ξc,δ(t) z) 2 κ (ξc,δ(t) z) 2 . (3.11) Kµ− Kµ − 21 ≤ µ Kµ − 12 (cid:13) (cid:13) (cid:13) (cid:13) Hence, by definition of any δ-opt(cid:13)imal input wc,δ in W(cid:13)µ,c(t,x,z(cid:13)) of (3.3), (cid:13) Wµ,c(t,x,z) δ <Jµ t,x,wc,δ =Jµ (t,x,wc,δ)+ψµ,c ξc,δ(t), z − m,ψc(,z) m,ψ0 · Wµ(t,x) (cid:0) c (ξ(cid:1)c,δ(t) z) 2 Wµ(t,x) (cid:0)( c )ǫ2,(cid:1) ≤ − 2 Kµ − 12 ≤ − 2κµ (cid:13) (cid:13) where the equality follows by (3.1), while the inequ(cid:13)alities follow by(cid:13)suboptimality of wc,δ in the definition (3.3) of Wµ,c, (3.6), and (3.11). Consequently, Wµ(t,x) Wµ, (t,x,z) Wµ(t,x) Wµ,c(t,x,z) ( c )ǫ2 δ > − ∞ ≥ − ≥ 2κµ − ( c¯ )ǫ2 δ¯, which contradicts (3.10). Hence, the assertion in the lemma statement is true. 2κµ − An upper norm bound on near-optimal inputs is also useful. Lemma 3.4. With c R>0, δ (0,1], t [0,¯tµ), and x,z X1 fixed, any input wc,δ W[0,t] that is ∈ ∈ ∈ ∈ 2 ∈ δ-optimal in the definition (3.3) of Wµ,c(t,x,z) satisfies the bound µwc,δ W[0,t] Mµ,c,δ(t,x,z) Mµ(t,x,z) (3.12) kJ k ≤ ≤ where 1 1 Mµ,c,δ(t,x,z)=. Wµ,c(t,x,z)−Wµ,c(t,x,z)+δ 2 , Mµ(t,x,z)=. Wµ(t,x)−Wµ,∞(t,x,z)+1 2 . 1(m m) ! 1(m m) ! 2 − 2 − Proof. Input wc,δ W[0,t] is respectively sub-optimal and δ-optimal in the definitions (3.2) and (3.3) of Wµ,c(t,x,z) and Wµ,c(∈t,x,z), so that t Wµ,c(t,x,z) Jµ (t,x,wc,δ)= κ ξc,δ(s) 2 m µwc,δ(s) 2 ds+ψµ,c ξc,δ(t), z , ≥ m,ψc(·,z) Z0 2 k k12 − 2 kJ k21 t (cid:0) (cid:1) Wµ,c(t,x,z) δ <Jµ (t,x,wc,δ)= κ ξc,δ(s) 2 m µwc,δ(s) 2 ds+ψµ,c ξc,δ(t), z , − m,ψc(·,z) Z0 2 k k12 − 2 kJ k21 (cid:0) (cid:1) in which ξc,δ() denotes the trajectory corresponding to wc,δ(). The left-hand inequality of (3.12) follows by · · subtracting the second inequality above from the first, while the right-hand inequality of (3.12) follows by application of (3.6) in the definition of Mµ,c,δ(t,x,z) to yield the upper bound Mµ(t,x,z). Lemmas 3.3 and 3.4 facilitate the proof of the required limit property of Theorem 3.2. Proof. [Theorem 3.2] Fix t [0,t¯µ) and x,z X1. Observe that for all c R 0, c1 R c, ∈ ∈ 2 ∈ ≥ ∈ ≥ ψµ,c(x,z) ψµ,c1(x,z) ψµ,∞(x,z), Wµ,c(t,x,z) Wµ,c1(t,x,z) Wµ,∞(t,x,z), ≥ ≥ ≥ ≥ 9 where the first set of inequalities follows immediately by inspection of (3.1), which in turn implies the second set of inequalities by inspection of (3.3) and (3.4). That is, Wµ,c is non-increasing in c and satisfies lim Wµ,c(t,x,z) Wµ, (t,x,z). (3.13) ∞ c ≥ →∞ In order to prove the opposite inequality required to demonstrate (3.9), a sub-optimal input for Wµ, is ∞ constructedfrom a near-optimalinput for Wµ,c. To this end, fix an arbitraryǫ R . With ξc,δ() and wc,δ() >0 as per the statement of Lemma 3.3, there exists a c¯ R and δ¯ (0,1] such th∈at · · >0 ∈ ∈ ξc,δ(t) z ǫ (3.14) 1 − 2 ≤ (cid:13) (cid:13) for all c (c¯, ) and δ (0,δ¯). Define a new i(cid:13)nput wˆc,δ (cid:13)W[0,t] by ∈ ∞ ∈ ∈ . wˆc,δ(s)=wc,δ(s)+ 1 z ξc,δ(t) (3.15) t − for all s [0,t]. By inspection of (2.8) and (3.15), the corres(cid:0)ponding st(cid:1)ate trajectory ξˆc,δ() satisfies ∈ · s s ξˆc,δ(s)=x+ wˆc,δ(σ)dσ =x+ wc,δ(σ)dσ+ s z ξc,δ(t) =ξc,δ(s)+ s z ξc,δ(t) , (3.16) t − t − Z0 Z0 (cid:0) (cid:1) (cid:0) (cid:1) so that ξˆc,δ(0) = x and ξˆc,δ(t) = z. However, as ξc,δ(t) need not equal z, (3.1) implies that ψµ,c ξc,δ(t), z = c (ξc,δ(t) z) 2 0=ψµ, (ξˆc,δ(t), z). So, for all c (c¯, ), δ (0,δ¯), −2 Kµ − 21 ≤ ∞ ∈ ∞ ∈ (cid:0) (cid:1) (cid:13) (cid:13) (cid:13) (cid:13) t Wµ,c(t,x,z) δ <Jµ (t,x,wc,δ)= κ ξc,δ(s) 2 m µwc,δ(s) 2 ds+ψµ,c(ξc,δ(t), z) − m,ψc(·,z) Z0 2 k k12 − 2 kJ k21 t κ ξˆc,δ(s) 2 m µwˆc,δ(s) 2 ds+ψµ, (ξˆc,δ(t), z)+∆µ,ǫ(t,x,z)=Jµ t,x,wˆc,δ +∆µ,ǫ(t,x,z) ≤Z0 2 k k21− 2 kJ k21 ∞ m,ψ∞(·,z) Wµ, (t,x,z)+∆µ,ǫ(t,x,z), (cid:0) (cid:1) (3.17) ∞ ≤ where sub-optimality of wˆc,δ in the definition (3.4) of Wµ, (t,x,z) has been applied, and ∞ t ∆µ,ǫ(t,x,z)=. κ ξc,δ(s) 2 ξˆc,δ(s) 2 m µwc,δ(s) 2 µwˆc,δ(s) 2 ds Z0 2 (cid:16)k k12 −k k21(cid:17)− 2 (cid:16)kJ k12 −kJ k21(cid:17) t κ ξˆc,δ(s) 1 + ξc,δ(s) 1 ξˆc,δ(s) ξc,δ(s) 1 + m µwˆc,δ(s) 1 + µwc,δ(s) 1 ≤ 2 k k2 k k2 k − k2 2 kJ k2 kJ k2 × Z0 (cid:16) (cid:17) (cid:16) (cid:17) µ(wˆc,δ(s) wc,δ(s)) 1 ds. (3.18) kJ − k2 (Here, the upper bound follows by the triangle inequality.) Note that ∆µ,ǫ(t,x,z) is parameterized by ǫ R >0 via c¯and δ¯(see Lemma 3.3). In order to bound the right-hand side of (3.18), Ho¨lder’s inequality implie∈s that . for any Hilbert space Z (with norm denoted by Z) and any z,zˆ Z[0,t]=L2([0,t];Z), k·k ∈ 1 1 t t 2 t 2 ( zˆ(s) Z + z(s) Z) zˆ(s) z(s) Z ds ( zˆ(s) Z + z(s) Z)2 ds zˆ(s) z(s) 2Z ds k k k k k − k ≤ k k k k k − k Z0 (cid:18)Z0 (cid:19) (cid:18)Z0 (cid:19) 1 ≤√2 kzˆk2Z[0,t]+kzk2Z[0,t] 2 kzˆ−zkZ[0,t] ≤√2 kzˆkZ[0,t]+kzkZ[0,t] kzˆ−zkZ[0,t] , (3.19) in which z 2(cid:16) =. t z(s) 2 ds(cid:17). Meanwhile, the trian(cid:0)gle inequality states t(cid:1)hat k kZ[0,t] 0 k kZ R zˆ Z[0,t] zˆ z Z[0,t]+ z Z[0,t]. (3.20) k k ≤k − k k k With a view to applying (3.19) and (3.20) to the right-hand side of (3.18), note that by (2.11) and (3.15), µ(wˆc,δ(s) wc,δ(s)) 2 = (wˆc,δ(s) wc,δ(s)) 2 +µ2 wˆc,δ(s) wc,δ(s) 2 1 1 1 kJ − k2 kJ − k2 k − k2 10