A probabilistic framework for the control of systems with discrete states and stochastic excitation Gianluca Meneghello∗ and Thomas Bewley Flow Control Lab, UC San Diego, La Jolla, CA 92093-0411, USA. Paolo Luchini DIMEC, Universita` di Salerno, Via Ponte don Melillo, 84084 Fisciano (SA), Italy (Dated: January 10, 2017) Aprobabilisticframeworkisproposedfortheoptimizationofefficientswitchedcontrolstrategies for physical systems dominated by stochastic excitation. In this framework, the equation for the state trajectory is replaced with an equivalent equation for its probability distribution function in the constrained optimization setting. This allows for a large class of control rules to be considered, includinghysteresisandamixofcontinuousanddiscreterandomvariables. Theproblemofsteering 7 atmosphericballoonswithinastratifiedflowfieldisamotivatingapplication;thesameapproachcan 1 be extended to a variety of mixed-variable stochastic systems and to new classes of control rules. 0 2 INTRODUCTION n a J Optimalcontroltheoryisconcernedwithminimizingtheenergyrequiredtomaintainafeasiblephase-spacetrajec- 6 torywithinafixedtime-averagemeasurefromatargettrajectory[2]. Thismaybeachievedbysolvingtheconstrained optimization problem [3] ] Y S min u (1a) . u | |Q s c (cid:40)x x¯ =constant R [ with | − | (1b) x˙ =f(x,u)+ξ, 1 v where u(x) is a given feedback control rule, x = x(t) and x¯ = x¯(t) are the actual and target trajectories in phase 7 7 space, and the additive noise term ξ models the unknown or uncertain components of the dynamical system. The 7 norms and must be chosen to reflect the actual control energy and the specific measure of interest of the Q R |·| |·| 1 systemstate,butareoftenlimitedtoL orL normstomaketheoptimizationproblemtractable. Thepresentwork 2 ∞ 0 is motivated by the general inability of the formulation (1) to treat problems with mixed continuous and discrete . 1 random variables, hysteretic behavior, and/or norms others than L or L . 2 ∞ 0 Asamotivatingapplication,consideraballooninastablystratifiedturbulentflowfieldwhosetime-averagedvelocity 7 is a function of height only, as depicted by thin arrows in Figure 1a. This is a good approximation for the radial flow 1 within a hurricane, as depicted in Figure 1b. The balloon’s density can be changed to control its vertical velocity : v (and, hence, its altitude), and the balloon’s motion can be well approximated as the motion of a massless particle i X carried by the flowfield r a X˙ =αZ+ξ, (2a) Z˙ =u(X,Z), (2b) where X and Z are random variables denoting the horizontal and vertical positions, α is the vertical gradient of the time-averaged horizontal velocity (i.e., αz is the time-averaged horizontal velocity at height z), and the turbulent fluctuations of the horizontal velocities are characterized by a white Gaussian noise ξ with zero mean and spectral density c2. Neglecting the vertical velocity fluctuations, the balloon moves in the horizontal direction according to a Brownian motion with a probability distribution function (PDF) p (x,z) with horizontal mean µ (t) = αzt and X,Z X variance σ2 (t) = c2t. In the uncontrolled case, the variance of the balloon’s horizontal position grows linearly with X time. The vertical velocity u can then be used, leveraging the background flow stratification α, to return the balloon to its original position. We are thus interested in designing a control strategy u(x,z) to limit the variance of the horizontal position of the balloon to a target value σ¯2 , while minimizing the control cost u(x,z) . More specifically, we consider a three-level X | |Q control (TLC) feedback rule, depicted by thick lines in Figure 1a, consisting of step-changes of altitude h in the ± 2 (a) (b) z 12 1 2 10 h 3 d x m) 8 4 d (k 6 5 − h z 4 3− − 10 α 2 α= α = 10−2 0 20 15 10 5 0 5 10 − − − − Radial velocity (m/s) FIG.1: Left: flowfieldmodel(thinarrows)andthethree-levelcontrol(TLC)rule(thicklines). Right: radialvelocityprofiles, composite from dropsonde measurements between 1996 and 2012 within 200km of the hurricane center [5], binned into 50m altitudeintervalsandsortedaccordingtohurricanecategory(1to5). Thejaggednessoftheprofilesabove2kmaltitudeisdue to the reduced number of available measurements with respect to the lower region. Dashed lines estimate the mean velocity gradient α. verticalposition,whichareappliedwhentheballoonreachesadistanceof dfromthetargettrajectoryx=0. Insuch a setting, the vertical coordinate Z¯ h,0,h is discrete, and the contr∓ol rule exhibits hysteresis in the horizontal ∈{− } coordinate; we additionally note that the L norm, measuring the step size h, is a more appropriate measure of the 1 energy u required by the balloon to change altitude than the classical L norm. Despite the apparent simplicity Q 2 | | of this control rule, it cannot be optimized as formulated in (1). Ratherthanconstrainingtheproblembythestate-spacerepresentationofthesystem(2),asisdonein(1),wethus instead use an equivalent condition on the PDF p (x,z), and restate the optimization problem (1) as X,Z minE[ u ] (3a) u | | E(cid:104)(cid:0)X−X¯(cid:1)2(cid:105)=constant with (3b) c2 ∂ p +f(X,u) p + 2p =0, t X X X ·∇ 2 ∇ where we have replaced the state equation in (1b) with an equivalent Fokker-Plank equation for the PDF p (x) [4], X and the norms are interpreted as expected values. The solution of the optimization problem as stated in (3) is the principal contribution of this work. The remainder of this paper is concerned with the solution of the optimization problem (3) for the TLC rule of Figure 1a, and with comparison to the classical linear control rule u = k x+k z, whose optimal solution is given 1 2 by the Linear Quadratic Regulator (LQR) [2]. To facilitate comparison, we first derive the functional form of the solution by dimensional analysis. DIMENSIONAL ANALYSIS Thecontrolproblemisgovernedbythreeparameters: thevelocitygradientα,thespectraldensityc2,andthetarget (cid:112) horizontal variance σ¯2 . Take the length, time, and velocity scales as L = c2/α, T = α−1, and U = L/T = √c2α. X A single dimensionless parameter can be defined as R=σ¯2 α/c2, (4) X and the dimensionless control cost can be written as w/U = E[ u ]/U = (R) where (R) is an unknown | | F F dimensionless function. Similar expressions can be written for d/L and h/L. The system (2) is additionally invariant with respect to a rescaling of the vertical coordinate by the time scale α−1, thus reducing the parameters governing 3 the problem to the standard deviation σ¯ and the spectral density c2. Dimensional analysis [1] can then be used to X write the control cost as w 1 c4 U =γwU ασ3 =γwR−23. (5) X Expressions for the control parameters can also be obtained as d σ Z 1 c2 =γd X =γdR12, =γh =γhR−21. (6) L L L Lασ X The solution is then summarized by the optimal value of γ for each control parameter. Similarly, for the linear (·) feedback control rule u = k x+k z, we can write Tk = γ R−2 and Tk = γ R−1. Note that the solution (5) is 1 2 1 k1 2 k2 independent of the specific choice of the control rule u(x,z); a comparison between different rules can be obtained by comparing the respective values of γ . w THREE-LEVEL CONTROL (TLC) RULE x 0 x d ≤ ≤− z= h z=0 z=h − x d x 0 ≥ ≥ FIG. 2: Implementation of the three-level control rule. We now proceed in seeking the optimal values for the parameters d and h [equivalently, γ and γ in (6)] in the d h TLCruleindicatedbythicklinesinFigure1a, correspondingtostepchangesinaltitudehatx=0, d. Inthislimit, ± the governing equations (2) can be restated as X˙ = αZ¯+ξ, (7) − where X is the same as in the original problem, but Z¯ h,0,h is now a discrete rather than a continuous ∈ R ∈ {− } randomvariable,andthecorrespondingequationisreplacedbytheautomatainFigure2. Thisisavalidapproximation when the control velocity u is larger than the horizontal velocity scale √c2α. Note that the dynamics of (7) is quite simple, and is dominated by the effect of the stochastic excitation by ξ. (cid:80) Let pX,Z¯(x,z¯) be the PDF of the balloon position, and pz¯(x)=pX|Z¯(x|z¯) pZ¯(z¯), so that pX(x)= z¯pz¯(x) is the marginal probability. The governing equations for the PDFs can be written as c2 ∂ p (x)+∂ αz¯p (x) ∂ p (x)=0 t z¯ x z¯ − xx 2 z¯ (8a) for z¯ h,0,h , ∈{− } ∂ p (x) =∂ p (x) for x d,0,d , (8b) x X |x− x X |x+ ∈{− } where (8a) are three Fokker-Plank equations for each discrete altitude z¯ obtained by considering the transition probabilities implied by (7), and (8b) represent the transition probabilities in the z direction, and imposes the conservation of the probability fluxes represented by arrows in Figure 2. Equations (8) now take the role of the optimization problem constraint (3b). The statistically steady-state solution of (8a) can actually be computed analytically, as shown in Figure 3, and is given by c x+c 0<x<d, 1 2 p (x)= (9a) 0 0 d<x< , ∞ q1λ2 e−2λx +q2 0<x<d, p (x)= (9b) h r1λ2 e−2λx +r2 d<x<∞, 4 z h x h − d d − FIG. 3: Steady state probability distribution functions for c2 = h = α = 1. Filled: p (x) and p (x). Solid: marginal 0 h probability p (x)=p (x)+p (x)+p (x). Dashed: normal distribution with the same variance. ± X 0 +h h − where λ= c2 is twice the e-folding scale of the PDF (the symmetry of the problem at x=0 can be used to compute gh the solution for x < 0, Z¯ = h). The integration constants c ,c ,q ,q ,r ,r can be obtained imposing (8b) together 1 2 1 2 1 2 with the boundary conditions p (x = d) = 0, p (x = 0) = 0, lim p (x) = 0, and the normalization condition 0 h x→∞ X (cid:82)∞ p (x)dx=1, resulting in −∞ X 1 1 e2λd 1 c = , q = , r = − , 1 1 1 −d(d+λ) −d(d+λ) d(d+λ) 1 λ c = , q = , r =0. (10) 2 2 2 d+λ 2d(d+λ) Upon substitution of the coefficients (10) into the expressions for p and p in (9), the variance can be written 0 h (cid:90) ∞ d3+2λd2+3λ2d+3λ3 σ2 = x2p (x)dx= . (11) X X 6(d+λ) −∞ ThecontrolcostcanbecomputedasthetotaltransitionprobabilitybetweenthestatesZ¯ =0andZ¯ =h,multiplied by the cost of the single control activation h: 2ghc2 w =2αhc2 ∂ p = . (12) x 0|x=d d(d+λ) The control parameters h and d, and the corresponding control cost w, can then be computed by minimization of the objective functional (3a). The corresponding dimensionless constants in (5) and (6) are γ =0.5432, γ =1.6288, γ =1.1166, (13a) w d h f =0.4864c2/σ2 =0.4864T−1R−1, (13b) x where f is the frequency at which the control has to be activated, and can be computed by considering the times t =d2/c2 and t =d/(gZ) to reach the location x=d from x=0 and back, respectively. out in Itisalsoofinteresttocomputetheminimumvarianceattainableforgivenvaluesofdandh,andfromthereobtain the limiting values of d and h for a specified σ¯ : X d2 lim σ2 = d<√6σ¯ , (14a) h→∞ X 6 ⇒ X c4 1 c2 limσ2 = h> . (14b) d→0 X 2α2h2 ⇒ √2σ¯Xα In both limits the control cost tends to infinity, in the first case because h , and in the second because the → ∞ frequency of the steps increases without bound. Reasonable (finite) values of h and d, away from these limiting values, are thus important in application. Results are summarized in Figure 4. 5 (a) (b) Linear rule (c) Three-step rule 102 6 6 101 = = 101 =6 101 R dR/L R ∝R−3/2 101−010 Tk2 h/L 100 wU 100 lineathrree-step 10−210−1 100 Tk1101 10−1 100 101 R=σ¯2g/c2 R=σ¯2g/c2 X X 10−1 (d) (e) 2 2 ) m k 0 0 ( 10−2 z −2 −2 10−1 100 101 −8−6−4−2 0 2 4 6 8 −4.89 0 4.89 R=σ¯2g/c2 x(km) x(km) X FIG. 4: Comparison of control rules: (a) control cost w=E[|u|] as a function of the dimensionless parameter R for the two different control rules; the two control costs are almost indistinguishable, differing by less than 5%. (b,c) Control parameters and (d,e) corresponding probability density for the hurricane case (R=6) for the linear and TLC control rules; gray patterns in (c) define the limit of each curve for which a given value of R is attainable, given by (14). APPLICATION TO THE CONTROL OF BALLOONS WITHIN A HURRICANE Finally,wecomputeoptimalvaluesofthecontrolparametersforatmosphericballoonswithinanidealizedhurricane flowfield,andcomparethemwithresultsusingthelinearcontrolruleu=k x+k z. Avelocitygradientofα=10−3s−1 1 2 (see Figure 1b) and a spectral density of c2 =1500m2s−1 [6] are assumed. We additionally impose a target standard deviation of σ¯ =3km, resulting in R=6 [see (4)]. Control parameters X and control cost can be obtained using (6) together with the values in (13). For the TLC rule, d=4886.4m, h=558.3m, (15a) w =4.54 10−2m/s, f =8.11 10−5s−1, (15b) × × where f corresponds to a period of about 3.5 hours. The optimal solution for the linear control rule can be readily obtained by solving (1), resulting in k =3.125 10−5s−1, k =2.5 10−4s−1, (16a) 1 2 × × w =4.32 10−2m/s. (16b) × Simulations for both control rules are shown in Figure 4d and 4e. CONCLUSIONS This paper introduces a probabilistic framework for the optimization of physical systems dominated by stochastic excitation in the presence of mixed continuous and discrete random variables, non-linearities, and hysteresis. Its application has been demonstrated by addressing the problem of controlling atmospheric balloons within a stratified flowfield; thesameframeworkcanbeextendedtoaclassofproblemsthat,totheauthors’knowledge,werepreviously intractable from an optimization point of view. From an application perspective, the three-level control (TLC) rule of Figure 2 has many advantages: holding rather than continuously adjusting a position is often an easier solution to implement, possiblyrequiringlessenergyinreallifeapplications. Foranobservationalplatformlikeasensorballoon, it has the additional advantage of performing measurements which are not disturbed by continuous control actions. 6 ∗ Correspondingauthor: [email protected];currentaddress: DepartmentofEarth,AtmosphericandPlanetary Sciences, Massachusetts Institute of Technology, Cambridge, Massachusetts, 02139, USA. [1] Barenblatt, G. I. (1996). Scaling, self-similarity, and intermediate asymptotics: dimensional analysis and intermediate asymptotics, volume 14. Cambridge University Press. [2] Lewis, F. L. and Syrmos, V. L. (1995). Optimal control. John Wiley & Sons. [3] Nocedal, J. and Wright, S. (2006). Numerical optimization. Springer Science & Business Media. [4] Risken, H. (1984). Fokker-planck equation. In The Fokker-Planck Equation, pages 63–95. Springer. [5] Wang, J., Young, K., Hock, T., Lauritsen, D., Behringer, D., Black, M., Black, P. G., Franklin, J., Halverson, J., Molinari, J.,etal.(2015). Along-term,high-quality,high-vertical-resolutionGPSdropsondedatasetforhurricaneandotherstudies. Bulletin of the American Meteorological Society, 96(6):961–973. [6] Zhang, J. A. and Montgomery, M. T. (2012). Observational estimates of the horizontal eddy diffusivity and mixing length in the low-level region of intense hurricanes. Journal of the Atmospheric Sciences, 69(4):1306–1316.