A Contraction Theory Approach to Stochastic Incremental Stability Quang-Cuong Pham Nicolas Tabareau ∗ 8 LPPA, Coll`ege de France LPPA, Coll`ege de France 0 0 Paris, France Paris, France 2 [email protected] [email protected] n Jean-Jacques Slotine a J Nonlinear Systems Laboratory, MIT 5 Cambridge, MA 02139, USA 2 [email protected] ] C February 1, 2008 O . h t a Abstract m WeinvestigatetheincrementalstabilitypropertiesofItoˆstochastic [ dynamical systems. Specifically, we derive a stochastic versionof non- 2 linear contraction theory that provides a bound on the mean square v distance between any two trajectories of a stochastically contracting 6 system. This bound can be expressed as a function of the noise inten- 2 9 sity and the contraction rate of the noise-free system. We illustrate 0 these results in the contexts of stochastic nonlinear observers design . and stochastic synchronization. 4 0 7 1 Introduction 0 : v i Nonlinear stability properties are often considered with respect to an equi- X librium point or to a nominal system trajectory (see e.g. [31]). By contrast, r a incremental stability is concerned with the behaviour of system trajectories with respect to each other. From the triangle inequality, global exponential incremental stability (any two trajectories tend to each other exponentially) is a stronger property than global exponential convergence to a single tra- jectory. Historically, work on deterministic incremental stability can be traced back to the 1950’s [23, 7, 16] (see e.g. [26, 20] for a more extensive list and historical discussion of related references). More recently, and largely inde- pendently of these earlier studies, a number of works have put incremental ∗To whom correspondance should be addressed. 1 stability on a broader theoretical basis and made relations with more tradi- tional stability approaches [14, 32, 24, 2, 6]. Furthermore, it was shown that incremental stability is especially relevant in the study of such problems as state detection [2], observer design or synchronization analysis. While the above references are mostly concerned with deterministic sta- bility notions, stability theory has also been extended to stochastic dynam- ical systems, see for instance [22, 17]. This includes important recent de- velopments in Lyapunov-like approaches [12, 27], as well as applications to standard problems in systems and control [13, 34, 8]. However, stochastic versions of incremental stability have not yet been systematically investi- gated. The goal of this paper is to extend some concepts and results in in- cremental stability to stochastic dynamical systems. More specifically, we derive a stochastic version of contraction analysis in the specialized context of state-independent metrics. We prove in section 2 that the mean square distance between any two trajectories of a stochastically contracting system is upper-bounded by a constant after exponential transients. In contrast with previous works on incremental stochastic stability [5], we consider the case when the two tra- jectories are subject to distinct and independent noises, as detailed in sec- tion 2.2.1. This specificity enables our theory to have a number of new and practically important applications. However, the fact that the noise does not vanish as two trajectories get very close to each other will prevent us from obtaining asymptotic almost-sure stability results (see section 2.3.2). In section 3, we show that results on combinations of deterministic con- tracting systems have simple analogues in the stochastic case. These combi- nation properties allow one to build by recursion stochastically contracting systems of arbitrary size. Finally, as illustrations of our results, we study in section 4 several ex- amples, including contracting observers with noisy measurements, stochas- ticcomposite variables andsynchronization phenomenain networks of noisy dynamical systems. 2 Main results 2.1 Background 2.1.1 Nonlinear contraction theory Contraction theory [24] provides a set of tools to analyze the incremental exponential stability of nonlinear systems, and has been applied notably to observer design [24, 25, 1, 21, 36], synchronization analysis [35, 28] and systems neuroscience modelling [15]. Nonlinear contracting systems enjoy desirableaggregationproperties,inthatcontractionispreservedundermany 2 types of system combinations given suitable simple conditions [24]. Whileweshallderiveglobalpropertiesofnonlinearsystems,manyofour results can be expressed in terms of eigenvalues of symmetric matrices [19]. Given a square matrix A, the symmetric part of A is denoted by A . The s smallestandlargesteigenvalues of A aredenoted byλ (A)andλ (A). s min max Given these notations, the matrix A is positive definite (denoted A > 0) if λ (A) > 0, and it is uniformly positive definite if min β > 0 x,t λ (A(x,t)) β min ∃ ∀ ≥ The basic theorem of contraction analysis, derived in [24], can be stated as follows Theorem 1 (Contraction) Consider, in Rn, the deterministic system x˙ = f(x,t) (2.1) where f is a smooth nonlinear function. Denote the Jacobian matrix of f with respect to its first variable by ∂f. If there exists a square matrix Θ(x,t) ∂x such that M(x,t) = Θ(x,t)TΘ(x,t) is uniformly positive definite and the matrix d ∂f F(x,t) = Θ(x,t)+Θ(x,t) Θ 1(x,t) − dt ∂x (cid:18) (cid:19) is uniformly negative definite, then all system trajectories converge exponen- tially to a single trajectory, with convergence rate supx,tλmax(F) = λ> 0. | | The system is said to be contracting, F is called its generalized Jacobian, M(x,t) its contraction metric and λ its contraction rate. 2.1.2 Standard stochastic stability Inthissection,wepresentveryinformallythebasicideasofstandardstochas- tic stability (for a rigourous treatment, the reader is referred to e.g. [22]). This will set the context to understand the forthcoming difficulties and dif- ferences associated with incremental stochastic stability. Forsimplicity, weconsiderthespecialcaseofglobalexponentialstability. Let x(t) be a Markov stochastic process and assumethat there exists a non- negative function V (V(x) may represent e.g. the squared distance of x from the origin) such that x Rn AV(x) λV(x) (2.2) ∀ ∈ ≤ − where λ is a positive real number and A is the infinitesimal operator of the e process x(t). The operator A is the stochastic analogue of the deterministic differentiation operator. In the case theat x(t) is an Itˆo process, A corre- sponds to the widely-used [2e7, 34, 8] differential generator L (for a proof of this fact, see [22], p. 15 or [3], p. 42). e 3 For x0 ∈ Rn, let Ex0(·) = E(·|x(0) = x0). Then by Dynkin’s formula ([22], p. 10), one has ∀t ≥ 0 Ex0V(x(t))−V(x0) = Ex0 0tAV(x(s))ds λEx tV(x(s))ds = λ tEx V(x(s))ds ≤ − R 0 0 − 0 0 e ApplyingtheGronwall’s lemmatothedReterministicreal-vaRlued function t Ex V(x(t)) yields → 0 ∀t ≥ 0 Ex0V(x(t)) ≤ V(x0)e−λt If we assume furthermore that Ex V(x(t)) < for all t, then the above 0 ∞ implies that V(x(t)) is a supermartingale (see lemma 3 in the Appendix for details), which yields, by the supermartingale inequality Px sup V(x(t)) A Ex0V(x(T)) V(x0)e−λT (2.3) 0 ≥ ≤ A ≤ A T t< ! ≤ ∞ Thus, one obtains an almost-sure stability result, in the sense that A> 0 lim Px sup V(x(t)) A = 0 (2.4) ∀ T 0 T t< ≥ ! →∞ ≤ ∞ 2.2 The stochastic contraction theorem 2.2.1 Settings Consider a noisy system described by an Itˆo stochastic differential equation da = f(a,t)dt+σ(a,t)dWd (2.5) where f is a Rn R+ Rn function, σ is a Rn R+ Rnd matrix-valued × → × → function and Wd is a standard d-dimensional Wiener process. To ensure existence and uniqueness of solutions to equation (2.5), we assume, here and in the remainder of the paper, the following standard conditions on f and σ Lipschitz condition: There exists a constant K > 0 such that 1 t 0, a,b Rn f(a,t) f(b,t) + σ(a,t) σ(b,t) K a b 1 ∀ ≥ ∈ k − | k − k ≤ k − k Restriction on growth: There exists a constant K > 0 2 t 0, a Rn f(a,t) 2 + σ(a,t) 2 K (1+ a 2) 2 ∀ ≥ ∈ k k k k ≤ k k Under these conditions, one can show ([3], p. 105) that equation (2.5) has on [0, [ a unique Rn-valued solution a(t), continuous with probability ∞ one, and satisfying the initial condition a(0) = a , with a Rn. 0 0 ∈ 4 Inordertoinvestigatetheincrementalstabilitypropertiesofsystem(2.5), consider now two system trajectories a(t) and b(t). Our goal will consist of studying the trajectories a(t) and b(t) with respect to each other. For this, we consider the augmented system x(t) = (a(t),b(t))T, which follows the equation f(a,t) σ(a,t) 0 dWd dx = dt+ 1 f(b,t) 0 σ(b,t) dWd (cid:18) (cid:19) (cid:18) (cid:19)(cid:18) 2 (cid:19) = ⌢f(x,t)dt+⌢σ(x,t)dW2d (2.6) Important remark As stated in the introduction, the systems a and b are driven by distinct and independent Wiener processes Wd and Wd. 1 2 This makes our approach considerably different from [5], where the authors studied two trajectories driven by the same Wiener process. Ourapproachenablesustostudythestabilityofthesystemwithrespect to variations in initial conditions and to random perturbations: indeed, two trajectories of any real-life system are typically affected by distinct “real- izations” of the noise. In addition, it leads very naturally to nice results on the comparison of noisy and noise-free trajectories (cf. section 2.4), which are particularly useful in applications (cf. section 4). However, because of the very fact that the two trajectories are driven by distinct Wiener processes, we cannot expect the influence of the noise to vanish when the two trajectories get very close to each other. This con- strasts with [5], and more generally, with the standard stochastic stability case, where the noise vanishes near the origin (cf. section 2.1.2). The con- sequences of this will be discussed in detail in section 2.3.2. 2.2.2 The basic stochastic contraction theorem We introduce two hypotheses (H1) f(a,t) is contracting in the identity metric, with contraction rate λ, (i.e. a,t λ ∂f λ) ∀ max ∂a ≤ − (H2) tr σ(a,t)Tσ(a,t(cid:0)) i(cid:1)s uniformly upper-bounded by a constant C (i.e. a,t tr σ(a,t)Tσ(a,t) C) ∀ (cid:0) (cid:1) ≤ Inotherwo(cid:0)rds,(H1)says(cid:1)thatthenoise-freesystemiscontracting, while (H2) says that the variance of the noise is upper-bounded by a constant. Definition 1 A system that verifies (H1) and (H2) is said to be stochas- tically contracting in the identity metric, with rate λ and bound C. ConsidernowtheLyapunov-likefunctionV(x) = a b 2 = (a b)T(a k − k − − b). Using (H1) and (H2), we derive below an inequality on AV(x), similar to equation (2.2) in section 2.1.2. e 5 Lemma 1 Under (H1) and (H2), one has the inequality AV(x) 2λV(x)+2C (2.7) ≤ − Proof Since x(t) is an Ietˆo process, A is given by the differential operator L of the process [22, 3]. Thus, by the Itˆo formula e AV(x) = LV(x) = ∂V(x)⌢f(x,t)+ 1tr ⌢σ(x,t)T∂2V(x)⌢σ(x,t) ∂x 2 ∂x2 (cid:18) (cid:19) e ∂V⌢ 1 ⌢ ∂2V ⌢ = f(x,t) + σ(x,t) σ(x,t) i ij kj ∂x 2 ∂x ∂x i i k 1 i 2n 1 i,j,k 2n ≤X≤ ≤X≤ ∂V ∂V = f(a,t) + f(b,t) i i ∂a ∂b i i 1 i n 1 i n ≤X≤ ≤X≤ 1 ∂2V + σ(a,t) σ(a,t) ij kj 2 ∂a ∂a i k 1 i,j,k n ≤X≤ 1 ∂2V + σ(b,t) σ(b,t) ij kj 2 ∂b ∂b i k 1 i,j,k n ≤X≤ = 2(a b)T(f(a,t) f(b,t)) − − +tr(σ(a,t)Tσ(a,t))+tr(σ(b,t)Tσ(b,t)) Fix t 0 and, as in [10], consider the real-valued function ≥ r(µ)= (a b)T(f(µa+(1 µ)b,t) f(b,t)) − − − Sincef is C1, r is C1 over [0,1]. By themean valuetheorem, thereexists µ ]0,1[ such that 0 ∈ r (µ ) = r(1) r(0) = (a b)T(f(a) f(b)) ′ 0 − − − On the other hand, one obtains by differentiating r ∂f r(µ ) = (a b)T (µ a+(1 µ )b,t) (a b) ′ 0 0 0 − ∂a − − (cid:18) (cid:19) Thus, one has ∂f (a b)T(f(a) f(b)) = (a b)T (µ a+(1 µ )b,t) (a b) 0 0 − − − ∂a − − (cid:18) (cid:19) λ(a b)T(a b) = 2λV(x) (2.8) ≤ − − − − where the inequality is obtained by using (H1). 6 Finally, AV(x) = 2(a b)T(f(a) f(b))+tr(σ(a,t)Tσ(a,t))+tr(σ(b,t)Tσ(b,t)) − − 2λV(x)+2C e ≤ − where the inequality is obtained by using (H2). (cid:3) We are now in a position to prove our main theorem on stochastic incre- mental stability. Theorem 2 (Stochastic contraction) Assume that system (2.5) verifies (H1) and (H2). Let a(t) and b(t) be two trajectories whose initial condi- tions are given by a probability distribution p(x(0)) = p(a(0),b(0)). Then + C C t 0 E a(t) b(t) 2 +e 2λt a b 2 dp(a ,b ) − 0 0 0 0 ∀ ≥ k − k ≤ λ k − k − λ Z (cid:20) (cid:21) (cid:0) (cid:1) (2.9) where []+ = max(0, ). This implies in particular · · C t 0 E a(t) b(t) 2 +E a(0) b(0) 2 e 2λt (2.10) − ∀ ≥ k − k ≤ λ k − k (cid:0) (cid:1) (cid:0) (cid:1) Proof Let x = (a ,b ) R2n. By Dynkin’s formula ([22], p. 10) 0 0 0 ∈ t Ex0V(x(t))−V(x0)= Ex0 AV(x(s))ds Z0 Thus one has u,t 0 u t < e ∀ ≤ ≤ ∞ t Ex V(x(t)) Ex V(x(u)) = Ex AV(x(s))ds 0 − 0 0 u Z t Ex (e2λV(x(s))+2C)ds (2.11) ≤ 0 − Zu t = ( 2λEx V(x(s))+2C)ds − 0 Zu where inequality (2.11) is obtained by using lemma 1. Denote by g(t) the deterministic quantity Ex V(x(t)). Clearly, g(t) is a 0 continuous function of t since x(t) is a continuous process. The function g then satisfies the conditions of the Gronwall-type lemma 4 in the Appendix, and as a consequence + C C ∀t≥ 0 Ex0V(x(t)) ≤ λ + V(x0)− λ e−2λT (cid:20) (cid:21) Integrating the above inequality with respect to x yields the desired 0 result (2.9). Next, inequality (2.10) follows from (2.9) by remarking that + C a b 2 dp(a ,b ) a b 2dp(a ,b ) 0 0 0 0 0 0 0 0 k − k − λ ≤ k − k Z (cid:20) (cid:21) Z = E a(0) b(0) 2 (2.12) k − k (cid:0) (cid:1) 7 (cid:3) Remark Let ǫ > 0 and Tǫ = 21λ log E(ka0−ǫb0k2) . Then inequal- (cid:18)q (cid:19) ity (2.10) and Jensen’s inequality [30] imply t T E( a(t) b(t) ) C/λ+ǫ (2.13) ǫ ∀ ≥ k − k ≤ p Since a(t) b(t) isnon-negative, (2.13)together with Markov inequal- k − k ity [11]allow one toobtain thefollowing probabilistic boundon thedistance between a(t) and b(t) C/λ+ǫ A> 0 t T P( a(t) b(t) A) ǫ ∀ ∀ ≥ k − k ≥ ≤ A p Note however that this bound is much weaker than the asymptotic almost-sure bound (2.4). 2.2.3 Generalization to time-varying metrics Theorem 2 can be vastly generalized by considering general time-dependent metrics (the case of state-dependent metrics is not considered in this article and will be the subject of a future work). Specifically, let us replace (H1) and (H2) by the following hypotheses (H1’) There exists a uniformly positive definite metric M(t) = Θ(t)TΘ(t), with the lower-bound β > 0 (i.e. x,t xTM(t)x β x 2) and f(a,t) ∀ ≥ k k is contracting in that metric, with contraction rate λ, i.e. d ∂f λ Θ(t)+Θ(t) Θ 1(t) λ uniformly max − dt ∂a ≤ − (cid:18)(cid:18) (cid:19) (cid:19) or equivalently T ∂f ∂f d M(t) + M(t)+ M(t) 2λM(t) uniformly ∂a ∂a dt ≤ − (cid:18) (cid:19) (H2’) tr σ(a,t)TM(t)σ(a,t) is uniformly upper-boundedby a constant C (cid:0) (cid:1) Definition 2 A system that verifies (H1’) and (H2’) is said to be stochas- tically contracting in the metric M(t), with rate λ and bound C. Consider now the generalized Lyapunov-like function V (x,t) = (a 1 − b)TM(t)(a b). Lemma 1 can then be generalized as follows. − Lemma 2 Under (H1’) and (H2’), one has the inequality AV (x,t) 2λV (x,t)+2C (2.14) 1 1 ≤ − e 8 Proof Let us compute first AV 1 AV (x,t) = ∂V1 + ∂V1⌢f(x,et)+ 1tr ⌢σ(x,t)T ∂2V1⌢σ(x,t) 1 ∂t ∂x 2 ∂x2 (cid:18) (cid:19) d e = (a b)T M(t) (a b)+2(a b)TM(t)(f(a,t) f(b,t)) − dt − − − (cid:18) (cid:19) +tr(σ(a,t)TM(t)σ(a,t))+tr(σ(b,t)TM(t)σ(b,t)) Fix t > 0 and consider the real-valued function r(µ)= (a b)TM(t)(f(µa+(1 µ)b,t) f(b,t)) − − − Sincef is C1, r is C1 over [0,1]. By themean valuetheorem, thereexists µ ]0,1[ such that 0 ∈ r (µ ) = r(1) r(0)= (a b)TM(t)(f(a) f(b)) ′ 0 − − − On the other hand, one obtains by differentiating r ∂f r (µ )= (a b)TM(t) (µ a+(1 µ )b,t) (a b) ′ 0 0 0 − ∂a − − (cid:18) (cid:19) Thus, letting c = µ a+(1 µ )b, one has 0 0 − (a b)T dM(t) (a b)+2(a b)TM(t)(f(a) f(b)) − dt − − − = (a b)T dM(t) (a b)+2(a b)TM(t) ∂f(c,t) (a b) − (cid:0)dt (cid:1) − − ∂a − =(a b)T (cid:0)dM(t)+(cid:1) M(t) ∂f(c,t) + ∂f(c,t(cid:0)) T M(t)(cid:1) (a b) − dt ∂a ∂a − (cid:16) 2λ(a b)TM(cid:0) (t)(a (cid:1)b)(cid:0)= 2λV(cid:1)(x) (cid:17) (2.15) 1 ≤ − − − − where the inequality is obtained by using (H1’). Finally, combining equation (2.15) with (H2’) allows to obtain the de- sired result. (cid:3) We can now state the generalized stochastic contraction theorem Theorem 3 (Generalized stochastic contraction) Assumethatsystem (2.5) verifies (H1’) and (H2’). Let a(t) and b(t) be two trajectories whose initialconditionsaregivenbyaprobabilitydistributionp(x(0)) = p(a(0),b(0)). Then t 0 E (a(t) b(t))TM(t)(a(t) b(t)) ∀ ≥ − − ≤ + C +e 2λt (a b(cid:0))TM(0)(a b ) C dp(a ,(cid:1)b ) (2.16) − 0 0 0 0 0 0 λ − − − λ Z (cid:20) (cid:21) In particular, 1 C t 0 E a(t) b(t) 2 +E a(0) b(0) 2 e 2λt (2.17) − ∀ ≥ k − k ≤ β λ k − k (cid:18) (cid:19) (cid:0) (cid:1) (cid:0) (cid:1) 9 Proof Following the same reasoning as in the proof of theorem 2, one obtains + C C ∀t ≥ 0 Ex0V1(x(t)) ≤ λ + V1(x0)− λ e−2λt (cid:20) (cid:21) whichleadsto(2.16)byintegratingwithrespectto(a ,b ). Next,observing 0 0 that 1 1 a(t) b(t) 2 (a(t) b(t))TM(t)(a(t) b(t)) = EV (x(t)) 1 k − k ≤ β − − β and using the same bounding as in (2.12) lead to (2.17). (cid:3) 2.3 Strength of the stochastic contraction theorem 2.3.1 “Optimality” of the mean square bound Consider the following linear dynamical system, known as the Ornstein- Uhlenbeck (colored noise) process da = λadt+σdW (2.18) − Clearly, the noise-free system is contracting with rate λ and the trace of the noise matrix is upper-bounded by σ2. Let a(t) and b(t) be two system trajectories starting respectively at a and b (deterministic initial condi- 0 0 tions). Then by theorem 2, we have σ2 σ2 + t 0 E (a(t) b(t))2 + (a b )2 e 2λt (2.19) 0 0 − ∀ ≥ − ≤ λ − − λ (cid:20) (cid:21) (cid:0) (cid:1) Let us verify this result by solving directly equation (2.18). Thesolution of equation (2.18) is ([3], p. 134) t a(t) = a e λt+σ eλ(s t)dW(s) (2.20) 0 − − Z0 Next, let us compute the mean square distance between the two trajec- tories a(t) and b(t) E((a(t) b(t))2) = (a b )2e 2λt+ 0 0 − − − t 2 t 2 σ2 E eλ(s t)dW (s) +E eλ(u t)dW (u) − 1 − 2 (cid:18)Z0 (cid:19) ! (cid:18)Z0 (cid:19) !! σ2 = (a b )2e 2λt+ (1 e 2λt) 0 0 − − − λ − σ2 σ2 + + (a b )2 e 2λt 0 0 − ≤ λ − − λ (cid:20) (cid:21) The last inequality is in fact an equality when (a b )2 σ2. Thus, 0 − 0 ≥ λ this calculation shows that the upper-bound (2.19) given by theorem 2 is optimal, in the sense that it can be attained. 10