Thermodynamics of efficient-robust transport over networks Yongxin Chen,1 Tryphon T. Georgiou,2 Michele Pavon,3 and Allen Tannenbaum4 1Y. Chen is with MSKCC, New York, NY 10065, USA∗ 2Department of Mechanical and Aerospace Engineering, University of California, Irvine, CA 92697† 3Dipartimento di Matematica “Tullio Levi Civita”, Universit`a di Padova, 35121 Padova, Italy‡ 7 4Department of Computer Science and Applied Mathematics & Statistics, 1 Stony Brook University, Stony Brook, NY 11794§ 0 2 Schr¨odinger’s bridges offer a flexible and powerful tool to handle transport prob- n lems away from equilibrium over unweighted and weighted graphs. The resulting a J scheduling consists in selecting transition probabilities for a Markovian evolution 6 which has prescribed initial and final marginals, as specified by the transportation 2 task. From a computational standpoint, viewing scheduling as a Schr¨odinger bridge ] problem, is especially attractive in light of an algorithm with linear convergence (in h p the Hilbert metric) that we proposed recently. The present paper builds on this line - of research by introducing an index of efficiency of a transportation plan and points, h t accordingly, to efficient-robust transport policies. In developing the theory, we es- a m tablish two remarkable invariance properties of Schr¨odinger’s bridges on graphs – [ an iterated bridge invariance property and the invariance of the most probable paths. These two invariance properties, which were tangentially mentioned in [17] in some 1 v special cases, are fully developed here. We also show that the distribution on paths 5 of the optimal transport policy, which depends on a “temperature” parameter in a 2 6 manner completely analogous to that of the Boltzmann distribution, tends to the 7 solution of the “most economical” but possibly less robust optimal mass transport 0 . problem as the temperature goes to zero. The relevance of all of these properties for 1 0 transport over networks is illustrated in an example. 7 1 Keywords: Transport over networks, efficient transport, robust transport, Schr¨odinger : v bridge, most probable path, temperature parameter. i X r a I. INTRODUCTION We consider a graph to specify the network of routes over which resources are to be redistributed. Thus, throughout, the terms graph and network will be used interchangeably. ∗Electronic address: [email protected]; URL: www.tc.umn.edu/~chen2468/ †Electronic address: [email protected]; URL: engineering.uci.edu/users/tryphon-georgiou ‡Electronic address: [email protected]; URL: www.math.unipd.it/~pavon §Electronic address: [email protected]; URL: www.iacs.stonybrook.edu/people/ affiliates/allen-tannenbaum 2 Network robustness is usually understood as the ability of the network to maintain connectivity or be insensitive (as measured by observables) in the face of link and node losses or disturbances. Various notions of robustness have been put forth and studied. In particular, robustness with respect to link and node losses has been considered as an inverse percolationprocess[1,5]. Robustnetworkdesignhasbeenusedtosuggesttheabilitytomeet demands in a given uncertainty set [7]. On the other hand, in [8, 13], robustness has been defined through a fluctuation-dissipation relation involving the entropy rate (Kolmogorov- Sinai entropy). This latter notion captures relaxation of a process back to equilibrium after a perturbation and has been used to study both financial and biological networks [39, 40]. A different issue is whether, given a network, a certain specific transport strategy, over a given finite time horizon is in some sense robust. Here equilibrium or even steady state behaviour plays no role. A concept of resilience of a routing policy in the presence of cascading failures has been introduced and discussed in [41]. In a recent paper [17], we studied the following robust transport problem: Given times t = 0,1,...,N and a directed, strongly connected graph, find a transportation plan from a source node to a sink node such that most of the mass arrives within the prescribed time even in the presence of failures. Different from several other contributions, our approach is “global” in that measures on paths play a central role. In [17], robustness of the transportation plan was identified with the mass spreading as much as the topology of the graph permitted before reconvening at the target node. This was achieved by solving a maximum entropy problem on path space, namely a Schr¨odinger Bridge Problem (SBP), with the so-called Ruelle-Bowen random walk as “prior” distribution. This approach was then extended to weighted graphs, resulting in a transportation plan that attains a satisfactory compromise between robustness and cost. Indeed, the optimal scheduling, while spreading the mass over all feasible paths, assigns maximum probabilitytoallminimum cost paths. Thisapproachappearsalsocomputationally attractive given the iterative algorithm proposed in [23]. In the present paper, we continue this line of research by introducing an explicit index of efficiencyofatransportationplanandseekingefficient-robusttransportplans. Moreover, we establish two remarkable invariance properties of the Schro¨dinger’s bridges. These consist in aniterated bridgeinvariancepropertyandintheinvarianceofthemost probable paths. These two invariance properties, which were only briefly mentioned in [17] in some special cases, are here fully investigated. Their relevance for transport over networks is then illustrated. Further, we study the dependence of the optimal transport on the temperature parameter. The possibility of employing the solution for near-zero temperature as an approximation of the solution to Optimal Mass Transport (OMT) is also discussed and illustrated through examples. The outline of the remainder of the paper is as follows. After recalling in Section II the notion of generalized Schr¨odinger bridges, we introduce in Section III a notion of efficiency of the corresponding transport policy. In Section IV, we introduce robust transport with fixed average path length. The next two sections deal with properties of the Schro¨dinger bridge– in Section V we establish the iterated bridge property, and in Section VI the invariance of the most probable paths. Section VII deals with efficient-robust transportation. In Section VIII, the dependence of the optimal transport on the temperature parameter is thoroughly investigated. The results are then illustrated through academic examples in Section IX. 3 II. GENERALIZED SCHRO¨DINGER’S BRIDGES We consider the generalization of the Schro¨dinger Bridge problem (SBP) introduced in [17]. To this end we are given a directed, strongly connected (i.e., with a path in each direction between each pair of vertices), aperiodic graph G = (X,E) with vertex set X = {1,2,...,n} and edge set E ⊆ X × X. We let time vary in T = {0,1,...,N}, and let FPN ⊆ XN+1 denote the family of length N, feasible paths x = (x ,...,x ), namely paths 0 0 N such that x x ∈ E for i = 0,1,...,N − 1. We seek a probability distribution P on i i+1 FPN with prescribed initial and final marginal probability distributions ν (·) and ν (·), 0 0 N respectively, and such that the resulting random evolution is closest to a “prior” measure M on FPN in a suitable sense. 0 The prior law for our problem is induced by the Markovian evolution (cid:88) µ (x ) = µ (x )m (t) (1) t+1 t+1 t t xtxt+1 xt∈X with nonnegative distributions µ (·) over X, t ∈ T , and weights m (t) ≥ 0 for all indices t ij i,j ∈ X and all times. Moreover, to respect the topology of the graph, m (t) = 0 for all t ij whenever ij (cid:54)∈ E. Often, but not always, the matrix M(t) = [m (t)]n (2) ij i,j=1 does not depend on t. The rows of the transition matrix M(t) do not necessarily sum up to one, so that the “total transported mass” is not necessarily preserved. It occurs, for instance, when M simply encodes the topological structure of the network with m being ij zero or one, depending on whether a certain link exists. The evolution (1), together with measure µ (·), which we assume positive on X, i.e., 0 µ (x) > 0 for all x ∈ X, (3) 0 induces a measure M on FPN as follows. It assigns to a path x = (x ,x ,...,x ) ∈ FPN 0 0 1 N 0 the value M(x ,x ,...,x ) = µ (x )m ···m , (4) 0 1 N 0 0 x0x1 xN−1xN and gives rise to a flow of one-time marginals (cid:88) µ (x ) = M(x ,x ,...,x ), t ∈ T . t t 0 1 N x (cid:96)(cid:54)=t Definition 1 We denote by P(ν ,ν ) the family of probability distributions on FPN having 0 N 0 the prescribed marginals ν (·) and ν (·). 0 N We seek a distribution in this set which is closest to the prior M in relative entropy where, for P and Q measures on XN+1, the relative entropy (divergence, Kullback-Leibler index) D(P(cid:107)Q) is (cid:26)(cid:80) P(x)log P(x), Supp(P) ⊆ Supp(Q), D(P(cid:107)Q) := x Q(x) +∞, Supp(P) (cid:54)⊆ Supp(Q), Here, by definition, 0·log0 = 0. Naturally, while the value of D(P(cid:107)Q) may turn out negative due to miss-match of scaling (in case Q = M is not a probability measure), the relative entropy is always jointly convex. We consider the Schr¨odinger Bridge Problem (SBP): 4 Problem 1 Determine M∗[ν ,ν ] := argmin{D(P(cid:107)M) | P ∈ P(ν ,ν )}. (5) 0 N 0 N The following result is a slight generalization (to time inhomogeneous prior) of [17, Theorem 2.3]. Theorem 1 Assume that the product M(N − 1)M(N − 2)···M(1)M(0) has all entries positive. Then there exist nonnegative functions ϕ(·) and ϕˆ(·) on [0,N]×X satisfying (cid:88) ϕ(t,i) = m (t)ϕ(t+1,j), (6a) ij j (cid:88) ϕˆ(t+1,j) = m (t)ϕˆ(t,i), (6b) ij i for t ∈ [0,N −1], along with the (nonlinear) boundary conditions ϕ(0,x )ϕˆ(0,x ) = ν (x ) (6c) 0 0 0 0 ϕ(N,x )ϕˆ(N,x ) = ν (x ), (6d) N N N N for x ,x ∈ X. Moreover, the solution M∗[ν ,ν ] to Problem 1 is unique and obtained by 0 N 0 N M∗(x ,...,x ) = ν (x )π (0)···π (N −1), 0 N 0 0 x0x1 xN−1xN where the one-step transition probabilities ϕ(t+1,j) π (t) := m (t) (7) ij ij ϕ(t,i) are well defined. The factors ϕ and ϕˆ are unique up to multiplication of ϕ by a positive constant and division of ϕˆ by the same constant. Let ϕ(t) and ϕˆ(t) denote the column vectors with components ϕ(t,i) and ϕˆ(t,i), respectively, with i ∈ X. In matricial form, (6a), (6b ) and (7) read ϕ(t) = M(t)ϕ(t+1), ϕˆ(t+1) = M(t)Tϕˆ(t), (8) and Π(t) := [π (t)] = diag(ϕ(t))−1M(t)diag(ϕ(t+1)). (9) ij Historically, the SBP was posed in 1931/32 by Erwin Schro¨dinger for Brownian particles with a large deviations of the empirical distribution motivation [42, 43]. Historically, other importantcontributionshavebeenprovidedbyFortet, Beurling, JamisonandF¨ollmer[9,21, 22,26,27]; see[47]forasurvey. TheproblemwasconsideredinthecontextofMarkovchains and studied in [23, 35], and some generalizations have been discussed in [17]. Important connections between SBP and OMT [3, 45, 46] have been discovered and developed in [14– 16, 29, 30, 32, 33]. 5 III. EFFICIENCY OF A TRANSPORT PLAN Inspired by the celebrated paper [49], we introduce below a measure of efficiency of a transportation plan over a certain finite time horizon and a given network. For the case of undirected and connected graphs, small-world networks [49] were identi- fied as networks being highly clustered but with small characteristic path length L, where 1 (cid:88) L := d ij n(n−1) i(cid:54)=j and d is the shortest path length between vertices i and j. The inverse of the characteristic ij path length L−1 is an index of efficiency of G. There are other such indexes, most noticeably the global efficiency E introduced in [28]. This is defined as E = E(G)/E(G ) where glob glob id 1 (cid:88) 1 E(G) = n(n−1) d ij i(cid:54)=j and G is the complete network with all possible edges in place. Thus, 0 ≤ E ≤ 1. How- id glob ever, as argued on [28, p. 198701-2], it is 1/L which “measures the efficiency of a sequential system (i.e., only one packet of information goes along the network)”. E , instead, mea- glob surestheefficiencyofaparallelsystem, namelyoneinwhichallnodesconcurrentlyexchange packets of information. Since we are interested in the efficiency of a specific transportation plan, we define below efficiency by a suitable adaptation of the index L. We now consider a strongly connected, aperiodic, directed graph G = (X,E) as in Section II. To each edge ij is now associated a length l ≥ 0. If ij (cid:54)∈ E, we set l = +∞. ij ij Let T = {0,1,...,N} be the time-indexing set. For a path x = (x ,...,x ) ∈ XN+1, we 0 N define the length of x to be N−1 (cid:88) l(x) = l . xtxt+1 t=0 Weconsiderthesituationwhereinitiallyattimet = 0themassisdistributedonX according toν (x)andneedstobedistributedaccordingtoν (x)atthefinaltimet = N. Thesemasses 0 N are normalized to sum to one, so that they are probability distributions. A transportation plan P is a probability measure on the (feasible) paths of the network having the prescribed marginals ν and ν at the initial and final time, respectively. A natural adaptation of the 0 N characteristic path length is to consider the average path length of the transportation plan P, which we define as (cid:88) L(P) = l(x)P(x) (10) x∈XN+1 with the usual convention +∞ × 0 = 0. This is entirely analogous to a thermodynamic quantity, the internal energy, which is defined as the expected value of the Hamiltonian observable in state P. Clearly, L(P) is finite if and only if the transport takes place on actual, existing links of G. Moreover, only the paths which are in the support of P enter in the computation of L(P). One of the goals of a transportation plan is of course to have small average path length since, for instance, cost might simply be proportional to length. Determining the probability measure that minimizes (10) can be seen to be an OMT problem. 6 IV. ROBUST TRANSPORT WITH FIXED AVERAGE PATH LENGTH Besides efficiency, another desirable property of a transport strategy is to ensure robust- ness with respect to links/nodes failures, the latter being due possibly to malicious attacks. We therefore seek a transport plan in which the mass spreads, as much as it is allowed by the network topology, before reconvening at time t = N in the sink node. We achieve this by selecting a transportation plan P that has a suitably high entropy S(P), where (cid:88) S(P) = − P(x)lnP(x). (11) x∈XN+1 Thus, in order to attain a level of robustness while guaranteeing a relatively low average path length (cost), we formulate below a constrained optimization problem that weighs in both S(P) as well as L(P). ¯ We begin by letting L designate a suitable bound on the average path length (cost) that ¯ we are willing accept. We assume that L satisfies 1 (cid:88) ¯ l = min l(x) ≤ L ≤ l(x). (12) m x∈XN+1 |FPN| 0 x∈FPN 0 The inequality to the left is clear, whereas the rationale behind the other, requiring an upper bound as stated, will be explained in Proposition 1 below. Let P denote the family of probability measures on XN+1. We consider the constrained optimization (maximum entropy) problem: Problem 2 maximize {S(P) | P ∈ P} (13a) ¯ subject to L(P) = L. (13b) Consider the corresponding Lagrangian ¯ L(P,λ) := S(P)+λ(L−L(P)). (14) As is well known, [38], the maximization of L over P is attained by the Boltzmann distri- bution l(x) (cid:88) l(x) P∗(x) = Z(T)−1exp[− ], Z(T) = exp[− ], T = λ−1, (15) T T T x Clearly, the Boltzmann distribution has support on the feasible paths FPN. Hence, we get 0 a version of Gibbs’ variational principle that the Boltzmann distribution P∗ minimizes the T free energy functional F(P,T) := L(P)−TS(P) over P. The simplest way to establish the minimizing property of the Boltzmann’s distri- bution is to observe that F(P,T) = TD(P(cid:107)P∗)−T logZ, T and therefore, minimizing the free energy over P is equivalent to: 7 Problem 3 minimize {D(P(cid:107)P∗) | P ∈ P}, T which has the unique minimum P = P∗. The following properties of P∗ are also well known, T T see e.g. [34, Chapter 2]. Proposition 1 The following hold: i) For T (cid:37) +∞, P∗ tends to P , the uniform distribution on all feasible paths. T u ii) For T (cid:38) 0, P∗ tends to concentrate on the set of feasible, minimum length paths. T iii) Assuming that l(·) is not constant over FPN then, for each value L¯ satisfying the 0 bounds (12), there exists a unique nonnegative value of T = λ−1 ∈ [0,∞] such that P∗ T satisfies (13b) and therefore solves Problem 2. Consider now ν and ν distributions on X. These are the “starting” and “ending” 0 N concentrations of resources for which we seek a transportation plan. We denote by P(ν ,ν ) 0 N the family of probability distributions on paths x ∈ XN+1 having ν and ν as initial and 0 N final marginals, respectively, and we consider the new Maximum Entropy problem: Problem 4 maximize {S(P);P ∈ P(ν ,ν )} (16) 0 N ¯ subject to L(P) = L. (17) ¯ The solution to Problem 4 depends on L as well as the two marginals ν ,ν and, when 0 N ¯ L is too close to l , the problem may not be feasible. As before, minimizing the Lagrangian m (14) over P(ν ,ν ) is equivalent to the following Schr¨odinger Bridge problem: 0 N Problem 5 minimize {D(P(cid:107)P∗) | P ∈ P(ν ,ν )}. T 0 N Thus, employing path space entropy as a measure of robustness, the solution M∗(ν ,ν ) T 0 N to Problem 5 (constructed in accordance with Theorem 1) minimizes a suitable free energy functional, with the temperature parameter specifying the tradeoff between efficiency and robustness. Before we study Problem 5 in detail, we observe the Markovian nature of the measure P∗. Indeed, recall that a positive measure M on XN+1 is Markovian if it can be expressed T as in (4). Since N−1 (cid:89) l P∗(x ,x ,...,x ) = Z(T)−1 exp[− xtxt+1], (18) T 0 1 N T t=0 which is exactly in the form (4), we conclude that P∗ is (time-homogeneous) Markovian with T uniform initial measure µ(x ) ≡ Z(T)−1 and time-invariant transition matrix given by 0 (cid:20) (cid:18) (cid:19)(cid:21) l ij M = exp − . (19) T T Thus, Problem 5 can be viewed as an SBP as in Section II where the “prior” measure P∗ T is Markovian. Observe however that, in general, M is not stochastic (rows do not sum to T one). 8 V. ITERATED BRIDGES Consider now two different initial-final marginals π and π and consider the (SBP) 0 N with these marginals and prior one time M, as in Section II, and one time M∗[ν ,ν ]. We 0 N claim that we get the same solution in both cases. Indeed, take M∗[ν ,ν ] as prior and consider the corresponding new Schr¨odinger system 0 N (in matrix form) ψ(t) = Π(t)ψ(t+1), ψˆ(t+1) = Π(t)Tψˆ(t), Π(t) = diag(ϕ(t))−1M(t)diag(ϕ(t+1)), which can be written as diag(ϕ(t))ψ(t) = M(t)diag(ϕ(t+1))ψ(t+1), (20a) diag(ϕ(t+1))−1ψˆ(t+1) = M(t)T diagϕ(t))−1ψˆ(t). (20b) Observe that when M(N −1)·M(N −2)···M(1)·M(0) has all positive elements (this is the key assumption in Theorem 1) so does Π(N − 1) · Π(N − 2)···Π(1) · Π(0). The new transition matrix Q∗ is given by Q∗(t) = [q∗(t)] = diag(ψ(t))−1Π(t)diag(ψ(t+1)) (21) ij = diag(ψ(t))−1diag(ϕ(t))−1M(t)diag(ϕ(t+1))diag(ψ(t+1)), (22) while ˆ ψ(0,x )ψ(0,x ) = π (x ) (23) 0 0 0 0 ˆ ψ(N,x )ψ(N,x ) = π (x ). (24) N N N N Let ψ (t) = diag(ϕ(t))ψ(t) and ψˆ (t) = diag(ϕ(t))−1ψˆ(t) so that 1 1 Q∗(t) = diag(ψ (t))−1M(t)diag(ψ (t+1)). 1 1 ˆ By (20), ψ and ψ are vectors with positive components satisfying ψ (t) = M(t)ψ (t+1), ψˆ (t+1) = M(t)Tψˆ (t). 1 1 1 1 Moreover, they satisfy the boundary conditions ˆ ψ (0,x )ψ (0,x ) = π (x ) (25) 1 0 1 0 0 0 ˆ ψ (N,x )ψ (N,x ) = π (x ). (26) 1 N 1 N N N ˆ Thus, (ψ ,ψ ) provide the solution when M is taken as prior. 1 1 Alternatively, observe the transition matrix Q∗(t) resulting from the two problems is the same and so is the initial marginal. Hence, the solutions of the SBP with marginals π and 0 π and prior transitions Π(t) and M(t) are identical. Thus, “the bridge over a bridge over N a prior” is the same as the “bridge over the prior,” i.e., iterated bridges produce the same result. It is should be observed that this result for probability distributions is not surprising since the solution is in the same reciprocal class as the prior (namely, it has the same three 9 times transition probability), cf. [26, 31, 50]. It could then be described as the fact that only the reciprocal class of the prior matters; this is can be seen from Schro¨dinger’s original construction [42, 43], and also [23, Section III B] for the case of Markov chains. This result, however, is more general since the prior is not necessarily a probability measure. In information theoretic terms, the bridge (i.e., probability law on path spaces) corre- sponding to Q∗ is the I-projection in the sense of Cziszar [6] of the prior onto the set of measures that are consistent with the initial-final marginals. The above result, however, is notsimplyan“iteratedinformation-projection”property, sinceM∗[ν ,ν ]istheI-projection 0 N of M onto P(ν ,ν ) which does not contain P(π ,π ) being in fact disjoint from it. 0 N 0 N VI. INVARIANCE OF MOST PROBABLE PATHS In an influential paper, building on the work of Jamison and the logarithmic trans- formation of Fleming, Holland, Mitter and others, Dai Pra made in 1991 the connection between SBP and stochastic control [18]. At about the same time, Blaquiere and others [10, 19, 20, 36] studied the control of the Fokker-Planck equation, and more recently Brock- ett studied control of the Louiville equation [11]. In [18, Section 5], Dai Pra established an interesting path-space property of the Schr¨odinger bridge for diffusion processes, that the “most probable path” [12, 44] of the prior and the solution are the same. Loosely speaking, a most probable path is similar to a mode for the path space measure P. More precisely, if both drift b(·,·) and diffusion coefficient σ(·,·) of the Markov diffusion process dX = b(X ,t)dt+σ(X ,t)dW t t t t are smooth and bounded, with σ(x,t)σ(x,t)T > ηI, η > 0, and {x(t) | 0 ≤ t ≤ T} is a path of class C2, then there exists an asymptotic estimate of the probability P of a small tube around x(t) of radius (cid:15). It follows from this estimate that the most probable path is the minimizer in a deterministic calculus of variations problem where the Lagrangian is an Onsager-Machlup functional, see [25, p. 532] for the full story. The concept of most probable path is, of course, much less delicate in our discrete setting. We define it for general positive measures on paths. Given a positive measure M as in Section II on the feasible paths of our graph G, we say that x = (x ,...,x ) ∈ FPN 0 N 0 is of maximal mass if for all other feasible paths y ∈ FPN we have M(y) ≤ M(x). Likewise 0 we consider paths of maximal mass connecting particular nodes. It is apparent that paths of maximal mass always exist but are, in general, not unique. If M is a probability measure, thenthemaximalmasspaths-mostprobablepathsaresimplythemodesofthedistribution. We establish below that the maximal mass paths joining two given nodes under the solution of a Schr¨odinger Bridge problem as in Section II are the same as for the prior measure. Proposition 2 Consider marginals ν and ν in Problem 1. Assume that ν (x) > 0 on 0 1 0 all nodes x ∈ X and that the product M(N − 1) · M(N − 2)···M(1) · M(0) of transition probability matrices of the prior has all positive elements (cf. with M’s as in (2)). Let x 1 and x be any two nodes. Then, under the solution M∗[ν ,ν ] of the SBP, the family of N 0 N maximal mass paths joining x and x in N steps is the same as under the prior measure 1 N M. 10 Proof. Suppose path y = (y = x ,y ,...,y ,y = x ) has maximal mass under the 0 0 1 N−1 N N prior M. In view of (4) and (7) and assumption (3), we have M∗[ν ,ν ](y) = ν (y )π (0)···π (N −1) 0 N 0 0 y0y1 yN−1yN ν (x ) ϕ(N,y ) ν (x ) ϕ(N,x ) 0 0 N 0 0 N = M(y ,y ,...,y ) = M(y ,y ,...,y ). 0 1 N 0 1 N µ (x ) ϕ(0,y ) µ (x ) ϕ(0,x ) 0 0 0 0 0 0 Since the quantity ν (x ) ϕ(N,x ) 0 0 N µ (x ) ϕ(0,x ) 0 0 0 is positive and does not depend on the particular path joining x and x , the conclusion 0 N follows. (cid:50) The calculation in the above proof actually establishes the following stronger result. Proposition 3 Let x and x be any two nodes. Let MxN and M∗[ν ,ν ]xN be the measures 1 N x0 0 N x0 M and M∗[ν ,ν ] conditioned to be in x at time 0 and in x at time N. Then, under the 0 N 0 N assumptions of Proposition 2, these conditional measures have the same level sets. VII. EFFICIENT-ROBUST TRANSPORT We revisit Problem 5, namely, to identify a probability distribution P on FPN that 0 minimizes D(·(cid:107)P∗) over P(ν ,ν ) where P∗ is the Boltzmann distribution (18)–the mini- T 0 N T mizing law being denoted by M∗(ν ,ν ) as before. We show below that the two invariant T 0 N properties discussed in the previous two sessions can be used to determine a best efficient- robust transport policy. We also show that the M∗(ν ,ν ) inherits from the Boltzmann T 0 N distribution P∗ properties as dictated by Proposition 1. T Initially, for simplicity, we consider the situation where at time t = 0 the whole mass is concentrated on node 1 (source) and at time t = N it is concentrated on node n (sink), i.e., ν (x) = δ (x) and ν (x) = δ (x). We want to allow (part of) the mass to reach the 0 1 N n end-point “sink” node, if this is possible, in less than N steps and then remain there until t = N. In order to ensure that is possible, we assume that there exists a self-loop at node n, i.e., M > 0. Clearly, M∗(δ ,δ )(·) = P∗[·|Y = 1,Y = n]. The Schr¨odinger bridge Tnn T 1 n T 0 N theory provides transition probabilities so that, for a path y = (y ,y ,...,y ), 0 1 N N−1 (cid:18) (cid:19) (cid:20) (cid:18) (cid:19)(cid:21) (cid:89) l ϕ (t+1,y ) ϕ (N,y ) l(y) M∗(δ ,δ )(y) = δ (y ) exp − ytyt+1 T t+1 = δ (y ) T N exp − , T 1 n 1 0 T ϕ (t,y ) 1 0 ϕ (0,y ) T T t T 0 t=1 (27) cf.(4)and(7). Herel(y) = (cid:80)N−1l isthelengthofpathy andϕ satisfiestogetherwith t=0 ytyt+1 (cid:16) (cid:17) T ϕˆ the Schro¨dinger system (6) with m (t) = exp −lij and ν (x) = δ (x),ν (x) = δ (x). T ij T 0 1 N n In [17, Section VI], Problem 5 was first studied with a prior measure M having cer- l tain special properties. To introduce in this particular measure, we first recall (part of) a fundamental result from linear algebra [24].