Table Of Content

1 On the Matrix Monge-Kantorovich Problem Yongxin Chen, Wilfrid Gangbo, Tryphon T. Georgiou, and Allen Tannenbaum Abstract The classical Monge-Kantorovich (MK) problem as originally posed is concerned with how best to move a 7 1 pileofsoilorrubbletoanexcavationorfillwiththeleastamountofworkrelativetosomecostfunction.Whenthe 0 cost is given by the square of the Euclidean distance, one can define a metric on densities called the Wasserstein 2 distance. In this note, we formulate a natural matrix counterpart of the MK problem for positive definite density n matrices. We prove a number of results about this metric including showing that it can be formulated as a convex a optimization problem, strong duality, an analogue of the Poincare´-Wirtinger inequality, and a Lax-Hopf-Oleinik J type result. 1 1 ] A I. INTRODUCTION F . h The mass transport problem was first formulated by Monge in 1781 and concerned finding the optimal t a way, in the sense of minimal transportation cost, of moving a pile of soil from one site to another. This m problem of optimal mass transport (OMT) was given a modern formulation in the work of Kantorovich, [ and so is now known as the Monge–Kantorovich (MK) problem; see [14], [16] and the many references 1 v therein. As originally formulated, the problem is static. Namely, given two probability densities, one can 6 define a metric, now known as the Wasserstein distance, that quantifies the cost of transport and enjoys a 2 8 number of remarkable properties as described in [14], [16]. Optimal mass transport is a very active area 2 of research with applications to numerous disciplines including probability, econometrics, fluid dynamics, 0 . automatic control, transportation, statistical physics, shape optimization, expert systems, and meteorology. 1 0 7 A major development in optimal mass transport theory was realized in the seminal dynamic approach 1 to optimal mass transport by Benamou and Brenier [4]. These authors base their approach to OMT on : v ideas from fluid mechanics via the minimization of a kinetic energy functional subject to a continuity i X constraint. In a recent paper [3] by Chen et al., a non-commutative counterpart of optimal transport was r a developed where density matrices ρ (i.e., Hermitian matrices that are positive-definite and have unit trace) replace probabilitydistributions,and where “transport” corresponds to a flow on the space ofsuch matrices that minimizes a corresponding action integral, thereby extending the fluid dynamical approach of [4]. (A similar approach to [3] was done independently by Carlen and Maas at about the same time [2].) Here again, based on the continuity equation that imposes a “mass preservation” constraint on a quadratic “kinetic energy,” we study a convex optimization problem that leads to a certain Riemannian structure on densities matrices and generalizes the work of [13] to the current non-commutative setting. Y. Chen is with the Department of Medical Physics, Memorial Sloan Kettering Cancer Center, NY; email: [email protected] W. Gangbo is with the Department of Mathematics, UCLA, Los Angeles, CA; email: [email protected] T. T. Georgiou is with the Department of Mechanical and Aerospace Engineering, University of California, Irvine, CA; email: [email protected] A.Tannenbaum iswiththeDepartmentsofComputerScienceandAppliedMathematics&Statistics,StonyBrookUniversity,NY;email: [email protected] 2 The present paper is a continuation of [3] in which a number of the results are given rigorous mathematical proofs. We show that indeed in line with [4], one has a convex optimization problem and strong duality, an analogue of the Poincare´-Wirtinger inequality, and a Lax-Hopf-Oleinik type result, all in our non-commutative Wasserstein framework. II. CONTINUITY EQUATION & WASSERSTEIN DISTANCE In thissection, weset up thecontinuityequation thatis thebasisfor ourformulationof theWasserstein distancefordensitymatrices.Wefollowcloselytherecentpaper[3].Inthatwork,anapproachisdeveloped based on the Lindblad equation which describes the evolution of open quantum systems. Open quantum systems are thought of as being coupled to a larger system (heat bath), and thus cannot in general be described by a wave function and a unitary evolution. The proper description is in terms of a density operator ρ [8] which in turn obeys the Lindblad equation where we assume ~ = 1: N 1 1 ∗ ∗ ∗ ρ˙ = −i[H,ρ]+ (L ρL − ρL L − L L ρ). (1) k k 2 k k 2 k k Xk=1 Here, * as superscript denotes conjugate transpose and [H,ρ] := Hρ−ρH denotes the commutator. The first term on the right-hand side describes the evolution of the state under the effect of the Hamiltonian H, and it is unitary (energy preserving), while the other the terms on the right-hand side model diffusion and, thereby, capture the dissipation of energy; these dissipative terms together represent the quantum analogue of Laplaces operator ∆ (as it will become clear shortly). The Lindblad equation defines a non-unitary evolution of the density and the calculus we develop next actually underscore parallels with classical diffusion and the Fokker-Planck equation. Denote by H and S the set of n×n Hermitian and skew-Hermitian matrices, respectively. We will assume that all of our matrices are fixed to be n×n. Next, we denote the space of block-column vectors consisting of N elements in S and H as SN, respectively HN. We let H and H denote the cones of + ++ nonnegative and positive-definite matrices, respectively, and D := {ρ ∈ H | tr(ρ) = 1}, (2) + D := {ρ ∈ H | tr(ρ) = 1}. (3) + ++ We note that the tangent space of D , at any ρ ∈ D+ is given by + T = {δ ∈ H | tr(δ) = 0}, (4) ρ and we use the standard notion of inner product, namely ∗ hX;Yi = tr(X Y), for both H and S. For X,Y ∈ HN (SN), N ∗ hX;Yi = tr(X Y ). k k Xk=1 Given X = [X∗,··· ,X∗]∗ ∈ HN (SN), Y ∈ H (S), set 1 N X X Y 1 1 . . XY =  .. Y :=  .. ,  X   X Y  N N     3 and X YX 1 1 . . YX = Y  ..  :=  .. .  X   YX  N N     If we assume that L = L∗, i.e., L ∈ H for all k ∈ 1...,N, then we can define k k k L X −XL 1 1 . ∇L : H → SN, X 7→  ..  (5)  L X −XL  N N   as the gradient operator. Note that ∇ acts just like the standard gradient operator, and in particular, L satisfies ∇ (XY +YX) = (∇ X)Y +X(∇ Y)+(∇ Y)X +Y(∇ X), ∀X,Y ∈ H. (6) L L L L L The dual of ∇ , which is an analogue of the (negative) divergence operator, is defined by L Y 1 N ∇∗L : SN → H, Y =  ...  7→ LkYk −YkLk. (7)  Y  Xk N   The duality ∗ h∇ X;Yi = hX;∇ Yi L L follows by definition. With these definitions, we define the (matricial) Laplacian as N ∗ ∗ ∗ ∗ ∆ X := −∇ ∇ X = (2L XL −XL L −L L X), X ∈ H, L L L k k k k k k Xk=1 whichisexactly(afterscalingby1/2)thediffusiontermintheLindbladequation(1).ThereforeLindblad’s equation (under the assumption that L = L∗) can be rewritten as k k 1 ρ˙ = −i[H,ρ]+ ∆ ρ, L 2 i.e., as a continuity equation expressing flow under the influence of a suitable vector field. In our case, we will consider a continuity equation of the form ∗ ρ˙ = ∇ m L for m ∈ SN a suitable “momentum field.” In particular, we are interested in the following family of continuity equations: ∗ ρ˙ = ∇ M (v), (8) L ρ where the momentum field is expressed as a non-commutative product M (v) ∈ SN between a “velocity ρ field” v ∈ SN and the density matrix ρ. Several such “non-commutative products” have been considered (see [3]), however, in the present work, we consider the following case: 1 M (v) := (vρ+ρv), (9a) ρ 2 4 which gives 1 ∗ ρ˙ = ∇ (vρ+ρv) (9b) 2 L and v = [v∗,...,v∗ ]∗ ∈ SN. Clearly vρ+ρv ∈ SN, which is consistent with the definition of ∇∗. In [3], 1 N L we call this the anti-commutator case, since vρ+ρv =: {v,ρ} is the anti-commutator when applied to elements of an associative algebra. In [3], another possibility is considered for the multiplication operator M (v). ρ Given two density matrices ρ ,ρ ∈ D we formulate the optimization problem (following [3]) 0 1 + 1 W (ρ ,ρ )2 := inf tr(ρv∗v)dt, (10a) 2 0 1 ρ∈D+,v∈SN Z0 1 ∗ ρ˙ = ∇ (vρ+ρv), (10b) 2 L ρ(0) = ρ , ρ(1) = ρ , (10c) 0 1 and define the Wasserstein distance W (ρ ,ρ ) between ρ and ρ to be the square root of the infimum 2 0 1 0 1 of the cost (10a). Other choices for M (v) in (9a) give alternative Wasserstein metrics, as noted in [3]. In ρ order for the metric W to be well-defined for all ρ ,ρ ∈ D , we need to assume that ker(∇ ) is spanned 2 0 1 + L by the identity matrix. The results in the present paper, however, carry through without this assumption. III. QUADRATIC FORMS AND POINCARE´-WIRTINGER INEQUALITY In this section, we prove some initial convexity results as well as a Poincare´-Wirtinger type inequality that we will need in the sequel. We begin with some notation. If m ,··· ,m ∈ Cn×n, we define the 1 N column vector m ∈ CnN×n with matrix entries the m ’s by i ∗ ∗ ∗ m = (m ,··· ,m ) , 1 N the column vector m ∈ CnN×n with entries m∗’s, by ∗ i ∗ m = (m ,··· ,m ) . ∗ 1 N For m,b ∈ CnN×n, i.e., with b = (b∗,··· ,b∗ )∗ for b ,··· ,b ∈ Cn×n, we define the inner products 1 N 1 N N ∗ ∗ hm ;b i = tr(m b ), hm;bi = tr(m b) = hm ;b i i i i i i i Xi=1 and introduce 1 m·b = hm;bi+hb;mi ∈ R. 2 (cid:0) (cid:1) Then, for v ∈ CnN×n and ρ ∈ H , we define the quadratic form + ∗ Q (v) := tr(ρv v) = hvρ;vi. (11) ρ The following lemma is an easy consequence of the definition and can be readily verified. Lemma 1: Let ρ ∈ H and v,w ∈ CnN×n. The following hold: + 5 (i) Q (v) ≥ 0 and Q (v) = 0 if v = 0; when ρ ∈ H , it becomes if and only if, ρ ρ ++ (ii) Q (v +w) = Q (v)+Q (w)+hvρ;wi+hwρ;vi, ρ ρ ρ (iii) Q (1−t)v +tw = (1−t)Q (v)+tQ (w)−t(1−t)Q (v −w), ρ ρ ρ ρ (iv) If w(cid:0)e further assu(cid:1)me that v,w ∈ SN, then hwρ;vi = hρv;wi. Since ||·|| (the standard norm on H) is uniformly convex and ker(∇ ) is a finite dimensional space, L there exists a unique proj(X) ∈ ker(∇ ), the orthogonal projection of X onto ker(∇ ), such that L L min ||X −Z|| = ||X −proj(X)||. Z∈ker(∇L) If we denote by ker(∇ )⊥ the orthogonal complement of ker(∇ ) in H, then L L ⊥ H = ker(∇ )⊕ker(∇ ) . L L Lemma 2: For any ρ ∈ H , the map X → Q (∇ X) is convex on H. If in addition ρ > 0, then the + ρ L map is strictly convex on ker(∇ )⊥. L Proof: From Lemma 1, we obtain that, for t ∈ (0,1) and X,Y ∈ H, Q (1−t)∇ X +t∇ Y = (1−t)Q (∇ X)+tQ (∇ Y)−t(1−t)Q (∇ X −∇ Y). ρ L L ρ L ρ L ρ L L (cid:0) (cid:1) The convexity follows from the fact that Q (·) ≥ 0. Furthermore, if ρ > 0, then Q (∇ X −∇ Y) > 0 ρ ρ L L unless ∇ X −∇ Y = 0. Hence, X → Q (∇ X) strictly convex on ker(∇ )⊥. L L ρ L L Theorem 3 (Poincare´–Wirtinger inequality): Let K ⊂ D be a compact set. Then there exists a + constant cK > 0 such that for all X ∈ H and ρ ∈ K, 2 Qρ ∇L(X −proj(X)) ≥ cK X −proj(X) . (cid:0) (cid:1) (cid:13) (cid:13) (cid:13) (cid:13) Proof: Define cK := inf tr ρ(∇LX)∗∇LX | ρ ∈ K,X ∈ ker(∇L)⊥,||X|| = 1 (12) ρ,Xn o (cid:0) (cid:1) and let (ρ ,X ) be a minimizing sequence in (12). This infimum of a continuous function over a compact k k k setisaminimum,attainedatacertain(ρ,X).SinceX ∈ ker(∇ )⊥,wecannothaveX ∈ ker(∇ )because, L L we would otherwise have X = 0 which will contradict the fact that ||X|| = 1. Since ρ > 0 we conclude that ∗ tr ρ(∇ X) ∇ X > 0 (13) L L (cid:16) (cid:17) In conclusion ∗ tr ρ(∇LY) ∇LY ≥ cK > 0 (cid:0) (cid:1) for any ρ ∈ K and any Y ∈ ker(∇ )⊥ such that ||Y|| = 1. By homogeneity, this completes the proof of L the theorem. Lemma 4: Given ρ ∈ H and v ∈ SN, there is a unique element ∇ X,X ∈ H, such that ++ L Q (v −∇ X) ≤ Q (v −∇ Y) ρ L ρ L for all Y ∈ H. Furthermore, the minimizer is characterized by the Euler–Lagrange equations ∗ (v −∇ X)ρ+ρ(v −∇ X) ∈ ker(∇ ). (14) L L L 6 Proof: Let (X ) ⊂ H be a sequence such that ℓ ℓ lim Q (v −∇ X ) = inf Q (v −∇ Y). ρ L ℓ ρ L ℓ→∞ Y∈H Note that Q (∇ X ) is bounded by definition and the convexity of Q (·). Replacing X by X − ρ L ℓ ℓ ρ ℓ ℓ proj(X ) i(cid:0)f necessary,(cid:1)we use the Poincare´–Wirtinger inequality to conclude that (X ) is a bounded ℓ ℓ ℓ sequence. Passing to a subsequence if necessary, we may assume that (X ) converges to some X which ℓ ℓ minimizes Q (v−∇ X) over H. The uniqueness follows from the strict convexity of Q (·), and condition ρ L ρ (14) expresses stationarity. IV. FLOW RATES IN THE SPACE OF DENSITIES We now return to the continuity equation ρ˙ = f with the flow rate f being the divergence of a momentum field p, i.e., ∗ f = ∇ p, (15) L with p ∈ SN, so that f ∈ H as well as tr(f) = 0. In particular, we are interested in the case where the momentum is a linear function of ρ of the form p = M (v) (see (9a)); then p = 1(m − m ) with ρ 2 ∗ m = vρ ∈ CNn×n, ρ ∈ H , and v ∈ SN, and ++ 1 ∗ f = ∇ (vρ+ρv). 2 L Since the range of ∇∗ coincides with ker(∇ )⊥, any f belongs to ker(∇ )⊥. The next theorem states L L L that in this case not only the converse holds, namely, that given ρ ∈ H , any f ∈ ker(∇ )⊥ can be ++ L written as above with m = vρ, but that v ∈ SN can be selected in the range of ∇ and that this choice L is unique. Theorem 5: For any ρ ∈ H and f ∈ ker(∇ )⊥, there exists a unique X ∈ ker(∇ )⊥ such that ++ L L 1 ∗ f = ∇ ∇ Xρ+ρ∇ X . (16) 2 L L L (cid:0) (cid:1) Furthermore, if K is a compact subset of D+ and ρ/tr(ρ) ∈ K then there exists cK > 0 such that ||f|| ≥ cK tr(ρ)||X||. (17) Proof: Define the functional 1 I(Y) = Q (∇ Y)−hf;Yi, ∀ Y ∈ H. ρ L 2 To avoid trivialities, we assume that f 6= 0. Observe that I(Y) ≡ 0 on ker(∇ ) and for 0 < ǫ << 1 we L have that ǫ2 I(ǫf) = Q (∇ f)−ǫ||f||2 < 0 ρ L 2 Thus, λ , the infimum of I over H is negative. Let (Y ) be a minimizing sequence. Since λ < 0, for ℓ 0 ℓ ℓ 0 large enough, I(Y ) < 0 and so, Y ∈ H\ker(∇ ). Replacing Y by Y −proj(Y ) if necessary, we may ℓ ℓ L ℓ ℓ ℓ assume that Y ∈ ker(∇ )⊥. By the Poincare´–Wirtinger inequality ℓ L 0 > I(Yℓ) ≥ cKtr(ρ)||Yℓ||2 −||f||||Yℓ||. 7 Consequently, (Y ) is a bounded sequence and so, passing to a subsequence if necessary, we may assume ℓ ℓ that (Yℓ)ℓ converges to some X ∈ ker(∇L)⊥. We have 0 ≥ cKtr(ρ)||X||2−||f||||X|| and so, (17) holds. If Y ∈ H is arbitrary, then for any real number ǫ, we use Lemma 1 to conclude that 1 ∗ I(X +ǫY) = I(X)+ǫ ∇ ∇ Xρ+ρ∇ X −f;Y +o(ǫ). 2 L L L (cid:10) (cid:0) (cid:1) (cid:11) By Lemma 2, I is convex on H and so, X is a critical point of I if and only if X minimizes I. Thus (16) holds if and only if X minimizes I. Since, the same lemma gives that I is strictly convex on ker(∇ )⊥, L I admits a unique minimizer on ker(∇ )⊥ which means that there exists a unique X ∈ ker(∇ )⊥ such L L that (16) holds. Remark 6: The uniqueness and existence of the representation may also be proven as follows. We first note that provided ρ ∈ H , the non-commutative multiplication in (9a), defines a positive definite ++ Hermitian operator 1 M : SN → SN : v 7→ (vρ+ρv). ρ 2 It follows that ∇∗M ∇ , when restricted to ker(∇ )⊥ = range(∇∗), is positive and therefore invertible. L ρ L L L Thus, for all f ∈ ker(∇ )⊥, (16) has a unique solution X ∈ ker(∇ )⊥. One also gets a lower bound on L L the norm of f as follows. If λ > 0 denotes the smallest eigenvalue of min ∗ ⊥ ⊥ ∇ M ∇ | : ker(∇ ) → ker(∇ ) , L ρ L ker(∇L)⊥ L L then kfk ≥ λ kXk. min V. FLOWS IN THE SPACE OF DENSITIES We begin with establishing a canonical representation of flow rates that minimize a certain analogue of kinetic energy of our matrix-valued flows. Proposition 1: Suppose ρ ∈ H , f ∈ ker(∇ )⊥, X ∈ H satisfy (16), and that v ∈ SN is such that + L 1 ∗ f = ∇ vρ+ρv . 2 L (cid:0) (cid:1) The following hold: (i) For all Y ∈ H, 1 1 ∗ tr(ρv v) ≥ hf;Yi− Q (∇ Y) ρ L 2 2 and equality holds if and only if v = ∇ X. L (ii) Further assume that ρ > 0 (which by Theorem 5 is a sufficient condition for (16) to hold). Then 1 1 1 min hm;mρ−1i | f = ∇∗(m−m ) = max hf;Yi− Q (∇ Y) . m∈CnN×nn2 2 L ∗ o Y∈H (cid:26) 2 ρ L (cid:27) Besides, the maximum is uniquely attained by the X which satisfies (16) and the minimum is uniquely attained by m = ∇ Xρ for the same X. L 8 Proof: (i) We have 1 1 1 1 1 ∗ 1 1 ∗ 1 1 2 tr(ρv v) = 2hvρ2;vρ2i+hf − 2∇L vρ+ρv ;Yi = 2hvρ2;vρ2i+hf;Yi− 2hvρ+ρv;∇LYi. (cid:0) (cid:1) Since both v as well as ∇ Y belong to SN (cf. Lemma 1 (iv)), L 1 1 1 1 hvρ+ρv;∇LYi = hvρ2;∇LYρ2i+h∇LYρ2;vρ2i. We conclude that 1 1 1 1 tr(ρv∗v) = kvρ12 −∇LYρ21k2 +hf;Yi− k∇LYρ21k2 ≥ hf;Yi− Qρ(∇LY). 2 2 2 2 (ii) Computations similar to the ones in (i) reveal that 1 1 1 hm;mρ−1i = kmρ−12 −∇LYρ12k2 +hf;Yi− k∇LYρ21k2 (18) 2 2 2 and so, for Y ∈ H, 1 1 hm;mρ−1i ≥ hf;Yi− Q (∇ Y). ρ L 2 2 We proceed to consider paths ρ(t) ∈ H for t ∈ [0,1] along with corresponding flow rates and action + integrals. A corollary of the above proposition ascertains the measurability of the canonical representation of the velocity field v. Corollary 7: Let L ⊂ H and denote by A : ker(∇ )⊥×L → H the map which to (f,ρ) associates ++ L X ∈ ker(∇ )⊥ such that (16) holds. L (i) If L is a compact subset of H , then A is continuous. ++ (ii) If ρ : [0,1] → H and f : [0,1] → ker(∇ )⊥ are continuous at t ∈ [0,1] then A(f,ρ) is ++ L 0 continuous at t . 0 (iii) If ρ ∈ L1(0,1;H ) and f ∈ L1(0,1;ker(∇ )⊥) are measurable, then A(f,ρ) is measurable. ++ L Proof: (i) Let K be the set of ρ/tr(ρ) such that ρ ∈ L. Let (f ,ρ ) be a sequence in ker(∇ )⊥×L ℓ ℓ ℓ L converging to (f,ρ). By Theorem 5, (X ) := A(f ,ρ ) is a bounded sequence in ker(∇ )⊥ and so, ℓ ℓ ℓ ℓ l L has all its points of accumulation in ker(∇ )⊥.(cid:0)If X is an(cid:1)y such point of accumulation, then clearly L 1 ∗ f = ∇ (∇ Xρ+ρ∇ X). 2 L L L Since ρ is invertible, X is unique and so, A(f,ρ) = X. This establishes (i). (ii) Condition (ii) is a direct consequence of (i). (iii) Approximate ρ in the L1–norm by a sequence (ρ ) ⊂ C([0,1];H ) which converges pointwise ℓ ℓ ++ almosteverywheretoρ.Similarly,approximatef intheL1–normbyasequence(f ) ⊂ C([0,1];ker(∇ )⊥) ℓ ℓ L which converges pointwise almost everywhere to f. By (i), A(f ,ρ ) converges pointwise almost ℓ ℓ ℓ everywhere to A(f,ρ) and so, A(f,ρ) is measurable. This estab(cid:0)lishes (iii(cid:1)). Lemma 8: If ρ ,ρ ∈ D , then the following hold: 0 1 + 9 (i) If ρ − ρ ∈ ker(∇ )⊥, then there exists a Borel map t → X(t) ∈ ker(∇ )⊥ and a Borel map 1 0 L L t → ρ(t) starting at ρ(0) = ρ and ending at ρ(1) = ρ such that 0 1 1 ∗ ρ˙(t) = ∇ ∇ X(t)ρ(t)+ρ(t)∇ X(t) (19) 2 L L L (cid:0) (cid:1) in the sense of distributions, and 1 Q (∇ X(t))dt < ∞. (20) ρ(t) L Z 0 (ii) Conversely, assume that there exist a Borel map t → v(t) ∈ SN and a Borel map t → ρ(t) starting at ρ and ending at ρ such that 0 1 1 ∗ ρ˙(t) = ∇ v(t)ρ(t)+ρ(t)v(t) (21) 2 L (cid:0) (cid:1) in the sense of distributions, and 1 ∗ tr(ρ(t)v(t) v(t))dt < ∞. (22) Z 0 Then ρ −ρ ∈ ker(∇ )⊥. 1 0 L Proof: (i) Assume ρ −ρ ∈ ker(∇ )⊥, Set ρ(t) = (1−t)ρ +tρ . Then K := {ρ(t) | t ∈ [0,1]} is 1 0 L 0 1 a compact subset of D . For each t ∈ [0,1], we use Theorem 5 to find a unique X(t) ∈ ker(∇ )⊥ such + L that 1 ∗ ρ −ρ = ∇ ∇ X(t)ρ(t)+ρ(t)∇ X(t) 1 0 2 L L L (cid:0) (cid:1) and ||X(t)k ≤ cK. By Corollary 7, t → X(t) ∈ ker(∇ )⊥ is continuous. Hence, (19) and (20) hold. L (ii) Conversely, assume (21) and (22) hold. Let Y ∈ ker(∇ ), then L 1 1 1 1 ∗ hρ −ρ ;Yi = h∇ v(t)ρ(t)+ρ(t)v(t) ;Yidt = hv(t)ρ(t)+ρ(t)v(t);∇ Yidt = 0. 1 0 2 Z L 2 Z L 0 (cid:0) (cid:1) 0 Since Y ∈ ker(∇ ) is arbitrary, we conclude that ρ −ρ ∈ ker(∇ )⊥. L 1 0 L Remark 9: ObservethatinLemma8 (ii), ifwerelax theassumptionsonρ and ρ by merelyimposing 0 1 that ρ ,ρ ∈ H , then (21) and (22) still imply ρ −ρ ∈ ker(∇ )⊥. 0 1 + 1 0 L For ρ ∈ H and m ∈ CnN×n we set ++ 1 F(ρ,m) := hm,mρ−1i. 2 Given ρ ,ρ ∈ D , denote by C(ρ ,ρ ) the set of paths (ρ,v) such that ρ ∈ C1([0,1],D ) start at ρ and 0 1 + 0 1 + 0 end at ρ , v : (0,1) → SN is Borel, Q (v) ∈ L1(0,1) and 1 ρ 1 ∗ ρ˙ = ∇ (vρ+ρv) 2 L in the sense of distributions on (0,1). Similarly, we define C˜(ρ ,ρ ) to be the set of paths (ρ,m) such 0 1 that ρ ∈ C1([0,1],D ) start at ρ and end at ρ , m : (0,1) → CnN×n is Borel, F(ρ,m) ∈ L1(0,1) and + 0 1 1 ∗ ρ˙ = ∇ (m−m ) 2 L ∗ 10 in the sense of distributions on (0,1). Observe that if v ∈ SN and we set m = vρ then 1 ∗ F(ρ,m) = tr(ρv v). 2 and so, the embedding (ρ,v) → (ρ,vρ) of C(ρ ,ρ ) into C˜(ρ ,ρ ), extends 1 tr(ρv∗v) to F(ρ,m). 0 1 0 1 2 Consequently, 1 1 1 inf tr(ρv∗v)dt | (ρ,v) ∈ C(ρ ,ρ ) ≥ inf F(ρ,m)dt | (ρ,m) ∈ C˜(ρ ,ρ ) . 0 1 0 1 (ρ,v)nZ0 2 o (ρ,m)nZ0 o We next show that the inequality can be turned into an equality. Lemma 10: If ρ ,ρ ∈ D then 0 1 + 1 1 1 inf tr(ρv∗v)dt | (ρ,v) ∈ C(ρ ,ρ ) = inf F(ρ,m)dt | (ρ,m) ∈ C˜(ρ ,ρ ) 0 1 0 1 (ρ,v)nZ0 2 o (ρ,m)nZ0 o Proof: It suffices to show that for any (ρ,m) ∈ C˜(ρ ,ρ ), there exists (ρ,v) ∈ C(ρ ,ρ ) such that 0 1 0 1 1 1 1 ∗ tr(ρv v)dt ≤ F(ρ,m)dt. (23) Z 2 Z 0 0 Observe that for almost every t ∈ (0,1) we have 1 ∗ ρ˙(t) = ∇ (m(t)−m (t)). (24) 2 L ∗ Since both t → ρ(t) and t → ρ˙(t) are continuous, by Corollary 7, there exists a continuous map X : [0,1] → H such that 1 ∗ ρ˙(t) = ∇ ∇ X(t)ρ(t)+ρ(t)∇ X(t) . (25) 2 L L L (cid:16) (cid:17) Thus, for almost every t ∈ (0,1), both (24) and (25) hold and so, by Proposition 1 1 ∗ tr ρ(t)v (t)v(t) ≤ F ρ(t),m(t) 2 (cid:0) (cid:1) (cid:0) (cid:1) for almost every t ∈ (0,1) with v(t) = ∇ X(t). Thus, (23) holds, which concludes the proof. L VI. RELAXATION OF VELOCITY-MOMENTUM FIELDS Given ρ ,ρ ∈ D we are interested in characterizing the paths (ρ,v) in H ×SN that minimize the 0 1 + + “action integral,” i.e., paths that possibly attain 1 1 ∗ ∗ inf tr(ρv v)dt ρ˙ = ∇ (vρ+ρv), ρ(0) = ρ , ρ(1) = ρ . (26) ρ∈H+,v∈SNnZ0 (cid:12) 2 L 0 1o (cid:12) (cid:12) When ρ > 0, Lemma 10 replaced tr(ρv∗v) by a new expression F(ρ,vρ), introducing a new problem which, under appropriate conditions, is a relaxation of (26). It then becomes neccesary to extend F to the whole set H × CnN×n and study the convexity properties of the extended functional. We start by introducing the open sets O := H ×CnN×n, O := {ρ ∈ H\H }×CnN×n. 0 ++ ∞ +