Table Of Content

Characterisations of Matrix and Operator-Valued Φ-Entropies, and Operator Efron-Stein Inequalities Hao-Chung Cheng1,2 and Min-Hsiu Hsieh2 1Graduate Institute Communication Engineering, National Taiwan University, Taiwan (R.O.C.) 6 2Centre for Quantum Computation and Intelligent Systems, 1 0 Faculty of Engineering and Information Technology, University of Technology Sydney, Australia 2 n Abstract. We derive new characterisations of the matrix Φ-entropy functionals introduced in a [Electron. J. Probab., 19(20): 1–30, 2014]. Notably, all known equivalent characterisations of the J classical Φ-entropies have their matrix correspondences. Next, we propose an operator-valued 1 generalisation of the matrix Φ-entropy functionals, and prove their subadditivity under Lo¨wner 3 partial ordering. Our results demonstrate that the subadditivity of operator-valuedΦ-entropies is ] equivalentto the convexityofvariousrelatedfunctions. This resultcanbe usedto demonstratean h interesting result in quantum information theory: the matrix Φ-entropy of a quantum ensemble is p - monotone under unital quantum channels. Finally, we derive the operator Efron-Stein inequality h to bound the operator-valuedvariance of a random matrix. t a m [ 1. Introduction 1 v The introduction of Φ-entropy functionals can be traced back to the early days of information 3 3 theory [1, 2] and convex analysis [3–6], where the notion of φ-divergence is defined. Formally, given 2 a non-negative real random variable Z and a smooth convex function Φ, the Φ-entropy functional 0 0 refers to 2. H (Z) = EΦ(Z) Φ(EZ). Φ 0 − By Jensen’s inequality, it is not hard to see that the quantity H (Z) is non-negative. Hence, the 6 Φ 1 Φ-entropy functional can be used as an entropic measure to characterise the uncertainty of the : v random variable Z. i X The investigation of general properties of classical Φ-entropies has enjoyed great success in physics, probability theory, information theory and computer science. Of these, the subadditivity r a (or the tensorisation) property [7–9] has led to the derivations of the logarithmic Sobolev [10], Φ-Sobolev [11] and Poincaré inequalities [12], which in turn, is a crucial step toward the powerful entropy method in concentration inequalities [13–15] and analysis of Markov semigroups [16]. Let Z = f(X , ,X ) be a random variable defined on n independent random variables 1 n ··· (X , ,X ). We say H (Z) is subadditive if 1 n Φ ··· n H (Z) E[E Φ(Z) Φ(E Z)], Φ i i ≤ − i=1 X where E denotes the conditional expectation with respect to X . L. Gross first observed that the i i ordinary entropy functional H (Z) is subadditive inhis seminal paper [10]. Later on, equivalent ulogu characterisations of the subadditive entropy class (see Theorem 2.1) are established [11, 17, 18], which prove to be useful in other contexts such as stochastic processes [17, 18]. E-mail address: [email protected], [email protected]. 1 Parallel to the classical Φ-entropies, Chen and Tropp [19] introduced the notion of matrix Φ- entropy functionals. Namely, for a positive semi-definite random matrix Z, the matrix Φ-entropy functional is defined as H (Z) , tr[EΦ(Z) Φ(EZ)], Φ − where tr is the normalised trace. The class of subadditive matrix Φ-entropy functionals is char- acterised in terms of the second derivative of their representing functions. Unlike its classical counterpart, only a few connections between the matrix Φ-entropy functionals and other convex forms of the same functions have been established [20, 21] prior to this current work. In this paper, we establish equivalent characterisations of the matrix Φ-entropy functionals defined in [19]. Our results show that matrix Φ-entropy functionals satisfy all known equivalent statements that classical Φ-entropy functions satisfy [15, 17, 18]. Our results provide additional justification to its original definition of the matrix Φ-entropy functionals (see Table 1). The equivalences between matrix Φ-entropy functionals and other convex forms of the function Φ advance our understanding of the class of entropy functions. Moreover, it allows to unify the study of matrix concentration inequalities and matrix Φ-Sobolev inequalities [22, 23]. Furthermore, we consider the following operator-valued generalisation of matrix Φ-entropy functionals: H (Z) , EΦ(Z) Φ(EZ). Φ − A special case of this operator-valued Φ-entropy functional is the operator-valued variance Var(Z) defined in [24] and [25], where Φ is the square function. The equivalent conditions for the subadditivity under Lo¨wner partial ordering are derived (Theorem 4.2). In particular, we show that subadditivity of the operator-valued Φ-entropies is equivalent to the convexity: Subadditvity of H (Z) H (Z) is convex in Z. Φ Φ ⇔ OurresultdirectlyyieldstheOperator Efron-Stein inequality, whichrecovers thewell-known Efron- Stein inequality [26, 27] when random matrices reduce to real random variables. Classical Φ-Entropy Functional Class (C1) Matrix Φ-Entropy Functional Class (C2) (a) Φ is affine or Φ′′ >0 and 1/Φ′′ is concave Φ is affine or DΦ′ is invertible and (DΦ′)−1 is concave (b) convexity of (u,v) Φ(u+v) Φ(u) Φ′(u)v convexity of (u,v) Tr[Φ(u+v) Φ(u) DΦ[u](v)] (c) convexity of (u,v)7→(Φ′(u+v−) Φ′(u−))v convexity of (u,v)7→Tr[(DΦ[u+v−](v) D−Φ[u](v)] (d) convexity of (u,v)7→Φ′′(u)v2 − convexity of (u,v)7→Tr[D2Φ[u](v,v)] − 7→ 7→ (e) Φ is affine or Φ′′ >0 and Φ′′′′Φ′′ 2Φ′′′2 Equation (3.2) ≥ (f) convexity of (u,v) tΦ(u)+(1 t)Φ(v) convexity of (u,v) Tr[tΦ(u)+(1 t)Φ(v) 7→ − 7→ − Φ(tu+(1 t)v) for any 0 t 1 Φ(tu+(1 t)v)] for any 0 t 1 (g) −E H (Z X −) H (E Z) ≤ ≤ −E H (Z X−) H (E Z) ≤ ≤ 1 Φ 1 Φ 1 1 Φ 1 Φ 1 | ≥ | ≥ (h) H (Z) is a convex function of Z H (Z) is a convex function of Z Φ Φ (i) H (Z)=sup E[Υ (T) Z+Υ (T)] H (Z)=sup trE[Υ (T) Z+Υ (T)] Φ T>0{ 1 · 2 } Φ T≻0 1 · 2 (j) H (Z) n EH(i)(Z) H (Z) n EH(i)(Z) Φ ≤ i=1 Φ Φ ≤ i=1 (cid:8)Φ (cid:9) TableP1. Comparison between the equivalent chaPracterisations of Φ-entropy functional class (C1) (Definition 2.1) and matrix Φ-entropy functional class (C2) (Def- inition 3.1) 1.1. Our Results. We summarize our results here. First, we derive equivalent characterisations for the matrix Φ-entropy functionals in Table 1 (see Theorem 3.2). Notably, all known equivalent characterisations for the classical Φ-entropies can be generalised to their matrix correspondences. We emphasise that additional characterisations of the Φ-entropies prove to be useful in many instances. The characterisations (b)-(d) in (C1) are explored by Chafa¨ı [18] to derive several 2 Operator-Valued Φ-Entropy Class (C3) (a) The second-order Fréchet derivative D2Φ[u](v,v) is jointly convex in (u,v) (b) Φ(u+v) Φ(u) DΦ[u](v) is jointly convex in (u,v) (c) (DΦ[u+v−](v) D−Φ[u](v) is jointly convex in (u,v) (d) D2Φ[u](v,v) is−jointly convex in (u,v) (e) tΦ(u)+(1 t)Φ(v) Φ(tu+(1 t)v) is jointly convex in (u,v) for any 0 t 1 (f) E H (Z X−) H (−E Z) − ≤ ≤ 1 Φ 1 Φ 1 | (cid:23) (g) H (Z) is a convex function of Z Φ (h) H (Z)=sup E[Υ (T) Z +Υ (T)] Φ T≻0{ 1 · 2 } (i) H (Z) n EH(i)(Z) Φ (cid:22) i=1 Φ Table 2. EquivaPlent statements of the operator-valued Φ-entropy class (C3) (Def- inition 4.1) entropic inequalities for M/M/ queueing processes that are not diffusions. With the characteri- ∞ sations (b)-(d), the difficulty of lacking the diffusion property can be circumvented and replaced by convexity. Moreover, as shown in Corollary 4.1, item (f) in Table 2 can be used to demonstrate an interesting result in quantum information theory: the matrix Φ-entropy functional of a quantum ensemble (i.e. a set of quantum states with some prior distribution) is monotone under any unital quantum channel. This property motivates us to study the dynamical evolution of a quantum ensemble and its mixing time, a fundamentally important problem in quantum computation (see our follow-up work [23] for further details). Second, we define and derive equivalent characterisations for operator-valued Φ-entropies in Table 2 (see Theorem 4.2). Note that the only known statement in Table 1 that is missing in Table 2 is condition (e). In other words, we are not able to generalise (e) in Table 1 to the non- commutative case. Finally, we employ the subadditivity of operator-valued Φ-entropies to show the operator Efron-Stein inequality in Theorem 5.1. 1.2. Prior Work. For the history of the equivalent characterisations in the class (C1), we refer to an excellent textbook [15] and the papers [17, 18]. The original definition of the matrix Φ-entropy class; namely (a) in (C2), is proposed by Chen and Tropp in 2014 [19]. In the same paper, they also establish the subadditivity property (j) through (i) and (g): (a) (i) (g) (j) in Table 1. Shortly after, the equivalent relation between ⇒ ⇒ ⇒ (a) and the joint convexity of the matrix Brégman divergence (b) is proved in [21]. The equivalent relation between (a) and (d) is almost immediately implied by the result in [20] (see the detailed discussion in the proof of Theorem 3.2). The convexity of H (Z), (h), is noted in [20]. Here, we Φ provide a transparent evidence—the joint convexity of (f). We organise the paper in the following way. We collect necessary information of the Matrix AlgebrainSection2. TheequivalentcharacterisationsofmatrixΦ-entropyfunctionalsareprovided in Section 3. We define the operator-valued Φ-entropies and derive their equivalent statements in Section4. Section5showsanapplicationofthesubadditivity—theoperatorEfron-Steininequality. The proofs of main results are collected in Sections 6 and 7, respectively. Finally, we conclude the paper. 2. Preliminaries We first introduce basic notation. The set Msa refers to the subspace of self-adjoint operators on some separable Hilbert space. We denote by M+ (resp. M++) the set of positive semi-definite (resp. positive-definite) operators in Msa. If the dimension d of a Hilbert space needs special attention, then we highlight it in subscripts, e.g. M denotes the Banach space of d d complex matrices. The trace function d × 3 Tr : Cd×d C is defined as the summation of eigenvalues. The normalised trace function tr for → every d d matrices M is denoted by tr[M] , 1 Tr[M]. For p [1, ), the Schatten p-norm of × d ∈ ∞ an operator M is denoted as M , ( λ (M) p)1/p, where λ (M) are the singular values of k kp i| i | { i } M. The Hilbert-Schmidt inner product is defined as A,B , TrA†B. For A,B Msa, A B P h i ∈ (cid:23) means that A B is positive semi-definite. Similarly, A B means A B is positive-definite. − ≻ − Throughout this paper, italic capital letters (e.g. X) are used to denote operators. Denote a probability space (Ω,Σ,P). A random matrix Z defined on the probability space (Ω,Σ,P)meansthatitisamatrix-valuedrandomvariabledefinedonΩ. Wedenotetheexpectation of Z with respect to P by EP[Z] , Zdµ = Z(x)P(dx), ZΩ Zx∈Ω where the integral is the Bochner integral [28, 29]. We note that the results derived in this paper is universal for all probability spaces. Hence we will omit the subscript P of the expectation. If we consider a sample space Ω Ω with joint distribution P. Then we denote the conditional 1 2 × expectation of Z with respect to the first space Ω by E [Z] , Z(x1,X )P (dx ), where 1 i x1∈Ω1 2· 1 1 P (x ) = P(x ,dx ) is the marginal distribution on Ω . 1 1 x2∈Ω2 1 2 1 R Let , be real Banach spaces. The Fréchet derivative of a function : at a point X U, ifWRit exists1, is a unique linear mapping D [X] : such thatL U → W ∈ U L U → W (X +E) (X) D [X](E) = o( E ), W U kL −L − L k k k where is a norm in (resp. ). The notation D [X](E) then is interpreted as “the U(W) k · k U W L Fréchet derivative of at X in the direction E”. The partial Fréchet derivative of multivariate L functions can be defined as follows. Let , and be real Banach spaces, : . For U V W L U ×V → W a fixed v , (u,v ) is a function of u whose derivative at u , if it exists, is called the partial 0 0 0 Fréchet de∈rivVatiLve of with respect to u, and is denoted by Du [u0,v0]. The partial Fréchet derivative Dv [u0,v0]Lis defined similarly. Similarly, the m-th FrLéchet derivative Dm [X] is a L L unique multi-linear map from m , (m times) to that satisfies U U ×···×U W Dm−1 [X +E ](E ,...,E ) Dm−1 [X](E ,...,E ) m 1 m−1 1 m−1 k L − L Dm [X](E ,...,E ) = o( E ) 1 m W m U − L k k k for each E ,i = 1,...,m. The Fréchet derivative enjoys several properties as in standard i ∈ U derivatives. We provide those facts in Appendix A. A function f : I R is called operator convex if for each A,B Msa(I) and 0 t 1, → ∈ ≤ ≤ f(tA)+f((1 t)B) f(tA+(1 t)B). − (cid:22) − Similarly, a function f : I R is called operator monotone if for each A,B Msa(I), → ∈ A B f(A) f(B). (cid:22) ⇒ (cid:22) 2.1. Classical Φ-Entropy Functionals. Let (C1) denote the class of functions Φ : [0, ) R ∞ → that are continuous, convex on [0, ), twice differentiable on (0, ), and either Φ is affine or Φ′′ ∞ ∞ is strictly positive and 1/Φ′′ is concave. Definition 2.1 (Classical Φ-Entropies). Let Φ : [0, ) R be a convex function. For every non-negative integrable random variable Z so that E∞Z →< and E Φ(Z) < , the classical | | ∞ | | ∞ Φ-entropy H (Z) is defined as Φ H (Z) = EΦ(Z) Φ(EZ). Φ − 1We assume the functions considered in the paper are Fréchetdifferentiable. The readers can refer to, e.g. [30, 31], for conditions for when a function is Fréchet differentiable. 4 In particular, we are interested in Z = f(X , ,X ), where X , ,X are independent random 1 n 1 n ··· ··· variables, and f 0 is a measurable function. ≥ We say H (Z) is subadditive [9] if Φ n H (Z) E H(i)(Z) , Φ ≤ Φ Xi=1 h i where H(i)(Z) = E Φ(Z) Φ(E Z) is the conditional Φ-entropy, and E denotes conditional expec- Φ i − i i tation conditioned on the n 1 random variables X , (X ,X ,X , ,X ). Sometimes −i 1 i−1 i+1 n − ··· ··· (i) we also denote H (Z) by H (Z X ). Φ Φ | −i It is a well-known result that, for any function Φ (C1), H (Z) is subadditive [11, Corollary Φ ∈ 3] (see also [13, Section 3]). The following theorem establishes equivalent characterisations of classical Φ-entropies. Theorem 2.1 ([18, Theorem 4.4]). The following statements are equivalent. (a) Φ (C1): Φ is affine or Φ′′ > 0 and 1/Φ′′ is concave; ∈ (b) Brègman divergence (u,v) Φ(u+v) Φ(u) Φ′(u)v is convex; 7→ − − (c) (u,v) (Φ′(u+v) Φ′(u))v is convex; 7→ − (d) (u,v) Φ′′(u)v2 is convex; 7→ (e) Φ is affine or Φ′′ > 0 and Φ′′′′Φ′′ 2Φ′′′2; ≥ (f) (u,v) tΦ(u)+(1 t)Φ(v) Φ(tu+(1 t)v) is convex for any 0 t 1; (g) E H (7→Z X ) H (−E Z); − − ≤ ≤ 1 Φ 1 Φ 1 | ≥ (h) H (Z) forms a convex set; Φ Φ∈(C1) (i) {H (Z) =} sup (E[(Φ′(T) Φ′(ET))(Z T)]+H (T)); Φ T>0 − − Φ (j) H (Z) n EH(i)(Z). Φ ≤ i=1 Φ P 3. Equivalent Characterizations of Matrix Φ-Entropy Functionals In this section, we first introduce matrix Φ-entropy functionals, and present the main result (Theorem 3.2) of this section; namely, new characterisations of the matrix Φ-entropy functionals. Chen and Tropp introduce the class of matrix Φ-entropies, and prove its subadditivity in 2014 [19]. In this section, we will show that all equivalent characterisations of classical Φ-entropies in Theorem 2.1 have a one-to-one correspondence for the class of matrix Φ-entropies. Let d be a natural number. The class Φ contains each function Φ : (0, ) R that is either d ∞ → affine or satisfies the following three conditions: (1) Φ is convex and continuous at zero. (2) Φ is twice continuously differentiable. (3) Define Ψ(t) = Φ′(t) for t > 0. The Fréchet derivative DΨ of the standard matrix function Ψ : M++ Msa is an invertible linear map on M++, and the map A (DΨ[A])−1 is d → d d 7→ concave with respect to the Lo¨wner partial ordering on positive definite matrices. Define (C2) , Φ ∞ Φ . ∞ ≡ d=1 d Definition 3.1 (MatrTix Φ-Entropy Functional [19]). Let Φ : [0, ) R be a convex function. Consider a random matrix Z M+ with E Z < and E∞Φ(→Z) < . The matrix ∈ d k k∞ ∞ k k∞ ∞ Φ-entropy H (Z) is defined as Φ H (Z) , tr[EΦ(Z) Φ(EZ)]. Φ − The corresponding conditional matrix Φ-entropy can be defined under the σ-algebra. 5 Theorem 3.1 (Subadditivity of Matrix Φ-Entropy Functional [19, Theorem 2.5]). Let Φ (C2), ∈ and assume Z is a measurable function of (X ,...,X ). 1 n n (3.1) H (Z) E H(i)(Z) , Φ ≤ Φ Xi=1 h i where H(i)(Z) = E Φ(Z) Φ(E Z) is the conditional entropy, and E denotes conditional expec- Φ i − i i tation conditioned on the n 1 random matrices X , (X ,...,X ,X ,...,X ). −i 1 i−1 i+1 n − The following theorem is the main result of this section. We show that all the equivalent conditions in Theorem 2.1 also hold for the class of matrix Φ-entropy functionals. Hence, we have a much comprehensive understanding on the class of matrix Φ-entropy functionals Theorem 3.2. The following statements are equivalent. (a) Φ (C2): Φ is affine or DΨ is invertible and A (DΨ[A])−1 is operator concave; (b) Ma∈trix Brègman divergence: (A,B) Tr[Φ(A+7→B) Φ(A) DΦ[A](B)] is convex; (c) (A,B) Tr[DΦ[A+B](B) DΦ[A7→](B)] is convex;− − (d) (A,B) 7→ Tr[D2Φ[A](B,B)] −is convex; 7→ (e) Φ is affine or Φ′′ > 0 and (3.2) Tr h (DΨ[A])−1 D3Ψ[A] k,k,(DΨ[A])−1(h) · ◦ 2(cid:2)Tr h (DΨ[A])−1 D2Ψ(cid:0)[A] k,(DΨ[A])−1 D(cid:1)(cid:3)2Ψ[A] k,(DΨ[A])−1(h) , ≥ · ◦ for each A(cid:2) 0 and h,k Msa; (cid:0) (cid:0) (cid:0) (cid:1)(cid:1)(cid:1)(cid:3) (cid:23) ∈ d (f) (A,B) Tr[tΦ(A)+(1 t)Φ(B) Φ(tA+(1 t)B)] is convex for any 0 t 1; (g) E H (Z7→X ) H (E Z)−; − − ≤ ≤ 1 Φ 1 Φ 1 | ≥ (h) H (Z) forms a convex set of convex functions; Φ Φ∈(C2) { } (i) H (Z) = sup trE[(Φ′(T) Φ′(ET))(Z T)]+H (T) ; Φ T≻0 − − Φ (j) H (Z) n EH(i)(Z). Φ ≤ i=1 (cid:8)Φ (cid:9) We note that tPhe statements (a) (i) (g) (j) was proved by Chen and Tropp in [19]. ⇒ ⇒ ⇒ The equivalence of (a) (b) was shown in [21, Theorem 2]. Hansen and Chang established an ⇔ equivalence of item (a) and the convexity of the following map: (3.3) (A,X) X,DΦ′[A](X). 7→ h From Lemma A.1, it is not hard to observe that Eq. (3.3) is equivalent to item (d), i.e. Tr(D2Φ[A](X,X)) = X,DΦ′[A](X) . h i We provide the detailed proof of the remaining equivalence statements in Section 6. 4. Operator-Valued Φ-Entropies In this section, we extend the notion of matrix Φ-entropy functionals (i.e. real-valued) to operator-valued Φ-entropies. Definition 4.1 (Operator-Valued Entropy Class). Let d be a natural number. The class Φ d contains each function Φ : [0, ) R such that its second-order Fréchet derivative exists and the ∞ → following map satisfies the joint convexity (under the Lo¨wner partial ordering) condition: (4.1) (A,X) D2Φ[A](X,X), A Msa X M . 7→ ∀ ∈ d ∈ d We denote the class of operator-valued Φ-entropies Φ , ∞ Φ by (C3). ∞ d=1 d 6 T Definition 4.2 (Operator-Valued Φ-Entropies). Let Φ : [0, ) R be a convex function. Con- sider a random matrix Z taking values in M+, with E Z ∞< →and E Φ(Z) < . That is, ∞ ∞ the random matrix Z and Φ(Z) are Bochner integrablek[2k8, 29]∞(Hence EkZ andkEΦ(Z∞) exist and are well-defined). The operator-valued Φ-entropy H is defined as Φ (4.2) H (Z) , EΦ(Z) Φ(EZ). Φ − The corresponding conditional terms can be defined under the σ algebra. It is worth mentioning that the matrix Φ-entropy functional [19] in Section 3 is non-negative for every convex function Φ due to the fact that the trace function trΦ is also convex [32] (or see e.g. [33, Sec. 2.2]). However, according to the operator Jensen’s inequality [34, Theorem 3.2], only the operator convex function Φ ensures the operator-valued Φ-entropy non-negative. In the following, we show that the the entropy class (C3) is not an empty set. Proposition 4.1. The square function Φ(u) = u2 belongs to (C3). Proof. It suffices to verify the joint convexity of the map: D2Φ[A](X,X) = 2X2 for all A,X M+, ∈ where we use the identity of second-order Fréchet derivative (see e.g. [35, Example X.4.6]) of the square function. Since the square function is operator convex, hence Φ(u) = u2 belongs to the operator-valued Φ-entropy class (C3). (cid:3) 4.1. Subaddtivity of operator-valued Φ-entropies. Denote by X , (X ,...,X ) a series of 1 n independent random variables taking values in a Polish space, and let X −1 X , (X ,...,X ,X ,...,X ). −i 1 i−1 i+1 n Let a positive semi-definite matrix Z that depends on the series of random variables X: Z , Z(X ,...,X ) M+. 1 n ∈ Throughout this paper, we assume the random matrix Z satisfies the integrability conditions: Z , √Z2 and Φ(Z) is Bochner integrable for Φ (C3). | | | | ∈ Theorem 4.1 (Subadditivity of Operator-Valued Φ-Entropy). Fix a function Φ (C3). Under ∈ the prevailing assumptions, n (4.3) H (Z) E H(i)(Z) , Φ (cid:22) Φ Xi=1 h i where H(i)(Z) = H (Z X ) , E Φ(Z) Φ(E Z). Φ Φ | −i i − i The proof is given in Section 7. 4.2. Equivalent characterisations ofoperator-valued Φ-entropies. Inthissection, wederive alternative characterisations of the class (C3) in Theorem 4.2. As an application of the entropy class, we show that if the function Φ belongs to (C3), then the operator-valued Φ-entropy is monotone under any unital completely positive map. Theorem 4.2. The following statements are equivalent. (a) Φ (C3): convexity of (A,B) D2Φ[A](B,B); (b) Op∈erator-valued Brègman diverg7→ence: (A,B) Φ(A+B) Φ(A) DΦ[A](B) is convex; (c) (A,B) DΦ[A+B](B) DΦ[A](B) is con7→vex; − − (d) (A,B) 7→ D2Φ[A](B,B) i−s convex; 7→ (e) Convexity of (A,B) tΦ(A)+(1 t)Φ(B) Φ(tA+(1 t)B) for any 0 t 1; 7→ − − − ≤ ≤ 7 (f) E H (Z X ) H (E Z); 1 Φ 1 Φ 1 | (cid:23) (g) H (Z) forms a convex set of convex functions; Φ Φ∈(C3) (h) {H (Z) =} sup E[DΦ[T](Z T) DΦ[ET](Z T)]+H (T) ; Φ T≻0{ − − − Φ } (i) H (Z) n EH(i)(Z). Φ (cid:22) i=1 Φ The proof is omPitted since it directly follows from that of Theorem 3.2 without taking traces. Remark 4.1. Initem (g) of Theorem 4.2, we introduce a supremum representation for the operator- valued Φ-entropies. The supremum is defined as the least upper bound (under Lo¨wner partial ordering) among the set of operators. In general, the supremum might not exist due to matrix partial ordering; however, the supremum in (g) exists and is attained when T Z. ♦ ≡ In the following, we demonstrate a monotone property of operator-valued Φ-entropies when Φ (C3). ∈ Proposition 4.2. [Monotonicity of Operator-Valued Φ-Entropies] Fix a convex function Φ ∈ (C3), then the operator-valued Φ-entropy H (Z) is monotone under any unital completely positive Φ N map , i.e. H (N(Z)) H (Z) Φ Φ (cid:22) for any random matrix Z taking values in M+. Proof. If Φ (C3), by item (e) in Theorem 4.2, we have the joint convexity of the map: ∈ F (A,B) , tΦ(A)+(1 t)Φ(B) Φ(tA+(1 t)B) t − − − for any 0 t 1. Let X = (A,B) denote the pair of matrices. ≤ ≤ N Foranycompletely positiveunital map , itcanbeexpressed inthefollowing form(seee.g. [36]): N(A) = K AK†, i i i X where K K† = I (the identity matrix in Msa), and † denotes the complex conjugate. Hence, i i i by Jensen’s operator inequality, Proposition A.5, yields P F (N(X)) F (X), 0 t 1 t t (cid:22) ∀ ≤ ≤ for any completely positive unital map N, which implies the monotonicity of H (Z). (cid:3) Φ Following the same argument, the matrix Φ-entropy functional satisfies the monotone property if Φ (C2). ∈ Corollary 4.1. [Monotonicity of Matrix Φ-Entropy Functionals] Fix a convex function Φ (C2), ∈ then the matrix Φ-entropy functional H (Z) is monotone under any unital completely positive map Φ N: H (N(Z)) H (Z) for any random matrix Z taking values in M+. Φ Φ ≤ We remark that the monotonicity of a quantum ensemble is only known for when Φ(x) = xlogx. This is the famous result in quantum information theory, namely, the monotone property of the Holevo quantity [37]. Our Corollary 4.1 extends the monotonicity of a quantum ensemble to any function Φ (C2). ∈ 5. Applications: Operator Efron-Stein Inequality In this section, we employ the operator subadditivity of Hu→u2(Z) to prove the operator Efron- Stein inequality. For 1 i n, let X′,...,X′ be independent copies of X ,...,X , and ≤ ≤ 1 n 1 n denote X(i) , (X ,...,X ,X′,X ,...,X ), i.e. replacing the i-th component of X by the 1 i−1 i i+1 n independent copy X′. i 8 f Define the quantity2 n 1 2 E( ) , E (X) X(i) , L 2 L −L " # Xi=1 (cid:16) (cid:16) (cid:17)(cid:17) and denote the operator-valued variance of a random matrifx A (taking values in Msa) by Var(A) , E(A EA)2 = EA2 (EA)2. − − Theorem 5.1 (Operator Efron-Stein Inequality). With the prevailing assumptions, we have Var(Z) E(Z). (cid:22) Proof. This theorem is a direct consequence of the subadditivity of operator-valued Φ-entropies; namely, Theorem 4.1 with Φ(u) = u2. For two independent and identical random matrices A, B, direct calculation yields: 1 1 E (A B)2 = E A2 AB BA+B2 2 − 2 − − (cid:2) (cid:3) = Var(cid:2)(A). (cid:3) ObservethatZ′isanindependentcopyofZ conditionedonX . DenoteVar(i)(Z) , E (Z E Z)2 i −i i − i for all i = 1,...,n. Then 1 Var(i)(Z) = E (Z Z′)2 . 2 i − i h i Finally, Theorem 4.1 and Proposition 4.1 lead to Var(Z) = Hu7→u2(Z) n EH(i) (Z) (cid:22) u7→u2 i=1 X n = Var(i)(Z) i=1 X = E(Z). (cid:3) Note that the established operator Efron-Stein inequality directly leads to a matrix polynomial Efron-Stein inequality. Corollary 5.1 (MatrixPolynomialEfron-Stein). With the prevailing assumptions, foreach natural number p 1, we have ≥ n p 1 E(Z EZ)2 p E (Z Z′)2 . − p ≤ (cid:13)2 − i (cid:13) (cid:13) (cid:13) (cid:13) Xi=1 h i(cid:13)p (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) (cid:13) Corollary5.1isavariantofthematrixpolynomialEfron-Steininequalityderivedin[25,Theorem (cid:13) (cid:13) 4.2]. 2Note that we will use notation E( ) and E(Z) interchangeably. L 9 6. Proof of Theorem 3.2 Proof. (a) (i) (g) (j): This statement is proved by Chen and Tropp in [19]. ⇒ ⇒ ⇒ (a) (b): This equivalent statement is proved in [21, Theorem 2]. ⇔ (a) (d): Theorem 2.1 in [20] proved the equivalence of (a) and the following convexity lemma. ⇔ Lemma 6.1 (Convexity Lemma [19, Lemma 4.2]). Fix a function Φ (C2), and let Ψ = Φ′. Suppose that A is a random matrix taking values in M++, and le∈t X be a random d matrix taking values in Msa. Assume that A , X are integrable. Then d k k k k E X,DΨ[A](X) E[X],DΨ[EA](EX) . h i ≥ h i What remains is to establish equivalence between the convexity lemma and condition (d). This follows easily from Lemma A.1: Tr(D2Φ[A](X,X)) = X,DΦ′[A](X) , h i Remark 6.1. In Ref. [19, Lemma 4.2], it is shown that the concavity of the map: A X(DΨ[A])−1(X) , X Msa 7→ ∀ ∈ d implies the joint convexity(cid:10)of the map (i.e. Lem(cid:11)ma 6.1). (6.1) (X,A) X(DΨ[A])(X) . 7→ h i (b) (c) (d): Define A ,B ,C : M+ M+ R as ⇔ ⇔ Φ Φ Φ d × d → A (u,v) , Tr[Φ(u+v) Φ(u) DΦ[u](v)] Φ − − B (u,v) , Tr[DΦ[u+v](v) DΦ[u](v)] Φ − C (u,v) , Tr[D2Φ[u](v,v)]. Φ Following from [18], we can establish the following relations: for any (u,v) M+ M+, ∈ d × d 1 (6.2) A (u,v) = (1 s)C (u+sv,v)ds Φ Φ − Z0 1 (6.3) B (u,v) = C (u+sv,v)ds, Φ Φ Z0 and for small enough ǫ > 0, 1 (6.4) A (u,ǫv) = C (u,v)ǫ2 +o(ǫ2); Φ Φ 2 (6.5) B (u,ǫv) = C (u,v)ǫ2 +o(ǫ2). Φ Φ Eq. (6.2) is exactly the integral representation for the matrix Brégman divergence proved in [21]. Similarly, Eq. (6.3) follows from d d B (u,v) = Tr[Φ(u+sv)] Tr[Φ(u+sv)] Φ ds − ds (cid:12)s=1 (cid:12)s=0 1 d d (cid:12)(cid:12) (cid:12)(cid:12) = Tr[Φ(u(cid:12) +sv)] ds (cid:12) ds ds Z0 (cid:18) (cid:19) 1 = C (u+sv,v)ds. Φ Z0 10

Characterisations of Matrix and Operator-Valued $\Phi$-Entropies, and Operator Efron-Stein Inequalities PDF

0.29 MB·

by Hao-Chung Cheng

#journals #arxiv

Checking for file health...

Save to my drive

Quick download

Download

Download Characterisations of Matrix and Operator-Valued $\Phi$-Entropies, and Operator Efron-Stein Inequalities PDF Free - Full Version

by Hao-Chung Cheng| 0.29

Download Characterisations of Matrix and Operator-Valued $\Phi$-Entropies, and Operator Efron-Stein Inequalities by Hao-Chung Cheng in PDF format completely FREE. No registration required, no payment needed. Get instant access to this valuable resource on PDFdrive.to!

Free Download PDF

About Characterisations of Matrix and Operator-Valued $\Phi$-Entropies, and Operator Efron-Stein Inequalities

No description available for this book.

Detailed Information

Author:	Hao-Chung Cheng
File Size:	0.29
Format:	PDF
Price:	FREE

Download Free PDF

Safe & Secure Download - No registration required

Why Choose PDFdrive for Your Free Characterisations of Matrix and Operator-Valued $\Phi$-Entropies, and Operator Efron-Stein Inequalities Download?

100% Free: No hidden fees or subscriptions required for one book every day.
No Registration: Immediate access is available without creating accounts for one book every day.
Safe and Secure: Clean downloads without malware or viruses
Multiple Formats: PDF, MOBI, Mpub,... optimized for all devices
Educational Resource: Supporting knowledge sharing and learning

Frequently Asked Questions

Is it really free to download Characterisations of Matrix and Operator-Valued $\Phi$-Entropies, and Operator Efron-Stein Inequalities PDF?

Yes, on https://PDFdrive.to you can download Characterisations of Matrix and Operator-Valued $\Phi$-Entropies, and Operator Efron-Stein Inequalities by Hao-Chung Cheng completely free. We don't require any payment, subscription, or registration to access this PDF file. For 3 books every day.

How can I read Characterisations of Matrix and Operator-Valued $\Phi$-Entropies, and Operator Efron-Stein Inequalities on my mobile device?

After downloading Characterisations of Matrix and Operator-Valued $\Phi$-Entropies, and Operator Efron-Stein Inequalities PDF, you can open it with any PDF reader app on your phone or tablet. We recommend using Adobe Acrobat Reader, Apple Books, or Google Play Books for the best reading experience.

Is this the full version of Characterisations of Matrix and Operator-Valued $\Phi$-Entropies, and Operator Efron-Stein Inequalities?

Yes, this is the complete PDF version of Characterisations of Matrix and Operator-Valued $\Phi$-Entropies, and Operator Efron-Stein Inequalities by Hao-Chung Cheng. You will be able to read the entire content as in the printed version without missing any pages.

Is it legal to download Characterisations of Matrix and Operator-Valued $\Phi$-Entropies, and Operator Efron-Stein Inequalities PDF for free?

https://PDFdrive.to provides links to free educational resources available online. We do not store any files on our servers. Please be aware of copyright laws in your country before downloading.

The materials shared are intended for research, educational, and personal use in accordance with fair use principles.