Table Of Content

1 Extended Gray–Wyner System with Complementary Causal Side Information Cheuk Ting Li and Abbas El Gamal Department of Electrical Engineering, Stanford University Email: [email protected], [email protected] Abstract We establish the rate region of an extended Gray–Wyner system for 2-DMS (X,Y) with two additional decoders having 7 complementary causal side information. This extension is interesting because in addition to the operationally significant extreme 1 points of the Gray–Wyner rate region, which include Wyner’s common information, Gács-Körner common information and 0 information bottleneck, the rate region for the extended system also includes the Körner graph entropy, the privacy funnel and 2 excessfunctionalinformation,aswellasthreenewquantitiesofpotentialinterest,asextremepoints.Tosimplifytheinvestigation ofthe5-dimensionalrateregionoftheextendedGray–Wynersystem,weestablishanequivalenceofthisregiontoa3-dimensional n mutual information region that consists of the set of all triples of the form (I(X;U), I(Y;U), I(X,Y;U)) for some p . a U|X,Y We further show that projections of this mutual information region yield the rate regions for many settings involving a 2-DMS, J including lossless source coding with causal side information, distributed channel synthesis, and lossless source coding with a 2 helper. 1 Index Terms ] T Gray–Wyner system, side information, complementary delivery, Körner graph entropy, privacy funnel. I . s I. INTRODUCTION c [ The lossless Gray–Wynersystem [1] is a multi-terminalsource coding setting for two discrete memorylesssource (2-DMS) (X,Y)withoneencoderandtwodecoders.Thissetupdrawssomeofitssignificancefromprovidingoperationalinterpretation 1 v for several information theoretic quantities of interest, namely Wyner’s common information [2], the Gács-Körner common 7 information [3], the necessary conditional entropy [4], and the information bottleneck [5]. 0 In this paper, we consider an extension of the Gray-Wyner system (henceforth called the EGW system), which includes 2 two new individual descriptions and two decoders with causal side information as depicted in Figure 1. The encoder maps 3 0 sequences from a 2-DMS (X,Y) into five indices Mi ∈ [1 : 2nRi], i = 0,...,4. Decoders 1 and 2 correspond to those of 1. the Gray–Wyner system, that is, decoder 1 recovers Xn from (M0,M1) and decoder 2 recovers Yn from (M0,M2). At time i∈[1:n],decoder3recoversX causallyfrom(M ,M ,Yi)anddecoder4similarlyrecoversY causallyfrom(M ,M ,Xi). 0 i 0 3 i 0 4 7 Note that decoders3 and 4 correspondto those of the complementarydelivery setup studied in [6], [7] with causal (instead of 1 noncausal) side information and with two additional private indices M3 and M4. This extended Gray-Wyner system setup is : lossless, that is, the decodersrecovertheir respective source sequences with probabilityof error that vanishes as n approaches v i infinity. The rate region R of the EGW system is defined in the usual way as the closure of the set of achievable rate tuples X (R ,R ,R ,R ,R ). 0 1 2 3 4 r Thefirst contributionofthispaperisto establish therate regionoftheEGW system.Moreover,tosimplifythe studyofthis a rate region and its extreme points, we show that it is equivalent to the 3-dimensional mutual information region for (X,Y) defined as I = {(I(X;U), I(Y;U), I(X,Y;U))}⊆R3 (1) XY pU[|XY in the sense that we can express R using I and vice versa. As a consequence and of particular interest, the extreme points of the rate region R (and its equivalent mutual information region I ) for the EGW system include, in addition to the XY aforementioned extreme points of the Gray–Wyner system, the Körner graph entropy [8], privacy funnel [9] and excess functional information [10], as well as three new quantities with interesting operational meaning, which we refer to as the maximal interaction information, the asymmetric private interaction information, and the symmetric private interaction information. These extreme points can be cast as maximizations of the interaction information [11] I(X;Y|U)−I(X;Y) undervariousconstraints. They can be consideredas distances fromextreme dependency,as they are equal to zero only under certain conditions of extreme dependency. In addition to providing operational interpretations to these information theoretic quantities, projectionsof the mutualinformation region yield the rate regions for many settings involving a 2-DMS, including lossless source coding with causal side information [12], distributed channel synthesis [13], [14], and lossless source coding with a helper [15], [16], [17]. A related extension of lossy Gray–Wyner system with two decoders with causal side information was studied by Timo and Vellambi [18]. If we only consider decoders 3 and 4 in EGW, then it can be considered as a special case of their setting 2 M 1 Xˆn 1 Decoder 1 Yˆn 2 M Decoder 2 2 Xn,Yn M Yi 0 Encoder Xˆ 3i M Decoder 3 3 Xi Yˆ 4i M Decoder 4 4 Figure1. ExtendedGray–Wynersystem. (wheretheside informationdoesnotneedto becomplementary).OtherrelatedsourcecodingsetupstotheEGWcanbefound in[19],[12],[20],[21],[22].Arelated3-dimensionalregion,calledtheregionoftension,wasinvestigatedbyPrabhakaranand Prabhakaran [23], [24]. We show that this region can be obtained from the mutual information region, but the other direction does not hold in general. In the following section, we establish the rate region of the EGW system, relate it to the mutual information region, and showthattheregionoftheoriginalGray–Wynersystemandtheregionoftensioncanbeobtainedfromthemutualinformation region.InSectionIII, we studythe extremepointsof themutualinformationregion.InSectionIV we establish therate region for the same setup as the EGW system but with noncausal instead of causal side information at decoders 3 and 4. We show that the rate region of the noncausal EGW can be expressed in terms of the Gray–Wyner region, hence it does not contain as many interesting extreme points as the causal EGW. Moreover, we show that this region is equivalent to the closure of the limit of the mutual information region for (Xn,Yn) as n approaches infinity. A. Notation Throughoutthis paper,we assume that log is base 2 andthe entropyH is in bits. We use the notation:Xb =(X ,...,X ), a a b Xn =Xn and [a:b]=[a,b]∩Z. 1 For discrete X, we write the probability mass function as p . For A ⊆ Rn, we write the closure of A as cl(A) and the X convex hull as conv(A). We write the support function as ψ (b)=sup aTb: a∈A . A (cid:8) (cid:9) We write the one-sided directional derivative of the support function as 1 ψ′ (b;c)= lim (ψ (b+tc)−ψ (b)). A t→0+ t A A Note that if A is compact and convex, then ψ′ (b;c)=max dTc: d∈argmaxaTb . A (cid:26) (cid:27) a∈A II. RATE REGION OF EGWAND THEMUTUAL INFORMATION REGION The rate region of the EGW system is given in the following. Theorem 1. The rate region the EGW system R is the set of rate tuples (R ,R ,R ,R ,R ) such that 0 1 2 3 4 R ≥I(X,Y;U), 0 R ≥H(X|U), 1 R ≥H(Y|U), 2 R ≥H(X|Y,U), 3 R ≥H(Y|X,U) 4 3 for some p , where |U|≤|X|·|Y|+2. U|XY Note that if we ignore decoders 3 and 4, i.e., let R ,R be sufficiently large, then this region reduces to the Gray–Wyner 3 4 region. Proof: The converse proof is quite straightforward and is given in Appendix A for completion. We now prove the achievability. Codebook generation. Fix pU|XY and randomly and independently generate 2nR0 sequences un(m0), m0 ∈ [1 : 2nR0], each according to ni=1pU(ui). Given un(m0), assign indices m1 ∈ [1 : 2nR1], m2 ∈ [1 : 2nR2] to the sequences in the conditional typiQcal sets T(n)(X|un(m )) and T(n)(Y|un(m )), respectively. For each y ∈ Y, u ∈ U, assign indices ǫ 0 ǫ 0 m3,y,u ∈ [1 : 2nR3,y,upYU(y,u)] to the sequences in Tǫn(1+ǫ)pYU(y,u)(X|y,u), where y,uR3,y,upYU(y,u) ≤ R3. Define m4,x,u similarly. P Encoding. To encode the sequence xn,yn, find m such that (un(m ),xn,yn) ∈ T(n) is jointly typical, and find indices 0 0 ǫ m ,m of xn,yn in T(n)(X|un(m )) and T(n)(Y|un(m )) given un(m ). For each x,y, let xn be the subsequence of xn 1 2 ǫ 0 ǫ 0 0 y,u where x is included if and only if y = y and u (m ) = u. Note that since (un(m ),yn) ∈ T(n), the length of xn is not i i i 0 0 ǫ y,u greater than n(1+ǫ)p (y,u). We then find an index m of xˆn(1+ǫ)pYU(y,u) ∈Tn(1+ǫ)pYU(y,u)(X|y,u) such that xn is YU 3,y,u y,u ǫ y,u a prefix of xˆn(1+ǫ)pYU(y,u), and output m as the concatenation of m for all y,u. Similar for m . y,u 3 3,y,u 4 Decoding. Decoder 1 outputs the sequence corresponding to the index m in T(n)(X|un(m )). Decoder 2 performs simi- 1 ǫ 0 larly using (m ,m ). Decoder 3, upon observing y , finds the sequence xˆn(1+ǫ)pYU(yi,ui(m0)) at the index m in 0 2 i yi,ui(m0) 3,yi,ui(m0) Tn(1+ǫ)pYU(yi,ui(m0))(X|y ,u (m )), and output the next symbol in the sequence that is not previously used. Decoder 4 ǫ i i 0 performs similarly using (m ,m ). 0 4 Analysisoftheprobabilityoferror.Bythecoveringlemma,theprobabilitythattheredoesnotexistm suchthat(un(m ),xn,yn)∈ 0 0 Tǫ(n) tends to 0 if R0 > I(X,Y;U). Also Tǫ(n)(X|un(m0)) ≤ 2nR1 for large n if R1 > H(X|U)+δ(ǫ) (similar for R >H(Y|U)+δ(ǫ)). Note that (un(m ),xn(cid:12),yn)∈T(n) imp(cid:12)lies 2 0 (cid:12) ǫ (cid:12) |{i: x =x, y =y, u (m )=u}| (1+ǫ)p (x,y,u) i i i 0 XYU ≤ n(1+ǫ)p (y,u) (1+ǫ)p (y,u) YU YU ≤p (x|y,u) X|YU forall(y,u). Hencethere existsxˆn(1+ǫ)pYU(y,u) ∈Tn(1+ǫ)pYU(y,u)(X|y,u)suchthatxn is a prefixofxˆn(1+ǫ)pYU(y,u). And y,u ǫ y,u y,u Tǫn(1+ǫ)pYU(y,u)(X|y,u) ≤2nR3,y,upYU(y,u) forlargenif R3,y,u >(1+ǫ)H(X|Y =y,U =u)+δ(ǫ). Hencewe canassign s(cid:12)uitable R for each y(cid:12),u if R >(1+ǫ)H(X|Y,U)+δ(ǫ). (cid:12) 3,y,u (cid:12) 3 Although R is 5-dimensional,the bounds on the rates can be expressed in terms of three quantities: I(X;U), I(Y;U) and I(X,Y;U) together with other constantquantities that involve only the given (X,Y). This leads to the following equivalence of R to the mutual information region I defined in (1). We denote the components of a vector v ∈ I by v = XY XY (v , v , v ). X Y XY Proposition 1. The rate region for the EGW system can be expressed as R = v , H(X)−v , H(Y)−v , H(X|Y)−v +v , H(Y|X)−v +v +[0,∞)5, (2) XY X Y XY Y XY X v∈[IXY(cid:8)(cid:0) (cid:1)(cid:9) where the last “+” denotes the Minkowski sum. Moreover, the mutual information region for (X,Y) can be expressed as I = v ∈R3 : v , H(X)−v , H(Y)−v , H(X|Y)−v +v , H(Y|X)−v +v ∈R . (3) XY XY X Y XY Y XY X (cid:8) (cid:0) (cid:1) (cid:9) Proof: Note that (2) follows from the definitions of R and I . We now prove (3). The ⊆ direction follows from (2). XY For the ⊇ direction, let v ∈R3 satisfy v , H(X)−v , H(Y)−v , H(X|Y)−v +v , H(Y|X)−v +v ∈R. XY X Y XY Y XY X (cid:0) (cid:1) Then by Theorem 1, there exists U such that v ≥I(X,Y;U), (4) XY H(X)−v ≥H(X|U), (5) X H(Y)−v ≥H(Y|U), (6) Y H(X|Y)−v +v ≥H(X|Y,U), (7) XY Y H(Y|X)−v +v ≥H(Y|X,U). (8) XY X 4 Adding (4) and (8), we have v ≥ I(X;U). Combining this with (5), we have v = I(X;U). Similarly v = I(Y;U). X X Y Substitutingthisinto(7),wehavev ≤I(X,Y;U).Combiningthiswith(4),wehavev =I(X,Y;U).Hencev ∈I . XY XY XY In the following we list several properties of I . XY Proposition 2. The mutual information region I satisfies: XY 1) Compactness and convexity. I is compact and convex. XY 2) Outer bound. I ⊆Io , where Io is the set of v such that XY XY XY v , v ≥0, X Y v +v −v ≤I(X;Y), X Y XY 0≤v −v ≤H(X|Y), XY Y 0≤v −v ≤H(Y|X). XY X 3) Inner bound. I ⊇Ii , where Ii is the convex hull of the points (0, 0, 0),(H(X), I(X;Y), H(X)), XY XY XY (I(X;Y), H(Y), H(Y)),(H(X), H(Y), H(X,Y)),(H(X|Y), H(Y|X), H(X|Y)+H(Y|X)). Moreover, there exists 0≤ǫ ,ǫ ≤logI(X;Y)+4 such that 1 2 (0, H(Y|X)−ǫ , H(Y|X)), (H(X|Y)−ǫ , 0, H(X|Y))∈I . 1 2 XY 4) Superadditivity. If (X ,Y ) is independentof (X ,Y ), then 1 1 2 2 I +I ⊆I , X1,Y1 X2,Y2 (X1,X2),(Y1,Y2) where + denotes the Minkowski sum. As a result, if (X ,Y )∼p i.i.d. for i=1,...,n, I ⊆(1/n)I . i i XY XY Xn,Yn 5) Data processing. If X −X −Y −Y forms a Markov chain, then for any v ∈I , there exists w ∈I such 2 1 1 2 X1,Y1 X2,Y2 that w ≤v , w ≤v , w ≤v , X X Y Y XY XY I(X ;Y )−w −w +w ≤I(X ;Y )−v −v +v . 2 2 X Y XY 1 1 X Y XY 6) Cardinality bound. I = {(I(X;U), I(Y;U), I(X,Y;U))}. XY [ pU|XY:|U|≤|X|·|Y|+2 7) Relation to Gray–Wyner region and region of tension. The Gray–Wyner region can be obtained from I as XY R = I(X,Y;U), H(X|U), H(Y|U) +[0,∞)3 GW pU[|XY (cid:8)(cid:0) (cid:1)(cid:9) = v , H(X)−v , H(Y)−v +[0,∞)3. XY X Y v∈[IXY (cid:8)(cid:0) (cid:1)(cid:9) The region of tension can be obtained from I as XY T= I(Y;U|X), I(X;U|Y), I(X;Y|U) +[0,∞)3 pU[|XY (cid:8)(cid:0) (cid:1)(cid:9) = v −v , v −v , I(X;Y)−v −v +v +[0,∞)3. XY X XY Y X Y XY v∈[IXY (cid:8)(cid:0) (cid:1)(cid:9) The proof of this proposition is given in Appendix B. III. EXTREMEPOINTS OFTHEMUTUAL INFORMATION REGION Many interesting information theoretic quantities can be expressed as optimizations over I (and R). Since I is XY XY convex and compact, some of these quantities can be represented in terms of the supportfunction ψI (x) and its one-sided XY directional derivative, which provides a representation of those quantities using at most 6 coordinates. To avoid conflicts and for consistency, we use different notation for some of these quantities from the original literature . We use semicolons, e.g., G(X;Y), for symmetric quantities, and arrows, e.g., G(X →Y), for asymmetric quantities. Figures 2, 3 illustrate the mutual information region I and its extreme points, and Table I lists the extreme points and XY their corresponding optimization problems and support function representations. We first consider the extreme points of I that correspond to previously known quantities. XY Wyner’s common information [2] J(X;Y)= min I(X,Y;U) X−U−Y 5 γ H(Y †X) I(X;Y) J(X;Y)−I(X;Y) bottlenecktradeoff H(X†Y) β Information HK(GYX,Y)−H(Y|Xα) K(X;Y) HK(GXY,X)−H(X|Y) GR∗(X→Y) H(Y|X) Ψ(X→Y) GR∗(Y →X) Privacy funneltradeoff −GPPI(X;Y) α H(X|Y) α Ψ(Y →X) −GPNI(X→Y) −GPNI(Y →X) −GNNI(X;Y) −H(Y|X) Figure 2. Illustration of IXY (yellow), IXiY (green) and IXoY (grey) defined in Proposition 2. The axes are α = I(X;U|Y) = vXY −vY, β=I(Y;U|X)=vXY −vX andγ=vX+vY −vXY,i.e.,themutualinformationI(X;Y;U).Withoutlossofgenerality,weassumeH(X)≥H(Y). Notethattheoriginal Gray–Wynerregionandtheregionoftensioncorrespondtotheupper-left corner. can be expressed as J(X;Y)=min{v : v ∈I , v +v −v =I(X;Y)} XY XY X Y XY =min R : R4 ∈R, R +R +R =H(X,Y) 0 0 0 1 2 =−ψ′(cid:8) (1,1,−1; 0,0,−1). (cid:9) I XY Gács-Körner common information [3], [25] K(X;Y)= max H(U)= max I(X,Y;U) U:H(U|X)=H(U|Y)=0 U:X−Y−U,U−X−Y can be expressed as K(X;Y)=max{v : v ∈I , v =v =v } XY XY X Y XY =max R : R4 ∈R, R +R =H(X), R +R =H(Y) 0 0 0 1 0 2 =ψ′ (cid:8)(1,1,−2; 0,0,1). (cid:9) I XY Körner graph entropy [8], [26]. Let G be a graph with a set of vertices X and edges between confusable symbols upon XY observing Y, i.e., there is an edge (x ,x ) if p(x ,y),p(x ,y)>0 for some y. The Körner graph entropy 1 2 1 2 H (G ,X)= min I(X;U) K XY U:U−X−Y,H(X|Y,U)=0 6 β 1. Plane γ =I(X;Y) γ H(Y|X) 1 α β H(Y†X) 2 3 J(X;Y)−I(X;Y) αα 4 α H(X†Y) H(X|Y) 2. Plane β =0 3. Plane α=H(X|Y) γ γ H(X†Y) I(X;Y) I(X;Y) bottlenecktradeoff mation HK(GXY,X)−H(X|Y) Infor K(X;Y) Privacyfunneltradeoff H(Y|X) GR∗(Y→X) H(X|Y) α β γ 4. Plane β+γ =0 α H(X|Y) GR∗(Y→X) Ψ(Y→X) −GPPI(X;Y) −GPNI(Y→X) −GNNI(X;Y) −H(Y|X) −H(Y|X) Figure 3. Illustration of IXY (yellow), IXiY (green) and IXoY (grey) restricted to different planes. The axes are α = I(X;U|Y) = vXY −vY, β=I(Y;U|X)=vXY −vX andγ=vX+vY −vXY.WeassumeH(X)≥H(Y). can be expressed as H (G ,X)=min{v : v ∈I , v =v , v −v =H(X|Y)} K XY X XY X XY XY Y =min R : R4 ∈R, R +R =H(X), R =0 0 0 0 1 3 =−ψ′(cid:8) (1,−1,0; −1,0,0). (cid:9) I XY In the Gray–Wyner system with causal complementary side information, H (G ,X) corresponds to the setting with only K XY decoders1,3andM =∅,andwerestrictthesumrateR +R =H(X).Thisisinlinewiththelosslesssourcecodingsetting 3 0 1 with causal side information [12], where the optimal rate is also given by H (G ,X). An intuitive reason of this equality K XY is that R +R = H(X) and the recovery requirement of decoder 1 forces M and M to contain negligible information 0 1 0 1 outside Xn, hence the setting is similar to the case in which the encoder has access only to Xn. This corresponds to lossless source coding with causal side information setting. Necessary conditionalentropy[4](also see H(Y ցX|X)in [27],G(Y →X)in [28], privateinformationin [29] and[30]) H(Y †X)= min H(U|X)= min I(Y;U)−I(X;Y) U:H(U|Y)=0,X−U−Y U:X−Y−U,X−U−Y 7 can be expressed as H(Y †X)=min{v : v ∈I , v =v , v =I(X;Y)}−I(X;Y) XY XY Y XY X =min R : R4 ∈R, R +R =H(Y), R =H(X|Y) 0 0 0 2 1 =−ψ′(cid:8) (1,2,−2; 1,0,−1). (cid:9) I XY Information bottleneck [5] G (t, X →Y)= min I(Y;U) IB U:X−Y−U,I(X;U)≥t can be expressed as G (t, X →Y)=min{v : v ∈I , v =v , v ≥t} IB Y XY Y XY X =min R : R4 ∈R, R +R =H(Y), R ≤H(X)−t . 0 0 0 2 1 (cid:8) (cid:9) Note that the same tradeoff also appears in common randomness extraction on a 2-DMS with one-way communication [31], lossless source coding with a helper [15], [16], [17], and a quantity studied by Witsenhausen and Wyner [32]. It is shown in [33] that its slope is given by the chordal slope of the hypercontractivityof Markov operator [34] I(X;U) s∗(Y →X)= sup I(Y;U) U:X−Y−U =sup{v /v : v ∈I , v =v }. X Y XY Y XY Privacy funnel [9] (also see the rate-privacy function defined in [29]) G (t, X →Y)= min I(X;U) PF U:X−Y−U,I(Y;U)≥t can be expressed as G (t, X →Y)=min{v : v ∈I , v =v , v ≥t} PF X XY Y XY Y =min R +R −H(Y|X): R4 ∈R, R +R =H(Y), R ≥t . 0 4 0 0 2 0 (cid:8) (cid:9) In particular, the maximum R for perfect privacy (written as g (X;Y) in [29], also see [35]) is 0 G (X →Y)=max{t≥0: G (t, X →Y)=0} R∗ PF =max{v : v ∈I , v =v , v =0} Y XY Y XY X =max R : R4 ∈R, R +R =H(Y), R +R =H(Y|X) 0 0 0 2 0 4 =ψ′ (cid:8)(−1,1,−1;0,1,0). (cid:9) I XY The optimal privacy-utility coefficient [35] is I(X;U) v∗(X →Y)= inf U:X−Y−U I(Y;U) =inf{v /v : v ∈I , v =v }. X Y XY Y XY Excess functional information [10] Ψ(X →Y)= min H(Y|U)−I(X;Y) U:U⊥⊥X is closely related to one-shot channel simulation [36] and lossy source coding, and can be expressed as Ψ(X →Y)=H(Y|X)−max{v : v ∈I , v =0} Y XY X =min R : R4 ∈R, R +R =H(Y|X) −I(X;Y) 2 0 0 4 =min(cid:8)R : R4 ∈R, R =0, R =H(Y|X(cid:9)) −I(X;Y) 2 0 4 0 =−ψ′(cid:8) (−2,0,1; 0,1,−1). (cid:9) I XY In the EGW system, Ψ(X →Y) corresponds to the setting with only decoders 2, 4 and M =∅ (since it is better to allocate 4 the rate to R instead of R ), andwe restrict R =H(Y|X). The valueof Ψ(X →Y)+I(X;Y) is the rate of the additional 0 4 0 information M that decoder 2 needs, in order to compensate the lack of side information compared to decoder 4. 2 Minimum communication rate for distributed channel synthesis with common randomness rate t [13], [14] C(t,X →Y)= min max{I(X;U), I(X,Y;U)−t} U:X−U−Y can be expressed as C(t,X →Y)=min{max{v , v −t}: v ∈I , v +v −v =I(X;Y)} X XY XY X Y XY =min max{H(X)−R , R −t}: R4 ∈R, R +R +R =H(X,Y) . 1 0 0 0 1 2 (cid:8) (cid:9) 8 A. New information theoretic quantities We now present three new quantities which arise as extreme points of I . These extreme points concern the case in XY which decoders 3 and 4 are active in the EGW system. Note that they are all maximizations of the interaction information I(X;Y|U)−I(X;Y) under various constraints. They can be considered as distances from extreme dependency, in the sense that they are equal to zero only under certain conditions of extreme dependency. Maximal interaction information is defined as G (X; Y)= max I(X;Y|U)−I(X;Y). NNI pU|XY It can be shown that G (X; Y)=H(X|Y)+H(Y|X)− min I(X,Y;U) NNI U:H(Y|X,U)=H(X|Y,U)=0 =max{v −v −v : v ∈I } XY X Y XY =H(X|Y)+H(Y|X)−min R +R +R : R4 ∈R 0 3 4 0 =H(X|Y)+H(Y|X)−min(cid:8)R : R4 ∈R, R =R =(cid:9)0 0 0 3 4 =ψI (−1,−1,1). (cid:8) (cid:9) XY The maximalinteraction informationconcernsthe sum-rate of the EGW system with only decoders3,4. Note that it is always better to allocate the rates R ,R to R instead, hence we can assume R = R = 0 (which corresponds to H(Y|X,U) = 3 4 0 3 4 H(X|Y,U)=0). The quantity H(X|Y)+H(Y|X)−G (X; Y) is the maximum rate in the lossless causal version of the NNI complementary delivery setup [7]. Asymmetric private interaction information is defined as G (X →Y)= max I(X;Y|U)−I(X;Y). PNI U:U⊥⊥X It can be shown that G (X →Y)=H(Y|X)− min I(Y;U) PNI U:U⊥⊥X,H(Y|X,U)=0 =H(Y|X)−min{v : v ∈I , v =0, v =H(Y|X)} Y XY X XY =H(X|Y)−min R : R4 ∈R, R +R =H(Y|X) 3 0 0 4 =H(X|Y)−min(cid:8)R : R4 ∈R, R =0, R =H(Y|X(cid:9)) 3 0 4 0 =ψ′ (−1,0,0; 0(cid:8),−1,1). (cid:9) I XY The asymmetric private interaction information is the opposite of excess functional information defined in [10] in which I(Y;U) is maximized instead. Another operational meaning of G is the generation of random variables with a privacy PNI constraint. Suppose Alice observes X and wants to generate Y ∼ p (·|X). However, she does not have any private Y|X randomnessandcan onlyaccesspublicrandomnessW, which isalso availableto Eve.Her goalis to generateY as a function ofX andW,whileminimizingEve’sknowledgeonY measuredbyI(Y;W).TheminimumI(Y;W)isH(Y|X)−G (X → PNI Y). Symmetric private interaction information is defined as G (X; Y)= max I(X;Y|U)−I(X;Y). PPI U:U⊥⊥X,U⊥⊥Y It can be shown that G (X; Y)= max I(X,Y;U) PPI U:U⊥⊥X,U⊥⊥Y =max{v : v ∈I , v =v =0} XY XY X Y =max R : R4 ∈R, R +R =H(X|Y), R +R =H(Y|X) 0 0 0 3 0 4 =ψ′ (cid:8)(−1,−1,0; 0,0,1). (cid:9) I XY Intuitively, G captures the maximum amount of information one can disclose about (X,Y), such that an eavesdropper PPI who only has one of X or Y would know nothing about the disclosed information. Another operational meaning of G is PNI the generation of random variables with a privacy constraint (similar to that for G ). Suppose Alice observes X and wants PNI to generate Y ∼p (·|X). She has access to public randomness W, which is also available to Eve. She also has access to Y|X private randomness. Her goal is to generate Y using X, W and her private randomness such that Eve has no knowledge on Y (i.e., I(Y;W)=0), while minimizing the amountof privaterandomnessused measured by H(Y|X,W) (note that if Alice 9 Active Supportfcn.rep. decoders Informationquantity Objective andconstraints inEGW (ψ=ψI ) inEGW XY Wyner’sCI[2] minR0: R0+R1+R2=H(X,Y) −ψ′(1,1,−1;0,0,−1) Gács-KörnerCI[3],[25] maxR0: R0+R1=H(X),R0+R2=H(Y) ψ′(1,1,−2;0,0,1) 1,2 Necessaryconditional entropy [4],[27] minR0: R0+R2=H(Y),R1=H(X|Y) −ψ′(1,2,−2;1,0,−1) Info.bottleneck [5] minR0: R0+R2=H(Y),R1≤H(X)−t none Comm.rateforchannel synthesis [13],[14] minmax{H(X)−R1,R0−t}:R0+R1+R2=H(X,Y) none Körnergraphentropy [8] minR0: R0+R1=H(X),R3=0 −ψ′(1,−1,0;−1,0,0) 1,3 Excessfunctional info.[10] minR2−I(X;Y): R4=0,R0=H(Y|X) −ψ′(−2,0,1;0,1,−1) or2,4 Max.rateforperfectprivacy [9],[29] maxR0: R0+R2=H(Y),R0+R4=H(Y|X) ψ′(−1,1,−1;0,1,0) Privacy funnel[9] minR0+R4−H(Y|X): R0+R2=H(Y),R0≥t none Maximalinteraction info. maxH(X|Y)+H(Y|X)−R0: R3=R4=0 ψ(−1,−1,1) 3,4 Asymm.private interaction info. maxH(X|Y)−R3: R4=0,R0=H(Y|X) ψ′(−1,0,0;0,−1,1) Symm.private interaction info. maxR0: R0+R3=H(X|Y),R0+R4=H(Y|X) ψ′(−1,−1,0;0,0,1) TableI EXTREMEPOINTSOFIXY ANDTHECORRESPONDINGEXTREMEPOINTSINTHEEGW,ANDTHEIRSUPPORTFUNCTIONREPRESENTATIONS. can flip fair coins for the private randomness, then by Knuth-Yao algorithm [37] the expected number of flips is bounded by H(Y|X,W)+2 ). The minimum H(Y|X,W) is H(Y|X)−G (X;Y). PPI We now list several properties of G , G and G . NNI PNI PPI Proposition 3. G , G and G satisfies NNI PNI PPI 1) Bounds. 0≤G (X; Y)≤G (X →Y)≤G (X; Y)≤min{H(X|Y), H(Y|X)}. PPI PNI NNI 2) Conditions for zero. • G (X; Y) = 0 if and only if the characteristic bipartite graph of X,Y (i.e. vertices X ∪Y with edge (x,y) if NNI p(x,y) > 0) does not contain paths of length 3, or equivalently, p(x|y) = 1 or p(y|x) = 1 for all x,y such that p(x,y)>0. • G (X →Y)=0 if and only if G (X; Y)=0. PNI NNI • G (X; Y)=0 if and only if the characteristic bipartite graph of X,Y does not contain cycles. PPI 3) Condition for maximum. If H(X)=H(Y), then the following statements are equivalent: • G (X; Y)=H(Y|X). NNI • G (X →Y)=H(Y|X). PNI • G (X; Y)=H(Y|X). PPI • p(x)=p(y) for all x,y such that p(x,y)>0. 4) Lower bound for independent X,Y. If X ⊥⊥Y, G (X; Y)≥E[−logmax{p(X), p(Y)}]−1. PPI 5) Superadditivity. If (X ,Y ) is independentof (X ,Y ), then 1 1 2 2 G (X ,X ; Y ,Y )≥G (X ; Y )+G (X ; Y ). NNI 1 2 1 2 NNI 1 1 NNI 2 2 Similar for G and G . PNI PPI The proof of this proposition is given in Appendix C. 10 IV. EXTENDEDGRAY–WYNERSYSTEMWITHNONCAUSAL COMPLEMENTARYSIDE INFORMATION In this section we establish the rate region R′ for the EGW system with complementary noncausal side information at decoders 3 and 4 (noncausalEGW), that is, decoder 3 recoversXn from (M ,M ,Yn) and decoder 4 similarly recoversYn 0 3 from (M ,M ,Xn). We show that R′ can be expressed in terms of the Gray-Wyner region R , hence it contains fewer 0 4 GW interesting extreme points compared to R. This is the reason we emphasized the causal side information in this paper. We further show that R′ is related to the asymptotic mutual information region defined as ∞ 1 I∞ = I , XY n Xn,Yn n[=1 where (Xn,Yn) is i.i.d. with (X ,Y )∼p . Note that I∞ may not be closed (unlike I which is always closed). 1 1 XY XY XY The following gives the rate region for the noncausal EGW. Theorem 2. The optimalrate regionR′ forthe extendedGray–Wynersystem with noncausalcomplementaryside information is the set of rate tuples (R ,R ,R ,R ,R ) such that 0 1 2 3 4 R ≥I(X,Y;U), 0 R ≥H(X|U), 1 R ≥H(Y|U), 2 R ≥H(X|U)−H(Y), 3 R ≥H(Y|U)−H(X), 4 R +R ≥H(X|Y), 0 3 R +R ≥H(Y|X), 0 4 R +R ≥H(X|U), 2 3 R +R ≥H(Y|U), 1 4 R +R +R ≥H(X,Y), 0 2 3 R +R +R ≥H(X,Y) 0 1 4 for some p , where |U|≤|X|·|Y|+2. U|XY The proof is given in Appendix D. Then we characterize the closure of I∞ . We show that cl(I∞ ), R′ and the the XY XY Gray–Wyner region R can be expressed in terms of each other. GW Proposition 4. The closure of I∞ , the rate region R′ for the noncausal EGW and the Gray–Wyner region R satisfy: XY GW 1) Characterization of cl(I∞ ). XY cl(I∞ )=(I +(−∞,0]×(−∞,0]×[0,∞))∩Io XY XY XY =(I +{(t,t,t): t≤0})∩ [0,∞)×[0,∞)×R . XY (cid:0) (cid:1) 2) Equivalence between cl(I∞ ) and R′. XY R′ = v , H(X)−v , H(Y)−v , H(X|Y)−v +v , H(Y|X)−v +v +[0,∞)5, XY X Y XY Y XY X v∈cl[(I∞ )(cid:8)(cid:0) (cid:1)(cid:9) XY and cl(I∞ )= v ∈R3 : v , H(X)−v , H(Y)−v , H(X|Y)−v +v , H(Y|X)−v +v ∈R′ . XY XY X Y XY Y XY X (cid:8) (cid:0) (cid:1) (cid:9) 3) Equivalence between cl(I∞ ) and R . XY GW R = v , H(X)−v , H(Y)−v +[0,∞)3, GW XY X Y v∈cl[(I∞ )(cid:8)(cid:0) (cid:1)(cid:9) XY and cl(I∞ )= v ∈Io : v , H(X)−v , H(Y)−v ∈R . XY XY XY X Y GW (cid:8) (cid:0) (cid:1) (cid:9) The proof is given in Appendix E. Note that Proposition 4 does not characterize I∞ completely since it does not specify XY which boundary points are in I∞ . XY