1 Hamming Code for Multiple Sources Rick Ma and Samuel Cheng, Member, IEEE Abstract—We consider Slepian-Wolf (SW) coding of is restricted to the discussion of two sources [3], [16], multiple sources and extend the packing bound and the [17], [18], [19] except few exceptions [20], [4], [21]. In notionofperfectcodefromconventionalchannelcodingto this paper, we describe a generalized syndrome based SWcodingwithmorethantwosources.Wethenintroduce SW code and extend the notions of a packing bound HammingCodesforMultipleSources(HCMSs)asapoten- and a perfect code from regular channel coding to SW 0 tialsolutionofperfectSW codingforarbitrarynumber of 1 coding with arbitrary number of sources. Moreover, we terminals. Moreover, we study the case with three sources 0 introduceHammingCodeforMultipleSources(HCMSs) in detail. We present the necessary conditions of a perfect 2 as a perfect code solution of SW coding for Hamming SW code and show that there exists infinite number of n HCMSs. Moreover, we show that for a perfect SW code sources (c.f. Definition 4) and show that there exists a J with sufficiently long code length, the compression rates infinitenumberofHCMSsforthreesources.Wethenex- 2 of different sources can be trade-off flexibly. Finally, we tendHCMStoamoreinclusiveformdubbedgeneralized 2 relax the construction procedure of HCMS and call the HCMS and show the universality of generalized HCMS. resulting code generalized HCMS. We prove that every Namely, any perfect SW code of Hamming sources can ] perfect SW code for Hamming sources is equivalent to a T bereducedtoaHCMScodethroughequivalentoperation generalized HCMS. I (c.f. Theorem 4). . s The paper is organized as follows. In the next sec- c [ tion, we will describe the problem setup and introduce I. INTRODUCTION 2 definitions used in the rest of the paper. In Section III, v SW coding refers to lossless distributed compression we will present a major lemma useful to the rest of 2 of correlated sources. Consider s correlated sources paper.WewillintroduceHCMSinSectionIV.InSection 7 X ,X ,··· ,X . Assuming that encoding can only be 0 1 2 s V, HCMS for three sources will be discussed in detail. 4 performedseparatelythats encoderscanseeonlyoneof Necessary conditions for perfect code will be given and . the s sourcesbut the compressedsourcesare transmitted 1 the existence of HCMS for three sources will be shown. 0 to a base station and decompressed jointly. To the In Section VI, we extend HCMS to generalized HCMS 0 surprise to many researchers of their time, Slepian and usingthenotionofrowbasismatrixasdefinedinSection 1 Wolf showed that it is possible to have no loss in sum : II. In Section VII, we will show the universality of v rate under this constrained situation [1]. That is, at least i generalized HCMS. X in theory, it is possible to recover the source losslessly r at the base station even though the sum rate is barely a II. PROBLEM SETUP above the joint entropy H(X ,X ,··· ,X ). 1 2 s We will start with a general definition of syndrome Wyner is the first who realized that by taking com- based SW codes with multiple sources [21]. puted syndromes as the compressed sources, channel codes can be used to implement SW coding [2]. The Definition 1 (Syndrome based SW code). A rate approach was rediscovered and popularized by Prad- (r ,r ,··· ,r ) syndrome based SW code for s cor- 1 2 s han et al. more than two decades later [3]. Practical related length-n sources contains s coding matrices syndrome-based schemes for S-W coding using channel H ,H ,··· ,H of sizes m ×n,m ×n,··· ,m ×n, 1 2 s 1 2 s codeshavethenbeenstudiedin [4], [5], [6], [7], [8], [9], where r = m /n for i= 1,2,··· ,s. i i [10], [11], [12], [13], [14], [15]. However, most work • Encoding: The ith encoder compresseslength-n in- put x into y = H x and transmit the compressed i i i i R.MawaswiththeDepartmentofMathematicsattheHongKong m bits (with compression rate r = m /n) to the University of Science and Technology, Hong Kong. i i i S. Cheng is with the School of Electrical and Computer Engi- base station neering, University of Oklahoma, Tulsa, OK, 74135 USA email: • Decoding: Upon receiving all yi, the base station [email protected]. decodes all sources by outputting a most proba- ApartofthisworkwaspresentedinDCC2010andwassubmitted ble xˆ ,xˆ ,··· ,xˆ that satisfies H xˆ = y ,i = to ISIT 2010. 1 2 s i i i Manuscript received January 19, 2010 1,2,··· ,s. 2 For the ease of exposition, we will occasionally H ,··· ,H can compress S, the intersection of null 1 s refer a compression scheme with its coding matrices spaces of H ,··· ,H should only contain the all-zero 1 s (H ,··· ,H ) directly. Moreover, let us introduce the vector. Otherwise, let x belong to the intersection and 1 s s following definitions. thus the s-tuple (x,··· ,x) ∈ S will have the same s Definition 2 ((s,n,M)-compression). We refer to z }| { (s,n,M)-compression as a SW code of s length-n syndrome (all-zero syndrome) as (0,··· ,0) ∈ S. This source tuples with M total compressed bits (M = contradicts with the assumption thazt H}|1,·{·· ,Hs com- m +m +···+m ). presses S. 1 2 s The following definitions are used to simplify the Definition 3 (Compressible). we will say the set of s- subsequent discussion. terminal source tuples S to be compressible by a SW Definition 6 (Hamming Matrix). An m-bit Hamming code if any source tuple in S can be reconstructed loss- matrix (of size m × (2m − 1)) consists of all nonzero lessly. Alternatively, we say the SW code can compress column vectors of length m. Note that the parity check S. matrix of a Hamming code is a Hamming matrix. Apparently, a SW code can compress S if and only if Definition 7 (Surjective Matrix). A surjective matrix is its encoding map restricted to S is injective (or 1-1). a full rank fat or square matrix. At one time instance, we call the correlation among different sources a type-0 correlation when all source Definition 8 (Row Basis Matrix). Given a matrix A, we bits from different terminal are the same. In general, we say a surjective matrix B is a row basis matrix of A, call the correlation a type-t correlation if all source bits if row(B) = row(A), where row(A) denotes the row except t of them are the same. For highly correlated space of A, i.e., all linear combinations of rows of A. source, we expect that the most probable sources are 100 those with type-0 correlations for all n time instances, Example 1 (Row Basis Matrix). is a row basis (cid:18)011(cid:19) and the next most probable sources are those with n−1 100 100 type-0 correlations and one type-1 correlation. We call matrix of both 100 and 111. these sources s-terminal Hamming sources of length n. 111 111 Let us summarize the above in the following. Remark 1 (Row Basis Matrix “Transform”). There is Definition 4 (Hamming sources). A s-terminal Ham- a unique matrix C s.t. A = CB (because every row of ming source of length n is s length-n source tuple A can be decompose as a unique linear combination of that contains either 1) entirely type-0 correlations for n B since B is full rank). And there exists matrix D s.t. time instances; or 2) type-0 correlations for n−1 time B = DA but D is not necessary unique. Thus, given instances and one type-1 correlation. a vector v, if we know Av, we can compute Bv (as DAv). Similar, we have Bv given Av. Let S be the set containing all s-terminal Hamming sources of length n. By simple counting, the set S has III. NULL SPACE SHIFTING size (s′n + 1)2n, where s′ = s when the number of We will now introduce an important lemma that terminals s > 2 and s′ = 1 when s = 2. Thus, if provides a powerful tool for our subsequent discussion. S is compressible by a SW code with C denoted as In a nutshell, the lemma tells us that it is possible to the set of all compressed outputs, we have a packing bound given by |C| ≥ (s′n+1)2n. We call the code as tradeoff the compression rates of different source tuples by shifting a part of the null space of a coding matrix perfect if the equality in the packing bound is satisfied (i.e.,|C| = (s′n+1)2n). The notion of perfectness can to another. Our proof is based on analysis of the null spaces be generalized to any set S of interests: of the coding matrices. More precisely, we isolate the Definition 5 (Perfect SW codes). A SW code is perfect common space shared by all except one null spaces and if and only if |C| = |S|. decompose the null spaces as the direct sum (⊕) of the common space and the residual space. In other words, the encoding map restricted to S is surjective (and injective also since S is compressible) if Lemma 1. Suppose H ,··· ,H can compress S and 1 s the compression is perfect. For the rest of the paper, null(H ) = K⊕N for all i but a r that null(H )= N , i i r r let us restrict S to denote the set containing all s- then matrices H′,··· ,H′, with null(H′)= K ⊕N for 1 s i i terminal Hamming sources of length n. Note that if all i but a d that null(H )= N can also compress S. d d 3 Furthermore,ifallH′ areontoand(H ,··· ,H )isa 1 = 2(s−1) (mod s) for every odd prime s > 1. This j 1 s perfectcompression,i.e.,(H ,··· ,H )restrictedtoS is gives an infinite number of solution to (1). 1 s bijective,then(H′,...,H′)isalsoaperfectcompression. Now, we will present the main theorem that leads to 1 s HCMS. Before the proof, we first notice that N ∩K = {0}. r Otherwise H ,··· ,H have common nonzero null vec- Theorem 1 (Hamming Code for Multiple Sources). For 1 s tor that contradicts with S being compressible by the positive integers s,n,M satisfy (1) and s > 2, let P be code. So the notation K ⊕N is justified. a Hamming matrix of size (M − n) × (2M−n − 1) = r Proof: We have nothing to prove if r = d. For (M −n)×(sn). r 6= d, we can simply put r = 2 and d = 1 without If P can be partitioned into losing generality. P = [Q ,Q ,··· ,Q ] (2) Define S as S+S = {s +s |s ,s ∈ S}. Note that 1 2 s + 1 2 1 2 1) (H1′,··· ,Hs′) restricted to S is 1-1 ⇔ if such that each Qi is an (M −n)×n matrix and (m ,··· ,m ) ∈ S and (H′m ,··· ,H′m ) = 1 s + 1 1 s s Q1+Q2+···+Qs = 0, (3) (0,··· ,0), then (m ,··· ,m )= (0,··· ,0). 1 s So let (m ,··· ,m ) ∈ S and (H′m ,··· ,H′m ) = and 1 s + 1 1 s s (0,··· ,0). By checking back the null spaces of Hj′, we Q1 have m1 = n1, m2 = k2 +n2, mi = ki +ni, where Q2 n ∈ N , and {k ,k } ⊂ K. So, m +k = n +k , R = ··· (4) j j 1 2 1 2 1 2 m2 + k2 = n2, and mi + k2 = k2 + ki + ni. By Qs−1 checkingthe nullspacesofH , we find(m ,··· ,m )+ T i 1 s (k ,··· ,k ) ∈ (nullH ,··· ,nullH ). As (k ,··· ,k ) 2 2 1 s 2 2 is invertible for some arbitrary T, then we have a set of hasalltype-0correlations,(m ,··· ,m )+(k ,··· ,k ) 1 s 2 2 s parity check matrices is also in S . Since (H ,··· ,H ) restricted to S is + 1 s G G G injective,by1),wehavemj = k2 forallj.In particular, 1 , 2 ,··· , s (5) m = k = n , which implies k ∈ N ∩ K = {0}. (cid:18)Q1(cid:19) (cid:18)Q2(cid:19) (cid:18)Qs(cid:19) 1 2 1 2 1 That is, k = 0 and thus m = 0 for all j. By 1) again, that formsa perfectcompression,whereG ,G ,··· ,G 2 j 1 2 s (H′,··· ,H′) restricted to S is injective. be any kind of row partition of T. That is, 1 s For the second part, (H ,··· ,H ) restricted to S is 1 s G 1 now surjectiveas well and hence (H ,··· ,H ) per se is also surjective. Therefore all H are1full ranks matrices. T = G2, (6) j ··· These imply all H′ have to be full rank as well. Fur- j G s thermore, if all all H′ are full rank, the dimensionof the j target spaceof (H′,··· ,H′)= ns− s |null(H′)|= and some Gi can be chosen as a void matrix. 1 s j=1 j ns − sj=1|null(Hj)|, which equaPls to the the di- Proof: mensioPn of the target space of (H1,··· ,Hs). Since For any b, v ∈ Zn s.t. |v |+|v |+···+|v |≤ 1, the i 2 1 2 s (H1,··· ,Hs) restricted to S is bijective, the dimension input of correlated sources [b + v1,b +v2,··· ,b + vs] of the target space of (H1,··· ,Hs) (and hence that of will result in syndrome (H′,··· ,H′)) is |S|. Since (H′,··· ,H′) restricted to 1 s 1 s G (b+v ) G (b+v ) G (b+v ) S has been proven to be injective already, it must be 1 1 , 2 2 ,··· , s s bijective as well. (cid:20)(cid:18)Q1(b+v1)(cid:19) (cid:18)Q2(b+v2)(cid:19) (cid:18)Qs(b+vs)(cid:19)(cid:21) to be received at the decoder. The decoder can then IV. HAMMING CODE FOR MULTIPLE SOURCES retrieve (v1,··· ,vs) since Recall that S is denoted as the set containing all s- Q (b+v )+Q (b+v )+···+Q (b+v ) 1 1 2 2 s s terminal Hamming sources of length n. Let M = m + 1 =Q (v )+···+Q (v ) (by (3)) 1 1 s s m +···+m be the total number of compressed bits. 2 s Then we have |C| = 2M and thus when s > 2, the v1 equation for perfect compression becomes =P v2 (by (2)) ··· 2n(sn+1) = 2M. (1) v s Since sn+1 = 2(M−n), s obviously cannot be even. and P is bijective over the set of all length-sn vectors On the other hand, by Fermat’s Little Theorem, we have with weight 1. 4 After knowing (v ,··· ,v ), we can compute Let us denote the dimensions of U,V,W, and C, as 1 s G (b),··· ,G (b) and Q (b),··· ,Q (b). Thus, we have u,v,w, and c, respectively. Then we have the following 1 s 1 s lemma. Q 1 Q2 Q Lemma 2. With u,v,w, and c described above, if the ··· 1 code can compress S and is perfect, then 3n−2M ≤ Qs−1(b) = Q··2· (b)= R(b). c≤ 3n−2M+3, whereM = m1+m2+m3. Moreover, G1 Q M −n−2≤ u,v,w ≤ M −n. G s−1 2 T Proof: Even if the rest of the sources are known ··· exactly, the correlation specified by S implies that there G s can be n + 1 possibilities for the remaining source. As R is invertible, we can recover b and hence the Therefore, 2mi ≥ n + 1 for i = 1,2,3. Denote correlated sources [b+v1,b+v2,··· ,b+vs]. Di as the dimension of the null space of Hi. Then, D = n−m ≤n−log (n+1) < n−log (n+1/3) = i i 2 2 n−log (3n+1)+log 3.Fromthe packingbound,ifthe Remark 2 (SW coding of three sources of length-1). 2 2 code is perfect, we have 2M = (3n+1)2n. Therefore, Apparently,HCMSonlyexistsifthe(s−1)(M−n) ≤ n, D ≤ n−(M −n)+log 3 and thus D ≤ 2n−M +1, otherwise the required height of T will be negative. For 2 where the secondinequality holds becauseD, n, and M example, let s = 3, n = 1, M = 3. Even though the are all integers. parameters satisfy (1), we will not have HCMS because n−(s−1)(M−n)=−3.However,aperfect(trivial)SW GiventhethreenullspacestobeC⊕U,C⊕V,andW, codeactuallyexistsinthiscase,theparitycheckmatrices thenwehavec+v ≤ 2n−M+1,c+u≤ 2n−M+1,and for all three terminals are simply the scalar matrix 1 . c+w ≤ 2n−M +1. Note that the last inequality holds since Lemma 1 tells us that full rank matrices with null (cid:0) (cid:1) From Remark 2, we see that HCMS cannot model spaces C ⊕ U, V, C ⊕ W also compress S perfectly. all perfect codesthat can compress s-terminal Hamming Moreover, since (C ⊕ A) ∩ B = {0} for any A,B ∈ sources. It turns out that we can modify HCMS slightly {U,V,W}andA6= B,wehavec+u+v ≤ n,c+u+w ≤ and the extension will cover all perfect SW codes for n, and c+v+w ≤ n. And the total dimensions of the Hamming sources. We will delay this discussion to null spaces = 3n − M = c + u + c + v + w. Thus Section VII. In the next section, we will first discuss 3n−M −(c+u+v) = c+w and 2n−M ≤ c+w. HCMS for three sources in detail. Similarly, we have 2n−M ≤ c+u and 2n−M ≤ c+v. In summary, we have V. HCMS FOR THREE SOURCES 2n−M ≤ c+a ≤ 2n−M +1, (8) A. Necessary Conditions Now, let us confine to the case with only three a ∈ {u,v,w}. Since w = 3n−M−(c+u)−(c+v), by encoders, i.e., s = 3. Let (H ,H ,H ) be a perfect 1 2 3 (8),3n−M−2(2n−M+1)≤ w ≤ 3n−M−2(2n−M) compression for S. We have and thus M −n−2 ≤ w ≤ M −n. Similarly, we have nullH ∩nullH ∩nullH = {0}. (7) M −n−2≤ u,v ≤ M −n. 1 2 3 Now, substituting M −n−2 ≤ w ≤ M −n back to Hence we can decompose the null spaces into U ⊕ (8), we have 3n−2M ≤ c ≤ 3n−2M +3 as desired. K ⊕K ,V ⊕K ⊕ K ,W ⊕ K ⊕K , where K = 2 3 1 3 1 2 i nullH ∩nullH ,(i,j,k) ∈ {(1,2,3),(2,3,1),(3,2,1)}. j k To summarize from Lemma 2, given n and M, there Notice that (U ⊕ K ⊕ K ) ∩ K = {0} by (7). We 2 3 1 are only four cases for different values of c,u,v, and w also have (U ⊕K ⊕K ⊕K )∩W = {0} (because 2 3 1 assuming u≥ v ≥ w as shown in Table I. ∀u ∈ U ,∀k ∈ K , ∀k ∈ K , ∀k ∈ K , and 1 1 2 2 3 3 Moreover,theperfectnesscondition,2M = (3n+1)2n, ∀w ∈ W, u +k +k +k = w ⇒ u+k +k = 1 2 3 2 3 turns out to be restrictive enough to confine M and n k + w ∈ nullH ∩ nullH = K ⇒ k = 0 and 1 1 3 2 1 into some limited possibilities as to be describe in the w = 0). By the symmetry among U,V,W, we also have following Lemma. (C ⊕ A) ∩ B = {0} for any A,B ∈ {U,V,W} and A 6= B, where C = K ⊕K ⊕K . By Lemma 1, the Lemma 3. All positive integers n and M that satisfy 1 2 3 perfect compression is equivalent to full rank matrices 2M = 2n(3n+1) have the forms (22a−1)/3 and 2a+ with null space C ⊕U,C ⊕V, and W. (22a −1)/3, respectively, for some positive integer a. 5 TABLE I any perfect code to another asymmetric perfect code of FEASIBLEVALUESOFc,u,v,ANDw desired rates by allocating C among different coding c u v w matrices. 3n−2M +3 M −n−2 M −n−2 M −n−2 3n−2M +2 M −n−1 M −n−1 M −n−2 B. Existence of HCMS 3n−2M +1 M −n M −n−1 M −n−1 We have not yet shown that any HCMS exists. Ac- 3n−2M M −n M −n M −n tually, the authors are not aware of any prior work that reportedperfectSWcodeswithmorethantwosourcesin TABLE II theliterature.FromRemark2,weseethatHCMScannot VALUESOFn,M,M−n,3n−2M FORa=1,2,3,4,5. modelthetrivialcasefora = 1(n =1andM = 3)even a n M M −n 3n−2M though by definition the trivial code (coding matrices 1 1 3 2 −3 equal to scalar identity for all three sources) is perfect. 2 5 9 4 −3 For a = 2 (n = 5 and M = 9), HCMS also does not 3 21 27 6 9 existsincen−(s−1)(M−n)= 5−2(4) = −3< 0 (c.f. 4 85 93 8 69 Remark 2). However, there is actually no perfect code 5 341 351 10 321 exists at all in this case as concluded in the following proposition. Proposition 1 (No perfect code for a = 2). There does Proof:Itiseasytoverifythat22a ≡ 1 (mod 3)and not exist perfect code for SW coding of three length-5 22a+1 ≡ 2 (mod 3)foranypositiveintegera.Moreover, sources. sincebothM andnarepositiveintegers,3n+1= 2M−n is a positive integer as well. This implies 2M−n ≡ 1 Proof: See Appendix. Even though HCMS does not exist for a = 1 and (mod 3)andthusM−nhastobeeven.LetM−n = 2a a = 2,itispossibletoshowthatHCMSexistsfora ≥ 3. for some positive integer a, then we can rewrite n and M as a: n(a) = (22a−1)/3 and M(a) = 2a+n(a). In the following, we will first show that HCMS for three sources exists for a ≥ 3 using Theorem 1. It is interesting to point out that M has to be divisible by 3. It can be proved using simple induction and we Proposition 2 (HCMS exists for a = 3). Now we will will skip the proof here. show that perfect compression exists for s= 3 and a = Lemma 4. M(a) = (22a −1)/3+2a, a ∈ Z+, defined 3. For a = 3,n = 21 and M = 27, we let in Lemma 3 is divisible by 3. 100000100001110110000 We list in Table II the first five possible values of 010000110000100000111 M and n. M −n and 3n −2M are also included for 001000011000011101011 Q1 = , convenience. 000100001100010011110 000010000110101101111 Note that for both n= 1 and n = 5, 3n−2M = −3. 000001000011001001101 Thus, only the first case described in Table I will be possible (i.e. c = 0). 000101101111010001111 By Lemma 1, the null space contributed by C can be 100010110111101111000 reallocated to different terminals arbitrarily and yet the 010001111011100000101 Q2= , resulting code will still compress S and be perfect. 101000111101011100111 Forexample,for n= 21 andM = 27, possiblevalues 010100111110001011011 of c,u,v,w are 12,4,4, and 4, respectively. This results 001010011111110101110 in an asymmetric code with m1 = 21 − 12 − 4 = 5, and m = 5, and m = 17. If this code compresses S, we 2 3 100101001110100111111 can reallocate4 dimensionsof nullspaceseachfrom H1 110010000111001111111 and H2 to H3 and result in a symmetric code that can 011001100011111101110 Q3 = . compress S as well. 101100110001001111001 From the above discussion, we see that c ∼ 3n−2M 010110111000100110100 increases exponentially with a wherea u,v,w ∼M −n 001011011100111100011 only increases linearly. Therefore, for sufficiently large It is a little bit laborious but straightforward to see n, c will always be large enough that we can rearrange that P = [Q Q Q ] is a 6-bit Hamming matrix and 1 2 3 6 Q +Q +Q =0. So we have (2) and (3) already. For We will show that the statement is also true for k+1. 1 2 3 (4), let U and V be the matrices obtained by truncating Let P be the 2k-bit Hamming matrix and A,B,C be the last 9 columns of Q and Q : its partition with properties described above. Let u = 1 2 1 0 1 1 00 0 0 01 0 00 0 1 ,v = ,w = . Let Ai, Bi, Ci be the i-th (cid:18)0(cid:19) (cid:18)1(cid:19) (cid:18)1(cid:19) 0 10 0 0 01 1 00 0 0 column of A,B,C respectively. We define 0 01 0 0 00 1 10 0 0 U = 0 A A A A 0 00 1 0 00 0 11 0 0 j j j j A = ··· ··· 0 00 0 1 00 0 01 1 0 + (cid:18)u 0 u v w (cid:19) 0 00 0 0 10 0 00 1 1 0 B B B B j j j j B = ··· ··· + (cid:18)v 0 v w u (cid:19) and 0 0 0 10 1 10 1 1 11 0 C C C C j j j j C = ··· ··· 1 0 0 01 0 11 0 1 11 + (cid:18)w 0 w u v (cid:19) 0 1 0 00 1 11 1 0 11 V = . where j runs from 1 to n= (22k −1)/3. 1 0 1 00 0 11 1 1 01 ItiseasytoverifythatP = [A B C ]isa2(k+1)- 0 1 0 10 0 11 1 1 10 + + + + bit Hamming matrix consisting of 3 + (4(22k − 1)) = 0 0 1 01 0 01 1 1 11 22(k+1)−1 different non-zero column vectors of length We have V = [0|I6]+KU, where 2(k +1). And obviously A +B +C = 0 as A+ + + + B+C = 0 and u+v+w = 0. 0 00 1 01 Lastly we permutated the columns of A , B and 1 00 0 10 + + C simutanteously (keeping their sum zero) such that 0 10 0 01 + K = . the first 4(k+1) columns are 1 01 0 00 0 10 1 00 A A ··· A A A A A ··· A = 1 2 2k 1 2 3 4 0 01 0 10 + (cid:18) 0 0 ··· 0 v w w u ···(cid:19) Hence Q1 is a full rank matrix with pivots on the B = B1 B2 ··· B2k B1 B2 B3 B4 ··· (cid:18)Q2(cid:19) + (cid:18) 0 0 ··· 0 w u v w ···(cid:19) Q1 C = C1 C2 ··· C2k C1 C2 C3 C4 ··· first 12 columns. Therefore, R = Q2 is a 21 ×21 + (cid:18) 0 0 ··· 0 u v u v ···(cid:19) T (11) invertible matrix where T = [0|I ]. By Theorem 1, 9 G G G A A A coding matrices 1 , 2 , 3 form a perfect Notice that 1 , 2 ,··· , 2k are linear in- (cid:18)Q1(cid:19) (cid:18)Q2(cid:19) (cid:18)Q3(cid:19) (cid:18)B1(cid:19) (cid:18)B2(cid:19) (cid:18)B2k(cid:19) compression with G defined in (6). dependent by induction assumption and i Theorem 2 (HCMS exists for all a ≥ 3). For s = 3, 0 1 1 1 there is a perfect compression for n and M that satisfy v w w u 1 1 1 0 = (cid:18)w u v w(cid:19) 1 1 0 1 3n(a)+1 = 22a, (9) 1 0 1 1 M(a)−n(a) = 2a, (10) are also linear independent. Therefore the first 4(k+1) where a is any integer larger than or equal to 3. columnvectorsof A+ arelinearlyindependent.Hence B h +i Proof: From Proposition 2, we have shown that A+ is a full rank matrix with pivots on the first 4(k+ B there exists HCMS for three sources when a = 3. h +i 1) columns. By Theorem 1, perfect compression can be Now we will show by induction that there is perfect built by choosing T = [0,I ]. n(k+1)−4(k+1) compression for all a ≥ 3. Suppose we have a partition ofaHammingmatrixP ofsize2a×(22a−1)(formedby all non-zero length-2a column vectors) as P = [ABC] VI. GENERALIZED HCMS A such that A+B +C = 0, and forms a full rank (cid:18)B(cid:19) Now, we will extend HCMS so that it will cover matrix with pivots on the first 4a columns whenever the trivial case described in Remark 2. The main idea 3 ≤ a ≤ k. By Theorem 1, perfect compression can of generalized HCMS is to “loosen” the condition in be built by choosing T = [0,I ]. (4) using the notion of row basis matrices defined in n(a)−4a 7 C 1 Section II. Let Y, a d × n matrix, be a row basis C = [1]. 1 = ⇒ Y = [1] and T is void Q1 3 (cid:18)C2(cid:19) (cid:18)1(cid:19) and hence Gi are void. So d = d = d = d = 1 matrix of Q··2· , where Q1,··· ,Qs is a partition and d1 +d2 +d3 +(n−d) =1M. So2 from3generalized HCMS,wegetperfectcompressionwithcodingmatrices Q s−1 of a Hamming matrix satisfying (2) and (3). Since Y 1 , 1 , and 1 just as in Remark 2. is a surjective matrix, there apparently exists T s.t. (cid:0) (cid:1) (cid:0) (cid:1) (cid:0) (cid:1) NotethatevenGeneralizedHCMSdoesnotguaranties Y R = is an n×n invertible matrix. the existenceof perfectcompressionas perfectcompres- (cid:18)T(cid:19) sion may simply does not exist. For example,there is no Theorem 3 (Generalized HCMS). Let Gi,i = 1,··· ,s perfect compression for s= 3, n =5, M = 3 as shown be any row partition of T as in (6) and Ci be a in Proposition 1. row basis matrix of Q for i = 1,··· ,s. Then a set i G G G of parity matrices 1 , 2 ,··· , s forms a VII. UNIVERSALITY OF GENERALIZED HCMS (cid:18)C1(cid:19) (cid:18)C2(cid:19) (cid:18)Cs(cid:19) compression for the set of s-terminal Hamming sources We are to prove that every perfect compression for of length n. Moreover,the compressionwill be perfect if Hamming sources S is equivalent to a generalized d +d +...+d +(n−d)= M, whered is the number HCMS. We say two perfect compressions are equivalent 1 2 s i of row of C . (denoted by ∼) to each other if and only if their null i spaces can be converted to each other through the steps Proof: Let |·| be the function that maps an element of the null space shifting as to be described in Lemma in Zn to its norm in Z by counting the number of 2 1. Since each step of null space shifting is invertible, nonzero components, e.g., |(1,1,0,1)| = 3. For any b, the term “equivalent” is mathematically justify. The set v ∈ Zn s.t. |v |+|v |+···+|v | ≤ 1, the input of i 2 1 2 s of perfect compression does form equivalent classes. correlatedsources[b+v ,b+v ,··· ,b+v ]willresult 1 2 s The objective of this section is to show the following in syndrome theorem. G1(b+v1) , G2(b+v2) ,··· , Gs(b+vs) Theorem 4. Every perfect compression is equivalent to (cid:20)(cid:18)C1(b+v1)(cid:19) (cid:18)C2(b+v2)(cid:19) (cid:18)Cs(b+vs)(cid:19)(cid:21) a generalized HCMS. to be received at the decoder. Given C (b+v ) at the To prove Theorem 4, we will introduce and show i i decoder, we can recover Q (b+v ) from Remark 1. several lemmas to achieve our goal. i i The decoder can then retrieve (v ,··· ,v ) since 1 s Lemma 5. Every2-sourceperfectcompressionis equiv- alent to a Hamming code. Q (b+v )+Q (b+v )+···+Q (b+v ) 1 1 2 2 s s =Q1(v1)+···+Qs(vs) (by (3)) Proof: Let s =2. If (H1,H2) is a perfect compres- sion, then we can let N = {0} and K = null(H ) v 1 1 1 and form (H′,H′) under Lemma 1. Having {0} as =P v2 (by (2))null space, H1′ can2 be any invertible n×n matrix and ··· 1 we can set H′ to the identity matrix without loss of vs 1 generality. Meanwhile, H′ is a full rank m×n matrix 2 and P is bijective over the set of all length-sn vectors with 2m = n + 1. Since the columns of H must 2 with weight 1. be nonzero and different from each other (because i- After knowing (v ,··· ,v ), we can compute th column = j-th column ⇔ H′e = H′e , where 1 s 2 i 2 j G (b),··· ,G (b) and C (b),··· ,C (b). This in turn i−1 n−i 1 s 1 s givesusT(b) andY(b), respectively,(the latter is again ei = [0,··· ,0,1,0,··· ,0]T ⇔ (H1′,H2′) fails to com- by Remark 1). So we haveRb. SinceR is invertible, we press zS }b|eca{usezei}a|nd{ej inputted to encoder 2 are can get back b and thus all sources. no longer distinguishable by the encoder’s output), H′ 2 The second claim is apparent by simple counting. is unique up to a permutation of columns. Therefore, H′ is a parity check matrix of the Hamming (n,n−m) Example 2 (Generalized HCMS of three sources of 2 code.Conversely,wecanconstruct(H ,H )(up to their 1 2 length-1). Let us revisit Remark 2. For the case s = 3, null spaces) from (H′,H′) by Lemma 1. That means n = 1, and M = 3. consider the Hamming matrix 1 2 any perfect compression is equivalent to Hamming code 101 P = = [Q Q Q ] we must get C = C = under Lemma 1. (cid:18)011(cid:19) 1 2 3 1 2 8 Lemma 6. Given a perfect compression by induction hypothesis. (H′,H′,··· ,H′), there exists (H ,··· ,H ) ∼ By (13) R +···+R ⊂ nullH′ ⊂ nullH′ + 1 2 s 1 s 2 k k+1 k+1 (H′,··· ,H′) s.t. nullH ∩...∩nullH ∩nullH ∩ R . So we can apply Fact 1 on (18) and thus obtain 1 s 1 i−1 i+1 k+1 ...∩nullH = 0 for 1≤ i < s. N ∩···∩N as a subsetof ((nullH′∩..∩nullH′)∩ s 2 k+1 2 k (nullH′ + R ) + (R + ··· + R )). Apply Fact Before we proceed with the proof of Lemma 6, we k+1 k+1 2 k 1 once more with V = nullH′ ,U = R ,W = will introduce a fact necessary for the proof as follows. k+1 k+1 nullH′ ∩···∩nullH′, we get (c.f. (13) for U ⊂ W) 2 k Fact 1. For vector spaces U,V, and W, it is easy to N ∩···∩N isasubsetof(nullH′∩···∩nullH′ )+ 2 k+1 2 k+1 show that R +R +···+R . By induction we get N ∩···∩ 2 3 k+1 2 N ⊂ (nullH′ ∩···∩nullH′ )+R +···+R . (V +U)∩W ⊂ (V ∩W)+U if U ⊂ W. (12) s−1 2 s−1 2 s−1 Lastly, N ∩···∩N 2 s (a) ⊂((nullH′ ∩···∩nullH′ )+R +···+R ) 2 s−1 2 s−1 Proof: Let v ∈ V,u ∈ U that v + u ∈ W. Then ∩nullH′ u ∈ U ⊂ W ⇒ v ∈ W ⇒ v ∈ V ∩ W. As a result s (b) v+u ∈ (V ∩W)+U. ⊂(nullH′ ∩···∩nullH′)+R +···+R 2 s 2 s−1 Proof of Lemma 6: Let R = nullH′ for i j (c) 1≤j≤Ts|j6=i =R1+R2+···+Rs−1, 1 ≤ i< s. We have where (a) is due to N ∈ nullH′ (c.f. (15)), (b) is due s s Ri ⊂ nullHj′, for 1≤ j ≤ s and i6= j, (13) to Fact 1, and (c) is from the definition of R1. Thus, N ∩ ··· ∩ N ⊂ (R + ··· + R )∩ N = and 2 s 1 s−1 s {0},where the last equality is from the construction of R ∩R = nullH′ (=a) 0 for i6= k, (14) N (c.f. (15)). i k j s 1≤\j≤s Lemma 7. Given the coding matrices, (H ,··· ,H ), of 1 s where (a) holds because the perfect compression a perfect (s,n,M)-compression, we can form a perfect (H′,H′,··· ,H′) must be injective and hence the in- (2,sn,M+(s−1)n)-compressionwith coding matrices 1 2 s tersection of all of their null spaces must be 0. (X,J), where By (13) and (14), there exist a space N that we can s I I 0 0 ··· 0 0 0 decompose 0 I I 0 ··· 0 0 0 X = (19) nullH′ = N ⊕R ⊕...⊕R (15) ··· s s 1 s−1 0 0 0 0 ··· 0 I I Again by (13) and (14) together with Lemma 1, we is a (s−1)n×sn matrix, and can form an equivalent perfect compression by first moving the whole Ri from nullHs′ to nullHi′ for i runs H1 0 ··· 0 from 1 to s−1 and get 0 H2 ··· 0 J = (20) ··· Ni = nullHi′⊕Ri,1 ≤ i< s (16) 0 0 ··· Hs is a M ×sn matrix, and I denotes the n×n identity Then if we let H be a surjective matrix with j matrix. nullH = N for 1≤ j ≤ s, (17) j j Proof: Since 2n(sn +1) = 2M implies 2sn(sn+ we have (H ,··· ,H )∼ (H′,··· ,H′). 1) = 2M+(s−1)n, we only need to show how to retrieve 1 s 1 s Now let L = N for 1 ≤ i < s, we still the input vectors. Let us decompose any pair of the i 1≤j≤s|j6=i j need to show Li =T0. b1 By symmetry, all we need to show is L1 = N2 ∩ input vectors for X and J, respectively, into b2 N ∩ ··· ∩ N ∩ N = 0. By (16), we have N ⊂ ··· 3 s−1 s 2 nullH2′ ⊕ R2. Suppose N2 ∩ ··· ∩ Nk ⊂ (nullH2′ ∩ bs ···∩nullHk′)+(R2+···+Rk) for a k < s−1. Then b1+v1 N2∩···∩Nk+1 is a subset of and ··· , where bi’s are n-entry vectors, vi’s b +v ((nullH2′ ∩···∩nullHk′)+(R2+···+Rk)) are alsosn-entsry vectors but restricted to the condition (18) ∩(nullH′ +R ) |v | + ··· + |v | ≤ 1, where | · | maps an element in k+1 k+1 1 s 9 Zn to its norm in Z by counting the number of nonzero because nullX ⊂ nullP. 2 components. Secondly, nullJ ⊂ nullP implies From the output of X, we will get b + b ,b + 1 2 2 nullH ⊂ nullQ (26) b ,··· ,b + b . Thus we can obtain b + j j 3 s−1 s 1 b2,b1 + b3,··· ,b1 + bs and H2(b1 + b2),H3(b1 + for 1 ≤ j ≤ s. Moreover, let b ),··· ,H (b +b ). 3 s 1 s From the output of J, we will obtain H2(b2 + bj ∈ nullQj, (27) v ),H (b +v ),··· ,H (b +v ) and H (b +v ). 2 3 3 3 s s s 1 1 1 we have (0,··· ,0,b ,0,··· ,0)t ∈ nullP and we j Combining the results, we get H (b + v ),H (b + 1 1 1 2 1 can decompose it into (c,··· ,c) + (c,··· ,c,c + v ),··· ,H (b + v ). Since (H ,··· ,H ) is a per- 2 s 1 s 1 s b ,c,··· ,c) by (22) with j fect compression, we can compute b ,v ,v ,··· ,v . 1 1 2 s Together with the output of X, we can retrieve all (c,··· ,c,c+b ,c,··· ,c) ∈ nullJ (28) j b ,b ,··· ,b and v ,v ,··· ,v . 1 2 s 1 2 s So c ∈nullH ∩nullH ∩···∩nullH ∩nullH ∩ Before we finally proceed to the proof of Theorem 4, 1 2 j−1 j+1 ···∩nullH andc+b ∈ nullH .By(21),wehavec= we need to present one more fact. s j j 0 if j 6= s and b ∈ nullH . Hence nullQ ⊂ nullH . j j j j Fact2. For vectorspacesV, U,andW,(V ⊕U)∩W = Together with (26), we get (V ∩W)⊕U if U ⊂ W. nullH = nullQ for j 6= s. (29) j j Proof: From (12), we know that (V +U)∩W ⊂ (V ∩W)+U. Now let v ∈ V ∩W, u ∈ U ⊂ W. Then Recall that Y is a row basis matrix of v+u ∈ W andv+u ∈ V+U.Therefore(V∩W)+U ⊂ Q 1 (V+U)∩W.Lastly,wenoticethatV∩U = (V∩W)∩U Q2 (30) as U ⊂W. Hence V ∩U ={0} iff (V ∩W)∩U = {0}, ... that justifies the direct sum signs. Q s−1 Proof of Theorem 4: and hence By Lemma 6, we can restrict our attention only to perfect compression whose coding matrices (H1,..,Hs) nullY = nullQ1∩...∩nullQs−1 = nullH1∩...∩nullHs−1 satisfy (31) by (29). Now we will show that nullQ = nullH ⊕ nullH ∩nullH ···∩nullH ∩nullH ∩···∩nullH = 0 s s 1 2 i−1 i+1 s (21) nullY. for i 6= s without loss of generality. nullQs ⊂ nullHs ⊕nullY: Equations (27) and (28) We can also generate X and J according to (19) and are true for all j. In particular, when j = s and let (20). Then Lemma 7 shows that (X,J) is a perfect bs ∈ nullQs, we have bs = c + bs + c, where (2,sn,M +(s−1)n)-compression. Therefore nullX ∩ c ∈ nullH1 ∩···∩nullHs−1 = nullY and c+bs ∈ nullJ = 0. Then Lemma 1 tells us that two surjective nullHs by (28). Therefore, bs ∈ nullHs+nullY. Since matrices with null spaces {0} and nullX ⊕ nullJ, nullHs∩nullY = nullH1∩...∩nullHs = 0, we have respectively,arealsoaperfectcompression.Bytheproof nullQs ⊂ nullHs⊕nullY. of Lemma 5, the first matrix is an invertible matrix and nullHs ⊕ nullY ⊂ nullQs: Given bs ∈ nullHs the second one is a (M +(s−1)n−sn =)M −n-bit and c ∈ nullY, (0,0,··· ,0,bs +c)t = (c,··· ,c)t + Hamming matrix P. (c,··· ,c,bs)t ∈ nullP (c.f. (22), (23)) because We have (c,··· ,c) ∈ nullX and (c,··· ,c,bs) in nullJ. Thus 0= P(0,··· ,0,b +c)t = Q (b +c). Hence b +c∈ s s s s nullP = nullX ⊕nullJ (22) nullQ and thus we have s and nullQ = nullH ⊕nullY (32) nullX = {(c,··· ,c)t|c ∈ Zn}; s s 2 (23) nullJ = {(n ,··· ,n )t|n ∈ nullH }. 1 s i i Let A be a subspace of Zn such that Partition P into 2 nullH ⊕nullY ⊕A= Zn. (33) P = [Q Q ··· .Q ], (24) s 2 1 2 s such that Q is a (M −n)×n matrix. We have Let T be a surjective matrix with i Q +Q +···+Q = 0 (25) nullT = nullH ⊕A. (34) 1 2 s s 10 We have We may assume the number of row in H is smaller 1 nullT ⊕nullY = Zn (35) than or equal to the other’s. In other words, H has 2 1 at most 3 rows. Hence null(H ) has at least two de- Being a row basis matrix matrix (c.f. (30)), Y is also 1 grees of freedom. Regardless the values of H and surjective. We have 2 H , null(H ) cannot contain any e . Otherwise, both 3 1 i Y (e ,0,0) and (0,0,0) that are in S will get the same R = (36) i (cid:20)T(cid:21) outputs (y ,y ,y ) = (0,0,0). Similarly, null(H ) 1 2 3 1 cannot contain e +e neither, otherwise both (e ,0,0) is an n×n invertible matrix. i j i and (e ,0,0) (in S) will get the same outputs because nullQ ∩nullT = (nullH ⊕nullY)∩nullT. Since j s s H e = H e . Thus null(H ) can only be span(e + nullH ⊂ nullT (c.f. (34)), we can apply Fact 2 to 1 i 1 j 1 i s e + e ,e + e + e ), where the letters i,j,k,m,n obtain j k i m n are different to each others. i.e. null(H ) = {0,e + 1 i nullQs∩nullT = (nullT ∩nullY)⊕nullHs (37) ej + ek,ei + em + en,ej + ek + em + en} for some = 0⊕nullH (c.f. (35)) i,j,k,m,n. Other structures such as higher dimension s will containforbiddenelements.Asthedimensionofthe = nullH s null spaces of 1×5 and 2×5 matrices are all greater than 2, H has to have at least 3 rows. Thus both H 1 2 Denote Cj as a row basis matrix of Qj for all j, then and H3 also have three rows. It excludes the possibility we have of perfect asymmetric SW codes(at rate [2/5,3/5,4/5], nullCj = nullQj (38) for example). So we can focus only on the symmetric case from now on. for all j. From the above discussion, null(H ) has to contain Since Q = Q +Q +...+Q (c.f. (25)) and C 1 s 1 2 s−1 s 0, two “3e” vectors and one “4e” vector, and no more. is a row basis matrix of Q (c.f. (38)), we have s Similarly, null(H ) and null(H ) get the same structure. 2 3 rowC = rowQ ⊂ rowY (39) Without lose of generality, we can write null(H ) = s s 1 {0,e +e +e ,e +e +e ,e +e +e +e }. Suppose T 1 2 3 1 4 5 2 3 4 5 (c.f. (30)). Then is a surjective matrix because its null(H ) contain e +e +e , then of course null(H ) (cid:20)Cs(cid:21) 2 1 2 3 3 row vectors are linear independent, thanks to (36) and cannot contain e1 +e2 + e3. Otherwise, (0,0,0) ∈ S (39). and (e1+e2+e3,e1+e2+e3,e1+e2+e3) ∈ S get the T same output (0,0,0). But null(H3) cannot contain ei+ Therefore, C ,C ,··· ,C , is a perfect (cid:18) 1 2 s−1 (cid:20)Cs(cid:21)(cid:19) ej +ek, i,j ∈ {1,2,3}; k ∈ {4,5} (i 6= j). Otherwise compression as its encoding matrices are all surjec- (e +e +e ,e +e +e ,e +e ) ∈ S and(0,0,e ) ∈ S 1 2 3 1 2 3 i j k tive and have the same null spaces of the perfect share the same output as well. So the “3e” vectors of compression (H1,··· ,Hs)’s (c.f. Lemma 1). More- null(H3) can only be two of e1+e4+e5, e2+e4+e5, T and e +e +e . Unfortunately, any pair of them sum over C ,C ,··· ,C , isageneralizedHCMS 3 4 5 (cid:18) 1 2 s−1 (cid:20)Cs(cid:21)(cid:19) up to a “2e” vector instead of a “4e” vector. Therefore, which is equivalent to (H1,··· ,Hs). there cannot be a common “3e” vector shared between any pair of the null spaces of H ,H , and H . 1 2 3 Now, supposenull(H ) contains the same “4e” vector 2 APPENDIX e +e +e +e asnull(H )does.Thennull(H )cannot 2 3 4 5 1 3 Proof of Proposition 1: contain any “4e” vector. Let the “4e” vector of null(H ) 3 Denote S as the set containing all 3-terminal Ham- be e1 +e2 +e3 +e4 +e5 −ej, j ∈ {2,3,4,5}. Then ming source of length 5. Let H ,H ,H be the three (e +e +e +e ,e +e +e +e ,[1,1,1,1,1]T) and 1 2 3 2 3 4 5 2 3 4 5 parity check matrices. Our strategy is to limit the null (0,0,ej) shares the same syndrome. Thus, any pair of spaces of them by the fact that all elements in S need to null spaces of H1,H2, and H3 cannot share a common have distinct syndromes. The limitation will eventually “4e” vector as well. kill the possibility of the existence of H ,H ,H . Hence, without loss of generality, we can write 1 2 3 Denote the null set of a matrix A as null(A) = null(H2) = {0,e2+e1+e4,e2+e3+e5,e1+e3+e4+ {u|Au = 0}, where 0 is an all zero vector. Further, e5}. Then, there are only three different possibilities for denote ei as the length-5 binary column vector that has the “4e” vector of null(H3): ith component equal to 1 and the rest of its components 1) e +e +e +e ; 1 2 4 5 equal to zero. 2) e +e +e +e ; 1 2 3 5