Communicating the sum of sources over a network Aditya Ramamoorthy, Member, IEEE, and Michael Langberg Abstract—We consider the network communication scenario, The problem of multicast has been studied intensively over directedacyclic networkswithunitcapacityedgesinwhich under the paradigm of network coding. The seminal work of a number of sources si each holding independent unit-entropy Ahlswede et al. [1] showed that under network coding the information Xi wish to communicate the sum PXi to a set of multicast capacity is the minimum of the maximum flows terminalstj.Weshowthatinthecaseinwhichthereareonlytwo sources or only two terminals, communication is possible if and from the source to each individual terminal node. The work only if each source terminal pair si/tj is connected by at least ofLiet al.[2] showedthatlinear networkcodesare sufficient 0 a single path. For the more general communication problem in to achieve the multicast capacity. The algebraic approach to 1 whichtherearethreesourcesandthreeterminals,weprovethata networkcodingproposedbyKoetterandMe´dard[3]provided 0 singlepath connectingthesource terminal pairs doesnot suffice 2 to communicate PXi. We then present an efficient encoding simpler proofs of these results. n scheme which enablesthe communication of PXi for the three Theproblemofmulticastingsumsofsourcesisanimportant sources,threeterminalscase,giventhateachsourceterminalpair componentinenablingthemulticastofcorrelatedsourcesover a J is connected by two edge disjoint paths. Our encoding scheme a network (using network coding). Network coding for corre- includesastructuraldecompositionofthenetworkathandwhich 9 lated sources was first examined by Ho et al. [4]. The work may befound usefulfor othernetwork codingproblemsas well. 2 of Ramamoorthy et al. [5] showed that in general separating distributed source coding and network coding is suboptimal T] Index Terms—network coding, function computation, multi- exceptinthecaseoftwosourcesandtwoterminals.Thework I cast, distributed source coding. ofWu etal. [6] presenteda practicalapproachtomulticasting s. correlatedsourcesoveranetwork.Reference[6]alsostatedthe c I. INTRODUCTION problemofcommunicatingsumsovernetworksusingnetwork [ coding, and called it the Network Arithmetic problem. We The problem of function computation over networks is 1 elaborate on related work in the upcoming Section II. perhaps the most general information transmission problem v In this work, we present(sometimestight) upperand lower 9 one can formulate. Many problems studied in information boundson the network resources required for communicating 1 theory can be cast as instances of it by defining the function thesumofsourcesoveranetworkundercertainspecialcases. 3 appropriately. In the most general case, one could consider 5 arbitrarilycorrelatedsourcesoveranoisynetworkwithmulti- . 1 pleterminalsrequestingarbitraryfunctionsofthesourceswith A. Main Contributions 0 respectto specifieddistortionconstraints.Itisevidentthatthe We consider networks that can be modeled as directed 0 problem in its full generality is quite challenging. acyclic graphs, with unit capacity edges. Let G = (V,E) 1 : In this work we consider a problem setting in which the representsuch a graph. There is a set of source nodes S ⊂V v sources are independent and the network links are error- that observe independentunit-entropysources, X ,i∈S, and i i X free, but capacity constrained. However, the topology of the a set ofterminalnodesT ⊂V, thatseeksto obtain X , i∈S i r networkcanbequitecomplicated,suchasanarbitrarydirected where the sum is over a finite field. Our work makes the a acyclic graph. This serves as an abstraction of current-day following contributions. P computer networks at the higher layers. We investigate the i) Characterization of necessary and sufficient conditions problem of characterizing the network resources required to when either |S|=2 or |T|=2. communicatethe sum (overa finite field) of a certain number Suppose that G is such that there are either two sources of sources over a network to multiple terminals. By network (|S| = 2) and an arbitrary number of terminals or an resources,wemeanthenumberofedgedisjointpathsbetween arbitrarynumberof sourcesand two terminals(|T|=2). varioussourceterminalpairsinthenetwork.Ourworkcanbe The followingconditionsare necessaryand sufficientfor consideredas using network codingto computeand multicast recovery of X at all terminals in T. sums(ormoregenerallyfunctions)ofthemessages,asagainst i∈S i multicasting the messages themselves. max-flowP(si−tj)≥1 for all si ∈S and tj ∈T. Our proofs are constructive, i.e., we provide efficient A. Ramamoorthy is with the Department of Electrical and Computer Engineering, Iowa State University, Ames IA 50011, USA (email: adit- algorithms for the network code assignment. [email protected]). ii) Unit connectivity does not suffice when |S| and |T| are M. Langberg is with the Computer Science Division, Open University of both greater than 2. Israel,Raanana 43107,Israel(email:[email protected]). The material in this work was presented in part at the 2008 IEEE We present a network G such that |S| = |T| = 3 in International Symposium on Information Theory in Toronto, Canada, at the which the maximum flow between each source terminal 2009IEEEInternational SymposiumonInformationTheoryinSeoul,South pair is at least 1 and (as opposed to that stated above) Korea and is currently under submission to the 2010 IEEE International SymposiumonInformation Theory. communicating the sum of sources is not possible. iii) Sufficient conditions when |S|=|T|=3. well motivated since it is a good abstraction of current-day SupposethatGissuchthat|S|=|T|=3.Thefollowing computer networks (at the higher layers). We investigate the conditionissufficientforrecoveryof X atallt ∈ problem of characterizing the network resources required to i∈S i j T. communicate the sum of a certain number of sources over P a network to multiple terminals. Network resources can be max-flow(s −t )≥2 for all s ∈S and t ∈T. i j i j measured in various ways. For example, one may specify Efficient algorithms for network code assignment are the maximum flow between the subsets of the source nodes presented in this case as well. and subsets of the terminal nodes in the network. In the iv) Developmentofa techniqueforstructuraldecomposition current work, all of our characterizations are in terms of the of networks. maximum flow between various si −tj pairs, where si (tj) Weproposealabelingschemefornodesandedgesofthe denotes a source (terminal) node. Previous work in this area, graph, that allows us to arrive at a class decomposition includes the work of Ahlswede et al. [1], who introduced the of the relevant networks. This significantly improvesour concept of network coding and showed the capacity region ability to reason about the problem at hand. We believe for multicast. In multicast, the terminals are interested in that this technique may be of independent interest. reconstructing the actual sources. Numerous follow-up works have extended and improved the results of [1], in different Finally, we emphasize that while we work with the sum ways. For example, [2], [3] considered multicast with linear of sources function throughout our discussions, most of our codes. Ho et al. [14] proposed random network coding and results will carry over for functions that are invariant under examined the multicast of correlated sources over a network permutations of the sources. and showed a tightcapacity region for it that can be achieved This paper is organizedas follows. We discuss background by using random network codes. Follow-up works [5], [6] and related work in Section II and our network coding investigated practical approaches for achieving this goal. As model in Section III. The characterization for the case of shown in [6], the problem of communicating (multicasting) |S|=2,|T|=nand|S|=n,|T|=2isdiscussedinSections the sum (over a finite field) of sources over a network is a IV and V respectively. Our counter-example demonstrating subproblemthatcanhelpfacilitatepracticalapproachestothis that unit-connectivity does not suffice for three sources and problem. three terminals can be found in Section VI. Sections VII In this work we consider function computation under net- and VIII discuss the sufficient characterization in the case of work coding. Specifically, we present network code assign- three sourcesand three terminals,and Section IX presentsthe ment algorithms for the problem of multicasting the sum of conclusions and possibilities for future work. sources over a network. As one would expect, one needs fewer resources in order to support this. To the best of our II. BACKGROUND AND RELATEDWORK knowledge,the first work to examinefunctioncomputationin Prior work of an information theoretic flavor in the area this setting is the work of Ramamoorthy[15], that considered of function computation has mainly considered the case of the problem of multicasting sums of sources, when there are two correlated sources X and Y, with direct links between either two sources or two terminals in the network. Subse- the sources and the terminal, where the terminal is interested quently,theworkofLangbergandRamamoorthy[16]showed in reconstructing a function f(X,Y). In these works, the that the characterization of [15] does not hold in the case of topologyof the network is very simple, howeverthe structure threesourcesandthreeterminals.Reference[16],proposedan of the correlation between X and Y may be arbitrary. In alternate characterization in this case. The current paper is a this setting, Korner & Marton [7] determine the rate region revisedandextendedversionof[15]and[16]thatcontainsall for encoding the modulo-2 sum of X and Y when they are the proofs and additional observations. uniform, correlated binary sources. The work of Orlitsky & Rai and Dey [17] independently found the same counter- Roche [8] determines the required rate for sending X to a example found in our work [16]; however, their proof only decoder with side information Y that must reliably compute shows that linear codes do not suffice for multicasting sums f(X,Y).Theresultof[8]wasextendedtothecasewhenboth under the characterization of [15]. Their work also contains X andY needtobeencoded(undercertainconditions)in[9]. an alternate proof of the result in the case of n sources Yamamoto[10](generalizingtheWyner-Zivresult[11])found and two terminals (see Section V). The work of Appuswamy the rate-distortion function for sending X to a decoder with et al. [18], [19] also considers function computation in the side information Y, that wants to compute f(X,Y) within a setting of error-free directed acyclic networks. In [18], [19], certain distortion level (see also [12] for an extension). Nazer the emphasis is on considering the rate of the computation, et al. [13] consider the problem of reliably reconstructing a where the rate refers to the maximum number of times a function over a multiple-access channel (MAC) and finding function can be computed per network usage. However, all the capacity of finite-field multiple access networks. their results are in the context of only one terminal, and for In contrast with previous studies, in this work we consider the most part do not provide constructive solutions. a problem setting in which the sources are independent and the network links are error-free, but capacity constrained. III. NETWORK CODING MODEL However, the topology of the network can be quite compli- In our model,we representthe network as a directed graph cated, such as an arbitrary directed acyclic graph. This is G = (V,E). The network contains a set of source nodes 2 S ⊂ V that are observing independent, discrete unit-entropy whereβ ∈{0,1},for alli=1,2. We say thatthe encoding e,i sources and a set of terminals T ⊂ V. We assume that each on edge e′ is greedy, if for i=1,2 we have edge in the network has unit capacity and can transmit one symbol from a finite field of size 2m per unit time (we are βe′,i = 0 if βe,i =0,∀e entering u (1) free to choose m large enough). If a given edge has a higher (1 otherwise. capacity, it can be treated as multiple unit capacity edges. A coding vector assignment for G, is said to be greedy if the A directed edge e between nodes v and v is represented i j encoding on each edge in G is greedy. as (v → v ). Thus head(e) = v and tail(e) = v . A i j j i A useful property of greedy encoding is given below. path between two nodes v and v is a sequence of edges i j Lemma 1: Suppose that we perform greedy encoding on a {e ,e ,...,e } such that tail(e ) = v ,head(e ) = v and 1 2 k 1 i k j graph G = (V,E) with two source nodes s and s holding 1 2 head(e )=tail(e ),i=1,...,k−1. i i+1 informationX and X respectively.Consider a vertex u that 1 2 Our counter-examplein Section VI considers arbitrary net- is downstreamof a subset ofsource nodes,indexedby the set work codes. However,our constructivealgorithmsin Sections B ⊆ {1,2}, i.e., u is downstream of all s such that i ∈ B. i IV, V and VII shall use linear network codes. In linear Then for any edge e leaving u it holds that β =1,∀i∈B. e,i network coding, the signal on an edge (v → v ), is a linear i j Moreover,if u is a terminal node then u can recover the sum combinationofthesignalsontheincomingedgesonv andthe i X . sourcesignalatv (ifv ∈S).Inthispaperweassumethatthe i∈B i i i Proof. Follows directly from Definition 1. source (terminal) nodes do not have any incoming (outgoing) P Remark 1: Greedy encoding can be performed in the case edges from (to) other nodes. If this is not the case one can of two sources, since if a node only receives either X or 1 always introduce an artificial source (terminal) connected to X , it just forwards them. Alternatively, if it receives both of 2 the original source (terminal) node by an edge of sufficiently them or X +X , then it just forwards X +X . However, 1 2 1 2 largecapacitythathasnoincoming(outgoing)edges.Weshall this form of greedy encodingdoes not seem to have a natural only be concerned with networks that are directed acyclic in generalization that is useful in our problem setting when the which internal nodes have sufficient memory. Such networks number of sources is higher. For example, if there are three can be treated as delay-free networks. Let Y (such that ei sources, and a node receives the combinations X1+X2 and tail(e ) = v and head(e ) = v ) denote the signal on the i k i l X +X , it cannot compute X +X +X . ith edge in E and let X denote the jth source. Then, we 2 3 1 2 3 j The main result of this section is the following. have Theorem 1: Consider a directed acylic graph G = (V,E) with unit capacity edges, two source nodes s and s and n Y = f Y if v ∈V\S, and 1 2 ei j,i ej k terminal nodes t ,...,t such that 1 n {ej|heaXd(ej)=vk} Yei = aj,iXj if vk ∈S, max-flow(si−tj)≥1 for all i=1,2 and j =1,...,n. {j|Xj obXservedatvk} Assumethatateachsourcenodesi,thereisaunit-ratesource where the coefficients a and f are from GF(2m). Note Xi, and that the Xi’s are independent. Then, there exists an j,i j,i assignmentofcodingvectorstoalledgessuchthateacht ,j = thatsincethegraphisdirectedacyclic,itispossibletoexpress j 1,...,n can recover X +X . Y for an edge e in terms of the sources X ’s. Suppose that 1 2 theeire are n sourceisX ,...,X . If Y = nj β X then The basic idea of the proof in the case of two sources and 1 n ei k=1 ei,k k n terminals is greedy encoding. we say that the global coding vector of edge e is β = P i ei Proof of Theorem 1. Consider any terminal node t . As we [β ··· β ]. For brevity we shall mostly use the term j ei,1 ei,n assume that max-flow(s −t ) ≥ 1 for all i = 1,2, it holds coding vector instead of global coding vector in this paper. i j that t is downstream of both s and s . Thus by Lemma 1, We say that a node v (or edge e ) is downstream of another j 1 2 i i t can recover X +X . node v (or edge e ) if there exists a path from v (or e ) to j 1 2 j j j j Note that if any of the conditions in the statement of v (or e ). i i Theorem 1 are violated then some terminal will be unable to compute X +X . For example, if max-flow(s −t )<1 1 2 1 j IV. CASE OF TWOSOURCES ANDnTERMINALS then any decoded signal Y at t will have H(Y|X )<1 (as j 2 Y is solely a function of X and X ). We conclude that Y In this section we state and prove the rate region for the 1 2 cannot be X +X . network arithmetic problem when there are two sources and 1 2 n terminals.Beforeembarkingonthisproof,we overviewthe concept of greedy encoding that will be used throughout the V. CASE OFnSOURCES ANDTWO TERMINALS paper.Inwhatfollows,weassumethatournetworkGhasunit We nowpresentthe rateregionforthe situationwhenthere capacity edges, and our source nodes si generate information arensourcesandtwoterminalssuchthateachterminalwants X of unit entropy. to recover the sum of the sources. i Definition 1: Greedy encoding. Consider a graph G = To show the main result we first demonstrate that the (V,E), with two source nodes s and s and an edge e′ = original network can be transformed into another network 1 2 (u → v) ∈ E. Suppose that the coding vector on each edge where there exists exactly one path from each source to each e entering u, has only 0 or 1 entries, i.e., β = [β β ], terminal. By a simple argument it then follows that coding e e,1 e,2 3 vectors can be assigned so that the terminals recover the sum paths from sources s ,...,s to t . By construction, there is 1 n j of the sources. exactlyone path from s to t . Thus, t receives n X . i j j i=1 i Theorem 2: Consider a directed acylic graph G = (V,E) As in the previous section it is clear that if any of the P with unit capacity edges, n source nodes s ,s ,...,s and conditions in the statement of Theorem 2 are violated then 1 2 n two terminal nodes t and t such that either terminal t or t will be unable to find n X . 1 2 1 2 i=1 i max-flow(s −t )≥1 for all i=1,...,n and j =1,2. P i j VI. EXAMPLEOF THREESOURCES AND THREE Assume that the source nodes observe independent unit- TERMINALS WITHINDEPENDENTUNIT-ENTROPY SOURCES entropy sources Xi,i = 1,...,n. Then, there exists an We nowpresentourcounterexamplewhich showsthatone assignment of coding vectors such that each terminal can cannotgeneralizethecharacterizationpresentedinTheorems1 recover the sum of the sources ni=1Xi. and 2 to the case of more sources or terminals. Namely, we Tosimplifyourproof,wemodifythegraphGofTheorem2 presentanetworkwiththreesourcesandthreeterminals,with P by introducing virtual source nodes s′i,i = 1,...n, virtual at least one path connecting each source terminal pair, in terminals t′j,j = 1,2 and virtual unit-capacity edges s′i → which the sum of sources cannot (under any network code) si,i=1,...,n and tj →t′j,j =1,2. These additions do not be transmitted (with zero error) to all three terminals. changetheconnectivityconstraintsspecifiedinTheorem2and Consider the network shown in Figure 1, with three source canbedonew.l.o.g.LetthenewsetofsourcesbedenotedS = nodes and three terminal nodes such that the source nodes {s′1,...,s′n}, and the modified graph G′. Notice that in G′ it observe unit entropy sources X1,X2 and X3 that are also holdsthatmax-flow(s′i−t′j)=1 for all i=1,...,n and j = independent.Alledgesareunitcapacity.AsshowedinFigure 1,2. We also need the following definitions. 1 the incoming edges into terminal t contain the values 3 Definition 2: Exactly one path condition. Consider two f(X ,X )andf′(X ,X )wheref andf′ aresomefunctions 1 2 2 3 nodes v1 and v2 such there is a path P between v1 and v2. of the sources. We saythatthereexistsexactlyonepathbetweenv1 andv2 if Suppose that X = 0. This implies that t should be 3 1 there does not exist another path P′ between v1 and v2 such able to recover X + X (that has entropy 1) from just 1 2 that P′ 6=P. f(X ,X ). Moreover note that each edge is unit capacity. 1 2 Definition 3: Minimality. A graph G = (V,E) with n Therefore, the entropy of f(X ,X ) also has to be 1, i.e., 1 2 source nodes s1,s2,...,sn and two terminal nodes t1 and there exists a one-to-one mapping between the set of values t2 is said to be minimal with respect to the connectivity that f(X1,X2) takes and the values of X1 + X2. In a requirements of Theorem 2 if the removal of any edge from similar manner we can conclude that there exists a one-to- E violates one of the requirements (i.e. for some i and j it one mapping between the set of values that f′(X ,X ) takes 2 3 holds that max-flow(s′i−t′j)<1). and the values of X2 + X3. At terminal t3, there needs to To show that Theorem 2 holds we first need an auxiliary exist some function h(f(X ,X ),f′(X ,X )) = 3 X . 1 2 2 3 i=1 i lemma that we state below. The proof can be found in the By the previous observations, this also implies the existence Appendix. of a function h′(X +X ,X +X ) that equals P3 X . Lemma 2: ConsiderthegraphG′ asconstructedabovewith We now demonstra1te tha2t thi2s is a3contradiction. Coi=n1sideir sources s′1,...,s′n and terminals t′1 and t′2. There exists a the following sets of inputs: X1 = a,X2 = 0,X3P= c and subgraph G∗ of G′ such that G∗ is minimal and there exists X′ = a−b,X′ = b,X′ =c−b. In both cases the inputs to 1 2 3 ienxaGct∗l.y one path from s′i to t′j for i = 1,...,n and j = 1,2 twhheilfeuncti3on hX′(′·,=·) aare−thbe+sacm,et.hHatowareeveinr ge3in=e1raXlid=iffaer+enct,. i=1 i P Therefore such a function h′(·,·) cannot exist. P A. Proof of Theorem 2. Note that we have presented the proof in the context of scalar nonlinearnetworkcodes. However,evenif we consider From Lemma 2, we know that it is possible to find a vectorsourcesalongwithvectornetworkcodes,thesameidea subgraphG∗ ofGsuchthatthereexistsexactlyonepathfrom of the proof can be used. s′ tot′ foralli=1,...,nandj =1,2.Supposethatwefind i j G∗. We will show that each terminal can recover n X i=1 i by assigning appropriate local encoding responsibilities for VII. CASE OF THREESOURCES AND THREETERMINALS every node. Consider a node v ∈G∗ and let Γo(v) aPnd Γi(v) It is evidentfrom the counter-examplediscussed in Section representthesetofoutgoingedgesfromvandincomingedges VI, that the characterization of the rate region in the case of intov respectively.LetYe representthesymboltransmittedon three sources and three terminals is different from the cases edge e. Each node operates in the following manner. discussedinSectionIVandV.Inthissection,weshowthatas longaseachsourceisconnectedbytwoedgedisjointpathsto Ye = Ye′ for e∈Γo(v), (2) each terminal, the terminals can recover the sum. We present e′∈XΓi(v) efficientlinearencodingschemesthatallowcommunicationin i.e., each node forwards the sum of the inputs on all output this case. The main result of this section can be summarized edges. In this case we observe that a terminal t can recover as follows. j n X .Toseethis,notethatthereceivedvalueatt canbe Theorem 3: Let G=(V,E) be a directed acyclic network i=1 i j expressed as the sum of the received values over all possible with three sources s ,s ,s and three terminals t ,t ,t . Let 1 2 3 1 2 3 P 4 the existence of a feasible network code in Gˆ. Finally, this network code can be converted (by property (d) above) into a feasible network code for G as desired. We specify the mapping between G and Gˆ and give proof of properties (a)- (d) in Section VII-A. For notational reasons, from this point on in the discussion we will assume thatour inputgraphG is structured — which is now clear to be w.l.o.g. In the second step of our proof,we give edges and vertices in the graph G certain labels depending on the combinatorial structureofG. Thisstepcanbeviewedasadecompositionof the graphG (boththe vertexset and the edgeset) into certain class sets whichmay be of interest beyondthe contextof this work.Theseclasseswilllaterplayamajorroleinouranalysis. The decomposition of G is given in detail in Section VII-B. Finally, in the third and final step of our proof, using the labeling above we perform a case analysis for the proof of Theorem 3. Namely, based on the terminology set in Sec- Fig.1. Exampleofanetwork withthree sources andthree terminals, such tionVII-B,weidentifyseveralscenarios,andproveTheorem3 that there exists at least one path between each source and each terminal. Howeveralltheterminals cannotcomputeP3i=1Xi. assuming they hold. As the different scenarios we consider will cover all possible ones, we will conclude our proof. Our detailed case analysis is given in Section VII-C and X be the information present at source s . If there exist two Section VIII. i i edge disjoint paths between each source/terminal pair, then All in all, as will be evidentfrom the sections yet to come, there exists a networkcoding scheme in which the sum X + our proof is constructive, and each of its steps can be done 1 X +X is obtained at each terminal t . Moreover, such a efficiently. This will result in the efficient construction of the 2 3 j network code can be found efficiently. desirednetworkcodeforG. We nowproceedtoformalizethe It turns out that the proof of this result requires several new steps of our proof. techniques that may be of independent interest. Remark 2: Our example in Section VI, shows that a single A. The reduction path between each s −t pair does not suffice. At the other i j extreme, if there are three edge-disjoint paths between each Let G=(V,E) be our input network, and let si and ti be s −t pair, then one can actually multicast X ,X and X the given sources and terminals. We now efficiently construct i j 1 2 3 to each terminal [3]. Our results show that two edge disjoint a structured graph Gˆ = (Vˆ,Eˆ) in which each internal node paths between each source terminal pair are sufficient for v ∈ Vˆ is of total degree at most three with the additional multicasting sums. following properties: (a) Gˆ is acyclic. (b) For every source We start by giving an overview of our proof. Roughly (terminal) in G there is a corresponding source (terminal) in speaking, our approach for determining the desired network Gˆ.(c)ForanytwoedgedisjointpathsP1 andP2 connectinga code has three steps. In the first step, we turn our graph sourceterminalpairinG, thereexisttwo vertex disjointpaths G into a graph Gˆ = (Vˆ,Eˆ) in which each internal node in Gˆ connecting the corresponding source terminal pair. (d) v ∈Vˆ isoftotaldegree(in-degree+out-degree)atmostthree. Any feasible network coding solution in Gˆ can be efficiently We refer to such graphs as structured graphs. Our efficient turned into a feasible network coding solution in G. Our reductionfollowsthatappearingin[20],andhasthefollowing reduction follows that appearing in [20] and is given here for properties:(a) Gˆ is acyclic. (b) For every source (terminal) in completeness. G there is a corresponding source (terminal) in Gˆ. (c) For Thereductionis doneiterativelyaccordingtothe following any two edge disjoint paths P and P connecting a source procedure in which we reduce the total degree of internal 1 2 terminal pair in G, there exist two vertex disjoint paths in Gˆ vertices to be at most 3. First we note that any source connecting the corresponding source terminal pair. Here and (terminal) in G is also one in Gˆ. throughout we say two paths between a source terminal pair 1) Reducingdegrees: LetGˆbethegraphformedfromGby are vertex disjoint even though they share their first and last iterativelyreplacingeachnodev ∈G,whichisnotasourceor vertices(i.e.,thesourceandterminalathand).(d)Anyfeasible aterminalnodewhosedegreeismorethanthreebyasubgraph network coding solution in Gˆ can be efficiently turned into a Γv, constructed as follows. Let {(xi,v) | i = 1,...,din(v)} feasible network coding solution in G. and {(v,yi) | i = 1,...,dout(v)} be the incoming and ItisnothardtoverifythatprovingTheorem3onstructured outgoinglinksofv,respectively,wheredin(v)anddout(v)are graphs implies a proof for general graphs G as well. Indeed, thein-andout-degreesofv.Foreachincominglink(xi,v)of given a network G satisfying the requirements of Theorem 3 v,weaddtoΓv anodexˆi andabinarytreewithrootatxˆi and construct the corresponding network Gˆ. By the properties dout(v) leaves xˆ1i,...,xˆidout(v). Similarly, for each outgoing above,Gˆ alsosatisfiestherequirementsofTheorem3.Assum- link(v,yi)ofv,weaddtoΓv anodeyˆi andaninvertedbinary ingTheorem3isprovenforstructuredgraphsGˆ,weconclude tree with root at yˆ and d (v) leaves yˆ1,...,yˆdin(v). Next, i in i i 5 for each1≤i≤d (v) and 1≤j ≤d (v) we addan edge B. The decomposition in out (xˆj,yˆi)toΓ .Finally,weconnectΓ totherestofthenetwork i j v v In this section we present our structural decomposition of by adding edges (x ,xˆ ) for 1 ≤ i ≤ d (v) and (yˆ,y ) for i i in i i G = (V,E). We assume throughout that G is directed and 1≤i≤d (v).Figures2and3demonstratetheconstruction out acyclic, that it has three sources s ,s ,s , three terminals 1 2 3 of the subgraph Γ for a node v with d (v) = d (v) = 3. v in out t ,t ,t and thatanyinternalvertexin V (namely,anyvertex 1 2 3 Note that for any two links (x ,v) and (v,y ) there is a path i j whichisneithera sourceorasink)hastotaldegreeatmost3. in Γ that connects x and y . v i j Moreover,weassumeGsatisfiestheconnectivityrequirements specified in Theorem 3. We start by labeling the vertices of G. A vertex v ∈ V is labeled by a pair (c ,c ) specifying how many sources s t (terminals) it is connected to. Specifically, c (v) equals the s numberof sourcess forwhich thereexists a path connecting i s andv inG.Similarly,c (v) equalsthenumberofterminals i t t for which there exists a path connecting v and t in G. j j For example,any source is labeled by the pair (1,3), and any terminal by the pair (3,1). An internal vertex v labeled (·,1) is connected to a single terminal only. This implies that any Fig.2. Anodev∈G. information leaving v will reach at most a single terminal. Such vertices v play an important role in the definitions to come. This concludes the labeling of V. An edge e = (u,v) for which v is labeled (·,1) will be referred to as a terminal edge. Namely, any information flowing on e can reach at most a single terminal. If this terminal is t then we will say that e is a t -edge. Clearly, j j the set of t -edges is disjoint from the set of t -edges (and 1 2 similarly for any pair of terminals). An edge which is not a terminal edge will be referred to as a remaining edge or an r-edge for short. We now prove some structural properties of the edge sets we havedefined.Firstof all, there existsan orderingof edges in E in which any r-edge comes before any terminal edge, and in addition there is no path from a terminal edge to an r-edge. This is obtained by an appropriate topological order inG.Moreover,foranyterminalt ,thesetoft -edgesforma j j connectedsubgraphofGrootedatt .Tosee thisnotethatby j definition each t -edge e is connected to t and all the edges j j on a path between e and t are t -edges. Finally, the head j j of an r-edge is either of type (·,2) or (·,3) (as otherwise it Fig.3. ThegadgetΓv forv inFigure2. would be a terminal edge). Foreachterminalt wenowdefineasetofverticesreferred WeproceedtoanalyzethepropertiesofGˆ,namelyweshow j to as the leaf set L of t . This definition shall play an that Gˆ is structured. The proof of properties (a), (b) and (c) j j important role in our discussions. followdirectlybyourconstruction.Forproperty(d)considera Definition 4: Leaf set of a terminal. Let P = (s = feasible network code for the network Gˆ. A feasible network i v ,v ,...,v = t ) be a path from s to t . Consider the 1 2 ℓ j i j code for G is constructed as follows. Let e = (u,v) be an intersection of P with the set of t -edges, This intersection j edge in G. Let e′ be the correspondingedge between Γu and consists of a subpath P′, (v ,...,v = t ) of P for which Γ in Gˆ. Here we assume both u and v were replaced by P ℓ j v the label of v is either (·,2) or (·,3), and the label of any P correspondinggadgets.Othercasescanbeprovenanalogously. other vertex in P′ is (·,1). We refer to v as the leaf of t P j The encoding function f for e=(u,v) is determined by the e correspondingto path P, and the set of all leaves of t as the j encodingfunctionsf oflinkseˆthatbelongtoΓ .Specifically, eˆ u leaf set L . j let A = {(x ,xˆ ),...,(x ,xˆ )} be the incoming 1 1 din(u) din(u) We remark that (a) the leaf set of tj is the set of nodes of links of Γ where d (u) is the in-degree of u in G. The u in in-degree 0 in the subgraph consisting of t -edges and (b) a construction of Gˆ implies that the information transmitted on j source node can be a leaf node for a given terminal. thelinke′ isa functionfe′ ofthepacketstransmittedonlinks A.Weusethisexactfunctionasthedesiredencodingfunction C. Case analysis f . The fact that the incoming links of u in G correspond to e the links in A implies the feasibility of the resulting code for We now present a classification of networks based on the G. node labeling procedure presented above. For each class of 6 networks we shall argue that each terminal can compute the has already been done, i.e., the graph ∪3 path(s −v) is a i=1 i sum of the sources (X + X + X ). Our proof shall be tree directed into v. 1 2 3 constructive, i.e., it can be interpreted as an algorithm for The basic idea of the proof is to show that the paths from finding the network code that allows each terminal to recover the sources to terminal t , i.e., ∪3 path(s −t ) are such (X +X +X ). 3 i=1 i 3 1 2 3 that their overlap with the other paths is very limited. Thus, 1) Case 0: There exists a node of type (3,3) in G. the entire graph can be decomposed into two parts, one over Suppose node v is of type (3,3). This implies that there which the sum is transmitted to t and t and another over 1 2 exist path(s − v), for i = 1,...,3 and path(v − t ), for i j whichthesum istransmittedtot .Towardsthisend,we have 3 j = 1,...,3. Consider the subgraph induced by these paths the following two claims. and color each edge on ∪3 path(s −v) red and each edge i=1 i on ∪3j=1path(v−tj) blue. We claim that as G is acyclic, at Claim 1: The path, path(s1−t3) cannot have an intersec- theendofthisprocedureeachedgegetsonlyonecolor.Tosee tion with either path(s2−v) or path(s3−v). this suppose thata red edge is also coloredblue. This implies Proof: Suppose that such an intersection occurred at a that it lies on a path from a source to v and a path from v node v′. Then, it is easy to see that v′ is connectedto at least to a terminal, i.e. its existence implies a directed cycle in the two sources and to all three terminals and therefore is a node graph. of type (2,3), which contradicts our assumption. Now, we can find an inverted tree that is a subset of the In an analogous manner we can see that (a) path(s −t ) red edgesdirected into v and similarly a tree rootedat v with 2 3 cannot have an intersection with either path(s − v) or t ,t and t as leaves using the blue edges. Finally, we can 1 1 2 3 path(s −v),and(b)path(s −t )cannothaveanintersection compute(X +X +X ) at v over the red tree and multicast 3 3 3 1 2 3 with either path(s −v) or path(s −v). it to t ,t and t over the blue subgraph. More specifically, 1 2 1 2 3 one may use an encoding scheme in which internal nodes of Claim 2: The paths, path(s1 − t3),path(s2 − t3) and the red tree receiving Y1 and Y2 send on their outgoing edge path(s3−t3)cannothaveanintersectionwitheitherpath(v− the sum Y1+Y2. t1) or path(v−t2). 2) Case 1: There exists a node of type (2,3) in G. Note Proof: To see this we note that if such an intersection that it is sufficient to consider the case when there does not happened,then v would also be connectedto t which would 3 exist a node of type (3,3) in G. We shall show that this case imply that v is a (3,3) node. This is a contradiction. is equivalent to a two sources, three terminals problem. Letv bethenodeclosesttovthatbelongstobothpath(s − W.l.o.g. we suppose that there exists a (2,3) node v that i i v)andpath(s −t )(noticethatv mayequals butitcannot is connected to s and s . We color the edges on path(s − i 3 i i 2 3 2 equal v). Consider the following coding solution on G′. On v) and path(s − v) blue. Next, consider the set of paths 3 the paths path(s −v ) send X . On the paths path(v −v) ∪3 path(s − t ). We claim that these paths do not have i i i i i=1 1 i send information that will allow v to obtain X +X +X . any intersection with the blue subgraph. This is because the 1 2 3 This can be easily done, as these (latter) paths form a tree existenceofsuchanintersectionwouldimplythatthereexists into v. Namely, one may use an encoding scheme in which a path between s and v which in turn implies that v would 1 internalnodesreceivingY andY sendontheiroutgoingedge be a (3,3) node. We can now compute (X +X ) at v by 1 2 2 3 the sum Y +Y . By the claims above (and the fact that G′ findingatree consistingofblueedgesthataredirectedintov. 1 2 is acyclic) it holds that the information flowing on edges e Suppose that the blue edges are removed from G to obtain a in path(v −t ),i = 1,...,3 has not been specified by the graph G′. Since G is directed acyclic, we have that there still i 3 encoding defined above. Thus, one may send information on exists a path from v to each terminal after the removal. Now, thepathspath(v −t )thatwillallowt toobtainX +X + note that (a) G′ is a graph such that there exists at least one i 3 3 1 2 X . Here we assume the paths path(v −t ) form a tree into path from s to each terminal and at least one path from v to 3 i 3 1 t ,ifthisisnotthecasewemayfindasubsetofedgesinthese each terminal, and (b) v can be considered as a source that 3 pathswiththisproperty.Oncemore,bytheclaimsabove(and contains (X +X ). Now, G′ satisfies the condition given in 2 3 thefactthatG′ isacyclic)itholdsthattheinformationflowing Theorem 1 (which addresses the two sources version of the onedgese in the pathspath(v−t ) andpath(v−t ) hasnot problem at hand), therefore we are done. 1 2 been specified (by the encodings above). On these edges we 3) Case 2: There exists a node of type (3,2) in G. As may transmit the sum X +X +X present at v. before it suffices to consider the case when there do not exist 1 2 3 any (3,3) or (2,3) nodes in the graph. Suppose that there 4) Case3: Theredonotexist(3,3),(2,3)and(3,2)nodes exists a (3,2) node v and w.l.o.g. assume that it is connected in G. Note that thus far we have not utilized the fact that to t1 and t2. We consider the subgraph G′ induced by the there exist two edge-disjoint paths from each source to each union of the following sets of paths terminalinG.Inpreviouscases,theproblemstructurethathas 1) ∪3i=1path(si−v), emergeddueto thenodelabeling,allowedusto communicate 2) ∪2i=1path(v−ti), and (X1+X2+X3) by using just one path between each si−tj 3) ∪3i=1path(si−t3). pair.However,forthecaseathandwewillindeedneedtouse Note that as argued previously, a subset of edges of the fact that there exist two paths between each s −t pair. i j ∪3 path(s −v)canbefoundsothattheyformatreedirected Aswewillsee,thissignificantlycomplicatestheanalysis.We i=1 i intov.Forthepurposesofthisproof,wewillassumethatthis present the analysis of Case 3 in the upcoming section. 7 VIII. ANALYSIS OF CASE 3 The basic idea of our proof is the following.We divide the Note that the node labeling procedure presented above set of graphsundercase 3, into variousclasses, dependingon assignsalabel(c (v),c (v))toanodevwherec (v)(c (v))is thenumberofcolorsthatexistinthegraph.Itturnsoutthatas s t s t the numberof sources(terminals) thatv is connectedto. This long as the number of colors in the graph is not 2, i.e., either labelingignorestheactualidentityofthesourcesandterminals 0,1 or 3 and higher, then there is a simple argument which thathave connectionsto v. Itturnsoutthatwe needto use an shows that each terminal can be satisfied. The argument in additional, somewhat finer notion of node connectivity when the case oftwo colorsisa bitmoreinvolvedandis developed we want to analyze case 3. We emphasize that throughout separately.Itcanbeshownthatourcounter-exampleinSection this section, we still operate under the assumption that the VI isa case wherethereare two colors.Notehowever,thatin reduction in Section VII-A has been performed and that each our counter-example there are certain si−tj pairs that have node has a total degree at most three. onlyonepathbetweenthem.Wenowproceedtodevelopthese Towards this end, for case 3 (i.e., in a graph G without arguments formally. (3,3),(2,3) and (3,2) nodes) we introduce the notion of the Claim 4: Considerthesubgraphinducedbyacertaincolor, color of a node. For each (2,2) node in G, the color of w.l.o.g. (s1,s2,t1,t2) in G, denoted by G(s1,s2,t1,t2). There the node is defined as the 4-tuple of sources and terminals exists an assignment of encoding vectors over G(s1,s2,t1,t2), it is connected to, e.g., if v is connected to sources s1 and such that any (unit entropy) function of the sources X1 and s2 and terminals t1 and t2, then its color, denoted col(v) is X2 canbemulticastedtoallnodesinG(s1,s2,t1,t2).Moreover, (s ,s ,t ,t ). We shall also say that the source color of v such encoding vector assignments can be done independently 1 2 1 2 is (s ,s ) and the terminal color of v is (t ,t ). The source over subgraphs of different colors. 1 2 1 2 and terminal colors are sometimes referred to as source and Proof: Note that we are working with directed acyclic terminal labels. graphs. Thus, there is a node v∗ in G(s1,s2,t1,t2), such that it The following claim is immediate. hasnoincomingedgesinG(s1,s2,t1,t2).Next,notethatthepath Claim 3: If there is a (2,2) node v in G of color col(v), froms1 tov∗ hasnointersectionwithapathfroms2 ors3.To then each terminal in the terminal color of v has at least one seethis,supposethattherewassuchanintersectionatnodev′. leaf with color col(v).For example,if col(v) = (s1,s2,t1,t2), Ifthereisapathfroms3 tov′,thenv∗ isa(3,2)node(which then both t and t have leaves with color (s ,s ,t ,t ). contradicts the assumption that v∗ is a (2,2) node). If there 1 2 1 2 1 2 Proof: W.l.o.g, let col(v) = (s1,s2,t1,t2). This implies is a path from s2 to v′, then v′ and the remaining vertices thatthere exists a pathP betweenv and t1. Letℓ be a leaf of connecting v′ to v∗ on the path from s1 to v∗ have color t1 on P. Recall that ℓ is defined as the last node on P with (s1,s2,t1,t2). Contradicting the fact that v∗ has no incoming terminal label at least 2, namely ct(ℓ) ≥ 2. Moreover, ct(ℓ) edgesin G(s1,s2,t1,t2). Likewise, we see that the path from s2 is exactly 2 and no larger as otherwise ct(v) would also be to v∗ has no intersection with a path from s1 or s3. greaterthan2contradictingourassumptionsintheclaim.This Therefore, the path from s1 to v∗ carries X1 in the clear, implies that the terminal color of ℓ is exactly (t1,t2). As ℓ is and likewise for the path from s2 to v∗. Thus, v∗ can obtain downstream of v it holds that cs(ℓ) ≥ cs(v) = 2. Here also, bothX1 and X2 and can computeany(unitentropy)function it holds that c (ℓ) is exactly 2, otherwise ℓ would be a (3,2) of them. Moreover, v∗ can transmit this function to all nodes s node (contradicting our assumption for case 3). This implies ofG(s1,s2,t1,t2) downstreamofv∗.Astheargumentabovecan that the source color of ℓ is (s1,s2). Therefore, t1 has a leaf be repeated for any node v∗ of in-degree 0 in G(s1,s2,t1,t2) it of color (s1,s2,t1,t2). A similar argument holds for t2. follows that all nodes of G(s1,s2,t1,t2) can obtain the desired The notion of a color is useful for the set of graphs under function of X1 and X2. case 3, since we can show that there can never be an edge Finally, we note that the assignments over subgraphs of between nodes of different colors. We exploit this property different colors can be done independently, since there does extensively below. not exist any edge between nodes of different colors (from Lemma 3: Consider a graph G, with sources, s ,i = Lemma 3). i 1,...,3, and terminals tj,j = 1,...3, such that it does not Lemma 4: Consider a graph G, with sources, si,i = have any(3,3),(2,3)or (3,2) nodes.There doesnotexist an 1,...,3, and terminals tj,j = 1,...3, such that (a) it does edge between (2,2) nodes of different color in G. not have any (3,3),(2,3) or (3,2) nodes, and (b) there exists Proof: Assume otherwise and consider two (2,2) nodes at least ones −t pathfor all i and j. Considerthe set of all i j v and v such that col(v ) 6= col(v ), for which there is an (2,2)nodesinGandtheircorrespondingcolors.Ifthereexist 1 2 1 2 edge (v ,v ) in G. Note that if the source colors of col(v ) no colors, exactly one color or at least three distinct colors in 1 2 1 and col(v ) are different, then v has to be a (3,2) node, G, then there exists a set of coding vectors such that each 2 2 which is a contradiction. Likewise, if the terminal colors of terminal can recover 3 X . i=1 i col(v ) and col(v ) are different, then v has to be a (2,3) Proof: Note that all leaves in G are of type (1,2),(1,3) 1 2 1 P node, which is also a contradiction. or(2,2).Thisimpliesthatanyterminalt thatdoesnothavea j Lemma3impliesthatwearefreetoassignanycodingcoef- (2,2) leaf with source color including s , must have a leaf at i ficients ona subgraphinducedby nodesof one color,without which X is received in the clear. We refer to such leaves i having to worry about the effect of this on another subgraph as singleton X leaves. The above follows directly by the i inducedby nodesofa differentcolor(simplybecausethere is connectivity assumption (b) stated in the Lemma. In cases 2 no such effect). and 3 in the analysis below, we need the field characteristic 8 TABLEI ENCODINGONSUBGRAPHSOFDIFFERENTSOURCECOLORS.RECOVERY 3 OFPi=1XiISPOSSIBLEFROMANYTWOOFTHERECEIVEDVALUES, USINGADDITIONSORSUBTRACTIONS. Legend Sourcecolor Encoding (s1,s2) 2X1+X2 (s2,s3) X2+2X3 (s1,s3) X1−X3 Fig. 4. A possible instance of Gaux when the degree sequence of the terminals is (2,2,2). The encoding specified in the legend denotes the encoding tobeusedontheappropriate subgraphs. case analysis depending upon the degree sequence of nodes t′,j = 1,...,3 in G . The degree sequence j aux is specified by a 3-tuple, where we note that the sum of the entries has to be 6. Legend a) The degree sequence is a permutation of (0,3,3). This only happensif the terminallabel of all colors, c′,i = 1,...,3 is the same and in turn implies that i the source label of each color is distinct, i.e., the Fig. 5. A possible instance of Gaux when the degree sequence of the source colors include (s ,s ),(s ,s ) and (s ,s ). 1 2 2 3 1 3 terminals is (3,2,1). The encoding specified in the legend denotes the Inthiscase,greedyencoding(cf.Definition1)works encoding tobeusedontheappropriate subgraphs. for the two terminals in the color support. This is becauseeachterminalwillobtainX +X ,X +X 1 2 2 3 to be >2. andX1+X3atitsleaves(usingClaims3and4)from which the terminal can compute 2 3 X . The (0) Case 0. There are no colors in G. i=1 i remainingterminalisnotconnectedtoany(2,2)leaf, This implies that there are no (2,2) nodesin G and thus P which implies that all its leaves contain singleton all terminals tj have distinct leaves holdingX1, X2, and values, from which it can compute 3 X . X respectively.Itsufficestodesignasimplecodeonthe i=1 i 3 b) The degree sequence is (2,2,2). pathsfromthoseleavesto tj whichenablestj to recover ThisonlyhappensifalltheterminalPlabelsofthecol- the sum X +X +X . 1 2 3 ors are distinct, i.e., the terminal labels are (t ,t ), (i) Case 1. There is only one color in G. 1 2 (t ,t ) and (t ,t ). Now consider the possibilities In this case perform greedy encoding (cf. Definition 1) 2 3 1 3 for the source labels. on the r-edges. We show that each terminal can recover If thereis onlyone source label, then greedyencod- 3 X from the contentof its leaves. W.l.o.g, suppose i=1 i ingensuresthatthesumofexactlytwoofthesources thatthecoloris(s ,s ,t ,t ).UsingClaim3,thismeans P 1 2 1 2 reaches each terminal. The connectivity condition that both t and t have leaves of this color. The greedy 1 2 guarantees that the remaining source is available as encoding implies that t and t can obtain X + X 1 2 1 2 a singleton at a leaf of each terminal. Therefore we from the corresponding leaves. Moreover, both t and 1 are done. t have a leaf containing a singleton X , because of the 2 3 If there are exactly two distinct source colors, then connectivity requirements. Therefore, they can compute we argue as follows. On the subgraphs induced by 3 X .Theterminalt hasonlysingletonleaves,such i=1 i 3 thecolorswiththesamesourcelabel,performgreedy that there exists at least one X ,X and X leaf. Thus P 1 2 3 encoding.On the remainingsubgraph,propagatethe it can compute their sum. remaining useful source. We illustrate this with an (ii) Case 2. There exist exactly three distinct colors in G. example that is w.l.o.g. Suppose that the colors are It is useful to introduce an auxiliary bipartite graph (s ,s ,t ,t ), (s ,s ,t ,t ) and (s ,s ,t ,t ). We that denotes the existence of the colors at the leaves 1 2 1 2 1 2 2 3 2 3 1 3 performgreedyencodingonthesubgraphsofthefirst of the different terminals. This bipartite graph denoted two colors, and only propagate X on the subgraph G is constructed as follows. There are three nodes 3 aux of the third color. As shown in Figure 4, this means t′,i=1,...,3 thatdenote the terminalson one side and i that terminals t and t are satisfied. Note that the three nodes c′,i=1,...,3 that denote the colors on the 1 3 i connectivity condition dictates that t has to have a otherside. If the colorc′ hast in its support,then there 2 i j leaf that has a singleton X , therefore it is satisfied is an edge between c′ and t′, i.e., t has a leaf of color 3 i j j as well. c′. The following properties of G are immediate. i aux Finally, suppose that there are three distinct source – Each c′i has degree-2. colors. In this case we use the encodingspecified in – Each t′i has degree at most 3. Table I on the subgraphs of each source color. It is – Multiple edges between nodes are disallowed. clear on inspection that 3 X can be recovered i=1 i Note that there are exactly three possible source colors fromanytwoofthereceivedvalues(asfromanytwo P ((s ,s ),(s ,s )and(s ,s ))andthreepossibleterminal ofthelinearcombinationsstated,onecandeducethe 1 2 2 3 3 1 colors ((t ,t ),(t ,t ) and (t ,t )). We now perform a sum X +X +X ). 1 2 2 3 3 1 1 2 3 9 c) The degree sequence is a permutation of (1,2,3). In this case, the degree sequence dictates that there have to be two terminals that share two colors. This Legend implies that the source label of those colors has to be different. For the subgraphs induced by these colors, we use the encoding proposed in Table I. For the subgraph induced by the remaining color, we perform greedy encoding. For example, suppose Fig. 6. An instance of Gaux when there exist exactly two distinct colors undercase3,suchthattheterminal labels ofthecolorsarethesame. that the colors are (s ,s ,t ,t ), (s ,s ,t ,t ) and 1 2 1 2 2 3 1 2 (s ,s ,t ,t ). As shown in Figure 5, t and t are 2 3 1 3 1 2 clearly satisfied (even without using the information from color (s ,s ,t ,t )). Terminal t has to have 2 3 1 3 3 a singleton leaf containing X by the connectivity 1 condition and is therefore satisfied. Together,theseargumentsestablishthatinthecasewhen there are three colors, all terminals can be satisfied. (ii) Case 3. There exist more than three distinct colors in G. Fig. 7. An instance of Gaux when there exist exactly two distinct colors Note that if there are at least four colors in G, then (a) undercase 3,suchthat boththe sourcelabels andthe terminal labels ofthe there are two colors with the same terminal label, since colorsaredifferent. thereareexactlythreepossibleterminallabels,and(b)for thecolorswiththesameterminallabels,thesourcelabels argumentcan be madefor s ) and t . Each ofthese pathshas necessarilyhavetobedifferent.Ourstrategyisasfollows. 3 2 a leaf for t . If one of the leaves containsa singletonX (i.e, For the terminalsthat share two colors, use the encoding 2 1 receives X in the clear), then performing greedy encoding proposedin Table I. If the remainingterminalhasaccess 1 on the two colors works since t obtains X +X , X and to only one source color, then use greedy encoding and 2 1 2 1 X +X and the other terminals will obtain singleton leaves note that this terminal has to have a singleton leaf. If it 2 3 thatsatisfy their demand.Likewise, if thereis a singletonleaf has access to at least two source colors, simply use the containingX on the vertexdisjointpaths from s to t , then encoding in Table I for it as well. 3 3 2 greedy encoding works. Thus,the correspondingleavesoft mustbeoftype(2,2). It remainsto developthe argumentin the case when there are 2 This implies that there are at least four distinct leaves of t exactlytwodistinctcolorsinG.Forthisweneedtoexplicitly 2 of type (2,2), two of color (s ,s ,t ,t ) and two of color use the fact that there are two edge-disjoint paths between 1 2 1 2 (s ,s ,t ,t ). We now conclude our proof by the following each s −t pair. 2 3 2 3 i j Lemma 5: Consider a graph G, with sources, s ,i = claims. i 1,...,3, and terminals t ,j = 1,...3, such that (a) it does Consider the subgraphinduced by nodes colored by one of j not have any (3,3),(2,3) or (3,2) nodes, and (b) there exist the colors above, w.l.o.g. (s1,s2,t1,t2), in G together with at least two si−tj paths for all i and j. Consider the set of the (1,·) nodesconnectedto either s1 ors2 in G. Denote this all (2,2) nodes in G and their corresponding colors. If there subgraph by G′. Consider a random linear network code on existexactlytwodistinctcolorsinG,thenthereexistsasetof the nodes of G′ (namely, each node outputs a random linear coding vectors such that each terminal can recover 3 X . combination of its incoming information over the underlying Proof: As in the proof of Lemma 4, we argue bais=e1d oin finite field of size 2m). We show,with highprobability(given the content of the leaves of the terminals. SupposePthat the m is large enough), that such a code allows both t1 and t2 to auxiliary bipartite graph Gaux is formed. If both the colors receive two linearly independentcombinations of X1 and X2 have the same terminal label (see Figure 6 for an example), attheirleaves.Ananalogousargumentalsoholdsfort2 andt3 then it is clear that the encoding in Table I on the subgraphs whenconsideringthecolor(s2,s3,t2,t3) andthe information inducedbythecolorssufficesforthecorrespondingterminals. X2 and X3. This suffices to conclude our assertion. The third terminal has singleton leaves correspondingto each Claim 5: Let u be any leaf in G′. Let U = αX1 +βX2 source and can compute 3 X . be the incoming information of u. With probability (1 − i=1 i Another possibility is that the terminal labels of the colors 2−m+1)|V| both α and β are not zero. are different, but the sourPce labels are the same. It should be Proof: Denote by C ={ci} the collection of coefficients clear thatthis case can be handledby greedyencodingon the used in the random linear network code on G′. Namely, each colors. ci is uniformly distributed in GF(2m), and the information The situation is more complicated when the terminal and on each edge e is a linear combination of it’s incoming sourcelabelsofthecolorsaredifferent,seeforexampleFigure information using coefficients from C (each coefficient in C 7. In the case depicted, greedy encoding does not work since is used only once). it satisfies t and t but not t . W.l.o.g., we assume that the It is not hard to verify that α is a multivariate polynomial 1 3 2 colors are (s ,s ,t ,t ) and (s ,s ,t ,t ). Now, we know in the variables in C of total degree ℓ, where ℓ is the length 1 2 1 2 2 3 2 3 thatthereexisttwovertex-disjointpathsbetweens (asimilar of the longest path between s and u (here i=1,2). Namely, 1 i 10