Skeleton and fractal scaling in complex networks K.-I. Goh, G. Salvi, B. Kahng, and D. Kim School of Physics and Center for Theoretical Physics, Seoul National University, Seoul 151-747, Korea (Dated: February 2, 2008) 6 Wefindthatthefractalscalinginaclassofscale-freenetworksoriginatesfromtheunderlyingtree 0 structure called skeleton, a special typeof spanning tree based on the edge betweenness centrality. 0 Thefractalskeletonhasthepropertyofthecriticalbranchingtree. Theoriginalfractalnetworksare 2 viewed as a fractal skeleton dressed with local shortcuts. An in-silico model with both the fractal n scaling and the scale-invariance properties is also constructed. The framework of fractal networks a is useful in understandingtheutility and theredundancyin networked systems. J 3 PACSnumbers: 89.75.Hc,05.45.Df,64.60.Ak 1 ] The emerging unifying concepts such as the small- betweenness centralities [11, 12] or loads [13]. The re- h worldproperty[1],thescale-freebehavior[2],andthehi- maining edges in the system are called shortcuts, which c e erarchicalmodularity[3]nowconstituteourbasicunder- contribute to forming loops. The skeleton of a scale- m standing of the organization of complex networked sys- free network is also scale-free but with different γ. Since tems, which appear in as diverse examples as the world- the betweenness centrality is related to the amount of - t wide web, the social networks, and the biochemical re- information flow along a given edge, the skeleton can be a t action networks inside cells. The small-world property consideredasthe“communicationkernel”ofthenetwork s . referstotheonethattheaverageseparationhDibetween [10]. If a network is organized in a modular way, as it is t a pairs of vertices in the network scales at most logarith- believedtobe soforthe world-widewebandthe biologi- m mically in the total number of vertices N in the system, calsystems, the inter-modular connections offer commu- - hDi ∼ lnN. The scale-free behavior means the lack of nication channels across the modules, thus gaining high d characteristic scales in the number of links k a vertex betweennesscentralities. Byconstruction,theskeletonis n has, called the degree, manifesting itself in the form of composed preferentially of such high-betweenness inter- o c a power-law degree distribution pd(k)∼k−γ for large k. modular connections, which will preserve the modular [ Recent discovery of fractal scaling and topological self- structure while greatly simplifying the complexity. Fur- similarity in the world-wide web and the metabolic net- thermore, if the modular structure is distinct enough, 2 v works[4], however,raiseda new perspective on our view i.e., there is a rather clear-cut separation between mod- 2 ofsuchnetworkedsystems. Thefractalscalingstandsfor ules, we can expect that even a random spanning tree 3 the power-law relation between the minimum number of can capture the modular structure. Thus by looking at 3 boxesN neededtocovertheentirenetworkandthesize thepropertiesofitsspanningtrees,wecanvisualizemore 8 B of the boxes ℓ , easily the topological organizationof the network. 0 B 5 N (ℓ )∼ℓ−dB, (1) With the underlying skeleton and random spanning 0 B B B tree,hereweperformthefractalscalinganalysisbymea- / t with a finite fractaldimensiond [5]. The self-similarity suring N (ℓ ) for several real-world networks and net- a B B B m here refers to the scale-invariance of the degree distri- work models [14]. Comparison of the fractal scalings bution under the coarse-grainingwith different box sizes in each original network with the corresponding span- - d (lengthscales)aswellasundertheiterativeapplicationof ningtreesrevealsdistinctpatternsaccordingtothepres- n the coarse-graining(the network renormalization) [4, 6]. ence or the absence of fractality in the network. For the o It has been observed, however,that not all networks are fractal networks, such as the world-wide web [Fig. 1(a)] c : fractal and the most random network models proposed [15],themetabolicnetworkofEscherichia coli[Fig.1(b)] v yet are not fractal, either. This poses a fundamental [16], and the protein interaction network of Homo sapi- i X question on the origin of the fractal scaling observed in ens[Fig.1(c)][17],thenumbersofboxesneededtocover r the real-world networks [4, 7, 8, 9]. In this Letter, we theoriginalnetworkanditsskeletonarealmostthesame. a showthat the fractalpropertyofthe networkcanbe un- Moreover, the random spanning tree, while possessing a derstood from its underlying tree structure. different statistics of N , shows nevertheless the same B Whilehighlyentangledasanetworklooks,amoresim- fractal dimension d . This is surprising, because in the B plestructureisembeddedunderneathit,thatisthespan- world-wideweb,forexample,morethan2N edges(short- ning tree. A spanning tree is a tree composed of N −1 cuts) are added onto the skeleton (the average degree of edges in a way that they connect all the N vertices in the world-wide web is 6.7), which is a tremendous num- the network. Of particular significance is the so-called ber in the graph-theoretical sense, and by no means a skeleton [10] of a network. The skeleton is a particular minuteperturbation. Sucharobustnessoffractalscaling spanningtree,whichisformedby the edgeswith highest in the world-wide web shows that even though the net- 2 100 100 100 100 100 10-1 (a) (b) (c) 10-1 (d) 10-1 (e) 10-1 N 10-2 N N 10-1 N N 10-2 Nl()/BB1100--43 Nl()/BB10-2 Nl()/BB10-2 Nl()/BB1100--32 Internet Nl()/BB1100--43 s γt=a2ti.c3 10-5 WWW 10-3 E m. ectoalbiolic H p.r osatepiinens 10-4 10-5 1 2 4 8 16 32 64 128 1 2 4 8 16 32 1 2 4 8 16 32 0 5 10 15 20 25 30 35 0 5 10 15 20 25 30 35 40 45 l l l l l B B B B B 6 25 14 4 12 hing ratio 45 (a′) hing ratio 1250 (b′) hing ratio 11802 (c′) hing ratio 23..553 (d′) hing ratio 180 (e′) nc 3 nc nc nc 2 nc 6 Mean bra 12 Mean bra 150 Mean bra 246 Mean bra 01..551 Mean bra 24 0 0 0 0 0 0 10 20 30 40 50 60 70 0 5 10 15 20 25 30 0 5 10 15 20 25 30 35 0 5 10 15 20 25 0 5 10 15 20 25 30 35 40 Distance from the root Distance from the root Distance from the root Distance from the root Distance from the root FIG. 1: (Color) (a–e) Box counting analysis of original networks (◦, red)) and their skeleton (▽, blue) and random spanning tree (△, orange). Shown are cases for the world-wide web (a), the metabolic network of Escherichia coli (b), and the protein interaction network of Homo sapiens (c), the Internet at the autonomous systems level as of the year 2004 (d), and the static model network with γ = 2.3 and hki = 4 (e). In (a–c), we also show two guidelines for the fractal scaling of the original network (black) and its random spanning tree (gray). Slope of each guideline is (a) −4.1, (b) −3.5, and (c) −2.3. In (d–e), ′ ′ the black line is a fit to the exponential function and the gray one is to a power law. (a–e) Branching analysis. The mean branchingnumberasafunctionofdistancefrom theroot fortheskeleton(▽,blue)andtherandomspanningtree(△,orange) of the networks in (a–e). For (a′–c′), both the skeleton and the random spanning tree fulfill the criticality condition, hmi=1 (horizontal line), as the distance from the root increases, while for (d′–e′), themean branching numberof the skeleton decays to zero with no plateau at hmi=1. workisfarfrombeingatree,theshortcutsaredistributed skeletonisrequiredforanetworktobeafractal: Thetree in a way that they preserve the fractality and modular- structuressuchastheskeletonandtherandomspanning ity. In other words, shortcuts are mainly present inside tree may be seen as generated through a multiplicative modules and the connections between different modules branching process starting from a root vertex [19]. At are largely made through the skeleton. This topologi- each branching step, each vertex born in the previous cal structure can be measured by the fraction of intra- step generates m offsprings with probability b . The m branch shortcuts among the total number of shortcuts, criticality condition means the average branching num- a branch being the subtree connected to the most con- ber, nected vertex. We find that the ratio is 0.78, 0.33, and ∞ 0.45 for Figs. 1(a), 1(b), and 1(c), respectively. On the hmi≡ X mbm =1. (2) other hand, other networks exhibit different features in m=0 the fractal scaling analysis. For example, the Internet Thus the branching tree grows perpetually with off- autonomous systems [Fig. 1(d)] or the static model with springs neither flourishing nor dying out. In this case, γ = 2.3 and hki = 4 [Fig. 1(e)], the box counting num- when b ∼ m−γ, the number of vertices s in the tree ber of the original network and the skeleton decays with m scales with its linear size t in a power-lawform as s∼tz ℓ much faster than that of the random spanning tree, B with z = (γ −1)/(γ −2) for 2 < γ < 3 and z = 2 for andthe fractionofintra-branchshortcutsissmall: 0.087 γ >3 [19, 20], andthe tree structure is fractalwith frac- for Fig. 1(d) and 0.015 for Fig. 1(e). Also for the social tal dimension d = z. Such a critical branching tree is networks, such as the actor network and the collabora- B similar in the topological characteristics to the homoge- tion network, the N (ℓ ) curves are appreciably differ- B B neous scale-free tree network proposed in Ref. [21]. To ent from those of the skeletons, implying that the global check the validity of our suggestion, we examine if the topology of the social network is highly interwoven on a criticality condition is fulfilled for the skeleton and the large scale to form a more compact structure [18]. random spanning trees of the four real-world networks ′ ′ The scaling behavior of the box counting relation and the static model in Figs. 1(a)–1(e). Indeed, for the ′ ′ Eq. (1) in Figs. 1(a)–1(c) for the original network and fractalnetworks[Figs.1(a)–1(c)],boththeskeletonand its skeletonsuggeststhat the fractalpropertyof the net- the randomspanning tree fulfill the criticality condition, work originates from that of the skeleton. In addition, even though in reality, the dynamic origins of their for- we argue here that the criticality in the topology of the mations may well be more complicated than the pure 3 branchingdynamics. Inouranalysis,therootistakenas themostconnectedvertexinthetree. Ontheotherhand, (a) (b) fornon-fractalnetworks,themeannumberofbranchesof the skeleton decays to zero rapidly as the distance from ′ ′ the root increases [see Figs. 1(d) and 1(e)]. A sim- ilar behavior is observed for the actor network as well [18]. Thus the actor network is not a fractal. However, the random spanning tree satisfies the criticality condi- tion in all cases, suggesting the generic fractal structure of this kind of trees as shown in Ref. [22]. In short, a fractalnetworkcontainsafractalskeletonunderneathit, whichisperturbedbylocalshortcuts,thuspreservingits fractal property. Incorporatingallthefindingssofar,wesetupafractal (c) (d) network model. The model is based on the multiplica- tive branching tree [19]. We first introduce an exponent γ for the branching probability b ∼ m−γ (m ≥ 1), m designed to produce the desired power-lawdegree distri- butionp (k)∼k−γ withγ >2. Thebranchingprobabil- d ity is properly normalized to be critical, i.e., it satisfies Eq.(2). Next,aparameterpisintroducedtocontrolthe number of shortcuts added, and hence the mean degree of the network. One final parameter q accounts for the relative frequency of the local and the global shortcuts. The construction of the model network proceeds as fol- lows: i) A tree is grown by the multiplicative branching (f) rulewithbranchingprobabilityb . ii)Afterthebranch- m ing process, every vertex increases its degree by a factor p and attempts to make shortcuts to its local neighbors. iii) For each successful shortcut in ii), with probability q, we replace it by reconnecting it to a randomly chosen vertex,not restrictedto its localneighbors. In the latter case, we choose the vertex with a weight in proportion to its degree in the branching tree, so as to maintain the same power-law scaling of the degree distribution. This FIG.2: (a)Uncorrelatedscale-freetreewiththedegreeexpo- rule is schematized in Fig. 2(e) [18]. Fig. 2(a) is an il- nentγ =2.3andthenumberofverticesN =164. Itisgrown lustrative example of a branching tree of size N = 164 by the multiplicative branching rule [19], with the branching and γ = 2.3. The vertices of the tree are colored ac- probabilitybm ∼m−γ. (b)Fractalmodelnetworkcreatedby adding local shortcuts (green) to the tree in (a). (c) A non- cording to which box they belong to in a particular box- fractal model network created by adding random shortcuts covering with ℓ = 2. For such a tree, even a simple B (blue)allowing global connection. (d)Thegeneric configura- graph drawing algorithm, such as Pajek [23], can cap- tionofthenetworkin(c)generatedbyPajek,andtheabsence ture its inherent hierarchical structure. Fig. 2(b) shows ofinherentmodularstructure. In(b–d),thecolorofeachver- the fractal network structure with the addition of local tex is that of the corresponding vertex in (a). (e) Schematic shortcuts, where we use p = 0.5 and q = 0. The hier- illustration of the growth rule of the fractal network model. Theshadedregionindicatestherestofthenetworkgenerated archicalmodular structure presentedin the tree network sofar. (f)Thefractalscalinganalysisofnetworkswithlarger [Fig.2(a)]persists. Ontheotherhand,thenetworkwith size N = 311,043, constructed in the same manner as those q = 1 shown in Fig. 2(c), in which the same number of in (a)-(c). Squares (blue) correspond to the fractal network shortcutsasinFig.2(b)areattached,doesnotretainthe model in (b) with p=0.5, circles (black) to its skeleton, and modularity. Suchanabsenceofmodularitycanbereadily the diamonds (red) to the non-fractal network model in (c) seen in Fig. 2(d), a different layout of the same network with q=1. Solid line is the guideline with slope ≈−2.8. asFig.2(c)generatedby Pajekinanunsupervisedman- ner. Consequently, the fractal property is preserved in the caseofq =0,whereasitisnotforq =1,asis clearly trees such as the Baraba´si-Albert tree [2] and the geo- revealed by the box counting analysis in Fig. 2(f). metrically growing scale-free tree [24]. Such models are It is noteworthy that there exists an important dis- notfractal,becausetheydonotfulfillthecriticalitycon- tinction between the present model and other scale-free dition. Indeed, their mean branching rate decreases to 4 zeromonotonically as the branching proceeds,without a furtherstudiesontherenormalizationandthe universal- plateauathmi≈1 [18]. Suchatype oftreenetworkwas ity in complex networks. classified as “causal” trees in Ref. [25]. Note also that This work is supported by the KRF Grant No. R14- the fractal trees are not small-worlds. However, with a 2002-059-01000-0 in the ABRL program funded by the small number of global shortcuts, e.g., q = 0.01 in our KoreangovernmentMOEHRD. G.S. gratefully acknowl- model with N ∼ 3×105, the network turns into a small edges financial support from the Swiss National Science world. This is seen in the mean box mass versus ℓ plot B Foundation under the fellowship PBEL2-106182. in the cluster growing method analysis [18]. The fractal network model is self-similar. To check it, weperformthecoarse-grainingthroughtheboxcounting method by replacing each box with a single super-node, and connecting them if any of their member vertices is [1] D. J. Watts, S. H. Strogatz, Nature (London) 393, 440 connected [4]. We find that the degree distribution of (1998). the fractal network model is invariant under the coarse- [2] A.-L. Barab´asi, R.Albert, Science 286, 509 (1999). graining by the boxes with different sizes. In addition, [3] E. Ravasz, A. L. Somera, D. A. Mongru, Z. N. Oltvai, A.-L. Barab´asi, Science 297, 1551 (2002). the degree distribution is invariant under successive ap- [4] C. Song, S. Havlin, H. A.Makse, Nature (London) 433, plications of the coarse-graining transformation [18]. It 392 (2005). isinterestingtonote thatsomenetworksareself-similar, [5] J. Feder, Fractals (Plenum, NewYork,1988). that is, exhibit the scale-invariant degree distribution, [6] B. J. Kim, Phys. Rev.Lett. 93, 168701 (2004). yet are not fractal. Typical example of such networks is [7] S. H. Strogatz, Nature (London),433, 365 (2005). the Internet[18]. So the fractality andthe self-similarity [8] S.-H.Yook,F.Radicchi,H.Meyer-Ortmanns,Phys.Rev. do not always imply each other in complex networks. E 72, 045105(R) (2005). [9] C. Song, S.Havlin, H. A.Makse, cond-mat/0507216. The framework of fractal network is helpful to under- [10] D.-H.Kim,J.D.Noh,H.Jeong,Phys.Rev.E70,046126 stand, for example, the utility and the redundancy in (2004). themetabolicnetworksfromapurelytopologicalaspect. [11] L. C. Freeman, Sociometry, 40, 35 (1977). The high flux backbone in the metabolic network of E. [12] M. Girvan, M. E. J. Newman, Proc. Natl. Acad. Sci. coliobtainedthroughthefluxbalanceanalysiswasshown U.S.A. 99, 7821 (2002). to be composed of many branches with few inter-branch [13] K.-I. Goh, B. Kahng, D. Kim, Phys. Rev. Lett. 87, 278701 (2001). connections and to merge into the biomass reaction[26]. [14] We use a slightly different method of box counting, the Obviously, its topological shape resembles the branch- burning method, that gives the same result as in [4] but ing tree skeleton rooted from a vertex with the largest is easier to implement. numberofconnectionsifthe directioninedgeis ignored. [15] R. Albert, H. Jeong, A.-L. Barab´asi, Nature (London) On the other hand, recent in silico flux analysis [27] has 401, 130 (1999). shownthatthemetabolicnetworkofE.colicontainshigh [16] H. Jeong, B. Tombor, R. Albert, Z. N. Oltvai, A.-L. density of backup reactions (redundancies) for a given Barab´asi, Nature (London) 407, 651 (2000). condition. Such a reaction is barely used in the normal [17] Database of Interacting Proteins, http://dip.doe-mbi.ucla.edu/dip/. condition, but takes up a high flux when a certain reac- [18] K.-I. Goh, G. Salvi, B. Kahng, and D. Kim (unpub- tion on the backbone is blocked. When the simultane- lished). ous blockade of such a reaction pair blocks the biomass [19] T. E. Harris, Theory of Branching Processes (Springer- production, they are called synthetic lethal [28]. Most Verlag, Berlin, 1963). synthetic lethal reactions are located very close to each [20] K.-I.Goh,D.-S.Lee,B.Kahng,D.Kim,Phys.Rev.Lett. other, being apart in three reaction steps or less along 91, 148701 (2003). themetabolicnetwork. Alsotheyaremostlyinthesame [21] Z.Burda, J.D.Correia, A.Krzywicki, Phys.Rev.E 64, 046118 (2001). functional category. Thus the reactions with high (low) [22] L. A. Braunstein, S. V. Buldyrev, R. Cohen, S. Havlin, flux in wild type can be regarded as the edges on the H. E. Stanley,Phys. Rev.Lett. 91, 168701 (2003). skeleton (shortcuts) in the framework of the fractal net- [23] Pajek graph drawing software, work. http://vlado.fmf.uni-lj.si/pub/networks/pajek/. Thecriticalbranchingtreethatcanbefoundinvarious [24] S. Jung, S. Kim, B. Kahng, Phys Rev. E 65, 056101 phenomenasuchasearthquakeprocesses,populationand (2002). biologicaldynamics,epidemics,socialcascades,etc.,also [25] P. Bialas, Z. Burda, J. Jurkiewicz, A. Krzywicki, Phys. Rev. E67, 066106 (2003). appears in the skeleton of fractal networks, as we found [26] E. Almaas, B. Kov´acs, T. Vicsek, Z. N. Oltvai, A.-L. here. While the evolving process of the fractal networks Barab´asi, Nature (London) 427, 839 (2004). would be complex and diverse depending on the specific [27] C.-M. Ghim, K.-I. Goh, B. Kahng, J. Theor. Biol. 237, systems, the underlying structure, the skeleton, has the 401 (2005). topology of the critical branching tree. Identifying such [28] D. Segr´e, A. DeLuna, G. M. Church, R. Kishony, Nat. a simple structure underneath is a step forward towards Genet. 37, 77 (2005).