On the Diversity of Non-Linear Transient Dynamics in Several Types of Complex Networks Luciano da Fontoura Costa Institute of Physics at S˜ao Carlos, University of S˜ao Paulo, PO Box 369, S˜ao Carlos, S˜ao Paulo, 13560-970 Brazil (Dated: 15th Dec 2007) Dynamicsystemscharacterizedbydiversifiedevolutionsarenotonlymoreflexible,butalsomore resilienttoattacks,failuresandchangingconditions. Thisarticleaddressesthequantificationofthe diversityofnon-lineartransientdynamicsobtainedinundirectedandunweightedcomplexnetworks asa consequenceofself-avoiding random walks. Thediversity ofwalks starting at a specificnodei 8 isquantifiedin termsof asignaturecomposed bytheentropiesofthenodevisit probabilities along 0 each of the initial steps. Six theoretical models of complex networks are considered: Erd˝os-R´enyi, 0 Barab´asi-Albert, Watts-Strogatz, a geographical model, as well as two recently introduced knitted 2 networksformedbypaths. Therandomwalkdiversityisexploredatthelevelofnetworkcategories n andofindividualnodes. Becausethediversityatsuccessivestepsofthewalkstendstobecorrelated, a principal component analysis is systematically applied in order to identify the more relevant linear J combinations of the diversity entropies and to obtain optimal dimensionality reduction. Several 7 interesting results are reported, including the facts that the transient diversity tends to increase with the average degree for all considered network models and that the Watts and Strogatz and ] geographicalmodelstendtoyielddiversityentropieswhichincreasemoregraduallywiththenumber h p of steps, contrasting sharply with the steep increases verified for the other four considered models. - The principal linear combination of the diversities identified by the principal component analysis c methodis shown toallow an interestingcharacterization ofindividualnodesaswell as partitioning o of networks into subgraphsof similar diversity. s . s PACSnumbers: 89.75.Fb,02.10.Ox,89.75.Da c i s y ‘There is acitywhere you arrive for thefirsttime; and role in a large number of natural dynamical processes h there is another city which you leave never to return.’ (e.g. reaction-diffusion and Schr¨odinger equation). The p (Invisible Cities, I. Calvino) dynamics of traditional, linear, random walks on com- [ plex networks has been investigated by several articles 2 (e.g. [8, 9, 10, 11, 12, 13]). Several other types of ran- v I. INTRODUCTION dom walks have also been considered in the literature 0 (e.g. [14, 15, 16]). For instance, the category of self- 8 The diversity of dynamics plays a key role in most avoiding random walks represents a particularly inter- 3 0 aspects of nature, which has ultimately resulted in a estingsituationinwhichthe movingagentisnotallowed . wealthy of species along evolution as well as a myriad of to return to nodes and/or edges. As such, self-avoiding 1 human cultural manifestations. Because of their capac- walksarecanbedirectlyassociatedtothepathsexisting 0 itytorepresentdiscretestructuresandscaffolddynamics, in the networks. By path, it is henceforth meant a se- 8 0 complexnetworks(e.g.[1,2,3,4,5])havebecomethekey quence of adjacent [24] edges without repetition of node : paradigm in theoretical and applied studies in complex or edge. Paths are important because they provide the v dynamic systems, finding applications in an impressive most effective way to connect the involved nodes (i.e. i X range of problems (e.g. [5]). A great deal of the current given M nodes, a path through them involves M − 1 r attention in this area concentrates not only in charac- edges). In addition, unlike random walks, random self- a terizing the topological properties of networks (e.g. [5], avoiding walks — path-walks for short — are non-linear butalsoininvestigatinghowthelatterconstrainsoreven andnecessarilyfiniteinfinitenetworks,becausethemov- define dynamics unfolding in the networks (e.g. [3, 4]). ingagentsoonerorlaterhasnowaytoproceed. Thepos- With a tradition extending back over several decades, sibilityto useself-avoidingwalkstosamplenetworkshas the study of the dynamics of random walks represents been investigated in [15]. Paths and self-avoiding walks one of the main paradigms in statistical physics and dy- haverecentlybeenexploredasdualmotifsofstarconnec- namical systems. Traditional random walks are usually tivity [17], building block of networks [18] and for char- performed by one or more agents choosing with uniform acterization of networks (especially through the longest probability between the outgoing edges at each node. path)[19]. Thetransientdynamicsofself-avoidingwalks Therefore, random walks represent one of the least in- inuniformlyrandom,smallworldandscalefreenetworks telligent ways to move in a network, involving no addi- hasbeenstudiedin[20,21],withspecialattentionplaced tional criterionrather than uniform chance. Still, such a on the average number of such walks. dynamics is directly related to the important linear dy- Random walks typically start from a node and pro- namicsofdiffusion(e.g.[6,7]),whichplaysanimportant ceed[25]until somestopping conditionis met(e.g. fixed 2 (e.g. from individual node to network category levels). Indeed,the diversityofwalksisimmediately relatedto a large number of important theoretical and practical as- pects of complex networks structure and dynamics. To begin with, the cases in which the path-walks are found to be mostly similar (i.e. little diversity) imply that the agent had little choice during its motion, and therefore littlepathredundancyispresentinthenetwork,starting from that node. At the same time, such situations will also be characterized as being highly efficient as far as node coverage is concerned (relatively few edges are re- quired while visiting severalnodes). Indeed, by recalling that all nodes in a path-walk must be distinct, a path- walkinvolvingS+1nodes willnecessarilyhaveS edges, which is the minimum number of connections required to connected those nodes. On the contrary, in case the path-walks are found to be strongly diverse,we can con- clude that the the progression of the agent is character- ized by great freedom of choice and variety of transient dynamics. Consequently,becausethepath-walkscannot FIG.1: Givenanodeiinanetwork(a),theentropiesE(i,h) repeatnodes,wealsohavethattheself-avoidingwalksin of the probabilities of visited nodes after the initial h steps this case will also involve several diverse nodes. There- (c) can be calculated by simulating several self-avoiding ran- dom walks starting from i. Therefore, a signature f~ (b)can fore, nodes with high diversity constitute natural can- didates as distributing sources (e.g. for information or assignedtoeachnodeiwhichexpressesthediversityofpaths mass). It is alsoimportantto observethat the dynamics obtained at each step h. diversityofanodeprovidesinformationwhichiscomple- mentary to other measurements of network nodes. For instance,thoughdiversitytendstobecorrelatedwiththe nodedegreeattheinitialstepsoftheself-avoidingwalks, suchacorrelationcanbequicklylostasaconsequenceof numberofstepsor,inthecaseofself-avoidingwalks,im- the structural diversity at the progressive surroundings possibility to proceed further). In this work all walks of the initial node. The walk diversity is also distinct are performed for a pre-specified number of 10 steps, from the betweeness centrality (e.g. [3, 5]) in the sense as we are interested in the transient dynamics. Con- that chained nodes with high betweeness centrality will sider now that the starting node has been fixed and sev- lead to low walk diversity. Therefore, diversity can be eralself-avoidingrandomwalks are performedfrom that best thought as a novelmeasurementwhich can comple- node. One interesting question regards how such walks ment previous approaches in the characterization of the are composed and distributed. For instance, one may be structural properties of complex networks. interested in the length of these walks (e.g. [19, 20]). A questionof special relevance whichhas receivedlittle at- Manyarethe interestingapplications ofsuchdiversity tention from the literature regards the diversity of the studies to real-world problems. For instance, in case the obtainedwalks and path-walks. By diversity,it is meant randomwalksareusedtomodeltheacquisitionofknowl- how much the walks differ one another by incorporating edge or cultural values by the agent (in this case each distinct nodes and/or edges. In a previous approach to node represent a knowledge or cultural fact, e.g. [16]), this problem, Herrero investigated the average number the diversity measurements can provide sound basis for of self-avoiding walks defined in uniformly random and discussing how diverse the development of agents start- scalefreenetworks. Inthepresentarticle,thediversityof ing from similar backgrounds but subsequently exposed self-avoiding walks starting at a specific node i is quan- to different information will be. Another particularlyin- tified in terms of the entropies of the node probability teresting application concerns the objective quantifica- visits after the first S steps along the walk after starting tion of the diversity of life and species along phylogenet- from each node i, giving rise to diversity entropy (see ics, as well as geographical exploration. In addition to Figure 1). Because of the non-linear nature of this type itspotentialforthe objectivecharacterizationofdynam- of walks and our interest in obtaining informationabout ics performed in complex networks, the quantification of each individual walk in several types of structurally di- the diversity of random walks and path-walks can also verse networks, the visit probabilities are estimated by provide valuable indications about the structure of the performing several self-avoiding walks. respective networks. For instance, in case all path-walks Provided such a diversity of dynamics can be quanti- are identical, we have a chain of nodes extending from fied, ideally in terms of a single measurement, a series the starting node. Contrariwise, a high diversity implies ofinteresting analysescanbe performedat severallevels the presence of redundancies in the network. 3 Because the diversity entropies at subsequent steps degree. As such, extremity nodes tend to determine the tend to be correlated, the statistical method known as terminationofmanyself-avoidingwalks(nowaybackfor principal component analysis (PCA) [22] is systemati- the moving agent from that type of nodes). The cluster- cally applied in this work in order to decorrelate those ingcoefficient ofanodeiistheratiobetweenthenumber measurements. The PCA method provides an optimal of undirected edges between the immediate neighbors of stochastic linear transformation in the sense of concen- i andthe maximumpossible number ofundirectededges trating the variationof the data along the first new ran- among those nodes. dom variables. In other words, the PCA transforms the original measurements into new features which are com- pletely uncorrelated one another. Because of the linear nature of PCA, the new obtained measurements corre- B. Complex Networks Models spond to linear combinations of the original features, weighted so as to optimize the concentration of variance Sixtheoreticalmodelsofcomplexnetworksareconsid- along the first new variables. ered in the present work including four traditional mod- The manuscript starts by presenting the basic con- els — Erdo˝s-R´enyi (ER), Baraba´si-Albert (BA), Watts- cepts, adopted network models, as well as the definition Strogatz(WS)andageographicalmodel(GG)—aswell of the diversity entropy and some of its properties. The as tworecently introduced knitted types of complex net- resultsarepresentedwithrespecttotheanalysisofwhole works [18] — the path-transformed BA model (PA) and network categories, individual networks, and individual path-regular networks (PN). The ER, BA and WS net- nodes. worksaregrowninthetraditionalway(e.g.[1,2,3,4,5]). TheGGnetworksinthisworkareobtainedbydistribut- ing N nodes within a square with uniform probability II. BASIC CONCEPTS and connecting all nodes which are closer than a mini- mal distance d. The PA and PN networks are obtained as explained in [18]: the PA networks (path-transformed This sectiondescribesthe basic conceptsandmethods BAnetworks)areobtainedbystar-pathtransformingall usedinthisarticle,includingnetworkrepresentationand nodes in an originalBA network and the PN (path regu- characterization,the6adoptedcomplexnetworkmodels, lar networks) model is easily obtained by defining paths thedefinitionandestimationofthediversityentropysig- involvingallnetworknodesinrandomorderandwithout nature,aswellas the stochasticprojectionsmethods ap- repetition. plied in order to decorrelatedthe signatures and achieve All networks considered in this article have N ≈ 100 dimensionality reduction. and m ≈ 3 or m ≈ 5 (m is the number of spokes in the added nodes in the BA model), with average degree hki≈2m. The approximations are a consequence of the A. Complex Networks Representation and statistical variability of the models. For the same rea- Characterization son,the numberofnodesN canvaryslightlyforthe GG networks. Becausetheaveragedegreesconsideredinthis A unweighted and undirected complex network, workarerelativelylarge(wellabovethe percolationcrit- formedbyN nodesandE edges,canbefullyrepresented ical value for ER), most of the nodes in each network in terms of its adjacency matrix K, which is symmetric belong to the largest connected component, which has and has dimension N ×N. Each existing edge (i,j) im- beenconsideredfor allthe analysesreportedin this arti- plies K(i,j) = K(j,i) = 1, with K(i,j) = K(j,i) = 0 cle. ThetotalofrewiringsusedintheWScasewasequal indicating absence of that edge. Two edges are said to to 0.1E. beadjacent whenevertheyshareoneoftheirextremities. A random walk corresponds to any sequence of adjacent edges(i ,i );(i ,i );...(i ,i ). Awalkwhichdoesnot 1 2 2 3 p−1 p C. Diversity Entropy and its Estimation repeat any edge or node, henceforth called self-avoiding random walk, defines a path in the network. The length ofawalkorpathisequaltothenumberofitsconstituent The diversity entropy is the measurement used in this edges. The shortest path between two nodes is defined article in order to quantify the diversity of the self- as one of the paths between those nodes which has the avoiding random walks obtained for each node i at each smallest length. step h. Let p(i,j,h) be the probability that a node j be The immediate neighbors of a node i are those nodes visitedafter h time steps while moving fromthe starting whichareconnectedtoithroughshortestpathsoflength node i. Once a self-avoiding walk is terminated (i.e. the 1. The degree of a node is equal to the number of edges moving agentcan proceed no further), the moving agent emanating from that node. The node degree averaged is understood to remainatthe final node and contribute within a network is called its average degree. Extrem- totheprobabilitiesanddiversitiesforallremainingsteps. ity nodes are henceforth understood as those with unit The diversity entropy of node i can now be defined as: 4 N E(i,h)=− p(i,j,h)log(p(i,j,h)) (1) X j=1 Given the starting node i, E(i,h) can be understood asthe diversityentropysignature forthatnode (see Fig- urefig:features). Eachsuchsignaturecanbe transformed into a single value, e.g by taking the arithmetic or geo- metric averageof its values considering all the steps h. Figure 2 illustrates several particularly relevant situa- tions regarding diversity entropy signatures. Because of the total absence of branches, the chain network in (a) yield a completely null diversity signature. This means total determinism in the sense that all self-avoiding ran- domwalksstartingfromiwillbeidentical. Thepresence ofabranchatstep3inthestructurein(b)impliesthein- creaseofthediversityentropyatthisspecificstep. Inthe networkin(c),thebranchoccursatthefirststep,imply- ing diversity entropy E(i,1)= log(1/3)≈ 1.1, which re- mains for the two followingsteps (i.e. h=2 and3). Ob- serve that the additional all-to-all connections between the nodes in the second and third steps have no effect in changing the respective diversity entropy, as they do not affect the respective probabilities p(i,j,3). The situ- ationdepictedin(d)involvesself-avoidingrandomwalks with different lengths, namely 1, 2 and 3. Because the moving agent is assumed to remain at its termination node, the diversity entropies do not change along the 3 initial steps. Though this assumption implies eventual degeneracy such as obtaining the same diversity entropy signatures for the structures in (c) and (d), the distinc- tion between such cases can be easily accomplished by considering additional measurements such as the length of the walks. Finally, the situation shown in (e) involves convergingconnectionsatsteps1and2,whichcontribute to reducing the diversity of the random walks. Observe FIG. 2: Illustrations of diversity entropy signatures (the di- that the alternative assumption of removing the moving versity entropies are shown in bold): (a) as the paths from agentafterithasreachedaterminationnodewouldimply nodeiareallequalforthiscase, theentropiesarenullforall identical diversity entropy signaturesfor both structures valuesofh;(b)thedivergenceofedgesath=3impliesthein- in (d) and (e). creaseofthediversityentropyto1/log(3)≈1.1atthatlevel; (c) the presence of all-to-all connections between the nodes GivenanetworkwithN nodes,themaximumdiversity at steps 2 and 3 has no effect in increasing the entropies at entropy obtained at any step is given when p(i,j,h) = level h = 3; (d) because the moving agent remains at each 1/N, implying terminal node after reaching it, the entropy does not change at the successive steps for this case; (e) converging edges (at the second and third steps in this particular example) can N lead to decrease of thediversity entropy along h. W =−1/N log(1/N)=log(N) (2) X j=1 Figure 3 shows the maximum diversity entropies for severalvaluesofN. Therefore,asallnetworksconsidered in this article involves N ≈ 100, the diversity entropy is hkih maximally bound to W =log(1/100)≈4.61. E(i,h)=− log(hkih)/hkih =hlog(hki) (3) A particularly interesting situation occurs when each X j=1 node ateachlevelh leadsexclusivelyto aconstantnum- ber hki of new nodes in the subsequent level h+1 (see Therefore, the diversity entropy will tend to increase h Figure 4). In this case, p(i,j,h)=1/(hki ), so that (or remain null for hki = 1) with h at constant rate 5 FIG. 3: The maximum diversity entropy which can be ob- tained for networks with N nodes. log(hki). This situation involves an infinite and com- pletely regular network (i.e. each node has the same de- FIG.4: Anexampleofasituationwherethediversityentropy gree hk+1i). As complex networks are often analyzed increases linearly with thesteps h. with respect to regular or nearly regular counterparts (e.g. ER model), it is useful to consider the above con- figuration as a reference. For instance, the situation in which the diversity entropy tends to increase almost lin- early with h along an interval can be understood as an S ×N is kept in order to store the number of visits to indication that the network is mostly regular along that eachofthenetworknodesafterstartingfromi. Atevery interval. However, it should be born in mind that linear step h = 1,2,...,S along each of these M self-avoiding increase of the diversity entropy can also be caused by walks,themovingagentisfoundatanodej andtheele- other structural organizations in complex networks (i.e. mentV(h,j)oftheaccumulatorvectorisincrementedby constant increase of entropy does not necessarily implies one. After completing the M self-avoiding walks start- network degree regularity, but the latter necessarily im- ing from node i, the probability of visits to nodes can plies linear entropy increase). be estimated as p(i,j,h)= V(h,j)/M. Recall that once Asthediversityentropyhasbeendefinedforeachnode the moving agentreachesa termination node, it remains iateachsteph,itprovidesanindividualsignatureasso- there for allremaining steps. For N =100,the situation ciatedtoeachnode(seeFigure1),whichcanbevaluable considered for all networks in this article, the probabili- while investigating dynamics emanating from that node. tieshavebeenexperimentallyfoundtohaveconvergedto However, it is often interesting to get an overall idea of less than 5% stability for M = 200, which is henceforth the diversity entropy dynamics considering all the nodes adopted. in the networks. This can be immediately obtained in termsoftheaverageandstandarddeviationofthediver- sity entropy at each step h, i.e.: D. Optimal Dimensionality Reduction by Stochastic Transformations N 1 hE(h)i= E(i,h) (4) N X Inseveralsituations,especiallyfornearlyuniformnet- i=1 work (i.e. nodes having most nodes with similar proper- 1 N ties, such as degree), the diversity neighboring entropies Var{E(h)}= (E(i,h)−hE(h)i)2 (5) N X along the steps of the signatures obtained for each node i=1 i will tend to be strongly correlated. Indeed, recall σ =+ Var{E(h)} (6) that each node i in the network will be mapped into E(h) p a 10−dimensional feature vector (the diversity entropy The algorithm adopted for picking a random path is signature), which is a relatively high dimensional space, simple and,giveneachstartingnodei,involvesperform- impossible to be visualized. It is possible, and useful, to ing M self-avoiding random walks along the initial S reducethedimensionalityofsuchmeasurementspacesby steps (in this work, S = 10). Therefore, for each node using optimal stochastic linear transformations such as i = 1,2,...,N, an accumulator array V of dimension principal component analysis and canonical projections 6 (e.g. [5, 22]). where m ≤ S, m ≥1, f~ and~g have respective dimen- The former of these approaches allows the measure- sionsS×1andm×1. Therefore,thenewmeasurements ment space to be optimally projected into m ≤ S di- (transformed variables) belong to a space of reduced di- mensions, m ≥ 1, while maximizing the variation of the mensionality m≤ S. The new measurements associated observations along the first transformed, new variables to the largest eigenvalues are called the main variables (e.g. [5, 22]). The transformed variables, which are lin- or components. Observe that each of the new measure- earcombinationsoftheoriginalmeasurements,areguar- ments is a linear combination of the original measure- anteed to be completely uncorrelated. The latter trans- ments,whilethe eigenvaluesγ ,i=1,...,m,correspond i formation(i.e. canonicalprojections)allowthe measure- to the variances of the new measurements in~g. So, it is ment space to be optimally projected into a smaller di- reasonable to include in the transformation matrix only mensional space while maximizing the separation of the the eigenvectorsassociatedto eigenvalueswhich arepar- categoriesofobservations,inthesenseofmaximizingthe ticularly large, in order to encompass the greatest part interclassvariationandminimizing the intraclassdisper- of the original variation of the observations. Also, ob- sion (e.g. [5, 23]). In both cases, the resulting trans- servethattherelativeweightofeachoftheoriginalmea- formed variables are linear combinations of the original surements used in the linear combinations defining the measurements. newvariablescanprovideanindicationabouttheimpor- Though still rarely applied in complex network re- tance of the respective original measurements. Because search,suchoptimalstochastic transformationscanbe a the feature vectors considered in this work have all the real help in organizing and simplifying the analysis and samenatureandpotentialdynamic range(recallthatall classificationofcomplexnetworks(e.g.[5]). Inthiswork, the elements of the diversity entropy signature are en- the principal component approach is used in order to tropies vary between 0 and log(N)), there is no need for decorrelate the diversity entropy signatures obtained for preliminarystandardizationoftheoriginalmeasurements each of the nodes in a given complex network, while the (e.g. [5]). canonicalprojectionsmethodisappliedinordertoobtain visualizations of the distribution of several realizations of 6 different types of complex network models. Addi- III. RESULTS AND DISCUSSION tionalinformationaboutthecanonicalprojectionsanaly- sis,whichismathematicallymoresophisticatedthanthe The diversity entropy methodology has been applied principal component approach, can be found in [5, 23]. at the level of network categories and individual nodes. The principal component analysis, used to decorrelated In the former case, the overall diversity was character- the diversity entropies in this work, is described as fol- ized with respect to the average and standard deviation lows. of the diversityentropies obtained for eachrealizationof Let f~be the feature vector containing the S measure- thenetworks. Thelatterinvestigationtargetstheestima- ments obtained for each observation i = 1,2,...,N. In tion of the diversity entropy signature at the individual thepresentwork,eachnodei(anobservation)ismapped nodelevel,whichallowsthe partitioningofeachnetwork into a diversity entropy signature (the feature vector) of into subgraphs of similar diversity. These two types of dimensionS×1. TheelementsC(i,j),i,j =1,2,...,Sof investigations,atthenetworkandindividualnodelevels, thecovariance matrix ofsuchadatasetcanbeestimated are described in the respective following sections. as N A. Network Level 1 C(i,j)= (v(i)−µ )(v(j)−µ ) (7) N −1X i j p=1 We start our diversity investigation by looking at the averagesandstandarddeviationsofthediversityentropy where µ is the average of v(a), a = 1,2,...,S. Ob- a signatures obtained for the realizations of each of the 6 serve that C(i,j) = Var(i) whenever i = j. Also, we considered network models. More specifically, a total of have that the covariance matrix C is necessarily sym- 50realizationswereperformedforeachofthesixcomplex metric. network models considering N = 100 and two average Let γ , i=1,2,...,S be the eigenvalues of the covari- i degrees: (a) hki = 6 (i.e. m = 3) and (b) hki = 10 (i.e. ance matrix, ordered so that γ ≥γ ≥...≥γ , and let 1 2 S m=5). For each of such realizations,200 random path- q~ be the respectively associated eigenvectors. The prin- i walks were performed starting from each of the nodes, cipal componentanalysis can be obtained by performing and the respective entropies E(i,h) were estimated for the following stochastic linear transformation h = 1,2,...,10. The average hE(h)i and standard de- viations σ of these diversity entropy values were ob- E(h) ←− q~ −→ tained for each of the 50 network realizations for each of 1 ←− q~ −→ the6consideredmodelsandareshowninFigures5and5 ~g = 2 f~ (8) ... ... ... respectively to m=3 and m=5.   ←− q~ −→ A series of interesting results can be inferred from the m 7 FIG. 5: The average ± standard deviations of the diversity entropies obtained for each of the networks considered for each of thesix complex networks models assuming N =100 and m=3 (i.e. hki=6). curvesin those figures. First, observethat markedlydis- Another interesting result regards the maximum en- tinctivecurvesandstandarddeviationswereobtainedfor tropyvalues,reachedforlargevaluesofh. Exceptforthe thenetworksbelongingtoeachoftheconsideredmodels. GGmodel,allothertypesofnetworkstendedtoentropies Two general behaviors can be distinguished in both fig- around 4.2 for both m = 3 and m = 5. Interestingly, ures: the steeper increaseofthe diversityentropywith h quite similar limiting entropy values have been obtained obtained for the ER, BA, PN and PA modes as opposed forBAandPNnetworksirrespectivelyoftheaveragede- tothemoregradualincreaseverifiedforthe WSandGG gree. By comparing the respective curves in the two fig- models. These two types of transient dynamics can be ures,itbecomesclearthatthehigheraveragedegree(i.e. observed for both m = 3 and m = 5. In the case of the m=5,implyinghki=10)tendedtoreducethestandard transient evolutions observed for the ER, BA, PN and deviations along the plateaux of the BA, WS, PN and PA models, the diversity entropy tended to reach sta- PA cases. Similar standard deviations can be observed bilization near a higher plateau (with diversity entropy fortheGGcase,whilehigherdeviationsnowcharacterize approximately equal to 4.2) after the three or four ini- the plateauxfor the ERmodels. Thoughatfirstsurpris- tial steps. This suggests that the self-avoiding paths in ing, the decrease of the standard deviations observedfor these networkstend to reachmostnodes after just a few most models is ultimately a consequence of the fact that steps. Themoregradualincreaseofentropyobservedfor increasing the average degree of finite networks tends to the WS and GG models indicates that the moving agent make them more regular, implying in more self-avoiding takes substantially more time to cover a smaller portion paths covering the same set of nodes. Observe that at of the nodes during the transient dynamics. This is a the extreme situation in which the network is fully con- consequence of the fact that, though nearly regular (i.e. nected, all path-walks will involve all nodes, implying similar degrees for all nodes), these two types of net- null variance of the diversity entropy. The higher aver- works are characterized by having pairs of nodes which agedegreealsotendedtochangetheshapesofthecurves areeitherconnectedthroughmanyshortpaths(adjacent by implying a steeper increase (especially for the second nodes) or virtually unconnected. More informally, given step) along the initial step, which is also a consequence two nodes i and j of a network, the adjacency between of the above observed regularizing effect. them can be quantified in terms of the number of short (i.e. up to a maximum length) paths interconnecting Another interesting result which is evident from Fig- those nodes; the higher this number, the more adjacent ures 5 and 5 are the markedly distinct standard devia- the pair of nodes is. tions obtained for each model. Confirming previous in- vestigations [18, 19], the PN model presented the more 8 FIG. 6: The average ± standard deviations of the diversity entropies obtained for each of the networks considered for each of thesix complex networks models assuming N =100 and m=5 (i.e. hki=10). regular features, with almost null standard deviations of This is in complete agreement with the two main types the diversity entropies for either m = 3 or m = 5. Al- of diversity dynamics identified for those networks (i.e. lied with the fast increase of diversity entropy exhibited steeper and more gradual increase of the entropies). In by this model, the extremely low variance of diversity addition to confirming that previous result, the canoni- entropies makes of the PN a choice model for achieving calprojectionsshowedthattheERandPNhavemarked highanduniformdiversitysignatures. Toagreatextent, similarity between their diversities (i.e. the clusters for such properties favoring diversity are a consequence of these two models were mapped nearby in the projected the fact that, unlike the WS and GG models, any pair space). The smallest dispersion of the diversities ob- of nodes in the PN structures tend not to be adjacent in tained for the PN model are clearly reflect in the dense the sense of being interconnected by many short paths. cluster obtained for that category of networks. Interest- Recall that the ER, WS, GG and PN are all network ingly,theERandPAclusterstendedtochangepositions models characterized by high degree regularity, so that considerably for m=3 and m=5. what makes them so different regarding diversity is ul- timately the adjacency between pairs of nodes, which is Additional results can also be obtained by consider- optimally broken in the PN model. ing the weights of the original measurements in the lin- ear combinations defining the two canonicalvariablesv1 In order to complete our analysis of the diversity en- and v2, shown in Table I. We concentrate attention tropy signatures for networks belonging to the 6 distinct on the absolute values of the weights in Table I. In considered models, we now apply the canonical projec- the case m = 3, we have that the two first canonical tion method (see Section IID). We use this method to variables v1 and v2 are mostly affected by the diver- project the original10−dimensionalentropies space into sity entropies for h = 1,2,4,7,8,9 and 10, which are a 2−dimensional space so as to maximize the separation the main measurements responsible for the optimal sep- between the clusters of networks belonging to each cat- aration between the 6 models in the case m = 3. The egory. Figure 7 shows the cluster distributions obtained most important measurements in the composition of the for m=3 (a) and m=5. two canonical variables for m =5 are the diversities ob- It is clear from Figure 7, where v1 and v2 correspond tained for h = 1,2,3,4, all of which with weights larger to the two principal canonicalvariables, that the 6 cate- than 0.5. These measurements, which correspond to the gories of networks yielded two supergroups: one formed initialstepsofthewalks,greatlycontributedtothesepa- by{GG,WS}andtheotherby{ER,BA,PN,PA}(ob- rationofthe 6networkcategories. The dominantcontri- serve the different ranges of values for the two axes). butionofthe initialentropiesis reasonablebecausemost 9 (a) (b) FIG. 7: The clusters of networks, for m=3 (a) and m=5 (b), after being canonically projected from 10 to 2 dimensions so as to maximize theseparation between the six categories of networks. the diversity signatures tend to become stable and sim- tainedatthe levelofindividualnodes. Inordertodo so, ilar after 3 steps in the case of m = 5 (recall that the we selected a network of each type and obtained the re- diversityentropiesrisemoreabruptlyfor m=5 thanfor spectivesignaturesshowninFigure8and8,withrespect m=3). Suchresultsobtainedform=5indicatethat,if to m = 3 and m = 5. Similar signtures were obtained the mainpurposeofthe analysisistoseparatethe 6net- for other realizations of each of the 6 types of networks. work types, it is mostly enough to consider the 4 initial The geographical network considered in this analysis is entropies in each signature. shown in Figure 12. Most of the results and explanations presented in the m=3 m=5 previous analyses at the network level can be immedi- ately extended to the signatures in these curves. First, v1 v2 v1 v2 the dispersion of the signatures tend to be smaller for 0.45 0.37 0.51 0.12 m=5 than for m=3. Very similar signatures were ob- -0.03 -0.36 -0.29 -0.58 tained for all nodes in the PN networks, which confirms -0.21 -0.01 -0.63 -0.14 thedistinctiveregularityofthismodel. Thetwotypesof -0.52 0.18 -0.16 0.57 transient dynamics, namely steeper for the ER, BA, PN -0.08 0.19 0.07 0.23 and PA networks and more gradual for the WS and GG -0.09 -0.36 0.31 0.37 structures,wereagainobserved. Themostinterestingad- ditional information provided by the presentation of the 0.40 -0.05 -0.05 0.02 individualnodesignaturesregardstherelativedispersion 0.32 0.39 0.22 -0.16 obtainedforeachcase. Observethatparticularlydistinct 0.18 0.42 0.19 -0.30 diversity signatures were obtained for the GG structure. -0.42 -0.45 -0.19 -0.12 This is mainly a consequence of the higher structural modularity and overall adjacency found in this type of TABLEI:Theweightsoftheoriginalmeasurementsassigned network (see Figure 12). bythecanonicalprojectionsmethodinordertobestseparate Additional insights about the measurement structure the 6 categories of complex networks in the 2−dimensional ineachofthenetworkscanbeobtainedbyapplyingprin- projections for m=3 and m=5. cipal component analysis to each of the datasets in Fig- ures 8 and 8 in order to obtain 2−dimensionalvisualiza- tions of the distribution of the respective diversity sig- natures. The projections obtained for m = 3 is shown B. Individual Node Level are Figure 10 (the projections for m = 5 are very simi- lar and are not shown in this article). Recall that each Havinginvestigatedhowthediversityentropiesbehave point in these plots corresponds to each of the nodes in in each of the 6 considered network categories, we now the respective sample network. Distinct clusters, all of turnourattentiontothediversityentropysignaturesob- which completely uncorrelated, were obtained for each 10 FIG. 8: The average ± standard deviations of the diversity entropies obtained for each node in a sample of each of the six complex networks models assuming N =100 and m=3 (i.e. hki=6). of the networks. Note the presence of outliers for the closely to the arithmetic mean (to a multiplicative fac- casesER,BAandGG.Thedistributionobtainedforthe tor). Thissummarizingmeasurementishenceforthcalled GG structure presented the largest dispersion of points. the overall diversity of each node. Contrariwise,the projection of the diversity entropy sig- In order to conclude our investigation of the diversity natures for the PN network yielded the most compact entropy signatures at the individual node level, we con- cluster,confirmingonceagaintheenhancedregularityof sidertheGGnetworkchosenfortheaboveexamples(see thistypeofnetwork. ThedispersionsoftheER,BA,WS Figure 12) for a more systematic investigation of the di- and PA networks are similar. versities. The choice of this type of network is justified Additionalinsightsabouttheinfluenceofthemeasure- because it is the only case among the considered cate- ments (i.e. the diversity entropies) in the definition of gories which incorporates the spatial positions of each the clusters in Figure 10 can be obtained by considering node and because this type of network tends to exhibit the respective weights of the original measurements in structured modularity (i.e. spatial and topological com- the linear combinations defining the two main principal munities). variables pca1 and pca2,which necessarily resulted com- Becausethefirstprincipalvariablehasbeenverifiedto pletely uncorrelated. Observe that the variance of pca1 correspond very closely to the arithmetic average of the is muchwider than thatofr v2 in allcases. Suchweights diversity entropies for all network types, we adopt this aregiveninTableII. Itisclearfromthevaluesinthista- value in order to summarize the diversity of each node blethattheinitial3or4diversityentropiesarethe most in the chosen network. Figure 11(a) shows an enlarged relevant measurements in all situations, except for the version of the PCA projection of the diversity entropies WS and GG cases. The special importance of the first obtained for this geographical network. Because of the entropiesis a consequenceofthe factthatthe signatures right-skewed distribution of the density of the points in tend to be more different at such initial steps, getting thisprojection,weconsideranewprojectionobtainedby nearly constant once they reach the plateaux. The dis- taken the exponential of the first PCA variable, hence- tinctcompositionexhibitedbythefirstprincipalvariable forth represented as exp(pca1). This new projected dis- in the case of the WS and GG networks reflects that the tribution is shown in Figure 11(b). A more uniform dis- individual signatures remain distinct even after 3 or 4 tribution of points is now obtained. We now subsume steps. Because of its largest variance, the first principal the new variable exp(pca1) into 9 intervals identified by variables ire particularly relevant as a single quantifica- the colors in Figure 11(b). Observe that, because pca1 tion of the individual node diversities. In the case of is very close to the arithmetic average of the diversity theWSandGGnetworks,thisvariablecorrespondsvery entropies for the various values of h, the diversity of the

