ebook img

Linear and Optimization Hamiltonians in Clustered Exponential Random Graph Modeling PDF

0.7 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Linear and Optimization Hamiltonians in Clustered Exponential Random Graph Modeling

Linear and Optimization Hamiltonians in Clustered Exponential Random Graph Modeling Juyong Park and Soon-Hyung Yook Department of Physics, Kyunghee University, Seoul, Korea Exponentialrandomgraphtheoryisthecomplexnetworkanalogofthecanonicalensembletheory fromstatisticalphysics. Whileithasbeenparticularlysuccessfulinmodelingnetworkswithspecified degree distributions, a na¨ıve model of a clustered network using a graph Hamiltonian linear in the number of triangles has been shown to undergo an abrupt transition into an unrealistic phase of 6 extreme clustering via triangle condensation. Here we study a non-linear graph Hamiltonian that 1 0 explicitly forbids such a condensation and show numerically that it generates an equilibrium phase 2 with specified intermediate clustering. n a I. INTRODUCTION the network ensemble [21]; whether the model is suffi- J cient (i.e., is a good approximation of network data) is 1 1 The study of complex systems found in various disci- to be judged on the given model’s ability to reproduce plines including engineering, biology, sociology that can other (not used as an input to the model) network char- ] be represented as networked systems composed of nodes acteristics such as the degree distribution, cluster size n and edges have garnered much interest from statistical distribution, degree-degree correlation, et cetera. A sig- n physicists in recent years. Building upon a rich and nificant disagreement between the expected characteris- - s long tradition of studies on many-body systems, they tic of a model and the data may indicate that the choice di have successfully adapted the analytical and computa- Hamiltonian needs to be reformulated; for instance, the . tion tools to understanding networks. [1–3] ubiquityofscale-free(power-law)networkswherethede- t a Anetworkmodelingmethodologythatshowsastriking greedistributionisfat-tailedrendersthesimplestErdo¨s- m resemblancetothecanonicalensembletheoryfromstatis- R´enyinetworkmodel(whichhasaPoissoniandegreedis- - tical physics is the Exponential Random Graph (ERG) tribution) inadequate, necessitating the introduction of d theory, originally developed in statistics and currently alternative forms of the graph Hamiltonian. One possi- n o the mostactivelystudiedin the SocialNetworkAnalysis bility is to incorporate explicitly the node degrees e{ki} c (SNA) circles [4–6]. Given that the potential readership (i∈N =1,··· ,nisthenodeindices)themselvestoform [ ofthispaperwillbecomposedofstatisticalphysicist,the the so-called Linear Degree Hamiltonian HLD(G): premise of ERG is perhaps most simply explained using 1 H (G)=θ k (G)+ +θ k (G), (2) v thelanguageofstatisticalphysics. Here,asinthecanoni- LD 1 1 ··· n n 1 calensembletheory,oneconsidersanensembleΓofgraph where θ are the conjugate variables that now control i 3 configurations (microstates) G whose probabilities in Γ { } the expected degrees k in a manner similar to what θ 3 aregivenbyP(G)= e−H(G)/Z whereH(G)isthe { i} G∈Γ did to m in Eq. (1) [22]. On a historical note, the study 2 graph Hamiltonian, aPfunction of network characteristics of H was prompted by the hypothesis that heavily 0 of G, and Z = e−H(G) is the partition function. LD . G∈Γ skewed degree distributions such as the power law may 1 Both in social nPetwork analysis and statistical physics, cause the observed negative correlation between degrees 0 the Hamiltonian H(G) is typically set up to be a linear of connected nodes, while the Erdo¨s-R´enyinetwork pro- 6 function of network variables or network statistics such duces no such correlation in the thermodynamic limit 1 as the number of edges m(G) in the graph. When the : (n ) [5]. On the other hand, it was shown analyti- v network is simple and unweighted (i.e. the number of → ∞ cally that power-law networks generated via Eq. (2) ex- i edge between twonodes is either 0 or 1)it is straightfor- X hibited negative degree correlation, proving the hypoth- wardtoshowthatH(G)=θm(G)generatestheso-called r esis, and thus that the skewed degree distribution was a Erdo¨s-R´enyi random graph in which two nodes are con- indeed responsible for the negative degree-degree corre- nected with probabilityp=1/(1+eθ) [5]. The expected lation. This is, in fact, a typical example of the ERG numberofedgesminanetworkofnnodesisinthiscase, modeling(alsoofgeneralstatisticalmodelingprocedure) therefore, given as procedure – identifying “important” features of the ob- n n(n 1) 1 served system and testing its sufficiency via comparing m= p= − , (1) themodel’s predictionsandrealdata(i.e. the “goodness (cid:18)2(cid:19) 2 1+eθ of fit” of the model in statistical sense. See Ref. [10]) controlled by the conjugate variable θ. In one then and, when a closer agreement is desired, refining the hy- equates m from Eq. (1) with the actual number of edges pothesis and repeating the procedure. This process is minthenetworkdataunderstudy,thisservesasthenull presented schematically in Fig. 1. model of the network under study which the number of Not surprisingly, the development of ERG as a net- edges as the only observable. Note again that only the work modeling framework closely follows the study of number ofedgesm is anexplicitvariableinconstructing graphHamiltoniansofincreasingcomplexity. ERGmod- 2 Network Data II. DEGREE HAMILTONIANS Here we briefly review H (G) = θ k , Eq. (2), LD i i i specifically when the network is sparseP(ki O(1) ∼ ≪ √n). In sucha case itis wellknownthatthe probability pij that nodes i and j are connected is e−θie−θj, leading to the average degree k of node i [5] Selection of Solution h ii Network and Variables Comparison ki = pij =e−θi e−θj h i Xj6=i Xj6=i Exponential Random ∞ =(n 1)e−θi e−θρ(θ)dθ A(n 1)e−θi, (3) Graph Modeling − Z ≡ − −∞ wherethelatterintegralformisvalidforalargenetwork FIG. 1: The schematics of the Exponential Random Graph (n 1), ρ(θ) is the distribution density of θ, and A modelingofnetworkdata. Fromthenetworkdataofinterest ∞≫e−θρ(θ)dθ is thus a constant. Setting k =q , th≡e (top) one selects network variables such as the node degrees −∞ h ii i Rspecified(desired)degreeofnodeiandinvertingEq.(3), {ki}(left)fromwhichonethenformsaHamiltonianH({ki}), we obtain θ = ln q /A(n 1) . H then becomes whose solutions and predictions are compared with the net- i − i − LD work data. Significant disagreements may necessitate a new (cid:0) (cid:1) H (G)= θ k (G) selection of variables or reformulation of theHamiltonian. LD i i iX∈N = k (G)lnq +ln A(n 1) k (G) i i i − − iX∈N (cid:0) (cid:1)iX∈N els of historical import include, in addition to the sim- = k (G)lnq +2M(G)ln A(n 1) , i i − − plest H(G) = θm(G), the Holland and Leinhardt model iX∈N (cid:0) (cid:1) of reciprocity,the Strauss model of clustering, the 2-star (4) model, and the generalized k-star models [5]. We refer where M(G) = 1 k (G) is the number of edges in interestedreaderstointroductoryarticlesandsignificant 2 i i recentworkfromtheSNAcommunityformoredetail[6– G. One can also shPow that the ensemble generated via H is equivalent to the configuration model, a popular 8, 10]. LD anduseful frameworkfor studying graphswith arbitrary Toastatisticalphysicist,thebenefitsofsuchaformal- degree distributions [13]. ismisobvious: onecanutilizeappropriatecomputational Now, if we restrict the ensemble Γ = G to con- (such as the Metropolis-Hastings algorithm) and analyt- { } tain only network configurations G with a fixed num- ical (such as Feynman-diagrammatic method) tools to ber of edges M(G) = M = 1 q (corresponding to study the properties of the model [5, 11, 12]. It should 0 2 i i alsobe notedthatthe Hamiltonianneednotbe linear at the canonical ensemble of partPicles), the second term 2M(G)lnA(n 1)) becomes a constant. Therefore, we all. For instance, when one wishes to construct an expo- − can safely ignore it and use an even simpler form nential random graph model of a network with a speci- fieddegreesequence,H(G)onlyneedstobeafunctionof H (G)= k (G)lnq . (5) LD i i the node degrees {ki(G)} in G, i.e. H(G) = H[{ki(G)}] −iX∈N where i = 1,...,n is the node index. This is suf- ficient to∈gNuara{ntee that}two configurations G, G′ with This is particularly useful in edge-conserving Monte Carlo simulations, where the Metropolis-Hastings algo- anidenticaldegreesequencehavethesameprobabilityin rithm would consist of relocating the edge between a the ensemble, and the aforementioned H is one possi- LD randomlyselectedconnectednodepairtobetweenaran- bility. Thus there is much freedom in choosing the form domly selected unconnected pair with probability 1 if it of the H(G), meaning there exists ample avenues for ex- resultsinalowerenergy,andwithprobabilitye−∆H(G) < ploration of various possible forms of Hamiltonians as 1 when it results in a higher energy. one sees fit, not limited to linear forms. In fact, linear It is important to note that it is the ensemble average forms such as H of Eq. (2) are often not robust in LD k of a node that is to be matched with its prescribed the presence of a perturbation, in the sense that when a h ii degree q , and there is no guaranteethat k =q strictly, composite Hamiltonian H =H +H′ is used the equi- i i i LD even at equilibrium: In fact, P(k q ), the probability librium degree distribution may differ significantly from i| i that a node with a prescribed degree q has degree k at the one specified from H , defeating the modeler’s in- i i LD equilibrium, is tention to generate a desired degree distribution using H . The purpose ofthis paper is to reviewthe cluster- ingLDperturbationandcomparethecharacteristicsoflinear P(ki|qi)= (cid:20) e−(θj+θi) 1−e−(θl+θi) (cid:21), and nonlinear Hamiltonians under it. For simplicity, we {XNk} jY∈Nk l∈YNk′(cid:0) (cid:1) (6) here consider only unweighted and undirected graphs. 3 while retaining the linear form. Instead, we introduce a Specified degree distribution nonlinear Hamiltonian Linear degree hamiltonian Optimization degree hamiltonian H (G)= β k (G) q (8) OD d i i | − | iX∈N k) P( whichwecalltheOptimizationDegreeHamiltonian, being reminiscent of Hamiltonians used in certain opti- mizationproblems such as number partitioning [14] [23]. The P(k) that results from H with β =1 for simplic- OD ity(thepenaltycanbecontrolledviaβ whennecessary) d is shownin Fig. 2 in red, which is indeed a more faithful reproduction of P(q) in comparison with H , showing Degree k LD sharper peaks at k =5 and k =15 of equal heights sim- ilar to P(q). The broadening of the peaks around the FIG. 2: The degree distributions from exponential random specifieddegreesfromtheH incomparisontoH in graph simulations. For simplicity we set the specified degree LD OD distributiontobeP(q=5)= 1 andP(q=15)= 1 (shownin Fig. 2 is persistent in cases of more heterogeneous (thus 2 2 gray). ThelineardegreeHamiltonianHLD =−Pi∈Nkilnqi less artificial) specified P(q) as seen in Fig. 3 where we generates a smooth distribution over a wide range of degrees compareH andH foraPoissonianP(q)andadou- LD OD with Poissonian-like peaks at k = 5 and k = 15 (blue). The ble Gaussian degreedistributionfromtheoptimizationdegreeHamiltonian HOD =Pi∈N|ki−qi|,bycontrast,isnoticeablyclosertothe P(q)=αΦ(q;µ1,σ1)+(1 α)Φ(q;µ2,σ2) (9) specified,withsharperpeaksofroughlyequalheightsatk=5 − and k=10. whereΦ(q,µ,σ)isaGaussianofmeanµandvarianceσ2, and α [0,1] sets the relative weights between the two ∈ Gaussian peaks. For the Poissoniancase we set q =10 whereθ = lnq /A(n 1),and isthesetofallpos- i i k h i − − {N } (Fig. 3 (a)), and for the double Gaussian we try three sible combinations of k nodes from excluding i. From N cases of varying weights and variances (Fig. 3 (b)-(d)). this, the total degree distribution P(k) in equilibrium is ThebehaviorofH andH areconsistentwithwhat given as LD OD we see from Fig. 2: in terms of the goodness of fit to P(k)= P(k q)P(q), (7) P(q) (including the relative heights at the peaks) HOD X{q} | is superior to HLD [24]. whereP(q)istheprescribeddegreedistribution. Itisun- likelythatP(k =q q) 1inEq.(6),andthuswecannot III. ROBUSTNESS OF DEGREE expect P(k) P(q|). ≡To find the general characteristics REPRODUCTION UNDER PERTURBATION: of P(k) from≡Eq. (7) in comparison with P(q), we per- TARGETED CLUSTERING formed a Monte Carlo simulation (using the Metropolis- Hastings method describedabove)ofH for a network Besides degree distribution, a network characteristic LD ofn=500andP(q =5)=P(q =15)= 1 forillustrative that has been widely studied is clustering. Intuitively, a 2 purposes,whoseresultsareshowninFig.2. Inthefigure, clusterednetworkcontains significantlymore triadic clo- theprescribedP(q)isshowningray,andtheequilibrium sures (triangles) than expected in a random graph with P(k) is shownin blue. While P(k) does exhibit peaks at the same number of edges. (A common definition of the k =5andk =15,italsoshowsafairlywidedistribution strength of clustering of a network is given using the so- (although small in comparison with n), and the fluctua- called clustering coefficient, which we present later). tion is visibly larger at k =15 resulting in a lower peak. In exponential random graph literature, studies have If the specified degree q had been the same for all nodes been made on graph Hamiltonians that incorporate the (i.e. aq-regulargraph)itiswellknownthatH creates numberoftrianglesT linearly,thesimplestcasebeingthe LD anErdo¨s-R´enyigraphwith aPoissoniandegreedistribu- Strauss Model with H (G) = θM +τT [15]. The moti- S tion, locally not unlike the peaks in Fig. 2 [5]. Thus we vation for H is that by controlling θ and τ, one could S call the peaks we see in Fig. 2 Poisson-like. hopefullygenerateanetworkwithanydesiredvalueofM The well-documented success of the configuration and T, i.e. a smooth, controllable transition between a model implies that the fluctuations we see in P(k) may non-clusteredconfiguration(smallT)andaclusteredone not be problematic in general,though in certain circum- (largeT). Unfortunately,ithasbeenshownthatH does S stances (we see later such as a case) a more faithful re- notshowsuchabehavior: dependingonθ andτ,thesys- production of the specified degree distribution may be temundergoesafirst-orderphasetransitionfromasparse desired. This means that a graphHamiltonian is needed ER-like phase with vanishing clustering and a nearly- that imposes a larger penalty when k deviates from q fully connected phase [15, 16], of which most real net- i i than H does. It is unclear how H can be modified worksareneither. Morerecently,Fosteret al. performed LD LD 4 (a) (b) Poissonian Double Gaussian (Case I) Specified degree distribution Linear degree hamiltonian k) Optimization degree hamiltonian P( Degree k (c) (d) Double Gaussian Double Gaussian (Case II) (Case III) FIG. 3: The degree distributions generated via HLD and HOD for heterogeneous specified degree distribution. (a) With P(q) a Poissonian (with hqi = 10). In (b)–(d) P(q) is a double Gaussian with peaks at q = 5 and q = 10 with varying relative heights (α ∈ [0,1] for the peak at q = 5, and 1−α for the peak at q = 10) and variances σ1, σ2 of the peaks. (b) (α,σ1,σ2)=(0.5,1.0,1.0). ThisisthemostsimilartoFig.2. (c)(α,σ1,σ2)=(0.75,1.0,1.0)and(d)(α,σ1,σ2)=(0.5,1.0,5.0). Here, HOD again consistently reproduce P(q) more faithfully. anextensivestudy ofthe HamiltonianH(G)=τT onan is at a corner, and s(k ) = ki is the number of pairs i 2 ensembleofnetworksoffixeddegreesequences(andthus of neighbors of node i, also(cid:0)ca(cid:1)lled a two-star centered a fixed number of edges), and found that as τ is tuned, on i. C is therefore the probability that two neighbors i T shows a series of jumps consisting of first-order phase of node i are themselves neighbors. The global mea- transitions [17], each transition indicating the formation sure of clustering is commonly given by two measures. of densely connected local cliques. One is the average of C which we write C, defined as i This pathology renders the linear Hamiltonian for C C = C /N, i.e. the average of the local ≡ h ii i∈N i modelingrealclusterednetworks,wherethetrianglesare clustering coePfficients. The other, which we call C, is distributed over the the network without such extreme defined as “condensation”oftriangles. Thelackofsuchaninterme- e diatephaseinH stemsfromthefactthattheadditionof S 3T 3T a single edgein analreadydensely connectedpartofthe C = , (11) network can lead to a disproportionately large increase ≡ i∈N s(ki) i 21ki(ki−1) in T and decrease in HS(G), resulting in the condensed e P P phaseenergeticallyfavorable. Therefore,itisunderstood where T = 1 t is again the number of triangles that a Hamiltonian or, more generally, a mathematical 3 i∈N i formalism is necessary that explicitly discourages such in the networkP. Therefore this is the probability that a randomly selected two-star is a part of a triangle (3 ex- condensation [16, 18]. istsinthe numeratorbecauseonetrianglecontainsthree Before we find such Hamiltonian in our context of ex- ponential random graphs, let us first review how clus- two-stars). Although C and C are not identical, C = C tering in networks is quantified. It is often done via the when Ci = C0 for all i. In terms of these quantities, e e clustering coefficient C. In wide use are three versions, the aforementioned behavior of the Strauss Hamiltonian onelocal(node-level)andtwoglobal(network-wide). On HS =θM+τT canbesummarizedastheclusteringcoef- the individual node level, the local clustering coefficient ficient (local or global) being either C(or C) 0 (sparse ≃ is defined as ER-like phase) or C(or C) 1 (condensed phase) or, in ≃ e Ci ≡ s(tkii) = 12ki(ktii−1) (ki ≤2), (10) owbteohrekrCowfso(rinkdts)e,.rtmiTe≃adki0ianetgoeractlicuu≃steesfr(riknoimg)cftoohreeffiallacliteitnewtrhaCinle,ditnEi qaw.on(ue8lt)d-, i ∼ where t is the number of triangles of which the node i we propose the following nonlinear targeted clustering i 5 Hamiltonian y =x 1 Linear degree hamiltonian HC = βc ti γis(ki) = βc ti γi ki(ki 1) , with targeted clustering iX∈N | − | iX∈N (cid:12)(cid:12) − 2 − (cid:12)(cid:12) "al Owpitthim tairzgaetitoedn cdleugsrteeeri nhgamiltonian (12) b o gl C whereγi isnowthespecified(i.e. targeted)clusteringco- ! efficient for node i. The difference between H and the C modelofMiloetal.[19],whereH = T T′ andT′is Milo | − | thespecifiednumberoftriangles,isthatH allowsusto C control local clustering. Similar to H of Eq. (8), H OD C explicitly penalizes t when it diverges froma prescribed i value γ s(k ). In studying the effectiveness of H in re- C i i C target producing the specified local clustering, we would also like to have the option of controlling the degrees ki FIG. 4: The global clustering hCi of graph ensembles gener- { } e simultaneously. We have already discussed two Hamil- atedfromthelinear(blue)andtheoptimization (red)degree tonians designed specifically for that purpose, HLD and HamiltoniansperturbedwithtargetedclusteringHamiltonian HOD. Intheremainderofthispaper,therefore,westudy HC =Pi∈N|ti−Ctargets(ki)|. hCei≃Ctarget is the result of the following two composite Hamiltonians node-levellocalclusteringcoefficientsbeing≃Ctarget,regard- less of thedegree distribution. H =H +H = k lnq +β t γ s(k ) 1 LD C i i c i i i − | − | iX∈N(cid:2) (cid:3) (13) (but milder) behavior at a higher value of C 0.6 target and ∼ and up [25]. H =H +H = β k q +β t γ s(k ) Let us now discuss the implications of the findings in 2 OD C d i i c i i i iX∈N(cid:2) | − | | − |(cid:3) Figs.4and5onthetopologyofnetworksgeneratedfrom (14) H1 = HLD +HC and H2 = HLD +HC. First of all, Fig. 4 tells us that, unlike the Strauss clustering per- to find out whether either is capable of generating net- turbation τT, H was able to discourage an extreme C workensemblesexhibitingboththespecifieddegreesand condensation of triangles, resulting in C C by target local clustering. way of C C for both H andhHi.≃However it i target 1 2 AMonteCarlosimulationwasperformedforanetwork was not eno≃ugh to completely overcomeethe cooperative of size n = 500 and k = 10. For simplicity, we again tendency of triangles under H . The telltale sign of h i LD set βd = βc = 1, P(q) = δq,10 (a regular graph), and this is the creation of high-degree nodes Fig. 5 which γi =Ctarget, a universal value for all i, varied between 0 showsthecreationofhigh-degreenodesinH1: nowmany and1. First,Fig.4showsthemeanglobalclustering C triangles exist between the high-degree nodes, forming h i from the simulation, which shows us that both H and a core of densely interconnected high-degree nodes al- 1 e H2 hCi ≃ Ctarget generate networks with the specified though Ci ≃ Ctarget as specified. On the other hand, clustering. This arises from the fact that Ci Ctarget under H2 where P(k) is sharply peaked at the specified e h i ≃ on the individual level as well (not shown). The differ- degree q =10 such cores does not exist; with ki q and ≃ encebetweentheH1 andH2,however,ismoststrikingin Ci ≃Ctarget foralliasspecified,H2 generatesanetwork theequilibriumP(k),showninFig.5. WhenC =0, thattrulyhasauniformdistributionoftriangles,lacking target perturbation H is insignificant since the expected clus- any unspecified, accidental local structures. C tering without it is 0 anyway, and therefore P(k) are We check our claim via the following two quantities: simply as expected – a true Poissonian for H1, and a the degree-degree correlation rdeg (the Pearson corre- sharper peak at q = 10 for H , similar to the ones we lation between the degrees of adjacent nodes) and the 2 saw in Fig. 2. When C = 0, on the other hand, mean corner degree of the triangles in the network, target 6 the peak in P(k) under H1 gradually shifts towards a shown in Figs 6 (a) and (b). First, the plot of hrdegi in smallerk while high-degreenodes arecreatedinorderto Fig. 6 (a) indicates that adjacent degrees in the network compensate for the number of edges M which is a con- arehighlycorrelatedunderH1,sothathigh-degreenodes stant. AsC istunedhigheritresemblesthespecified are indeed connected with other high-degree nodes and target distribution less and less, and at Ctarget 0.4 we even viceversa,whileH2 showsnosucheffect. Thisleadsnat- ≃ observe multiple peaks (at k =5 and 15 – the values for urally to what we see in Fig. 6 (b): under H1, the mean which t C s(k) = 0, meaning the peaks will shift corner degree is significantly higher than k =q, unlike | − target | h i for a different Ctarget and thus are not very meaningful). H2 where it is practicallyequal to q. These observations P(k)underH ,incontrast,isrobust,withoutnoticeable arepresentedvisuallyinFig.6(c)(anactualsnapshotof 2 change up to C = 0.5, already an unusually high an equilibrium configuration of a network with n = 50, target value for real-world networks, until it too shows similar P(q) = δ , and C = 0.4. C are 0.35 0.02 and q,5 target h i ± e 6 Specified degree distribution Linear degree hamiltonian with targeted clustering ) Optimization degree hamiltonian k( with targeted clustering P C =0.0 C =0.1 C =0.2 target target target Degree k C =0.3 C =0.4 C =0.5 target target target FIG. 5: The equilibrium degree distributions P(k) generated from H1 = HLD +HC (blue) and H2 = HOD +HC (red) for various values of Ctarget. The specified degree distribution P(q) = δq,10 is in gray. When Ctarget = 0, both Hamiltonians generate their natural P(k) – a true Poissonian for HLD, and a sharp peak for HOD. As Ctarget is tuned higher, however, P(k) peaks at a smaller k for HLD+HC and even exhibits multiple peaks when Ctarget is too large, while it stays virtually unchanged for HOD+HC up to Ctarget ≃0.5, an unusually high valuein real networks. 0.31 0.02, respectively). As expected, for H (left) we a null model of network data with actual values of the 1 clear±ly see that the ten highest-degreenodes (blue, aver- variables φ˜ v =1,...,l using the Hamiltonian [17] v { | } age degree 9.7) forms a densely interconnected core (en- circled in orange), with ten lowest-degree nodes (yellow, H(G)= β φ (G) φ˜ , (15) v v v averagedegree1.0)pushedtotheperipherywithlowtri- φX∈Φ | − | angle participation rate. For H (right), no significant 2 difference between highest- and lowest-degree nodes ex- thereby enabling the modeler to assess quickly the suf- ists,andthetrianglesaredistributeduniformly,expected ficiency of the particular set of variables in character- of a maximally random configuration given the degree izing the network. An interesting recent application of and local clustering constraints. a related framework was provided by Foster et al. [20]: specifically,theygeneratednetworkswithspecifiedglobal clusteringcoefficientC ordegree-degreecorrelationr us- IV. DISCUSSION AND FUTURE DIRECTIONS ing the optimization hamiltonian and measured their ef- e fect on each other and the modular structure of the net- work (although they kept the the degree sequence fixed Here we have studied two forms of graph Hamiltonian as the network data). In doing so, they demonstrated in exponential random graph theory that take node de- the utility of the Hamiltonian of the form Eq. (15) in grees and local clustering as specified input. The ten- creating network ensembles with desired characteristics. dency of triangles to coalesce in the Strauss model was Naturally,morestudymustbemadeonthepropertiesof shown to persist when the linear clustering perturbation Eq.(15)inrelationtovariousnetworkvariables—global was replaced by an optimized clustering form, albeit in aswellaslocal—inorderto establishitsgeneralutility, a milder fashion, rendering the composite Hamiltonian and also the more recent. In light of the fact that new, unable to generate the specified degree distribution [26]. complex measures of network properties are frequently TheoptimizationdegreeHamiltonian,ontheotherhand, devisedandintroduced,wehopetheformalismprovesto wasabletosatisfyboth,exhibitingsignificantrobustness be a useful tool for network scientists. under the same perturbation. That the optimization Hamiltonian form was able to reproduce both the targeted degree and clustering presents an appealing possibility from the viewpoint of Acknowledgments network modeling via exponential randomgraphtheory: givenasetofnetworkvariablesΦ= φ v =1, ,l ,it TheauthorwouldliketothankDoochulKimforhelp- v { | ··· } mayactasapracticalcomputationalmethodtogenerate ful comments. This work was supported by Kyung Hee 7 (a) (b) Linear degree hamiltonian with targeted clustering Mean degree (10) n Optimization degree hamiltonian e correlatio with targeted clustering ner degree gre cor ee-de Mean gr e D C C target target (c) Linear Degree Hamiltonian Optimization Degree Hamiltonian with Targeted Clustering with Targeted Clustering en highest-degree nodes Ten lowest-degree nodes FIG. 6: (a) The degree-degree correlation rdeg under H1 = HLD+HC and H2 = HOD+HC. H1 generates positive degree correlation for any positive Ctarget, while H2 exhibits very little correlation up to Ctarget ≃ 0.5. (b) The mean corner degree of triangles contained in the networks in equilibrium. Under H1 most triangles exist between high-degree nodes, indicating thepersistence ofthecooperative natureof triangles. (c) Equilibrium topologies ofclustered networksunderH1 (left) andH2 (right). H1 generates a core of high-degree nodes that are densely connected and share a large number of triangles (enclosed in orange oval). H2, in contrast, maintains the specified degree distribution P(q) = δq,10 while the triangles are distributed uniformly, features expected of a maximally random configuration given thethe degree and local clustering constraints. UniversityGrantKHU-20100116andtheKoreaResearch Foundation Grant KRF-20100004910. [1] R.Albert,A.-L.Barab´asi,ReviewofModernPhysics74, [4] O.Frank,D.Strauss,JournaloftheAmericanStatistical 47 (2002). Association 81, 832 (1986). [2] S. N. Dorogovtsev, J. F. F. Mendes, A.-N. Samukhin, [5] J. Park and M. E. J. Newman, Physical Review E 70 Nucl.Phys. B 666, 396 (2003). (2004). [3] M.E.J.Newman,PhysicsToday 61,33,066117 (2008). [6] G. Robins, T. Snijders, P. Wang, M. Handcock, P. Pat- 8 tison, Social Networks 29, 192 (2007). [21] Typically we consider thenumberof nodes n as given. [7] D.R. Hunter,Social Networks 29, 216 (2007). [22] NotetheabsenceofthetemperatureβinEqs.(1)and(2). [8] C. J. Anderson, S. Wasserman, B. Crouch, Social Net- Here, one may consider β as having been absorbed into works 21, 37 (1999). {θ}. [9] T.A.B. Snijders,P.Pattison, G. Robins,M. Handcock, [23] Note that the so-called “curved” Exponential Random Sociological Methodology 36, 99153 (2006). Graph Model is similar to our formalism in that the [10] D. Hunter, S. Goodreau, M. Handcock, Journal of the graph Hamiltonians are nonlinear functions of network American Statistical Association 103, 248268 (2008). variables [10]. [11] N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, [24] For the purposes of this paper we are showing some nu- A. H. Teller, A. Teller, Journal of Chemical Physics 21, mericalexamples.Formoresystematicstudiesonecould 1087 (1953). investigate various moments of the degree distributions [12] J.Park,Journalof Statistical MechanicsP04006 (2010). from the two Hamiltonians, or a difference measure be- [13] M. E. J. Newman, S. H. Strogatz, D. J. Watts, Physical tween two distributions P and Q such as D(P,Q) ≡ ReviewE 64, 026118 (2001). P |P(k)−Q(k)|. k [14] S.Mertens, Physical Review Letters 81, 4281 (1998). [25] We performed similar simulations for the four heteroge- [15] D.Strauss, SIAMReview 28, 513 (1986). neous P(q) shown in Fig. 3 and found similar results. [16] J.Park,M.E.J.Newman,PhysicalReviewE72,026136 [26] ThereversecaseofHOD+τT wasalsonumericallystud- (2005). ied with varying τ. As τ is tuned more negative (thus [17] D. Foster, J. Foster, M. Paczuski, P. Grassberger, Phys- favoringtriangles)herealsooccursasuddenonsetofthe ical Review E 81, 046115 (2010). emergence of a densely connected cluster of high-degree [18] M. E. J. Newman, Physical Review Letters 103, 058701 nodes, signifying a condensation of triangles similar to (2009). HLD+τT.This demonstrates that to create finite clus- [19] R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. tering degree and local clustering optimization are nec- Chklovskii, U.Alon, Science 298, 824 (2002). essary. [20] D. V. Foster, J. G. Foster, P. Grassberger, M. Paczuski, arXiv:1012.2348

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.