ebook img

On large deviation properties of Erdos-Renyi random graphs PDF

0.54 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview On large deviation properties of Erdos-Renyi random graphs

4 0 0 2 n On large deviation properties of Erd¨os-R´enyi a J random graphs 1 2 Andreas Engel1,2, R´emi Monasson2,3, and Alexander K. Hartmann4 ] h 1 Institut fu¨r Physik, Universit¨at Oldenburg, c e 26111 Oldenburg, Germany. m 2 CNRS-Laboratoire de Physique Th´eorique, - 3 rue de l’Universit´e, 67000 Strasbourg, France. t a 3 CNRS-Laboratoire de Physique Th´eorique de l’ENS, t s 24 rue Lhomond, 75005 Paris, France. . t a 4 Institut fu¨r Theoretische Physik, Tammannstraße 1, m 37077 G¨ottingen, Germany - d February 2, 2008 n o c [ Abstract We show that large deviation properties of Erd¨os-R´enyi random 2 graphscanbederivedfromthefreeenergyoftheq-statePottsmodel v of statistical mechanics. More precisely the Legendre transform of 5 thePottsfreeenergywithrespecttolnqisrelatedtothecomponent 3 5 generatingfunctionofthegraphensemble. Thisgeneralizesthewell- 1 known mapping between typical properties of random graphs and 1 the q → 1 limit of the Potts free energy. For exponentially rare 3 graphsweexplicitlycalculatethenumberofcomponents,thesizeof 0 thegiantcomponent,thedegreedistributionsinsideandoutsidethe / giantcomponent,andthedistributionofsmallcomponentsizes. We t a alsoperformnumericalsimulationswhichareinverygoodagreement m with our analytical work. Finally we demonstrate how the same - results can be derived by studying the evolution of random graphs d under the insertion of new vertices and edges, without recourse to n the thermodynamicsof thePotts model. o c PACS: 02.50.-r,05.50.+q, 75.10.Nr : v i X 1 Introduction r a Random graphs have kept being an issue of tremendous interest in proba- bilityandgraphtheoryeversincetheseminalworkbyErdo¨sandR´enyi[1] 1 more than four decades ago. In addition to fixed edge number and fixed edgeprobabilitydistributionsalsorandomgraphswithconstantvertexde- gree[2]orpowerlawdegreedistribution[3,4]havebeeninvestigated. Most ofthe effortsdevotedtothe studyofthe propertiesofrandomgraphshave takenadvantage of the fact that these properties undergo some concentra- tionprocessintheinfinitesize(numberofvertices)limit. Forinstance,the numbers of vertices in the largest component or the number of connected components,whicharestochasticinnature,becomehighlyconcentratedin thislimit,andwithhighprobabilitydonotdifferfromtheiraveragevalues. For large but finite sizes, properties as the one evoked above obviously fluctuate from graph to graph. The understanding of their statistical de- viations are important for several problems in statistical physics, e.g. for the life-time of metastable states and the extremal properties of models defined on random graphs [5], as well as in computer science, e.g. for information-packet transmission in random networks [6, 7], resolution of randomdecisionproblemswithsearchprocedures[8,9,10]andothers. Up to now apparently little attention has been paid to a quantitative charac- terization of large deviations in random graph ensembles [11, 12]. The presentworkisintendedtocontributeto animprovedunderstand- ingofrarefluctuationsinrandomgraphs. Ourmainobjectivewastodevise a microscopic ”mean-field” approach permitting to handle such rare devi- ations in much the same way as for average properties of various similar problems, as e.g. bootstrap and rigidity percolation [13] and spin-glasses [14]. The mean-field approach relies on a statistical stability argument: a largegraphis not stronglymodifiedwhen adding anedge and/ora vertex. Thisstatementcanbetranslatedintosomeself-consistentequationsforthe average value of physical properties of interest, as e.g. the magnetization foraspinsystem,ortheprobabilityofbelongingtothek-coreforbootstrap percolation. We willshowinthe presentworkthatasimilarself-consistent approachcanalsobesuccessfullyusedtoaccesslargedeviationsinrandom graphs. The mainpropertywe focus onthroughoutthis paper is the number of connected components of a random graph. As established by Fortuin and Kasteleyn [15], several properties of random graphs with a typical number of components can be inferred from the knowledge of the thermodynamics of the q-state Potts model on a complete graph for values of q around 1. Wewillshowthatthethermodynamicpropertiesforgeneralvaluesofqcan be used to additionally characterize the properties of random graphs with an atypical number of components. This allows us to verify the validity of our microscopic mean-field approach. This paper is organizedas follows. In Section2, we introduce the basic definitions and notations for the quantities studied. Section 3 is devoted to the derivation of rare graphs properties through the study of the Potts 2 model. WeshowinSection4howtheseresultscanberederivedthroughthe requirementofthe statisticalstabilityofverylargeatypicalgraphsagainst theadditionofavertexanditsattachededges,oranedge. InSection5we describe our numerical procedure to simulate large deviation properties of randomgraphsensembles. SomeconclusionisfinallyproposedinSection6. 2 Basic notions We begin by fixing some vocabulary. For a detailed and precise account on random graphs we refer the reader to the textbook [16]. A graph G is a collection of vertices numbered by i = 1,...,N with edges (i,j), i6=j, i,j =1,...,N connectingthem. Thenumberofedgesisbetween0(forthe emptygraph)andN(N−1)/2(forthe completegraph). Acomponentofa graphisasubsetofconnectedverticeswhicharedisconnectfromtherestof thegraph. ThesizeS ofacomponentisthenumberofverticesitcontains. Hence the empty graph consists of N components of size 1 whereas the completegraphismadefromasinglecomponentofsizeN. Thenumberof components of a graphG is denoted by C(G). We are generally interested in properties of large graphs, N →∞. We will consider random graphs in the sense that an edge between two vertices may be present or absent with a certain probability. The various joint probabilities to be discussed below will be denoted in the form P(x ,x ,...;a ,a ,...) with the x representing the random variables 1 2 1 2 i and the a denoting the parameters of the distribution. In particular we i consider random graphs in which each pair of vertices is connected by an edgewithprobabilityγ/N independently ofallotherpairsofvertices. The parameterγ characterizesthe connectivity of the graph. Since eachvertex establishes edges with probability γ/N with all the other N −1 vertices γ is in the limit N → ∞ just the typical degree of a vertex denoted by d∗, giving the average number of edges emanating from it. Moreprecisely,inthislimitthedegreedofavertexisarandomvariable obeying a Poisson law with parameter γ, γd P(d;γ)=e−γ . (1) d! In particular, P(d=0;γ)=e−γ is the fraction of isolated vertices. Hence theaveragenumberofcomponentsofarandomgraphofthedescribedtype is bounded from below by Ne−γ. Note also that the typical degree d∗ of a vertex remains finite for N → ∞. Typical realization of such random graphs are therefore sparse. The probability P(G;γ,N) of one particular random graph G with N 3 vertices and parameter γ derives from the binomial law, γ L(G) γ (N)−L(G) P(G;γ,N)= 1− 2 N N =e(cid:16)−γ2N(cid:17)+γ(21−(cid:16)γ4+LN(G)(cid:17))+o(1) γ L(G), (2) N (cid:16) (cid:17) where L(G)=O(N) denotes the number of edges of graphG. To describe the decomposition of a large random graph into its components, it is con- venientto introduce the probabilityP(C;γ,N)of arandomgraphwith N vertices to have C components P(C;γ,N)= P(G;γ,N)δ(C,C(G)), (3) G X where δ(a,b) denotes the Kronecker delta. A general observation is that for given γ and large N the probability P(C;γ,N)getssharplypeakedatsometypicalvalueC∗ ofC andtheprob- abilities for values of C significantly different from C∗ being exponentially small in N. To describe this fact more quantitatively we introduce the number of components per vertex c=C/N together with the quantity 1 ω(c,γ)= lim lnP(C;γ,N). (4) N→∞ N Clearlyω(c,γ)≤0andthetypicalvaluec∗ ofchasω(c∗,γ)=0. Averages with P(G;γ,N) are therefore dominated by graphs with a typical number of components. Thefocusofthepresentpaperisonpropertiesofrandomgraphswhich areatypical withrespecttotheir numberofcomponentsC. Inordertoget accesstothepropertiesofthesegraphsweintroducethebiased probability distributions 1 P(G;γ,q,N)= P(G;γ,N)qC(G), (5) Z(γ,q,N) with Z(γ,q,N) defined by Z(γ,q,N)= P(G;γ,N)qC(G) = P(C;γ,N)qC. (6) G C X X The normalization constant Z(γ,q,N) in (5) has hence the meaning of a component generating function of P(G;γ,N). Contrary to averages with P(G;γ,N)thosewithP(G;γ,q,N)aredominatedbygraphswithanatyp- ical number of components which is fixed implicitly with the parameter q. Values of q smaller than 1 shift weight to graphs with few components 4 whereas for q > 1 graphs with many components dominate the distribu- tion. The typical case is obviously recoveredfor q =1. Similar to ω(c,γ) it is convenient to introduce the function 1 ϕ(γ,q)= lim lnZ(γ,q,N). (7) N→∞ N From (6) and (4) it follows to leading order in N that 1 Z(γ,q,N)= dc exp(N[ω(c,γ)+clnq]) (8) Z0 and performing the integral by the Laplace method for large N we find that ϕ(γ,q) and ω(c,γ) are Legendre transforms of each other: ϕ(γ,q)=max[ω(c,γ)+clnq] ω(c,γ)=min[ϕ(γ,q)−clnq] (9) c q ∂ω ∂ϕ q =exp(− ) c=q (10) ∂c ∂q The large deviation properties of the ensemble of random graphs as char- acterizedby ω(c,γ)canhence be inferredfromϕ(γ,q). Inthe nextsection we show how ϕ(γ,q) can be obtained from the statistical mechanics of the Potts model. For later use we also note that from differentiating (7) with respect to γ we find using (6) and (2) to leading order in N γ ∂ϕ ℓ(γ,q)= +γ . (11) 2 ∂γ Here ℓ(γ,q) denotes the average number of edges per vertex in the graph where the average is performed with the distribution (5). 3 Thermodynamics of atypical graphs 3.1 The mean-field Potts model It has long been known [15] that certain characteristics of random graphs are related to the thermodynamic properties of the Potts model [17, 18]. The Potts model is defined in terms of an energy function E({σ }) de- i pending onN spin variables σ ,i=1,...,N, whichmay take onq distinct i valuesσ =0,1,...,q−1. Inthemean-fieldvarianttheenergyfunctionreads q−1 1 E({σ })=− δ(σ ,σ )−h u δ(σ,σ ), (12) i i j σ i N i<j σ=0 i X X X 5 1 0.8 0.6 s0 0.4 0.2 0 0 1 2 3 4 5 6 7 β Figure1: Solutions (β,q)ofthesaddle-pointEq.(16)asfunctionofβ for 0 q = 1,2,4,6 (from left to right). For q ≤ 2 the non-trivial solution s > 0 0 branches off continuously from the high-temperature solution s = 0 at 0 β =q. For q >2 the new solution appears discontinuously at the spinodal point βs <q by a subcritical bifurcation. where hu is an auxiliary field parallel to the direction σ. The thermo- σ dynamic properties of the system at inverse temperature β can be derived from the partition function Z(β,h,q,{u },N)= exp(−βE({σ })) (13) σ i {Xσi} where the sumruns overallqN spinconfigurations{σ }. A standardanal- i ysis (cf. the appendix) gives for the free energy 1 f(β,h,q,{u })=− lim lnZ(β,h,q,{u },N) (14) σ σ N→∞βN at h=0 the result 1 q−1 1 f(β,q)=extr − − s2− lnq s0 (cid:20) 2q 2q 0 β 1+(q−1)s q−1 0 + ln(1+(q−1)s )+ (1−s )ln(1−s ) . (15) 0 0 0 βq βq (cid:21) The saddle-point value s (β,q) extremizing the expression in the brackets 0 is the stable solution of the equation 1+(q−1)s eβs0 = 0. (16) 1−s 0 6 Clearlys =0isalwaysasolutionofthisequation. Itis,however,unstable 0 forlargeβ andanother,non-trivialsolutionbecomesstablewhichdescribes the spontaneous appearance of order in the low temperature phase. Fig. 1 displaysthesolutionsof(16)asfunctionofβ fordifferentvaluesofq. Note the subcriticial bifurcation in s (β,q) for q >2. 0 3.2 Diagrammatic expansion of the Potts model The relation between the Potts model and the random graph ensemble introduced in section 2 becomes apparent when considering the high- temperature expansion of the free energy (14) of the Potts model. Since the Kronecker delta can take only the values zero or unity, the partition function (13) can be recast into the form [15] Z(β,h,q,{uσ},N)= [1+wδ(σi,σj)] eβh σuσ iδ(σi,σ), (17) {Xσi} Yi<j P P where β β 1 w =exp( )−1= +O( ). (18) N N N2 When expanding the product appearing in (17) we obtain a sum of 2N(N−1)/2 terms each of which is in one–to–one correspondence with a graph. The N vertices of this graph represent the Potts variables σ , i whereas an edge (i,j) stands for a factor wδ(σ ,σ ). Performing the trace i j over the configurations {σ } for each term in the sum, i.e. for each graph, i separately,the Kroneckerdeltas constrainthe Potts variablesbelonging to one component of the graph to the same value. As a result we find the Potts partition function as a sum over graphs in the form C(G)−1 Z(β,h,q,{u },N)= wL(G) eβhuσSn (19) σ XG nY=0 (cid:16)Xσ (cid:17) where the product is over all components of the graph and S denotes the n sizeofthen-thcomponent. Wewillassumethatn=0referstothelargest component. From (19) and (18) we find L(G) β Z(β,h=0,q,N)= qC(G). (20) N G (cid:18) (cid:19) X and comparison with (6) and (2) yields to leading order in N γN Z(γ,h=0,q,N)=e 2 Z(γ,q,N). (21) 7 Correspondingly from (7) and (14) it follows that 1 1 f(γ,q)=− − ϕ(γ,q). (22) 2 γ Eqs.(21)and(22)establishtherelationbetweentherandomgraphensem- ble defined in section 2 and the statistical mechanics of the Potts model sketched in section 3.1. In particular we obtain from (22) and (15) γ q−1 ϕ(γ,q)=extr (s2−1)+lnq s0 (cid:20)2 q 0 1+(q−1)s q−1 0 − ln(1+(q−1)s )− (1−s )ln(1−s ) (23) 0 0 0 q q (cid:21) from which ω(c,γ) follows with the help of the Legendre transform (9), (10). The equation for the saddle-point value s (γ,q) in (23) is from (16) 0 1+(q−1)s eγs0 = 0. (24) 1−s 0 Differentiating(19)foru =δ(σ,0)withrespecttoh,sendingfirstN →∞ σ and then h→0 from above,one can show that the stable solution s (γ,q) 0 of(24)is nothingbut the averagefractionofverticesinthelargestcompo- nent s =S /N in an ensemble of random graphs with biased probability 0 0 (5). Hence the phase transition in the Potts model describing the appear- ance of a spontaneous magnetization at sufficiently low temperature 1/β correspondsto a percolationtransitionin the randomgraphensemble giv- ingbirthtoagiantcomponentwithextensivelymanyverticesatsufficiently large connectivity parameter γ. Wealsonoteforlaterconveniencethatusing(24)in(23)theexpression for ϕ(γ,q) can be rewritten as γ q−1 γs ϕ(γ,q)=− (1+s2)− 0 +ln(q−1+eγs0). (25) 2 q 0 q It is finally useful to write (19) with β replaced by γ in the form Z(γ,h,q,{uσ},N)=Z(γ,h=0,q,N)e−γ2N C(G)−1 exp ln( eγhuσSn(G))−C(G)lnq , (26) D (cid:16) nX=0 Xσ (cid:17)E where the average h...i is with respect to the biased probability (5). Sin- gling out a possible giant component of size S = Ns and grouping to- 0 0 gether all small components of the same size we then obtain for the free 8 energy (14) 1 f(γ,h,q,{u })=f(γ,h=0,q)+ − σ 2 1 lim ln exp(N γhs (G) + ψ(S,G)ln eγhuσS −clnq ) . 0 N→∞γN D (cid:2) XS Xσ (cid:3) E (27) Here we have introduced the number of components of size S of graph G divided by N C(G)−1 1 ψ(S,G)= δ(S,S (G)) . (28) n N n=1 X Eq. (27) forms a suitable starting point for the characterization of the distribution of small components from the Potts free energy. 3.3 Properties of atypical graphs The connection between the Potts free energy and the component gen- erating function of Erdo¨s-R´enyi graphs allows to elucidate several large deviation properties of the random graph ensemble. First we get for the averagenumber of edges per vertex from (11) and (23) γ ℓ(γ,q)= 1+(q−1)s2(γ,q) . (29) 2q 0 (cid:0) (cid:1) Thedependenceofℓontherelativesizeofthegiantcomponents (γ,q)for 0 q 6=1 indicates a non-trivial internal organizationof edges in rare graphs. For the number of components of graphs dominating the distribution (5) we find from (23) and (10) γ c(γ,q)= 1−s (γ,q) 1− 1−s (γ,q) . (30) 0 0 2q (cid:0) (cid:1)(cid:16) (cid:0) (cid:1)(cid:17) The above equations already give access to some microscopic information on edges and vertices belonging to the giant component or to the small components. Call ℓ and ℓ the average numbers of edges inside and in out outside the giant component divided by N respectively. Obviously, ℓ + in ℓ = ℓ. In addition, since almost all small components are trees (cf. out section 4.1), the number of these components is related to the number of edges they containthroughc=1−s −ℓ . Fromthese two relations,we 0 out obtain γ ℓ (γ,q) = 2s +(q−2)s2 in 2q 0 0 γ (cid:0) 2 (cid:1) ℓ (γ,q) = 1−s (31) out 0 2q (cid:0) (cid:1) 9 4 3 2.5 3 2 γd(=0.25,q)1.5 γd(=3,q)2 1 1 0.5 0 0 0.01 0.1 q 1 10 0.1 1 q qM 10 Figure 2: Average degrees d (γ,q) (dashed top) and d (γ,q) (full) as in out functions of q according to (32) for γ = 0.25 (left) and γ = 3 (right). The symbols indicate numerical results (diamond=inside, circle=outside thegiantcomponent). Thestatisticalerrorbarsaremuchsmallerthanthe symbol size. from which we deduce the average degrees γ d (γ,q) = 2+(q−2)s in 0 q γ(cid:0) (cid:1) d (γ,q) = 1−s (32) out 0 q (cid:0) (cid:1) ofverticesinsideandoutsidethegiantcomponentrespectively. Thedepen- dence of these degrees on q for one particular value of γ is shown in Fig. 2 together with results from numerical simulations described in section 5. In order to calculate the complete spectrum ω(c,q) using the Legendre transform(9)weneedtoknowϕ(γ,q)forgeneralrealq >0. Wehavehence to study the extremization over s in (23) for fixed γ and variable q. This issomewhatcomplementarytowhatisdoneinthe statisticalmechanicsof the Potts model where the free energy (15) is minimized for integer q ≥2 anddifferentvaluesofβ. Herewehavetokeepinmindthattheextremum in (23) is a minimum if q >1 but a maximum if 0<q <11. The dependence of ϕ(γ,q) on q is qualitatively different for γ ≤ 2 and γ >2asshowninFig.3. Forγ ≤2thestablesolutionof(24)ispositivefor q <γ,goestozeroforq →γ,andisidenticallyzeroforq >γ (cf. leftinset in Fig. 3). Accordingly ϕ(γ,q) shows a second order phase transition at 1The reason for this is that due to the constraint (120) the free energy depends on (q−1)variables,anumberwhichbecomes negativeforq<1. 10

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.