ebook img

A maximum entropy framework for non-exponential distributions PDF

0.85 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview A maximum entropy framework for non-exponential distributions

A maximum entropy framework for non-exponential distributions Jack Peterson,1,2 Purushottam D. Dixit,3 and Ken A. Dill2 1Oregon State University, Department of Mathematics, Corvallis, OR 2Laufer Center for Physical and Quantitative Biology, Departments of Physics and Chemistry, Stony Brook University, NY 3Center for Computational Biology and Bioinformatics, Department of Biomedical Informatics, Columbia University, New York, NY Probabilitydistributionshavingpower-lawtailsareobservedinabroadrangeofsocial,economic, and biological systems. We describe here a potentially useful common framework. We derive distribution functions {p } for situations in which a ‘joiner particle’ k pays some form of price to k entera‘community’ofsizek−1,wherecostsaresubjecttoeconomies-of-scale(EOS).Maximizing 5 theBoltzmann-Gibbs-Shannonentropysubjecttothisenergy-likeconstraintpredictsadistribution 1 having a power-law tail; it reduces to the Boltzmann distribution in the absence of EOS. We show 0 that the predicted function gives excellent fits to 13 different distribution functions, ranging from 2 friendshiplinksinsocialnetworks,toprotein-proteininteractions,totheseverityofterroristattacks. This approach may give useful insights into when to expect power-law distributions in the natural n and social sciences. a J 6 Probability distributions are often observed to have set of constrained moments, such as the average energy. power-law tails, particularly in social, economic, and For this type of problem, which is highly underdeter- ] h biological systems. Examples include distributions of mined, a principle is needed for selecting a ‘best’ math- p fluctuations in financial markets [1], the populations of ematical function from among alternative model distri- - cities [2], the distribution of website links [3], and oth- bution functions. To find the mathematical form of c o ers [4, 5]. Such distributions have generated much pop- the distribution function pk over states k = 1,2,3,..., s ular interest [6, 7] because of their association with rare the Max Ent principle asserts that you should maximize s. but consequential events, such as stock market bubbles theBoltzmann-Gibbs-Shannon(BGS)entropyfunctional c and crashes. S[{p }] = −(cid:80) p logp subject to constraints, such as i k k k k s If sufficient data is available, finding the mathemati- the known value of the average energy (cid:104)E(cid:105). This pro- y cal shape of a distribution function can be as simple as cedure gives the exponential (Boltzmann) distribution, ph curve-fitting, with a follow-up determination of the sig- pk ∝e−βEk, where β is the Lagrange multiplier that en- [ nificance of the mathematical form used to fit it. On the forcestheconstraint. Thisvariationalprinciplehasbeen otherhand,itisofteninterestingtoknowiftheshapeofa the subject of various historical justifications. It is now 1 givendistributionfunctioncanbeexplainedbyanunder- commonly understood as the approach that chooses the v lying generative principle. Principles underlying power- least-biasedmodelthatisconsistentwiththeknowncon- 9 law distributions have been sought in various types of straint(s) [39]. 4 0 models. Forexample,thepower-lawdistributionsofnode Isthereanequallycompellingprinciplethatwouldse- 1 connectivities in social networks have been derived from lect fat-tailed distributions, given limited information? 0 dynamical network evolution models [8–17]. A large and There is a large literature that explores this. Inferring 1. popular class of such models is based on the ‘preferen- non-exponential distributions can be done by maximiz- 0 tial attachment’ rule [18–27], wherein it is assumed that ingadifferentmathematicalformofentropy,ratherthan 5 new nodes attach preferentially to the largest of the ex- the Boltzmann-Gibbs-Shannon form. Examples of these 1 isting nodes. Explanations for power-laws are also given non-traditional entropies include those of Tsallis [40], v: by Ising models in critical phenomena [28–34], network Renyi [41], and others [42, 43]. For example, the Tsallis i modelswiththresholded‘fitness’values[35]andrandom- entropyisdefinedas K ((cid:80) pq −1),whereK isacon- X 1 q k k energy models of hydrophobic contacts in protein inter- stantandq isaparame−terfortheproblemathand. Such r action networks [36]. methods otherwise follow the same strategy as above: a However, such approaches are often based on partic- maximizing the chosen form of entropy subject to an ex- ular mechanisms or processes. They often predict par- tensive energy constraint gives non-exponential distribu- ticular power-law exponents, for example. Our inter- tions. The Tsallis entropy has been applied widely [44– est here is in finding a broader vantage point, as well 53]. as a common language, for describing a range of dis- However, we adopt an alternative way to infer non- tributions, from power-law to exponential. For deriv- exponential distributions. To contrast our approach, we ing exponential distributions, a well-known general prin- first switch from probabilities to their logarithms. Log- ciple is the method of Maximum Entropy (Max Ent) arithms of probabilities can be parsed into energy-like in statistical physics [37, 38]. In such problems, you and entropy-like components, as is standard in statisti- want to choose the best possible distribution from all cal physics. Said differently, a nonexponential distribu- candidate distributions that are consistent with certain tion that is derived from a Max Ent principle requires 2 m m m ... m ... 1 2 3 N - 1 { N FIG.1. µ isthejoiningcost foraparticletojoinasizek−1community. Thisdiagramcandescribeparticlesformingcolloidal k clusters, or social processes such as people joining cities, citations added to papers, or link creation in a social network. that there be non-extensivity in either an energy-like particles is the sum, or entropy-like term; that is, it is non-additive over in- dependent subsystems, not scaling linearly with system k(cid:88)−1 size. Tsallis and others have chosen to assign the non- wk = µj. (1) extensivity to an entropy term, and retain extensivity j=1 in an energy term. Here, instead, we keep the canon- MaxEntassertsthatweshouldchoosetheprobability ical BGS form of entropy, and invoke a non-extensive distribution that has the maximum entropy amongst all energy-like term. In our view, only the latter approach candidatedistributionsthatareconsistentwiththemean is consistent with the principles elucidated by Shore and value (cid:104)w(cid:105) of the total cost of assembly [54], Johnson (SJ) [37] (reviewed in [39]). Shore and Johnson (SJ) showed that the Boltzmann-Gibbs-Shannon form of entropy is uniquely the mathematical function that p = e−λwk , (2) k (cid:88) ensures satisfaction of the addition and multiplication e−λwi rules of probability. SJ asserts that any form of entropy i other than BGS will impart a bias that is unwarranted where λ is a Lagrange multiplier that enforces the con- by the data it aims to fit. We regard the SJ argument as straint. a compelling first-principles basis for defining a proper In situations where the cost of joining does not de- variational principle for modeling distribution functions. pend on the size of the community a particle joins, then Here, we describe a variational approach based on the µ =µ , where µ is a constant. The cumulative cost of BGS entropy function, and we seek an explanation for k ◦ ◦ assembling the cluster is then power-law distributions in the form of an energy-like function instead. w =(k−1)µ . (3) k ◦ SubstitutingintoEq.2andabsorbingtheLagrangemul- tiplier λ into µ yields the grand canonical exponential ◦ I. THEORY distribution, well-known for this problems such as this: e µ◦k A. Assembly of simple colloidal particles p = − . (4) k (cid:88)e µ◦i − i We frame our discussion in terms of a ‘joiner particle’ thatentersaclusterorcommunityofparticles, asshown In short, when the joining cost of a particle entry is in Fig. 1. On the one hand, this is a natural way to de- independent of the size of the community it enters, the scribe the classical problem of the colloidal clustering of community size distribution is exponential. physicalparticles;itisreadilyshown(reviewedbelow)to give an exponential distribution of cluster sizes. On the other hand, this general description also pertains more B. Communal assemblies and ‘economies of scale’ broadly, such as when people populate cities, links are added to websites, or when papers accumulate citations. Now, we develop a general model of communal assem- We want to compute the distribution, p , of populations k bly based on ‘economies of scale’. Consider a situation of communities having size k =1,2,...,N. wherethejoiningcostforaparticledependsonthesizeof Tobegin,weexpressacumulative‘cost’ofjoining. For thecommunityitjoins. Inparticular,considersituations particles in colloids, this cost is expressed as a chemical in which the costs are lower for joining a larger commu- potential, i.e., a free energy per particle. If µ represents nity. Said differently, the ‘cost-minus-benefit’ function j the cost of adding particle j to a cluster of size j −1, µ is now allowed to be subject to ‘economies of scale’, k the cumulative cost of assembling a whole cluster of k which, as we note below, can also be interpreted instead 3 asaformofdiscountinwhichthecommunitypaysdown person is more likely to join a well-populated group on a some of the joining costs for the joiner particle. social-networkingsitebecausethemanyexistinglinksto Toseetheideaofeconomy-of-scalecostfunction,imag- it make it is easier to find (i.e., lower cost) and because inebuildinganetworkoftelephones. Inthiscase,acom- its bigger hub offers the newcomer more relationships to munity of size 1 is a single unconnected phone. A com- other people (i.e., greater benefit). Or, it can express munity of size 2 is two connected phones, etc. Consider that people prefer larger cities to smaller ones because the first phone: The cost of creating the first phone is of the greater benefits that accrue to the joiner in terms high because it requires initial investment in the phone of jobs, services and entertainment. (In our terminology, assembly plant. And the benefit is low, because there is a larger community ‘pays down’ more of the cost-minus- no value in having a single phone. Now, for the second benefitforthenextimmigranttojoin.) Weusetheterms phone, the cost-minus-benefit is lower. The cost of pro- ‘economy-of-scale’ (EOS), or ‘communal’, to refer to any ducing the second phone is lower than the first since the system that can be described by a cost function such productionplantalreadyexists. Andthebenefitishigher as Eq. 5, in which the community can be regarded as because two connected phones are more useful than one sharing in the joining costs, although other functional unconnectedphone. Forthethirdphone,thecost-minus- forms might also be of value for expressing EOS. benefitisevenlowerthanforthesecondbecausethepro- RearrangingEq.5givesµ =µ k /(k+k ). Thetotal k ◦ 0 0 ductioncostisevenlower(economyofscale)andbecause cost-minus-benefit,w ,ofassemblingacommunityofsize k the benefits increase with the number of phones in the k is network. k 1 (cid:88)− 1 To illustrate, suppose the cost-minus-benefit for the w =µ k =µ k ψ(k+k )−C, (6) k ◦ 0 ◦ 0 0 firstphoneis150, forthesecondphoneis80, andforthe j+k0 j=1 third phone is 50. To express these cost relationships, we define an ‘intrinsic cost’ for the first phone (joiner where ψ(k) = −γ +(cid:80)kj=−11j−1 is the digamma function particle), 150 in this example. And, we define the differ- (γ =0.5772...isEuler’sconstant),andtheconstantterm ence in cost-minus-benefit between the first and second C = µ◦k0ψ(k0)+µ◦ will be absorbed into the normal- phonesasthediscountprovided‘bythefirstphone’when ization. the second phone ‘joins the community’ of two phones. From this cost-minus-benefit expression (Eq. 6), for a In this example, the first phone provides a discount of given k , we can now uniquely determine the probabil- 0 70 when the second phone joins. Similarly, the total dis- ity distribution by maximizing the entropy. Substituting countprovidedbythetwo-phonecommunityis100when Eq. 6 into Eq. 2 yields theInthtihridsplahnognueajgoei,nsthteheexciosmtinmgun‘citoym.munity’ is ‘paying p = e−µ◦k0ψ(k+k0) . (7) down’somefractionofthejoiningcostsforthenextpar- k (cid:88)e−µ◦k0ψ(i+k0) ticle. Mathematically, this communal cost-minus-benefit i function can be expressed as Eq. 7 describes a broad class of distributions. These distributions have a power-law tail for large k, with ex- kµ µk =µ◦− k. (5) ponent µ◦k0, and a cross-over at k = k0 from expo- k 0 nential to power-law. To see this, expand ψ(k + k ) 0 asymptoticallyanddroptermsoforder1/k2. Thisyields Thequantityµk ontheleftsideofEq.5isthetotalcost- w ∼ µ k ln(cid:0)k+k − 1(cid:1), so Eq. 7 obeys a power-law minus-benefit when a particle joins a k-mer community. k ◦ 0 0 2 The joining cost has two components, expressed on the pk ∼ (cid:0)k+k0− 12(cid:1)−µ◦k0 for large k. pk becomes a sim- rightside: eachjoiningeventhasanintrinsiccostµ that ple exponential in the limit of k → ∞ (zero cost shar- ◦ 0 must be paid, and each joining event involves some dis- ing). One quantitative measure of a distribution’s posi- count that is provided by the community. Because there tion along the continuum from exponential-to-power-law are k members of the existing community, the quantity is the value of its scaling exponent, µ k . A small expo- ◦ 0 µ /k is the discount given to a joiner by each existing nent indicates that the system has extensive social shar- k 0 community particle, where k is a problem-specific pa- ing, thus power-law behavior. As the exponent becomes 0 rameter that characterizes how much of the joining cost large, the distribution approaches an exponential func- burden is shouldered by each member of the community. tion. Eq. 7 has a power-law scaling only when the cost In the phone example, we assumed k =1. The value of of joining a community has a linear dependence on the 0 k =1 represents fully equal cost sharing between joiner community size. The linear dependence arises because 0 and community member: each communal particle gives thejoinerparticleinteractsidenticallywithallotherpar- the joining particle a discount equal to what the joiner ticles in the community. itself pays. The opposite extreme limit is represented by What is the role of detailed balance in our modeling? k → ∞: in this case, the community gives no discount Figure 1 shows no reverse arrows from k to k−1. The 0 at all to the joining particle. principle of maximum entropy can be regarded as a gen- The idea of communal sharing of cost-minus-benefit is eralwaytoinferdistributionfunctionsfromlimitedinfor- applicable to various domains. It can express that one mation, irrespective of whether there is an underlying a 4 GitHub Wikipedia PrettyGoodPrivacy 100 100 100 10−1 10−1 10−1 bility10−2 bility10−2 bility10−2 a a a ob10−3 ob ob pr pr10−3 pr 10−3 10−4 10−4 10−5 10−4 100 101 102 103 100 101 102 103 104 105 100 101 102 usersperproject edits interactions Wordadjacency Terroristattacks Facebook 100 100 100 10−1 10−1 10−1 y y y bilit10−2 bilit10−2 bilit10−2 a a a b b b o o o pr pr pr 10−3 10−3 10−3 10−4 10−4 10−4 100 101 102 103 100 101 102 103 100 101 102 words deaths posts Proteins(fly) Proteins(yeast) Proteins(human) 100 100 100 10−1 10−1 10−1 y y y bilit bilit bilit oba oba10−2 oba10−2 pr10−2 pr pr 10−3 10−3 10−3 100 101 100 101 102 100 101 102 interactions interactions interactions Digg Petster Worduse 100 100 100 10−1 10−1 10−1 y y y bilit10−2 bilit bilit10−2 a a a ob ob10−2 ob pr10−3 pr pr10−3 10−4 10−3 10−4 100 101 102 100 101 102 100 101 102 103 104 replies friends timesused Software 100 10−1 y bilit a ob10−2 pr 10−3 100 101 102 103 104 classdependencies FIG. 2. Eq. 7 gives good fits (P > 0.05; see SI for details) to 13 empirical distributions, with the values of µ◦ and k given 0 in Table 1. Points are empirical data, and lines represent best-fit distributions. The probability p of exactly k is shown in k blue,andtheprobabilityofat least k (thecomplementarycumulativedistribution,(cid:80)∞ p )isshowninred. Descriptionsand j=k j references for these datasets can be found in the SI. 5 kinetic model. So, it poses no problem that some of our 100 GitHub distributions, such as scientific citations, are not taken Software Word adjacency from ‘reversible’ processes. Pretty Good Privacy 10−1 PDeigtgster Facebook Wikipedia II. RESULTS Terrorist attacks Proteins (fly) p)k10−2 PPrrootteeiinnss ((yheuamsat)n) Eq. 7 and Fig. 2 show the central results of this pa- x( Word use a per. Consider three types of plots. On the one hand, m exponential functions can be seen in data by plotting /k10−3 p logp vs k. Or, power-law functions are seen by plotting k logp vs logk. Here, we find that plotting logp vs a k k digamma function provides a universal fit to several dis- 10−4 parateexperimentaldatasetsovertheirfulldistributions (Fig. 3). Fig. 2 shows fits of Eq. 7 to 13 datasets, using µ andk asfittingparametersthataredeterminedbya 10−5 ◦ 0 maximum-likelihood procedure. (See SI for dataset and 0 2 4 6 8 µ w10 12 14 16 18 20 ◦ k 1 goodness-of-fit test details.) µ and k characterize the − ◦ 0 intrinsic cost of joining any cluster, and the communal FIG. 3. Eq. 7 fitted to the 13 datasets in Table 1, plotted contribution to sharing that cost, respectively. againstthetotalcosttoassembleasizekcommunity,µ◦w . k−1 Rare events are less rare under fat-tailed distributions Values of µ◦ and k are shown in Table 1. The y-axis has 0 thanunderexponentialdistributions. Fordynamicalsys- been re-scaled by dividing by the maximum p , so that all k tems, theriskofsucheventscanbequantifiedbytheco- curves begin at pk/max(pk)=1. All data sets are fit by the efficient of variation (CV), defined as the ratio of the logy=−x line. See Fig. 2 for fits to individual datasets. standard deviation σ to the mean (cid:104)k(cid:105). For equilib- k rium/steady state systems, the CV quantifies the spread of a probability distribution, and is determined by the III. DISCUSSION power-law exponent, µ k . Systems with small scaling ◦ 0 exponents (µ◦k0 ≤ 3) experience an unbounded, power- We have expressed a range of probability distribu- law growth of their CV as the system size N becomes tions in terms of a generalized energy-like cost func- large, σk/(cid:104)k(cid:105)∼Nβ. This growth is particularly rapid in tion. In particular, we have considered types of costs systemswith1.8<µ◦k0 <2.2,becausetheaveragecom- that can be subject to economies of scale, which we munity size (cid:104)k(cid:105) diverges at µ◦k0 =2. For these systems, have also called ‘community discounts’. We maximize β =1/2isobserved. Severalofourdatasetsfallintothis the Boltzmann-Gibbs-Shannon entropy, subject to such ‘high-risk’ category, such as the number of deaths due to cost-minus-benefit functions. This procedure predicts terrorist attacks (Table 1). probability distributions that are exponential functions of a digamma function. Such a distribution function has a power-law tail, but reduces to a Boltzmann dis- TABLE I. Fitting parameters and statistics. tribution in the absence of EOS. This function gives goodfitstodistributionsrangingfromscientificcitations Data set µ◦ k (cid:104)k(cid:105) N µ◦k P andpatents,toprotein-proteininteractions,tofriendship 0 0 networks, and to weblinks and terrorist networks – over GitHub 9(1) 0.21(2) 3.642 120,866 2(2) 0.78 their full distributions, not just in their tails. Wikipedia 1.5(1) 1.3(1) 25.418 21,607 1.9(1) 0.79 Framedinthisway,eachnew‘joinerparticle’mustpay PGP 1(1) 2.6(2) 4.558 10,680 2.6(3) 0.16 an intrinsic buy-in cost to join a community, but that Word adjacency 3.6(4) 0.6(1) 5.243 11,018 2.1(3) 0.09 cost may be reduced by a communal discount (an econ- Terrorists 2.1(2) 1(1) 4.346 9,101 2.2(3) 0.38 omy of scale).Here, we discuss a few points. First, both Facebook wall 1.6(1) 2.3(3) 2.128 10,082 3.6(5) 0.99 exponential and power-law distributions are ubiquitous. Proteins, fly 0.9(2) 5(2) 2.527 878 5(2) 0.89 How can we rationalize this? One perspective is given Proteins, yeast 0.9(1) 4(1) 3.404 2,170 3(1) 0.48 by switching viewpoint from probabilities to their loga- rithms, which are commonly expressed in a language of Proteins, human 0.8(1) 4(1) 3.391 3,165 4(1) 0.52 dimensionless cost functions, such as energy/RT. There Digg 0.68(3) 4.2(3) 5.202 16,844 2.8(2) 0.05 are many forms of energy (gravitational, magnetic, elec- Petster 0.21(3) 15(3) 13.492 1,858 3(1) 0.08 trostatic, springs, interatomic interactions, etc). The Word use 2.3(1) 0.8(1) 11.137 18,855 1.9(2) 0.56 ubiquity of the exponential distribution can be seen in Software 0.8(1) 2.1(3) 62.82 2,208 1.7(3) 0.69 terms of the diversity and interchangeability of energies. A broad swath of physics problems can be expressed in terms of the different types of energy and their ability 6 to combine, add or exchange with each other in various ACKNOWLEDGMENTS ways. Here, we indicate that non-exponential distribu- tionstoo,canbeexpressedinalanguageofcosts,partic- We thank A. de Graff, H. Ge, D. Farrell, K. Ghosh, ularlythosethataresharedandaresubjecttoeconomies S. Maslov, and C. Shalizi for helpful discussions, and of scale. Second, where do we expect exponentials vs. K.Sneppen,M.S.Shell,andH.Qianforcommentsonour power-laws? What sets Eq. 5 apart from typical en- manuscript. J.P.thankstheU.S.DepartmentofDefense ergy functions in physical systems is that EOS costs are forfinancialsupportfromaNationalDefenseScienceand bothindependentofdistanceandlong-ranged(thejoiner Engineering Graduate Fellowship. J.P. and K.D. thank particle interacts with all particles in given community). the NSF and Laufer Center for support. P.D. acknowl- Consequently, when the system size becomes large, due edges Department of Energy grant PM-031 from the Of- to the absence of a correlation length-scale, the energy fice of Biological Research. of the system does not increase linearly with system size givingrisetoanon-extensiveenergyfunction. Thisview is consistent with the appearance of power-laws in crit- ical phenomena, where interactions are effectively long- ranged. Third, interestingly, the concept of cost-minus-benefit in Eq. 5 can be further generalized, also leading to either Gaussian or stretched-exponential distributions. A Gaussian distribution results when the cost-minus- benefit function grows linearly with cluster size, µ ∼k; k this would arise if the joiner particle were to pay a tax to each member of a community. This leads to a total cost of w ∼ k2 (see Eq. 1). These would be ‘hostile’ k communities, leading to mostly very small communities and few large ones, since a Gaussian function drops off evenfasterwithkthananexponentialdoes. Anexample wouldbeaCoulombicparticleofchargeq joiningacom- munity of k other such charged particles, as in the Born modelofionhydration[55]. Astretched-exponentialdis- tribution can arise if the joiner particle instead pays a taxtoonlyasubsetofthecommunity. Forexample,ina chargedspherewithstrongshielding,ifonlytheparticles at the sphere’s surface interact with the joiner particle, then µ ∼ k2/3 and w ∼ k5/3, leading to a stretched- k k exponential distribution. In these situations, EOS can affect the community-size distribution not only through cost sharing but also through the topology of interac- tions. Finally, we reiterate a matter of principle. On the onehand,non-exponentialdistributionscouldbederived by using a non-extensive entropy-like quantity, such as those of Tsallis, combined with an extensive energy-like quantity. Here, instead, our derivation is based on using the Boltzmann-Gibbs-Shannon entropy combined with a non-extensive energy-like quantity. We favor the latter because it is consistent with the foundational premises of Shore and Johnson [37]. In short, in the absence of energies or costs, the BGS entropy alone predicts a uni- form distribution; any other alternative would introduce bias and structure into p that is not warranted by the k data. Models based on non-extensive entropies, on the other hand, intrinsically prefer larger clusters, but with- out any basis to justify them. The present treatment invokes the same nature of randomness as when physical particles populate energy levels. The present work pro- vides a cost-like language for expressing various different types of probability distribution functions. 7 [1] Mantegna,R.&Stanley,H. Scalingbehaviourinthedy- [22] Yook, S.-H., Jeong, H. & Baraba´si, A.-L. Modeling the namicsofaneconomicindex. Nature 376,46–49(1995). internet’s large-scale topology. Proceedings of the Na- [2] Zipf, G. K. Human Behavior and the Principle of Least tional Academy of Sciences 99, 13382–13386 (2002). Effort (Addison-Wesley: Cambridge, 1949). [23] Capocci,A.et al. Preferentialattachmentinthegrowth [3] Broder, A. et al. Graph structure in the web. Computer of social networks: The internet encyclopedia wikipedia. Networks 33, 309 (2000). Physical Review E 74, 036116 (2006). [4] Newman,M. Powerlaws,ParetodistributionsandZipf’s [24] Newman, M. E. Clustering and preferential attachment law. Contemporary Physics 46, 323 (2005). in growing networks. Physical Review E 64, 025102 [5] Clauset, A., Shalizi, C. & Newman, M. Power-law dis- (2001). tributions in empirical data. SIAM Review 51, 661–703 [25] Jeong, H., N´eda, Z. & Baraba´si, A.-L. Measuring pref- (2009). erential attachment in evolving networks. EPL (Euro- [6] Taleb, N. N. The Black Swan: The Impact of the Highly physics Letters) 61, 567 (2003). Improbable (Random House, 2007). [26] Poncela,J.,Go´mez-Garden˜es,J.,Flor´ıa,L.M.,Sa´nchez, [7] Bremmer, I. & Keats, P. The Fat Tail: The Power of A. & Moreno, Y. Complex cooperative networks from Political Knowledge for Strategic Investing (OxfordUni- evolutionarypreferentialattachment. PLoSone 3,e2449 versity Press, 2009). (2008). [8] Va´zquez, A., Flammini, A., Maritan, A. & Vespignani, [27] Peterson, G., Press´e, S. & Dill, K. Nonuniversal power A. Modelingofproteininteractionnetworks. ComPlexUs law scaling in the probability distribution of scientific 1, 38–44 (2003). citations. Proc. Natl. Acad. Sci. USA 107, 16023–16027 [9] Berg, J., Lassig, M. & Wagner, A. Structure and evolu- (2010). tion of protein interaction networks: a statistical model [28] Fisher, M. The renormalization group in the theory of for link dynamics and gene duplications. BMC Evolu- critical behavior. Reviews of Modern Physics 46, 597– tionary Biology 4, 51–63 (2004). 616 (1974). [10] Maslov, S., Krishna, S., Pang, T. Y. & Sneppen, K. [29] Yeomans, J. Statistical Mechanics of Phase Transitions Toolboxmodelofevolutionofprokaryoticmetabolicnet- (Oxford Science, 1992). works and their regulation. Proceedings of the National [30] Stanley, H. Scaling, universality, and renormaliza- Academy of Sciences 106, 9743–9748 (2009). tion: Three pillars of modern critical phenomena. [11] Pang, T. Y. & Maslov, S. A toolbox model of evolution Rev. Mod. Phys. 71, S358–S366 (1999). ofmetabolicpathwaysonnetworksofarbitrarytopology. [31] Gefen, Y., Mandelbrot, B. B. & Aharony, A. Critical PLoS Computational Biology 7, e1001137 (2011). phenomena on fractal lattices. Physical Review Letters [12] Leskovec, J., Chakrabarti, D., Kleinberg, J., Faloutsos, 45, 855–858 (1980). C.&Ghahramani,Z. Kroneckergraphs: Anapproachto [32] Fisher, D. S. Scaling and critical slowing down in modeling networks. J. Mach. Learn. Res. 11, 985–1042 random-field ising systems. Physical Review Letters 56, (2010). 416–419 (1986). [13] Karagiannis,T.,LeBoudec,J.-Y.&Vojnovic,M. Power [33] Suzuki,M.&Kubo,R.Dynamicsoftheisingmodelnear lawandexponentialdecayofintercontacttimesbetween thecriticalpoint. J. Phys. Soc. Japan 24,51–60(1968). mobile devices. Mobile Computing, IEEE Transactions [34] Glauber, R. J. Time-dependent statistics of the ising on 9, 1377–1390 (2010). model. Journal of mathematical physics 4, 294 (1963). [14] Shou, C. et al. Measuring the evolutionary rewiring [35] Caldarelli,G.,Capocci,A.,DeLosRios,P.&Munoz,M. of biological networks. PLoS computational biology 7, Scale-free networks from varying vertex intrinsic fitness. e1001050 (2011). Physical Review Letters 89, 258702–5 (2002). [15] Fortuna, M. A., Bonachela, J. A. & Levin, S. A. Evolu- [36] Deeds, E., Ashenberg, O. & Shakhnovich, E. A sim- tion of a modular software network. Proceedings of the ple physical model for scaling in protein-protein interac- National Academy of Sciences 108,19985–19989(2011). tion networks. Proc. Nat. Acad. Sci. USA 103, 311–316 [16] Peterson, G., Press´e, S., Peterson, K. & Dill, K. Sim- (2006). ulated evolution of protein-protein interaction networks [37] Shore, J. & Johnson, R. Axiomatic derivation of the with realistic topology. PLoS ONE 7, e39052 (2012). principle of maximum entropy and the principle of min- [17] Pang,T.Y.&Maslov,S. Universaldistributionofcom- imum cross-entropy. IEEE Transactions on Information ponent frequencies in biological and technological sys- Theory 26, 26–37 (1980). tems. Proceedings of the National Academy of Sciences [38] Jaynes,E.T. Informationtheoryandstatisticalmechan- (2013). ics. Physical Review 106, 620 (1957). [18] Simon, H. On a class of skew distribution functions. [39] Press´e,S.,Ghosh,K.,Lee,J.&Dill,K.A.Theprinciples Biometrika 42, 425–440 (1955). of Maximum Entropy and Maximum Caliber in statisti- [19] de Solla Price, D. A general theory of bibliometric and cal physics. Rev. Mod. Phys. 85, 1115–1141 (2013). other cumulative advantage processes. J. Am. Soc. In- [40] Tsallis, C. Possible generalization of boltzmann-gibbs form. Sci. 27, 292–306 (1976). statistics. Journal of Statistical Physics 52, 479–487 [20] Baraba´si, A. & Albert, R. Emergence of scaling in ran- (1988). dom networks. Science 286, 509–512 (1999). [41] R`enyi, A. On measures of entropy and information. In [21] Va´zquez, A. Growing network with local rules: Prefer- Fourth Berkeley Symposium on Mathematical Statistics ential attachment, clustering hierarchy, and degree cor- and Probability, 547–561 (1961). relations. Physical Review E 67, 056104 (2003). [42] Acz´el, J. & Daro´czy, Z. On measures of information 8 andtheircharacterizations,vol.40(AcademicPressNew York, 1975). [43] Amari, S.-i. Differential-geometrical methods in statistic (Springer, 1985). [44] Lutz, E. Anomalous diffusion and tsallis statistics in an optical lattice. Physical Review A 67, 051402 (2003). [45] Douglas, P., Bergamini, S. & Renzoni, F. Tunable tsal- lis distributions in dissipative optical lattices. Physical Review Letters 96, 110601 (2006). [46] Burlaga, L. & Vinas, A. Triangle for the entropic index q ofnon-extensivestatisticalmechanicsobservedbyvoy- ager 1 in the distant heliosphere. Physica A: Statistical Mechanics and its Applications 356, 375–384 (2005). [47] Pickup,R.,Cywinski,R.,Pappas,C.,Farago,B.&Fou- quet, P. Generalized spin-glass relaxation. Physical Re- view Letters 102, 097202 (2009). [48] DeVoe, R. G. Power-law distributions for a trapped ion interacting with a classical buffer gas. Physical Review Letters 102, 063001 (2009). [49] Plastino,A.&Plastino,A. Non-extensivestatisticalme- chanics and generalized fokker-planck equation. Physica A: Statistical Mechanics and its Applications 222, 347– 354 (1995). [50] Tsallis, C. & Bukman, D. J. Anomalous diffusion in the presence of external forces: Exact time-dependent solu- tions and their thermostatistical basis. Physical Review E 54, R2197–R2200 (1996). [51] Caruso, F. & Tsallis, C. Nonadditive entropy reconciles the area law in quantum systems with classical thermo- dynamics. Physical Review E 78, 021102 (2008). [52] Abe, S. Axioms and uniqueness theorem for tsallis en- tropy. Physics Letters A 271, 74–79 (2000). [53] Gell-Mann, M. & Tsallis, C. Nonextensive Entropy: In- terdisciplinary Applications: Interdisciplinary Applica- tions (Oxford University Press, USA, 2004). [54] Dill, K. & Bromberg, S. Molecular Driving Forces: Sta- tisticalThermodynamicsinBiology, Chemistry, Physics, and Nanoscience (Garland Science, 2010), 2nd edn. [55] Born,M. Volumesandheatsofhydrationofions(trans- lated). Z. Phys 1, 45–48 (1920). [56] Choudhury,M.D.,Sundaram,H.,John,A.&Seligmann, D. D. Social synchrony: Predicting mimicry of user ac- tionsinonlinesocialmedia. InProc.Int.Conf.onCom- putational Science and Engineering, 151–158 (2009). [57] Boguna, M., Pastor-Satorras, R., Diaz-Guilera, A. & Arenas,A. Modelsofsocialnetworksbasedonsocialdis- tanceattachment. PhysicalReviewE 70,056122(2004). [58] Patil, A., Nakai, K. & Nakamura, H. Hitpredict: a database of quality assessed protein-protein interactions in nine species. Nucleic Acids Research 39 (suppl 1), D744–D749 (2011). [59] Clauset,A.,Young,M.&Gleditsch,K.Onthefrequency of severe terrorist events. J. Conflict Resolution 51, 58 (2007). [60] Wikimedia Foundation. Wikimedia downloads. http: //dumps.wikimedia.org/ (2010). [61] Viswanath,B.,Mislove,A.,Cha,M.&Gummadi,K. On theevolutionofuserinteractioninfacebook. InProceed- ings of the 2nd ACM SIGCOMM Workshop on Social Networks (WOSN’09) (2009). 1 Appendix A: Datasets and fitting • Interactions between users of the Pretty Good Pri- vacy (PGP) secure data transfer algorithm [57] Inthispaper,wehavetwoaims: (i)touseMaxEntto identifyageneralfunctionalformforsituationsinvolving cost-minus-benefit constraints, and (ii) to fit data over a broad range of contexts. For (i), Max Ent predicts an • Words occurring immediately after one an- exponential of a digamma function, provided that k0 is other in a Spanish book (downloaded from known. It also shows how to compute a full distribu- konect.uni-koblenz.de) tion if we are given a single quantity, (cid:104)w(cid:105). For (ii), our aim is just to do simple curve-fitting of data, given the mathematicalformfrom(i). Inthiscase,wehavenomi- croscopic model for k , so we know neither k nor µ . In 0 0 ◦ • DeathsresultingfromterroristattacksfromFebru- thiscase, ourobjectiveisnottofindthefulldistribution ary 1968 to June 2006 [59] from (cid:104)w(cid:105). Rather, for (ii), we are given the full distribu- tionfunction, andourobjectiveistofindthebestvalues of k and µ that fit it. 0 ◦ Our fitting procedure is as follows. We use the maxi- mumlikelihoodestimationfunctionmlefunctioninMat- • Wall posts by users to their walls on the social- lab, with probability distribution specified by Eq. 7. Ta- networkingwebsiteFacebook, froma2009crawlof ble I shows estimated µ and k values, with 95% con- New Orleans Facebook [61] ◦ 0 fidence intervals. We calculate goodness-of-fit P-values using the Monte Carlo simulation procedure (based on the Kolmogorov-Smirnov test) described in [5]. The 13 datasets shown here are above the P = 0.05 statistical • Pairwise, physical protein-protein interactions significancethresholdproposedin[5]. P-valuesshownin (PPI) of proteins detected in small-scale PPI net- Table S1 are based on 1000 simulations for each dataset. work data, in yeast (Saccharomyces cerevisiae), We fit Eq. 7 to 13 data sets: fruit flies (Drosophila melanogaster), and humans (Homo sapiens) [58] • Project membership on the social cod- ing website GitHub (downloaded from konect.uni-koblenz.de) • Edits made by users of the English-language • Replies between users of the social news website Wikipedia [60] Digg [56] 6 • Friendships between users of the Petster social 5 networking site Hamsterster (downloaded from konect.uni-koblenz.de) 4 3 ◦ µ 2 • Occurrences of unique words in the novel Moby Dick [4] 1 0 • Class-class dependencies in the software li- −1 braries JUNG and javax (downloaded from 0 1 2 3 4 5 6 7 8 9 10 1/k0 konect.uni-koblenz.de) FIG. S1. µ◦ plotted against 1/k for the 13 data sets listed 0 in Table I. Error bars are 95% confidence intervals. Linear regression is shown as a solid line, µ◦ = 0.516k−1 −0.125 An overlay of all fits and datasets is shown in Fig. 3. 0 (R2 =0.991). Eachpointonthisplotrepresentsanempirical Individual parameters and fits are shown in Table I and data set. Fig. 2.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.