ebook img

Emergence of Zipf's Law in the Evolution of Communication PDF

0.65 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Emergence of Zipf's Law in the Evolution of Communication

Emergence of Zipf’s Law in the Evolution of Communication Bernat Corominas-Murtra1, Jordi Fortuny Andreu2 and Ricard V. Sol´e1,3,4 1 ICREA-Complex Systems Lab, Universitat Pompeu Fabra -Parc de Recerca Biom`edica de Barcelona (PRBB). Dr Aiguader 88, 08003 Barcelona, Spain 2Centre de Lingu¨´ıstica Te`orica (CLT), Facultat de Lletres, Edifici B, Universitat Auto`noma de Barcelona, 08193 Bellaterra (Barcelona), Spain 3Santa Fe Institute, 1399 Hyde Park Road, New Mexico 87501, USA 4Institut de Biologia Evolutiva. CSIC-UPF. Passeig Mar´ıtim de la Barceloneta, 37-49, 08003 Barcelona, Spain. Zipf’s law seems to be ubiquitous in human languages and appears to be a universal property of complex communicating systems. Following the early proposal made by Zipf concerning the pres- enceofatensionbetweentheeffortsofspeakerandhearerinacommunicationsystem,weintroduce evolution by means of a variational approach to the problem based on Kullback’s Minimum Dis- criminationofInformationPrinciple. Therefore,usingaformalismfullyembeddedintheframework 1 ofinformationtheory, wedemonstratethatZipf’slawistheonlyexpectedoutcomeofanevolving, 1 communicative system under a rigorous definition of the communicative tension described by Zipf. 0 Keywords: Zipf’slaw,scaling,evolutionofcodes,MinimumDiscriminationofInformationPrinciple 2 n a I. INTRODUCTION a b J Ω Noise 3 1 Zipf’slawisoneofthemostcommonpowerlawsfound S in nature and society [1–6]. Although it was early ob- P DECOD ] served in the distribution of money income [7] and city X (n) X (n) O Ω s sizes[1],itwaspopularizedbythelinguistGeorgeKings- P = A leyZipf, who observedthat itaccounts forthe frequency . of words within written texts [2, 3]. Specifically, if we n Noise rankalltheoccurrencesofwordsinatextfromthemost i l common totheleast, Zipf’slawstates thatthe probabil- n ity q(s ) that in a random trial we find the m-th most P DECOD [ m common word (i=1,...,n) falls off as XΩ(n+1) Xs(n+1) 3 v 1 8 q(sm)= Zm−γ, H(Xs)=H(XΩ|Xs) minD(qn||qn+1) 3 9 where FIG. 1: A growing communication system. In (a) possible 0 (cid:88) meaning-signal associations made by the coder module P in . Z = j−γ, which eq. (2) holds is depicted. In (b) we summarize the 8 evolution rules of our communicative system. Suppose that 0 j≤n symmetry between coder and decoder -i.e., eq. (2)- holds for 0 with γ 1. The ubiquity of this scaling behavior sug- the step n (above). At each step (below) a new element is 1 ≈ gested several mechanisms to account for the emergence added to the set Ω and eq. (2) holds again for this new con- : v of this distribution, among many others, see [4, 8–12]. figuration. Furthermore,thenewconfigurationisconstrained Xi Within the context of human language, G. K. Zipf by the MDIP, which introduces a path dependency in the evolutionary process. early conjectured that this scaling law is the outcome of r a a tension between two forces acting in a communication system[3]. FollowingZipf’sproposal,speakersandhear- ers need to simultaneously minimize their efforts, under been provided under realistic, information-theoretic con- whathecalledvocabularybalance,aparticularcaseofthe straints. We can view the proposals made in [10, 11, 13] so-called Principle of Least Effort. This triggers a ten- as static for they consider a fixed size of the code. sionbetweenthetwocommunicativeagents,whiletrying Arecentapproach-whichgoesbeyondthecommunica- to simultaneously minimize their efforts. The speaker’s tive framework- defined the key complexity properties economy would favour a reduction of the size of the vo- of a system to display a statistics of events following cabulary to a single word whereas the hearer’s economy Zipf’s law: An open, unbounded number of accessible would lead to an increase of the size of a vocabulary to states and a linear loss of entropy due to generic inter- a point where there would be a different word for each nal constraints [12]. The linear loss of entropy grasps meaning. The resulting vocabulary would emerge out the intuitive idea that the studied systems are in an in- of this unification-diversification conflict [3]. Although termediate state between order and disorder -or that a bothnumericalandtheoreticalstudieshaveexploredthis possible informative tension is balanced, as we shall see- idea [10, 11, 13], no truly analytic proof of unicity has and the unbounded number of accessible states reflects 2 their open nature. It was shown that, under a very gen- IV we discuss the implications of our results. eral parametrization, and imposing properties of scale- invariance to the solution, Zipf’s law was the only possi- ble outcome. II. THE EVOLUTION OF THE COMMUNICATIVE SYSTEM Now we adapt and enrich the general framework pro- posed in [12] to the communicative context. As we shall see, Zipf’s hypothesis can be interpreted in such a way In this section we mathematically define 1) the com- thatthesystemcanbestudiedwithintheframeworkpro- municativetensiondescribedbyZipfand2)theevolution posed in [12]. Moreover, the parameters that were arbi- or growth of a given code subject to such a tension. We trary in the general mathematical framework mentioned furthermoredefinetherangeofapplicationofourformal- above can now be naturally interpreted in the commu- ism. As we shall see in section III, the proposal made in nicativeframeworkasthekeypiecesofthemathematical thissectiondefinesaframeworkwhosekeypiecetowork statement of Zipf’s hypothesis. with is eq. (6). Beyond the mathematical formalization of the com- municative conflict described by Zipf, we need another A. The explicit description of the communicative ingredient, pointed out -in a different context- in [14], conflict namely the active role played by the evolutionary path followed by the code. As it occurs with other systems growing out of equilibrium, such as scale-free networks The first task is to properly define the communicative [15], wewillconsidertheevolutionofthecommunicative tension between the coder and the decoder and how this exchange under system’s growth. Here the evolution- tension is solved. Following the standard nomenclature arycomponentisvariationallyintroducedbyminimizing used in studies of the evolution of communicating, au- the divergence between code configurations belonging to tonomous agents [19–21], in our system there are two successive time steps. This minimal change follows the agents: the coder agent, P, encoding information from so-called Minimum Discrimination Information Princi- a set of external events, Ω, and the decoder or external ple (henceforth MDIP), a general variational principle observer,whichinfersthebehaviorofΩthroughthecode considered analogous to the Maximum Entropy Princi- provided by the coder agent P. In this way, ple [16], from which statistical mechanics can be prop- Ω= m ,...,m erly formalized [17, 18]. The MDIP states that, under { 1 n} changes in the constraints of the system, the most ex- isthesetofexternaleventsactingastheinputalphabet, pectedprobabilitydistributionistheoneminimizingthe and Kullback-Leiblerdivergence(alsoreferredastoKullback- Leibler entropy or relative entropy) from the original one = s ,...,s 1 n [17]. Such a variational principle constrains the changes S { } of the internal configurations of an statistical ensemble is the set of signals or output alphabet. The coder mod- whentheexternalconditionschangeinthesamewaythat uleP-fig. (1a)-isfullydescribedbyamatrixP(X X ), s Ω | internal configurations of an statistical ensemble change where X is a random variable taking values on the set Ω when we introduce moment constraints in a Jaynesian Ω following the probability measure p; being p(m ) the k formalism. In our context, this information theoretic probability to have symbol m as the input in a given k functional assumes the role of a Lagrangian whose mini- computation. Complementarily, X isarandomvariable s mization along the process defines the possible ensemble taking values on and following the probability distri- S configurations one can observe at a certain point of an bution q which, for a given s , reads: i ∈S evolutionary path. (cid:88) Using the MDIP and the framework provided in [12], q(si)= p(mk)P(si mk), (1) | we provide a proof of unicity for the emergence of Zipf’s k≤n law in evolving codes. We stress that no arbitrary as- i.e., the probability to obtain s as the output of a codi- sumptions are made on the nature of solutions. i fication. We assume that The remainder of the paper is structured as follows: InsectionIIwerigorouslydefinethecommunicativeten- ( m Ω)(cid:88)P(s m )=1. k i k sion intuitively defined by Zipf and explicitly character- ∀ ∈ | i≤n izetheevolutionaryprocessintermsofthemathematical statement of such a tension. In section III we apply the For the decoder agent inferring the input set from the MDIP as the guiding, variational principle which ac- outputsetwithleasteffort, thebestscenarioisaone-to- counts for the possible evolutionary paths of the code. onemappingbetweenΩand . Inthiscase,Pgenerates S Finally, we demonstrate that the consequences of the an unambiguous code, and no supplementary amount of application of both the communicative tension and the information to successfully reconstruct X is required. Ω MDIP account for the emergence of Zipf’s law as the However,fromthecodingdeviceperspective,thiscoding unique possible solution of the evolving code. In section has a high cost. In order to characterize this conflict, 3 let us properly formalize the above intuitive statement: secondrelationbecomesequalitywhenthecodingdevice The decoder agent wants to reconstruct X through the performscompletelyrandomassociations. Itisclearthat Ω intermediation of the coding performed by P. There- eqs. (2) and (3) alone cannot explain the emergence of fore, the amount of bits needed by the decoder of X to Zipf’slawsinceonecouldtunetheparametersof,say,an s unambiguously reconstruct X is exponentialdistributiontoreachthedesiredrelationbe- Ω tweenentropies. Thereforeweneedtointroduceanother (cid:88)(cid:88) H(X ,X )= P(m ,s )logP(m ,s ), ingredient to obtain Zipf’s law as the unique possible so- Ω s i k i k − i≤nk≤n lution of our problem. which is the joint Shannon entropy or, simply, joint en- tropy of the two random variables X ,X [26]. From Ω s B. Evolution thecodificationprocess,thedecoderreceivesH(X )bits, s and thus, the remaining uncertainty it must face will be The unicity in the solution is provided by the evolu- H(X ,X ) H(X )=H(X X ), tion,whichisnowexplicitlyintroduced-seefig(1b). Let Ω s s Ω s − | us suppose that our communicative success grows over where time, thereby increasing the number of input symbols thatPcanencode. Formally,thisimpliesthatthecardi- (cid:88) H(Xs)= q(si)logq(si), nalityofthesetΩdefinedaboveincreases. Weintroduce − i≤n thisfeaturebydefiningasequenceofΩ’sΩ(1),...,Ω(k),... satisfying an inclusive ordering, i.e., (i.e, the entropy of the random variable X ) and s (cid:88) (cid:88) Ω(1) Ω(2) ... Ω(k),..., H(X X )= q(s ) P(m s )logP(m s ), ⊂ ⊂ ⊂ Ω s i k i k i | − | | i≤n k≤n which is introduced, without any loss of generality, as- suming that the conditional entropy of the random variable X con- Ω ditioned to the random variable X . The tension be- s Ω(1) = m , 1 tween the coder and the decoder is solved by imposing a { } Ω(2) = m ,m , symmetricbalancebetweenitsassociatedefforts-seefig. 1 2 { } (1a)-,i.e.: Thecodersendsasmanybitsastheadditional ... ... bits the observer needs to perfectly reconstruct XΩ: Ω(n 1) = m1,...,mn−1 , − { } Ω(n) = m ,...,m ,m . H(X )=H(X X ). (2) 1 n−1 n s Ω s { } | Theaboveansatzisthemathematicalformulationofthe At time step n, P will be able to process the n symbols symmetric balance between the efforts of the coder and of Ω(n).The elements m1,...,mi,... are members of some the decoder. We will refer to this equation as the sym- infinite, countable set Ω˜, i.e., ( i)(Ω(i) Ω˜). Ω˜ can ∀ ⊂ metry condition and, as pointed out in [11], it math- be understood, using a thermodynamical metaphor, as a ematically describes how the communicative tension is reservoirofinformation. Followingthischaracterization, solvedbyusingacooperativestrategybetweenthecoder we say that for every set Ω(i) there is a random variable and the decoder agents. It is worth noting that different XΩ(i),takingvaluesinΩ(i)followingtheorderedproba- equationssharingthesamespiritwereformerlyproposed, bilitydistributionpi. Furthermore,weassumethatexists within the framework of the so-called code-length game a unique µ (0,1) such that ( (cid:15)>0)( N):( n>N), ∈ ∀ ∃ ∀ [10]. From eq. (2), we can state that: (cid:12) (cid:12) (cid:12)H(XΩ(n)) (cid:12) (cid:12) µ(cid:12)<(cid:15). (4) H(XΩ,Xs)=2H(Xs). (cid:12) logn − (cid:12) And knowing the classical inequalities This means that the entropy of the input set is un- bounded when its size increases, which implies that the H(XΩ,Xs)≥H(XΩ) potentialinputsetΩ˜ actsasaninfinitereservoirofinfor- H(XΩ Xs)=H(XΩ,Xs) H(Xs) H(XΩ), mation. The behavior of the output set at the stage n is | − ≤ describedbya randomvariable X (n), which followsthe s wereachageneralrelationbetweentheinformativerich- orderedprobabilitydistributionq , asdefinedineq. (1), n nessoftheinputvariableX andtheinformativerichness Ω taking values on (n) = s ,...,s . We observe that 1 n ofthemessagessentbythecoder,constrainedbyeq. (2): S { } (n) (n + 1), defining a sequence (1),..., (k),... S ⊆ S S S 1 also ordered by inclusion. At every time step, the conse- H(XΩ) H(Xs) H(XΩ). (3) quencesofthesymmetrycondition-seeeq. (2)-depicted 2 ≤ ≤ in eq. (3) are satisfied, which implies that the sequence The first relation becomes equality only in the case of P performing a deterministic codification process. The =H(X (1)),H(X (2)),...,H(X (k)),... s s s H 4 also satisfies the convergence ansatz made over the se- quence of normalized entropies of the input -see eq. (4). Theonlydifferenceisthevalueofthelimit,ν. Thevalue of ν can be bounded by using eqs (3) and (4), thereby obtaining: 1 µ ν µ (5) 2 ≤ ≤ Therefore, in this case, by virtue of eqs. (3), (4) and (5), the convergence condition for the normalized entropies of the sequence of random variables X (1),...,X (n),... s s reads: exists a unique ν (1µ,µ) such that ( (cid:15) > ∈ 2 ∀ 0)( N):( n>N): ∃ ∀ (cid:12) (cid:12) (cid:12)H(Xs(n)) (cid:12) (cid:12) ν(cid:12)<(cid:15). (6) (cid:12) logn − (cid:12) Theaboveequationdepictstwocrucialfactsintheforth- FIG.2: Numericalsimulationofthefinaldistributionqn(n= coming derivations: If the potential informative richness 104) obtained by constraining the growth process with i)the consequences of the symmetry of coding/decoding -see eq. of the input set is unbounded, so is the informative rich- (2)-providedbyeq. (6)andii)theapplicationoftheMDIP ness of the output set, under the constraints imposed by at every step of the growth process. Different convergence the symmetry condition -see eq. (2). values are studied: a) ν = 0.2, b) ν = 0.3 and c) ν = 0.5. The dashed lines are the best fit interpolations, which give estimatedexponentsγ =1.06,1.04and1.01,respectively(all III. THE EMERGENCE OF ZIPF’S LAW with correlation coefficients r<−0.99). UNDER THE MDIP The MDIP is presented in this section as the varia- theMDIP isachievedbyminimizingthefollowingfunc- tional principle guiding the evolution of the code. As we tional [17]: shallseeattheendofthissection,theconsequencesofits application result in a proof of unicity for the emergence   of Zipf’s law in evolving codes. (cid:88) (qn+1,λ)=D(qn qn+1)+λ qn+1(si) 1. L || − i≤n+1 A. The MDIP and its consequences for the evolution of codes We observe that this functional has a role equivalent to the one attributed to the Lagrangian function in a given continuous, differentiable system; therefore, the The question is thus how the probability distribution trajectoriesminimizingitwillgoverntheevolutionofthe q evolves along the growth process. Under the MDIP n system. Furthermore, the symmetry condition on cod- we face a variational problem which is stated as follows: ing/decoding -eq. (2)- imposes that the solutions must During the growth process, the most likely code at step n+1 is the one minimizing the distance with respect to lie in the region defined by eq. (6). The minimum of L is found when q satisfies: the code at step n, consistently with the MDIP. Fur- n+1 thermore,theevolutionofthecodemustsatisfy,alongall tbhyeeeqv.olu(2ti)o.naTrhyestcerpusc,iathlecosynmtrimbuettrioyncoonfdtihtieonMdDepIiPcteids qn+1(si)=(cid:26)1λ1qn(1si)iffiffi=i≤n+n 1, (7) − λ that it naturally introduces the footprints of the path dependence imposed by evolution. Following the ther- being λ the Lagrange multiplier, which is a positive, modynamical metaphor, this variational principle acts, uniqueconstantforallelementsoftheprobabilitydistri- in our context, as a principle on energy minimization bution q . We observe that, for λ=1, D(q q )= n+1 n n+1 || acting over the transitions of successive codes. Putting 0,but,inthiscase,H(X (n))=H(X (n+1)),incontra- s s it formally, let diction to the assumption provided by eq. (6), according towhichinformativerichnessgrowsduringtheevolution- (cid:88) qn(si) ary process. D(q q ) q (s )log n n+1 n i || ≡ qn+1(si) Now we want to find the asymptotic behavior of i≤n+1 q , n under the above justified conditions (6) and n → ∞ be the Kullback-Leibler Divergence of the distribution (7). Thekeyfeaturewederivefromthepathdependency q with respect the distribution q [22]. Therefore, in the evolution imposed by the MDIP is that the fol- n+1 n 5 lowing quotient We observe that all the elements of the above equation are finite constants, since q (s ) n k+j ( k+j n) f(k,k+j)= (8) ∞ ∀ ≤ q (s ) (cid:88)logi n k < . i1+δ ∞ doesnotdependonn. Therefore, alongtheevolutionary i=1 process, as soon as Thus, having q(cid:48) as defined above, n qn(sk),qn(sk+j)>0, lim H(X(cid:48)(n))< . n→∞ s ∞ f(k,k+j) remains invariant. Therefore, during the growth process, due to the con- straint imposed by eq. (6), B. The Emergence of Zipf’s Law (cid:18) (cid:19)(1+δ) m f(m,m+1)> , (10) m+1 The asymptotic behavior of quotient f and, thus, the tail of q , is strongly constrained by the entropy restric- with δ arbitrarily small, provided that n can increase n tion provided by eq. (6) [12]. As shall see, the key of the unboundedly. Otherwise, its normalized entropy -see eq. forthcoming derivations will be the convergence proper- (9)- will have, as an asymptotic value ties of the normalized entropies of a given random vari- H(X (n)) able X having n possible states whose (ordered) proba- s 0, bilities follow a power-law distribution function, namely logn → g(s ) i−γ. The explicit form of these entropies is: i ∝ incontradictiontotheassumptionthatν >0asdepicted   in eq. (6). H(X) 1 γ (cid:88)logi Furthermore,weobservethat,if( δ >0, n>m)( N) logn = lognZ iγ +logZγ. (9) such that ∀ ∃ γ i≤n (cid:18) (cid:19)(1−δ) m Consistently, Z is the normalization constant. ( m>N) f(m,m+1)> , (11) γ ∀ m+1 The first observation is that it can be shown that the convergencepropertiesoftheRiemannζ-functiononR+ then [23] H(X (n)) s lim =1, (cid:88)∞ 1 n→∞ logn ζ(γ)= , iγ again in contradiction to eq. (6), except in the extreme, i=1 pathological case where ν = 1, when the coding pro- strongly constrain the convergence properties of a given cess is completely noisy. To see how we reach this latter probability distribution [12]. Indeed, we find that, if point we observe that statement (11) implies that q is n ( δ >0, n>m)( N) such that: notdominatedbyapower-lawprobabilitydistributionq(cid:48) ∀ ∃ n having exponent 1 δ, namely: (cid:18) m (cid:19)1+δ − ( m>N) f(m,m+1)< , ∀ m+1 q(cid:48)(s )= i−(1−δ), n i Z then ( C < R+) suchthat ( n)(H(X (n)) < C), 1−δ s which∃contra∞dict∈s the assumptions∀of the problem, de- where Z1−δ is the normalization constant. Putting ex- picted by eq. (6). Indeed, primarily, one can observe plicitly the expression of the normalized entropy -see eq. that the above statement implies that q is dominated (9)- for the random variable X(cid:48)(n), one obtains: n s by a power-law having exponent 1+δ, i.e. that qn de-   cays faster than q(cid:48), defined as: H(X(cid:48)(n)) δ(1 δ) (cid:88) logk n lim s = lim  − +δ n→∞ logn n→∞ nδlogn k1−δ i−(1+δ) k≤n qn(cid:48)(si)= Z , 1 δ (cid:18) 1(cid:19) 1+δ = lim − logn +δ n→∞ logn − δ whereZ isthenormalizationconstant. Now,wewrite 1+δ the explicit form of the entropy of X(cid:48)(n) q(cid:48) -to be = 1, written as H(X(cid:48)(n))- when n bys mul∼tiplnying the s → ∞ which is the desired result. Accordingly, since from eq. expression derived in eq. (9) by logn: (6) ν is generally different from 1, ∞ lim H(X(cid:48)(n))= 1+δ (cid:88)logi +log(ζ(1+δ)). (cid:18) m (cid:19)(1−δ) n→∞ s ζ(1+δ) i1+δ f(m,m+1)< m+1 . (12) i=1 6 Thus, combining eq. (10) and (12), we have shown efforthypothesishastwoingredients. First,startingfrom thattheasymptoticsolutionisboundedbythefollowing Zipf’s conjecture, we reach a static symmetry equation chain of inequalities: to solve the communicative tension between coder and decoder. This is consistent with previous work, but re- (cid:18) (cid:19)(1+δ) (cid:18) (cid:19)(1−δ) m m veals itself insufficient to derive Zipf’s law as the unique <f(m,m+1)< . m+1 m+1 solution, for it is easy to check that static equations of the kind of eq. (2) and (3) have infinite arbitrary so- The crucial step is that it can be shown that, if n , lutions, even in the asymptotic regime, due to the pos- →∞ we can set sible parametrizations of the solutions. Secondly -and crucially-weconsiderthatthecodeevolvesthroughtime, δ 0. → and that, consistently, there is a path dependence in its evolution, mathematically stated by imposing a varia- (The mathematical technicalities of this result can be tional principle, the MDIP, between successive states found in [12].) This implies, in turn, that, for n 1: (cid:29) of the code. It is only by imposing evolution (and thus, m path dependence) that we reach Zipf’s law as the only f(m,m+1) ≈ m+1 asymptotic solution. Therefore, the origin of the power- law with exponent γ = 1 derives from three comple- and, from the definition of f provided in eq. (8), we − mentary, very general conditions: conclude that 1 The unbounded informative potential of the code, q (s ) , • n m ∝ m thelossofinformationresultingfromthesymmetry • therebyleadingustoZipf’slawastheuniqueasymptotic condition, depicted in eq. (2), and solution. evolution,anditsassociatedpathdependency,vari- In fig. (2) we numerically explored the behavior of • ationallyimposedbytheapplicationoftheMDIP the rank probability distribution of signals belonging to oversuccessivestatesoftheevolutionofthesystem. a growing code under the assumption of symmetry in coding/decoding provided by eqs. (2) and (6), and the Thereisanother,veryinterestingpoint,intimatelytied MDIP whose consequences in the evolution of qn are to a code exhibiting Zipf’s law and, more specially, the depicted in eq. (7). The outcome perfectly fits with consequencesofthesymmetry condition,themathemati- the mathematical derivations, showing very well-defined calansatzwhichabstractlyencodestheZipf’shypothesis power-laws with exponents close to 1 although the con- ofvocabularybalance: Thepresenceofaninevitableam- vergencevaluesν divergefrom0.2to0.5. Thisnumerical biguityinthecode. Itisacommonobservationthatnat- validation shows that the predicted asymptotic effects - urallanguagesareambiguous,namely,thatlinguisticut- i.e., the convergence of qn to Zipf’s law- are perfectly terances or parts of linguistic utterances can be assigned appreciated even in finite simulations where 105 signals more than one interpretation. If the principle of least are at work. effortisatwork,andthusthereisacooperativestrategy We end this section with a remark on the boundary between the coder and the decoder, then the presence conditionsneededfortheemergenceofZipf’slaw. Inthe of a certain amount of ambiguity is expected, provided section IIB, we imposed that the potential information that the speaker tends to assign more than one meaning richness of the source must be unbounded. Such a con- tocertainsignals. Therefore,ambiguityisabyproductof dition is mathematically stated by (4). We observe that, efficientcommunicationratherthanafingerprintofpoor more than an assumption, equation (4) is a boundary communicative design. condition under which a growing code can (assymptoti- The presented framework is general, and rigorously cally)exhibitZipf’slaw[27]. Inthisway,sinceH(Xs(n)) demonstrates that Zipf’s law is a natural outcome of has a linear relation with H(XΩ(n)), the divergence of a broad class of communication systems evolving under the latter implies the divergence of the former. And it coding/decoding tensions. In other words, Zipf’s law is a required condition, since the entropy of a system emerges in a system where the coder and decoder coe- exhibiting a power law with an exponent equal to 1 di- volve under a general problem of energy minimization. vergeswithn. Otherwise, exponentsarehigher, orother The range of application to real-world phenomena, how- probability distributions can emerge. ever, must be contrasted with the validity of data, for it has been pointed out that many supposed power-law behaviorsshowdeviationswhenthestatisticalanalysisis IV. DISCUSSION performed accurately [24, 25]. It should be noted, how- ever, that a deviation of the predicted behavior need not The results provided in our study define a general ra- be necessarily attributed to a failure of the framework. tionale for the emergence of Zipf’s law in the abundance Oneshouldtakeintoaccountthatotherconstraints,such ofsignalsofevolvingcommunicationsystems. Thevaria- asgeneralmemorylimitations,canplayaroleinshaping tionalapproachtakenhereasaformalpictureoftheleast the final distribution. 7 Acknowledgments ported by NWO research project Dependency in Univer- sal Grammar, the Spanish MCIN Theoretical Linguistics We thank the members of the Complex Systems Lab 2009SGR1079(JF),theJamesS.McDonnellFoundation for useful discussions and an anonymous reviewer for (BCM) and by Santa Fe Institute (RS). his/herconstructivecomments. Thisworkhasbeensup- [1] F. Auerbach, Patermans Geograpische Mittelungen 59, Wiley and Sons. New York, 1959). 74 (1913). [18] A.R.Plastino,H.G.Miller,andA.Plastino,Phys.Rev. [2] G. K. Zipf, The Psycho-biology of language (Routledge E 56, 3927 (1997). (London), 1936). [19] J. Hurford, Lingua 77, 187 (1989). [3] G. K. Zipf, Human Behavior and the Principle of Least [20] M. A. Nowak and D. Krakauer, Proc. Nat. Acad. Sci. Effort (Addison-Wesley (Reading MA), 1949). USA 96, 8028 (1999). [4] P. Bak, C. Tang, and K. Wiesenfeld, Phys. Rev. Lett. [21] N. L. Komarova and P. Niyogi, Art. Int. 154, 1 (2004). 59, 381 (1987). [22] T.M.CoverandJ.A.Thomas,ElementsofInformation [5] X.Gabaix,TheQuarterlyJournalofEconomics114,739 Theory (John Wiley and Sons. New York, 1991). (1999) [23] M. Abramowitz and S. I. (editors), Handbook of mathe- [6] M. E. J. Newman, Contemporary Physics 46 (2005). maticalfunctions,vol.55ofNBS,Appl.Math.Ser.(U.S. [7] V.Pareto,Coursded’economiePolitique (Droz.Geneva, Government Printing office, Washington, D.C., 1965). 1896). [24] I. Kanter and D. A. Kessler, Phys. Rev. Lett. 74, 4559 [8] H. A. Simon, Biometrika 42, 425 (1955). (1995). [9] W. Li, Information Theory, IEEE Transactions on 38, [25] D. Avnir, O. Biham, D. Lidar, and O. Malcai, Science 1842 (1992). 279, 39 (1998). [10] P. Harremo¨es and F. Topsœ, Entropy 3, 191 (2001). [26] Throughout the paper, log≡log . 2 [11] R.Ferrer-i-CanchoandR.V.Sol´e,Proc.Natl.Acad.Sci. [27] We notice that eq. (4) depicts a linear relation between USA 100, 788 (2003). H(X (n))andlogn;i.e.: H(X (n))∼µlogn.Thereare Ω Ω [12] B. Corominas-Murtra and R. V. Sol´e, Phys. Rev. E 82, strong reasons to believe that one could generalize this 011102 (2010). statement by saying that the only condition needed is [13] P. Vogt, in Artificial Life IX Proceedings of the Ninth that, in spite that International Conference on the Simulation and Synthe- sis of Living Systems, edited by I. J. Pollack, M. Bedau, lim H(XΩ(n)) =0, P.Husbands,T.Ikegami,andR.A.Watson(MITPress, n→∞ logn 2004). [14] T. Maillart, D. Sornette, S. Spaeth, and G. von Krogh, if H(XΩ(n)) is a monotonous, growing and unbounded Phys. Rev. Lett. 101, 218701 (2008). functiononn,thenZipf’slawwouldemergeusingsimilar [15] S. N. Dorogovtsev and J. F. F. Mendes, Evolution of arguments to the ones used in this paper. The lack of a Networks (Oxford University Press. New York, 2003). rigorous demonstration for this latter point forces us to [16] E. T. Jaynes, Physical Review 106, 620 (1957). restrictourargumentstotheregionofapplicationofeq. [17] S. Kullback, Information Theory and Statistics (John (4).

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.