ebook img

Tie Strength Distribution in Scientific Collaboration Networks PDF

0.94 MB·
by  Qing Ke
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Tie Strength Distribution in Scientific Collaboration Networks

Tie Strength Distribution in Scientific Collaboration Networks Qing Ke and Yong-Yeol Ahn∗ Center for Complex Networks and Systems Research, School of Informatics and Computing, Indiana University, Bloomington, IN, USA (Dated: July 8, 2014) Science is increasingly dominated by teams. Understanding patterns of scientific collaboration andtheirimpactsontheproductivityandevolutionofdisciplinesiscrucialtounderstandscientific processes. Electronicbibliographyoffersauniqueopportunitytomapandinvestigatethenatureof scientificcollaboration. Recentworkhavedemonstratedacounter-intuitiveorganizationalpatternof scientificcollaborationnetworks: denselyinterconnectedlocalclustersconsistofweakties,whereas strongtiesplaytheroleofconnectingdifferentclusters. Thispatterncontrastsitselffrommanyother typesofnetworkswherestrongtiesformcommunitieswhileweaktiesconnectdifferentcommunities. Although there are many models for collaboration networks, no model reproduces this pattern. 4 In this paper, we present an evolution model of collaboration networks, which reproduces many 1 0 properties of real-world collaboration networks, including the organization of tie strengths, skewed 2 degree and weight distribution, high clustering and assortative mixing. l u J I. INTRODUCTION ‘weak’ and ‘strong’ ties refer the weight of edges. This organizational principle has been repeatedly confirmed 6 Teams are increasingly overshadowing solo authors in in many networks [22–27]. However, scientific collabo- ] production of knowledge [1]. Examining patterns of sci- ration networks exhibit the opposite pattern; weak ties h constitute communities, while strong ties connect these entificcollaborationisthereforecrucialtounderstandthe p researchcommunities[27,28]. Thiscounter-intuitiveob- - scientific processes, knowledge production [1], research c servation raises a question: How and why collaboration productivity [2], the evolution of disciplines [3], and sci- o networks are shaped in this way? entific impact [4, 5], etc. Electronic bibliographic data s . and the development of network science make it possible Althoughtherearemanymodelsofscientificcollabora- s c to systematically investigate scientific collaboration at a tionnetworksorsimilarweightednetworks[3,11,29–35], i large scale [6–8]. A common approach to studying scien- the organization of tie strength and their roles on global s y tific collaboration is to construct a network of collabora- connectivity have not been fully explored. Here we pro- h tion, where nodes represent authors and two authors are pose that the academic advising system, the patterns of p connected by co-authorship [9]. Various aspects of col- academic career trajectory, and the active inter-group [ laborationnetworkshavebeenwidelyexplored,including collaboration may provide an explanation. Our key no- 3 basic structural properties [9, 10], evolution [11], robust- tionisthatweaktiesaremainlyformedfromshort-term v ness[12,13],assortativemixing[14],andrich-cluborder- collaborationsbetweenstudentsandtheiradvisors,while 7 ing[15,16]. Sincecoauthorsusuallyknoweachother,col- strong ties are formed through long-term collaborations 2 laboration networks have often been considered as prox- between groups [28]. Built on this notion, our model 0 iesofsocialnetworks[9]. Thisviewpointhasbeenwidely reproduces the tie-strength distribution as well as other 5 adopted, because collaboration networks can be system- common properties, such as skewed degree and weight . 1 atically constructed without any subjective bias [9] and distribution, high clustering, and assortative mixing. 0 the size of these networks can be large. 4 However, recent studies have revealed that collabo- 1 ration networks possess unique properties that are not II. STRUCTURE AND LINK WEIGHT : v presented in other proxies of real-world social networks i X such as mobile communication networks and online so- Totesttheuniversalityoftheatypicaltie-strengthdis- cial networks. One example is the atypical distribution tributions in scientific collaboration reported in [27, 28], r a of weak and strong ties. Like most other networks, col- we analyze four scientific collaboration networks: Net- laboration networks exhibit cohesive groups (‘commu- work Science, High-energy Physics, Astrophysics, and nities’) [17–20]. Since Granovetter pioneered the ideas Condensed Matter. Link weights in these networks are of the relationship between network structure and tie definedbyw =(cid:80) 1 ,wheren isthenumberofau- strength, it has been assumed that strong ties tend to thorsinpapeirjpinwphnicph−1iandjparpticipated[10,36]. Al- exist in the communities, while weak ties tend to con- though this particular definition of weight is not unique, nect these groups [17, 21]. Here we refer ‘communities’ ithasbeenwidelyacceptedasastandardmetric(SeeSec- inapurelystructuralpointofview,ignoringweightsand tionIIIin[36]foradetailandthoroughdiscussionabout it). Table I lists basic statistics of these networks. As manystudiesdemonstrated,bothdegreeandlinkweights are broadly distributed [9, 10]. ∗ [email protected] Figure 1 shows the relationship between link weight 2 Name N M (cid:104)k(cid:105) (cid:104)w(cid:105) c r Time Net-sci 379 914 4.823 0.536 0.741 -0.0817 - Hep-th 5,835 13,815 4.74 0.990 0.506 0.185 1995 – 1999 Astro-ph 14,845 119,652 16.12 0.279 0.670 0.228 1995 – 1999 Cond-mat 36,458 171,735 9.42 0.506 0.657 0.177 1995 – 2005 TABLE I. Structural statistics for weighted scientific collaboration networks include number of nodes N, number of links M, mean node degree (cid:104)k(cid:105), mean link weight (cid:104)w(cid:105), clustering coefficient c [37], and assortativity coefficient r [14]. Net-sci is based on the coauthorship of scientists working on network science [38]. Hep-th, Astro-ph, and Cond-mat are constructed based on the papers posted on High-energy Physics E-Print Archive (http://arxiv.org/archive/hep-th), Astrophysics E- PrintArchive(http://arxiv.org/archive/astro-ph),andCondensedMatterE-PrintArchive(http://arxiv.org/archive/ cond-mat), respectively [9]. For each network, we only consider the largest connected component. All the 4 networks are downloaded from http://www-personal.umich.edu/~mejn/netdata/. w and local clustering defined by the overlap measure 1.0 1.0 ij O = nij ,wheren isthenumberofcommon 0.8 0.8 ij di−1+dj−1−nij ij 0.6 0.6 neighborsofnodeiandj,anddi(dj)isthedegreeofnode Oij 0.4 0.4 i (j) [23]. O quantifies the overlap between the neigh- ij 0.2 0.2 bors of two end-points and measures embeddedness of an A B 0.0 0.0 edge. For instance, Oij =0 indicates that nodes i and j 10-1 100 101 10-2 10-1 100 101 102 have no common neighbors and the link is likely to con- 1.0 1.0 nect communities. For a large portion of links, overlap 0.8 0.8 decreases with weights. For a small portion of strongest Oij0.6 0.6 links (20% links with w >0.640 for Net-sci, 4.3% links 0.4 0.4 ij withw >3.327forHep-th,7.8%linkswithw >0.765 0.2 0.2 ij ij C D for Astro-ph, and 14% links with wij > 0.869 for Cond- 0.100-2 10-1 100 101 1020.100-2 10-1 100 101 102 mat), overlap increases with weights. These results in- w w ij ij dicate that weak ties mainly constitute dense local clus- FIG. 1. The correlation between link overlap O and link ij ters, whereas strong ties are connecting these clusters. weightw inscientificcollaborationnetworkof(A)Network ij In order to further confirm the universality of weight- Science,(B)High-energyPhysics,(C)Astrophysics,and(D) topologycouplingpatternsinscientificcollaborationnet- Condensed Matter. We use logarithmic binning for w . The ij works, we examine network connectivity under link re- error bars indicate the standard error of the mean Oij. For moval [23, 25, 28]. We remove links based on descending a large portion of links, overlap decreases with weight. For a or ascending order of link weights and track the relative smallportionofstrongestlinks,overlapincreaseswithweight. size of Largest Connected Component (LCC) R as a LCC functionofthefractionofremovedlinks. Figure2shows that removing strong links breaks the networks into dis- By contrast, our model is based on the following ob- connected components faster than removing weak links, servations: (i) scientific collaboration networks grow in indicating that strong links are more important in main- time, as new papers and scientists join continuously; (ii) tainingglobalnetworkconnectivity. Stronglinksconnect junior scientists become inactive with high probability; clusters(Fig. 3B),whileweaklinksresideinsidecommu- indeed, recent work on analysis of the APS dataset re- nities (Fig. 3C). vealsthat40%ofauthorsonlypublishonepaperintheir entire career [28]; and (iii) long-term collaboration usu- ally occurs between senior scientists who have their own III. MODEL research groups [28]. Our model has two mechanisms of producing new pa- Manymodelshavebeenproposedtoexplaintheknown pers: intra-groupandinter-groupcollaboration. Starting properties of scientific collaboration networks. However, with a research group of an advisor and a student, the these models either do not consider link weights or do collaboration network grows over time. At every time not capture the role of strong ties in maintaining global step: connectivity. Some models focus on assortative mix- ing [29, 30]. Some study the self-organizing evolution • Withprobabilityc,eachgrouppublishesonepaper of collaboration networks as preferential attachment and byitself. Theparameterccontrolstheratiooftotal “rich-get-richer” [11, 31, 32]. Others emphasize the evo- numberofauthorstototalnumberofpapers. Each lution of disciplines [33] or social interaction of scien- paperiswrittenbytheadvisorandl−1co-authors tists [3]. The weak-tie hypothesis has often been consid- preferentiallychosenfromthesamegroupbasedon eredasanevidenttruthaboutnetworks, andmostmod- thestudents’scientificexpertisee. Theprobability elsthatproducecommunitystructuresassumeso[40,41]. tobechosenisproportionaltoe. Ifastudentjoins 3 1.0 Parameter Meaning A B 0.8 c Probability to publish paper C0.6 l Number of authors in each paper C L G Expertise threshold for students to graduate R 0.4 high f Probability of graduates to form new groups 0.2 low α Ratio of inter-group to intra-group collaborations 0.0 1.0 C D TABLE II. Model parameters and their explanations. 0.8 C0.6 C L R 0.4 0.2 • Each group has one new student; 0.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 f f FIG. 2. The robustness of scientific collaboration network • WhentheexpertisereachesathresholdG, thestu- of (A) Network Science, (B) High-energy Physics, (C) As- dent forms a new group with probability f or be- trophysics, and (D) Condensed Matter under the removal comes inactive with probability 1−f. of strong (weak) ties. The control parameter f means the fraction of removed links. The removal of links is on the basis of their strength w . The black dashed curves corre- ij spond to the removal of links from weak to strong. The red solid curves correspond to the removal of links from strong A. Simulation Results to weak. The relative size of Largest Connected Component (LCC)R =N /N indicatesthattheremovalofstrong LCC LCC links leads to a faster breakdown of networks. In setting the parameter values (or distributions), we incorporateasmanyempiricallymeasuredvaluesaspos- sible. First, we assume the following parameters to be A B C constants, namely c = 0.4, G = 7, and f = 0.2 in our analysis. The choice for c = 0.4 is based on the ratio of LCC total number of scientists to that of papers in the APS dataset. The other parameters G and f are also cho- sen based on real-world observations [42]. The number of authors in each paper, l, is a random variable with theunderlyingprobabilitydistributionobtainedfromthe APS dataset. The only free parameter is α and we have FIG.3. VisualizationofthestructureofNetworkSciencecol- performed a robustness analysis of α and the other pa- laboration network and link removal process. (A) The whole network structure with 379 nodes and 914 links. The color rametersinSectionIIIC.TableIIshowsthemeaningsof of each node indicates its community membership obtained the model parameters. by Louvain method [39]. (B) The remaining subgraph after We have one free parameter, α, and we investigate removal of 43% strongest links. The shaded region indicates the impact of inter-group collaboration by varying it. LargestConnectedComponent. (C)Theremainingsubgraph Fig. 4A-C demonstrate that the v-shaped pattern be- after removal of 43% weakest links. tweenoverlapandweightcanbereproducedwhenα≥1, i.e., when there are active inter-group collaborations. Fig. 4D-F show that, on the other hand, the strong a group at time τ, with initial expertise e(τ) = 1, ties increasingly maintain global connectivity if we de- e increases linearly with time: e(t)=t−τ +1; crease α. As we will demostrate in the next section, when α (cid:39) 3.44, the difference between intra- and inter- group tie strength is 0. Our model seems to exhibit the • Each group may publish up to α papers with an- most similar weight organization with the real collabo- other group. If the group has not had any exter- ration networks around α = 1. All the results below nal collaboration, it chooses a group randomly and are obtained when α = 1. These results indicate that establish a permanent preferred collaboration rela- the inter-group collaborations plays an important role in tionship. The group tries to write α papers, each explaining the atypical tie-strength distributions in sci- with probability c; Each paper still have l authors, entific collaboration networks. among which two are the two advisors from each group and l−2 are randomly chosen from the pool Furthermore, our model reproduces other common of students of the two groups with probability pro- propertiesofscientificcollaborationnetworks: (i)skewed portional to their expertise; The parameter α con- distributionofdegreeandlinkweights,asshowninFig.5 trols the ratio of inter-group collaborations to that and (ii) strong clustering (average clustering coefficient of intra-group. is 0.55 when α=1). 4 1.0 p (e) be the probability that a student with expertise e A B C i 0.8 ischoseninanintra-grouppaper(seeAppendixBforits 0.6 calculation). Then the gained link weight between the Oij 0.4 advisor and the student is 0.2 c 0.00.1 1 10 100 0.1 1 10 100 0.1 1 10 100 wa(i,)e = l−1pi(e). (1) w w w 1.0 ij ij ij D E F 0.8 Let pi(e1,e2) be probability that two students in the group g with expertise e and e are chosen in an intra- C0.6 1 2 C RL0.4 group paper. Then the gained link weight between the two students is 0.2 0.0 c 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 w(i) = p (e ,e ). (2) f f f e1,e2 l−1 i 1 2 FIG. 4. Model results with different α. Top: correlation Meanwhile,theadvisoraingroupg writesonaverageαc between O and w ; Bottom: model network robustness to ij ij link removal. Left: α=0; Middle: α=1; Right: α=3. papers with another group g(cid:48). The weight between the two advisors increases by αc w = . (3) 100 a,a(cid:48) l−1 A B C 10-1 Letp (e)betheprobabilitythatastudentwithexpertise P(d)10-2 e is cbhosen in an inter-group paper. Then the expected 10-3 gain, through inter-group collaborations, of weights be- tween an advisor and a student with expertise e either 10-4 1001 1d0 100 1 1d0 100 1 1d0 100 in the same group (wa(i,(cid:48)e)) or the other group (wa(b,e)) are D E F represented as follows: 10-1 w)10-2 w(i(cid:48)) =w(b) = αc p (e). (4) P(10-3 a,e a,e l−1 b 10-4 Let p (e ,e ) be the probability that two students with 10-5 b 1 2 0.1 1 w 10 1000.1 1 w 10 1000.1 1 w 10 100 expertise e1 and e2 are chosen in an inter-group paper. Then FIG.5. Ourmodelproducesskeweddegreeandweightdistri- αc butions. Complementary cumulative (Top) degree and (Bot- w(i(cid:48)) =w(b) = p (e ,e ). (5) tom)weightdistributions. Left: α=0;Middle: α=1;Right: e1,e2 e1,e2 l−1 b 1 2 Hep-th. So the gained tie strength within a group at each time step is G G B. Analytical Results W(i) =(cid:88)w(i) + (cid:88) w(i) +(cid:88)w(i(cid:48))+ (cid:88) w(i(cid:48)) . a,e e1,e2 a,e e1,e2 e=1 e1(cid:54)=e2 e=1 e1(cid:54)=e2 (6) By calculating the gained tie strength within a group The total gained tie strength between the two groups is andbetweengroupsateachtimestep,weshowthatwhen α (cid:39) 3.44, the difference between intra- and inter-group G G G tie strength is 0. We focus on stationary groups with W(b) =w +(cid:88)w(b) + (cid:88) (cid:88) w(b) . (7) G students and with total expertise G(G+1)/2. With a,a(cid:48) a,e e1,e2 e=1 e1=1e2=1 probability c, the advisor a in group g writes one paper withthegroupmembers. Itwilladdtheweightof 1 to Thedifferencebetweeninter-grouptiestrengthandintra- l−1 the link between the advisor and a chosen student. Let group tie strength is ∆W =W(i)−W(b)   G G c (cid:88) (cid:88) (cid:88) = l−1 pi(e)+ pi(e1,e2)−α−α pb(e,e) (8) e=1 e1(cid:54)=e2 e=1 5 103 104 ∆W =0 when Analytical Numerical α = (cid:80)Ge=1pi(e)+(cid:80)e1(cid:54)=e2pi(e1,e2) (cid:39)3.44. (9) 103 c 1+(cid:80)G p (e,e) 102 e=1 b Indeed, Fig. 4F shows that with α = 3, the removal of ng ns102 weak and strong ties similarly affects the connectivity of the network until about 60% of the edges removed. 101 101 We next derive the number of groups n (t) and the g number of students n (t) at t. Let n(e)(t) be the number s s A B of students at the expertise level e at time step t. The 100 100 expertise e increases in time and the students graduate 0 10 20 30 40 50 60 70 0 10 20 30 40 50 60 70 t t when the expertise reaches G. Graduates create their FIG. 6. Comparison of calculation results with numerical own group with the probability f, namely results of (A) number of of groups n (t) and (B) number of g students n (t). The numerical results are averaged over 100 n (t)=n (t−1)+fn(G−1)(t−1), (10) s g g s repetitions. with n (0) = ... = n (G−2) = 1. At each time step, g g therearethesamenumberofnewstudentsasthenumber 1.0 of groups A B 0.8 0.6 n(1)(t)=n (t). (11) Oij s g 0.4 0.2 The number of students with expertise e≥2 is the same 0.0 as the number of students with expertise e − 1 in the 0.1 1 10 100 0.1 1 10 100 w w previous time step 1.0 ij ij C D 0.8 n(se)(t)=...=n(s1)(t−(e−1))=ng(t−e+1). (12) LCC0.6 R 0.4 Therefore, the number of groups is 0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 ng(t)=ng(t−1)+fng(t−G+1). (t≥G−1) (13) f f FIG. 7. (Top) Weight-overlap correlation and (Bottom) ro- The number of students is bustness to link removal when (Left) G = 6 and (Right) G=8. G G (cid:88) (cid:88) n (t)= n(e)(t)= n (t−e+1). (t≥G−1) s s g e=1 e=1 (14) IV. CONCLUSIONS withn (t)=t+1fort≤G−2. Theincreasednumberof s In this paper we explore the weight organization of nodes is the number of groups in the previous time step scientific collaboration networks. We propose a model, which incorporates intra- and inter-group collaborations N(t)=N(t−1)+n (t−1) (15) g and reproduces many properties of real-world collabora- tion networks. We also provide detailed analysis of our withN(0)=0andN(t)=t+1for1≤t≤G−2. These model. Our work also raises further questions such as: analytical results are in agreement with the numerical How did the collaboration pattern change in time? How results, as shown in Fig. 6. doscientificideasflowthroughstrongandweakties? Are Finally,weofferthecalculationofthemeandegree(cid:104)d(cid:105) there any general coupling patterns (or classes) between of model networks in Appendix A. structure and weights? C. Robustness Analysis ACKNOWLEDGMENTS We now investigate the model’s sensitivity to the pa- rameters G and f. Fig. 7 and 8 show that our model is The authors would like to thank Lilian Weng, Nicola robust to the choices of parameters G and f. The two Perra, M´arton Karsai, Filippo Menczer, Alessandro observations are still produced when (Fig. 7) G = 6 or Flammini, Filippo Radicchi, Martin Rosvall, Jie Tang, G=8 and when (Fig. 8) f =0.1 or f =0.3. and Honglei Zhuang for helpful discussions, and John 6 1.0 in the group g for a is A B 1 0.8 (cid:88) (cid:88) 0.6 d = I(a ,a )= 1×Pr(a ,a ) Oij i 1 e 1 e 0.4 ae∈g ae∈g 0.2 (cid:88) = 1−Pr(a ,a ) 0.0 1 e 0.1 1 10 100 0.1 1 10 100 w w ae∈g 1.0 ij ij (cid:88) C D =2(G−2)− Pr(a ,a ). (A1) 0.8 1 e C0.6 ae∈g C L R 0.4 Consider the student a with expertise e = 2 at time 2 0.2 t = τ, a and a will be in the same group until the 1 2 0.00.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 expertise of a2 reaches G−1. Let p¯(e1,e2) be the proba- f f bility that two students with expertise e and e are not 1 2 FIG. 8. (Top) Weight-overlap correlation and (Bottom) ro- chosen in the one intra-group paper bustness to link removal when (Left) f = 0.1 and (Right) f =0.3. p¯(e ,e )=(1−c)+c(1−p (e ,e )). (A2) 1 2 i 1 2 Then Pr(a ,a ) is the probability that a do not collab- 1 2 1 orate with a from t=τ to t=τ +G−3 2 G−2 (cid:89) Pr(a ,a )= p¯(e,e+1). 1 2 McCurley for editorial assistance. e=1 Similarly, for student a ,...,a with expertise e = 3 G−1 3,...,G−1 at time t=τ G−3 (cid:89) Pr(a ,a ) = p¯(e,e+2), 1 3 e=1 ... 1 (cid:89) Pr(a,a ) = p¯(e,e+G−2). G−1 e=1 Appendix A: Calculation of mean degree Fromtimestept=τ+1tot=τ+G−2,theexpertise of a increases from e=2 to G−1. A new student joins 1 the group at each time step Inordertogettheexpecteddegreeofanadvisorwhen graduation, wetrackitfromtimestepτ whenshejoined G−1 (cid:89) astationarygroupgasastudenta withinitialexpertise Pr(a ,a ) = p¯(e,e−1), 1 1 G e = 1 to τ +G−1 when graduation with expertise G. e=2 ... For intra-group collaboration, there are already a total G−1 of G−2 students in group g at τ. From t = τ +1 to (cid:89) Pr(a ,a ) = p¯(e,e−(G−2)). t = τ +G−2, there are another G−2 new students 1 2G−3 e=G−1 joining in the group. Let I(a ,a ) (Pr(a ,a )) be the 1 2 1 2 indicatorfunction(probability)thata havecollaborated The intra-group degree for student collaborators 1 with a . The number of different students collaborators (Eq. A1) now is 2 G−2G−1−j G−1G−1 (cid:88) (cid:89) (cid:88) (cid:89) d =2(G−2)− p(e,e+j)− p(e,e−(j−1)). (A3) i j=1 e=1 j=2 e=j The inter-group degree d is similar except different expertise 1). Let p¯(α)(e ,e ) be the probability that two b b 1 2 probabilityform. Thereareatotalof2(G−2)+1number studentswithexpertisee ande arenotchoseninanyof 1 2 of different students (one more student in group g(cid:48) with the α inter-group papers. Then, the number of student collaborators in another group g(cid:48) is 7 G−2G−1−j G−1G−1 (cid:88) (cid:89) (cid:88) (cid:89) d =2G−3− p(α)(e,e+j)− p(α)(e,e−(j−1)). (A4) b j=0 e=1 j=2 e=j Consideringtwoadvisers,wegetthedegreeofthestu- students dent a when graduation 1 ∆d =(1−(1−p (1)))+(1−(1−p (1))α). (A6) m i b The total degree of advisors at time step t is d=d +d +2. (A5) i b t−1 (cid:88) (cid:110) (cid:111) D (t)= n(G−1)(τ)·f ·[d+(t−τ +1)∆d ] . m s m τ=G−1 (A7) After graduation, a becomes an advisor with proba- Usingthesameidea,wecangetthedegreeofastudent 1 bility f. The increased degree at each time step now is with expertise e (e ≤ G−1) at time step t. This is a theprobabilityofcollaboratingwiththetwonewlyjoined general case of Equations A3 and A4 G−1−e e e−1e−i e e d(e)(t)=2(G−2)− (cid:88) (cid:89)p¯(j,j+i)−(cid:88)(cid:89)p¯(j,j+G−1−(e−i))−(cid:88)(cid:89)p¯(j,j−(i−1)), i i=1 j=1 i=1j=1 i=2j=i G−1−e e e−1e−i e e d(e)(t)=2G−3− (cid:88) (cid:89)p¯(α)(j,j+i)−(cid:88)(cid:89)p¯(α)(j,j+G−1−(e−i))−(cid:88)(cid:89)p¯(α)(j,j−(i−1)). b i=0 j=1 i=1j=1 i=2j=i d(e)(t)=d(e)(t)+d(e)(t)+2. (A8) studentwithexpertiseeischosenforoneintra-grouppa- i b per. Its distribution is the sum of multiple multivariate So the mean degree (cid:104)d(cid:105) at time step t is Wallenius’ noncentral hypergeometric distributions [43] and can be calculated by using package BiasedUrn (cid:104)d(cid:105)(t)= 1 [D (t)+G(cid:88)−1n(e)(t)d(e)(t)]. (A9) in R [44]. The calculations for pb(e), pi(e1,e2), and N(t) m s pb(e1,e2) are similar. e=1 Appendix B: Calculation of p (e) i We describe how to calculate p (e), p (e), p (e ,e ), i b i 1 2 and p (e ,e ). Recall that p (e) is the probability that a b 1 2 i [1] S. Wuchty, B. F. Jones, and B. Uzzi, Science 316, 1036 [7] R. Albert and A.-L. Baraba´si, Rev. Mod. Phys. 74, 47 (2007). (2002). [2] D. B. de Beaver and R. Rosen, Scientometrics 1, 133 [8] S.Boccaletti,V.Latora,Y.Moreno,M.Chavez,andD.- (1979). U. Hwang, Physics Reports 424, 175 (2006). [3] X. Sun, J. Kaur, S. Milojevi´c, A. Flammini, and [9] M. E. J. Newman, PNAS 98, 404 (2001). F. Menczer, Scientific Reports 3, 1069 (2013). [10] A. Barrat, M. Barth´elemy, R. Pastor-Satorras, and [4] B. F. Jones, S. Wuchty, and B. Uzzi, Science 322, 1259 A. Vespignani, PNAS 101, 3747 (2004). (2008). [11] A.-L. Baraba´si, H. Jeong, Z. Neda, E. Ravasz, A. Schu- [5] B.Uzzi,S.Mukherjee,M.Stringer,andB.Jones,Science bert, and T. Vicsek, Physica A 311, 590 (2002). 342, 468 (2013). [12] P. Holme, B. J. Kim, C. N. Yoon, and S. K. Han, Phys. [6] M. E. J. Newman, SIAM Review 45, 167 (2003). Rev. E 65, 056109 (2002). 8 [13] X.-F. Liu, X.-K. Xu, M. Small, and C. K. Tse, PLoS [33] R. Guimera, B. Uzzi, J. Spiro, and L. A. N. Amaral, ONE 6, e26271 (2011). Science 308, 697 (2005). [14] M. E. J. Newman, Phys. Rev. Lett. 89, 208701 (2002). [34] K.-I. Goh, B. Kahng, and D. Kim, Phys. Rev. E 72, [15] V. Colizza, A. Flammini, M. A. Serrano, and A. Vespig- 017103 (2005). nani, Nature Physics 2, 110 (2006). [35] K.-I.Goh,Y.-H.Eom,H.Jeong,B.Kahng,andD.Kim, [16] T.Opsahl,V.Colizza,P.Panzarasa,andJ.J.Ramasco, Phys. Rev. E 73, 066123 (2006). Phys. Rev. Lett. 101, 168702 (2008). [36] M. E. J. Newman, Phys. Rev. E 64, 016132 (2001). [17] M.GirvanandM.E.J.Newman,PNAS99,7821(2002). [37] D.J.WattsandS.H.Strogatz,Nature393,440(1998). [18] S. Fortunato, Physics Reports 486, 75 (2010). [38] M. E. J. Newman, Phys. Rev. E 74, 036104 (2006). [19] M. Rosvall and C. T. Bergstrom, PNAS 105, 1118 [39] V. D. Blondel, J.-L. Guillaume, R. Lambiotte, and (2008). E. Lefebvre, J. Stat. Mech. 2008, P10008 (2008). [20] Y.-Y. Ahn, J. P. Bagrow, and S. Lehmann, Nature 466, [40] H.-H.Jo,R.K.Pan,andK.Kaski,PLoSONE6,e22687 761 (2010). (2011). [21] M. S. Granovetter, American Journal of Sociology 78, [41] J.M.Kumpula,J.-P.Onnela,J.Sarama¨ki,K.Kaski,and 1360 (1973). J. Kert´esz, Phys. Rev. Lett. 99, 228701 (2007). [22] N. Friedkin, Social Networks 2, 411 (1980). [42] The expertise threshold for students to graduate, G, re- [23] J.-P. Onnela, J. Sarama¨ki, J. Hyvo¨nen, G. Szabo´, flects the median year to earn a Ph.D. degree reported D. Lazer, K. Kaski, J. Kert´esz, and A.-L. Baraba´si, in [45]; probability of graduates to form new groups, f, PNAS 104, 7332 (2007). is based on the percentage of Ph.D. students who land [24] J.-P. Onnela, J. Sarama¨ki, J. Hyvo¨nen, G. Szabo´, M. A. academic jobs by graduation [46]. de Menezes, K. Kaski, A.-L. Baraba´si, and J. Kert´esz, [43] J. Chesson, J. Appl. Probab. 13, 795 (1976). New Journal of Physics 9, 179 (2007). [44] A. Fog, Biasedurn: Biased urn model distribu- [25] X.-Q. Cheng, F.-X. Ren, H.-W. Shen, Z.-K. Zhang, and tions, http://cran.r-project.org/web/packages/ T. Zhou, J. Stat. Mech. 2010, P10011 (2010). BiasedUrn/index.html (2013), [Online; Accessed on [26] P. A. Grabowicz, J. J. Ramasco, E. Moro, J. M. Pujol, 1-Nov-2013]. and V. M. Eguiluz, PLoS ONE 7, e29358 (2012). [45] T. B. Hoffer and V. Welch, Time to Degree of U.S. [27] S. Pajevic and D. Plenz, Nature Physics 8, 429 (2012). Research Doctorate Recipients, http://www.nsf.gov/ [28] R. K. Pan and J. Saramaki, EPL 97, 18007 (2012). statistics/infbrief/nsf06312/ (2006), [Online; Ac- [29] M. E. J. Newman and J. Park, Phys. Rev. E 68, 036122 cessed on 29-Mar-2014]. (2003). [46] J. Weissmann, How Many Ph.D.’s Actually [30] M. Catanzaro, G. Caldarelli, and L. Pietronero, Phys. Get to Become College Professors?, http: Rev. E 70, 037101 (2004). //www.theatlantic.com/business/archive/2013/02/ [31] J. J. Ramasco, S. N. Dorogovtsev, and R. Pastor- how-many-phds-actually-get-to-become-college-professors/ Satorras, Phys. Rev. E 70, 036106 (2004). 273434/ (2013), [Online; Accessed on 29-Mar-2014]. [32] K.Bo¨rner,J.T.Maru,andR.L.Goldstone,PNAS101, 5266 (2004).

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.