Table Of Content

Around Average Behavior: 3-lambda Network Model Milos Kudelka Eliska Ochodkova Sarka Zehnalova FEI,VSB-Technical FEI,VSB-Technical FEI,VSB-Technical UniversityofOstrava UniversityofOstrava UniversityofOstrava 17. listopadu15,70833 17. listopadu15,70833 17. listopadu15,70833 Ostrava-Poruba,Czech Ostrava-Poruba,Czech Ostrava-Poruba,Czech Republic Republic Republic [email protected] [email protected] [email protected] ABSTRACT suchascore-peripherystructure[16]andself-similarity[19]. Underlying processes that take place during the evolution The analysis of networks affects the research of many real 7 of real-world networks are also examined. Some models are phenomena. The complex network structure can be viewed 1 based on analyzing these processes, which allows using the asanetwork’sstateatthetimeoftheanalysisorasaresult 0 formallydescribedunderlyingprocessasagenerativemech- of the process through which the network arises. Research 2 anism. Such a mechanism can generate networks possess- activities focus on both and, thanks to them, we know not n only many measurable properties of networks but also the ing one or more known properties. Models that reveal key a principles include those using the preferential attachment essenceofsomephenomenathatoccurduringtheevolution J to generate network centers [1] or triadic closure i.e. com- ofnetworks. Onetypicalresearchareaistheanalysisofco- 6 authorship networks and their evolution. In our paper, the pletinginterconnectionsintotriangles,capableofgenerating analysis of one real-world co-authorship network and inspi- community structure [5]. ] Models that provide key knowledge about networks are rationfromexistingmodelsformthebasisofthehypothesis I usually inherently simple. Most of them, however, focus on S from which we derive new 3-lambda network model. This thequestion“Howtoconnectanewnodeintothenetwork?” . hypothesis works with the assumption that regular behav- s Our question is, “How does an existing node behave to its c ior of nodes revolves around an average. However, some neighbors and other existing and new nodes during the net- [ anomalies may occur. The 3-lambda model is stochastic and uses the three parameters associated with the average work’s evolution?”. The result of our focus on the behavior 2 ofexistingnodesisasimplemodelwithoutmemoryandwith behavior of the nodes. The growth of the network based v onlythreeparameters. Thisnew3-lambdamodelisinspired on this model assumes that one step of the growth is an 4 bytheevolutionoftheco-authorshipnetwork. Fortheanal- interaction in which both new and existing nodes are par- 7 ysisweusedanetworkgeneratedfromaDBLPdatasetand ticipating. Inthepaperwepresenttheresultsoftheanalysis 2 we worked with the assumption that in each publication is of a co-authorship network and formulate a hypothesis and 1 just one key author who picks out additional co-authors. a model based on this hypothesis. Later in the paper, we 0 In the analytically oriented experiment, we show that with examine the outputs from the network generator based on 1. the3-lambdamodelandshowthatgeneratednetworkshave suchanassumption,thenumberofpublicationswithagiven 0 characteristics known from the environment of real-world number of co-authors corresponds approximately to a Pois- 7 networks. son distribution. Based on the result of this experiment, 1 weformulateasimplehypothesisandtheresultingnetwork : growthmodel. Thishypothesisassumesthatone-stepofthe v Keywords networkgrowthisaninteractioninvolvingexistingandnew i X complexnetworks;graphs;networkmodel;communitystruc- network nodes. In this respect our approach is similar to ture the model of collaborative networks published by Ramasco r a et al. [15] and inspired by the analysis of co-authorship ego networks in research of Arnaboldi et al. [2]. 3-lambda is 1. INTRODUCTION a stochastic model that estimates the number of nodes in Network analysis is a phenomenon that affects research theinteractionusingthePoissondistribution. Intheexper- in many areas. One of the goals of network analysis is to imental part of this paper, we describe the network gener- describethephenomena,properties,andprinciplesthatare ator based on our model and three experiments. The first universal and manifest in nature, society, and in the use of experiment shows that generated networks have character- technology. As a network, we understand an ordered pair isticsknownfromreal-worldnetworks,andhowtheselected G = (V,E) (undirected unweighted graph) of a set V of properties change with a different setting of the generator. nodes and a set E of edges which are unordered pairs of The second experiment shows how the properties of gener- nodesfromG. Thecomplexnetworkstructurecanbeviewed atednetworkchangeduringitsgrowth. Theaimofthethird from the perspective of the network’s state at the time of experimentistocomparesomecharacteristicsoftheDBLP the analysis. Networks can, therefore, be described by the network and large-scale generated networks. properties known from the environment of real-world net- The rest of the paper is organized as follows: Section 2 works, including, in particular, the small-world, free-scale, focusesontherelatedwork. Section3providesourfindings high average clustering coefficient, assortativity [13], com- and hypothesis on the real-world network extracted from munity structure, shrinking diameter [12], but also others the DBLP dataset. In Section 4, we describe the 3-lambda are denser than communities themselves. modelofcollaborativenetworkandnetworkgeneratorbased Processes in the networks take place in time. Networks on this model. Section 5 focuses on three experiments with are from this perspective temporal, and each interaction is generated networks. We conclude and briefly discuss open reflectedinchangestothenetworkstructure. Applicationof problems in Section 6. theprinciplesmentionedaboveinthecourseofnetworkevo- lution allows us to examine how network structure changes 2. RELATEDWORK overtime. Holme&Saramaki[10]presentedatime-varying importanceofnodesandedgestogetherwithasurveyofex- Inthelasttwodecades,theanalysisofreal-worldnetworks isting approaches and the unification of terminology in the has received extraordinary attention. One of the sources of area of temporal networks research. Ramasco at al. [15] data is social networks, which are growing at an enormous studied social collaboration networks as dynamic networks rate. Notableandalonginvestigatedsourceinthisareaare growing in time by the continuous addition of new acts of co-authorship and, in general, collaborative networks. A collaboration and new actors. In real-world networks, par- common feature of this type of network is that underlying ticularly social ones, instances often have strong relations processes proceed in cliques, which then become a funda- defined as interactions that are frequently repeated (nodes mental building block of the network. Barabasi et al. [3] remember them), as well as weak relations representing the presented and analyzed in detail a network model inspired occasional interactions. Karsai et al. [11] explain, how is by the evolution of co-authorship networks. The research creating new relationships and strengthening existing links presented by Ramasco et al.[15] falls into the same area; in networks important for network evolution. it analyzed in detail the development of collaboration net- Therearealsonovelapproachesfocusedonmodelswhich works. Their model combines preferential edge attachment allow generating networks with predictable properties. For withthebipartitestructureanddependsontheactofcollab- instance,Zhangetal. [23]formulatedagenerativemodelas oration. The rise of the giant connected component in the an optimization problem. set of k-cliques of a classical random graph was described In our approach, we do not use preferential attachment by Derenyi at al. [7] as well as a k-clique community, as a in a straightforward manner. It is, however, a side effect unionofallk-cliques. Anovelmodelofmulti-layernetwork of principles related to the nature of the formation of the wasproposedbyBattistonetal. [4],theirmodelcapturesa community structure. In our model, we use a clique as a multi-faceted character of actors in collaborative networks. structural element of the network. A clique is the result of Theuniversallyrecognizedprincipleistheso-calledprefer- interaction among nodes and is the basis of the community entialattachment. Atthemomentoftheconnectingofnew structure. Our networks are generated as temporal because nodestothenetworkduringitsgrowth,thereisapreference oneinteractionistheresultofonestepofthegrowthofthe for selecting high degree nodes. The well-known Barabasi- network. Albertmodel[1]isbasedonexperimentalworkandanalysis of large-scale data. Zuev et al. [24] described how pref- 3. DBLPDATASETANALYSIS erential attachment together with latent network geometry explains the emergence of soft community structure in net- WestudiedtheDBLPdatasetwhichcontainsbasicbibli- works and non-uniform distribution of nodes. ographicinformationofpublicationsfromthecomputersci- Oneofthebasiccharacteristicsofsometypesofnetworks encefield. Thisdataisfreelyavailable1 andcontainshighly (socialandbiological),istheircommunitystructure. Under- relevantinformationaboutpublicationactivityfromthepe- standingtheprinciplesuponwhichcommunitiesemergeisa riod of nearly fifty years, even though they are not com- key task. Growing network model using the triadic closure plete. At the time we downloaded this dataset (July 2016), mechanism is able to display a nontrivial community struc- andafterfirstpre-processing,itcontainedatotalof2988015 ture,aswasproposedbyBianconiatal. [5]. Theadditionof publications. FurthercharacteristicscanbeseeninTable1. links between existing nodes having a common neighbor as alocalprocessleadstotheemergenceofpreferentialattach- ment as is stated by Shekatkar & Ambika [18]. In another Table 1: DBLP dataset modelforgrowingnetworkproposedbyToivonenetal. [20], communitiesarisefromamixtureofrandomattachmentand Total number of publications 2988015 implicit preferential attachment. Total number of authors 1622828 Afrequentfeatureoftheseapproachesisthatcommunities Mean papers per author 5.113 risefromacombinationoflinksbetweenexistingnodeswith Mean authors per paper 2.895 their neighbors to new nodes. Some of these approaches do not use the preferential attachment for node selection Based on the co-authorship of authors, we constructed a because the scale-free property is the result of underlying social network where authors are linked if they co-authored processes. apaper. Theweightoftheedgecorrespondstothenumber Another well-known property of real-world networks is of co-authored papers. Basic characteristics of this network thatcommunitieshaveoverlaps. Anodemaybelongtomore and its maximal connected component are in Table 2. cliques simultaneously, and this property is the basis of the In additional pre-processing of the dataset, we set each Clique Percolation Method presented by Palla et al. [14]. publication a month which corresponded to the date of a Thecliquegraph,whereincliquesofagivenorderarerepre- conference or date of publication in a journal, respectively. sentedasnodesinaweightedgraph,isaconceptualtoolto If the record in the DBPL did not contain the month of understand the k-clique percolation described by Evans [9]. publication, we chose the month for a given year randomly. Yang and Leskovec introduced the Community-Affiliation Graph [22] based on observation, that community overlaps 1http://www.informatik.uni-trier.de/˜ley/db/ Table2: DBLPnetworkanditsmax. connectedcomponent net maxCC 150 ll M1.e8a5n 80 lll M2.e2a3n4 150 l M1.e6a3n8 80 lll M2.e6a2n8 250 l M0.e2a3n8 TToottaall nnuummbbeerr ooff nedogdeess 16595340757425 16470688313018 050 l lllll 040 l lllllll 050 l l l l l 040 l lllllllll 0100 l l DMeenasnitdyegree 5.738e.9-0165 6.839e.6-0162 80 0ll2l 4M2.6e3a9n7 80 0ll3l 6M2.e29a1n1 80 0lll2 M24.e3a7n5 250 0l 3 6 M09.e0a8n5 80 0lll1 M2.e42a2n9 NGulombbalercloufstceorninngecctoeedfficcoimenptonents 04.91475499 0.17411 040 l llllllll 040 l llllll 040 l llllllll 0100 l l l l 040 l llll Mean local cluster coefficient 0.7341 0.7217 0 3 6 9 0 3 6 9 0 3 6 9 0 2 4 0 2 4 6 MNuemanbeerdgoef cwoemigmhtunities (Louvain) 1.746- 1.742622 04080 llllllllllllM2l.le5la7ln2l 04080 llllllllM2.le4la7n9l 04080 llllllM2l.e2la4n3l 04080 llllllllM2l.le3a8ln8l 02060 lllllllM2l.e7la9n5l 0 4 8 13 0 3 6 9 0 3 6 0 3 6 9 0 3 6 9 Furthermore,weassumedthattheauthorexistinginagiven monthistheauthorwhohadatleastonepublicationinthe Figure2: Co-authorshistogramsfortop15authors(bynum- month preceding the given month. In the next step, we ber of publications) dropped all publications that did not contain any already existingauthors. Fortherestofpublications,wehaveiden- tified the main author, who was the first existing author in the second group are those who have previously published the order of the co-authors of these publications. but not yet together with the main author. In the third ThefirstobjectiveoftheexperimentwiththeDBLPdataset group are new authors, i.e. those who have no previous wastodiscoverwhatshapehasthedistributionofthenum- publications. Wecanthenformulateahypothesis,basedon ber of publications depending on the number of co-authors which we define the new network model in the next part of of the main author. Figure 1 shows the distribution for the the paper. firsttwentyvalues(i.e. upto20co-authors)andcumulative distribution of all values (except for one publication with Hypothesis 286 authors). This distribution is compared to a Poisson distribution with a λ value equal to the average number of 1. The variables describing the number of co-authors in co-authors, which is 1.99. different groups are independent random variables. 2. Justasthetotalnumberofco-authors,thesevariables ons ll E(Tmhmeepaoirnrei cctaiocl −ala (uPthooisrsso =n )1.99) 0.81.0 lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll lll ll In tfhoilslowpaPpeoris,swonedairsetrwibourtkiionng.with a co-authorship net- Total publicati4e+05 l ll CDF20.40.6 l wfmoororedk,e,lt.hweThhipcerhemsiesondetesesldeinsmtpioardlilmeylaacrialcnyolanlbaobotuobrtaeatitsvaiekmenpneletawssoimarkuu.lnaTtivihoenersroea-fl 0. l the development of collaborative networks inspired by ana- +00 llllllllllllllll 0.0 EThmepoirreictaiclal (Poisson) lyzing the co-authorship network. 0e 0 3 6 9 12 15 18 0 20 40 60 80 100 Number of co−authors Number of co−authors 4. 3-LAMBDAMODEL Figure 1: DBLP: Poisson distribution As mentioned in the introduction, the model is based on theassumptionthatonestepofthenetwork’sgrowthisone interaction. Both new and existing nodes are involved in The second objective of the experiment was to find au- the interaction, and after the interaction, there are edges thorswhoaremostoftenintheroleofthemainauthorofa between all involved pairs of nodes. Some (or all) of these publication. The results of this part of the experiment are edgesbetweenpairsofinvolvednodescouldexistbeforethe to be understood only as an estimate based on the above- interaction. The model is inherently temporal since we can mentioned assumptions about the main author. Empirical associate with each node or edge the list of interactions in distribution,togetherwiththetheoreticalvalueofthePois- which they participated during the growth of the network sondistributionforthefirstfifteenauthorswiththehighest (the list of interactions is a list of steps of the growth in number of publications in the role of the first author, is the order in which the interactions took place). It is fur- shown in Figure 2. ther assumed that neither nodes nor edges age, so they are 3.1 Hypothesis not removed during the growth of the network. From this perspective the model is growing. Results of the analysis of the DBLP dataset can not be In the model, nodes involved in the interaction have four easily generalized. However, experiments have shown quite different roles: clearly that the probability of a publication with a certain numberofco-authorsfollowsthePoissondistribution. Ifwe 1. key node of the interaction (proactive) extend thoughts about the main author by what kind and how many different co-authors he/she has, we may divide 2. nodes adjacent to the proactive node (neighbors) co-authors into three groups. In the first group are authors with whom the main author has previously published. In 3. new nodes (newbies) (a)λ1 =0,λ2 =3,λ3 =0 (b)λ1 =2,λ2 =3,λ3 =0 (c)λ1 =2,λ2 =3,λ3 =0.5 Figure 3: 3-lambda model: generated networks with 200 nodes 4. nodesthatarenotadjacenttotheproactivenode(new The minimum and the maximum number of new connec- connections) tions (edges) is in Equations ?? and ??, respectively. It is essential for the model that each interaction always n·(n−1) contains exactly one proactive node and may (but doesn’t MIN =n+e+b·n+e·n+ (2) 2 have to) include nodes in three other roles. How many spots will be represented by each of these three roles is se- b·(b−1) e·(e−1) lected based on the Poisson distribution with a preselected MAX =MIN +e·b+ + (3) λ (neighbors); λ (newbies); and λ (new connections). 2 2 1 2 3 Furthermore, the model assumes that the selection of spe- Thus, if for example we set b = 2,n = 3,e = 1 for the cific existing nodes for neighbor roles (in the number equal interaction, then: to the maximum number of all neighbors at most) and new connections is random. • interaction size is s=1+2+3+1=7 A natural characteristic of the model is that nodes with • after the interaction a total of 21 edges exist between a higher degree are more likely to participate in an interac- pairs of nodes tion in which the proactive node has at least one neighbor. Although a proactive node of the interaction and its neigh- • MIN =3+1+6+3+3=16 in the case of 5 edges borsarebeingselectedatrandom,ahighdegreenodehasa existingbetweenpairsofnodespriortotheinteraction greater chance of being selected as a neighbor of the proac- • MAX = 10+2+1+0 = 19 in the case of 2 edges tive node. existingbetweenpairsofnodespriortotheinteraction λ , λ and λ significantly affect the density of the net- 1 2 3 work. If we assume that a randomly selected interaction Each of λ , λ and λ affects different property of the has, based on the corresponding distributions, b neighbors 1 2 3 network generated by 3-lambda model. of a proactive node, n new nodes and e nodes unconnected totheproactivenode,thenthenumberofnodesinvolvedin • λ (newbies) defines the growth rate of the network 2 this interaction (interaction size) is in Equation ?? andprovidesatree-likenetworkstructure,seeFig. 3a. ToconstructanetworkwithN nodesrequiresapprox- imately N/λ interactions. s=1+b+n+e (1) 2 • λ (neighbors) constitutes network community struc- and the following applies: 1 ture through local connections of existing neighbors • n new nodes, which must connect to a proactive node and new nodes, see Fig. 3b. and each other, is created. • λ (newconnections)ensureslinkingofnodesthatare 3 • Therearebnodesadjacenttotheproactivenodewhich not adjacent, thereby linking communities. The con- must first get connected among each other, then with sequence is emerging of core-periphery network struc- e nodes that are not adjacent to the proactive node ture, see Fig. 3c. (these edges may already exist prior to the interac- 4.1 NetworkGenerator tion). Finally, they must connect with the new n Thenetworkgeneratorusesasimplealgorithmwhichcomes nodes. directly from the model description. The only extra step is • There are e nodes not adjacent to the proactive node, setting up the initial network state. The model is memory- whichmustgetconnectedwithit. Next,theymustget less, which allows working with an arbitrary initial state. connectedamongeachother(theseedgesmayalready For our generator we chose a complete graph with a num- exist prior to the interaction) and n new nodes. ber of nodes equal to the round of (1+λ +λ +λ ) as 1 2 3 the default state. Algorithm 1 describes the whole process participation of new nodes and rather exceptional partic- of generating a network. The generated network is a con- ipation of existing nodes yet not connected to the proac- nected graph; the algorithm starts with a complete graph tive node. In Setting = [3,6,1] predominate new nodes, 2 and each interaction contains at least one existing node. In and the average number of nodes in interaction is 11. In our implementation, for reasons of analysis and visualiza- Setting =[0.45,0.45,0.1]interactioninvolvedtwonodes(a 3 tion, we store the number of interactions for each node and dyad)onaverage,whereinthenumberofneighborsandnew edge, which is not mentioned in the algorithm. nodesarebalancedandnewconnectionoccurrencesareless likely. input : number of nodes N, λ ,λ ,λ 1 2 3 output: generated network G 103 103 choose s=ROUND(1+λ +λ +λ ) cwrheaitlee GV =ha(sVl,eEss)tahsanaNcomn1podleetse2dgoraph3 with s nodes freq.110012 freq.110012 freq.102 choose from V randomly proactive node A b= number of neighbors by Poisson(λ ) (b is the 100 100 100 1 100 101 102 100 101 102 100 101 number of neighbors of A at most) degree degree degree n= number of newbies by Poisson(λ2) (a)Setting1 (b)Setting2 (c)Setting3 e= number of new connections by Poisson(λ ) 3 create a list I with a proactive node A Figure 4: Degree distribution add to list I b randomly selected neighbors of node A from V create n new nodes, add them to V and I add to list I e randomly selected not-neighbors of 103 103 103 node A from V (if such nodes exist) eq.102 eq.102 eq.102 foreifacnhopeadigreoefbneotdweesen(VVi,Vajn)d∈VI deoxists then fr101 fr101 fr101 i j create e and add to E 100 100 100 endif 10in0ter1a0c1t. pe1r0 2nod1e03 10in0ter1a0c1t. pe1r0 2nod1e03 10in0ter1a0c1t. pe1r0 2nod1e03 end (a)Setting1 (b)Setting2 (c)Setting3 end Algorithm 1: 3-lambda model network generator Figure 5: Interactions per node Theaveragenumberofnodesinaninteractionisapproxi- matelys=1+λ1+λ2+λ3. However,theaverageisslightly 30 60 lower because the number of neighbors selected for interac- 12 tion through the simulation of the Poisson distribution is eq.20 eq.8 eq.40 limited by the actual (maximum) number of neighbors of fr fr fr 10 20 4 the proactive node. The complexity of the algorithm is O(s2 · N), which is 0 0 based on the fact that: λ2 0 co3m0mu6n0ity s9iz0e 0 co2m5mu5n0ity s7iz5e 0 c1o0mm2u0nity3 s0ize40 (a)Setting1 (b)Setting2 (c)Setting3 • TogenerateN nodesrequiresapproximatelyN/λ in- 2 Figure 6: Community size teractions (λ is the average number of new nodes in 2 one interaction). 5.1 PropertiesofGeneratedNetwork • The number of edges between nodes in an interaction is in quadraticrelationtothe number of nodes of this Inthefirstexperiment,weshowthepropertiesofnetworks interaction (the interaction takes place in a complete generatedwithdifferentsettings. Wegeneratedforeachset- sub-graph with n nodes and m= n∗(n−1) edges). ting100networkswithapproximately10000nodes. Table3 2 summarizesaveragevalues(andstandarddeviation)ofmea- The calculation of algorithm complexity does not include sured properties for each setting. Measured properties in- thecomplexityofthesimulationofthePoissondistribution cludenumberofnodesnandedgesm,averagedegree<k>, for individual lambdas (in our case, the Knuth’s algorithm average shortest path length <l>, diameter Lmax, average with complexity O(λ) was used). clustering coefficient CC, assortativity r, number of communities detected by Infomap [17] com and Louvain [6] IM com algorithm, and corresponding modularities Q and 5. EXPERIMENTALEVALUATION L IM Q ,respectively. Table4containsaveragevaluesassociated L We are using using three different settings, Setting = with the temporality of the network, i.e. total number of [λ ,λ ,λ ],intheexperiments. InSetting =[1.6,0.35,0.05], interactions I, average number of interactions <i> and av- 1 2 3 1 theaverageinteractioninvolvedthreenodes(atriad),which eragenumberofnodesininteraction<s>. Figures4-6show corresponds approximately to the analyzed co-authorship degreedistribution,numberofinteractionsdistributionand network. This setting presumes the interaction is domi- distributions of community size detected by Infomap algo- natedbyneighborsoftheproactivenodewiththeoccasional rithm. Table 3: Global properties Setting n m <k> <l> l CC r com com Q Q max IM L IM L Setting mean 10001.10 40108.97 8.0209 4.8101 12.07 0.6555 0.14522 609.40 54.70 0.6424 0.7119 1 sd 0.31 398.80 0.0797 0.0421 0.64 0.0029 0.01769 12.37 7.04 0.0062 0.0085 Setting mean 10003.87 88620.40 17.7172 4.2369 8.43 0.8087 0.12961 422.60 42.90 0.6772 0.7334 2 sd 1.53 797.95 0.1600 0.0277 0.63 0.0018 0.00913 10.30 2.78 0.0049 0.0067 Setting mean 10001.17 21494.57 4.2984 6.8965 16.23 0.4784 0.19907 953.30 68.70 0.7276 0.8129 3 sd 0.46 142.64 0.0285 0.0609 0.63 0.0044 0.01222 15.55 3.19 0.0035 0.0038 Table 4: Interactions ment of network properties during its growth. The values for each characteristic are measured when the network has Setting I <i> <s> 10,20,50,100,...,and10000nodes. Theresultsaresumma- Setting1 mean 28561.90 8.0887 2.8323 rized in Table 5. sd 278.93 0.0817 0.0082 Figure 8, similarly to the previous experiment, shows the Setting2 mean 1664.53 1.8278 10.9860 distribution of degree, number of interactions and size of sd 17.69 0.0115 0.0824 communities. Theevolutionofthesepropertiesisportrayed Setting3 mean 22204.43 4.3772 1.9716 when network has 100,200,1000,5000,10000 nodes. sd 202.06 0.0292 0.0069 Key properties (average degree, shortest path, clustering coefficient, assortativity, modularity) happen to stabi- lize their values between 1000 and 10000 nodes. Our ex- The experiment indicates that all three settings gener- perimentsshowthatgeneratednetworkswithothersettings atenetworkswithsmall-worldandscale-freecharacteristics. also have similar behavior. Figure 9 shows the evolution of Thefirstandsecondsettinghavegeneratednetworksofhigh thenumberofinteractionsforfifteennodeswiththehighest average clustering coefficient. Networks have a tendency to numberofinteractionsattheendofthegenerationprocess. be assortative. Assortativity values correspond to all set- The ID of a node represents the moment of its creation. tings to the values known from social networks [13]. Gen- The trend shows how the chances of participating in inter- erated networks also have community structure and a high actions increase for nodes that already have a high number modularity for all settings. of interactions. 5626003721592585226967544808111562806883752725641193315822823324755921115898464198101760826933245706109821997888434267284451594884487835337342654037326741916737778439664432574161512948367714480435859875725219772441187718142644669945745723355657386234392993471345158327396424443897117494440739219313891177346302028171153148096849061582469586790116874767826628413723510841688435485054687869706676426668651670114916194985062972451958279946440768569419089866645195977416179829581455760708558567962713511175662781324021442398738476574557656133495129292193066741090698477414365728910621293890227022711242202857819664524687986717570458964299822533270018731079950189458649313768013516572775842788398879838351970266273141626312939100797783600868757650484065634245300938730817122701921390035971943830459410372347893588129089422726710461116418547470118867106426677295593734273781449749799848368454824798153862216137570322228047165300426158086556941682688134931727125576643539460812262139147619506046881989558582679439969921453607758837965903186617978251371683795775036747785651330613587195952513198636584675462096876855849262388959262457952812476232494228872649980304158329293608795271438003913009176366483785563143551140626170882193931117214341626150042532581122586039634751519157111832991986320655927137588247523101257765870627428526352472639064203276638314614552167708462323954587134567467887086468272489644301025242434347931775863630821125967934729517835018315943184615604071356238648555798422048933676266764022768911190996281849260787967389740222919861378567792500022788011011429098213838276476340251524922819533985222748422642201898182986642796232250239833258599335758514955437724262260323238056732749416036643056306424780746178155536647788365957605144699693837533728548993511845025634479955391978946347010859519652835698584779351261432837402788968924572186390842722808722792543730578375267440873951555586013966414538909572349311121699395698876240271482148343877312174280825146936566597109539421548936350528736642382893454799965415726882166744698229158712018595331467478472626839797172416559321280953249852650109338453816983917796716563234634869696590105458375315122350366556151423394191371516152192702788150193456434420538613635566393981233141332956514625944564896603593620953696466664037249314513481578849719894208241152991421496971104161251127533762787061177378379378373995774539757223250257474291958758095928164197442097999110335424429605219493421218687490090808178020218554430640574799991836946844779402283875139480469595390682342917588549536318853171844748065423037584833426868968832252673738715439639821341545207658678772386932426521371976963286410311339917996768053735621189997064133913943425522738074049273173704013338593450510893259109490394654844785004080083945574358216885665793821972450763109676984129598430363572954712102933315066293132705154031522255563096437566561164597987899788 Interact. per NodeF1122ig05055000000ure109: 2E0volu5t0ion10o0Nfodneosd ine Ns5ew0tw0iotrhk th2e00m0ost in1t0e0r0a0ctIdi oo023567812235511f 01760833Nn03odse Figure 7: Network: 1000 nodes, Setting 1 5.3 Correlations Figure7showsanetworkwith1000nodesand3599edges generated with Setting . A total of 2830 interactions took In this experiment, we compare the analyzed real-world 1 place. The size of nodes and strength of edges correspond network with generated networks. For each setting we gen- to the number of interactions they took part in, the labels eratedanetworkwithonemillionnodes. Thenweexamined indicatetheorderinwhichnodeswerecreated. Thenetwork thecorrelationbetweenthenode’screationtimeandit’sde- hasanoverlappingcommunityandcore-peripherystructure. gree and number of interactions, respectively. The emer- Colored 19 communities were detected by Louvain method, gence of the node is represented by its ID, which is in the modularity is 0.746. order of its creation. Furthermore, we calculated the correlation between degree and number of node’s interactions. 5.2 EvolutionofGeneratedNetwork WedidthesamefortheanalyzedDBLPdataset,wherethe Thesubjectofthesecondexperimentisonenetworkgen- order of nodes is to be understood as an estimate based on erated with Setting . The aim is to show the develop- data pre-processing described in Section 3. 1 Table 5: Evolution of network properties n m <k> <l> l CC r com com Q Q max IM L IM L 10 27.00 5.4000 1.4000 20 0.8195 -0.41738 1 3 0.0000 0.0590 20 53.00 5.3000 2.0684 5 0.5804 -0.09355 2 4 0.0361 0.2398 50 184.00 7.3600 2.5004 7 0.6248 -0.00886 5 5 0.2957 0.3742 100 388.00 7.7600 2.8731 6 0.6793 0.09177 11 9 0.4259 0.4613 200 812.00 8.1200 3.1859 7 0.6824 0.05107 20 11 0.5002 0.5121 500 2112.00 8.4480 3.6095 8 0.6394 0.11911 44 11 0.5540 0.5849 1000 4053.00 8.1060 3.9402 9 0.6592 0.14528 83 17 0.5812 0.6341 2000 8306.00 8.3060 4.2054 10 0.6538 0.14016 151 24 0.5972 0.6714 5000 20316.00 8.1264 4.5344 12 0.6603 0.15823 342 44 0.6203 0.6827 10000 40476.00 8.0952 4.8015 11 0.6557 0.15351 645 64 0.6307 0.7003 101.2 Degree distribution 110000..48 ● ● ●●●●●●●●●●●● 101 ● ●●●●●●●●●●●●●●●●●●● 110012 ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● 110012 ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● 111000123 ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● 100 ●●●●●●●●●●● 100 ●●●●●●●●●●● 100 ●●●●●●●●●●●●●●●●●● 100 ●●●●●●●●●●●●●●●●●●●●●●● 100 ●●●●●●●●●●●●●●●●●●●●●●●●●●● 100 101 100 101 100 101 102 100 101 102 100 101 102 Interactions per node 11110000001...0482 ●●●●●●●●●●●●●●●●●●●●●●●●●●● 110001 ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● 111000012 ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● 111100000123 ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● 110002 ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● 100 101 102 103 100 101 102 103 100 101 102 103 100 101 102 103 100 101 102 103 2.00 ●●● 4 ● 10.0 ● 25 ● ● Community size 1111....02570505 ● ●● ● ● 123 ●●●●●●●● ● ● 257...505 ●●●●●●●●●●●●●●●●●●●●●●●●● ● ● 11205050 ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● ●●● 123400000 ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● 10 20 30 0 10 20 30 40 50 0 20 40 60 0 20 40 60 0 20 40 60 Figure 8: Evolution of network (100, 200, 1000, 5000, 10000 nodes) Table 6: Correlations some properties of the network. Previous experiments demonstrated that despite the ab- Setting ρ(Id,k) ρ(Id,i) ρ(k,i) sence of aging, generated networks have very good proper- Setting1 -0.59 -0.50 0.96 ties. Wehavenot,therefore,forreasonsofsimplicity,incor- Setting2 -0.47 -0.47 0.95 porated any of the known models of aging (e.g. inspired by Setting3 -0.70 -0.58 0.87 [8, 21]) to the 3-lambda network model. DBLP -0.23 -0.18 0.84 6. CONCLUSION Evolution of real-world networks is influenced by many The results summarized in Table 6 show that regardless factors. Thepurposeofnetworkmodelsistodiscoverthese of the setting, the first two correlations are much higher in factors and describe them in a simple way. Our research the 3-lambda model than in the DBLP dataset (Pearson’s focused on analyzing behavioral patterns of nodes existing correlation coefficient was used). Thus, in the presented inaco-authorshipnetworkwhileparticipatinginpublishing model, older nodes have a higher (and continuous) chance activities. We described four roles of nodes involved in in- to participate in interactions than in reality. The cause is teractions. Based on the analysis of the DBLP dataset, we probably the aging of nodes in real-world networks where formulatedthehypothesis,whichassumesthatthenumbers nodes, at different times, no longer participate in interac- ofnodesinvolvedininteractionsrevolvearoundanaverage, tions. This significantly affects the evolution (growth) and and they are independent Poisson variables. Based on this hypothesis, we defined the 3-lambda model of collaborative [12] J. Leskovec, J. Kleinberg, and C. Faloutsos. Graphs network. Themodelhasthreeparametersandhasnomem- over time: densification laws, shrinking diameters and ory. In three experiments, based on three different settings possible explanations. In Proceedings of the eleventh corresponding to dyads, triads and larger groups behavior, ACM SIGKDD international conference on Knowledge we showed that networks generated by the 3-lambda model discovery in data mining, pages 177–187. ACM, 2005. havethecharacteristicsknownfromtheenvironmentofreal- [13] M. E. Newman. Assortative mixing in networks. world social networks. Furthermore, we showed that the Physical review letters, 89(20):208701, 2002. model can be understood as temporal. In one experiment [14] G. Palla, I. Derényi, I. Farkas, and T. Vicsek. we presented the development and stabilization of gener- Uncovering the overlapping community structure of ated network properties in time. For future research there complex networks in nature and society. Nature, remainsomeopenquestions. Theybearrelationtorecogniz- 435(7043):814–818, 2005. ingotherfactorsinfluencingthedevelopmentofthenetwork [15] J. J. Ramasco, S. N. Dorogovtsev, and andtothedetailedstudyofthedependenceandpredictabil- R. Pastor-Satorras. Self-organization of collaboration ityofpropertiesofgeneratednetworksonthesettingofthree networks. Physical review E, 70(3):036106, 2004. network parameters (lambdas). [16] M. P. Rombach, M. A. Porter, J. H. Fowler, and P. J. Mucha. Core-periphery structure in networks. SIAM 7. ACKNOWLEDGMENTS Journal on Applied mathematics, 74(1):167–190, 2014. ThisworkwassupportedbySGS,VSB-TechnicalUniver- [17] M. Rosvall, D. Axelsson, and C. T. Bergstrom. The sity of Ostrava, under the grant“Parallel Processing of Big map equation. The European Physical Journal Special Data”, no. SP2016/97. Topics, 178(1):13–23, 2009. [18] S. M. Shekatkar and G. Ambika. Complex networks 8. REFERENCES with scale-free nature and hierarchical modularity. [1] R. Albert and A.-L. Barabási. Statistical mechanics of The European Physical Journal B, 88(9):1–7, 2015. complex networks. Reviews of modern physics, [19] C. Song, S. Havlin, and H. A. Makse. Self-similarity of 74(1):47, 2002. complex networks. Nature, 433(7024):392–395, 2005. [2] V. Arnaboldi, R. I. Dunbar, A. Passarella, and [20] R. Toivonen, J.-P. Onnela, J. Saramäki, J. Hyvönen, M. Conti. Analysis of co-authorship ego networks. In and K. Kaski. A model for social networks. Physica A: Advances in Network Science: 12th International Statistical Mechanics and its Applications, Conference and School, NetSci-X 2016, Wroclaw, 371(2):851–860, 2006. Poland, January 11-13, 2016, Proceedings, volume [21] K. S. Xu, M. Kliger, and A. O. Hero. Evolutionary 9564, page 82. Springer, 2016. spectral clustering with adaptive forgetting factor. In [3] A.-L. Barabaˆsi, H. Jeong, Z. Néda, E. Ravasz, 2010 IEEE International Conference on Acoustics, A. Schubert, and T. Vicsek. Evolution of the social Speech and Signal Processing, pages 2174–2177. IEEE, network of scientific collaborations. Physica A: 2010. Statistical mechanics and its applications, [22] J. Yang and J. Leskovec. Community-affiliation graph 311(3):590–614, 2002. model for overlapping network community detection. [4] F. Battiston, J. Iacovacci, V. Nicosia, G. Bianconi, In 2012 IEEE 12th International Conference on Data and V. Latora. Emergence of multiplex communities Mining, pages 1170–1175. IEEE, 2012. in collaboration networks. PloS one, 11(1):e0147451, [23] B. Zheng, H. Wu, L. Kuang, J. Qin, W. Du, J. Wang, 2016. and D. Li. A simple model clarifies the complicated [5] G. Bianconi, R. K. Darst, J. Iacovacci, and relationships of complex networks. Scientific reports, S. Fortunato. Triadic closure as a basic generating 4, 2014. mechanism of communities in complex networks. [24] K. Zuev, M. Bogunã´, G. Bianconi, and D. Krioukov. Physical Review E, 90(4):042806, 2014. Emergence of soft communities from geometric [6] V. D. Blondel, J.-L. Guillaume, R. Lambiotte, and preferential attachment. Scientific reports, 5, 2015. E. Lefebvre. Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment, 2008(10):P10008, 2008. [7] I. Derényi, G. Palla, and T. Vicsek. Clique percolation in random networks. Physical review letters, 94(16):160202, 2005. [8] S. N. Dorogovtsev and J. F. F. Mendes. Evolution of networks with aging of sites. Physical Review E, 62(2):1842, 2000. [9] T. S. Evans. Clique graphs and overlapping communities. Journal of Statistical Mechanics: Theory and Experiment, 2010(12):P12037, 2010. [10] P. Holme and J. Saramäki. Temporal networks. Physics reports, 519(3):97–125, 2012. [11] M. Karsai, N. Perra, and A. Vespignani. Time varying networks and the weakness of strong ties. Scientific Reports, 4:4001, 2014.