On the Geometry and Extremal Properties of the Edge-Degeneracy Model∗ Nicolas Kim†‡ Dane Wilburne†§ Sonja Petrovi´c§ Alessandro Rinaldo‡ Abstract it dictates the statistical and mathematical behavior of Theedge-degeneracymodelisanexponentialrandomgraph theERGM.Whilethereisnotageneralclassificationof model that uses the graph degeneracy, a measure of the ‘good’and‘bad’networkstatistics,someleadtomodels 6 graph’s connection density, and number of edges in a graph 1 as its sufficient statistics. We show this model is relatively that behave better asymptotically than others, so that 0 well-behaved by studying the statistical degeneracy of this computation and inference on large networks can be 2 model through the geometry of the associated polytope. handled in a reliable way. p Keywords exponential random graph model, degeneracy, InanERGM,theprobabilityofobservinganygiven e k-core, polytope graphdependsonthegraphonlythroughthevalueofits S sufficient statistic, and is therefore modulated by how 6 1 Introduction much or how little the graph expresses those properties 1 Statistical network analysis is concerned with develop- capturedbythesufficientstatistics. Asthereisvirtually ing statistical tools for assessing, validating and model- no restriction on the choice of the sufficient statistics, ] T ing the properties of random graphs, or networks. The the class of ERGMs therefore possesses remarkable S very first step of any statistical analysis is the formal- flexibility and expressive power, and offers, at least in . ization of a statistical model, a collection of probabil- principle, a broadly applicable and statistically sound h ity distributions over the space of graphs (usually, on means of validating any scientific theory on real-life t a a fixed number of nodes n), which will serve as a ref- networks. However, despite their simplicity, ERGMs m erence model for any inferential tasks one may want to are also difficult to analyze and are often thought to [ perform. Statistical models are in turn designed to be behave in pathological ways, e.g., give significant mass 2 interpretableand,atthesametime,tobecapableofre- to extreme graph configurations. Such properties are v producing the network characteristics pertaining to the often referred to as degeneracy; here we will refer to it 0 particularproblemathand. Exponentialrandomgraph as statistical degeneracy [10] (not to be confused with 8 models, or ERGMs, are arguably the most important graph degeneracy below). Further, their asymptotic 1 class of models for networks with a long history. They properties are largely unknown, though there has been 0 0 are especially useful when one wants to construct mod- somerecentworkinthisdirection;forexample,[6]offer . elsthatresembletheobservednetwork,butwithoutthe a variation approach, while in some cases it has been 2 needtodefineanexplicitnetworkformationmechanism. shown that their geometric properties can be exploited 0 6 Intheinterestofspace,wesingleoutclassicalreferences to reveal their extremal asymptotic behaviors [22], see 1 [2], [4], [9] and a recent review paper [8]. also[18]. Thesetypesofresultsareinterestingnotonly : Central to the specification of an ERGM is the mathematically,buthavestatisticalvalue: theyprovide v i choice of sufficient statistic, a function on the space a catalogue of extremal behaviors as a function of the X of graphs, usually vector-valued, that captures the model parameters and illustrate the extent to which r particular properties of a network that are of scientific statistical degeneracy may play a role in inference. a interest. Common examples of sufficient statistics are In this article we define and study the properties the number of edges, triangles, or k-stars, the degree of the ERGM whose sufficient statistics vector consists sequence, etc; for an overview, see [8]. The choice of of two quantities: the edge count, familiar to and of- a sufficient statistic is not to be taken for granted: it ten used in the ERGM family, and the graph degen- dependsontheapplicationathandandatthesametime eracy, novel to the statistics literature. (These quanti- ties may be scaled appropriately, for purpose of asymp- totic considerations; see Section 2.) As we will see, ∗PartiallysupportedbyAFOSRgrant#FA9550-14-1-0141. graph degeneracy arises from the graph’s core struc- †Equalcontribution ‡DepartmentofStatistics,CarnegieMellonUniversity,Email: ture, a property that is new to the ERGM framework [email protected],[email protected]. [11], but is a natural connectivity statistic that gives a §Department of Applied Mathematics, Illinois In- sense of how densely connected the most important ac- stitute of Technology, Email: [email protected], tors in the network are. The core structure of a graph [email protected]. (see Definition 2.1) is of interest to social scientists and Theedge-degeneracy ERGMisthestatisticalmodel other researchers in a variety of applications, including on G whose sufficient statistics are the rescaled graph n theidentificationandrankingofinfluencers(or“spread- degeneracy and the edge count of the observed graph. ers”) in networks (see [14] and [1]), examining robust- Concretely, for G∈G let n nesstonodefailure,andforvisualizationtechniquesfor (cid:18) (cid:18) (cid:19) (cid:19) n large-scale networks [5]. The degeneracy of a graph is (2.1) t(G)= E(G)/ ,degen(G)/(n−1) , simply the statistic that records the largest core. 2 Cores are used as descriptive statistics in several whereE(G)isthenumberofedgesofG. TheEDmodel network applications (see, e.g., [16]), but until recently, on G is the ERGM {P ,θ ∈R2}, where very little was known about statistical inference from n n,θ this type of graph property: [11] shows that cores are (2.2) P (G)=exp{(cid:104)θ,t(G)(cid:105)−ψ(θ)} n,θ unrelated to node degrees and that restricting graph degeneracy yields reasonable core-based ERGMs. Yet, istheprobabilityofobservingthegraphG∈G forthe n there are currently no rigorous statistical models for choice of model parameter θ ∈ R2. The log-partition networks in terms of their degeneracy. The results in function ψ: R2 → R, given by ψ(θ) = (cid:80) e(cid:104)θ,t(G)(cid:105) this paper thus add a dimension to our understating serves as a normalizing constant, so thatGp∈rGonbabilities of cores by exhibiting the behavior of the joint edge- add up to 1 for each choice of θ (notice that ψ(θ)<∞ degeneracy statistic within the context of the ERGM for all θ, as G is finite). n thatcapturesit,andprovideextremalresultscriticalto Noticethatdifferentchoicesofθ =(θ ,θ )willlead 1 2 estimationandinferencefortheedge-degeneracymodel. to rather different distributions. For example, for large Wedefinetheedge-degeneracyERGMinSection2, and positive values of θ and θ the probability mass 1 2 investigate its geometric structure in Sections 3, 4, and concentrates on dense graphs, while negative values 5, and summarize the relevance to statistical inference of the parameters will favor sparse graphs. More in Section 6. interestingly, when one parameter is positive and the other is negative, the model will favor configurations in 2 The edge-degeneracy (ED) model which the edge and degeneracy count will be balanced This section presents the necessary graph-theoretical againsteachother. OurresultsinSection5willprovide tools, establishes notation, and introduces the ED a catalogue of such behaviors in extremal cases and for model. LetG denotethespaceof(labeled,undirected) large n. n (n) The normalization of the degeneracy and the edge simple graphs on n nodes, so |Gn|=2 2 . count in (2.1) and the presence of the coefficient (cid:0)n(cid:1) To define the family of probability distributions 2 in the ED probabilities (2.2) are to ensure a non- over G comprising the ED model, we first define the n trivial limiting behavior as n → ∞, since E(G) and degeneracy statistic. degen(G) scale differently in n (see, e.g., [6] and [22]). Definition 2.1. Let G = (V,E) be a simple, undi- This normalization is not strictly necessary for our theoreticalresultstohold. However,theEDmodel,like rected graph. The k-core of G is the maximal subgraph mostERGMs,isnotconsistent,thusmakingasymptotic of G with minimum degree at least k. Equivalently, the considerations somewhat problematic. k-core of G is the subgraph obtained by iteratively delet- ingverticesofdegreelessthank. The graphdegeneracy Lemma 2.1. The edge-degeneracy model is an ERGM of G, denoted degen(G), is the maximum value of k for that is not consistent under sampling, as in [21]. which the k-core of G is non-empty. Proof. The range of graph degeneracy values when This idea is illustrated in Figure 1, which shows a going from a graph with n vertices to one with n+1 graph G and its 2-core. In this case, degen(G)=4. vertices depends on the original graph; e.g. if there is a 2-star that is not a triangle in a graph with three vertices, the addition of another vertex can form a triangle and increase the graph degeneracy to 2, but if there was not a 2-star then there is no way to increase the graph degeneracy. Since the range is not constant, this ERGM is not consistent under sampling. Figure 1: A small graph G (left) and its 2-core (right). Thus,asthenumberofverticesngrows,itisimportant The degeneracy of this graph is 4. to note the following property of the ED model, not uncommon in ERGMs: inference on the whole network Let us consider P for some small values of n. The n cannot be done by applying the model to subnetworks. polytope P is plotted in Figure 2. 10 Inthenextfewsectionswewillstudythegeometry of the ED model as a means to derive some of its Thecasen=3. Therearefournon-isomorphicgraphs asymptotic properties. The use of polyhedral geometry on 3 vertices, and each gives rise to a distinct edge- inthestatisticalanalysisofdiscreteexponentialfamilies degeneracy vector: is well established: see, e.g., [2], [4], [7], [20], [19]. (cid:32) (cid:33) (cid:32) (cid:33) t =(0,0) t =(1,1) 3 Geometry of the ED model polytope The edge-degeneracy ERGM (2.2) is a discrete expo- (cid:32) (cid:33) (cid:32) (cid:33) nential family, for which the geometric structure of the t =(2,1) t =(3,2) model carries important information about parameter estimation including existence of maximum likelihood estimate (MLE) - see above mentioned references. This Hence P = conv{(0,0),(1,1),(2,1),(3,2)}. Note that 3 geometric structure is captured by the model polytope. in this case, each realizable edge-degeneracy vector lies The model polytope Pn of the ED model on Gn is on the boundary of the model polytope. We will see theconvexhullofthesetofallpossibleedge-degeneracy belowthatn=3istheuniquevalueofnforwhichthere pairs for graphs in Pn. In symbols, are no realizable edge-degeneracy vectors contained in (cid:110) (cid:111) the relative interior of P . P :=conv (E(G),degen(G)),G∈G ⊂R2. n n n Note the use of the unscaled version of the sufficient The case n = 4. On 4 vertices there are 11 non- statisticsindefiningthemodelpolytope. Inthissection, isomorphic graphs but only 8 distinct edge-degeneracy the scaling used in model definition (2.1) has little vectors. Without listing the graphs, the edge- impactonshapeofP ,thus-forsimplicityofnotation- degeneracy vectors are: n wedonotincludeitinthedefinitionofP . Thescaling n (0,0),(1,1),(2,1),(3,1),(3,2),(4,2),(5,2),(6,3). factorswillbere-introduced,however,whenweconsider the normal fan and the asymptotics in Section 4. Here we pause to make the simple observation that In the following, we characterize the geometric P ⊂ P always holds. Indeed, every realizable propertiesofP thatarecrucialtostatisticalinference. n n+1 n edge-degeneracy vector for graphs on n vertices is also First, we arrive at a startling result, Proposition 3.2, realizable for graphs on n + 1 vertices, since adding that every integer point in the model polytope is a re- a single isolated vertex to a graph affects neither the alizable statistic. Second, Proposition 3.4 implies that number of edges nor the graph degeneracy. the observed network statistics will with high proba- bility lie in the relative interior of the model polytope, The case n = 5. There are 34 non-isomorphic graphs which is an important property because estimation al- onn=5verticesbutonly15realizableedge-degeneracy gorithms are guaranteed to behave well when off the vectors. They are: boundary of the polytope. This also implies that the MLE for the edge-degeneracy ERGM exists for many (0,0),(1,1),(2,1),(3,1),(3,2),(4,2),(5,2),(6,3), large graphs. In other words, there are very few net- work observations that can lead to statistical degener- (4,1),(6,2),(7,2),(7,3),(8,3),(9,3),(10,4), acy, that is, bad behavior of the model for which some where the pairs listed on the top row are contained ERGMs are famous. That behavior implies that a sub- in P and the pairs on the second row are contained 4 set of the natural parameters is non-estimable, making in P \ P . Here we make the observation that the 5 4 complete inference impossible. Thus it being avoided proportion of realizable edge-degeneracy vectors lying by the edge-degeneracy ERGM is a desirable outcome. on the interior of the P seems to be increasing with n. n Insummary,Propositions3.4,3.2,3.5andTheorem3.1 ThisphenomenonisaddressedinProposition3.4below. completely characterize the geometry of P and thus n Figure 2 depicts the integer points that define P . 10 solve [17, Problem 4.3] for this particular ERGM. Re- markably, this problem—although critical for our un- derstanding of reliability of inference for such models– has not been solved for most ERGMs except, for exam- Thecaseforgeneraln. Manyoftheargumentsbelow ple, the beta model [20], which relied heavily on known rely on the precise values of the coordinates of the graph-theoretic results. extreme points of P . n P Proof. Suppose that G is a graph on n vertices with 10 9 degen(G) = d ≤ n−1. Our strategy will be to show 8 that for all e such that Un(d)≤e≤Ln(d), there exists a graph G on n vertices such that degen(G) = d and 7 E(G)=e. 6 It is clear that there is exactly one graph (up y ac5 to isomorphism) corresponding to the edge-degeneracy ener vector ((cid:0)d+1(cid:1),d); it is the graph g4 2 e D 3 (3.3) K ∪K ∪...∪K , d+1 1 1 (cid:124) (cid:123)(cid:122) (cid:125) 2 n−d−1 times 1 i.e., the complete graph on d + 1 vertices along with 0 n−d−1 isolated vertices. 0 5 10 15 20Edges25 30 35 40 45 Thus, for each e such that U (d) = (cid:0)d+1(cid:1) < e < n 2 (cid:0)d+1(cid:1)+(n−d−1)·d = L (d), we must show how to 2 n Figure 2: The integer points that define the model constructagraphGonnverticeswithgraphdegeneracy polytope P10. d and e edges. Call the resulting graph Gn,d,e. To construct G , start with the graph in 3.3, which has n,d,e degeneracyd. Labeltheisolatedverticesv ,...,v . Proposition 3.1. Let Un(d) be the minimum number For each j such that 1≤j ≤e−(cid:0)d+1(cid:1), ad1d the end−gde−e1 of edges over all graphs on n vertices with degeneracy d 2 j by making the vertex v such that i≡j mod n−d−1 i and let L (d) be the maximum number of edges over all n adjacent to an arbitrary vertex of K . This process d+1 such graphs. Then, results in a graph with exactly e = (cid:0)d+1(cid:1) + j edges, 2 (cid:18)d+1(cid:19) and since the vertices v1,...,vn−d−1 still have degree U (d)= at most d, our construction guarantees that we have n 2 not increased the graph degeneracy. Hence, we have constructed G . and n,d,e (cid:18) (cid:19) d+1 L (d)= +(n−d−1)·d. The preceding proof also shows the following: n 2 Proposition 3.3. P contains exactly n Proof. First, observe that if degen(G) = d, then there are at least d+1 vertices of G in the d-core. Using this n−1 (cid:88) observation,itisnotdifficulttoseethatU (d)=(cid:0)d+1(cid:1), (3.4) [(n−d−1)·d+1] n 2 since this is the minimum number of edges required to d=0 construct a graph with a non-empty d-core. Hence, the integer points. This is the number of realizable edge- upper boundary of P consists of the points ((cid:0)d+1(cid:1),d) n 2 degeneracy vectors for every n. for 0 ≤ d ≤ n − 1. For the value of L (d), it is n an immediate consequence of [11, Proposition 11] that The following property is useful throughout: L (d)=(cid:0)d+1(cid:1)+(n−d−1)·d. n 2 Lemma 3.1. P is rotationally symmetric. n We use the notation L (d) and U (d) to signify that n n Proof. For n≥3 and d∈{1,...,n−1}, the extreme points (L (d),d) and (U (d),d) lie on the n n lower and upper boundaries of P , respectively. n L (d)−L (d−1)=U (n−d)−U (n−d−1). n n n n Itiswellknowninthetheoryofdiscreteexponential families that the MLE exists if and only if the average Note that the center of rotation is the point ((n − sufficient statistic of the sample lies in the relative 1)n/4,(n−1)/2). The rotation is 180 degrees around interior of the model polytope. This leads us to that point. investigate which pairs of integer points correspond to As mentioned above, the following nice property of realizable edge-degeneracy vectors. the ED model polytope-in conjunction with the partial characterizationoftheboundarygraphsbelow-suggests Proposition 3.2. Every integer point contained in Pn that the MLE for the ED model exists for most large is a realizable edge-degeneracy vector. graphs. Proposition 3.4. Let p denote the proportion of re- be a forest. However, if the forest is not connected, one n alizable edge-degeneracy vectors that lie on the relative could add an edge without increasing the degeneracy, interior of P . Then, and thus it must be a tree. For the second statement, n the complete graph minus one edge has the most edges lim pn =1. among all non-complete graphs. Any other graph has n→∞ either larger degeneracy or fewer edges. Proof. This result follows from analyzing the formula in 3.4 and uses the following lemma: The graphs corresponding to extreme points (L (d),d) for 2 ≤ d ≤ n−3 are called maximally d- Lemma 3.2. Thereare2n−2realizablelatticepointson n degenerate graphs and were studied extensively in [3]. theboundaryofP , andeachisavertexofthepolytope. n Such graphs have many interesting properties, but are quite difficult to fully classify or enumerate. Proof. Since P ⊆ Z+×Z+, and (0,0) ∈ P for all n, n n In the following theorem, we show that the lower we know that (0,0) must be a vertex of P . By the n boundary of P is like a mirrored version of the upper rotational symmetry of P , (n−1,(n−1)n/2) must be n n boundary. This partially characterizes the remaining a vertex, too. It is clear that the points of the form maximally d-degenerate graphs. (U (d),d) and (L (d),d) for d ∈ {1,...,n−2} are the n n only other points on the boundary; we will show that Theorem 3.1. Let G(U (d)) to be the unique (up to n eachofthesepointsisinfactavertex. Forthisitsuffices isomorphism) graph on n nodes with degeneracy d that to observe that Un(d) is strictly concave and Ln(d) is has the minimum number of edges, given by Un(d). strictly convex as a function of d. Hence, no interval Similarly, let G(L (d)) ⊂ G be the set of graphs on n n [Un(d),Ln(d)] is contained in the convex hull of any n nodes with degeneracy d that have the maximum collection of intervals of the same form. Thus, for each number of edges, given by L (d). Then, for all d ∈ n d ∈ {1,...,n−2} the points (Un(d),d) and (Ln(d),d) {0,1,...,n−1}, are vertices of P . n G(U (d))∈G(L (n−d−1)), n n To prove Proposition 3.4, we then compute: where G(U (d)) denotes the graph complement of n (cid:80)n−1[(n−d−1)·d+1]−(2n−2) G(U (d)). p = d=0 →1. n n (cid:80)n−1[(n−d−1)·d+1] d=0 Proof. As we know, G(U (d)) = K ∪ K . n d+1 n−d−1 Taking the complement, 3.1 Extremal graphs of P . Now we turn our at- n tention to the problem of identifying the graphs corre- (3.5) G(U (d))=K +K , n d+1 n−d−1 sponding to extreme points of P . Clearly, the bound- n arypoint(0,0)isuniquelyattainedbytheemptygraph where + denotes the graph join operation. We only K¯n and the boundary point ((cid:0)n2(cid:1),n−1) is uniquely at- need to show that this has Ln(n − d − 1) edges and tained by the compete graph Kn. The proof of Propo- graph degeneracy n−d−1. sition 3.2 shows that the unique graph corresponding Since G(U (d)) has (cid:0)d+1(cid:1) edges, G(U (d)) must n 2 n to the upper boundary point (Un(d),d) is a complete have (cid:0)n(cid:1)−(cid:0)d+1(cid:1) = L (n−d−1) edges. As for the graphond+1verticesunionn−d−1isolatedvertices. 2 2 n graph degeneracy, K +K =K is a subgraph 1 n−d−1 n−d The lower boundary graphs are more complicated, but of G(U (d)). Therefore, degen(G(U (d))) ≥ n−d−1. n n the graphs corresponding to two of them are classified However, degen(G(U (d))) < n−d because there are n in the following proposition. d + 1 vertices of degree n − d − 1, and a non-empty (n−d)-core would require at least n−d+1 vertices of Proposition 3.5. The graphs corresponding to the degreeatleastn−d. Thus,degen(G(U (d)))=n−d−1, lower boundary point (L (1),1) of P are exactly the n n n as desired. trees on n vertices. The unique (up to isomorphism) graphcorrespondingtothelowerboundarypoint(L (n− n 4 Asymptotics of the ED model polytope and 2),n−2) is the complete graph on n vertices minus an its normal fan edge. Since we will let n → ∞ in this section, it will be Proof. First we consider lower boundary graphs with necessary to rescale the polytope P so that it is n edge-degeneracy vector (L (1),1). A graph with de- contained in [0,1]2, for each n. Thus, we divide the n generacy 1 must be acyclic, since otherwise it would graph degeneracy parameter by n − 1 and the edge have degeneracy at least 2. Hence, such a graph must parameter by (cid:0)n(cid:1), as we have already done in (2.1). 2 While this rescaling has little impact on shape of 2. LetLandU befunctionsfrom[0,1]into[0,1]given P discussed in Section 3, it does affect its normal fan, by n a key geometric object that plays a crucial role in our √ √ subsequent analysis. We describe the normal fan of the L(x)=1− 1−x and U(x)= x. normalized polytope next. Then, Proposition 4.1. All of the perpendicular directions to the faces of Pn are: P =(cid:8)(x,y)∈[0,1]2: L(x)≤y ≤U(x)(cid:9). {±(1,−m):m∈{1,2,...,n−1}}. Proof. If α∈[0,1] is such that α(n−1)∈N, then So, after normalizing, we get that the directions are U (α(n−1)) α−α2 n =α2+ ≥α2 2 (cid:0)n(cid:1) n {±(1,− ):α∈{1/(n−1),2/(n−1),...,1}}. 2 αn for any finite value of n. Similarly, Proof. The slopes of the line segments defining each face of P are 1/∆ (d) = 1/d in the unnormalized L (α(n−1)) α2−α n U n =2α−α2+ ≤1−(1−α)2, parametrization. To get the slopes of the normalized (cid:0)n(cid:1) n polytope,justmultiplyeachslopeby(cid:0)n(cid:1)/(n−1)=n/2. 2 2 for any finite value of n. Hence, for any n, Our next goal is to describe the limiting shape of the normalized model polytope and its normal fan as U (α(n−1)) L (α(n−1)) α2 ≤ n ≤ n ≤1−(1−α)2. n → ∞. We first collect some simple facts about the (cid:0)n(cid:1) (cid:0)n(cid:1) 2 2 limiting behavior of the normalized graph degeneracy and edge count. Together with Propositions 4.1 and 4.2, this completes the proof. Proposition 4.2. If α ∈[0,1] such that α(n−1)∈N (so that α parameterizes the normalized graph degener- The convex set P is depicted at the top of Figure 4. acy), then In order to study the asymptotics of extremal U (α(n−1)) lim n =α2. propertiesoftheEDmodel, thefinalstepistodescribe (cid:0)n(cid:1) n→∞ allthenormalstoP. Aswewillseeinthenextsection, 2 these normals will correspond to different extremal Furthermore, due to the rotational symmetry of P , n behaviors of the model. Towards this end, we define L (α(n−1)) the following (closed, pointed) cones lim n =1−(1−α)2. (cid:0)n(cid:1) n→∞ 2 C =cone{(1,−2),(−1,0)}, ∅ Proof. By definition, U (α(n − 1)) = α2n2 + o(n). Ccomplete =cone{(1,0),(−1,2)}, n 2 C =cone{(−1,0),(−1,2)}, Hence, U C =cone{(1,0),(1,−2)}, L U (α(n−1)) α2n2+o(n) n (cid:0)n(cid:1) = 2 n(n−1) →α2. where, for A ⊂ R2, cone(A) denote the set of all conic 2 2 (non-negative) combinations of the elements in A. It is clear that C and C are the normal fan to the We can now proceed to describe the set limit cor- ∅ complete points(0,0)and(1,1)ofP. Asfortheothertwocones, responding to the sequence {P } of model polytopes. n n it is not hard to see that the set of all normal rays to Here,thenotionofsetlimitisthesameasin[22,Lemma the edges of the upper, resp. lower, boundary of P for 4.1]. Let n all n are dense in C , resp. C . As we will show in U L P =cl(cid:8)t∈R2: t=t(G),G∈G ,n=1,2,...(cid:9) the next section, the regions C∅ and Ccomplete indicate n directionsofstatisticaldegeneracy(forlargen)towards be the closure of the set of all possible realizable the empty and complete graphs, respectively. On the statistics (2.1) from the model. Using Propositions 4.1 otherhand,CU andCL containdirectionsofnon-trivial and 4.2 we can characterize P as follows. convergence to extremal configurations of maximal and minimalgraphdegeneracy. SeeFigure3andthemiddle Lemma 4.1. 1. P ⊂P for all n and lim P =P. and lower part of Figure 4. n n n q D 0 1. q E 0.8 acy 0.6 er n e Deg 0.4 2 0. Figure 3: The green regions indicate directions of nontrivial convergence. The bottom-left and top-right 0 redregionsindicatedirectionstowardstheemptygraph 0. K¯ and complete graph K , respectively. n n 0.0 0.2 0.4 0.6 0.8 1.0 Edge 5 Asymptotical Extremal Properties of the ED Model In this section we will describe the behavior of distri- butions from the ED model of the form P , where n,β+rd d is a non-zero point in R2 and r a positive number. In particular, we will consider the case in which d and β are fixed, but n and r are large (especially r). We will show that there are four possible types of extremal behavior of the model, depending on d, the “direction” alongwhichthedistributionbecomesextremal(forfixed n and as r grows unbounded). This dependence can be looselyexpressedasfollows: eachdwillidentifyoneand only one value α(d) of the normalized edge-degeneracy pairs such that, for all n and r large enough, P n,β+rd Figure 4: (Top) the sequence of normalized polytopes willconcentrateonlyongraphswhosenormalizededge- {P } converges outwards, starting from P in the degeneracy value is arbitrarily close to α(d). n n 3 center. (Middle) and (bottom) are the representative In order to state this result precisely, we will first infinite graphs along points on the upper and lower need to make some elementary, yet crucial, geometric boundariesofP, respectively, depictedasgraphons[15] observations. Any d ∈ R2 defines a normal direction for convenience. to one point on the boundary of P. Therefore, each d∈R2 identifiesonepointontheboundaryofP,which we will denote α(d). Specifically, any d ∈ C is in the collinear, α(d) (cid:54)= α(d(cid:48)). Analogous considerations hold ∅ normalconetothepoint(0,0)∈P,sothatα(d)=(0,0) for the points d∈int(CU): non-collinear points map to (the normalized edge-degeneracy of the empty graph) different points along the curve {U(x),x∈[0,1]}. for all d ∈ C (and those points only). Similarly, With these considerations in mind, we now present ∅ any d ∈ C is in the normal cone to the point our main result about the asymptotics of extremal complete (1,1) ∈ P, and therefore α(d) = (1,1) (the normalized properties of the ED model. edge-degeneracy of K ) for all d∈C (and those n complete Theorem 5.1. Let d (cid:54)= 0 and consider the following points only). On the other hand, if d ∈ int(C ), then L cases. d is normal to one point on the upper boundary of P. Assumingwithoutlossofgeneralitythatd=(1,a),α(d) • d∈int(C ). ∅ isthepoint(x,y)alongthecurve{L(x),x∈[0,1]}such Then, for any β ∈ R2 and arbitrarily small (cid:15) ∈ thatL(cid:48)(x)=−1. Noticethat,unlikethepreviouscases, (0,1) there exists a n((cid:15)) such that for all n ≥ n((cid:15)) a if d and d(cid:48) are distinct points in int(C ) that are not there exists a r = r((cid:15),n) such that, for all r ≥ L r((cid:15),n) the empty graph has probability at least 1−(cid:15) only two realizable pairs of normalized edge count and under P . graphdegeneracy,namelyitsendpoints,usingtheresult n,β+rd in [22], one can choose r = r(n,(cid:15),η) large enough so • d∈int(C ). complete that at least 1−(cid:15) of the mass probability of P Then, for any β ∈ R2 and arbitrarily small (cid:15) ∈ n,β+rd concentrates on the graphs in G whose normalized n (0,1) there exists a n((cid:15)) such that for all n ≥ n((cid:15)) edge-degeneracy vector is either one of the two vertices there exists a r = r((cid:15),n) such that, for all r ≥ ine . Theclaimfollowsfromthefactthatthesevertices n r((cid:15),n) the complete graph has probability at least are within η of α(q) For the other case in which d is 1−(cid:15) under P . n,β+rd in the interior of the normal cone to the vertex v , n again the results in [22] yield that one can can choose • d∈int(C ). L Then, for any β ∈ R2 and arbitrarily small (cid:15),η ∈ r = r(n,(cid:15),η) large enough so that at least 1 − (cid:15) of the mass probability of P concentrates on graphs (0,1)thereexistsan((cid:15))suchthatforalln≥n((cid:15),η) n,β+rd in G whose normalized edge-degeneracy vector is v . there exists a r = r((cid:15),n) such that, for all r ≥ k n Since v is within η of α(q) we are done. r((cid:15),η,n) the set of graphs in G whose normalized n n edge-degeneracy is within η of α(d) has probability at least 1−(cid:15) under P . The interpretation of Theorem 5.1 is as follows. n,β+rd If d is a non-zero direction in C and C , then ∅ complete • d∈int(C ). U P will exhibit statistical degeneracy regardless of Then, for any β ∈ R2 and arbitrarily small (cid:15),η ∈ n,β+rd β and for large enough r, in the sense that it will (0,1)thereexistsan((cid:15))suchthatforalln≥n((cid:15),η) concentrate on the empty and fully connected graphs, there exists a r = r((cid:15),n) such that, for all r ≥ respectively. As shown in Figure 3, C and C ∅ complete r((cid:15),η,n) the set of graphs in G whose normalized n are fairly large regions, so that one may in fact expect edge-degeneracy is within η of α(d) has probability statistical degeneracy to occur prominently when the at least 1−(cid:15) under P . n,β+rd model parameters have the same sign. The extremal directions in C and C yields instead non-trivial Remarks. We point out that the directions along the U L behavior for large n and r. In this case, P will boundaries of C and C are not part of our n,β+rd ∅ complete concentrate on graph configurations that are extremal results. Ouranalysiscanalsoaccommodatethosecases, in sense of exhibiting nearly maximal or minimal graph but in the interest of space, we omit the results. More degeneracy given the number of edges. importantly, the value of β does not play a role in the Taken together, these results suggest that care limiting behavior we describe. We further remark that is needed when fitting the ED model, as statistical it is possible to formulate a version of Theorem 5.1 for degeneracy appears to be likely. each finite n, so that only r varies. In that case, by Proposition 4.1, for each n there will only be 2(n−1) 6 Discussion possible extremal configurations, aside from the empty and fully connected graphs. We have chosen instead to ThegoalofthispaperistointroduceanewERGMand let n vary, so that we could capture all possible cases. demonstrate its statistical properties and asymptotic behavior captured by its geometry. The ED model is Proof. We only sketch the proof for the case d ∈ based on two graph statistics that are not commonly int(C ),whichfollowseasilyfromtheargumentsin[22], used jointly and capture complementary information L in particular Propositions 7.2, 7.3 and Corollary 7.4. about the network: the number of edges and the graph The proofs of the other cases are analogous. First, we degeneracy. The latter is extracted from important observe that the assumption (A1)-(A4) from [19] hold information about the network’s connectivity structure for the ED model. Next, let n be large enough such called cores and is often used as a descriptive statistic. that d is not in the normal cone corresponding to the Theexponentialfamilyframeworkprovidesabeau- points (0,0) and (1,1) of P . Then, for each such n, d tiful connection between the model geometry and its n either defines a direction corresponding to the normal statisticalbehavior. Tothatend,wecompletelycharac- of an edge, say e , of the upper boundary of P or d is terizedthemodelpolytopeinSection3forfinitegraphs n n in the interior of the normal cone to a vertex, say v , andSection4forthelimitingcaseasn→∞. Themost n of P . Since P →P, n can be chosen large enough so obvious implication of the structure of the ED model n n thateithertheverticesofe orv (dependingonwhich polytopeisthattheMLEexistsforasignificantpropor- n n oneofthetwocaseswearefacing)arewithinη ofα(d). tionoflargegraphs. Anotheristhatitsimplifiesgreatly Let us first consider the case when d is normal to the problem of projecting noisy data onto the polytope the edge v of P . Since every edge of P contains and finding the nearest realizable point, as one need n n n only worry about the projection. Such projections play [4] L. D. Brown, Fundamentals of statistical exponen- acriticalroleindataprivacyproblems,astheyareused tial families, IMS Lecture Notes, Monograph Series 9 in computing a private estimator of the released data (1986). with good statistical properties; see [12, 13]. Finally, [5] S. Carmi and S. Havlin and S. Kirkpatrick and Y. Shavitt and E. Shir,A model of Internet topol- the structure of the polytope and its normal fan reveal ogyusingk-shelldecomposition,ProceedingsoftheNa- various extremal behaviors of the model, discussed in tionalAcademyofSciences,104(27)(2007),pp.11150- Section 5. 11154. Note that the two statistics in the ED model sum- [6] S. Chatterjee and P. Diaconis, Estimating and under- marize very different properties of the observed graph, standing exponential random graph models, Annals of giving this seemingly simple model some expressive Statistics, 41(5) (2013), pp. 2428–2461. power and flexibility. In graph-theoretic terms, the de- [7] S. E. Fienberg and A. Rinaldo, Maximum likelihood generacy summarizes the core structure of the graph, estimation in log-linear models, Annals of Statistics, within which there can be few or many edges (see [11] 40(2) (2012). fordetails); combiningitwiththenumberofedgespro- [8] A. Goldenberg and A. X. Zheng and S. E. Fienberg duces Erd˝os-Renyi as a submodel. and E. M. Airoldi, A survey of statistical network models, Foundations and Trends in Machine Learning As discussed in Section 2, different choices of the 2(2) (2009), pp. 129–233. parametervector,thatis,valuesoftheedge-degeneracy [9] S.M.Goodreau,Advancesinexponentialrandomgraph pair, lead to rather different distributions, from sparse (p∗) models applied to a large social network, Special to dense graphs as both parameters are negative or Section: AdvancesinExponentialRandomGraph(p∗) positive,respectively,aswellasgraphswhereedgecount Models, Social Networks 29(2) (2007), pp. 231-248. and degeneracy are balanced for mixed-sign parameter [10] M. S. Handcock, Assessing degeneracy in statistical vectors. Our results in Section 5 provide a catalogue modelsofsocialnetworks,workingpaperno.39,Center of such behaviors in extremal cases and for large n. for Statistics and the Social Sciences, University of The asymptotic properties we derive offer interesting Washington (2003). insights on the extremal asymptotic behavior of the [11] V. Karwa and M. J. Pelsmajer and S. Petrovi´c and ED model. However, the asymptotic properties of non- D. Stasi and D. Wilburne, Statistical models for cores decompositionofanundirectedrandomgraph,preprint extremal cases, that is, those of distributions of the arXiv:1410.7357 (v2) (2015). formP forfixedβanddivergingn,remaincompletely n,β [12] V. Karwa and A. Slavkovi´c, Inference using noisy unknown. Whilethisanexceedinglycommonissuewith degrees: differentially private β-model and synthetic ERGMs, whose asymptotics are extremely difficult to graphs, Annals of Statistics 44(1) (2016), pp. 87–112. describe, it would nonetheless be desirable to gain a [13] V. Karwa and A. Slavkovi´c and P. Krivitsky, Differ- betterunderstandingoftheEDmodelwhenthenetwork entially private exponential random graphs, Privacy in is large. In this regard, the variation approach put Statistical Databases, Lecture Notes in Computer Sci- forward by [6], which provides a way to resolve the ence, Springer (2014), pp. 142-155. asymptoticsofERGMsingeneral,maybeaninteresting [14] M. Kitsak and L.K. Gallos and S. Kirkpatrick and direction to pursue in future work. S. Havlin and F. Liljeros and L. Muchnik and H.E.StanleyandH.Makse,Identificationofinfluential spreaders in complex networks, Nature Physics, 6(11) Acknowledgements (2010), pp. 880–893. The authors are grateful to Johannes Rauh for his [15] L. Lova´sz, Large networks and graph limits, American careful reading of an earlier draft of this manuscript Mathematical Society (2012). and for providing many helpful comments. [16] S.PeiandL.MuchnikandJ.AndradeJr.andZ. Zheng and H. Maske, Searching for superspreaders if infor- mationinreal-worldsocialmedia,NatureScientificRe- References ports, (2012). [17] S.Petrovi´c,Asurveyofdiscretemethodsin(algebraic) statistics for networks, Proceedings of the AMS Spe- [1] J. Bae and S. Kim, Identifying and ranking influential cial Session on Algebraic and Geometric Methods in spreaders in complex networks by neighborhood core- Discrete Mathematics, Heather Harrington, Mohamed ness, Physica A: Statistical Mechanics and its Appli- Omar, and Matthew Wright (editors), Contemporary cations, 395 (2014), pp. 549-559. Mathematics (CONM) book series, American Mathe- [2] O. Barndorff-Nielsen, Information and exponential matical Society, to appear. families in statistical theory, Wiley (1978). [18] A. Razborov, On the minimal density of triangles in [3] A. Bickle, The k-cores of a graph, Ph.D. dissertation, graphs,Combin.Probab.Comput.17(2008),pp.603– Western Michigan University (2010). 618. [19] A.RinaldoandS.E.FienbergandY.Zhou,Onthege- ometryofdiscreteexponentialfamilieswithapplication to exponential random graph models, Electronic Jour- nal of Statistics, 3 (2009), pp. 446–484. [20] A. Rinaldo and S. Petrovi´c and S. E. Fienberg, Max- imum likelihood estimation in the β-model, Annals of Statistics 41(3) (2013), pp. 1085–1110. [21] C. R. Shalizi and A. Rinaldo, Consistency under sam- pling of exponential random graph models, Annals of Statistics, 41(2) (2013), pp. 508–535. [22] M. Yin and A. Rinaldo and S. Fadnavis, Asymptotic quantization of exponential random graphs, to appear in the Annals of Applied Probability (2016).