ebook img

Polynomial representation for the expected length of minimal spanning trees PDF

0.77 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Polynomial representation for the expected length of minimal spanning trees

Polynomial Representation for the Expected Length of Minimal Spanning Trees∗ 5 1 Jared Nishikawa † Peter T. Otto † Colin Starr † 0 2 January 16, 2015 n a J 5 Abstract 1 In this paper, we investigate the polynomial integrand of an integral formula that yields the expected length of the minimal spanning tree of ] R a graph whose edges are uniformly distributed over the interval [0,1]. In particular, we derive a general formula for the coefficients of the poly- P nomial and apply it to express the first few coefficients in terms of the . h structure of the underlying graph; e.g. number of vertices, edges and t a cycles. m [ 1 Introduction 1 v In 2002, J.M. Steele [7] derived an integral formula for the expected length 8 of a minimal spanning tree (MST) of a graph with independent edge lengths 5 uniformly distributed over the interval [0,1]. While the formula gives an exact 7 value of the mean length of the MST in terms of the Tutte polynomial of the 3 0 graph, it yields (at least to us) little intuition of how the MST relates to the . structure of the underlying graph. 1 This provided the motivation for the research project investigated by the 0 5 Willamette University group of the Willamette Valley REU-RET Consortium 1 for Mathematics Research in the summer of 2008. The authors of this paper v: weremembersofthatresearchgroupandthispapercoverstheworkthatbegan i that summer. X The main result of this paper is a general formula for the coefficients of the r polynomialintegrandinSteele’sformulafortheexpectedlengthoftheMSTofa a simple,finite,connectedgraph. Forthefirstsevencoefficientsofthepolynomial, we prove a surprising result that expresses the coefficients in terms of features of the underlying graph; e.g. the number of vertices, edges, and cycles. The remainder of this paper is organized as follows: In Section 2, we state Steele’sformula,whichiswrittenintermsoftheTuttepolynomialoftheunder- lyinggraph. InSection3,weinvestigatetheintegrandoftheformulaandprove ∗ThisresearchwassupportedbyNSFgrant#0649068fundingtheWiVaMREU-RETin Mathematics †WillametteUniversity 1 that it is a polynomial, expressing the coefficients in terms of characteristics of the graph. We illustrate our results with an example in Section 4 and examine the particular case of the complete graph in Section 5. Throughout this paper, “graph” means a finite simple graph. We adopt the usualnotations: V(G)andE(G)arethevertexandedgesetsofG,respectively. The rank of G is denoted by r(G), and r(G) = |V(G)|−k(G), where k(G) denotes the number of connected components of G. 2 Steele’s Formula Let G be a graph. We assign independent random lengths ξ with uniform e distribution over the interval [0,1] to the edges e∈E(G). The total length of a minimal spanning tree (MST) of the graph G is denoted by (cid:88) L(G)= ξ . e e∈E(MST(G)) We are interested in the expected value of L(G), which we denote by E[L(G)]. Steele’s formula for E[L(G)] is written in terms of the Tutte polynomial of a graph, which we define next. Definition 2.1. Let G be a graph, and define S(G) to be the set of spanning subgraphs of G; i.e., subgraphs of G with vertex set V(G) and edge set a subset of E(G). The Tutte polynomial of G is defined as follows: (cid:88) T(G;x,y)= (x−1)r(G)−r(A)(y−1)|E(A)|−r(A). A∈S(G) The Tutte polynomial of a graph encodes much information about the graph, but we will only use the definition above in our analysis and refer the reader to [1] for more information. We will use the following result about the Tutte polynomial in the proof of the main result. The proof is a straightforward calculation from the definition and so we omit it. Lemma 2.2. Let G be a connected graph with n vertices and m edges. Then for values of (x,y) satisfying (x−1)(y−1)=1, we have (cid:18) x (cid:19)m (a) T(G;x,y)=(x−1)n−1 x−1   (cid:88) (cid:18) x (cid:19)m (b) Tx(G;x,y)=(x−1)n−2 k(A)(y−1)|A|− x−1  A∈S(G) where T denotes the partial derivative of T with respect to x. x WenowstateSteele’sintegralformulafortheexpectedlengthoftheminimal spanning tree that was proved in [7]. 2 Theorem 2.3. (Steele’s formula) Let G be a connected graph and T(G;x,y) the Tutte polynomial of G. Then (cid:16) (cid:17) E[L(G)]=(cid:90) 1 1−tTx G;1t,1−1t dt (1) (cid:16) (cid:17) 0 t T G;1, 1 t 1−t Steele’s formula above has been generalized to the case of an arbitrary, but still identical, edge distribution [5] and to edge distributions that are not nec- essarily identically distributed [6]. 3 Integrand in Steele’s Formula 3.1 Polynomial integrand We begin by showing that the integrand in Steele’s integral formula is a poly- nomial of degree less than or equal to the number of edges in the graph. Theorem 3.1. Let G be a connected graph with n vertices and m edges. Then (cid:90) 1 E[L(G)]= p (t)dt m 0 where p (t) is a polynomial of degree less than or equal to m. m Proof. For convenience, we let |A|=|E(A)|. By Lemma 2.2, we have (cid:16) (cid:17) T G;1, 1 1−t x t 1−t = (cid:88) k(A)t|A|(1−t)m−|A|−1 (cid:16) (cid:17) t T G;1, 1 t 1−t A∈S(G) m−|A| (cid:18) (cid:19) (cid:88) (cid:88) m−|A| = −1+ k(A) (−1)m−|A|−j tm−j j A∈S(G) j=0 This establishes the result, but we refine the coefficients further. Define m−|A| (cid:18) (cid:19) (cid:88) (cid:88) m−|A| p (t)=−1+ k(A) (−1)m−|A|−j tm−j. m j A∈S(G) j=0 Let i=m−j. Then m−|A|−j =i−|A|, so we have m−|A| (cid:18) (cid:19) (cid:88) (cid:88) m−|A| p (t)=−1+ k(A) (−1)i−|A| ti. m m−i A∈S(G) m−i=0 Tofindthecoefficientofti,wesumoverallAinS(G)suchthat|E(A)|≤i. This yields i (cid:18) (cid:19) (cid:88) m−(cid:96) (cid:88) a = (−1)i−(cid:96) k(A), (2) i m−i (cid:96)=0 A∈S(cid:96) 3 where S :={A∈S(G):|E(A)|=(cid:96)}. Thus (cid:96) m (cid:88) p (t)=−1+ a ti m i i=0 with a as above. i In the proof of Theorem 3.1, we derived an initial formula (2) for the coef- ficients of the polynomial integrand in Steele’s formula for the expected length of the MST. In the next section, we derive an easier working form for the co- efficients but we end this section with our first main result on the first three coefficients. Theorem 3.2. Let m (cid:88) p (t)=−1+ a ti m i i=0 be the polynomial integrand in Steele’s formula for the expected length of the MST of a connected graph G with n vertices and m edges. Then a =n, a =−m, and a =0 0 1 2 Proof. The set S consists of just the single subgraph of G with no edges and 0 (cid:80) n vertices, which has n connected components. Therefore, k(A) = n. A∈S0 Next, the set S consists of the m spanning subgraphs with just one edge, each 1 (cid:80) of which has exactly n−1 connected components. Therefore, k(A) = m(n−1). Lastly,thesetS consistsof(cid:0)m(cid:1)spanningsubgraphswitAh∈tSw1oedges, 2 2 each of which has exactly n−2 connected components. Substituting in these values into formula (2) yields (cid:88) (cid:88) (cid:88) a = k(A)=n, a =−m k(A)+ k(A)=−mn+m(n−1)=−m 0 1 A∈S0 A∈S0 A∈S1 and (cid:18) (cid:19) m (cid:88) (cid:88) (cid:88) a = k(A)−(m−1) k(A)+ k(A) 2 2 A∈S0 A∈S1 A∈S2 (cid:18) (cid:19) (cid:18) (cid:19) m m = n−m(m−1)(n−1)+ (n−2)=0. 2 2 This completes the proof. 3.2 Coefficients of the polynomial integrand In the previous theorem, the initial formula (2) for the coefficients is easily applied for the cases (cid:96)=0,1,2, because for each such (cid:96), the members of S all (cid:96) have the same number of connected components. When k(A) is non-constant on S , the enumeration becomes more difficult. (cid:96) 4 Accordingly, we partition the set S into subsets with different numbers of (cid:96) connected components. This can be achieved by partitioning over the ranks of the members of S since subgraphs in S with the same rank have the same (cid:96) (cid:96) number of connected components, namely n−r. Let k(cid:96) be the number of spanning subgraphs of G in S with rank r; i.e. r (cid:96) the number of spanning subgraphs of G with (cid:96) edges and n − r connected components. In terms of k(cid:96), formula (2) can be rewritten as r i (cid:18) (cid:19) (cid:96) (cid:88) m−(cid:96) (cid:88) a = (−1)i−(cid:96) k(cid:96)(n−r), (3) i m−i r (cid:96)=0 r=r(cid:96) where r is the minimum rank of a graph with n vertices and (cid:96) edges. If K is (cid:96) q the largest complete graph with |E(K )|<(cid:96), then r =q. In other words, r is q (cid:96) (cid:96) the largest integer with (cid:0)r(cid:96)(cid:1)<(cid:96). 2 We use the fact that (cid:80)(cid:96) k(cid:96) = (cid:0)m(cid:1) to reduce the number of terms of k(cid:96) r=r(cid:96) r (cid:96) r by one in (3). The new general expression for the polynomial coefficients a i for i ≥ 3 is stated in Theorem 3.4 below. The proof of the theorem requires a couple of combinatorial identities stated in the following lemma. Lemma 3.3. For integers m,k,i and n, (cid:18) (cid:19)(cid:18) (cid:19) (cid:18) (cid:19)(cid:18) (cid:19) m−k m m i (a) = m−i k m−i k n (cid:18) (cid:19) (cid:88) n (b) (−1)k k =0 k k=0 Theorem 3.4. Let a be the coefficients of the polynomial integrand in Steele’s i integral formula for the expected length of the MST of a connected graph G with n vertices and m edges. Then for i≥3 i (cid:18) (cid:19) (cid:96)−1 (cid:88) m−(cid:96) (cid:88) a = (−1)i−(cid:96) k(cid:96)((cid:96)−r). i m−i r (cid:96)=3 r=r(cid:96) Proof. Summing all the terms k(cid:96) for a fixed number of edges (cid:96) yields the total r number of spanning subgraphs in S , which equals (cid:0)m(cid:1). This implies that k(cid:96) = (cid:96) (cid:96) (cid:96) (cid:0)m(cid:1)−(cid:80)(cid:96)−1 k(cid:96) and thus from formula (3), we get (cid:96) r=r(cid:96) r i (cid:18) (cid:19)(cid:34)(cid:96)−1 (cid:32)(cid:18) (cid:19) (cid:96)−1 (cid:33) (cid:35) (cid:88) m−(cid:96) (cid:88) m (cid:88) a = (−1)i−(cid:96) k(cid:96)(n−r)+ − k(cid:96) (n−(cid:96)) i m−i r (cid:96) r (cid:96)=0 r=r(cid:96) r=r(cid:96) i (cid:18) (cid:19)(cid:34)(cid:96)−1 (cid:18) (cid:19) (cid:35) (cid:88) m−(cid:96) (cid:88) m = (−1)i−(cid:96) k(cid:96)((cid:96)−r)+ (n−(cid:96)) m−i r (cid:96) (cid:96)=0 r=r(cid:96) i (cid:18) (cid:19) (cid:96)−1 i (cid:18) (cid:19)(cid:18) (cid:19) (cid:88) m−(cid:96) (cid:88) (cid:88) m−(cid:96) m = (−1)i−(cid:96) k(cid:96)((cid:96)−r)+ (−1)i−(cid:96) (n−(cid:96)) m−i r m−i (cid:96) (cid:96)=0 r=r(cid:96) (cid:96)=0 5 The minimum ranks for (cid:96) = 0,1,2 are r = 0,r = 1 and r = 2. Therefore, 0 1 2 for these values of (cid:96), the summation on r is empty and a reduces to the second i summation. This and Lemma 3.3(a) yield (cid:34) i (cid:18) (cid:19) (cid:96)−1 (cid:35) (cid:34) i (cid:18) (cid:19)(cid:18) (cid:19) (cid:35) (cid:88) m−(cid:96) (cid:88) (cid:88) m i a = (−1)i−(cid:96) k(cid:96)((cid:96)−r) + (−1)i−(cid:96) (n−(cid:96)) i m−i r m−i (cid:96) (cid:96)=3 r=r(cid:96) (cid:96)=0 The second sum equals zero by the Binomial Theorem and Lemma 3.3(b). Theaboveresultgivesageneralformulaforthecoefficientsofthepolynomial integrand in terms of the values k(cid:96). Determining the values of k(cid:96) for large (cid:96) r r poses a major challenge. We conclude this section with the enumeration for (cid:96)=3,4,5,6 and the corresponding coefficients of p (t). m Definition 3.5. For a connected graph G, define (a) c = number of cycles of size i in G. i (b) c = number of cycles of size i with one chord. i,1 (c) c¯ = number of cycles of size i with one chord and one additional edge i,1 that is not a chord of the cycle. (d) k = number of complete subgraphs K in G. i i (e) k = number of complete bipartite subgraphs K in G. i,j i,j Lemma 3.6. For (cid:96)=3,4,5,6, (cid:96)−1 (cid:96) (cid:18) (cid:19) (cid:88) (cid:88) m−j k(cid:96)((cid:96)−r)= c −d (4) r j m−(cid:96) (cid:96) r=r(cid:96) j=r(cid:96) where d =0, d =0, d =k5, d =c¯ +c +k +4k 3 4 5 3 6 4,1 5,1 3,2 4 Proof. Weshowtheaboveresultfor(cid:96)=5;theothercasesaresimilarinnature. The minimum rank for (cid:96) = 5 is r = 3 and so the left-hand side of equation 5 (4) is k5 +2k5. The types of subgraphs counted in k5 are those with 5 edges 4 3 4 andn−4connectedcomponents,whichhavetheformshowninFigure1(a)-(c). Analogously, there is only one type of subgraph counted in k5, which is shown 3 in Figure 1(d). Note that the graphs in Figures 1 and 2 that are a one-clique sum of smaller graphs actually represent families that include subgraphs that (cid:83) aredisjointunionsofthesummands. Forexample,1(a)includesK P ,where 3 2 K is the complete graph on three vertices and P is a path with two edges. 3 2 Now consider the right-hand side of (4). Start with any 3-cycle and choose any other 2 edges in the graph; there are c (cid:0)m−3(cid:1) ways to do this. This counts 3 2 all the types of subgraphs depicted in Figure 1(a) and counts the subgraphs in Figure 1(d) twice. Figure 2 gives a pictorial representation of c (cid:0)m−3(cid:1). The 3 2 subgraphscountedbyc (cid:0)m−4(cid:1)(startwitha4-cycleandchooseanyotheredge) 4 1 areofthetypeshowninFigure1(b)andFigure1(d). Thesearedepictedinthe right-hand side of Figure 2. 6 k5 = + + k35 = 4 (a) (b) (c) (d) Figure 1: Representations of the subgraphs counted in k5 and k5 4 3 c3(m2-3)= + 2 c4(m1-4) = + Figure 2: Representations of the subgraphs counted in c (cid:0)m−3(cid:1) and c (cid:0)m−4(cid:1) 3 2 4 1 + Lastly, c is the number of 5-cycles, which are shown in Figure 1(c). There- 5 fore, (b) (c) (cid:18) (cid:19) (cid:18) (cid:19) m−3 m−4 k5+2k5 =c +c +c −k5. 4 3 3 2 4 1 5 3 While initially Lemma 3.6 appears only to complicate the coefficient for- mula given in Theorem 3.4, the next lemma shows that when it is applied to the coefficient formula, it actually simplifies it. The proof is a straightforward calculationandsoweomitit; thereasoningisanalogoustotheproofofLemma 3.6. Although we proved the first equation in Lemma 3.7 for i = 3,4,5,6, we conjecture that it holds in general for all i≥3. Lemma 3.7. For i=3,4,5,6, i (cid:18) (cid:19) (cid:96) (cid:18) (cid:19) (cid:88) m−(cid:96) (cid:88) m−j (−1)i−(cid:96) c =c m−i j m−(cid:96) i (cid:96)=3 j=r(cid:96) and thus i (cid:18) (cid:19) (cid:88) m−(cid:96) a =c − (−1)i−(cid:96) d (5) i i m−i (cid:96) (cid:96)=3 Finally, we derive representations for the coefficients a through a in terms 3 6 ofthestructureoftheunderlyinggraphG. Theproofofthetheoremisadirect application of Lemmas 3.6 and 3.7 to the general coefficient formula given in Theorem 3.4. Theorem 3.8. Let m (cid:88) p (t)=−1+ a ti m i i=0 7 be the polynomial integrand in Steele’s formula for the expected length of the MST of a connected graph G with n vertices and m edges. Then a =c , a =c , a =c −k5, a =c +2k −c −k . 3 3 4 4 5 5 3 6 6 4 5,1 3,2 4 Application of Results In this section, we apply Theorems 3.2 and 3.8 to the complete bipartite graph K in order to derive the expected length of the minimal spanning tree of this 3,2 graph. Proposition 4.1. Let p (t) be the polynomial integrand in Steele’s formula for m the complete bipartite graph, K shown below. 3,2 Then p (t)=4−6t+3t4−t6 m and (cid:90) 1 3 1 51 E[L(K )]= (4−6t+3t4−t6)dt=4−3+ − = 3,2 5 7 35 0 Proof. For K , n=5, and m=6. By Theorem 3.2, we have a =5, a =−6, 3,2 0 1 and a =0. Next, we apply Theorem 3.8. K has no 3-cycles, so a =0. The 2 3,2 3 graph has three 4-cycles, so a = 3. For the coefficient a , we note that there 4 5 are no 5-cycles and also no k5-type subgraphs (a 4-cycle with a chord) either, 3 so a = 0. Lastly, for a , there are no 6-cycles, no K subgraphs, no c -type 5 6 4 5,1 subgraphs (since there are no 5-cycles), and one k -type subgraph (the entire 3,2 graph). Therefore, a =−1 and we get 6 p (t)=−1+5−6t+3t4−t6. m 5 The Complete Graph The MST problem on K has been studied extensively. Frieze [3] proved that n ∞ (cid:88) lim E[L(K )]=ζ(3)= i−3 =1.202... n n→∞ i=1 In [8], Steele extended this result to general edge distributions and Janson [4] proved a central limit theorem for L(K ). n We apply our results to the complete graph and derive exact formulas in termsofthenumberofverticesnforthefirstsevencoefficientsofthepolynomial integrand in Steele’s formula. 8 Theorem 5.1. Let p (t) = −1 + (cid:80)m a ti be the polynomial integrand in m i=1 i Steele’s formula for the complete graph on n vertices, denoted by K . Then n (cid:18) (cid:19) (cid:18) (cid:19) n n a =n, a =− , a =0, a = 0 1 2 2 3 3 (cid:18) (cid:19) (cid:18) (cid:19) (cid:18) (cid:19) (cid:18) (cid:19) (cid:18) (cid:19) (cid:18) (cid:19) n n n n n n a =3 , a =12 −6 , a =60 −60 −2(n−5) 4 4 5 5 4 6 6 5 4 Proof. For the complete graph on n vertices, the number of edges m=(cid:0)n(cid:1) and 2 the number of cycles of length j is given by (cid:18) (cid:19) 1 n c = (j−1)! j 2 j In addition, k5 =2c , c =5c , k =(cid:0)n(cid:1) and k =(cid:0)n(cid:1)(cid:0)5(cid:1). 3 4 5,1 5 4 4 3,2 5 2 NumericalcalculationofE[L(K )]hadledtothefamousconjecturethatthe n convergentsequenceisalsomonotoneincreasingandconcave. Thisproblemwas raised at the conference Mathematics and Computer Science II at Versailles in 2002butnoproofhasbeenfound. Clearly,ourresultsalonewillnotanswerthis question as we have only derived exact formulas for the first seven coefficients. Butourresultsgiveahintthattheremaybeapatterntothecoefficientsofthe polynomial integrand in Steele’s formula for the complete graph, which if true, would answer the conjecture. We end this section with a result that factors the polynomial integrand in Steele’sformulaforK ,withoneofthefactorsapolynomialofdegreelessthan n or equal to the number of edges of K . n−1 Theorem 5.2. Let p (t) be the polynomial integrand in Steele’s formula for m the expected length of the MST of the complete graph on n vertices denoted by K . Then n p (t)=(1−t)n−1q(t), m where q(t) is a polynomial of degree less than or equal to (cid:0)n−1(cid:1). 2 Proof. As in the proof of Theorem 3.1, we have (cid:88) p (t)= k(A)t|A|(1−t)m−|A|−1, m A∈S(G) where m=(cid:0)n(cid:1). 2 We factor out (1−t)n−1 to get   pm(t)=(1−t)n−1 (cid:88) k(A)t|A|(1−t)(n2)−|A|−(n−1)−(1−t)1−n. A∈S(G) 9 Note that (cid:0)n(cid:1)−(n−1)=(cid:0)n−1(cid:1). Now the sum ranges over spanning subgraphs 2 2 of size (in edges) from 0 to (cid:0)n(cid:1). We split it into two sums as follows: 2  pm(t)=(1−t)n−1 (cid:88) k(A)t|A|(1−t)(n−21)−|A| |A|≤(n−1) 2  + (cid:88) k(A)t|A|(1−t)(n2)−|A|−(n−1)−(1−t)1−n |A|>(n−1) 2 Clearly, the first sum over |A|≤(cid:0)n−1(cid:1) is a polynomial of degree at most (cid:0)n−1(cid:1). 2 2 Call it q (t). 1 For the second sum, we sum over possible number of edges i > (cid:0)n−1(cid:1) and 2 count the number of subgraphs with i edges, which for the complete graph is (cid:0)(n2i)(cid:1). Furthermore, for all spanning subgraphs of Kn with i>(cid:0)n−21(cid:1) edges, the number of connected components is 1. Therefore, we have (cid:88) k(A)t|A|(1−t)(n2)−|A|−(n−1) |A|>(n−1) 2 (n) = (cid:88)2 (cid:18)(cid:0)n2(cid:1)(cid:19)ti(1−t)(n2)−i−(n−1) i i=(n−1)+1 2 (n) (n−1) = (cid:88)2 (cid:18)(cid:0)n2(cid:1)(cid:19)ti(1−t)(n2)−i(1−t)−(n−1)− (cid:88)2 (cid:18)(cid:0)n2(cid:1)(cid:19)ti(1−t)(n−21)−i i i i=0 i=0 (n) (n−1) = (1−t)1−n(cid:88)2 (cid:18)(cid:0)n2(cid:1)(cid:19)ti(1−t)(n2)−i− (cid:88)2 (cid:18)(cid:0)n2(cid:1)(cid:19)ti(1−t)(n−21)−i i i i=0 i=0 By the Binomial Theorem, the first sum equals 1 and the second sum, call it q (t), is a polynomial of degree at most (cid:0)n−1(cid:1). 2 2 We now have p (t)=(1−t)n−1(q (t)+(1−t)1−n+q (t)−(1−t)1−n)=(1−t)n−1(q (t)+q (t)), m 1 2 1 2 wherebothq (t)andq (t)arepolynomialsofdegreelessthanorequalto(cid:0)n−1(cid:1). 1 2 2 This completes the proof. References [1] B.Bollob´as, Modern Graph Theory, SpringerGraduateTextsinMathemat- ics (1998), p. 335. 10

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.