Constraint Satisfaction, Packet Routing, and the Lovász Local Lemma ∗ † David G. Harris Aravind Srinivasan ABSTRACT derlying Boolean variables or their complements (with the other n−k being don’t-cares). In addition to algorithms Constraint-satisfaction problems (CSPs) form a basic fam- andcomplexity,CSPshavealsospurredresearchinalgebra, ily of NP-hard optimization problems that includes satisfi- logic,modeltheory,probabilisticmethods,phasetransitions ability. Motivated by the sufficient condition for the sat- etc.;wejustmentionthealgebraicdichotomyconjecture[7] isfiability of SAT formulae that is offered by the Lov´asz as one of the central contributions to, and conjectures of, Local Lemma, we seek such sufficient conditions for arbi- this field. Motivated by the well-known sufficient condition trary CSPs. To this end, we identify a variable-covering provided by the Lova´sz Local Lemma (LLL) for the satis- radius–type parameter for the infeasible configurations of fiability of k-CNF-SAT formulae [10], this work develops a a given CSP, and also develop an extension of the Lova´sz general sufficient condition for the satisfiability of arbitrary Local Lemma in which many of the events to be avoided CSPs – formulating a generalization of the LLL in the pro- have probabilities arbitrarily close to one; these lead to a cess. general sufficient condition for the satisfiability of arbitrary We start by recalling how a key parameter works well CSPs. One primary application is to packet-routing in the withtheLLL,fork-CNF-SATinstances(whereeachclause classical Leighton-Maggs-Rao setting, where we introduce has exactly k literals, say). Let e denote the base of the several additional ideas in order to prove the existence of natural logarithm. The basic (“symmetric”) version of the near-optimal schedules; further applications in combinato- LLL states that if we have a collection of m“bad”events rial optimization are also shown. B ,B ,...,B suchthatPr[B ]≤pwitheachB “depend- 1 2 m i i ing” on at most d other B , then e·p·(d+1) ≤ 1 is a j 1. INTRODUCTION sufficient condition for Pr[B ∩B ∩···∩B ] > 0. The 1 2 m Constraint-satisfaction problems (CSPs) abstract a large fact that m (cid:29) d is allowed, leads to a number of appli- variety of problems in combinatorial optimization. In their cations of the LLL [2]. The more general (“asymmetric”) basic form, there are n variables where the ith variable is versionoftheLLLallowsheterogeneousvaluesfortheprob- required to take values from some finite ordered set Ai. ability and amount of independence of each Bi. For nearly A CSP then is just a collection of infeasible configurations all applications of the LLL, there are now polynomial-time F ⊆A ×A ×···×A ;thedecisionproblemiswhetherF algorithms which can find configurations that avoid all bad 1 2 n is nonempty, and the optimization problem is to find some events[22,14,18,24]. Aneasyapplicationofthesymmetric element of F (if this set is nonempty). The k-CNF-SAT versionoftheLLLshowsthatifthemaximumnumberdof problem is a well-known instance of CSPs, in which each clauses in which any literal (Xi or Xi) appears is at most elementofF isaconjunctionofkliteralsformedbytheun- 2k/(ek)−1/k, then the formula is satisfiable (see [30, 12] forfurtherimprovements); furthermore, asatisfyingassign- ∗ Department of Applied Mathematics, University of Mary- ment can be found in polynomial time [22]. We ask: what land, College Park, MD 20742. Research supported in is an analog of d for an arbitrary CSP P, a suitable upper part by NSF Award CNS-1010789. Email: davidghar- bound for which implies the satisfiability of P? [email protected] To understand this, let us observe that any CSP can be † Department of Computer Science and Institute for Ad- written as a simple 0−1 integer linear program as follows. vanced Computer Studies, University of Maryland, College Denote the set {1,2,...,t} by [t]. Let x be the indica- Park, MD 20742. Research supported in part by NSF i,j Awards CCR-0208005, ITR CNS-0426683, CNS-0626636, tor variable for the ith variable taking on the value j ∈Ai; (cid:80) and CNS-1010789. Email: [email protected]. thus, ∀i ∈ [n], j∈Aixi,j = 1. Next, each forbidden con- figuration (cid:104)i ,j (cid:105),...,(cid:104)i ,j (cid:105) can be avoided by writing the 1 1 k k constraint “x +···+x ≤ k −1”. Note that this i1,j1 ik,jk linear packing constraint is (of course) monotone decreas- Permissiontomakedigitalorhardcopiesofallorpartofthisworkfor inginthevariablesx. Allowingmoregeneral,nonlinearbut personalorclassroomuseisgrantedwithoutfeeprovidedthatcopiesare monotonedecreasingconstraintscouldmakethedescription notmadeordistributedforprofitorcommercialadvantageandthatcopies succinctand/orsparse;wewillseethisinourpacket-routing bearthisnoticeandthefullcitationonthefirstpage.Tocopyotherwise,to application. republish,topostonserversortoredistributetolists,requirespriorspecific Withthisinmind,westartwiththefollowingformulation permissionand/orafee. STOC’13,June1âA˘S¸4,2013,PaloAlto,California,USA. of arbitrary CSPs: Copyright2013ACM978-1-4503-2029-0/13/06...$10.00. Definition 1.1. (CSPs as Integer Programs with De- 2. if Z(cid:126) = (Z ) is the vector of mutually independent i,j creasing Boolean Constraints.) Consider the following Bernoulli random variables with P(Z = 1) = z , i,j i,j typeofintegerprogram,wherexi,j hereistheindicatorvari- then for all (i,j) and all k ∈ Yi,j, P(Bk(Z(cid:126)) | Zi,j = able for the ith variable of the CSP taking on the value 1)<δ/d. j ∈ A , and V = {x : i ∈ [n],j ∈ A } denotes the set i i,j i (We stress that this is an approximate restatement, and is of all these indicator variables: not to be taken literally.) The fact that the probability (Assignment Constraints) ∀i∈[n], (cid:80) x =1. bound in the second item depends on d alone, is one of the j∈Ai i,j key benefits brought by the assignment LLL. In contrast, a (Decreasing Boolean Constraints) For all k ∈ [(cid:96)], a typicalmethodofapplyingtheLLLtothistypeofproblemis given Boolean function B , which is an increasing toselectexactlyonevalueforeachvariableintheCSP–very k function of each of the variables in some given subset often independently, with the ith CSP variable chosen from Sk ⊆V, should be false. (As our goal is to avoid Bk, somedistributiononAi (thedistributionisoftenuniformor we sometimes refer to these as“bad events”.) givenbyanLPorotherconvexprogram)–andtothentreat the B as the bad events. The advantage of this method is k (Integrality) ∀(i,j), x ∈{0,1}. that the assignment constraints are automatically satisfied. i,j Unfortunately: (i) this introduces significant dependencies A technical remark. Our integer programs will have the (e.g., between Bk and Bk(cid:48) that respectively depend on xi,j above three types of constraints; a remark is in order. Sup- and xi,j(cid:48)), and more importantly, (ii) some Bk may depend pose |A |=2, say, and that we have variables x and x on a large number of variables, even if d is small – thus i i,1 i,2 which are constrained to sum to 1, by the assignment con- makingthedependenciesamongtheB “large.”Thisisthe k straint. Now if B is an increasing function of x , it is a difference between row-sparsity and column-sparsity of the k i,1 decreasing function of x on the set of feasible solutions. dependency matrix. i,2 To avoid any confusion about this, the reader is asked to We next discuss a variety of applications, and compare note that for each k, there is some S ⊆V such that B is them to the best previous results (many proved using the k k an increasing function of each variable in S (e.g., we will standard LLL). k have xi,1 ∈Sk here). (a) Packet routing: the Leighton-Maggs-Rao frame- Our new“assignment LLL”is given as Theorem 2.2; its work. Afundamentalpacket-routingproblemisasfollows. proofdepartsfromthestandardinductiveproofoftheLLL Suppose we are given an undirected graph G with N pack- to develop a different induction that combines the “R¨odl ets,inwhichweneedtorouteeachpacketifromvertexs to i Nibble”[?] and correlation inequalities. Intriguingly, our vertex t along a given simple path P . The constraints are i i approach will set each xi,j to 1 with infinitesimally-small thateachedgecanonlycarryonepacketatatime,andeach probability: note that each assignment constraint is now edgetraversaltakesunittimeforapacket;nodesareallowed very likely to be violated. Nevertheless, this yields a path- toqueuepackets. Thegoalistoconductroutingsalongthe waytoprovingthetheorem. WeapplytheassignmentLLL pathsP ,inordertominimizethemakespan T (thetimeby i to improve bounds of [1, 3, 8, 15, 16, 19, 20, 25, 31]. How- which all packets have reached their destinations). ever, our approach here does not appear to lead to an al- Twonaturallower-boundsonT arethecongestion C (the gorithmic counterpart of our assignment LLL. In many of maximum number of the P that contain any given edge of i ourexamples,thisextendedLLLframeworkcanbeapplied the graph) and the dilation D (the length of the longest as an almost automatic replacement for the original LLL. P ); thus, (C +D)/2 is always a lower-bound, and there i Hence, many of these applications will be very direct. In exist families of instances with T ≥ (1+Ω(1))·(C +D) addition, we will examine the case of packet-routing in the [28]. A seminal result proven in [20] (via iterated applica- classical setting of [20] in detail: we show here that there tion of the LLL, which has become a key tool in its own are many other improvements possible, using the standard right [21]) is that in fact T ≤ O(C +D) for all input in- LLL and further using our assignment LLL. stances, using constant-sized queues at the edges; both this The following definition will be crucial: result and its approach, have been used in much work in networks and combinatorial optimization. This argument Definition 1.2. Y = {k | x ∈ S } indexes those B i,j i,j k k wasrefinedandsimplifiedin[29,25],leadingtoa(construc- that increase explicitly as a function of x . i,j tive)boundof23.4(C+D)[25]. Byintroducingseveralnew ideasinschedulingpacketsalongtimeintervals,weimprove Informal discussion of the assignment LLL. The key this to 7.26(C +D) (non-constructively) and 8.84(C +D) analog of the parameter d from k-SAT, is d=maxi,j|Yi,j|: (constructively), thus approaching the simple lower bound informally, we can view d as the“variable-covering radius” for this fundamental problem. oftheCSP–themaximumnumberofeventsthatareinflu- The packet-routing application above involves increasing encedbyanygivenvariable. Tounderstandtheassignment Boolean functions B that are non-linear. The rest of our k LLL informally (albeit somewhat imprecisely), it roughly applicationshavelinear“packing”constraints;i.e.,eachbad saysthatasufficientconditionfortheCSPtobesatisfiable, event B is of the form“(cid:80) a (i,j)x >b ”, where the istheexistenceofanon-negativevectorz (whichisarelax- k (i,j) k i,j k given values {a (i,j)} and b are non-negative. ation of the vector x required by the three CSP constraints k k (b) Integrality gap for column-sparse matrices. Con- above) and a corresponding random vector Z – as well as siderthespecialcaseofourfamilyofCSPswhereeachB is some small δ>0 – with the following two properties: k (cid:80) alinearpackingconstraint: oftheform“ a (i,j)x > (i,j) k i,j 1. theassignmentconstraintsareslightly“over-satisfied”, b ”, where {a (i,j)} ∈ {0,1} and b ≥ 0. When does such k k k i.e., for all i, (cid:80) z ≥eδ, and an integer linear program have a feasible solution? In par- j∈Ai ij ticular,supposethereexistsasolutionx(cid:48)tothenaturalfrac- thenotionofchromatic capacity toexplicitlyconstructD∗, tionalrelaxation1 ofourCSPsinwhich(cid:80) a (i,j)x(cid:48) ≤ using graphs that have a large value for this parameter [8]. (i,j) k i,j b(cid:48) holds for each k ∈ [(cid:96)]; how small should the b(cid:48) be for Weimprovethedegree-basedupper-boundforthechromatic k k ourCSPtobesatisfiable? Weworkintheclassicalcolumn- numberfrom[3,8],aswellasarelatedupper-boundonthe sparse setting [4, 17] where for each (i,j), the number of k “kth upper chromatic number”of the reals due to [3]. for which ak(i,j) is nonzero, is at most some parameter d. WepresenttheAssignmentLLLandrelatednotionsinSec- Consider, for notational simplicity, the case where b(cid:48) ≥ tions 2, 3 and 4, and follow up with applications in the k Ω(logd) for all k. There is a classical linear-algebraic ap- subsequent sections. proach to such problems [4, 17]; in particular, the work of [17] shows that“bk ≥ b(cid:48)k +d for each k”suffices. Comple- 2. THEASSIGNMENTLLL mentingthis,ourassignmentLLLshowsthat,thecondition “b ≥b(cid:48)(1+d−C)+C(cid:48)·(cid:112)b(cid:48) logd for each k”is a sufficient Webeginbyfixingournotation. Foravectorvandscalar k k k a, we let v = (cid:126)a denote that each entry of v is a. Define condition,whereC isanypositiveconstant,andwhereC(cid:48) is A = [n] to be the family of all blocks, and let Z be the aconstantthatonlydependsonC. Therelatedtheoremsof i,j indicator for selecting the jth element from block i. We [31,19]giveaweakerresult–which,furthermore,holdsonly are given some Boolean functions B ,B ,...,B , such that whenalltheb arewithinO(1)ofeachother. However,the 1 2 (cid:96) k each B is determined by, and is an increasing function of, results of [17, 19] are constructive, while ours is not. (Note k a subset S of the Z variables. again that these earlier results hold only when each B is k k Werefertoanyorderedpair(cid:104)i,j(cid:105)wherej ∈A asanele- linear, while we allow arbitrary increasing functions.) i ment. So,forexample,wemightrefertoY wherex=(cid:104)i,j(cid:105) x (c) Independent transversals. Let n and ∆ denote as asanelement. WesupposethereareN elements,andsome- usual the number of vertices and maximum degree respec- times identify the set of elements with the set [N]. When tively, of an undirected graph G=(V,E). It is well-known discussing blocks i and possible values j ∈ A , we will of- i that G has an independent set of size at least n/(∆+1). ten omit to mention that j ∈ A (e.g., as in (cid:80) p ). This What if we want independent sets with much more struc- should always be understood asij being restrictjedi,tjo A , or i ture? In particular, say that we have apartitionof the ver- equivalently that p =0 for j (cid:54)∈A . i,j i ticesofGintoblocksofsizeb. Wewishtoselectexactlyone UnlikeintheusualLLL,wecannotsimplydefineaprob- vertex from each block such that the selected vertices form ability distribution and compute the probability of the bad an independent set in G. This is known as an independent event occuring. We will only have partial control over this transversal. probabilitydistribution,andthefollowingdefinitionwillbe A well-known result of Alon using a simple LLL-based important: proof[1,2]showsthatifb≥2e∆,thenGhasanindependent transversal. Thus,atthecostofthelossofasmallconstant Definition 2.1. (UNC) Given a probability distribution Ω (2e ∼ 5.44), we get much more structure as compared to on the underlying indicator variables Z, we say that Ω sat- the basic“n/(∆+1)”lower-bound. Using the algorithmic isfies upper negative correlation with respect to the vector LLL of [24], this can be improved to show that b ≥ 4∆ is q ∈[0,∞)N – or simply“UNC(q)”– if for all k and for all sufficient. The existentially optimal, but non-constructive, distinct elements x1,...,xk, we have result of [15, 16] states that b≥2∆ suffices. P (Z =···=Z =1)≤q ...q . Our approach shows that G≥4∆+1 suffices, moreover, Ω x1 xk x1 xk weareabletoobtaintwoimprovementscomparedtoprevi- For an event E and vector q, we define P∗q(E) to be the ous results. First, if the given partition of V has the prop- minimum, over all Ω satisfying UNC(q), of PΩ(E). (As erty that the average degree of the vertices in each block the number of variables is finite, this minimum is achieved. is at most d, then the method of [1] shows that it suffices Also, while we could restrict q∈[0,1]N without loss of gen- to have b≥2e(d+1), which does not appear to carry over erality, by allowing q ∈ [0,∞)N we simplify the notation to the corresponding b ≥ 2∆ bound of [15, 16]; we show considerably.) thatb>4dsuffices. (Pegden’smethod[24]canalsobeused Essentially, when computing P∗(E), we are not allowing to derive this bound.) Second, our method extends to the q therandomvariablesZtobepositivelycorrelated. Forsome weightedcase, where the approachof [15, 16] doesnot hold types of events, such as large-deviation events, this allows to our knowledge. Suppose now that each vertex i has a ustocontroltheprobabilityverystrongly;forotherevents, weight w ≥ 0; it is not hard to observe that G has an in- i (cid:80) such as a union of many events, this is no better than the dependentsetoftotalweightatleast( w )/(∆+1),and i i union bound. thattheredoesnotalwaysexistanindependenttransversal (cid:80) Our main theorem is: of weight more than ( w )/b. Here, we show that if our i i partition satisfies b>4∆, then there exists an independent Theorem 2.2 (Assignment LLL). Suppose we are given a transversalofweightatleast((cid:80)iwi)√b−√4b∆−+4∆√+b(√2bb−1) (which CSP for which there exists λ∈[0,∞]N such that apWpreoaacphpelsy((cid:80)ouirwrie)s/ubltfofrorlairngdeebp)e.ndent transversals to the ∀i∈[n], (cid:88)λi,j·P∗λ[ (cid:92) Bk (cid:12)(cid:12)Zi,j =1]>1. problem of chromatic capacity. R¨odl has shown that for j k∈Yi,j anyacyclicdigraphD,thereexistsanundirectedgraphD∗ Then, if no Bk is a tautology, the CSP is feasible. such that all of the latter’s acyclic orientations contain D To prove the theorem, we will study the following proba- asaninducedsubgraph[26]. Cochand&Duchetintroduced bilistic process. We are given a vector p ∈ [0,1]N of prob- 1Wherein the constraint“∀(i,j), xi,j ∈{0,1}”is relaxed to abilities, one for each indicator Zx. Each Zx is drawn in- let the x lie in [0,1]. dependently as Bernoulli-p, i.e., P(Z = 1) = p . (If for i,j x x some event x we have p > 1, then by an abuse of nota- and it is easy to see that the lemma’s hypothesized con- x tion, we take this to mean that Z =1 with certainty.) Let straint implies that this is at least (cid:15). x Y ={B ,B ,...,B } denote the set of all bad events. Our Next suppose |Y(cid:48)| > 0. We use Inclusion-Exclusion to 1 2 (cid:96) goal is to satisfy all the assignment constraints and avoid estimate P(Assigned(i) | ¬∃Y(cid:48),Assigned(A(cid:48))). First, con- all the events in Y. If Y(cid:48) ⊆Y, we use the notation ∃Y(cid:48) to sider the probability that a distinct pair j,j(cid:48) in block i are denotetheeventthatsomeB ∈Y(cid:48) occurs. Sointhiscase, jointly chosen, conditional on all these events. For this, by i we want to avoid the event ∃Y. Lemma 2.3 we have We recall a basic lemma concerning increasing and de- creasingevents,whichfollowsfromtheFKGinequality[11]: P(Zi,j =Zi,j(cid:48) =1|Assigned(A(cid:48)),¬∃Y(cid:48))≤pi,jpi,j(cid:48) (1) as i(cid:54)∈A(cid:48) and ¬∃Y(cid:48) is decreasing. Lemma 2.3. Let X0 ⊆[N]. Suppose B1 is an event which Let us fix j. We next need to show a lower bound on depends solely on variables X1, where X1 is disjoint from P(Zi,j = 1 | Assigned(A(cid:48)),¬∃Y(cid:48)). This is easily seen to X0, and let E− be an decreasing event. Then, equal P(Zi,j =1) if Yi,j ∩Y(cid:48) =∅, so we can assume Yi,j ∩ P(∀x∈X Z =1|B ,E−)≤ (cid:89) p . Y(cid:48) (cid:54)= ∅. Using Bayes’ Theorem, we interchange the events 0 x 1 x Z =1 and ∃Y in the conditional probability to obtain i,j i,j x∈X0 P(Z =1|Assigned(A(cid:48)),¬∃Y(cid:48)) Similarly, if E+ is increasing, then P(∀x ∈ X Z = 1 | i,j 0 x B1,E+)≥(cid:81)x∈X0px. ≥pi,j·P(¬∃Yi,j |Zi,j =1,Assigned(A(cid:48)),¬∃(Y(cid:48)−Yi,j)).(2) Proof. We will only prove the first part of this lemma; the This approach to handling a conditioning was inspired by second is analogous. [5]. We average over all assignments to the variables Z , for x Consider the random variables Z conditioned on the (cid:86) x∈X1. Foranysuchassigment-vector(cid:126)z,theevent x∈X0Zx = events Assigned(A(cid:48)),¬(Y −Yi,j),Zi,j = 1. Our key claim 1isanincreasingfunction,whileE−isadecreasingfunction now is that these conditional random variables Z (apart intheremainingvariables. Hence, byFKG,theprobability fromZ itself)satisfyUNC(p/(cid:15)). Toshowthis,weneedto i,j ofthiseventconditionalon(ZX1 =(cid:126)z∧E−)is(cid:86)atmostitscon- upper-bound P(E1 |E2), where ditional on Z =(cid:126)z alone. But, the events Z =1 and ZX1 = (cid:126)zXi1nvolve disjoint sets of variablesx,∈Xso0thXe0y are E1 ≡(Zi(cid:48)1,j1(cid:48) =···=Zi(cid:48)k,jk(cid:48) =1) and independent. Hencethisprobabilityisatmosttheuncondi- E ≡(Assigned(A(cid:48)),¬∃(Y −Y ),Z =1), tional probability of (cid:86) Z =(cid:126)1, namely (cid:81) p . 2 i,j i,j x∈X0 x x∈X0 x for arbitrary k and i(cid:48),j(cid:48),...,i(cid:48),j(cid:48). Let I(cid:48) = {i(cid:48),...,i(cid:48)} 1 1 k k 1 k and E ≡ (Z = 1,Assigned(A(cid:48) −I(cid:48)),¬∃(Y −Y )). By IfA(cid:48) ⊆A isanysubsetoftheblocks, wedefinetheevent 3 i,j i,j simple manipulations, we obtain Assigned(A(cid:48))tobetheeventthat, for alli∈A(cid:48), thereisat leastone valueofj forwhichZi,j =1. Ourgoalistosatisfy P(E |E )≤ P(E1 |E3) . (3) the constraint Assigned(A). Because all the bad events are 1 2 P(Assigned(I(cid:48))|E ) 3 increasing Boolean functions, if we can find a configuration Note that E does not share any variables with (Z = in which each block has at least one value assigned, we can 1 i,j 1,Assigned(A(cid:48) −I(cid:48))), and that ¬∃(Y −Y ) is a decreas- easilyalterittoafeasibleconfigurationinwhicheachblock i,j ing event. Hence by Lemma 2.3 the numerator of (3) is at has exactly one value assigned. Wearenowreadytostatethefirstlemmaconcerningthis most pi(cid:48)1,j1(cid:48) ...pi(cid:48)k,jk(cid:48). Now let us examine the denominator. The variable Z does not affect any of the events men- probabilistic process. We want to show that there is a pos- i,j tioned in the denominator, so we may remove it from the itive probability of satisfying all the assignment constraints conditioning: i.e., P(Assigned(I(cid:48))|E ) equals andavoidingallbadevents. Wewillshowthefollowingkey 3 lemma by induction: P(Assigned(I(cid:48))|Assigned(A(cid:48)−I(cid:48)),¬∃(Y −Y )), i,j Lemma 2.4. Let 0<(cid:15)<1. Suppose p∈[0,1]N is a vector which in turn is at least (cid:15)|I(cid:48)| by iterated application of the such that for all blocks i, induction hypothesis. Putting all of this together, (cid:88)j pi,j(P∗p/(cid:15)(¬∃Yi,j |Zi,j =1))−(cid:88)j,j(cid:48)pi,jpi,j(cid:48) ≥(cid:15). P(E1 |E2)≤pi(cid:48)1,j1(cid:48) ...pi(cid:48)k,jk(cid:48)/(cid:15)|I(cid:48)| ≤pi(cid:48)1,j1(cid:48) ...pi(cid:48)k,jk(cid:48)/(cid:15)k. Then for any block i, any Y(cid:48) ⊆ Y a set of bad events, and So the random variables Z conditionally satisfy UNC(p/(cid:15)) any A(cid:48) ⊆A a set of blocks, we have and we have P(Assigned(i)|¬∃Y(cid:48),Assigned(A(cid:48)))≥(cid:15). P(¬∃Yi,j |Zi,j =1,Assigned(A(cid:48)),¬∃(Y(cid:48)−Yi,j)) ≥P∗ (¬∃Y |Z =1) p/(cid:15) i,j i,j Proof. We show this by induction on |Y(cid:48)|+|A(cid:48)|. We may The right-hand side is substantially simpler, as there is no assumethati∈/ A(cid:48),asotherwisethisisvacuous. First,sup- conditioning to link the variables. Substituting this into (1) pose |Y(cid:48)| = 0. Then, P(Assigned(i) | ¬∃Y(cid:48),Assigned(A(cid:48))) and (2), we get equals P(Assigned(i)) as these events are independent. By Inclusion-Exclusion, the latter probability is at least P(Assigned(i)|Assigned(A(cid:48)),¬∃Y(cid:48)) P(Assigned(i))≥(cid:88)pi,j−(cid:88)pi,jpi,j(cid:48) ≥(cid:88)pi,jP∗p/(cid:15)(¬∃Yi,j |Zi,j =1)−(cid:88)pi,jpi,j(cid:48) j j,j(cid:48) j j,j(cid:48) and by our hypothesis the right-hand side is at least (cid:15). abuse notation to view it as a distribution on Z ,...,Z x1 xk aswell. WedescribeagenericalgorithmtocomputeP∗(B) λ Definition 2.5. (Functions h and Hi) Suppose we are (we sometimes just denote P∗(·) as P∗(·)). given a CSP and a vector λ ∈ [0,∞)N. For any element AsBisanincreasingevent,λwecanwriteB =a ∨···∨a , 1 n x ∈ [N], let hx(λ) = P∗λ(¬∃(cid:80)Yx | Zx = 1). Then, for any whereeachai ∈B isanatomicevent,andisminimalinthe block Ai, we define Hi(λ)= jλi,jhi,j(λ). sensethatforalla(cid:48) <ai wehavea(cid:48) (cid:54)∈B. Weassume0∈/ B, as otherwise B is a tautology and P∗(B)=1. Having defined the H , we can now return to Lemma 2.4 i We say that a probability distribution Ω on the variables andallowallentriesofptotendto0atthesamerate,which Z ,...,Z isworst-case ifP∗(B)=P (B). Bycompact- leads to the following proof of the Assignment LLL: x1 xk λ Ω ness, such an Ω exists. The basic idea of our algorithm is Proof. (Of Theorem 2.2.) Given a λ ∈ [0,∞)N such that to view each PΩ(ω) as an unknown quantity, where ω ∈ Ω Hi(λ)>1foralli,weneedtoshowthattheCSPisfeasible. is an atomic event. We write qω =PΩ(ω) for simplicity. In Let p=αλ and let (cid:15)=α for some α>0. For α sufficiently this case, PΩ(B) is the sum small, p is a probability vector. In order to use Lemma 2.4, (cid:88) P (B)= q it suffices to satisfy the constraint for all i Ω ω ω∈B (cid:88) (cid:88) pi,jhi,j(p/(cid:15))− pi,jpi,j(cid:48) ≥(cid:15). (4) Furthermore, the UNC constraints can also be viewed as j j,j(cid:48) linear constraints in the variable q . For each x(cid:48),...,x(cid:48) , ω 1 k(cid:48) LetusfixablockA . Supposeweallowα→0. Inthiscase, we have the constraint i (4) will be satisfied for some α > 0 sufficiently small if we (cid:88) have qω ≤λx(cid:48)1...λx(cid:48)k(cid:48) (cid:88)λ ·P∗(¬∃Y |Z =1)>1 ωx(cid:48)1=···=ωx(cid:48)k(cid:48)=1 i,j λ i,j i,j This defines an LP, in which we maximize the objective j functionP∗(B)=(cid:80) q subjecttotheUNCconstraints. Asthereareonlyfinitelymanyblocks,thereisα>0suf- The size of this LP mωa∈yBbeωenormous, potentially including ficientlysmallwhichsatisfiesallconstraintssimultaneously. 2k variables and 2k constraints. However, in many applica- In this case, we claim that there is a positive probabil- tionskisaparameteroftheproblemwhichcanberegarded ity of satisfying Assigned(Ai),Bk for all blocks Ai and all asconstant,sosuchaprogrammaystillbetractable. Ingen- bad events Bk, when we assign variables Z independently eral,wecanreducethenumberofvariablesandconstraints (cid:81) Bernoulli-p. First, P(¬∃Y) ≥ x∈[N]P(Zx = 0), since no of this LP with the following observations. B is a tautology; the latter product is clearly positive for k small-enough α. Next, by Lemma 2.4 and Bayes’ Theorem, Proposition 3.2. There is a distribution Ω with PΩ(B)= P∗(B) and such that Ω is only supported on the atomic n P(Assigned(1)∧···∧Assigned(n)|¬∃Y)≥(cid:89)(cid:15)>0. events {0,a1,...,an}. i=1 Proof. Suppose Ω satisfies the UNC constraints. Then de- Inparticular,thereisaconfigurationoftheZx whichsatis- fine Ω(cid:48) as follows. For each atomic event ω, if ω ∈/ B, then fies all the constraints simultaneously. shift the probability mass of ω to 0; otherwise, shift the probability mass of ω to any minimal event a underneath i 3. COMPUTINGP∗ it. This preserves all UNC constraints as well as the objec- tive function. In the usual LLL, one can fully specify the underlying randomprocess,soonecancomputetheprobabilityofabad Forsometypesofbadevents,therearecertainsymmetries event fairly readily. In the assignment LLL, we know that amongclassesofvariables. Inthiscase,onecanassumethat the random variables must satisfy their UNC constraints, thedistributionΩissymmetricinthesevariables;hencethe but we do not know the full distribution of these variables. probabilitiesofallsucheventscanbecollapsedintoasingle This can make it much harder to bound the probability of variable. a bad event. Roughly speaking, the UNC constraints force the under- Proposition 3.3. Given a group G⊆S , where S is the k k lyingvariablestobenegativelycorrelated(orindependent). symmetric group on k letters, define the group action of G For some types of bad events, this is enough to give strong on a probability distribution Ω by permutation of the indices bounds: and on λ by permutation of the coordinates. Suppose B,λ areclosedundertheactionofG. Thenthereisaworst-case Lemma 3.1. For random variables Z ,...,Z , let µ = x1 xk probability distribution Ω which is closed under G. For this λ +···+λ . Then x1 xk probability distribution Ω, we only need to keep track of a (cid:16) eδ (cid:17)µ single unknown quantity q(cid:48) for each orbit of G. P∗(Z +···+Z ≤µ(1+δ))≥1− λ x1 xk (1+δ)1+δ Proof. Given a worst-case Ω, let Ω(cid:48) = 1 (cid:80) gΩ. As B |G| g∈G Proof. The Chernoff upper-tail bound applies to negatively isclosedunderG,eachofthedistributionsgΩhasthesame correlated random variables [23]. probability of the bad event B. Since λ is closed under G, all the UNC constraints are preserved in each gΩ. Suppose we have an increasing bad event B which de- pends on Z ,...,Z . We are given λ ∈ [0,1]N. Note WewillshowhowtousethesetechniquestocomputeP∗ x1 xk that Ω is a probability distribution on Z ,...,Z , but we for various applications in later sections. 1 N 4. THELLLDISTRIBUTION Next,alowerboundontheprobabilityofhavingZ =1 i,j IfwearegivenadistributionλsatisfyingH (λ)>1,then after the alteration step is given by i weknowthatthereexistsaconfigurationwhichsatisfiesall P(Z =1|L )≥P(Z =1|Assigned(A),¬∃Y) i,j α i,j constraints. Furthermore,suchaconfigurationcanbefound (cid:88) by drawing the indicators Z as independent Bernoulli-αλ, −2 P(Zi,j =Zi,j(cid:48) =1|Assigned(A),¬∃Y) for some α>0 sufficiently small. j<j(cid:48) We may wish to learn more about such configurations, Let us examine these in turn. We want to lower-bound other than that they exist. We can use the probabilistic P(Z =1|Assigned(A),¬∃Y)andtoupper-boundP(Z = method, by defining an appropriate distribution on the set i,j i,j of feasible configurations. We define the LLL-distribution Zi,j(cid:48) =1|Assigned(A),¬∃Y). We define the event to be the distribution on feasible configurations Γ that is E ≡Assigned(A−i)∧¬∃Y. induced by the following process: We have 1. Draw each Z independently with probability αλ. x P(Zi,j =Zi,j(cid:48) =1|Assigned(A),¬∃Y)= 2. IftheresultingconfigurationΓdoesnotsatisfyallcon- P(Zi,j =Zi,j(cid:48) =1∧Assigned(i)|E) straints Assigned(A),¬∃Y, return to step (1). P(Assigned(i)|E) ≤ pi,jpi,j(cid:48) by Lemmas 2.3, 2.4 3. Now each block has at least one element selected. If (cid:15) any block has more than one element, arbitrarily dis- =O(α) card all but one. Next, we lower-bound P(Z =1|Assigned(A),¬∃Y) as i,j 4. Return the resulting feasible configuration Γ(cid:48). P(Zi,j =1|Assigned(A),¬∃Y) P(Z =1|E) Werefertostep(3)asthealteration step andstep(2)as = i,j P(Assigned(i)|E) the configuration step. P(Z =1|E) For α>0 sufficiently small, this process terminates with = (cid:80) (cid:80)i,j positive probability, returning a feasible configuration Γ(cid:48). j(cid:48)P(Zi,j(cid:48) =1|E)− j(cid:48),j(cid:48)(cid:48)P(Zi,j(cid:48) =Zi,j(cid:48)(cid:48) =1|E) Such configuration has exactly one element selected from p P (¬∃Y |Z =1) each block. We refer to this distribution as Lα. ≥ p P (¬∃Yi,j |p/Z(cid:15) =i1,)j+(cid:80)i,j p −O(α2) Note that there are finitely many configurations, and for i,j p/(cid:15) i,j i,j j(cid:48)(cid:54)=j i,j eaascahroaftitohneasleftuhnectpioronboafbitlhiteypoafraPm(Γete|rLαα.) Hcaenncbeeavsieαw→ed =λi,jhi,j(λ)λi,j+h(cid:80)i,jj((cid:48)(cid:54)=λ)jλi,j(cid:48) −O(α) 0, P(Γ | L ) must converge. Furthermore, since for any α ThisimpliesthattheprobabilityofZ =1inthedistri- α > 0 sufficiently small, P(Γ | L ) defines a probability i,j α bution L is at least distribution, so must the limit as α → 0. We define the α distribution L to be the limiting distribution as α → 0, λ h (λ) and refer to L as the LLL distribution. The reason for this P(Zi,j =1|Lα)≥ λi,jhi,j(λ)+i,(cid:80)j ji(cid:48),j(cid:54)=jλi,j(cid:48) −O(α) −O(α) limitingstepisthattypicallythedistributionLgivessimpler and tighter bounds than any L . Note however that there As α→0, we have α is no explicit process that samples from L. λ h (λ) anWeleemsheonwttxhaistaincctehpetLedLLdodeisstnriobtudtiioffne,rtthoeopmroubcahbfirliotmytλhat: P(Zi,j =1|L)≥P(Zi,j =1|Lα)≥ λi,jhi,j(λi,)j+i,j(cid:80)j(cid:48)(cid:54)=jλi,j(cid:48) x Theorem 4.1. Suppose we are given real numbers λ ∈ [0,1]N such that for all i we satisfy Hi(λ) > 1. Then, the Suppose we are given non-negative weights wx ≥ 0. By distribution L is well-defined. Also, for any i,j we have drawing from the LLL distribution, we can show the exis- tence of configurations with high or low weights: h (λ) λi,jhi,j(λ)λi,ji,+j (cid:80)j(cid:48)(cid:54)=jλi,j(cid:48) ≤P(Zi,j =1|L)≤λi,j. Lemma4.2. Supposewearegivenλ∈[0,1]N suchthatfor all blocks, H (λ)>1. i Fix a block i. To simplify notation, let |A | = l with the Proof. AsinTheorem2.2,wesetp=αλand(cid:15)=αforsome i elements sorted by weight so that 0 ≤ w ≤ w ≤ ··· ≤ α>0 sufficiently small so that i,1 i,2 w . We define the upper and lower weights for block i as i,l (cid:88) (cid:88) pi,jhi,j(p/(cid:15))− pi,jpi,j(cid:48) ≥(cid:15) W− = min (cid:80)kj=1λjwi,j+(cid:80)lj=k+1λi,jhi,jwi,j j j,j(cid:48) i 1<k<l (cid:80)kj=1λi,j+(cid:80)lj=k+1λi,jhi,j First, we show an easy upper bound on P(Zi,j =1): (cid:80)k λ w +(cid:80)l λ h w W+ = max j=1 i,j i,j j=k+1 i,j i,j i,j P(Zi,j =1|Lα)≤P(Zi,j =1|Assigned(A),¬∃Y) i 1<k<l (cid:80)kj=1λi,j+(cid:80)lj=k+1λi,jhi,j ≤p /(cid:15) by Lemmas 2.3, 2.4 i,j Then the expected weight of block i in the LLL distribution =λi,j lies in [Wi−,Wi+]. Proof. The proof is very similar to Theorem 4.1. independentsetofweightatleast w(V). Wecanalsogivea ∆+1 similar bound on the weight of an independent transversal: Corollary 4.3. Suppose for all blocks H (λ) > 1. Then i there is a feasible configuration with weight at most W+ = Theorem5.2. SupposewehaveagraphG=(V,E)withV (cid:80)n W+. Thereisalsoafeasibleconfigurationwithweight partitionedintoclassesofsizeb>4∆. Then,Ghasaninde- i=1 i √ √ at least W− =(cid:80)n W−. pendent transversal of weight at least w(V)√ b−4∆√+ b , i=1 i b−4∆+ b(2b−1) which approaches w(V)/b for large b. 5. DIRECTAPPLICATIONS Proof. Define 5.1 Independenttransversals (cid:113) 1− b−4∆ Itiswell-knownthatagraphGwithnverticesandaver- α∗ = b 2∆ agedegreedhasanindependentsetofsizeatleastn/(d+1). What if we want independent sets with much more struc- Supposeweset(cid:126)λ=α,whereα=α∗+(cid:15). Then,for(cid:15)>0 ture? Suppose we are given a partition of the vertices into sufficientlysmall, wesatisfytheconstraintofCorollary4.3. sets V = V (cid:116)···(cid:116)V , and we wish to select one vertex Hence there exists an independent transversal of weight at 1 k fromeachsetsothattheresultingsetisanindependentset. least W−. (Such a set is known as an independent transversal.) Our bound for W− depends in a rather complicated way i Oneimportantparameterfortheindependenttransversal on the distribution of weights within a block. If our goal is problem is the size of the classes, i.e., requiring |V |≥b for toshowalowerboundontheformcw(V)forsomeconstant i all i = 1,...,k. Alon gives a short LLL-based proof that c, then the worst-possible distribution of weights is that a a sufficient condition for such an independent transversal single vertex in each block has all the weight. In this case, to exist is to require b≥2e∆ [1], where ∆ is the maximum our bound W− gives us that degreeofanyvertexinthegraph. Haxellprovidesanelegant toplogical proof that a sufficient condition is b ≥ 2∆ [15]. W− ≥(cid:88) αhw(Vi) (b−1)α+αh The condition of [15] is existentially optimal, in the sense i that no condition of the form b≥c∆ is possible for c<2. 1−α∆ =w(V) Theseboundsareallgivenintermsofthemaximumdegree b−α∆ ∆, which can be a crude statistic. The proof of [15] adds Summingoverallblocks,weseethattheexpectedweight vertices one-by-one to partial transversals, which depends approachesarbitrarilycloseto w(V)(1−α∗∆) as(cid:15)→0. Asthe veryheavilyonboundingthemaximumdegreeofanyvertex. b−α∗∆ set of all independent transversals is finite, this implies the Suppose we let d denote the maximum average degree of existence of an independent transversal of weight at least anyclassV . Thisisamoreflexiblestatisticthan∆. Using i Pegden’s version of the algorithm LLL, we can show that (cid:32) √ √ (cid:33) b−4∆+ b b≥4dsuffices[24](andthisisconstructive). Wewillprove w(V)· √ √ . this same result using our version of the LLL. This proof b−4∆+ b(2b−1) offersnoadvantagesoverPegden’sapproach,butillustrates how our LLL can be applied in practice. 5.2 Chromaticcapacity Theorem 5.1. Suppose we have a graph G=(V,E) whose R¨odl has shown that for any acyclic digraph D, there ex- vertex set ispartitionedasV =V (cid:116)···(cid:116)V , where foreach 1 k ists an undirected graph D∗ such that all of the latter’s i,(i)theaveragedegreeoftheverticesinV isatmostd,and i acyclic orientations contain D as an induced subgraph [26]. (ii) |V |≥b>4d. Then G has an independent transversal. i Cochand & Duchet introduced the notion of chromatic ca- Proof. Ifany|V |isgreaterthanb,wemaydiscarditshighest- pacity to explicitly construct D∗, using graphs that have i degree vertices to bring it down to size b exactly, while de- a large value for this parameter [8]. Given an undirected creasing its average degree. In this way, we may assume multi-graphG,thechromaticcapacity χcap(G)isthelargest that |V | = b for all i. Our constraint is that for each edge integertsuchthatthereinanedget-labelingsuchthat,for i f ∈E, at most one incident vertex is selected. To use The- any vertex t-labeling, some edge shares the same color as orem2.2,wesetλ=β(cid:126) forsomeβ >0tobechosenshortly. both its endpoints. That is, Fix any v ∈ V with degree dv, from a class Vi. Then the ∃f :E →[t]∀g:V →[t]∃(cid:104)u,v(cid:105)∈E g(u)=g(v)=f(u,v) event that all bad events Y are avoided, given that v itself v isselected,isjusttheeventthatnoneighborofvisselected. See [6, 9] for further study of this para√meter. This occurs with probability hv ≥1−dvβ. So we have It is shown in [3, 8] that χcap(G) ≤ 2e∆−e−1. We (cid:88) can use our results for independent trave√rsals to show that H (λ)= β(1−d β)≥bβ(1−dβ). that the chromatic capacity is at most 2 ∆. (A value pro- i v √ v∈Vi portionalto ∆issometimesofthecorrectorder,asshown by the complete graph [8, 9].) Setting β = 1 shows that b > 4d is a sufficient condition 2d for an independent transversal to exist. Theorem5.3. Let∆denotethemaximumdegreeofagraph G√. Then for any graph G, its chromatic capacity is at most We can also use our weighting condition to bound the 2 ∆. weight ofsuchanindependenttransversal,whichisnotpos- sible using [15]. Suppose that we are given weights w for Proof. GivengraphGwithanedge-labelingf ont+1colors, v eachvertexofG. ThereisasimpleargumentthatGhasan constructthegraphH asfollows. Foreachvertexv∈Gand eachcolorj ∈[t+1],constructavertex(v,j)∈H. Foreach where for all k we have bk ≥b(cid:48)k(1+d−c1)+c3(cid:112)b(cid:48)klogd and edge e = (cid:104)u,v(cid:105) ∈ G with color j, construct an edge in H b(cid:48) ≥c logd. k 2 between (u,j) and (v,j). Now partition the vertices of H bygroupingallthevertices(v,1),...,(v,t+1)intoasingle Proof. We suppose d > 1, as when d = 1 the IP is trivial. class. Letusfixc1,c2 >0;wewillshowhowtochoosec3appropri- We can now observe the following simple facts: ately. Define (cid:15)=d−c1. Suppose we have a feasible solution y to the relaxed linear program. In the notation of Theo- 1. Any independent transversal of H corresponds to a rem 2.2, set λ = (1+(cid:15))y and let Z be the associated i,j i,j unique vertex-labeling of G which avoids f(u,v) = random vector. (In case λ >1, we set x =1; this only i,j i,j g(u)=g(v), and vice-versa. helps our analysis. Although we do not discuss the details here, the reader can assume that λ ≤1 for simplicity.) i,j 2. The classes in H contain exactly t+1 vertices. For any given pair (i,j), there are most d constraints that involve Z ; consider any such constraint k. In or- 3. For each vertex v ∈ G, the average degree of the cor- i,j der to avoid the bad event conditional on Z = 1, we responding class in H is dv . i,j t+1 must have the remaining variables in that constraint sum to at most b −1. The UNC constraints apply to these By Theorem 5.1, as long as t+1 > 4 ∆ , there exists k t+1 variables. So the sum of all such variables has mean µ = alanbeinlidnegpoefndGe.ntThtriasnissvseartsiaslfieodf Hwhaenndth=en2c√e∆a.valid vertex- (cid:80)(i(cid:48),j(cid:48))(cid:54)=(i,j)ak,i(cid:48),j(cid:48)(1+(cid:15))yi(cid:48),j(cid:48) ≤ (1+(cid:15))b(cid:48)k. We can bound the probability that such a sum deviates from its mean by a Chernoff bound under negative correlation [23], where We can use this result almost immediately to improve a exp(x) denotes ex: related bound of Archer [3] on the upper chromatic number ofthereals. Theupperchromaticnumberofametricspace (b −µ)2 P∗(bad event k|Z =1)≤exp(− k ) S is related to distance-based color-constraints on S [13]. λ i,j (2+(b /µ−1))µ k The argument of [3] uses the chromatic capacity to show (b −(1+(cid:15))b(cid:48))2 thatthe“kthupperchromaticnumber”χˆ(k)oftherealsunder ≤exp(− k k ) theEuclideandistance,isatmost(cid:100)4ek(cid:101). Ourboundonthe (2+((1+b(cid:15)k)b(cid:48) −1))b(cid:48)k k chromatic capacity above, shows that χˆ(k)(R)≤8k. c2logd ≤exp(− 3 ) (cid:112) 5.3 Integrality gap of column-sparse packing (2+c22 b(cid:48)klogd)(1+d−c1) problems By our assumption on b(cid:48) ≥ c logd and our assumption k 2 Consider our family of CSPs where we have a series of d > 1, we can simplify this as exp(−clogd), where c is a linear packing constraints: indexed by k and of the form constant which increases unboundedly with c3. Hence, in “(cid:80) a x ≤ b ”, with non-negative a ,b . In addi- the notation of Definition 2.5, i,j k,i,j i,j k k,i,j k triioesn,otfhseertesiAs t,h.e..u,sAualwaistshigtnhmeecnotncsotrnasitnrtasin(cid:80)t, namxely a=s1e-. hi,j(λ)≥1− (cid:88) exp(−clogd)≥1−d1−c. When does s1uch an innteger linear program hajv∈eAia fie,jasible k: ak,t=1 soluion? Suppose we wish to solve this via LP relaxation. Now, summing over all j ∈A , we have i One technique is to start with the LP where the integrality constraints on xi,j ∈ {0,1} are relaxed to x(cid:48)i,j ∈ [0,1], and Hi(λ)≥(cid:88)λi,j(1−d1−c) withthepackingconstraintstightenedto(cid:80)i,jak,i,jx(cid:48)i,j ≤b(cid:48)k j for some b(cid:48)k ≤bk. ≥(cid:88)(1+d−c1)yi,j(1−d1−c) The notion of“covering radius”d here is obvious: as in j [4, 17], we assume that for each (i,j), there are at most ≥(1+d−c1)(1−d1−c), d constraints k with a (cid:54)= 0. Consider, for notational k,i,j simplicity, the case where ak,i,j ∈{0,1}. We will show that whichcanbemadelargerthan1bychoosingc3 sufficiently a bound of large, for any value d>1. Hence, by Theorem 2.2, there is b ≥(1+ 1 )b(cid:48) +Ω((cid:112)b(cid:48) logd) a configuration x that satisfies all the IP constraints. k poly(d) k k 6. PACKETROUTING suffices to guarantee the existence of an integral solution. We state this theorem (with notation such as d etc. as 6.1 Overview above) very carefully, as the various parameters depend on Wewillexaminethepacketroutingprobleminmuchmore each other delicately: detail than the other problems we have considered. This Theorem 5.4. Let c ,c > 0 be any constants. There is analysiswilluseourassignmentLLL.However,manyofthe 1 2 some constant c ≥0 with the following property. If the LP improvementswemakewillbemuchmoreproblem-specific. 3 We will improve some choices of parameters, as well as ex- ∀i, (cid:88) yi,j ≥1; ∀k, (cid:88)ak,i,jyi,j ≤b(cid:48)k; ∀(i,j), yi,j ∈[0,1] aminemorecloselyinstancesinwhichthecongestioniscon- j∈Ai i,j trolled on very small scales. So, we will first examine the packet-routing problem using the standard LLL; this ex- is satisfiable, then so is the integer program tends the analysis of [29] and [25]. We will then show how (cid:88) (cid:88) ∀i, x =1; ∀k, a x ≤b ; ∀(i,j), x ∈[0,1] toenhancethistohandlethenonlinearincreasingfunctions i,j k,i,j i,j k i,j that arise, by using our assignment LLL. j∈Ai i,j We begin by reviewing the basic strategy of [29], and could further improve the general packet routing. Unfortu- its improvement by [25]. The former has a very readable nately, we are not able to show a general result for small overview of our basic strategy, and we will not include all delayssuchasD=3. However,aswewillsee,theschedules the details which are covered there. Our choice of param- that are produced in the larger construction of [29] are far eters will be slightly improved from [29] and [25]. We note from generic; rather they have relatively balanced conges- that[25]studiedamoregeneralversionofthepacket-routing tion on small scales. We will see how to take advantage of problem, so their choice of parameters was not (and could this balanced structure to improve the scheduling. not be) optimized. We next review the general construction of [29]. WearegivenagraphGwithN packets. Eachpackethas 6.2 UsingtheLLLtofindaschedule a simple path, of length at most D, to reach its endpoint vertex (we refer to D as the dilation). In any timestep, a The general strategy for this construction is to add ran- packet may wait at its current position, or move along the dom delays to each packet, and then allowing the packet to next edge on its path. Our goal is to find a schedule of move through each of its edges in turn without hesitation. smallestmakespaninwhich,inanygiventimestep,anedge Thiseffectivelyhomogenizesthecongestionacrosstime. We carriesatmostasinglepacket. We define the congestion C have the following lemma: tobethemaximum,overalledges,ofthenumberofpackets scheduledtotraversethatedge. ItisclearthatDandC are Lemma 6.2. Let i(cid:48) < i, and let m,C(cid:48) be non-negative in- both lower bounds for the makespan, and [20] has known tegers. Suppose there is a (possibly infeasible) schedule S of that in fact a schedule of makespan O(C +D) is possible. length L such that every interval of length i has congestion [29] provided an explicit constant bound of 39(C +D), as at most C. Suppose that we have well as describing an algorithm to find such a schedule. i(cid:48) Whilethefinalscheduleonlyallowsonepackettocrossan e×P(Binomial(C, )>C(cid:48))×(Cmi2+1)<1. i−i(cid:48) edgeatatime,wewillrelaxthisconstraintduringourcon- struction. We consider“infeasible”schedules, in which each Then there is a (possibly infeasible) schedule S(cid:48) of length packet follows its path but where arbitrarily many packets L(cid:48) = L(1+1/m)+i, in which every interval of length i(cid:48) may pass through each edge at each timestep.2 We define has congestion ≤C(cid:48). Furthermore, S(cid:48) can be constructed in aninterval tobeaconsecutivesetoftimesinourschedule, polynomial time. and the congestion of an edge in a given interval to be the Proof. We break the schedule S into frames of length F = number of packets crossing that edge. If we are referring to mi, and refine each separately. Within each frame, we add intervalsoflengthi,thenwedefineaframe tobeaninterval a random delay of length i−i(cid:48) to each packet separately. which starts at an integer multiple of i. Let us fix an F-frame for the moment. Associate a bad One can easily form an (infeasible) schedule with delay event to each edge f and i(cid:48)-interval I, that the congestion D and overall congestion C. Initially, this congestion may in that interval and edge exceeds C(cid:48). “bunch up” in time, that is, certain edges may have very For each I,e, there are at most C possible packets that high congestion in some timesteps and very low congestion in others. So the congestion is not bounded on any smaller could cross, and each does so with probability p = i−i(cid:48)i(cid:48). interval than the trivial interval of length D. During our Hence the probability of the bad event is at most the prob- construction, we will“even out”the schedule, bounding the ability that a Binomial random variable with C trials and congestion on successively smaller intervals. probability p exceeds C(cid:48). Ideally, one would eventually finish by showing that on Next consider the dependency. Given an edge f and i(cid:48)- eacheachindividualtimestep(i.e.,intervaloflength1),the interval I, there are at most C packets crossing it, each of congestionisroughlyC/D. Inthiscase,onecouldturnsuch whichmaypassthroughuptomiotheredgesintheframe. an infeasible schedule into a feasible schedule, by simply We refer to the combination of a specific packet passing expanding each timestep into C/D separate timesteps. through a specific edge as a transit. Now, for each transit, As [25] showed, it suffices to control the congestion on there are (depending on the delay assigned to that packet) intervals of length 2. Given our infeasible schedule, we can at most i other i(cid:48)-intervals in which this transit could have vieweachintervaloflength2asdefininganewsubproblem. been scheduled. Hence the dependency is at most Cmi2. In this subproblem, our packets start at a given vertex and By the LLL, the condition in the hypothesis guarantees have paths of length 2. The congestion of this subproblem that there is a positive probability that the delays avoid all is exactly the congestion of the schedule. Hence, if we can badevents. Inthiscase,werefineeachframeofS toobtain scheduleproblemsoflength2,thenwecanalsoschedulethe a new schedule S(cid:48) as desired. We can use the algorithmic 2-intervals of our expanded schedule. LLL to actually find such schedules S(cid:48) in polynomial time. Sofar,thisensuresthatwithineachframe,thecongestion Proposition6.1. ([25])SupposethereisainstanceGwith withinanyintervaloflengthi(cid:48) isatmostC(cid:48). Intherefined delay D = 2 and congestion C. Then there is a feasible schedule S(cid:48) there may be intervals that cross frames. To schedule of makespan C+1. Furthermore, such a schedule ensure that these do not pose any problems, we insert a can be formed in polynomial time. delay of length i(cid:48) between successive frames, during which no packets move at all. This step means that the schedule The work of [25] used this to improve the bound on the S(cid:48) may have length up to L(1+1/m)+i. makespan to 23.4(C+D); it also speculated that by exam- ining the scheduling for longer, but still small, delays, one We note one important distinction between this analyis and that of Scheideler [29]. In Scheideler, one associates a 2Unless specified as“feasible”, a“schedule”will refer to a bad event to each edge; we have a bad event corresponding possibly infeasible schedule. to each edge and interval. Using Lemma 6.2, we can transform the original prob- NowapplyProposition6.1toS ,expandingeach2-frame 3 lem instance (in which C,D may be unbounded), into one to a feasible schedule of length 21. The total length of the in which C,D are small. In order to carry out this analy- resulting schedule is at most 21L ≤10.92(C+D). 2 3 sis properly, one would need to develop a series of separate boundsdependingonthesizesofC,D. Tosimplifytheexpo- 6.3 Betterschedulingofthefinal2-frame sition,wewillassumethatC,Dareverylarge,inwhichcase Let us examine the last stage in the construction more certainroundingeffectscanbedisregarded. WhenC,D are closely. In this phase, we are dividing the schedule into smaller, we can show stronger bounds but doing this com- intervals of length2, andwe want tocontrol the congestion pletelyrequiresanextensivecaseanalysisoftheparameters. of each edge in each 2-frame. Lemma 6.3. AssumeC+D≥2896. Thereisascheduleof For a given edge f and time t, we let ct(f) denote the lengthatmost1.004(C+D)andinwhichthecongestionon number of packets scheduled to cross that edge in the four any interval of length 224 is at most 17,040,600. Further- time steps of the original (infeasible) schedule. more, this schedule can be produced in polynomial time. Supposewehavetwoconsecutive2-framesstartingattime t. The reason for the high value of C(cid:48) in the final step of Proof. Define the sequence ak recursively as follows. theaboveconstructionisthatitisquitelikelythatct+ct+1 or c +c are much larger than their mean. However, t+2 t+3 a0 =256 ak+1 =2ak it would be quite rare for both these bad events to happen simultaneously. We will construct a schedule in which we Thereisauniquek suchthata3.5 ≤(C+D)<a3.5 . By k k+1 insert an“overflow”time between the 2-frames. This over- aslightvariantonLemma6.2,onecanadddelaystoobtain flow handles cases in which either c +c is too large or a schedule of lengthC+D, in which the congestiononany t t+1 c +c is too large. interval of length i(cid:48) =a3 is at most C(cid:48) =i(cid:48)(1+4/a ). t+2 t+3 k k Our goal will be to modify either of the 2-frames soas to At this point, we use Lemma 6.2 repeatedly to ensure ensurethatthecongestionisatmostT. Inordertodescribe to control the congestion intervals of length a3, for j = j our modification strategy, we first fix, for every packet and k−1,...0. Ateachstep,thisincreasesthelengthofthere- frame, a“first edge”and“second edge”in this frame. Some sultingschedulefromL toL (1+1/a )+a ,andincreases j j j+1 j packets may only transit a single edge, which we will arbi- the congestion on the relevant interval from i(1+4/a ) to k trarily label as first or second. As we modify the schedule, k−1 some packets that initially had two transits scheduled will (cid:89) 1 i(1+4/a ) (1+4/a )( ) be left with only one; in this case, we retain the label for k j 1−(a /a )3 j j+1 thatedge. So,wemayassumethateveryedgeismarkedas j=0 first or second and this label does not change. (WeusetheChernoffboundtoestimatethebinomialtailin We do this by shifting transits into the overflow time. Lemma 6.2.) Foreach2-frame,therearetwooverflowtimes,respectively For C +D ≥ a3k.5, it is a simple calculation to see that earlier and later. If we want to shift an edge to the later the increase in length is from C+D (after the original re- overflow time, we choose any packet that uses that edge as finement) to at most 1.004(C +D). In the final step of a second edge (if any), and reschedule the second transit to thisanalysis,weareboundingthecongestionofintervalsof the later overflow time; similarly if we shift an edge to the length a30 = 224, and the congestion on such an interval is earlier overflow time. See Figure 1. at most 17040600. Furthermore,sinceallofthesestepsusetheLLL,onecan form all such schedules in polynomial time. S 1 2 3 4 5 6 7 8 See [29] for a much more thorough explanation of this process. Nowthatwehavereducedtoconstant-sizedintervals,we S' 1 2 3 4 5 6 7 8 arenolongerinterestedinasymptoticarguments(whichuse generic bounds such as the Chernoff bound), and we come down to specific numbers. Figure 1: The packets in the original schedule S are Lemma 6.4. There is a feasible schedule of length at most shifted into overflow times in the schedule S(cid:48). 10.92(C+D), which can be constructed in polynomial time. Proof. For simplicity, we assume C+D≥2896. Notethatintheanalysisof[25],theonlythingthatmat- By Lemma 6.3, we form a schedule S , of length L ≤ tersisthetotal congestionofanedgeineach2-frame. Inde- 1 1 1.004(C+D), in which each interval of length 224 has con- cidinghowtoshiftpacketsintotheoverflowtimes,weneed gestion at most 17040600. to be careful to account for how often the edge appears as NowapplyLemma6.2toS ,withm=64,i(cid:48) =1024,C(cid:48) = thefirstorsecondtransit. Ifanedgeappearsexclusivelyas 1 1385toobtainascheduleS ,oflengthL ≤1.0157L +224, a“firstedge”,wewillonlybeabletoshiftitintotheearlier 2 2 1 inwhicheachintervaloflength1024hascongestionatmost overflow, and similarly if an edge appears exclusively as a 1385. “second edge”. NowapplyLemma6.2toS withm=64,i(cid:48) =2,C(cid:48) =20, Keepingthisconstraintinmind,ourgoalistoequalizeas 2 to obtain a schedule S of length L ≤1.0157L +1024, in faraspossiblethedistributionofedgesintoearlierandlater 3 3 2 which each frame of length 2 has congestion at most 20. overflows. We do this by the following scheme:
Description: