Online Vector Scheduling and Generalized Load Balancing Xiaojun Zhu∗†, Qun Li†, Weizhen Mao† and Guihai Chen∗ ∗ State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, P. R. China † Department of Computer Science, the College of William and Mary, Williamsburg, VA, USA Email: {njuxjzhu,liqun,wm}@cs.wm.edu, [email protected] Abstract—We give an approximation-preserving reduction of vectors, and the other providing O(lnd) approximation 2 from vector scheduling problem (VS) to generalized load bal- with running time polynomial in nd, where n is the number 1 ancing problem (GLB). The reduction bridges existing results of of vectors. The third algorithm is a randomized algorithm, 0 the two problems. Specifically, the hardness result for VS holds whichassignseachvectortoauniformlyandrandomlychosen 2 alsoforGLBandanyalgorithmforGLBcanbeusedtosolveVS. Based on this, we get two new results. First, GLB does not have partition.ItgivesO(lndm/lnlndm)approximationwithhigh v constant approximation algorithms that run in polynomial time probability,wheremisthenumberofpartitions(servers).For o unless P = NP. Second, there is an online algorithm (vectors fixed d, there exists a polynomial time approximation scheme N coming in an online fashion) that solves VS with approximation (PTAS) [1]. A PTAS has also been proposed for a wide class boundelog(md),whereeisthenaturalnumber,misthenumber 5 of cost functions (rather than max) [2]. of partitions and d is the dimension of vectors. The algorithm 2 is borrowed from GLB literature and is very simple in that Generalized load balancing is recently introduced to model each vector only needs to minimize the L norm of the ] ln(md) theeffectofwirelessinterference[3][4].Eachjobincurscosts C resultingload.However,itisunclearwhetherthisalgorithmruns in polynomial time, due to the irrational and non-integer nature to all machines, no matter which machine it is assigned to. C ofln(md).Weaddressthisissuebyroundingln(md)tothenext The exact cost incurred by a job to a specific machine is s. integer.Weprovethattheresultingalgorithmrunsinpolynomial dependent on which machine the job is assigned to. The load c time with approximation bound elog(md)+ lne(lmogd()e+)1 which is of a machine is the total cost incurred by all the jobs, instead [ in O(ln(md)). This improves the O(ln2d) bound of the existing ofjustthejobsitserves.Thismodeliswellsuitedforwireless polynomial time algorithm for VS. 1 transmission, since, in wireless network, a user may influence v I. INTRODUCTION all APs in its transmission range due to the broadcast nature 9 of wireless signal. To solve the generalized load balancing 2 Scheduling with costs is a very well studied problem in problem, the current solution is an online algorithm, adapted 7 combinatorial optimization. The traditional paradigm assumes 5 from the recent progress in online scheduling on traditional single-cost scenario: each job incurs a single cost to the . model[5].Thesolution,thoughprovidesgoodapproximation, 1 machine that it is assigned to. The load of a machine is the is rather simple: each job selects the machine to minimize the 1 total cost incurred by the jobs it serves. The objective is to 2 minimize the makespan, the maximum machine load. Vector Lτ norm of the resulting loads at all machines where τ is a 1 parameter. schedulingandgeneralizedloadbalancingextendthescenario : v in different directions. Toavoidconfusion,wekeepthetwotermsjobandmachine i X Vector scheduling assumes that each job incurs a vector unchanged for generalized load balancing, while refer to job cost to the machine that it is assigned to. The load of a and machine in the vector scheduling model as vector and r a machine is defined as the maximum cost among all dimen- partition respectively. sions. The objective is to minimize the makespan. Vector We make two contributions. First, we present an approach scheduling is a multi-dimensional generalization of the tra- to encode any vector scheduling instance by an instance of ditional paradigm. It finds application in multi-dimensional generalized load balancing problem. This encoding method resource scheduling in parallel query optimization [1], where brings the recent progress in generalized load balancing into it is not suitable to represent different requests (such as the vector scheduling domain, and takes the hardness result CPU, memory, network resource, etc.) as a single scalar. of vector scheduling to generalized load balancing problem. To solve vector scheduling, there are three approximation Second, the existing solution to GLB needs to compute the solutions [1]. Two of them are deterministic algorithms based L norm (l is the number of machines), but it is unclear on derandomization of a randomized algorithm, with one lnl whether this norm can be computed in polynomial time. We providing O(ln2d) approximation1, where d is the dimension eliminate this uncertainty by rounding lnl to the next integer, guaranteeing polynomial running time. In addition, we prove The work was done when the first author was visiting the College of WilliamandMary. that the approximation loss due to rounding is small. 1In this paper, e denotes the natural number, ln(·) denotes the natural logarithm,andlog(·)denotesthelogarithmbase2. We conclude this section by the following two definitions. A. Vector Scheduling A. Creating GLB Instances We are given positive integers n,d,m. There are a set Comparing VS to GLB, we can find that they mainly differ (cid:80) V of n rational and d-dimensional vectors p ,p ,...,p in the subscripts of max and . Our construction is inspired 1 2 n from [0,∞)d. Denote vector p = (p ,...,p ). We need by this observation. i i1 id to partition the vectors in V into m sets A ,...,A . The Given as input to VS the vector set V and m partitions, we 1 m problem is to find a partition to minimize max (cid:107)A (cid:107) constructtheGLBinstanceasfollows.WesetthejobsJ =V. 1≤i≤m i ∞ w(cid:107)Ahe(cid:107)re Aisi t=he(cid:80)injfi∈nAitiypjnoirsmthdeefisunmedoafstthheevmeacxtoimrsuimn Aelei,maenndt Fmoarcheianceh,dpeanrtoittieodnbAyjthaenpdaiitrs(kj,-kth).dTimhuesn,stihoen,cwonestcrounctsetrdumctaa- i ∞ inthevectorA .Forthecasem≥n,thereisatrivialoptimal chine set M is {(j,k)|j =1,2,...,m and k =1,2,...,d}. i solution that assigns vectors to distinct partitions. Therefore, We refer to a machine as a pair of indices so that we can we only consider the case m<n. map the machine back to its corresponding partition and For ease of presentation, we give an equivalent integer dimension easily. For a machine t = (j,k) where t ∈ M, program formulation. Let xij be the indicator variable such we refer to the partition j as t1, and the dimension k as that xij = 1 if and only if vector pi is assigned to partition t2, i.e., t = (t1,t2). We can see that there are totally d A . Then machines (t included) corresponding to the same partition as j (cid:88) themachinet.Wedenote[t]asthesetofthesemachines,i.e., (cid:107)A (cid:107) = max x p j ∞ ij ik [t]={(j,1),(j,2),...,(j,d)}, where j =t . Among these d 1≤k≤d 1 i machines,weselectthefirstone(j,1)astheanchormachine, The vector scheduling problem can be rewritten as denoted by t, such that a vector chooses partition A in VS if j and only if the corresponding job chooses t in the new GLB (cid:88) minmax x p problem. ij ik x j,k The incurred cost c of job i to machine t if i chooses i ist subject to machine s is defined as (cid:88) (VS) xij =1, ∀i pit2 if s=t (1) j cist = ∞ if s∈[t]∧s(cid:54)=t (2) xij ∈{0,1}, ∀i,j 0 if s∈/ [t] (3) B. Generalized Load Balancing where (1)(2) are for the situation where s and t correspond to the same partition. They force a job to select only the anchor Thisformulationfirstappearsin[4].Wereformulateitwith machines. (3) is for the situation where s and t correspond to slightly different notations. There are a set M of independent different partitions. In this case, there is no load increase. machines, and a set J of jobs. If job i is assigned to machine The resulting GLB instance is defined as VS-GLB: j, there is non-negative cost c to machine k. The load ijk of a machine is defined as its total cost. The problem is to minmax (cid:88)x(cid:48) c is ist find an assignment (or schedule) to minimize the makespan, x(cid:48) t i,s the maximum load of all the machines. This problem can be subject to formally defined as follows. (VS-GLB) (cid:88) x(cid:48) =1, ∀i∈J (cid:88) is minmax xijcijk s x k i,j x(cid:48)is ∈{0,1}, ∀i∈J,s∈M subject to To avoid the confusion with the general GLB problem, we (GLB) (cid:88) x =1, ∀i∈J intentionally use different notations x(cid:48), s and t. The notation ij j i is kept since it is in 1-1 correspondence with the vectors in x ∈{0,1}, ∀i∈J,j ∈M VS. ij As an example, consider the case when d=1. All vectors where x ∈ {0,1}|J|×|M| is the assignment matrix with in VS have only one element, and there is only one machine elements x = 1 if and only if job i is assigned to machine in VS-GLB representing a partition in VS. The objective ij (cid:80) j. The two constraints require each job to be assigned to one of VS becomes max x p . On the other hand, the j i ij i1 machine. objective of VS-GLB is max (cid:80) x(cid:48) c = max (cid:80) x(cid:48) p . t i it itt t i it i1 Since any machine t corresponds to a distinct partition A , j II. ENCODINGVECTORSCHEDULINGBYGENERALIZED simply changing subscripts shows that the two problems are LOADBALANCING equivalent. For the case when d > 1, the proof is much involved, which we delay to Section II-B. We first create a GLB instance for any VS instance, then prove their equivalence. At last, we discuss the hardness of Theorem 1. The construction of VS-GLB can be done in GLB and extend the VS model. polynomial time. Proof: An instance of VS needs Ω(nd) bits. The con- Set x(cid:48) = x and all others to be 0. We first show that x(cid:48) is is1 structed VS-GLB instance has n jobs, md machines and is a feasible solution to VS-GLB. Obviously, x(cid:48) is an integer n(md)2 costs. Since m < n, all three terms are polynomials assignment. We will check that (cid:80) x(cid:48) = 1. Observe that s is in n and d. The theorem follows immediately. x(cid:48) = 0 if s (cid:54)= s. We only need to consider m machines is ThefollowingtheoremshowsthattheconstructedVS-GLB (1,1),(2,1),...,(m,1). Since x is a feasible solution to VS, problem is equivalent to its corresponding VS problem. Let T then for any i ∈ V, there exists one and only one partition be a positive constant. A such that x =1. Our transformation sets x(cid:48) =1 where j i,j is s=(j,1). So (cid:80) x(cid:48) =1. Theorem2. ThereisafeasiblesolutionxtoVSwithobjective s is Second, we prove that the objective values of the two valueT ifandonlyifthereisafeasiblesolutionx(cid:48) toVS-GLB feasible solutions are equal. with the same objective value T. (cid:88) (cid:88) max x(cid:48) c =max x(cid:48) c (7) This theorem shows that VS and its corresponding VS- is ist is ist t t GLB have the same optimal value. In addition, any c- i,s i s∈[t] approximation solution to VS-GLB, after transformation, is =max(cid:88)x(cid:48) c (8) also a c-approximation solution to VS, vice versa. We prove t it itt i this theorem in Section II-B. (cid:88) =max x p ItisworthmentioningthatVS-GLBisaspecialinstanceof t it1 it2 i GLB. Since VS-GLB is converted from VS, VS is a special (cid:88) =max x p , instance of GLB, which implies that VS should have approx- ij ik j,k imation algorithms at least as good as GLB. Unfortunately, i on the contrary, the literature shows better approximation where (7) is due to that cist = 0 if s ∈/ [t], and (8) is due to algorithm for GLB than that for VS. Hence, it is worth our assignment of x(cid:48) that x(cid:48) =0 if s(cid:54)=s. is applying algorithms of GLB to VS. “⇐=”Givenx(cid:48) forVS-GLB,constructxforVSasfollows. Set x = x(cid:48) where s = j. We show that x is a feasible B. Proof of Equivalence ij is 1 solution to VS. Due to Lemma 1, for any i, there exists one We first study the properties of feasible solutions to VS- s such that x(cid:48) = 1 and s = s. Therefore, there exists one j is GLB in Lemma 1 and Lemma 2, and then prove Theorem 2. such that x =1. On the other hand, there cannot be two js ij both with x = 1, otherwise x(cid:48) is not a feasible solution to Lemma 1. Given a feasible solution x(cid:48) to VS-GLB yielding ij VS-GLB. objective value T, for any i∈J, we have For the objective value, we have 1) ∀s(cid:54)=s,x(cid:48) =0; is (cid:88) (cid:88) 2) ∃j such that for s=(j,1),x(cid:48) =1. max x p = max x p is j,k ij ik t=(j,k) it1 it2 s (cid:54)=Pros.of:ThFeonr x1(cid:48)i)s,cissuspp=ose∞x(cid:48)is>=T,1 cfoonrtrasodmicteings wwiitthh i =mtax(cid:88)ix(cid:48)itpit2 max (cid:80) x c =T. i t i,s is ist (cid:88) For 2), since (cid:80)sx(cid:48)is = 1, there exists some s such that =mtax x(cid:48)iscist, (9) x(cid:48) =1. Due to 1), we must have s=s. i,s is where (9) is due to Lemma 2. This completes our proof. Lemma 2. Given a machine t, a job i, and a feasible solution x(cid:48) to VS-GLB yielding objective value T, we have (cid:80)sx(cid:48)iscist =x(cid:48)itpit2. C. Inapproximability for GLB Proof: Recall that [t]={(t ,1),(t ,2),...,(t ,d)}. We It has been proved that no polynomial time algorithm can 1 1 1 have give c-approximation solution to VS for any c > 1 unless (cid:88) (cid:88) (cid:88) NP = ZPP [1]. Combining this result with Theorem 2, we x(cid:48) c = x(cid:48) c + x(cid:48) c is ist is ist is ist have the following theorem. s s∈[t] s∈/[t] (cid:88) Theorem 3. For any constant c > 1, there does not exist a = x(cid:48) c (4) is ist polynomial time c-approximation algorithm for GLB, unless s∈[t] NP =ZPP. =x(cid:48) c (5) it itt Proof: Since VS-GLB is a special instance of GLB, any =x(cid:48) p (6) it it2 c-approximation algorithm for GLB can be used to obtain c- where (4) is due to (3), (5) is due to Lemma 1, and (6) is due approximation solution to VS-GLB. By Theorem 2, any c- to (1). approximation solution to VS-GLB is also a c-approximation With the two lemmas, we can now prove the equivalence. solution to the corresponding VS. Therefore, the approxima- Proof of Theorem 2: “=⇒” Given a feasible solution x tion algorithm is also a c-approximation algorithm for VS, a to VS, construct a feasible solution x(cid:48) to VS-GLB as follows. contradiction. We can obtain a stronger result by relaxing the assumption for the traditional load balancing problem [5], and recently NP (cid:54)= ZPP to P (cid:54)= NP. (It is a relaxation because extended to the GLB problem [3]. The parameter τ controls P ⊆ ZPP ⊆ NP.) This can be done by examining the the approximation ratio of the algorithm, as shown in the inapproximability proof for VS [1]. The inapproximability following lemma. proof relies on the result that no polynomial time algorithm Lemma 3 ([5], [3]). Minimizing L norm gives τ l1/τ can approximate chromatic number to within n1−(cid:15) for any τ ln(2) approximation ratio to solve GLB where l is the number of (cid:15) > 0 unless NP = ZPP. Recently, it has been proved machines. thatnopolynomialtimealgorithmcanapproximatechromatic number to within n1−(cid:15) for any (cid:15) > 0 unless P = NP [6]. Setting τ = lnl yields the best approximation ratio elogl. Thus,wecanchangetheassumptionNP (cid:54)=ZPP toP (cid:54)=NP However, it is unclear whether the computation of L can lnl safely. be done in polynomial time. We consider this issue later. Theorem 4. For any constant c > 1, there does not exist a A. Adapting to VS polynomial time c-approximation algorithm for GLB, unless To apply the above algorithm to VS, we can first solve P =NP. VS-GLB and transform the solution to VS. This process can D. Extending to generalized VS be simplified by omitting the transformation between VS and VS-GLB. Our construction of VS-GLB and proof can be easily ex- Recall that the algorithm is to assign vectors one by one. tendedtoageneralversionofvectorscheduling.Inthecurrent Consider a vector p in VS. To solve VS-GLB, this vector VS definition, all machines (partitions) are identical so that i should choose a machine to minimize the L norm of the any job (vector) incurs the same vector cost to all machines. τ resulting load. Due to the construction of VS-GLB, this The machines can be generalized to be heterogeneous so that vector can only choose from the anchor machines, otherwise, each job incurs a different vector cost to different machines. Formally, job i incurs vector cost p(j) to machine j if i is the resulting Lτ norm would be infinite (definitely not the i optimal choice). Thus, this is equivalent to picking from the assigned to machine j. The formulation and transformations corresponding partitions in VS. After the assignment of any can be slightly changed as follows. In the integer program formulationofVS,changetheobjectivetomax (cid:80) x p(j). numberofvectorsthatleadstopartitionsA1,A2,...,Am,the j,k i ij ik L normoftheloadofmachinesinVS-GLBis,infact,equal Change p in equation (1) to be p(t1). For Lemma 2, τ it2 it2 to change x(cid:48) p to x(cid:48) p(t1). It can be verified that the proof of Theoremi2t iist2still viatliidt2with minor modifications. The online f(τ)(A1,...,Am)=(cid:0)(cid:107)A1(cid:107)ττ +...+(cid:107)Am(cid:107)ττ(cid:1)1/τ algorithm adopted later is also valid for this general version where of vector scheduling. For simplicity, we mainly focus on the τ original VS model. (cid:107)Aj(cid:107)ττ =(cid:88)(cid:88) pik . III. ONLINEALGORITHMFORVS k i∈Aj BasedonTheorem2,wecansolveVSbyitscorresponding Suppose the assignment of vectors p ,p ,...,p leads to 1 2 i−1 VS-GLB. We review the approximation algorithm [3] for partitionsA ,A ,...,A .Letf(τ) beL normoftheresult- 1 2 m i,j τ GLB, and then modify it to solve VS. ing load if vector p chooses partition A , i.e., i j Given a GLB instance and a positive number τ, the algo- f(τ) =f(τ)(A ,...,A ∪{p },...,A ). rithm[3]considersjobsonebyone(inanarbitraryorder)and i,j 1 j i m assignsthecurrentjobtoamachinetominimizetheL norm2 τ Then,accordingtothealgorithm,vectorp shouldbeassigned i oftheresultingloadofallmachines.Specifically,supposejobs to the partition arenumberedas1,2,...,n,thesameastheconsideredorder. argminf(τ). Suppose the load of machine k after jobs 1,2,...,i−1 are j i,j assigned is li−1. Then job i is assigned to the machine k TheprocedureisdescribedinAlgorithm1.Foreachincom- (cid:32) (cid:33)1/τ ing vector, it only needs to execute Lines 5-9. (cid:88) argmin (li−1+c )τ . Algorithm 1 with τ =ln(md) is an elog(md) approxima- k ijk j tion algorithm to solve the corresponding VS-GLB. Thus, we k have the following result due to Theorem 2. The above optimization problem can be solved by trying each possiblemachine.Duringtheoptimization,thecomputationof Lemma 4. Algorithm 1 with τ = ln(md) is an elog(md) the last step of Lτ norm, (·)1/τ, can be omitted. In addition, approximation algorithm to solve VS. because the algorithm does not require the order of jobs However, it is unclear whether Algorithm 1 with τ = and each job is assigned once, it can be implemented in ln(md) can terminate within polynomial time. The algorithm an online fashion. This algorithm was originally proposed requires the computation of xln(md) for some x. First, the 2Lτ normofavectorx=(x1,x2,...,xt)isdefinedas((cid:80)ixτi)1/τ. number ln(md) is irrational, thus cannot be represented by Algorithm 1: Vector Scheduling Sinceξ ≥lnl,wehavel1/ξ ≤e.Additionally,ξ ≤lnl+1, (cid:16) (cid:17) Input: m, the number of partitions; d, the dimension of so 1− lnl ≤ 1 . Therefore, g(cid:48)(ξ) ≤ e 1− lnl ≤ ξ lnl+1 ln(2) ξ each vector; p1,p2,...,pn, the n vectors to be elog(e). We have scheduled; τ, the norm lnl+1 Output: A ,...,A , the m partitions elog(e) 1 m g((cid:100)lnl(cid:101))−g(lnl)≤g(lnl+1)−g(lnl)≤ 1 begin lnl+1 2 for j from 1 to m do Note that g(lnl)=elog(l). This completes our proof. 3 Aj ←−∅; This theorem holds for general GLB problem, such as the 4 for i from 1 to n do one considered in [3] [5] and [4]. Of course, it holds for VS- 5 if ∃Aj,Aj =∅ then GLBaswell.Tohaveanintuitionontheloss,weplotthetwo 6 Aj ←−Aj(cid:83){pi}; approximation ratios with respect to the number of machines 7 else in Figure 1. We can see that the loss is small. 8 find j to mi(cid:83)nimize fi(,τj); 20 9 Aj ←−Aj {pi}; 15 o ati n r o polynomial bits to achieve arbitrary resolution. Second, even ati10 m though we can approximate it by a rational number with xi o acceptable resolution, the number xτ˜ may still be irrational, pr p whereτ˜istherationalapproximationtoτ.Forexample,when a 5 τ˜ = 1.5, there are lots of values of x such that x1.5 are before rounding irrational. Though we can still approximate it by a rational after rounding 0 number, it is complicated to theoretically analyze whether 0 20 40 60 80 100 number of machines the approximation ratio still holds and how the running time increaseswithrespecttorationalnumberapproximationaccu- Fig. 1. Approximation ratio loss due to rounding. Before rounding, the racy. This problem has not been addressed in literature. approximationratioisy=elog(x)anditbecomesy=elog(x)+ elog(e) ln(x)+1 Our solution is to round ln(md) to the next integer afterrounding. (cid:100)ln(md)(cid:101) and compute the L norm. This guarantees (cid:100)ln(md)(cid:101) We have the following corollary due to Theorem 5. polynomialrunningtime,butcausesthelossofapproximation ratio. We show in the following that the loss is very small. Corollary 1. With τ = (cid:100)ln(md)(cid:101), Algorithm 1 is an elog(md)+ elog(e) approximation algorithm to VS, and it B. Guaranteeing Polynomial Running Time ln(md)+1 runs in polynomial time. To deal with the irrational number issue, we round ln(md) tothenextinteger(cid:100)ln(md)(cid:101).Inthefollowing,weanalyzethe Thepolynomialrunningtimecanbeshownbythefollowing resulting approximation ratio. analysis. We assume τ =(cid:100)ln(md)(cid:101) if not specified. The main time consuming step is to minimize f(τ) over j for given i. Theorem 5. Let l be the number of machines. Minimizing i,j L norm gives elog(l) + elog(e) approximation ratio to We can omit the computation of the outer 1/τ power since (cid:100)lnl(cid:101) lnl+1 function xy is monotonic for x≥0 and y >0. In computing solve GLB. L norm, there is a basic operation, the integer power of a τ Proof: This result is obtained from Lemma 3 by per- number, aτ, where a is an element in any vector A . The j forming calculus analysis. Let g(x)= x l1/x. Consider the naive approach, which multiplies a iteratively, involves τ − ln(2) derivative of g, 1 multiplications. This can be improved by utilizing partial l1/x (cid:18) lnl(cid:19) multiplication results. For example, computing a8 as ((a2)2)2 g(cid:48)(x)= 1− . onlyneeds3multiplications.Generally,computingaτ requires ln2 x (cid:98)log(τ)(cid:99)+ν(τ)−1multiplications,whereν(τ)isthenumber Forx≥lnl,itholdsthatg(cid:48)(x)≥0sothatthefunctiong(x)is of 1s in the binary representation of τ (Chapter 4.6.3 in [7]). monotonically increasing. Since ln(l)≤(cid:100)ln(l)(cid:101))≤ln(l)+1, Inthefollowing,weputanupperbound2logτ tothenumber we have of multiplications needed to compute aτ. g((cid:100)ln(l)(cid:101))−g(lnl)≤g(lnl+1)−g(lnl). To compute f(τ) for given i and j, it needs d + m − 1 i,j additions (adding p to A , suppose A is maintained in each In addition, consider the two points (lnl,g(lnl)) and (lnl+ i j j iteration) and 2mdlog(τ) multiplications (md numbers, each 1,g(lnl + 1)). Due to Langrange’s mean value theorem in needstocomputeitsτ power).Tofindtheoptimalj forgiven calculus, there exists ξ ∈[lnl,lnl+1] such that i, we need to compute f(τ) for all j, and select the optimal i,j g(lnl+1)−g(lnl)=g(cid:48)(ξ). one by comparison. This procedure needs m(d + m − 1) additions, 2dlog(τ)m2 multiplications, and m − 1 compar- Algorithm 2: Sped-up Vector Scheduling with τ = isons.Insummary,ittakesO(dlog(τ)m2)timeforonevector. (cid:100)ln(md)(cid:101) For the overall algorithm, it takes O(dlog(τ)nm2) time. Input: m, the number of partitions; d, the dimension of The computations can be sped up by exploiting the problem each vector; p ,p ,...,p , the n vectors to be 1 2 n structure. The complexity can be reduced to O(dlog(τ)mn), scheduled dropping one m factor, as shown in the following. Output: A ,...,A , the m partitions 1 m C. Computation Speedup 1 begin 2 for j from 1 to m do Towards VS-GLB, we have the following lemma. Note that 3 Aj ←−∅; this lemma does not hold for the general GLB problem. 4 µj ←−0; // vector Aj Lemma 5. For any j1,j2, it holds that fi(,τj1) > fi(,τj2) if and 5 δj ←−0 ; // scalar (cid:107)Aj(cid:107)ττ only if 6 for i from 1 to n do (cid:13)(cid:13)(cid:13)Aj1 ∪{pi}(cid:13)(cid:13)(cid:13)ττ −(cid:13)(cid:13)Aj1(cid:13)(cid:13)ττ >(cid:13)(cid:13)(cid:13)Aj2 ∪{pi}(cid:13)(cid:13)(cid:13)ττ −(cid:13)(cid:13)Aj2(cid:13)(cid:13)ττ 78 if ∃AAjj,←A−j =Aj∅(cid:83)th{epni}; Proof:Adding(cid:107)A (cid:107)τ+...+(cid:107)A (cid:107)τ tobothsidesproves 9 µj ←−pi; the lemma. 1 τ m τ 10 δj ←−(cid:107)pi(cid:107)ττ; 11 else Algorithm 2 shows the final design. For each partition A , the algorithm maintains two variables, the vector A (µ ijn 12 jmin ←−1; // partition index the algorithm) and its norm (cid:107)A (cid:107)τ (δ in the algorijthm)j. If 13 µmin ←−µ1+p1; thereisnoemptypartition,thenejacτhincjomingvectorsearches 14 δmin ←−(cid:107)µmin(cid:107)ττ; (cid:13) (cid:13)τ 15 for j from 2 to m do o(cid:13)(cid:13)vAejr(cid:13)(cid:13)aττll(Lpainrteistio1n2s-2t4o).finAdstLheemjmtoa 5misnhiomwizs,et(cid:13)(cid:13)hAisjis∪e{qpuii}v(cid:13)(cid:13)aτlen−t 1167 µδ˜˜←←−−(cid:107)µµ˜j(cid:107)+ττ;pi; // vector addition to minimize fi(,τj). 18 if δmin−δjmin >δ˜−δj then For the running time, consider a new vector that cannot 19 jmin ←−j; findanemptypartition.Therearemdadditions(Lines13,16), 20 µmin ←−µ˜; 2mdlog(τ) multiplications (Lines 14,17), 2(m−1) subtrac- 21 δmin ←−δ˜; tionsandm−1comparisons(Line18).Thedominatingfactor is mdlog(τ). This is for one vector. For all n vectors, the 22 µjmin ←−µmin; running time is O(mndlog(τ)), compared to O(m2ndlogτ) 23 δjmin ←−δmin; (cid:83) before speedup. Substituting τ = (cid:100)ln(md)(cid:101) into the formula 24 Ajmin ←−Ajmin {pi}; yields O(nmdlnln(md)) running time, polynomial in the input length (note m < n). This analysis, together with Corollary 1 and Lemma 5, gives the following theorem. Theorem 6. Algorithm 2 is an elog(md)+ elog(e) approx- we design a polynomial time algorithm for vector scheduling ln(md)+1 by using an existing algorithm for GLB and modifying it to imation algorithm to VS. It runs in O(nmdlnln(md)) time. guarantee polynomial running time. The resulting algorithm It should be noted that we treat multiplications as basic has two advantages. It can be performed in an online fashion, operations in the above running time analysis. The running and it provides better approximation bound to solve VS than time will be different if we further consider the complexity existingofflinepolynomialtimealgorithm.Notethatweround of computing multiplications. Multiplying two n-bit integers ln(md) to an integer to guarantee polynomial running time, takes time O(n1.59) for a recursive algorithm (Chapter 5.5 which causes a small approximation ratio loss. It is unclear in [8]). Applying such analysis to Algorithm 2, however, whether computing L norm directly, without rounding, ln(md) requires the consideration of the length of the binary repre- can be done in polynomial time. sentation of each numeric value in the vectors, which may be complicated. Nevertheless, it is clear that multiplications REFERENCES run in polynomial time in the input length. Thus Algorithm 2 [1] C. Chekuri and S. Khanna, “On multidimensional packing problems,” terminates certainly in polynomial time. SIAMJ.Comput.,vol.33,pp.837–851,April2004. [2] L.EpsteinandT.Tassa,“Vectorassignmentproblems:ageneralframe- IV. CONCLUSION work,”J.Algorithms,September2003. [3] F.Xu,C.C.Tan,Q.Li,G.Yan,andJ.Wu,“Designingapracticalaccess Inthiswork,weconnectthevectorschedulingproblemwith pointassociationprotocol,”inProceedingsofINFOCOM’10. thegeneralizedloadbalancingproblem,andobtainnewresults [4] F. Xu, X. Zhu, C. C. Tan, Q. Li, G. Yan, and W. Jie, “Smartassoc: by applying existing results to each other. On one hand, we Decentralized access point selection algorithm to improve throughput,” undersubmission. show that generalized load balancing does not admit constant [5] I. Caragiannis, “Better bounds for online load balancing on unrelated approximationalgorithmsunlessP =NP.Ontheotherhand, machines,”inProceedingsofSODA’08. [6] D. Zuckerman, “Linear degree extractors and the inapproximability of maxcliqueandchromaticnumber,”inProceedingsofSTOC’06. [7] D. E. Knuth, The art of computer programming, volume 2 (2nd ed.): seminumerical algorithms. Addison-Wesley Longman Publishing Co., Inc.,1981. [8] J.KleinbergandE.Tardos,AlgorithmDesign. Addison-WesleyLongman PublishingCo.,Inc.,2005.