1 Response-Time-Optimized Distributed Cloud Resource Allocation Matthias Keller and Holger Karl Abstract—Acurrenttrendinnetworkingandcloudcomputingistoprovidecomputeresourceswidelydistributedexemplifiedbyinitiatives likeNetworkFunctionVirtualization.Thispavesthewayforawidespreadservicedeploymentandcanimproveservicequality;anearby servercanreducetheuser-perceivedresponsetimes.Butalwaysusingthenearestserverisabaddecisionifthatserverisalready highlyutilized. Thispaperinvestigatestheoptimalassignmentofuserstodistributedresources–aconvexcapacitatedfacilitylocationproblemwith integratedqueuingsystems.Wedeterminetheresponsetimesdependingonthenumberofusedresources.Thisenablesservice providerstobalancebetweenresourcecostsandthecorrespondingservicequality.Wealsopresentalinearproblemreformulation showingsmalloptimalitygapsandfastersolvingtimes;thisspeed-upenablesaswiftreactiontodemandchanges.Finally,wecompare 6 solutionsbyeitherconsideringorignoringqueuingsystemsanddiscusstheresponsetimereductionbyusingthemorecomplexmodel. 1 Ourinvestigationsarebackedbylarge-scalenumericalevaluations. 0 2 IndexTerms—cloudcomputing;virtualnetworkfunction;networkfunctionvirtualization;resourcemanagement;placement;facility y location;queueingmodel;linearisation;optimization a (cid:70) M 9 2 1 INTRODUCTION thecrucialqualitymetricistheuser-perceivedresponsetime ] I 1.1 ChallengesinDistributedClouds toarequestastheapplicationneedtoquicklyreactonuser N interactions.Largeresponsetimesimpedeusability,increase A current trend in networking and cloud computing is to . userfrustration[16],[37],orpreventcommercialsuccess. s providecomputeresourceswidelydistributed.Computation c An obvious solution to provide small response times will not only take place on desktops or large data centres, [ wouldbetodeployanapplicationatmanysitessothateach but also at smaller centres or within the network itself, userfindsonesitenearby.This,however,isinfeasibleaseach 2 e.g.,insideindividualin-networkserverrackslocatednear utilizedsiteincursadditionalcosts.Wearehencefacedwith v backbonerouters.Thistrendisknownunderdifferentlabels, 2 thetasktodecidewhereauser’srequestshallbeprocessed, forexample,CarrierClouds[6],[13],[51],DistributedCloud 6 usingasfewsitesaspossibleatabestpossibleresponsetime. Computing [3], [15], [21], [46], or In-Network Clouds [28], 2 Werefertothistaskastheassignmentproblem.Thisproblem’s 6 [48], [50]. These In-Network Clouds tend to be less cost- trade-offbetweencostandqualityisintuitiveyetdifficultto 0 efficientthanconventionalCloudsduetoaworseeconomy captureinaconcreteproblemstatementandsolution. . of scale; they are hence often geared towards specific 1 Thisdifficultyliesinthenatureoftheresponsetime.Itis 0 network services (e.g. firewalls, load balancers). Easing a asumofthreeparts: 6 moreflexibledeploymentoftheseservicesbecamepopular • Thenetworklatencytakentosendtherequestfromthe 1 as Network Function Virtualization [22] not only inside a usertothecloudresourceandsendingtheanswerback : data centre but also beyond, in wide area networks [17], v –theroundtriptime(RTT); i [51],[52].Weconsidernotonlyexecutingnetworkfunctions X • theactualprocessingtime(PT)oftherequest; but more generically executing applications at those In- • the queuing delay (QD) a request incurs at the cloud r NetworkCloudsyieldinganimportantadvantage[4],[43]: a resourcewhileotherrequestsarecurrentlyprocessedat The resources of these Clouds are closer to end users thatresource(??). than those of conventional Clouds, have smaller latency Inmanyapplications,wecanconsidertheprocessingtimeto betweenuserandcloudresource,andarethereforesuitable besignificantlysmallerthantheroundtriptime.Theround for running highly interactive applications. Examples for trip time depends on the choice where a user’s request is such applications are latency-critical applications [7], [54], processed and, to a much smaller degree, on the network user-customizedstreamingservices[8],[18],[30],orCloud loadalongtheway.Thequeuingdelay,however,depends Gaming [37]; the computing tasks range from processing onthesharingofaresourceamongmanyusersandisnot the request, aggregating incoming data streams, up to aneffectimmediatelyinfluencedbythedecisionforasingle renderingandencodingvideostreams.Insuchapplications, user;itdependsonthejointdecisionforallusers. Fromqueuingtheoryweknowthat,forafixedutilization, • ManuscriptreceivedJanuaryxx,yy;revisedJanuaryxx,yy.Thisworkwas thequeuingdelayisshorterforhigherservicerates.??shows partiallysupportedbytheGermanResearchFoundation(DFG)withinthe CollaborativeResearchCentre“On-The-FlyComputing”(SFB901). queuing delays for different levels of system utilization • TheauthorsarewiththeUniversityofPaderborn,WarburgerStr.100, ρ=λ/µforthreedifferentserviceratesµ.Forinstance,web 33098Paderborn,Germany.E-mail:[email protected],[email protected] servers answering simple requests have high service rates 2 User Server 50 y a ¹=0:10 el 40 ¹=1:00 180 g d 30 ¹=10:00 ms] 160 S1 S queue euin 2100 me [ 140 2 u cpu q 0 ti S 120 3 0 1 1 3 1 p. 4 2 4 s e 100 S utilization ½=¸¹ . r 4 Figure 1: Response time p 80 x as the sum of round trip Figure 2: Queuing delays e S5 60 time RTT, queuing de- QDs for different service 10 30 50 70 90 110 130 150 170 layQD,processingtimePT. rates. arrival rate ¸ [req/s] andoftennegligiblequeuingdelays(say,below10ms).This Figure3:Thelowerplotshowsresponsetimesofdifferent mayexplainswhytheyarecommonlyignoredinliterature arrivalratesλc forthreedifferentstrategiesforthetopology. (??). We focus in this paper on computation-intensive ap- Theupperplotshowstheusedtopology. plications, such as data processing, intrusion detection, or game render application (see next paragraph) oppose to light-weight application with nearly no computation like bothcaseswithdifferentproportionsofroundtriptimesand web servers just returning static content. The processing processingtimes(??). time of computation-intensive applications is longer (e.g. up to 1s) than of light-weight applications, which implies 1.2 QueuingDelayEffects a lower service rate and a longer queuing delay. A long How does the queuing delay affect the response time? As queuingdelaybecomesalargeportionoftheresponsetime, a toy example, let us consider the network from ?? with significant large enough to necessitates considering them three locations of interest: One client c and two possible whendecidingtheassignments. facility locations fa, fb with compute resources to run the Shah etal. [49] survey intrusion detection systems and application.Theseresourcesareequallyfastandcanserve citedifferentmeasurementsofpacketprocessingtimesofup requests at rate µ=100req./s. Assume the round trip times to10ms.Barkeretal.[7]studygameservermaploadslasting 20–110ms in their experiments. Ishii etal. [30] conduct between c and fa as 60ms and between c and fb as 70ms. Requestsenterthenetworkatcwitharrivalrateλ.Withthis experiments on AWS [1] using a parallel Data Processing Applicationandobserveprocessingtimebetween400and setup,therequestscanbeservedatonlyfa,onlyfb,orsplit 1800ms. Lee etal. [37] and Claypool etal. [16] observe amongfa andfb.??shows,asafunctionofthearrivalrate, theresultingresponsetimesRT=RTT+PT+QDforafew drop in user experience when playing computer games with artificially increased latency to larger than 200ms. In simplestrategiesSi2. ThefirststrategyS1 minimizesonlytherequests’round summary, we focus on applications with long processing times,10ms–1s,andonaverageinternetroundtriptimes, trip time: Requests are assigned to the nearest facility fa andifitscapacityµisexceededtheremainingrequestsare 60–600ms[36]. assignedtofb.Thedramaticresponsetimegrowthforλ>80 Differentproportionsbetweenroundtriptimesandpro- istheresultoftoomanyrequestsassignedtofacilityfa.Let cessingtimesarepossible.Decidingtheassignmentinsuch λa betheassignedrequeststofa,thenfa’sutilizationρa is scenarios can be simplified by ignoring the less dominant λa/µ.Toavoidtoolargeutilizations,strategiesS2andS3limit part: Very short round trip times inside data centres, say lessthan20ms,leavesqueuingdelaysasthedominantpart themtoamaximumvalueρˆ,ρa, ρb ≤ ρˆ<1;S2 usesρˆ=0.9 andS3 usesρˆ=0.8.OntheonehandS3 withalowerlimit of the response time. Similarly, long processing times, say more than 1min, renders queuing delays as the dominant has a shorter RT (??) than S2 as the second facility is used earlier.ButontheotherhandS2 canhandleahigherarrival part.Inbothcases,roundtriptimescanbeignored;doing rates than S3, 160<λ<180, because a higher limit enables so, the assignment problem becomes a simpler mapping handlingmorerequestsintotal;systemcapacityisλ<2µρˆ.To problem.Ontheoppositesite,veryshortprocessingtimes, say 0.1ms1, result in very low queuing delays rendering relaxsuchapredefinedupperbound,S4dynamicallyadjusts thelimittothecurrentsystemutilization,ρˆ=λ/2µ.Withthe theroundtriptimebeingthedominantpart.Whenignoring same resources at both locations S4 boils down to evenly queuingdelaysinthiscase,theassignmentproblembecomes splitting the load between the two facilities. Compared to asimpler,non-convexFacilityLocationProblem.Insummary, S1..3 theresultingRTsareontheonehandsmallforλ>40 only if round trip times and processing times are not of butontheotherhandlargerforλ<40. thesamemagnitude,droppingthelessdominantpartisa So far, all assignment strategies ignore the resulting vitaloption.Ifbothtimesareofthesamemagnitude,also queuingdelays.Incontrast,ourlaststrategyS5 additionally queuing delays have to be considered when deciding the assignment.Ignoringqueuingdelaysinsuchcasesworsen 1.Correspondingservicerateof10,000req./sofahighperformance theassignmentresultinginhighresponsetimes;wecompare webserver,e.g.Apache,Nginx,deliveringstaticcontent. 3 1.0 ¹ of f magnitude). To proof this claim, we present an extended ¸ a FacilityLocationProblemwithintegratedqueuingsystems of 0.8 (??),showitsconvexity,and,forthefirsttime,obtainoptimal3 o ti solutionsforlargernetworksusingaconvexsolver(??). Due c a 0.6 toproblem’scomplexity,solvingtimeswerealreadylargefor r t f medium-sizednetworks(e.g.??ta2topology)hinderingthe en 0.4 largescaledevaluationwehadenvisioned.Wewereableto m n l¢ =0 l¢ =20 shortenthesolvingtimeswithhighaccuracybynon-trivially sig 0.2 l¢ =5 l¢ =40 linearisingtheconvexproblem(??).Thislinearisedproblem as l¢ =10 l¢ =80 hasalargersearchspaceastheconvexproblem;despitethis, itsolvestheoriginalproblemsignificantfaster(empirically 0.0 1 26 51 76 101 126 151 176 shown,??).Havingnowanadequateandfastsubstitution arrival rate ¸ [req/s] athandenablesustocomparenumeroussolutionsobtained byconsideringandignoringqueuingdelayssupportingour Figure4:Requestassignmentstofa withdifferentdistances claimbyshowingsignificantresponsetimereductionwhen l∆ betweenfa andfb. consideringqueuingdelays(??).Inaddition,weshowhow theresponsetimeandqueuingdelaysincreaseswhenusing lessercomputeresources(e.g.fewerlocations,??). reducesthequeuingdelaysofbothresources.Thestrategy’s Thisevaluationin??extendsourownpreviouswork[33]. request assignment depends on the round trip times (lcf) We compare four factors influencing queuing delays and betweenbothresources.TheresultingRTsarethelowestfor in addition vary input randomly in order to verify the allstrategiesS1..5.Inconclusion,wewereabletoimprove statisticalrelevanceofourfindings.Insummary,wesolved assignmentsbyconsideringqueuingdelays. andanalysed52,500configurations. Hereafter,welisttheequationsoftheexpectedresponse timesin??:S1 toS3 resultsin(1),S4 in(2),andS5 in(3). 2 RELATED WORK 60+ 1 , ifλ≤ρˆµ Assignment problems of the form described above have (cid:16) µ−λ (cid:17) been investigated before. We structure their comparison fµ,ρˆ(λ):= ρˆλµ 60+ µ−1ρˆµ + (1) along four dimensions relevant to this work: Their model λ−ρˆµ(cid:16)70+ 1 (cid:17), else complexity, simplifications reducing the problem’s search λ µ−λ+ρˆµ space,optimizationgoals,andsolutionapproaches.Finally, 1(cid:18) 1 (cid:19) 1(cid:18) 1 (cid:19) gµ(λ):= 2 60+µ−λ/2 + 2 70+µ−λ/2 (2) relatedsystemsofgeographicalloadbalancingarecompared. (cid:16) (cid:17) λ1 60+ 1 + 2.1 ModelComplexity hµ(λ):= λ1m∈[i0n,λ]λ−λλλ1 (cid:16)70+µµ−−λ1λ1+λ1(cid:17) (3) Thesimplestmodelconsidersonlytheroundtriptime(RTT) when assigning users to cloud resources. They equate ?? shows strategy S5’s request assignments to resource response time with RTT. Clearly, this is a simplification of fa as a fraction of λ on the vertical axis; the remaining reality,yetminimizingthisaverageRTTisequivalenttothe requests are assigned to fb. The horizontal axis shows an well-knowncapacitatedFacilityLocationProblem(FLP).If increasing arrival rate λ. The different lines correspond to theproblemisfurtherrestrictedtoonlyusepresources,it distancesl∆ betweenfa andfb –howmuchlongerrequest becomesap-medianFLP,whichisNP-hard[31]. transportation takes to send to fb instead to fa. This way A step closer to reality is modelling also the processing the original toy example is line l∆=10 and the other lines time (PT) in addition to the RTT. But as long as PT is vary the round trip time to fb. If the resources have the constant, this still stays a Facility Location Problem of the sameroundtriptime,theassignmentresultsinanevensplit. typedescribedabove.Thiscanbeeasilyseenbyextending However, if one resource is farer away, l∆>0, at first the theoriginalnetworktopologybypseudo-links(attheserver nearerresourceispreferredandonlywithincreasingarrivals oruserside)thatrepresenttheseprocessingtimesviatheir do the assignments converge to an even split. Then, the latencies; this is a common rewriting technique for graph- queuingdelayportionoftheresponsetimeissignificantly basedproblems(includingFacilityLocationProblems). largerthanl∆. The real challenge occurs when we also consider the queuingdelay(QD).Inthiscase,theadditionaltimecannot 1.3 Contribution beexpressedbyrewritingthenetworktopologyastheQD dependsontheassignmentdecisions:Ahigherutilization This paper discusses finding the optimal assignment be- resultsinalongerwait,possiblytradingoffagainstashorter tweenrequestingusersandcomputeresourceshostingthe RTT. answeringapplicationatdifferentlocationsinthenetwork. Sofar,thismoregeneralmodelhasbeenconsideredonly The assignment minimizes the expected average response by few works discussed in the remaining of this section, time for all users. We claim the necessity of considering mostusesimplerassumptionsthanours(??)renderingthe request’s queuing delays at used compute resources to avoidsuboptimalassignmentstoe.g.over-utilizedresources 3.Numericallyobtainingsolutionsbysolverwithagapthresholdof (whenroundtriptimesandqueuingdelaysareofthesame 10−6. 4 problemeasiertosolve.Vidyarthietal.[53]allowthesame 2.3 OptimizationGoal degrees of freedom as we do. They approximate, similar Existing literature uses queuing delays in FLPs with three to us, the non-linear part of the objective function with a optimizationgoals:classicFLP,min/maxFLP,andcoverage piece-wiselinearfunction.However,incontrasttoourwork, FLP. theyusedacuttingplanetechniquewhichiterativelyrefines Classic FLPs are problems that minimize the average thepiece-wisefunctionasnecessary;itremainsunclearhow responsetime,likeourproblem(??)orothers[10],[19],[47], largetheirlinearisationerroris.Incontrast,ourevaluation [53],[55],[59],allowingRTvariationsforindividualusers. shows small linearisation errors; and this is achieved by Aboolian etal. [2]’s min/max problem minimizes the usingasimplertechnique. maximumresponsetime.Intuitively,suchproblemsimprove especiallytheusers’RTwithhighRTTstocloudresources. However,ifonlyonesuchuserexistswithresourcesbeing far away, assigning this user will negatively affect the 2.2 Simplifications assignmentsofotherusers:Theirassignmentsarenowless- restrictedlyconstrainedbyarelaxedupperboundandare Otherauthorsinvestigateslightlydifferentscenarios,sothat likelyworsethanwithoutthefirstuser.Incontrast,classical theirproblemformulationsaresimilar,yetsimplerthanours. FLPsdonotsufferthiswayfromaworsecaseuser. Some authors [55], [59] replace the non-linear QD part Anothertypeofproblemiscoverageproblems;theuser withaconstantupperboundand,consequently,theresulting assignment’s response times is upper bounded [40], [41], problems become simpler to solve. But this also hides QD [42]. Structurally, a coverage problem is a special, simpler changesasaresultofassignmentchanges.Forinstance,in case of a min/max problem; the first has a predefined a situation where load balancing would reduce the QD, bound, which is additionally minimized in the second. this reduction is not visible as the QD part is constant. Intuitively,suchproblemscanbeappliedinscenarioswhere Consequently, the resulting solution has further potential serviceguaranteesforacertainmaximalresponsetimewill foroptimization–weexploitthispotential. be provided and paid. In contrast, classical FLPs allow Inanothersimplification,theassignmentsarepredefined minimizing the average response time below the lowest byarule.Someauthors[2],[55],[59]alwaysassignrequests possibleresponsetimebound. to the nearest cloud resource. In such a case, the problem reduces to just finding the best resource location and is easiertosolve.Theassignmentsarethenpredeterminedby 2.4 SolutionApproaches therule.However,balancingtheassignmentscouldfurther Acoupleofheuristicswereproposedsolvingrelatedprob- reduce the QD but is not considered. We do not use any lemswhicharevariantsoftheNP-hardcapacitatedFLP[27]. predefined assignment rule, so we have the freedom to No work so far used solvers to obtain solutions (for non- changeassignmentsinordertofurtherreducetheresponse relaxedproblems)andfullenumerationsareknownforsmall times. instanceslimitedtoopenfivefacilities[2].Agreedydropping Anothergroupofauthors[10],[19]usesaparametrized heuristicsuccessivelyremovesfromthesetofcandidatesthat assignmentrulecalledthegravityrule:Weightsdetermine resourcewhichincreasestheresponsetimebythesmallest how users are assigned to cloud resources. These config- amount [55]. Greedy adding heuristics successively add urable weights are used to continuously solve the same resources,whichdecreasestheresponsetimebythelargest problemwithnewweightsreflectingtheresourceutilizations amount [2], [10], [19]. Another heuristic probabilistically oftheprevioussolution.Thisapproachdoesnotguaranteeto selects set changes of used resources [19] or performs a converge,sotheauthorsproposeaheuristicthatattenuates breath-first-searchthrough“neighbouringsolutions”where the changes in each iteration, enforcing convergence with twosolutionsareneighboursiftheirsetsofusedresources an unknown linearisation error. In contrast, we solve the differinoneelement[2],[19].Suchheuristicscanbestocked problem in one step by using all information to find the inlocaloptimaandtomitigatethisdrawbackmeta-heuristics globaloptimum. areusedasasuperstructure[2],[10],[19],[47],[55].These Liu etal. [39], Lin etal. [38], and Goudarzi etal. [25] meta-heuristicstypicallyrefinepreviouslygeneratedinitial present a similar Facility Location Problem with convex solutions, which are obtained randomly or by combining costs such as queuing delays or resource’s energy costs. existingsolutions.Thehopeisthatamongthefoundlocal In contrast to our work, they relax the integer allocation optima, one solution is very close to the global optimum decision variable simplifying the problem to the cost of – but without any guarantee. In contrast, we obtain global a less accurate solution when rounding up the obtained optima.Thisisanimportantstepforheuristicdevelopment continuous allocations. Our goal,in contrast, is to prevent asonlythisenablesaclearjudgementofheuristics’accuracy; unexpected expenses by introducing an upper bound to theirsolution’sgaptotheglobaloptimum. the number of used resources. Continuously relaxing our Others[53],[55]mayachievenearoptimalsolutionsby problem can cause any location to be allocated a bit and, usingoptimizationtechniqueslikebranch-and-boundand consequently, any site is used and paid. While the papers cuttingplanesbuttheirsolutionshaveunknownoptimality [38], [39] only consider queuing delays as a cost function, gabs.Insummary,eitheroptimaforsmallinputorsolutions this paper discusses a holistic queuing system integration withunknownoptimalitygapareobtained.Thismotivated andadditionallyconsiderssplittingandjoining(assigning) ourworkonfindingnear-optimalsolutionswithanumeri- ofthearrivalprocess. callyverysmalloptimalitygap. 5 Liuetal.[39]andWendelletal.[56]presentdistributed a) algorithms for their global Geographical Load Balancing problembydecomposingitintoseparatesubproblemssolved by all clients. These subproblems converge to the optimal solution only if they are executed in several synchronized roundsinwhichassignmentandutilizationinformationare exchangedamongallclients.Bothpapersstatethatthisdis- tributedalgorithmswouldobtainoptimalsolutionfasterthan gathering everything to a centralised solver. However, we believethateachroundacommunicationdelayisintroduced whensendingupdateinformationamongallclients;theyhad b) c) ignored this delay in their evaluations. The resulting total delay over all rounds is likely larger than communicating Figure5:BipartitegraphofaFacilityLocationProblem(a); with a centralised coordinator. In addition, our p-median time-in-systemfunctionsateachfacility(b)and,alternatively, Facility Location Problem has a global constraint on the piece-wiselinearisedfunctions(c). maximalusedresourcespreventingittobeeasilyseparated intosubproblems. Table1:Modelvariables We observed that problem instances were solved only Inputconstants: exemplarysofar[2],[10],[19],[38],[39],[41],[47],[47],[56]. G=(V,E) Bipartite graph with V = C ∪F, Consequently, the average performance of these solution C∩F =∅withclientnodesc∈C approachesishardtopredict.Wegobeyondthisbyunder- andfacilitynodesf ∈F taking a statistical performance evaluation. We randomly lcf ∈R>0 Roundtriptimebetweencandf varyourinputdataandverifythestatisticalrelevanceofour µf ∈R>0 Servicerateascapacityatf findings. λc ∈R>0 Arrivalrateasdemandatc Tµ ∈R>0 Timeinqueuingsystem(TiS) 2.5 GeographicalLoadBalancing αµs,βµs s-thbasepointTµ(αs)=βs ofT˜µ AsystemforGeographicalLoadBalancing(GLB)comprises Decisionvariables: twoparts:Thedecisionpartselectsappropriateserver,sites, xcf ∈R>0 Assignmentindemandunits or Virtual Machines for requests of a certain origin – the yf ∈{0,1} Indicatoriff isopened(=1) previous sections considers them. This section focuses on zfs ∈[0,1] Weightofs-thbasepointatf therealisationpart,whichgathersmonitoringinformation Helpervariables: andimplementsselections.Differentmiddlewareshadbeen (cid:80) (cid:80) proposed [23], [56], [57], [58] which are shared between Λ;Λf totalΛ= cλc;atf:Λf= cxcf applications. In this way, each application benefits from τ ∈R>0 Sufficientsmallvalue instancesoftheotherapplicationrunningatdiversesitesby ρ∈R>0=λ/µ Systemutilization sharing monitoring information such as latency to servers or to customers. They realise request assignments, e.g., to correspondtolocationswhereuserrequestflowsenterthe close-by or low utilised server by either configuring the network.Facilitiesrepresentcandidatelocationstoexecute DomainNameSystem(DNS)orareexplicitlyqueriedahead theapplication,e.g.,datacentres.Moreprecisely,a(compute) arequestsend.Slightlydifferent,Cardellinietal.[14]propose resourcerefertoahostatsuchadatacentreexecutingthe redirecting requests to different sites to balance the load. application. ??a shows such a graph. The geographically4 Policies ranges from redirecting all, only largest, or only distributeddemandismodelledbytherequestarrivalrate grouprequeststoselectingsitesbasedonround-robin,site utilization,orconnectionproperties.Ourpaperfocuseson λc foreachclientc. Computingcapacityismodelledasthe solvingtheproblemandinvestigateswhetherthecomplexer request serving rate µf for each facility f. The round trip problemwithqueuingsystemsisworththeadditionalefforts timelcf isthetimetosenddatafromctof andback.??lists allvariables. andourresultscanbeappliedtoimprovegeographicalload Ourfirstproblemformulationrecapitulatestheknown balancingsystems. p-medianproblemP(G, λ, µ, p): 1 (cid:88)(cid:88) T3hisPsRecOtiBonLEfiMrst formalises our scenario model and then mxin (cid:80)cλc c f xcflcf (objective) (4) (cid:88) details on practical realisations. Afterwards this section s.t. xcf =λc, ∀c (demand) (5) discusses problem’s convexity and proposes a problem f linearisationminimizingthemaximallinearisationerror. (cid:88) xcf ≤yfµf, ∀f (capacity) (6) c 3.1 Model Ourscenarioisformalizedasacapacitatedp-medianFacility 4.Moreprecisely,therequestarrivalandservicepointsaretopologi- callydistributed;theroundtriptimeofapathbetweentwopointsonly LocationProblem[20].AbipartitegraphG=(C∪F, E)has roughlymatchesitsgeographicallydistance.Weuse“geographically” twotypesofnodes:clients(c∈C)andfacilities(f∈F).Clients foraconvenientexplanation. 6 (cid:88) yf =p (limit) (7) Geographical Load Balancing systems (??) to our own f ApplicationDeploymentToolkit[34].Theymonitortraffic, decide assignments, and reconfigure the dispatching sub- The formulation contains two decision variables: xcf∈R≥0 describes which part of c’s request rate λc is sayvsetreamgedinntuimmbeepreorfioindcso.mTihnegarevqeuraegstesaartrriovuatlerractefoλrcthiesltahset assignedtowhichf;yf∈{0, 1}describesiflocationf isused period. By solving problem QP once a period, the request or not. The objective is to minimize the average response assignments5 aredecidedforthenextperiod.Thedecisionis time;butwithoutmodellingthequeuingdelayandservice realisedbyconfiguringthedispatchingsubsystem,e.g.,DNS, timeatfacilities,theresponsetimeonlyconsistsoftheround and allocating cloud resources accordingly. The system is triptime.TheRTTisminimizedwhilealldemandisserved not meant to allow a fine grained assignment decision for (5)andthecapacityisnotexceeded(6). eachincomingrequest,e.g.atlinespeed.Onlongerterms,it In addition, exactly p locations are used (7). This con- decideswhichsitesarestrategicallyusedandhowincoming straintservestwoproposes.First,bylimitingthenumberof requestsareroughlydistributed. locationwheretheapplicationisdevelopedto,theexpenses fortheapplicationproviderwhenleasingCloudresourcesis bound.InFacilityLocationvariantswherefacilityopening 3.3 ConvexOptimization costs are directly integrated the resulting total costs are Previouswork(??)alsoconsideredourobjectivefunction(8) unsure.Second,statingtheproblemwiththisboundallowus butdidnotsolvethecorrespondingproblemoptimally,ex- toinvestigatetheresponsetimetrendwhileallowingmore ceptforsmallgraphsviafullenumeration.Thisisbecauseof and more resources (??). Since 1979 the problem without thenon-linearityoftheobjectivefunctionwhichnecessitates capacity is known to be NP-hard [27]. This problem is a non-linearsolvers.Thereexistacoupleofnon-linearsolvers generalizationand,thus,alsoNP-hard. with different specializations: quadratic, convex, or non- Untilnow,theresponsetimehasonlybeentheroundtrip convexobjectivefunctions.Bydeterminingthecomplexity time. To predict the queuing times, the model is extended class of our objective function, we can choose a suitable byqueuingsystemsateachfacility(??b).There,theservice solver,toefficientlyobtainaglobaloptimum. timesareexponentialdistributed.Theinter-arrivaltimesat Thissectionfirstprovestheobjectivefunction’sconvexity eachnodecaredescribedbyaPoissonprocess.Therequests andshowsthatitisnotsimpler,e.g.,quadratic.Afterwards, (cid:80) canbeassignedtomultiplefacility( fxcf)and,there,the itdescribeshowweusedtheconvexsolver. individualassignmentfromdifferentnodesareaggregated (cid:80) Definition1. Afunctiong isconvexifitsdomaindom(g)is ( cxcf). The resulting process is also a Poisson process, aconvexsetandifg(cid:48)(cid:48)(x)≥0holds∀x∈dom(g)[12]. becausesplittingandjoiningdoesnotchangetheunderlying randomdistribution.Asaresult,wehaveaM/M/1-queuing Lemma 1. Function g=(cid:80)iwigi, g:Rn→R, is convex, iff model[11].Thefunctionforthetimeinqueuingsystem(TiS) ∀i:gi:Rn→Randwi∈R>0 isconvex[12]. computes the processing time plus the queuing delay (??), Tfoµr(mλu)=laµti−1oλn.oPfuttthinisgqeuveeruyitnhgi-negxtteongdetehderp,-tmheecdoirarnespproonbdlienmg wThitehofruemncti1o.nTThµecoombjepcutitvinegftuhnecstoiojonur(8n)tiomfeQinPainsMco/nMve/x1 queuingsystem. QP(G, λ, µ, p)is: mx,iyn (cid:80)c(cid:80)fxccλfclcf +(cid:80)f((cid:80)cx(cid:80)cfc)λµcf−(cid:80)1cxcf (8) 0co≤nλv<exµPsreeonto.ffoT:rhTceehdseebcdoyonmcdoandisnetrrioavifantTtiv(µe1(0λT));(cid:48)µ(cid:48)=a(λn1/)iµ=n−t2λe/(rµvis−aλlt)hi3seiasilnwatlaweyravsyaasl (cid:124)avera(cid:123)g(cid:122)eRTT(cid:125) (cid:124) aver(cid:123)ag(cid:122)eTiS (cid:125) larger0withinitsdomain.By??,Tµ(λ)isaconvexfunction. (cid:88) Forafixedf intheobjectivefunction,0<Λf=λ<µand s.t. xcf =λc, ∀c (demand) (9) Tµ(Λf)isconvex.Then,thenon-negativeweightedsumof (cid:88)f convexfunctions(cid:80)fΛfTµ(Λf), Λf=(cid:80)cxcf isalsoconvex xcf <yfµf, ∀f (capacity) (10) (??).Thetermremainsconvexafter1/Λ>0ismultiplied.The c lefttermoftheobjectivefunctionislinearandalsoconvex. (cid:88) yf =p (limit) (11) Sincethesumoftwoconvexfunctionsisconvex,theobjective f function(8)isconvex. Withtheknowledgeofaconvexobjectivefunction,we Thenewobjective(8)istominimizetheaverageresponse canignorelessefficientsolversformoregeneral,non-convex time,whichisthesumoftheaverageroundtriptimeand problems.Thenextmoreefficientsolverclassisquadratic, theaveragetimeinsystem(??).Constraint(9)isthesameas which need objective functions of the form xTMx with Constraint (5); all demand must be served. Constraint (10) symmetric matrix M∈Rn. But our objective function is assures the steady state (λ<µ, c.f. [11]) for each queuing not of this form, making quadratic solvers inapplicable. system. Finally, Constraint (11) mandates to use exactly p Consequently,wehavetouseaconvexsolver. locations,justlikeConstraint(7). Implementation:Wechoosetheoptimizationframework CVXOPT[5]fromtheauthorsof[12].QPisamixedinteger 3.2 Systemdesign problem, which is not directly supported by CVXOPT. Continuouslyrelaxingtheproblemisnotpossible(??).We The presented optimisation problem QP is part of a large system which dispatches requests of a certain origin to 5.Theassignmentxcf istherequestratedispatchedfromctof,in sites as decided. Examples of such a system ranges from shortrequestassignment. 7 decomposedQPintosolvingmultiplesubsetsF(cid:48) ofF with afterwardssummingalltermsup.Twotypesoftermsexist |F(cid:48)|=p: (18)withdifferentderivatives(19,20).InJacobimatrix(17), QP(G=(C∪F,E), λ, µ, p, τ)= eachpartialderivativeacf isg1(cid:48),cf(x)+g2(cid:48),cf(x). F(cid:48)∈Fm,|iFn(cid:48)|=p{PQP((C∪F(cid:48),E), λ, µ, τ)} (12) g1(x)= lcfΛxcf g2(x)=Λ(ΛΛ−f µ ) (18) f f withthepurelyconvexsub-problemPQP(G, λc, µf, τ): min (cid:80)c(cid:80)fxcflcf + (cid:80)f (cid:80)µf(cid:80)−c(cid:80)xccfxcf (13) g1(cid:48),cf(x)= dxdcf lijΛxij =(cid:40)0lΛij ieflscef=ij (19) x cλc cλc d Λ (cid:88) g(cid:48) (x)= j s.t. xcf =λc, ∀c (demand) (14) 2,cf dxcf Λ(Λj−µj) (cid:88)f xcf ≤µf −τ, ∀f (capacity) (15) =(cid:40)0Λ(µjΛ−jΛj)2 + Λj(µj1−Λj) ieflsfe=j (20) c ThedecompositionoptimallysolvesQPbysolvinganon- Similarly, the Hessian matrix (21) contains second-order, integerconvexsubproblemPQPseveraltimesfordifferent partial derivatives which are first derived in xcf direction configurations of the binary variables yq; these variables (rows)andtheninxde direction(columns). indicatewhichfacilityisused.First,problemPQPissolved H (x)=(a ) , withallfacilitysubsetsQ(cid:48)⊆Qwith|Q(cid:48)|=t.Then,oneofall f cfde cf∈C×F,de∈C×F pvreeqersoucptbaoollrsenmtsyheeiPstQiQmrPePep’’srs(ed1ss3eeo)cnl.iustIietnoidontnbhvyieisscstsuosoerbllusxeetciattoenQndd,(cid:48),ptth∀hroaqebt∈dlQheemac(cid:48):sisQyitoqhP=ne’s1vmd.eeIicnnctiiosmtrihoaixnsl acfde =0Λ(Λ2fΛ−Λf(µΛff)3−2+µf)2 ieflsfe=e (21) wayC,VthXeOoPpTtimsaollvseosluPtioQnPfobryprcohbelcekminQgPthisefoduonmda.in (con- Eachcellacfde (21)isg1(cid:48)(cid:48),cfde(x)+g2(cid:48)(cid:48),cfde(x)from(22,23). straints) and iterating towards the optimum by using the g(cid:48)(cid:48) (x)= d lijxij =0 (22) JacobiandHessianmatrix(firstandsecondorderderivatives) 1,cfde dx dx Λ cf de oftheobjectivefunction(13).Hardcodingsuchmatricesis d Λ g(cid:48)(cid:48) (x)= j notfeasibleforalargenumberofparameterconfigurations. 2,cfde dx dx Λ(Λ −µ ) cf de j j Wmeatrwicaenstattorhuanvteimaen. aAulgtoembraatesdysstoelmutsiolnikeobMtaainxiinmga6thceasne =(cid:40)Λ(Λ2jΛ−jµj)3 + Λ(Λj−2µj)2 iff=j∧e=j (23) be used, but need a detour through another system and 0 else computingderivativesofmulti-dimensionalfunctionstakes time;forsmallinput,moretimethansolvingtheproblem.To 3.4 LinearApproximation obtainthesematricesfaster,wefound,nottoosurprisingly, WhileCVXOPTsolvestheproblemoptimally,ithastotestall thatthestructureof(13)anditsderivativesarethesamefor different|C|, |F|. subsetsF(cid:48),whichtakestime.Asanalternative,theconvex objective function is linearised. This way, well researched Exploiting this property, we were able to deduce a linearsolverscanbeusedtoobtainsolutionsfaster. construction rule for both matrices. Using this rule, we construct our Jacobi and Hessian matrices at runtime for 3.4.1 Piece-wiselinear differentinputswithoutnotableoverhead. Indetail,weconstructedtheJacobiandHessianmatrices Any non-linear function g(x):R→R over a finite interval from the objective function (13); here, reintroduced as a [α0, αm−1]⊂Rcanbeapproximatedbyapiece-wiselinear convenientcopyf (16),f(x11,...,xcf). (PWL)functiong˜[24].Thisfunctionconsistsofmbasepoints f(x)=Λ1 (cid:88)lcfxcf + Λ1 (cid:88)ΛΛ−fµ , αan0d,..i,sαdse,fi.n.,eαdmin−1(2,4co)rfroerspαos≤ndxi≤ngαfsu+n1c.tionvaluesβs=g(αs), f f cf f (β −β ) (cid:88) (cid:88) g˜(x):=(x−α ) s+1 s +β , Λ= λc, ∀f:Λf = xcf (16) s (αs+1−αs) s c c αs ≤x≤αs+1 ∀s∈[0,m−2] (24) TheJacobimatrix(17)foronefunctionisavectorofpartial derivatives(a11,...,acf)foreachvariablexcf. As an example, let us consider the part λTµ(λ) of (8) for µ=1.0. Then g(ρ):=λT1(λ)=λ/1−λ is our example func- Jf(x)=(acf)cf∈C×F , tiontolinearise.??ashowsg andtwodifferentlinearisations acf = Λ(µfΛ−fΛf)2+Λ(µf1−Λf) +(cid:88)lΛcf (17) gt˜h1eavnedrtig˜c2a.lTahxieshshooriwzosnthtaelcaoxrirsessphoonwdsintgheTiaSr.r?i?vbalshraotwesatnhde c Structurally,f(x)isasumofterms,anddifferentiatingf(x) absolute differences between g and either linearisation g˜1 or g˜2. These differences denote the linearisation accuracy: can be done by differentiating the terms individually and Thesmallerthedifferencesare,thetighterthePWLfunction resembles the original function. We use the maximum of 6.Maximamanuel:http://maxima.sourceforge.net/docs/manual/ maxima.pdf all absolute differences (cid:15)g˜, defined in (25), to measure the 8 3.4.2 Linearisationalgorithm 50 g(¸)= ¸ =Tw (¸) ¯ 1 ¸ 1:0 4 Our algorithm obtains basepoints for convex functions g~ (¸), i¡mamoto 40 1 ¯ with low error. It is an extended version of Imamoto’s al- g~ (¸), uniform 3 2 gorithm [29]. Imamoto’s algorithm iteratively refines m S 30 basepointsbymovingthemindividuallyalongtheabscissa Ti 20 ¯ to reduce the error (cid:15)g˜. Each basepoint’s adjustment ∆s 2 along the abscissa, αsneu=αsalt+∆s, is computed from the 10 ¯1 bbaasseeppooiinntt’dsifistrasnt-coerddse=rαdse−rivαast+iv1.e dαdasltg(αsalt) and the inter- 0 ¯0 Thepaper’s[29]statementisthatthealgorithmcomputes ® ®0 ®1 ®4 basepointswhichhavethemaximallinearisationaccuracy ® 3 2 for the given number of used basepoints. However, the 30 algorithmrunsinnumericalissuesrenderingthealgorithm g~ (¸) g(¸), imamoto S 20 1 ¡ useless for some convex functions. When fixing7 them, it ¢ Ti 10 g~2(¸)¡g(¸), uniform cannotbeguaranteedanymorethattheresultingbasepoints 0 formanlinearisationwithmaximalaccuracy(minimalerror). But it is still very small – still a good and fast option to 0.0 0.2 0.4 0.6 0.8 1.0 lineariseconvexfunctions. arrival rate ¸ Moreindetail,weextendImamoto’salgorithm[29]and fixedthefollowingtwocases:First,thealgorithmiteratively Figure6:Thetopplotshowsanexamplefunctiongwithtwo adjusts the current set of basepoints so that the error is successively reduced. These adjustments are weighted in possiblelinearizationswiththesamenumberofbasepoints. ordertoallowgraduallyfinerchangessothattheerrorafter The bottom plot shows absolute differences between the linearizationsandg.Imamoto’slinearisationhasasmaller eachiterationconvergestotheminimumerrorintheory.In practice, floating-point accuracy is limited and sometimes maximumdifference. valuesaretoosmall,changesnotapplied,andthealgorithm iteratesinfinitely.Wefixedthatbyadditionallyabortingif nofurtherbasepointchangesareobserved. linearisationaccuracy.Weseekbasepointsαi thatminimize Second, for special functions the algorithm terminates thiserror. with a division by zero. The cause is computing a base- (cid:15)g˜:= max |g˜(x)−g(x)| (25) pointαs’sadjustment∆s dependingonoriginalfunction’s Given a set of bxa∈s[eαp0,oαimnt−s1]resulting in a certain error, dtheeridviaftfievreengc(cid:48)e(αosf)t=wdoαdvsgal(uαess).gT(cid:48)(hαei)dnivuimsioenricbayllzyeerqouoaclcsuzresroif, this error is reduced by placing an additional basepoint ∃i(cid:54)=s : g(cid:48)(αs)−g(cid:48)(αi) = 0. That is if g resembles a linear functionoversomeinterval.Wefixedthatbyremovingall at a point where the absolute difference equals the error. However, more basepoints also increase the number of basepoints αs with g(cid:48)(αs)=g(cid:48)(αi), i<s and inserting these basepoints between basepoints whose g(cid:48)(α) values differ necessary variables for the optimization, which increases fromeachother.Thisassuresthattheerrorneverincreasesor searchspaceandsolvingruntime. canbereduced:Forthoseintervalsofthefunctionwhichare Some functions are hard to approximate with linear nearlylinear,alinearisationoverthewholeintervalyieldsa segments,e.g.,functionswithlargesecond-orderderivative lowerror;thus,removingbasepointswithinthisintervalhas values. If their values are large within the linearisation littleimpact.Insertingthesebasepointsatanothernon-linear interval[α0, αm−1],theerrorwillbelarge.TheTiSfunction’s partofthefunctionimprovesthelinearisationaccuracyas asymptote limλ→µTµ(λ)=∞ approximated by linear seg- thePWLfunctionbecomestighter. mentsresultsinsuchalargererror.Onepossiblecontrolknob is to adjust the interval αm−1<µ. But this also introduces 3.4.3 Formulationoflinearisedproblem an artificial capacity limit: Small values (e.g. αm−1=0.8µ) resultinfewerrequestsservedthanpossible(c.f.??b’sS2). This section describes the problem reformulation using a Consequently,thetotalarrivalrateforwhichsolutionsare PWL function. From existing alternatives [45], we used a feasibletoobtainissmaller,(cid:80)cλc/p≤αm−1<µwithpused Special Ordered Set (SOSk) of type k = 2 (SOS2) [9]: In a resource. set of continuous variables, at most k of them, adjacent to eachother,maytakenon-zerovalues.Currentlinearsolvers BothPWLfunctionsin??b,uniformandimamoto,have directlysupportSOS2. thesamenumberofbasepointsbutatdifferentpositions.As A PWL function y˜=g˜(x) is represented by a set shown,uniformlydistributingthebasepointscandramati- callyincreasetheerror(g˜1).Incontrast,thegreybasepoints of m continuous decision variables 0≤zs≤1 with a havesmallerrors(g˜2).Thosebasepointswerecomputedby SOS2(z0(cid:80),..zs,..zm−1) constraint and a convex combina- ouralgorithmsdetailedin??. tion 1= szs. This way, two adjacent values sum up to 1 = zs+zs+1. These values are then used as weights Weevaluatethefirsttwocontrolknobs,thenumberof for the basepoints (αs, βs) obtained previously by the basepointsandthelinearisationinterval’supperbound,in ??.Forthethirdcontrolknob,thebasepointpositions,our 7.This paper’s extended version details our improvements of algorithmdeterminesbasepointswithlowerror. Imamoto’salgorithm. 9 linearisation process (??). This way, the weighted sum of areusuallysolvedfasterthannon-linearproblems.Which all basepoints results in the piece-wise linear problem, problem is solved faster? The answers is not obvious, e.g., (cid:80) (cid:80) x= szsαs,y= szsβs. Q(cid:102)Pislinearbuthasalargersearchspace.Theirruntimesare Using this representation, we linearise the convex part experimentallyevaluatedin??. (cid:80) of the objective function (8) ΛfTµ(Λf), Λf= cxcf, and The maximal linearisation error (35) of the objective substituteitbycorrespondingweightedbasepointsums,the function(29)dependsontheerrorsofthelinearisedparts, SOS2constraint,andaconvexcombination.First,wefocus whichisthesumofusedresources,yf=1,andtheirmaximal on one facility location and then add indexes to model all linearisationerrors(cid:15)T˜w. locations.Forlocationf,functionTµ computestheTiS(26). µ Wobijtehctiitvselifnuenacrtiisoend,ΛvefrTsµio(nΛfT˜)µ,b(e2c7o)mtheescΛofn(cid:80)vesxβpsazsrt.AofstΛhfe Λ1 f,(cid:88)yf=1(cid:15)T˜µwf ≤ Λp mT˜awx{(cid:15)T˜w} (35) dependsondecisionvariablexcf,multiplyingxcf withzs turnsthereplacementtermtobequadratic;onlyfunctionTµ The linearisation accuracy drops if more resources are allowed to open (p). To maintain the same linearisation waslinearised,notthewholeobjectivefunction.However, havingalinearandnotquadraticobjectivefunctionwould accuracywhiledoublingpthelinearisationerror(cid:15)T˜w hasto behalved.Thiscanbeachievedbyusingmorebasepointsfor reduce problem complexity and speeds up solving. The quadratic term Λf(cid:80)sβszs needs to be replaced with an thelinearisation.Evenif??indicatesthatlessthantwicethe basepointsarenecessary,increasingthenumberofbasepoints equivalent linear term. This is achieved by “moving” the weightΛf intofunctionTµ,whichbecomesTwµ (28).Using and,hence,increasingtheproblem’ssearchspaceincreases T˜w’sbasepointswilltransformthequadraticintothelinear theruntime. µ (cid:80) term sβszs; the ordinate basepoints are now already weighted. Since the other parts of the objective function 4 EVALUATION werelinear,thewholeobjectivefunctionisnowlinear. This section has four parts. First, it presents different TiS 1 function linearisations to find a balance of two conflicting Tµ(x)= µ−x =y, withx=Λf (26) goals:Smallobjectivefunctionapproximationerrorandfew T˜µ :x=(cid:88)zsαs, y˜=(cid:88)zsβs (27) bthaesecpoonivnetsx(amnd) floinrefaarstpcroobmlepmuta(QtioPnv.sS.eQc(cid:102)oPn)da,rseocluomtiopnasreodf s s x for different real networks. Third, the trade-off between Twµ(x)=xTµ(x)= µ−x =y, withx=Λf (28) the number of used locations and resulting response time is discussed. Finally, application and network properties To model all locations, index f is added for each fa- are presented for which considering the QD yields better cility location forming the decision variables zfs and responsetimesthanignoringQD(Q(cid:102)Pvs.P). basepoint variables αfs, βfs. The linearised version Q(cid:102)P(G, λc, µf, p, αs, βs)ishence: 4.1 WeightedTiSLinearization 1 (cid:88) 1 (cid:88) xm,yi,nz Λ xcflcf + Λ βfszfs (29) TpohiinstssefcotrioTnws(dλe)s(c2r8ib)einsthhoewevwaleuaotbiotani.nFothrethcios,nwcreetsehobwasea- cf fs µ (cid:88) simplificationwithonesetofbasepointsadaptedatruntime s.t. xcf =λc, ∀c (demand) (30) fordifferentµvalues.Afterwards,wediscussthetrade-off f betweenafastsolvingtimeandlowapproximationerror. (cid:88) (cid:88) xcf = αfszfs, ∀f (capacity) (31) FunctionTwµ(λ)dependsonµandneedsindividuallin- c s earizationsfordifferentµ;letαsµ, βsµ betheirbasepoints(36). (cid:88)(cid:88)s zxfcfs =≤1y,f,SOS2(zf·,..),∀∀ff(force(pflwipl)) ((3323)) ActceoalnnrtreaebrssnepTarowteinw(vdλe/rilniµyt)gt,e=fbnuTansswµicemt(piλoiol)nian(r3Ttlsy8w):α(a∀ρsns,)dβ:(3tsαh7.sµe)F=icusoµnircαnrtedsiso,epnβposeTµnn=dwµdiβncesagn.ntbobafseeµrpeowwinirtithts- c (cid:88) λ (cid:88) (cid:88) yf =p (limit) (34) Twµ(λ)= µ−λ : αsµzs=λ, βsµzs=Twµ(λ) (36) f s s ρ (cid:88) (cid:88) The demand-weighted TiS is represented by term Tw(ρ)= 1−ρ : αszs=ρ, βszs=Tw(ρ) (37) ((cid:80)(cid:80)31ss).βαTffshszzisffssc=aap(cid:80)nadcfityΛthfceo=n(cid:80)csotrrcarxiencsftp;aoltnshodeiinmngepwlaicrcirtialvypaaalcsisrtuyatreecsoanthtsatrftatihniest Tw(µλ)= 1−λ/λµ/µ : (cid:88)ss αszs=µλ, (cid:88)ss βszs=Tw(µλ) (38) queuingsystemisinasteadystatethroughtheupperbound As the ordinate basepoints βs remain unchanged, the of the linearization interval α(m−1) < µ; τ from the old basepoints’ approximation error is also not affected. With constraint(10)becomesobsolete. this handy transformation, we only need to precompute ThesearchspaceofQPconsistsof|F|binaryand|F||C| basepoints of Tw instead of basepoint sets of Tw for each µ realvariables.InadditionQ(cid:102)Phasm|F|real,restrictedSOS differentµinthemodel,whichspeedsupthemodelsetup variables. If both problems were linear we could guess process. that solving the second problem Q(cid:102)P takes longer than QP In the remaining section, we investigate the trade-off becausethesearchspaceislarger.However,linearproblems between a fast solving time and a low-error linearisation. 10 ??comparesthesolutionquality(a)andsolvingtime(b). 7 ® =0:80 (¯ =4:00) m 1 m 1 The horizontal axis lists the groups of different topologies w~Trror of 465 ®®®®mmmm¡¡¡¡1111====0000::::99990689 ((((¯¯¯¯mmmm¡¡¡¡1111====9249:4990:::0000)000))) aarQenxPsidpsot(ihcnnreso(eaus)stsies)mdhaoennswudasmsQt(cid:102)bthhPeeer9oq(5lfui%nraeelsic)tooyufnoorfircfdeseesoanflucochertieoionanfctsehtrohvgbeartoals5uin0opef.drTtehbhaeyeliazvsvoaeeltrvritoaiicnngagesl e ¡ ¡ x. foreachgroup.Similarly,theverticalaxisin(b)showsthe o 3 95%confidenceintervalsofthesolvingtimeforeachgroup. r p p 2 TheQ(cid:102)P’ssolutionshaveaverysimilarqualitytoQP’ssolutions. a bs. 1 obtaLionoinkgintgheatQ(cid:102)(bP)’,sosbotlauitniionngs.QHPo’swseovleurt,iothnossteoovkalluoensghearvtheatno a 0 beinterpretedwithcare.OurimplementationofQPhasto processallpossiblecombinations(F(cid:48)),whereasQ(cid:102)Pbenefits 5 10 15 20 25 fromtheMIPsolver’sbranch-and-cutalgorithmtoreducethe number of basepoints (m) searchspace.Tocomparethisstructurallydifferentproblems, werestrictthenumberofcandidatefacilitiesto10andthe Figure7:TheerroroflinearisingTw isshowndependingon number of possible combinations; the major cause of the µ highersolvingtimeofQP.Nevertheless,theabsolutesolving thenumberofbasepointsanddifferentlinearizationintervals [0,..,αm−1]. timesofQ(cid:102)Pareveryshortforallgroups. In conclusion, the linearised problem Q(cid:102)P is an adequate substitutionforouroriginalproblemQP:fastandaccurate. For the first, we need to minimize the number of decision variableszfs or,equivalently,thenumberofbasepointsused 4.3 ResponseTimeReduction for the linearisation. For the second, we investigate two ??alsoshowshowresponsetimeimproveswhenaddinga control knobs (??): many basepoints or small interval end resource.Wecouldverifytwoeffectsdecreasingtheresponse αm−1.??showstheerrorofT˜w dependingonthenumber times:First,usingmorelocationsallowsbetterloadbalanc- ofbasepointsmfordifferentαm−1 values.Weneedasmall ing,whichreducesthequeuingdelays.Thesereductionsare error (down the vertical axis) with small m (left on the largerforhighlyutilizedlocationsthanforlessutilizedones. horizontalaxis)withlargeαm−1.Thelatteralsoartificially Second,morelocationsallownearerlocations,reducingthe limits the resource capacity and renders solving an input roundtriptimes.Inconclusion,theaverageresponsetimeof infeasiblethatcouldinfactbesolvedwithlargerαm−1. QP(..., p)decreasesmonotonicallyinnumberofresources. For our evaluation, we set αm−1=0.96 and m=6 with Then,serviceprovidersearningmoremoneybyconnecting error(cid:15)Tw=2.67asagoodcompromisebetweenthenumber userswithlowerresponsetimesfaceadiminishingreturn. µ of decision variables, approximation error, and artificial Atonebreakingpointp∗ thecostforaddingaresourcewill capacitylimit. exceedtheadditionallyearnedmoney.Thispointdepends onthetopology,servicetimes,andservicemonetization.By 4.2 Comparison:Convexvs.Linear using QP with different p values, the service provider can determinep∗ inadvancetoavoidprofitloss. We choose the following structurally different topologies Foracloserlook,??showsnotonlytheaverageresponse fromSndLib[44]:ta2,zib54withmanynodes(around50); time but also the time in system and round trip times yuan,bwinwithfewnodes(around10);atlanta,norway groupedalongthehorizontalaxisthesamewaylikein??. fordensenetworks(node:edgeratio1:2).Alltopologiesare Wecantracethetwoeffectsofresponsetimereductions:First, connected. We approximate the latency between nodes by loadbalancingacrossmoreresourcesreducesthequeuing theirgeographicaldistance[32].Weassumethatdatacentres delay and with it the time in queuing system. Second, the arebuiltatwellconnectednodes/routers,soweselectedthe roundtriptimeisreducedasnearerresourcesareused. 10 nodes with the highest degree8 to be data centres. We setarelativelylowservicerateofµ=100req./storeflectour 4.4 ConsideringQueueingDelays computation-intensive example applications [7], [37], [54]. Thispaperpresentsarefinementoftheassignmentproblem The service rates were the same for all data centres. User requestsarriveatallnodesandthearrivalrateλc forusers (Pin(4))byadditionallyconsideringqueuingdelays.This at site c is randomly generated: Each value is uniformly refinement increases problem’s complexity, accuracy, and drawnfroman[0,1]intervaland,afterwards,allvaluesare solvingtime.Onlyasignificantlylowerresponsetimewould normalized to (cid:80)cλc=470req./s. This value, together with make these drawbacks worthwhile; for instances, small theservicerateµ=100req./s,ensuresfeasibilityfor5ormore queuing delays compared to large round trip times will facilities.Werandomlygenerated50differentrealizationsfor rendertherefinementunnoticeable.Thissectioninvestigates allarrivalrates.Foreachofthese50setsofarrivalrates,we multiple scenario factors influencing queuing delays and consideredp∈[5, 10]facilities,resultingin300configurations judgestherefinementgainbycomparingsolutions’response pertopology.EachconfigurationwassolvedusingeitherQP times obtained by either ignoring or considering queuing orQ(cid:102)P. delays. Forthis,weperformasecondexperimentwithaslightly 8.Forsamedegreesthenodeidisthetiebreaker. differentconfigurationasin??.Thefirstexperimentshowed