ebook img

Energy-aware Load Balancing Policies for the Cloud Ecosystem PDF

0.23 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Energy-aware Load Balancing Policies for the Cloud Ecosystem

Energy-aware Load Balancing Policies for the Cloud Ecosystem Ashkan Paya and Dan C. Marinescu Computer Science Division Department of Electrical Engineering and Computer Science University of Central Florida, Orlando, FL 32816, USA Email:ashkan [email protected], [email protected] January 13, 2014 Abstract the heart of utility computing. In the last few years pack- 4 1 aging computing cycles and storage and offering them as a 0Theenergyconsumptionofcomputerandcommunicationsys- metered service became a reality. Large farms of computing 2tems does not scale linearly with the workload. A system and storage platforms have been assembled and a fair num- nuses a significant amount of energy even when idle or lightly ber of Cloud Service Providers (CSPs) offer computing and aloaded. A widely reported solution to resource management storageservicesbasedonthreedifferentdeliverymodelsSaaS J in large data centers is to concentrate the loadon a subset of (Software as a Service), PaaS (Platform as a Service), and 9servers and, whenever possible, switch the rest of the servers IaaS (Infrastructure as a Service). tooneofthepossiblesleepstates. Weproposeareformulation Reduction of energy consumption thus, of the carbonfoot- ] Cofthetraditionalconceptofloadbalancingaimingtooptimize print of cloud related activities, is increasingly more impor- Dtheenergyconsumptionofalarge-scalesystem: distributethe tant for the society. Indeed, as more and more applica- .workload evenly to the smallest set of servers operating at an tionsrunonclouds,moreenergyis requiredto supportcloud csoptimal energylevel, while observing QoS constraints,suchas computing than the energy required for many other human- [the response time. Our model applies to clustered systems; related activities. While most of the energy used by data the modelalsorequiresthatthe demandforsystemresources centers is directly related to cloud computing, a significant 1 vtoincreaseataboundedrateineachreallocationinterval. In fractionis also used by the networkinginfrastructure used to 8this paper we report the VM migration costs for application accessthecloud. Thisfractionisincreasing,aswirelessaccess 9scaling. becomes more popular and wireless communication is energy 1 intensive. In this paper we are only concerned with a single 2 aspectofenergyoptimization,minimizingtheenergyusedby 1.1 Introduction and Motivation cloud servers. 0 Unfortunately, computer and communication systems are 4 1 notenergyproportionalsystems,inotherwords,theirenergy The concept of “load balancing” dates back to the time the : consumptiondoesnotscalelinearlywiththeworkload;anidle vfirst distributed computing systems were implemented in the system consumes a rather significant fraction, often as much ilate 1970s and early 1980s. It means exactly what the name X as50%,oftheenergyusedtodeliverpeakperformance. Cloud implies, to evenly distribute the workload to a set of servers r elasticity, one of the main attractions for cloud users, comes atomaximizethethroughput,minimizetheresponsetime,and at a stiff price as the cloud resource managementis based on increasethesystemresiliencetofaultsbyavoidingoverloading over-provisioning. This means that a cloud service provider one or more systems in the distributed environment. hasto investinalargerinfrastructurethana“typical”orav- Distributed systems became popular after communication eragecloud loadwarrants. At the same time, cloud elasticity networks allowed multiple computing engines to effectively impliesthatmostofthetimecloudserversoperatewithalow communicate with one another and the networking software load, but still use a large fraction of the energy necessary to became an integralcomponent of an operating system. Once deliver peak performance. The low average cloud server uti- processes were able to easily communicate with one another lization [1, 3, 10, 15] affects negatively the common measure 1 using sockets , the client-server paradigm became the pre- of energy efficiency, the performance per Watt of power and ferredmethod to developdistributedapplications; it enforces amplifies the ecologicalimpact of cloud computing. modularity, provides a complete isolation of clients from the The strategy for resource management in a computing servers, and enables the development of stateless servers. cloud we discuss is to concentrate the load on a subset of The client-server model proved to be not only enduring, servers and, whenever possible, switch the rest of the servers but also increasingly successful; three decades later, it is at to a sleep state. In a sleep state the energy consumption is 1 The sockets were introduced by BSD (Berkeley Systems Distribu- very low. This observation implies that the traditional con- tion)Unixin1977 cept of load balancing could be reformulated to optimize the 1 energy consumption of a large-scale system as follows: dis- processorshavereasonablygoodenergy-proportionalprofiles, tribute evenly the workload to the smallest set of servers op- significantimprovementsin memory and disk subsystems are erating at an optimal energy level, while observing QoS con- necessary. straints, such as the response time. An optimal energy level The processorsusedin serversconsume less than one-third is one when the normalized system performance, defined as of their peak power at very-low load and have a dynamic the ratio of the current performance to the maximum per- range of more than 70% of peak power; the processors used formance, is delivered with the minimum normalized energy in mobile and/or embedded applications are better in this consumption, defined as the ratio of the current energy con- respect. According to [5] the dynamic power range of other sumption to the maximal one. components of a system is much narrower: less than 50% for Fromthe largenumberofquestions posedbyenergy-aware DRAM,25%fordiskdrives,and15%fornetworkingswitches. loadbalancingpoliciesdiscussedinSection3,wediscussonly The largest consumer of power of a system is the proces- theenergycostsformigratingaVMwhenwedecidetoeither sor, followed by memory, and storage systems. The power switch a server to a sleep state or force it to operate within consumption can vary from 45W to 200W per multi-core the boundaries of an energy optimal regime. CPU; newer processors include power saving technologies. Large servers often use 32 − 64 dual in-line memory mod- ules(DIMMs);thepowerconsumptionofoneDIMMisinthe 2 Operating Efficiency of a System 5−21 W range. Server secondary memory cooling requires additionalpower;aserverwith2−4harddiskdrives(HDDs) In this section we discuss energy proportional systems, the consumes 24−48 W. dynamic range of different subsystems including memory, A strategy to reduce energy consumption by disk drives secondary storage, and interconnection networks. Then we is to concentrate the workload on a small number of disks overviewmethodstoreducetheenergyconsumptionofacom- and allow the others to operate in a low-power mode. One puter system and discuss sleep states. of the techniques to accomplish this is based on replication. Operating efficiency. The operating efficiency of a sys- A replication strategy based on a sliding window is reported tem is captured by an expression of “performance per Watt in [25]; measurement results indicate that it performs better of power.” It is reported that during the last two decades than LRU, MRU, and LFU2 policies for a range of file sizes, the performance of computing systems has increased much file availability, and number of client nodes and the power faster than their operating efficiency; for example, during requirement is reduced by as much as 31%. the period 1998 till 2007, the performance of supercomput- Anothertechnique isbasedondatamigration. Thesystem ers has increased 7,000% while their operating efficiency has in[11]usesdatastorageinvirtualnodesmanagedwithadis- increased only 2,000%. Recall that power is the amount of tributed hash table; the migration is controlled by two algo- energyconsumedperunitoftimeanditismeasuredinWatts, rithms, a short-term optimization algorithm used for gather- or Joules/second. ingorspreadingvirtualnodesaccordingtothedailyvariation Energy proportional systems. In an ideal world, the of the workload so that the number of active physical nodes energy consumed by an idle system should be near zero and is reducedto a minimum, anda long-termoptimization algo- grow linearly with the system load. In real life, even systems rithm, usedfor coping with changesin the popularity of data whose power requirements scale linearly, when idle use more over a longer period, e.g., a week. than half the power they use at full load. Data collected A number of proposals have emerged for energy propor- over a long period of time shows that the typical operating tional networks; the energy consumed by such networks is region for data center servers is far from an optimal energy proportional with the communication load. For example, in consumption region as we shall see in Section 3. [1] the authors argue that a data center network based on Energy-proportionalsystems could lead to large savings in a flattened butterfly topology is more energy and cost effi- energy costs for computing clouds. An energy-proportional cient. High-speed channels typically consist of multiple serial systemconsumesno powerwhenidle, verylittle powerunder lanes with the same data rate; a physical unit is stripped a light loadand, gradually,more power as the loadincreases. across all the active lanes. Channels commonly operate ple- By definition, an ideal energy-proportional system is always siochronously3 and are always on, regardless of the load, be- operatingat 100%efficiency. Humans are a goodapproxima- cause they must still send idle packets to maintain byte and tionofanidealenergyproportionalsystem;thehumanenergy line alignment across the multiple lines. An example of an consumption is about 70 W at rest, 120 W on average on a energy proportional network is InfiniBand. daily basis, and can go as high as 1,000−2,000 W during a Sleep states. A comprehensive document [12] elaborated strenuous, short time effort [5]. by Hewlett-Packard, Intel, Microsoft, Phoenix Technologies, Dynamic range of subsystems. The dynamic range is and Toshiba describes the advanced configurationand power the difference between the upper and the lower limits of the energy consumption of a system function of the load placed 2LRU (Least Recently Used), MRU (Most Recently Used), and on the system. A large dynamic range means that a system LFU(Least Frequently Used) are replacement policies used by memory hierarchiesforcachingandpaging. is able to operate at a lower fraction of its peak energy when 3Different parts of the system are almost, but not quite perfectly, its load is low. Different subsystems of a computing system synchronized; in this case, the core logic in the router operates at a behave differently in terms of energy efficiency; while many frequencydifferentfromthatoftheI/Ochannels. 2 Table 1: Estimated averagepower use of volume, mid-range, and high-end servers (in Watts) along the years [13]. Type 2000 2001 2002 2003 2004 2005 2006 Vol 186 193 200 207 213 219 225 Mid 424 457 491 524 574 625 675 High 5,534 5,832 6,130 6,428 6,973 7,651 8,163 interface (ACPI) specifications which allow an operating sys- The alternative to the wasteful resource management pol- tem (OS) to effectively manage the power consumption of icywhentheserversarealways on,regardlessoftheirload,is the hardware. Several types of sleep sates, are defined: C- todevelopenergy-aware load balancingpolicies. Suchpolicies states (C1-C6) for the CPU, D-states (D0-D3) for modems, combinedynamicpowermanagementwithloadbalancingand hard-drives,andCD-ROM,andS-states(S1-S4)forthebasic attempt to identify servers operating outside their optimal input-output system (BIOS). powerregimeanddecideifandwhentheyshouldbeswitched The C-states, allow a computer to save energy when the toasleepstateorwhatotheractionsshouldbetakentoopti- CPU is idle. In a sleep state the idle units of a CPU have mize the energy consumption. The term server consolidation their clock signal and the power cut. The higher the state is sometimes used to describe the process of switching idle number,thedeepertheCPUsleepmode,thelargertheenergy systems to a sleep state. saved, and the longer the time for the CPU to return to the Challenges and metrics for energy-aware load bal- state C0 which corresponds to the CPU fully operational. In ancing. Some of the questions posed by energy-aware load states C1 to C3 the clock signal and the power of different balancing are: CPUunits arecut, whileinstatesC4toC6the CPUvoltage 1. Under what conditions should a server be switched to a is reduced. For example, in the C1 state the main internal sleep state? CPUclockis stoppedby the softwarebut the bus interface is and the advanced programmableinterrupt controller (APIC) 2. What sleep state should the server be switched to? are running,while in state C3 allinternalclocks arestopped, and in state C4 the CPU voltage is reduced. 3. How much energy is necessary to switch a server to a Economy of scale. Economy of scale affects the energy sleep state and then switch it back to an active state? efficiency of data processing [8]. For example, Google re- 4. Howmuchtimeittakestoswitchaserverinasleepstate portsthattheannualenergyconsumptionforanEmailservice to a running state? variessignificantlydepending onthe businesssize andcanbe 15 times larger for a small one [10]. Cloud computing can 5. How much energy is necessary to migrate a VM running bemoreenergyefficientthanon-premisecomputingformany on a server to another one? organizations [4, 18]. The powerconsumptionof servershas increasedovertime. 6. HowmuchenergyisnecessarytostartaVMonthetarget Table 1 [13] shows the evolution of the average power con- server? sumption for volume (Vol) servers - servers with a price less 7. How to choose the target for the migration of a VM? than $ 25 K, mid-range (Mid) servers - servers with a price between $25 K and $499 K, and high-end (High) servers - 8. How much time it takes to migrate a VM? servers with a price tag larger than $500 K. The energy to transport data is a significant component of Two basic metrics ultimately determine the quality of an the total energy cost. According to [4] ”a public cloud could energy-awareloadbalancingpolicy: (1)theamountofenergy consume three to four times more power than a private one saved;and(2)thenumberofviolationsitcauses. Inpractice, due to increased energy consumption in transport.” the metrics depend on the system load and other resource management policies, e.g., the admission control policy and the QoS guarantees offered. The load can be slow- or fast- 3 Energy optimization in large-scale varying, have spikes or be smooth, can be predicted or is data centers totally unpredictable; the admission control can restrict the acceptance of additional load when the available capacity of the servers is low. What we can measure in practice is the Motivation. Recently, Gartner research reported that the average energy used and the average server setup time. The average server utilization in large data-centers is 18% [21], setuptime variesdepending onthe hardwareand the operat- while the utilization of x86 servers is even lower, 12%. These ing system and can be as large as 260 seconds [9]; the energy resultsconfirmearlierestimationsthattheaverageserveruti- consumptionduringthesetupphaseisclosethe maximalone lizationisinthe10−30%range[5]. A2010survey[6]reports for the server. that idle servers contribute 11 million tones of unnecessary The time to switchthe serversto a running state is critical CO2 emissions each year and that the total yearly costs for when the load is fast varying, the load variations are very idle servers is $19 billion. steep, and the spikes are unpredictable. The decisions when 3 4 Clustered Cloud Models to switch servers to a sleep state and back to a running state are less critical when a strict admission control policy is in place;thennewservicerequestsforlargeamountsofresources Clustering. Hierarchical organization has long been recog- can be delayed until the system is able to turn on a number nized as an effective way to cope with system complexity. of sleeping servers to satisfy the additional demand. Clustering supports scalability, as the number of systems in- Policies and mechanisms to implement the policies. crease we add new clusters. Clustering also supports prac- Severalpolicies have been proposedto decide when to switch ticality, server decisions are based primarily on local state a server to a sleep state. The reactive policy [22] responds information gathered from the members of the cluster; such to the current load, it switches the servers to a sleep state information is more accurate and available with lower over- when the load decreases and switches them to the running head then information from a very large population. state when the load increases. Generally, this policy leads to An energy-aware model. In [19]we introduceda model SLA violations and could work only for slowly-varying and of large-scale system with a clustered organization. In this predictable loads. To reduce SLA violations one canenvision modelwedistinguishseveralregimesofoperationforaserver a reactive with extra capacity policy when one attempts to based on the energy efficiency. have a safety margin and keep a fraction of the total num- We assume that the normalized performance of server S k ber of servers, e.g., 20%, running above those needed for the depends on the energy level a (t)=f [b (t)] and distinguish k k k current load. The autoscale policy [9] is a very conservative five operating regions of a server, an optimal one, two sub- reactive policy in switching serversto sleepstate to avoidthe optimal, and two undesirable, Figure 1: power consumption and the delay in switching them back to running state. This can be advantageous for unpredictable, R1 - undesirable low region spiky loads. β0 ≤b (t)≤βsopt,l 0≤a (t)≤αsopt,l (1) A very different approach is taken by two version of pre- k k k k k dictive policies [7, 24]. The moving window averages one es- R2 - lower suboptimal region timates the workload by measuring the average request rate ina window ofsize ∆seconds anduse this averageto predict βsopt,l ≤b (t)≤βopt,l αsopt,l ≤a (t)≤αopt,l. (2) k k k k k k theloadduringthenextsecond(second∆+1)andthenslide the window one second to predict the load for second ∆+2, R3 - optimal region andsoon. Thepredictive linearregressionpolicyusesalinear regressionto predict the future load. βkopt,l ≤bk(t)≤βkopt,h αokpt,l ≤ak(t)≤αokpt,h. (3) Anoptimalpolicycanbedefinedasonewhichdoesnotpro- duceanySLAviolationsandguaranteesthatallserversoper- R4 - upper suboptimal region ateintheiroptimalenergyregime. Optimalityisalocalprop- βopt,h ≤b (t)≤βsopt,h αopt,h ≤a (t)≤αsopt,h. (4) erty of a server and can be easily determined by the energy k k k k k k management component of the hypervisor. Recall that Eopt, k R5 - undesirable high (h) region the optimalenergylevelofserverS is one whenthe normal- k ized system performance, defined as the ratio of the current βsopt,h ≤b (t)≤1 αsopt,h ≤a (t)≤1 (5) performance to the maximum performance is delivered with k k k k the minimum normalizedenergy consumption, defined as the The classificationcaptures the current system load and al- ratio of the current energy consumption to the maximal one. lows us to distinguish the actions to be taken to return to The boundariesofthe optimalregionsaredefinedasEkopt±δ the optimal regime or region. When the system is operating with δ = (0.05−0.1)×Eopt. In a heterogeneous environ- in the upper suboptimal or undesirable regions one or more k mentthe normalizedsystemperformanceandthe normalized VMs shouldbe migratedelsewhereto lowerthe loadthe load energy consumption differ from server to server. of the server. When operating in the lower suboptimal or Themechanismstoimplementenergy-awareloadbalancing undesirable regions the system is lightly loaded; then addi- policies should satisfy several conditions: tionalloadshouldbe broughtin,or,alternatively,thesystem should be a candidate for switching to a sleep state. This 1. Scalability - work well for large farms of servers. classification also captures the urgency of the actions taken; suboptimal regions do not require an immediate attention, 2. Effectiveness - lead to substantial energy and cost sav- while undesirable regions do. The time spent operating in ings. eachnon-optimalregionis alsoimportant. Of courseone can 3. Practicality-useefficientalgorithmsandrequireasinput furtherrefinethemodelanddefinealargernumberofregions onlydatathatcanbemeasuredwithlowoverheadand,at but this could complicate the algorithms. thesametime,accuratelyreflectsthestateofthesystem. An important characteristic of the model is that the rate of workload increase is limited. This requirement is moti- 4. Consistency - the policies should be aligned with the vated by the fact that effective admission controlpolicies are global system objectives and with the contactual obliga- rarely effective because the available capacity of the a cloud tionsspecifiedbyServiceLevelAgreements,e.g.,observe is difficult to estimate; it should be a part of a Service Level deadlines, minimize the response time, and so on. Agreement. 4 a-normalized performance 1 (5)Undesirable-high αsopt,high k (4)Suboptimal-high αopt,high k (3)Optimal αopt,low k (2)Suboptimal-low αsopt,low k (1)Undesirable-low βk0 βkopt,low βksopt,high 1 βksopt,low βkopt,high b-normalized energy Figure1: Normalizedperformanceversusnormalizedserverenergyconsumption;theboundariesofthe fiveoperatingregions are shown. A homogeneous cloud model. To estimate the energy The ratio of energy consumption for the two scenarios is savings by distributing the work load to the smallest set of E n b servers operating at an optimal energy mode we consider a ref = × avg (10) simple model. The model assumes a homogeneous environ- Eopt n−nsleep bopt ment all servers have the same peak performance and the We require that the volume of computations carried out same energy consumption at the peak performance. More- under the two scenarios be the same over,it ignores the overheadof migrating computations from oneservertoanotherduringtheenergyoptimizationprocess. n a We compare the energy consumption for two scenarios: na =(n−n )a ⇒ = opt. (11) avg sleep opt n−n a sleep avg • Reference cloud operation - the n physical platforms op- erate at normalized performance levels uniformly dis- It follows that tributed in the interval [a ,a ]. We assume that min max E a b the average normalized energy consumption per opera- ref = opt × avg (12) tion in this range is bavg. Then the energy consumption Eopt aavg bopt is For example, when b = 0.6, a = 0.3, b = 0.8, and avg avg opt a =0.9 then opt E =nb (6) ref avg E ref =2.25. (13) and the number of operations is E opt a −a In this case the optimal operation reduces the energy con- C =na with a = max min (7) ref avg avg 2 sumption to less than half. A heterogeneous cloud model. Amore complex model • Optimal energy-operation - a subset, n < n servers is used for the second type of simulation experiments. Some sleep are switched to a sleep state and the remaining (n − ofthe modelparametersare: τk thereallocationinterval, λi,k n ) platforms operate at a normalized performance the largest rate of increase in demand for CPU cycles of the sleep level aopt and the normalized energy consumption per applicationAi,k onserverSk,qk(t+τk)andpk(t+τk)thecosts operationisbopt =bavg+ǫ. Thentheenergyconsumption for horizontal and vertical scaling, respectively for server Sk is in the next reallocation interval, and jk(t+τk), cost of com- E =(n−n )b (8) munication and data transfer to or from the leader for the opt sleep opt next reallocation interval. The average server load is uni- and the number of operations is formly distributed in the [0.1−0.9] range of the normalized performance. The serversare connected to the leader by star Copt =(n−nsleep)aopt (9) topology. 5 5 Simulation Experiments The parameters defining the energy regimes for server S , k αsopt,l,αopt,l,αopt,h, and αsopt,h are randomly chosen from k k k k uniform distributions in the [0.20 - 0.25], [0.25 - 0.45], [0.55 The effect of the system load. In [19] we report on sim- - 0.80] and [0.80 - 0.85] range, respectively. Each applica- ulation experiments designed to evaluate the effectiveness of tion has a unique λ . Initially, all servers are operating in i,k algorithms to balance the load while attempting to optimize the C0 mode. A server Sk maintains static information such theenergyconsumptionwhileacceptingadditionalload. One as the serverId, and the boundaries of the energy regimes, of the questions we addressed was whether the system load αsopt,l,αopt,l,αopt,h, and αsopt,h; it also maintains dynamic k k k k has an effect on the resource management strategy to force information such as the number of applications, the server theserversinaclustertooperatewithintheboundariesofthe load, the regime of operation and CPU state. The leader is optimalregion. In[19]weexperimentedwithclusterssizes20, informed periodically about the regime of each server of the 40,60,and80 servers;the experiments we reportnow are for cluster. the clusters with 102,103, and 104 servers. For each cluster Attheendofthecurrentreallocationinterval,R (t),server k size we considered two load distributions: S evaluates the operation regime for the next reallocation k (i)Lowaverageload-aninitialloaduniformlydistributed in interval, R (t+τ ) based on the server load, CPU cycles de- k k the interval 20−40% of the server capacity. Figures 2 (a), mand of each individual application and performance of the (c), and (e) show the distribution of the number of servers server. Ifneeded,italsocalculatesthecostsforhorizontaland in the five operating regions for clusters with 102,103 and verticalscalingandcommunicationwiththe leaderwhichde- 104 servers, respectively, before and after load balancing. As pendsontheregimeofoperationandnumberofapplications. expected, when average load 30% of the server capacity, the In other words, S computes a (t+τ ),q (t+τ ),p (t+τ ) k k k k k k k initialserverdistributionisconcentratedinoperatingregions and j (t+τ ). After determining the regime of operation in k k the next reallocation interval, R (t+τ ). If the regime is: attheleftandintheoptimalregionR3. Afterloadbalancing, k k the majority of the servers operate within the boundaries of the optimal and the two suboptimal regimes, and almost 4% 1. R1 - then Sk notifies the leader. Upon receiving the no- in the undesirable regimes. tification, the searchesfor serversoperatingin the R4 or (ii) High average load - initial server load uniformly dis- R5 regimes,as wellas,other serversoperatingin the R1 tributedinthe 60−80%ofthe servercapacity. Figures2(b), or R2 regimes. If such servers are identified, the leader (d), and (f) show the distribution of the number of serversin calculates the cost of transferring VMs from/to such the five operating regions for clusters with 102,103 and 104 serversandsendsthisinformationtoserverS . Uponre- k servers, respectively, before and after load balancing. In this ceivingthisinformation,S determinesifitshouldgather k case the average load is 70% of the server capacity and the additional workload from servers operating in either R4 initialserverdistributionisconcentratedinoperatingregions or R5 regimes, or if it should transfer its own workload at the right of and in the optimal region. After load balanc- to servers operating in the R1 or R2 regimes and then ing,themajorityoftheserversoperatewithintheboundaries switch itself to sleep. of the optimal and the two suboptimal regimes, and almost 4% in the undesirable regimes. 2. R2 - then Sk notifies the leader that it is willing to ac- High-cost versus low-cost application scaling. Elas- ceptadditionalworkload. Theleadersearchesforservers ticity is one ofthe main attractionof cloudcomputing; cloud operating in the R4 and R5 regimes and informs them elasticity allowsapplicationto seamlesslyscale up and down. that Sk is willing to accept some of their workload. Fi- In the next set of simulation experiments we investi- nally, Sk negotiates directly with the potential partners gate horizontal and vertical application scaling for non- for load balancing. homogeneous clusters. Horizontal scaling requires the cre- ation of additional VMs to run the application on lightly 3. R3 - no action is necessary. loadedservers. Horizontalscalingincurs highercostsforload balancing than vertical scaling. The higher costs are due to communication with the leader to identify the potential tar- 4. R4 -thenSk notifiestheleaderthatitisoverloaded. The gets and then to transport the VM image to one or more leader searches for servers operating in the R1 and R2 of them. Vertical scaling allows VM running an application regimesandinformsthemthatS wishestotransfersome k to acquire additional resources from the local server; vertical of its workload to them. Finally, S negotiates directly k scalinghaslowercosts,but itis only feasible ifthe serverhas with the potential partners for load balancing. sufficient free capacity. Weconductsixexperimentsforthreeclustersizes,102,103, 5. R5 - then Sk notifies the leader. Upon receiving the no- and 104 and two different initial loads for each of them, 30% tification,the leadersearchesforserversoperatinginthe and 70% average server loads. We study the evolution of a R1 or R2 regimes and requests Sk to negotiate directly clusterforsome40reallocationintervals. Weareinterestedin withthepotentialpartnersforloadbalancing. Ifnosuch the average ratio of high-cost in cluster horizontal scaling to serversarefound,theleaderwakesuponeormoreservers low-cost local vertical scaling and in the standard deviation in C3 or C6 (sleep) states and informs Sk. of this ratio. 6 Initial And Final Number of Servers in Each Regime Initial And Final Number of Servers in Each Regime 80 70 Initial State Initial State Final State Final State 70 60 60 50 ers50 ers erv erv40 S S of 40 of er er b b30 m m Nu30 Nu 20 20 10 10 0 0 1 2 3 4 5 1 2 3 4 5 Regime of Operation Regime of Operation (a) Cluster size: 102. Average load: 30% (b) Cluster size 102. Average load: -70% Initial And Final Number of Servers in Each Regime Initial And Final Number of Servers in Each Regime 800 700 Initial State Initial State Final State Final State 700 600 600 500 ers500 ers erv erv400 S S of 400 of er er b b300 m m Nu300 Nu 200 200 100 100 0 0 1 2 3 4 5 1 2 3 4 5 Regime of Operation Regime of Operation (a) Cluster size: 103. Average load: 30% (b) Cluster size 103. Average load: -70% Initial And Final Number of Servers in Each Regime Initial And Final Number of Servers in Each Regime 7000 7000 Initial State Initial State Final State Final State 6000 6000 5000 5000 ers ers erv4000 erv4000 S S of of er er b3000 b3000 m m u u N N 2000 2000 1000 1000 0 0 1 2 3 4 5 1 2 3 4 5 Regime of Operation Regime of Operation (a) Cluster size: 104. Average load: 30% (b) Cluster size 104. Average load: -70% Figure 2: The effect of average server load on the distribution of the servers in the five operating regimes, R1, R2, R3, R4, and R5, before and after energy optimization and load balancing. Average load: (a), (c), (e) - 30%; (b), (d), (f) - 70%. The cluster size: (a) and (b) - 102; (c) and (d) - 103; (e) and (f) - 104. 7 3 7 Ratio Ratio 6 2.5 ns ns o o ecisi ecisi5 D 2 D al al Loc Loc4 o o er t1.5 er t ust ust3 −cl −cl n n o of I 1 o of I2 ati ati R R 0.5 1 0 0 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 Reallocation Interval Reallocation Interval (a) (b) 1.6 8 Ratio Ratio 1.4 7 ns ns ecisio1.2 ecisio6 D D al al 5 Loc 1 Loc o o er t er t4 ust0.8 ust −cl −cl3 n n o of I0.6 o of I ati ati2 R R 0.4 1 0.2 0 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 Reallocation Interval Reallocation Interval (c) (d) 1 7 Ratio Ratio 0.9 6 ns ns ecisio0.8 ecisio5 D D al 0.7 al Loc Loc4 o o er t0.6 er t ust ust3 −cl0.5 −cl n n o of I o of I2 ati0.4 ati R R 0.3 1 0.2 0 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 Reallocation Interval Reallocation Interval (e) (f) Figure 3: Time series of in-cluster to local decisions ratios for 40 reallocation intervals. Average load: (a), (c), (e) - 30%; (b), (d), (f) - 70%. The cluster size: (a) and (b) - 102; (c) and (d) - 103; (e) and (f) - 104. For low average load (a), (c) and (e) low-cost local decisions become dominant after about 20 reallocation intervals. For high average load (b), (d), and (f) low-cost local decisions become dominant after some 5 reallocation intervals. 8 2 3 4 Table 2: In-cluster to local decisions ratios for 30% and 70% average server load for three cluster sizes, 10 , 10 , and 10 . Plot Cluster Average # of servers Average Standard sizes load in sleep state ratio deviation (a) 102 30% 0 0.6490 0.5229 (b) 102 70% 0 0.5540 0.9088 (c) 103 30% 8 0.4739 0.2602 3 (d) 10 70% 0 0.5248 1.1311 (e) 104 30% 796 0.4294 0.1998 (f) 104 70% 0 0.4843 0.9323 Numerical results of these experiments are summarized in other servers identified by the cluster leader, a lightly loaded Table 2. For low averageloadFigures 3 (a), (c) and (e) show servers is switched to one of the sleep states. thatlow-costlocaldecisionsbecome dominantafter about20 There are multiple sleep states; the higher the state num- reallocation intervals. For high average load Figures 3 (b), ber, the larger the energy saved, and the longer the time for (d), and (f) low-cost local decisions become dominant after the CPU to return to the state C0 which corresponds to a some 5 reallocation intervals. As expected for low-average fully operational system. For simplicity we chose only two load after loadbalancing a number of serversare switched to theC3 sleepstate;thisnumberincreasesfrom0to8andthen sleep states C3 and C6 in the simulation. If the overall load to 796 when the clusters size increases from 102 to 103 and of the cluster is more than 60% of the cluster capacity we then to 104 servers. do not switch any server to a C6 state because in the next future the probability that the system will require additional Theaverageratioofhightolowcostscalingforthe40real- computing cycles is high. Switching from the C6 state to locationintervalsisinthe0.42to0.65range,anddecreasesas C0 requires more energy and takes more time. On the other the cluster size increases but has a large standard deviation hand, when the total cluster load is less than 60% of its ca- due to the large variations for the first reallocation intervals. pacity we switch to C6 because it is so unlikely that for the As the systemstabilizes this ratio tends to havelowervalues. next interval and the interval after that system needs extra computational unit. 6 Summary The simulation results reported in Section 5 show that the loadbalancingalgorithmsareeffectiveandthatlow-costverti- calscalingoccursevenwhenaclusteroperatesunderaheavy The average server utilization in large data-centers is 18% load. The larger the cluster size the lower the ratio of high- [21]. When idle the servers of a data center use more than cost in-cluster versus low-cost local decisions. half the power they use at full load. The alternative to the wasteful resource management policy when the servers are The QoS requirements for the three cloud delivery models always on,regardlessoftheirload,istodevelopenergy-aware are different thus, the mechanisms to implement a cloud re- load balancing policies. Suchpolicies combine dynamic power sourcemanagementpolicybasedonthisideashouldbediffer- management with load balancing. ent. To guarantee real-time performance or a short response There are ample opportunities to reduce the energy neces- time, the servers supporting SaaS applications such as data- sary to power the servers of a large-scale data center and streamingoron-linetransactionprocessing(OLTEP)maybe shrink the carbon footprint of cloud computing activities, required to operate within the boundaries of a sub-optimal eventhoughthisisonlyafractionofthetotalenergyrequired region in terms of energy consumption. bytheeverincreasingappetiteforcomputingandstorageser- There are cases when the instantaneous demand for re- vices. Tooptimizetheresourcemanagementoflargefarmsof sourcescannotbe accuratelypredictedandsystemareforced servers we redefine the concept of load balancing and exploit to operate in a non-optimal region before additional systems the technological advances and the power management func- canbeswitchedfromasleepstatetoanactiveone. Typically, tions of individual servers. In the process of balancing the PaaS applications run for extended periods of time and the load we concentrate it on a subset of servers and, whenever smallest set of serves operating at an optimal power level to possible, switch the rest of the servers to a sleep state. guarantee the required turnaround time can be determined Fromthe largenumberofquestions posedbyenergy-aware accurately. load balancing policies we discuss only the energy costs for migratingaVMwhenwedecidetoeitherswitchaservertoa This is also true for many IaaS applications in the area sleep state or force it to operate within the boundaries of an of computational science and engineering. There is always a energy optimal regime. The policies analyzed in this paper pricetopayforanadditionalfunctionalityofasystem,sothe aim to keep the servers of a cluster within the boundaries of future workshould evaluate the overheadand the limitations the optimal operating regime. After migrating the VMs to of the algorithms required by these mechanisms. 9 References [13] J. G. Koomney. “Estimating total power con- sumtion by servers in the US and world.” [1] D. Abts. “The Cray XT4 and Seastar 3-D torus inter- http://hightech.lbl.gov/documents/data centers connect.” Encyclopedia of Parallel Computing, Part 3, svrpwrusecompletefinal.pdf (accessed on May 11, Ed. David Padua, pp. 470–477,Springer, 2011. 2013). [2] D. Abts, M. R. Marty, P. M. Wells, P. Klausler, and [14] E. Le Sueur and G. Heiser. “Dynamic voltage and fre- H. Liu. “Energy proportional datacenter networks.” quency scaling: the laws of diminishing returns.” Proc. ACM IEEE Int. Symp. on Comp. Arch. (ISCA’10), pp. Workshop on Power Aware Computing and Systems, 338–347,2010. HotPower’10, pp. 2–5, 2010. [3] D. Ardagna, B. Panicucci, M. Trubian, and L. Zhang. [15] D. C. Marinescu. “Cloud Computing; Theory and Prac- “Energy-aware autonomic resource allocation in multi- tice.” Morgan Kaufmann, 2013. tier virtualizedenvironments.” IEEE Trans. on Services [16] M. Mazzucco, D. Dyachuk, and R. Deters. “Maximiz- Computing, 5(1):2–19, 2012. ing cloud providersrevenues via energy awareallocation [4] J. Baliga, R.W.A. Ayre, K. Hinton, and R.S. Tucker. policies.” Proc. IEEE 3rd Int. Conf. on Cloud Comput- “Green cloud computing: balancing energy in process- ing, pp. 131–138,2010. ing,storage,andtransport.” Proc. IEEE,99(1):149-167, [17] M. P. Mills “An overview of the electricity used 2011. by the global digital ecosystem.” http://www.tech- pundit.com/wp-content/uploads/2013/07/Cloud Begins [5] L. A. Barroso and U. Ho¨zle. “The case for energy- proportional computing.” IEEE Computer, 40(12):33– With Coal.pdf, 2013. (Accessed on September 22, 2013) 37, 2007. [18] NRDC and WSP 2012. “The carbon emissions of server computing for small-to medium-sized organi- [6] M. Blackburn and A. Hawkins. “Unused server survey zation - a performance study of on-premise vs. the results analysis.” www.thegreengrid.org/media/White cloud.” http://www.wspenvironmental.com/media/docs/ Papers/Unused%20Server%20Study WP 101910 v1. ourlocations/usa/NRDC-WSP Cloud Computing.pdf ashx?lang=en (Accessed on December 6, 2013). October 2012 (Accessed on November 10, 2013). [7] P. Bodik, R. Griffith, C. Sutton, A. Fox, M Jordan, and [19] A. Paya and D. C. Marinescu. “Energy- D. Patterson. “Statistical machine learning makes auto- aware application scaling on a cloud.” matic control practical for Internet datacenters.” Proc. http://arxiv.org/pdf/1307.3306v1.pdf,July 2013. Conf. on Hot Topics in Cloud Computing.,pp.1-8,2009. [20] C. Preist and P. Shabajee. “Energy use in the media [8] A. Gandhi, and M. Harchol-Balter. “How data cen- cloud.” Proc IEEE 2nd Int. Conf. on Cloud Computing ter size impacts the effectiveness of dynamic power Technology and Science, pp. 581–586,2010. management.” Proc. 49th Annual Allerton Conference on Communication, Control, and Computing, Urbana- [21] B. Snyder. “Server virtualization has stalled, despite Champaign, pp. 1864–1869,2011. thehype.”http://www.infoworld.com/print/146901 (Ac- cessed on December 6, 2013) [9] A.Ghandi, M.Harchol-Balter,R.Raghunathan,andM. A. Kozuch. “Autoscale: dynamic, robust capacity man- [22] B. Urgaonkar and C. Chandra. “Dynamic provisioning agement for multi-tier data centers.” ACM Trans. on ofmulti-tierInternetapplications.” Proc. 2nd Int. Conf, Computer Systems, 30(4):1–26, 2012. on Automatic Computing, pp. 217-228,2005. [10] Google. “Google’sgreencomputing: efficiency at scale.” [23] H.N.Van,F.D.Tran,andJ.-M.Menaud. “Performance http://static.googleusercontent.com/external content/ andpowermanagementforcloudinfrastructures.” Proc. untrusted dlcp/www.google.com/en/us/green/pdfs/google IEEE 3rd Int. Conf. on Cloud Computing, pp. 329–336, -green-computing.pdf (Accessed on August 29, 2013). 2010. [11] K. Hasebe, T. Niwa, A. Sugiki, and K. Kato. “Power- [24] A. Verma, G. Dasgupta, T. K. Nayak, P. DE, and R. saving in large-scale storage systems with data migra- Kothari. “Server workload analysis for power mini- tion.” Proc IEEE 2nd Int. Conf. on Cloud Computing mization using consolidation.” Proc. USENIX’09 Conf., Technology and Science, pp. 266–273,2010. pp.28–28, 2009. [25] S. V. Vrbsky, M. Lei, K. Smith, and J. Byrd. “Data [12] Hewlet-Packard/Intel/Microsoft/Phoenix/Toshiba. replication and power consumption in data grids.” Proc “Advanced configuration and power interface spec- IEEE 2nd Int. Conf. on Cloud Computing Technology ifications, revision 5.0” http://www.acpi.info/ and Science, pp. 288–295,2010. DOWNLOADS/ACPIspec50.pdf, 2011. (Accessed on November 10, 2013). 10

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.