ebook img

FlexCloud: A Flexible and Extendible Simulator for Performance Evaluation of Virtual Machine Allocation PDF

6.2 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview FlexCloud: A Flexible and Extendible Simulator for Performance Evaluation of Virtual Machine Allocation

1 FlexCloud: A Flexible and Extendible Simulator for Performance Evaluation of Virtual Machine Allocation Minxian Xu, Wenhong Tian, Xinyang Wang, Qin Xiong 5 (cid:70) 1 0 2 n aAbstract—Cloud Data centers aim to provide reliable, sustainable consideredinitsinfancy.Youseffetal.[19]introducea Jandscalableservicesforallkindsofapplications.Resourceschedul- detailed ontology of dissecting Cloud into five main 3 ingisoneofkeystocloudservices.Tomodelandevaluatedifferent layers from top to down: Cloud application (SaaS), 2schedulingpoliciesandalgorithms,weproposeFlexCloud,aflexible Cloud software environment (PaaS), Cloud software and scalable simulator that enables users to simulate the process infrastructure (IaaS), software kernel and hardware ]ofinitializingclouddatacenters,allocatingvirtualmachinerequests Cand providing performance evaluation for various scheduling algo- (HaaS), and illustrates their interrelations as well Drithms. FlexCloud can be run on a single computer with JVM to as their inter-dependency on preceding technologies. simulatelargescalecloudenvironmentswithfocusoninfrastructure From structure perspective, Cloud data center can be . sasaservice;adoptsagiledesignpatternstoassuretheflexibilityand regarded as a distributed network, containing many cextensibility;modelsvirtualmachinemigrationswhichislackinthe [ computing nodes, storage nodes, or network devices. 1 euxraistitoinngstaonodls;repprolavyidinegs.uCsoemr-fpriaernindglytionteexrifsatcinegsfsoimrcuulasttoorms,izFeldexcConlofiugd- Each node is composite of a series of resources such as CPU, memory, network bandwidth and so on. In vhascombiningfeaturesforsupportingpubliccloudproviders,load- 9balanceandenergy-efficiencyscheduling.FlexCloudhasadvantage this paper, we focus on Infrastructure as a service 8incomputingtimeandmemoryconsumptiontosupportlarge-scale (IaaS) in Cloud data centers, and proposing general 7simulations. The detailed design of FlexCloud is introduced and and flexible definition as well as model that could be 5performanceevaluationisprovided. used by various cloud providers. 0 An essential technology in Cloud datacenter is re- .Index Terms—Cloud Data Centers; Resource Scheduling Algo- 1 source scheduling. One challenge problem related to rithms;VirtualMachineAllocation;PerformanceEvaluation;Flexibil- 0 ityandExtensibility scheduling in Cloud data center is to consider alloca- 5 tionandmigrationofreconfigurablevirtualmachines 1 v:1 INTRODUCTION and integrated features of hosting physical machines. Different from existing load-balancing scheduling al- i XWith various recent advancements in virtualization, gorithmsthatconsideronlyphysicalserverswithone rlike Grid computing, Web computing, utility com- factor such as CPU, the new algorithms treat CPU, aputing and related technologies, Cloud computing memory and network bandwidth integrated for both obtains great development. Cloud computing aims physicalmachines(PMs)andvirtualmachines(VMs). to provide both infrastructure and services on de- Besides that, real-time virtual machine allocation for mand through the Internet or intranet [20], and its multiple parallel jobs and physical machines is taken benefits can be concluded as hiding and abstraction into consideration. With the development of cloud of complexity, virtualized resources and efficient use computing, the size and density of the cloud data of distributed resources. Cloud computing allows the center become huge, and problems which need to sharing, allocation and aggregation of software, com- be solved therewith. For instance, how to manage putationalandstoragenetworkresourcesondemand. physical resources and virtual resources intensively Currently, quite a few IT enterprises products, like and use them dynamically, to improve elasticity and Amazon EC2 [4], Google App Engine [14], IBM blue flexibility which can improve service and reduce cost Cloud [17] and Microsoft Azure [22] have shown and risk management; and how to help customers their practice of emerging Cloud computing plat- build flexible, dynamic, and business growth adapt- forms. Whereas there are many challenging issues ing infrastructure as well as ensure the sustainable to be resolved [20] [1] [25], Cloud computing is still development in the future. Because of the uncertainty of network environ- Dr.Tian,Mr.WangandMissXiongareintheschoolofComputerSci- ments, it is extremely hard to research widely for ence,UniversityofElectronicScienceandTechnologyofChina(UESTC), 610054;Mr.XuisintheschoolofsoftwareengineeringofUESTC. all these problems in real Internet platform. In ad- 2 dition, the network conditions cannot be predicted [28] introduce techniques for virtual machine migra- or controlled accurately, but affect the validation of tion and propose some migration algorithms. Zhang strategies. A considerate way in research is develop- et al. [29] compare major load balancing scheduling ing a simulation system, which supports visualized algorithms for traditional web servers. Singh et al. modeling and simulation in large-scale applications [5] propose a novel load balancing algorithm called in cloud infrastructure. Data center simulation sys- VectorDot for handling the hierarchical and multi- tem can describe the application workload statement, dimensional resource constraints by considering both whichincludesuserinformation,datacenterposition, servers and storage in Cloud computing. Doyle et al. theamountofusersanddatacenters,andtheamount [18]proposeasystemnamedStratustodeterminethe of resources in each data center. Using this informa- routine decisions for data center requests. tion,datacentersimulationsystemgeneratesresponse Buyyaetal.introduceGridSim[23]toolkitformod- requests and allocates these requests to virtual ma- eling and simulation of distributed resource manage- chines. By using data center scheduling simulation ment for grid computing. Dumitrescu and Foster [7] system, researchers can evaluate suitable strategies introduce GangSim tool for grid scheduling. Buyya such as distributing reasonable data center resources, et al. [24] introduce modeling and simulations of selecting data center to match special requirements, Cloud computing environments at application level, reducingcosts,findingefficientschedulingalgorithms a few simple scheduling algorithms such as time- and so on. sharedandspace-sharedarediscussedandcompared. Themajorcontributionsofthispaperareasfollow- CloudSim [24] is one of Cloud computing simulators, ing: which provides: modeling large-scale cloud comput- • theproposalofanewcloudsimulator,FlexCloud, ing infrastructure; models for the data center, service with light weight design to simulate cloud envi- agency, scheduling and distributing strategies; virtual ronment; engines,whichishelpfultocreateandmanageseveral • the design and implementation of a flexible and independent and collaborative virtual services in a extendible architecture model that resource, re- data center node; switching flexibly between process- questspecificationandschedulingalgorithmscan ingcoreswithspace-sharingandtime-sharing.Cloud- be easily added; Analyst [6] aims to achieve the optimal scheduling • the validation of the simulator, which has been among user groups and data centers based on the carried out by comparing realistic data with ac- currentconfiguration.BothCloudSimandCloudAna- tual results collected from Lawrence Livermore lystarebasedonSimJava[12]andGridSim[23].Also National Lab [16] trace; CloudSimandCloudAnalysttreataClouddatacenter • the performance comparison with CloudSim, asalargeresourcepoolandconsideronlyapplication- whichshowsFlexCloudhasstrengthintimecost levelworkloads,maynotsuitableforInfrastructureas and memory consumption. aservice(IaaS)simulationwhereeachvirtualmachine The remainder parts of this paper are organized as as resource is considered to be requested and allo- follows:section2describestherelatedworkonvirtual cated. A CloudSim-based simulation tool considering machine allocation algorithms and compares existing DVFSenergymodelisproposedin[27].Kliazovichet cloud computing simulators from several categories. al. propose an energy-aware simulation environment At the end of this section, the major contributions of namedGreenCloudforClouddatacenters[11].Nunez thispaperaregiven.Section3demonstratesthearchi- et al. [3] introduce a new simulator of cloud infras- tectural model of the newly proposed simulator from tructure named iCanCloud using C++ and compare severalaspects,includingitslayeredarchitecture,sce- the performance with CloudSim. nario, datacenter modeling, VM requests modeling, Table 1 shows the comparison of some state-of-art scheduling algorithms modeling, implemented per- cloud simulators as well as FlexCloud proposed in formance metrics, VM migration modeling, schedul- this paper. We compare these cloud simulators from ing process modeling and etc. Section 4 presents several categories. implementation details and design patterns adopted Platform: CloudSim and FlexCloud are both imple- inFlexCloud.Section5andsection6demonstratethe mented with Java, so they can be executed on any validation and evaluation of FlexCloud respectively. machineinstalledJVM.BuiltinGridSimandSimJava, Finally,thispaperendswiththebriefconclusionsand CloudSim is heavy to execute. MDCSim is written in a discussion on future work. CSIM, as for GreenCloud and iCanCloud, they are based on NS2 and OMNET respectively. 2 RELATED WORKS Language: The languages implemented with the A mount of research has been conducted in resource simulatorsarerelatedtotheplatforms.CloudSimand schedulingalgorithms,whicharesignificantforcloud FlexCloud are implemented with Java, MDCSim can data centers. Mastroaianni [8] et al. present a self- be implemented with C++ and Java, and GreenCloud organizing and adaptive approach for the consolida- needs combining C++ and OTcl, which is difficult for tion of VMs on CPU and RAM resources. Wood et al. developers. 3 TABLE1 SummaryofCloudSimulators Items CloudSim MDCSim GreenCloud iCanCloud FlexCloud Platform any CSIM NS2 OMNET,MPI any ProgrammingLanguage Java C++/Java C++/OTcl C++ Java Availability OpenSource Commercial OpenSource OpenSource OpenSource GraphicalSupport Limited(ViaCloudAnalyst) None Limited(ViaNam) Full Full PhysicalModels None None Limited(ViaPlug-in) Full Full Modelsforpubliccloud None None None Amazon Amazon SupportforParallelexperiments No No No Yes No SupportforEnergyConsumptionModel Yes Yes Yes No Yes SupportforMigrationalgorithms Yes No No No Yes Availability: Only MDCSim is commercial, and FlexCloudundermulti-datacenterenvironment.They other four simulators are free or open-source. Flex- would use FlexCloud to explore suitable algorithms Cloud can be fetched from [13]. for their company applications. Graphical support: MDCSim doesnt support inter- face operations. The original CloudSim support no 3 THE ARCHITECTURAL MODEL OF FLEX- graphicalinterface,butwithCloudAnalyst,thegraph- CLOUD ical interface are supported. However, full support is not provided in CloudAnalyst, in a whole scheduling Fig.1 shows the overview architecture of FlexCloud process, only the configurations and results can be with layered components. The top layer is Client presented. So we label it limited, the same reason Layer that provides the interface for user to config- for GreenCloud. FlexCloud and iCanCloud support ure requests properties and have results feedbacks whole scheduling process to be showed on the inter- from lower layers. At this layer, a GUI implemented faces. with Java Swing supports user to configure algo- Physical models: iCanCloud and FlexCloud pro- rithm types, set PM and VM specifications and select vide detailed simulation for physical analogs for the scheduling algorithms. After all settings are com- scheduling. GreenCloud needs to use a plug-in to pleted,thedefinedconfigurationswouldbesubmitted simulate that. to lower layer and a sequence of scheduling steps Modelsforpubliccloudproviders:BothiCanCloud wouldbeprocessed.Comparisondiagramsaswellas and FlexCloud use the model suggested by Amazon, result outputs would be sent as feedback to Client in which physical machine and virtual machine spec- Layer. At lower layer, a Requests Broker is imple- ifications are pre-defined. mented at Broker Layer acting as a mediator between Parallel experiments: Supporting for multiple ma- Client Layer and Scheduler Layer. This Layer is re- chines running the experiments together is a main sponsible for verifying the inputs from Client Layer feature of iCanCloud and that feature is under de- and transforming the settings into recognized com- velopment. As for FlexCloud, we are working to mands at Scheduler Layer. For instance, the number implement that function as well. of VM requests submitted from Client Layer would Power consumption model: Except for iCanCloud, be written into a configuration file, which could be otherfoursimulatorscansupportpowerconsumption read in the process of scheduling at Scheduler Layer. modeling. Scheduler Layer implements the core functions for Migrationalgorithm:CloudSimandFlexCloudsup- FlexCloud system. At this layer, the scheduling pro- port migration algorithms, while other 3 simulators cess is defined: VM Requests Generation component haven’t supported that. generatestheVMrequestswithconfiguredproperties In our teaching practice in our university, we have on user interface; Datacenter Scheduler component adopted CloudSim, a mature simulator, as a teaching schedules the particular algorithms to allocate VMs tool assisted, but according to the students feedback, to corresponding PM according to algorithms; VM CloudSim is a bit complex to use and heavy to exe- RequestsAllocationcomponentmanagestheallocated cute. That complexity is also a feature of iCanCloud. VMs, including checking the allocation conditions AsforMDCSim,acommercialtool,isnotappropriate and removing VMs at the end of their lifecycles. for researching. Apart from that, its not easy to use At bottom layer, Resource Layer contains a Resource several languages together in GreenCloud, since it is Management component providing resource that VM implemented with C++ and OTcl. requests require and supporting services for higher The main contribution of FlexCloud lies in that it levels. Besides the component, the physical resource, is implemented with light weight design, flexible to such as servers, network and storage are resources of extendaswellaseasytostart.Besidesthebenefitsfor the whole system. teaching,wealsocooperatewithacompanyresearch- Fig.2showsanapplicationscenariowithFlexCloud. ing in resource scheduling to boost the functions of This figure shows the three main components: user, 4 Fig.1. LayeredFlexCloudarchitecture TABLE2 FlexCloudschedulercenterandothercomputingcen- 3typesofphysicalmachines(PMs)suggested ters.TheFlexCloudschedulercenterisresponsiblefor the following main tasks: (1) accepting the VM re- PMPoolType ComputeUnits Memory Storage quests sentby users; (2) managingcomputing centers Type1 16units 30GB 3380GB thatinservice;(3)findingavailablecomputingunitto Type2 52units 136GB 3380GB allocate requests; (4) sending feedback information to Type3 40units 14GB 3380GB users.ComputingcentersrepresentapoolofPhysical Machines (PMs) or Virtual Machines (VMs), each one configured with a pre-defined specification such as CPU, memory and storage. Users are represented as bedynamicallyset.Thetypeandpropertyvalues,like component that submits a set of jobs to be allocated CPU, memory, storage and power, are recorded in a tospecificPMincomputingcenter.Thesesubmissions configuration XML file (in Fig. 3), which would be submitted directly to the FlexCloud scheduler center. loadedintosystem.Becausethesepropertyvaluesare Then, the requests are managed by this module to in XML file, modification can be easily done either be allocated to specific PM in the corresponding data for exactly value or new added property elements. center. After all requests have been processed, a feed- For instance, if more types of PMs are needed, the back report would be sent back to the user. pair < pminfo > type − id < pminfo > could be created and other values of this type PM could so 3.1 ModelingthedatacenterinFlexCloud also be added. Besides the load balance algorithms, we also implement energy-saving algorithms that Fromcomputingresourcepointofview,adatacenter contain a new property named power consumption. consists of a number of physical servers (PMs), net- This property is added in the configuration XML work devices, storages and other related equipment. file and corresponding methods are added in class A PM contains several kinds of resource, like CPU, PhysicalMachine.Thecorrespondingmethodsinclass memory, storage and bandwidth, etc. Before VM re- PhysicalMachine are responsible for accessing these quests are coming, the PMs are at the state of turned- property values. on, which means the class of Physical Machine is instantiated in FlexCloud. The number of instances More detailed information related to comparison depends on the number of PMs would provide ser- indices can be found in Section 3.3. In datacenter vices. model, the left resource capacity decides whether a InTABLE2,the3suggestedtypesofheterogeneous VM request can be allocated to that PM. At initializa- PMsinFlexCloudarelisted,andtheconfigurationcan tionstage,PMhasafullcapacityresourcetoofferser- 5 Fig.2. AscenarioarchitecturewithFlexCloud time,whichcanbetreatedasasecondoraminutethat arequestisin.Forinstance,VM2occupiestimeslot3 to 5, so lifecycle of VM2 is 3 slots. The value 0.0625 is proportionofresourceoccupation,meaningthatVM2 wouldoccupy6.25%resourceofthePMitisallocated to, during time slot 3 to 5. In our model, several VM requests can share the capacity of the same PM at the same time slot only if the capacity is enough. 0.0625 0.0625 0.0625 Fig.4. anexampleofVMrequests TABLE 3 shows the corresponding CPU, memory, Fig.3. PMspecificationinXMLfile storage values for different VMs. Also for extensi- ble reason, these property values are also recorded into a configuration XML file. Once a VM request is allocated to a PM, the left resource capacity would vices.Eithertheallocationorremoveoperationwould be decreased by the value of that request, and the update the available capacity value and influence the capacity is increased back when request is released. later requests allocation. In FlexCloud, several VM requests generation ap- proaches have been implemented, in which requests 3.2 ModelingVMrequestsinFlexCloud can be generated in Poisson, Normal and Random We use a simple example to show how VM re- distributions. When the specific distribution is se- quests are modeled in FlexCloud in Fig. 4. Slots lected, the start time or duration of the generated #1,#2,...,#6 represent the time slots in discrete requests would follow the distribution. In section 5, 6 TABLE3 requests’ allocation (line 11 to 12) after their end-time 8typesofvirtualmachines(VMs)inAmazonEC2 expired. This example is based on singe data center, while under multiple data centers, the index genera- ComputeUnits Memory Storage VMType tion process would be involved with data center id 1units 1.7GB 160GB 1-1(1) 4units 7.5GB 850GB 1-2(2) generation, rack id generation and PM id generation 8units 15GB 1690GB 1-3(3) rather than only PM id generation. 6.5units 17.1GB 420GB 2-1(4) 13units 34.2GB 850GB 2-2(5) 26units 68.4GB 1690GB 2-3(6) 5units 1.7GB 350GB 3-1(7) 20units 7GB 1690GB 3-2(8) we would show the data collected from different distributions. Moreover, its available for FlexCloud to import requests data from file in the Generate VMs step (see section 3.6), which means it can be tested under realistic data. 3.3 ModelingSchedulingAlgorithmsinFlexCloud Four kinds of scheduling algorithms are provided in FlexCloud based on scheduling goals and request types.Forrequesttypes,schedulingalgorithmscanbe dividedintoonlinealgorithmsandofflinealgorithms, Fig.5. ThepseudocodeofRandomalgorithm thedifferenceliesinwhethertherequestsinformation isallknownbeforescheduling.Requestswouldcome andbeoperatedonebyoneinonlinealgorithm,while ThealgorithmprocessofR-RandLSisquitesimilar requests sequence can be adjusted by processing time to Random except the way the index is generated for or end time because all requests information have PMs or VMs. In R-R algorithm, index generation is beencollectedbeforeschedulinginofflinealgorithms. in a round robin way while the index refers to the Another division principle is via goal: we consider PMwiththeleastaverageutilizationinLSalgorithm. load balancing and energy saving in FlexCloud. The more complex algorithms, Post Migration algo- When comparing the effects of different algo- rithms and Prepartition Algorithm, that considering rithms, the scheduling process would be same ex- migrations operations introduced in section 3.5 could cept that the scheduling algorithms are different. For also be modeled based on these principles. As for online load balancing comparison, Random, Round- offline load balancing algorithms, a procedure of re- Robin (Round), List Scheduling (LS) algorithms have quest processing should be added before line 4. The been implemented. Under the layered architectural processing procedure may change the requests order model and related design pattern (introduced in byprocessingtimeorendtimeorotherfeatures.Also, later section), new created algorithms can be added theprocessofonline/offlineenergy-savingalgorithms to scheduling algorithm library, without influencing taken is similar to online/offline load balancing algo- other existed algorithms. rithms. WetaketheRandomalgorithm,oneofthesimplest algorithms, as an example to show scheduling algo- 3.4 PerformanceMetricsinFlexCloud rithm could be modeled, mapped and extended in FlexCloud. Fig.5 shows the pseudo-code of Random In this section, we introduce the major performance algorithm.Afteradatacenterschedulerhasinitialized metrics we used in FlexCloud: the PMs and VM requests, Random algorithm would For load balancing algorithms: randomlygenerateanindexintherangeof0andM-1 Average utilization: Each PM would have the (line3)forPMs.ThenVMrequestwillbeallocatedto utilization value in scheduling process, and average the PM with generated index (line 5), if allocation is utilization is the arithmetic average value of all PMs successful by checking whether the PM has available in the data center; resource, then allocated PM needs updating its left PMresource:PM (i,PCPU ,PMem ,PStorage ),iis i i i i capacity (line 7), another index would be generated the index number of PM, PCPU ,PMem ,PStorage i i i if allocation is failed and the VM request should be are the CPU, memory, storage capacity of that a PM allocated again with a new index (line 8-9). As VM can provide. requests have lifecycles, the VM requests should be VM resource: released from hosting PM to prepare for other VM VM (j,VCPU ,VMem ,VStorage ,Tstart,Tend), j j j j j j j 7 is the VM type ID, VCPU ,VMem ,VStorage are load balancing algorithm for the new situation as j j j the CPU, memory, storage requirements of VM , follow: j Tstart,Tend are the start time and end time, which Skewofcapacity makespanisdefinedastheminimal j j are used to represent the life cycle of a VM. capacity makespanovermaximalcapacity makespan Time slot: we consider a time span from 0 to T be on all machines (referring to equation (6)): divided into parts with same length. Then n parts (cid:80) min c t can be defined as [(t1−t0),(t2−t1),...,(tn−tn−1)], skew(capacity makespan)= (cid:80)j∈A(i) j j (7) each time slot Tk means the time span (tk−tk−1). max j∈A(i)cjtj Average CPU utilization of PM during slot 0 and i where c is the capacity (for example CPU) requests T : j n (cid:80)n (PCPUTk ×T ) ofVMj andtj isthespanofrequestj (i.e.,thelength PCPUiU = k=0 (cid:80)n Ti k (1) of processing time of request j). k=0 k Forenergysavingalgorithms,followingindicesare And memory PMemU and storage PStorageU uti- provided: i i lization of both PMs and VMs can be computed in 1) The total number of PMs turned-on during the the same way. Similarly, average CPU utilization of a scheduling; VM can be computed. 2) Rejected number of VM requests: VM requests Integrated load imbalance value ILB of PM . The which cannot be served by the data center resources; i i variance is widely used as a measure of how far a set 3)Totalenergyconsumption:theenergyconsumption of values is spread out from each other in statistics. of all PMs (including VMs allocated on them); a Using variance, an integrated load imbalancing value less total energy consumption value reflects a better ILB of server i is defined energy saving effect for a given set of requests. i (Avg −CPUA)2 (Avg −MemA)2 In [1], authors found that CPU utilization is typ- ILBi = i 3 u + i 3 u ically proportional to the overall system load, and proposed a power model defined in equation (8): (Avg −StorageA)2 + i u (2) 3 P(u)=kP +(1−k)P u (8) max max where where P is the maximum power consumed when PCPUU +PMemU +PStoargeU max Avg = i i i (3) the server is fully utilized; k is the fraction of power i 3 consumed by the idle server (studies show that on and CPUA,MemA,StorageA are respectively the av- average it is about 70%); and u is the CPU utiliza- u u u erage utilization of CPU, memory and storage in a tion. This paper focuses on CPU power consumption, Cloud data center. which accounts for main part of energy comparing ILB is applied to indicate load imbalance level to the other resources such as memory, disk storage i comparing utilization of CPU, memory and network andnetworkdevices.InFlexCloud,weusethepower bandwidth of a single server itself. model defined in (8). Equation (8) is further reduced Makespan: is as same as traditional definition, and to (9): therefore the capacity makespan of all PMs can be P =P +(P −P )u (9) min max min formulated as below: where P is the power of given PM when its CPU min capacity makespan=max(Li) (4) utilization is zero (the PM is idle without any VM i running). In real environment, the utilization of the Load efficiency (skew of makespan): is defined as the CPU may change over time due to the workload (minimal average load divided by maximal average variability. Thus, the CPU utilization is a function of load) on all machines: time and is represented as u(t). Therefore, the total skew(makespan)= mini(Li) (5) energy consumption by a PM (Ei) can be defined as max (L ) an integral value of the power consumption function i i over a period of time as in (10): where L is the load of PM i. Skew shows the load i balancing efficiency to some degree. (cid:90) t1 Capacity makespan:InanyallocationofVMrequests Ei = P(u(t))dt (10) toPMs,wecanletA(i)denotethesetofVMrequests t0 allocated to machine PM , under this allocation, ma- If u(t) is constant over time, for example average uti- i chine PMi will have total loads, lizationisadopted,u(t)=u,thenEi =P(u)×(t1−t0). (cid:88) Thetotalenergyconsumptionofaclouddatacenter L = c t (6) i j j is computed as (11): j∈A(i) n (cid:88) Based on the above definitions and equations, we E = E (11) DC i have developed another metric, capacity skew on i=1 8 It is the sum of energy consumed by all PMs. Notes thePMtobehigherthantheup-threshold.Theremay that energy consumption of all VMs on PMs is in- be still some VMs left in the list, finally the algorithm cluded. allocates the left VMs to the PMs with the lowest Also confidence intervals can be calculated for dif- capacity makespan until the list is empty. ferent metrics as follows: Let x ,x ,x ,...,x be the Capacity makespan Prepartition Algorithm: novel 1 2 3 n calculated metrics (such as IBL and E values work proposed by ourselves. For a given set of VM tot cdc etc.) from n times of repeated simulations. Then the reservations,letusconsidertherearemPMsinadata mean is center and denote OPT as the optimal solution for a n 1 (cid:88) given set of J VM reservations. Firstly define x = x (12) mean n i i=1 1 (cid:88)J P =max{maxJ CM , CM }≤OPT (15) and the standard deviation s is 0 j=1 j m j j=1 (cid:115) s= (cid:80)ni=1(xmean−xi)2 (13) P0 isalowerboundonOPT.TheCapacity makespan n−1 Prepartition algorithm is introduced in detailed in [32]. It firstly computes balance value by equation andtheconfidenceintervalat95%confidenceisgiven (15), defines partition value (k) and finds the length by of each partition (i.e. (cid:100)P /k(cid:101), which is the max time s s 0 (xmean−1.96√n,xmean+1.96√n) (14) length a VM can continuously run on a PM). For each request, Prepartition equally partitions it into multiple (cid:100)P /k(cid:101) subintervals if its CM is larger than 0 Abovearebasicthemetricsthatalreadyimplemented (cid:100)P /k(cid:101), and then finds a PM with the lowest average 0 inFlexCloud.Othermetricscouldalsobeincludedfor capacity makespan and available capacity, and up- further research. datestheloadoneachPM.Afterallrequestsareallo- cated,thealgorithmcomputesthecapacity makespan 3.5 Modeling Virtual Machine Migrations in Flex- ofeachPMandfindstotalpartition(migration)num- Cloud bers. For practice, the scheduler has to record all possible subintervals and their hosting PMs of each There is lack of virtual machine migration modeling request so that migrations of VMs can be conducted in existing simulation tools. In [32], the detailed algo- in advance to reduce overheads. rithmsaboutmigrationareintroducedandcompared. FlexCloud therefore can evaluate the performance of In this section, we provide brief introduction to vir- differentmigrationalgorithms;theevaluationprocess tual machine migration modeling in FlexCloud. The is similar to allocation algorithms. key difference from allocation is that the migration objectives and the choose of resource and destination 3.6 TheSchedulingProcessinFlexCloud PMs.Twotypicalmigrationalgorithmsareintroduced in FlexCloud: The major steps of the scheduling process in Flex- Post Migration algorithm: Firstly, it processes the Cloud are as followings: requests in the same way as LPT (Longest Processing 1). Booting PMs: it loads the configuration XML file Timefirst)does.Thentheaveragecapacity makespan containing PM specifications set by user from user of all jobs is calculated. The up-threshold and interface. After needed information is collected, in- low-threshold of the capacity makespan for the stances of PMs are created to prepare for VM allo- post migration are calculated through the average cation. capacity makespan multiplied by a factor (in this 2). Generating traces (VM requests): it loads the con- paper we set the factor as 0.1, so the up-threshold is figuration XML file containing VM specifications and averagecapacity makespanmultipliedby1.1andthe VM traces from user interfaces. low-thresholdismultipliedby0.9).Offcoursethefac- 3). Comparing scheduling algorithms: two or more tor can be set dynamically to meet different require- iterators would collect the compared algorithms and ments; however, the larger the factor is, the higher compared indices. All selected algorithms will be imbalance is. A migration list is formed by collecting compared and corresponding indices are collected. the VMs taken from PMs with capacity makespan 4)Outputresults:thecomparisonresultsareoutputed higher than the low-threshold. The VMs would be in both text or diagrams format. taken from a PM only if the operation would not For building a more flexible system, the scheduling lead the capacity makespan of the PM to be less processonlydefinesthebasicframeworkprocessand than the low threshold. After that, the VMs in the customization may be improved based on this pro- migration list would be re-allocated to a PM with cess.BeforebootingPMs,thePMspecificationscanbe capacity makespan less than the up-threshold. The modified in configuration file. As for generating VM VMs would be allocated to a new PM only if the requests traces, besides the configuration file, more operation would not lead the capacity makespan of VM requests creation methods can be implemented. 9 metrics are flexible and extendable to add in; currently load-balancing and energy-efficiency scheduling algorithms are considered. (5) Virtual machine migration is modeled, this is still lack in current simulation tools. 4.2 DesignPatternsinFlexCloud Fig.6. SchedulingprocessinFlexCloud Various algorithms can be developed in algorithm scheduling process and other results format may be adopted for better visual effects. Fig. 6 shows a Fig.7. DecoratorpatterninFlexCloud scheduling process combing user interface configura- tions and basic scheduling process. FlexCloud has adopted some design patterns to meet the extendible goal. Fig.7 shows decorator pat- 4 IMPLEMENTATION OF FLEXCLOUD tern to meet the requirement that requests gener- In this section, we will introduce the detailed imple- ation approaches may differ. Class CreateVMDec- mentation of FlexCloud from design patterns’ point oratorA and CreateVMDecorateB extend the creat- of view. The design principles are mainly aiming eVM() method of CreateVM and add a new behavior at satisfying agile system goals with flexibility and method. With this pattern, when new requests gen- extendibility. eration approaches are needed, we can rewrite the method addedBehavior(). Under this method, func- 4.1 MainFeaturesinFlexCloud tion of class CreateVM can be dynamically added Considering design principles, FlexCloud mainly has or deleted. For instance, new resource is needed, following novel features: resource collection codes can be put in the addedBe- (1) FlexCloud is built on Java platform and can be havior()methodratherthanchangetheexistingcodes run on a single computer installed JVM to simulate or add new classes. large scale cloud infrastructure as a service (IaaS). A computer with 4 GB memory can simulate larger scale applications. We test the condition when the Java environment would throw an OutOfMemory exception by increasing requests gradually. With a 4 GB memory computer, experiments can simulate scheduling process of more than 100,000 requests. We have extended our tests with computers with 2GB memory, that configuration can simulate requests ranging from 25,000 to 50,000. (2) A user-friendly GUI is provided and lots of customized configurations can be set to satisfy Fig.8. AbstractFactorypatterninFlexCloud various simulation assumptions. The basic operation includes: select algorithm type, set VM numbers, set average duration, set start time, set total number of Fig.8 shows the Abstract Factory pattern to meet PMs, select comparison algorithms and indices. Of the requirement of different compositions of requests course, without GUI, user can also simulate a cloud generation approaches and scheduling algorithms. datacenter and scheduling process in .java class file An instance of LoadBalanceFactory would combine as well. a request generation approach from CreateVM and (3) A scheduling process framework is defined, each scheduling algorithms from generalization of Allo- step of process can be extended easily in agile style. cateAlgorihm. These different combinations can pro- (4) New scheduling algorithms and performance duce diverse scheduling process, like set requests 10 TABLE4 order byprocessing timeand scheduledby asubtype TheoreticalandsimulationresultscomparisonofLS of class OnlineAlgorithm. Adopting this design pat- algorithm tern can avoid fixed composition and gain a better extendable effect. LSIndices Theoretical Simulation AverageUtilization 0.5 0.5 ImbalanceDegree 0.0 0.0 Makespan 0.5 0.5 Skew(makespan) 1 1 Capacity makespan 50 50 Skew(capacity makespan) 1 1 rithms and indices should extend from their base class. The strength for this design pattern is also Fig.9. StrategypatterninFlexCloud a favorable extendibility. It’s intelligent when new algorithms and indices are implemented, the Iterator would compare results only if they are appended to Strategy pattern in Fig.9 defines a series encapsu- it. late scheduling algorithm classes: Random, Round- Robin, and OLRSA [18] for OnlineAlgorithm and LPT(LongestProcessTime)etc.forOfflineAlgorithm, 4.3 CommunicationsBetweenEntities which can be substituted with each other, enabling schedulingalgorithmsbeindependentonthechanges fromusers.Withstrategydesignpattern,whenanew algorithm is joined, only allocate() method in new joined algorithm should be implemented. After that, thenewjoinedalgorithmcanworkassameasexisting algorithms. Fig.11. Sequencediagramofbasicprocess Fig.11 depicts the flow of communication among important FlexCloud entities. At the beginning of the simulation, a DataCenter entity sends necessary messages that BootPM entity needs to start PMs pro- viding services in datacenter. CreateVM entity would also accept messages it needs to create VM requests. AlgorithmIteratorandIndexIteratorentitiesacttorun scheduling algorithms and send calculated indices values back to DataCenter entity. Thecommunicationflowdescribedaboveisabasic flow in a simulated experiment. Some variations in Fig.10. IteratorpatterninFlexCloud this flow are possible depending on the scheduling process. For example, before bootPM() and creat- Fig.10 shows the Iterator pattern to meet the re- eVM(), the message sent by a datacenter should be quirement of algorithm and indices results compar- verified. ison in data centers. After user have selected the comparison indices and algorithms on user interface, 5 VALIDATION OF FLEXCLOUD the selected algorithms and indices would be added toseparatelist,atthesametime,Iteratorforalgorithm To validate the accuracy of FlexCloud, we have de- and index, Index Iterator and Algorithm Iterator, signed some test cases to compare the theoretical would be generated. When outputting results, the results and simulation results. In this section, we use Iterator would schedule algorithms in Iterator one LS (List Scheduling), LPT(Longest Processing Time by one and output indices results with showIndex() First), EDF(End-time Decreasing First) algorithms to method in order. To satisfy the Iterator, both algo- compare theoretical and simulation results.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.