ebook img

A Workflow for Fast Evaluation of Mapping Heuristics Targeting Cloud Infrastructures PDF

2.8 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview A Workflow for Fast Evaluation of Mapping Heuristics Targeting Cloud Infrastructures

A Workflow for Fast Evaluation of Mapping Heuristics Targeting Cloud Infrastructures Roman Ursu∗, Khalid Latif∗, David Novo∗, Manuel Selva∗, Abdoulaye Gamatie∗, Gilles Sassatelli∗, Dmitry Khabi†, Alexey Cheptsov† ∗LIRMM / CNRS - University of Montpellier, France {firstname.lastname}@lirmm.fr †High-Performance Computing Center of Stuttgart (HLRS), Germany {lastname}@hlrs.de 6 Abstract—Resource allocation is today an integral part of of the DreamCloud project (see Section III for the details of 1 cloudinfrastructuresmanagementtoefficientlyexploitresources. the model) to the SimGrid [2] simulation platform. It aims to 0 Cloud infrastructures centers generally use custom built heuris- 2 beanopen-sourceandflexibleframeworksupportingavariety ticstodefinetheresourceallocations.Itisanimmediaterequire- of state of the art mapping algorithms. n ment for the management tools of these centers to have a fast a yet reasonably accurate simulation and evaluation platform to Outline.Intherestofthepaper,SectionIIdiscussesrelated J define the resource allocation for cloud applications. This work work. Section III introduces the AMALTHEA application 7 proposes a framework allowing users to easily specify mappings modeling language. The core engine of our simulation frame- 2 forcloudapplicationsdescribedintheAMALTHEAformatused workallowingtosimulateAMALTHEAapplicationsontopof inthecontextoftheDreamCloudEuropeanprojectandtoassess ] thequalityforthesemappings.Thetwoqualitymetricsprovided SimGrid is presented in Section IV. The proposed framework C by the framework are execution time and energy consumption. is introduced in Section V. Section VI presents the results of D usingthisflowtoevaluatethequalityofdifferentmappingsfor I. INTRODUCTION ascientificapplication casestudyexecutedon topofatypical . s Usage of cloud infrastructures is by no means uncommon cloud infrastructure. Finally, conclusions and perspectives are c in today’s scientific research. A typical cloud system is a presented in Section VII. [ sophisticated tool that relies on the possibility to combine 1 computing, networking and storage capabilities of multiple II. RELATEDWORK v computeresourcestodeliverasignificantperformanceadvan- A variety of cloud application simulation frameworks has 0 2 tageovertraditionaldesktopsystems.Duetothesize,costand been proposed. CloudSim [3] is a Java-based software frame- 4 complexityofoperation,cloudinfrastructuresarealmostnever work, which supports modeling and simulation of large scale 7 usedexclusivelybyasingleuser.Instead,anumberofuserstry cloudcomputingcomponentsandenvironmentsincludingdata 0 to compete for access, and thus mechanisms for effective re- centers, virtual machines, service brokers, and provisioning . 1 sourcesharingamongmultipleconsumersneedtobeinplace. policies.ThefeaturesofCloudSimincludetheavailabilityofa 0 A process of allocating resources efficiently and scheduling visualization engine that aids in the creation and management 6 computational tasks onto these allocated resources in a proper of multiple, independent, and co-hosted virtualized services 1 manner is based on resource allocation and scheduling pri- on a data center node. Additionally, CloudSim provides flex- : v orities. A typical scheduler targets multiple goals including ibility to switch between space- and time-shared allocation i X maximization of throughput and minimization of response of processing cores to virtualized services. These compelling time (i.e., latency). Other optimization criteria such as CPU features of CloudSim would speed up the development of r a fairness or energy-awareness are becoming more important new application provisioning algorithms for cloud computing. nowadays, too. Since some of these goals are conflicting with However, it is not possible to define the task level allocation each other, a job of the scheduler is to figure out a reasonable policies for CloudSim. allocationstrategy.TheapproachoftheDreamCloudprojectis iCanCloud [4] is an open source C++ based simulation to enable a closed-loop control mechanism that will allow for platform which has been designed to model and simulate a dynamic resource allocation. In order to assess the quality cloud computing systems. A user can customize the core of the resource allocation policy, the use of dedicated fast hypervisor class, which in turn is the core of iCanCloud. The simulators is of an essential importance. Amazon model has been integrated with the simulator. From Contributions of this paper. To address the above re- the DreamCloud perspective, being a commercial tool, it does quirement this work proposes a configurable framework for not provide complete access to flexibly deal with application fast performance evaluation of different resource allocation requirements. There is no support for power consumption strategies targeting cloud infrastructures. The implemented analysis.Additionally,thefocusofiCanCloudistodefineand framework connects AMALTHEA [1] application models, integrate new brokering policies and minimize the user cost used to describe application as a tasks graph in the context for public cloud infrastructure based on classical scheduling In AMALTHEA, an application model is represented as T0 T1 L3 T3 L1 T4 T6 a directed task graph, where vertices represent tasks, while directed edges represent either inter-task activation or com- L2 R46 R44 R42 munication. Figure 1 represents an arbitrary AMALTHEA T2 T5 applicationmodelwithseventasksandthreelabels.Tospecify Elements: R47 R45 R43 R41 R40 the inter-task communication, it can be observed that task T1 communicates with task T3 via label L3. For this purpose, T1 Instructions Labels T Task willperformawriteoperationonL3whileT3willperformthe Computation (Delay) Label L01 Read Label L21 Label L02 readoperationonL3.ThesizeofL3representstheexchanged R Runnable Computation (Delay) Label L03 Write Label L01 datavolume.Atlowergranularitylevel,ataskiscomposedof Read Label L37 Label L21 L Label Computation (Delay) runnables as illustrated in Figure 1. Task ’T4’ is composed of Write Label L29 Label L29 Inter-task ... eightrunnables.Therunnablegraphalsoprovidesinformation activation Read Label L03 Label L37 about precedence relationship between runnables. Fig.1:Anillustrativerepresentationofanapplicationasatask A runnable is a set of “abstract” instructions. Such in- graph and task as a runnable graph, with inter-task activation structions in AMALTHEA application model are categorized and communication. as computation and communication instructions. Computation instructions feature operations such as arithmetic or logical operations. For quick simulations, the actual computational heuristic. While the main focus of the DreamCloud project is instructions are abstracted away by representing them with dynamic resource allocation. associated platform-dependent execution time in terms of GreenCloud [5] is a sophisticated packet-level cloud sim- clockcycles.Communicationinstructionscompriseread/write ulator with focus on energy consumption for cloud commu- accesses to labels, which can be further classified into two nications. It is a C++/OTcl based tool, with underlying NS-2 categories: local and remote accesses. Local accesses take platform. GreenCloud provides the fine-grained modeling of place, when the target label is mapped at the same node as energyconsumptionfortheITequipmentinadatacenter,such therequestingrunnable.Remoteaccessestakeplace,whenthe as computer servers, network switches and communication targetlabelismappedonadifferentnode,comparedtothatof links. From the DreamCloud perspective, GreenCloud does the requesting runnable. In this case, the runnable goes to the not provide thedetailed timing analysis which isan important wait state to wait for the read/write response and makes the requirement for evaluating dynamic resource allocations. To coreavailableforotherrunnablestoexecutetheirinstructions. introduce the detailed timing analysis, a designer needs to Figure 1 details the instructions of runnable R45 from the define timing models in NS2 environment, which is quite runnable graph of task T4. The computation instructions are outdated and very complex tool. representedwiththeirexecutiondelayvalues.Thecommunica- SimGrid [2] is an open-source and mature C-based tool tioninstructionsarerepresentedasread/writeaccessestotheir to study the behavior of large- scale distributed systems corresponding labels. Their simulation consists of the actual such as grids, clouds, HPC or P2P systems. It can be used transfer of data size corresponding that of accessed labels in to evaluate heuristics, prototype applications or even assess read/write requests. legacy MPI applications. Task level scheduling is supported by SimGrid. The power consumption analysis tool has been IV. FROMSIMGRIDTOAMALTHEA integrated to the latest version of SimGrid. Additionally, the We now review how we implemented the simulation of developers are very reactive. Due to these facts, and making AMALTHEA on top of SimGrid. Before digging into the thebasicstudy,SimGridwasselectedfortheintegrationtoour details, we describe the basic architecture of SimGrid. design framework for performance evaluation of applications executed on top of cloud infrastructures. A. SimGrid Architecture SimGrid has been an active project for more than 10 years. III. APPLICATIONMODEL Since then, it has been evolving and taking new systems into In the context of the DreamCloud project addressing re- consideration.Atstart,itwasdedicatedtogridsimulationand source allocation issues in future multicore and distributed itsfocuswastostudycentralizedschedulingsimulations(e.g., systems.Fromtheapplicationsperspective,theprojectfocuses off-line scheduling of a Directed Acyclic Graph (DAG) on a both on automotive and on cloud computing domains. The heterogeneoussetofdistributedcomputingnodes).Nowadays, AMALTHEA[1]applicationmodelhasbeenchosentocapture SimGrid deals with a variety of distributed systems ranging application specifications from these two domains because from grids to peer-to-peer systems, and arbitrarily defined it provides clean abstractions allowing to capture both of hybrid systems combining different computing systems. The them. AMALTHEA is based on three main entities: labels latest, stable and publicly available version of SimGrid (3.2) representing data elements, runnables denoting the smallest implements the architecture shown in Figure 2. units of code schedulable by an OS, and tasks defined as SimGridprovidesthreeAPIs,allbasedonacommonkernel, graphs of runnables. allowing users to write custom simulations. SURF is this n Application utio AMALTHEA specifications b simulation ntri coreengine o 1 AMALTHEA environment C MSG SMPI SimDag 2 Parser d Simpleapplication Librarytorun Frameworkfor Gri levelsimulator MPIapplications DAGsofparalleltasks 3 Mapper m Archi. Si SURF description AMALTHEA simulator virtualplatformsimulator 4 (xml) SectionIV-B Fig.2:SimGridArchitectureandthecontributionofthiswork. hosts SimGrid provides three different APIs allowing user code to Enery Time write custom simulations. This work relies on the MSG API. time Fig. 3: Simulation workflow from applications specifications internal kernel. It provides the core functionalities to simulate to simulation results a virtual platform. It has been implemented at very low level and is not intended to be used by the end users. On top of the SURFlayer,SimDag,MSG,andSMPIAPIsareimplemented. • Step 1. The application specifications are modeled in MSG is a simple programming environment, which was the AMALTHEA as discussed in the previous section. first distributed programming environment provided within SimGrid. It is still the most commonly used programming en- • Step 2. The AMALTHEA model is fed to a parser. The vironmentandcanbeusedtobuildratherrealisticsimulations. job of the parser is to translate the AMALTHEA model The SimDag API provides specific programming environment into a format that can then be easily simulated using for DAG applications while the SMPI one provides support SimGrid.ThisparserhasbeenwritteninC++tobeeasily for simulating MPI applications. To simulate the rich set of integrated with SimGrid. abstractionsprovidedbytheAMALTHEAmodeldescribedin • Step 3. The mapper then executes the mapping algo- the previous section, we rely on the MSG API. rithms by allocating tasks and labels onto the simulation platform. The description of the platform is provided to B. AMALTHEA Execution on SimGrid the mapper. Because this work focuses on providing a To execute AMALTHEA applications on top of SimGrid, modular workflow to evaluate mapping strategies, the we implemented a simulation core engine that reads the in- proposed workflow does not provide elaborated mapping memory object representation of an AMALTHEA model and strategies but rather a clean and simple interface to let convertsittoSimGridfunctioncalls.Eachrunnableoftheini- usersimplementanymappingalgorithm.Nevertheless,to tialapplicationisconvertedintoaSimGridprocess.Runnables illustratethecompleteworkflow,basicmappingstrategies activation dependencies and communications through labels are proposed by default. are then implemented using SimGrid messages. • Step 4. At this stage, the AMALTHEA runnables are In order to fit AMALTHEA applications into SimGrid, we executed on the simulation platform as described in simplified the AMALTHEA semantic by gathering for each Section IV-B. Ensuring the AMALTHEA execution se- runnable all the read instructions together, all the computing mantic on top of the SimGrid MSG API was the biggest instructions together and all the write instructions together. implementation challenge of the workflow. From that, each runnable process first waits for receiving activationmessagesfromalltherunnablesitdependson.Then, As results, the simulation platform provides information therunnableprocesssendsmessagescorrespondingtotheread aboutexecutiontime,executionstepsandtheenergyconsump- label instructions it contains. The computational load of the tion. All these results are provided in the form of text files. process required by SimGrid is then computed by adding the The energy consumption results are provided both globally executiontimeofallthecomputeinstructionsoftherunnable. and per host. Moreover, the workflow provides a timeline Finally, the runnable process performs write label operations view of runnable processes execution on the different hosts and activates all subsequent runnables. of the platform. This timeline view, provided as a standard timeline trace file, also includes runnables dependencies and V. WORKFLOW labels reading and writing information. This timeline view is providedasatracefilethatcouldbevisualizedusingtheVite1 InordertobeabletosimulateAMALTHEAapplicationson open source tool. top of cloud infrastructure we propose the workflow depicted on Figure 3. This workflow comprises the 4 following steps: 1http://vite.gforge.inria.fr iteration R4 MS2-1 L3 L5 Adapt Generate Generateinput Calculate Rank Check L1 L2 L7 L8 state indivudals datasets fitnessvalues individuals termination R1 R2 R3 L4 L6 R6 R7 R8 MS2-n ReadL7 Compute(delay) R5 WriteL8 Fig. 4: The eScience case study application VI. EXPERIMENTALRESULTS We now show how the proposed workflow can be used to evaluate an application model and a cloud infrastructure providedbytheHighPerformanceComputingCenterStuttgart (HLRS)involvedintheDreamCloudproject.Inallthefollow- ing experiments, to illustrate the workflow, we use a mapping algorithm that randomly allocate the runnables and the labels on the underlying hardware infrastructure. A. eScience Application TheapplicationusecaseisbasedonaneScienceapplication MS2 genetic algorithm [6] that performs molecular dynamic simulation. The goal of the simulation is to predict thermo- physical properties of condensed fluids which is required for Fig. 5: The HLRS cloud platform thedesignandoptimizationofchemicalengineeringprocesses. The eScience workflow as an AMALTHEA model is depicted in Figure 4. MS2 is used to optimize the algorithm that tries C. Results to fit the parameters of an existing model to data collected through experiments. The algorithm is executed until a good 1) Preliminary mapping evaluation: We simulate the exe- enough fit is found, a time limit is hit or a fixed number cution of the eScience application on the cloud infrastructure of iterations have been executed. This in handled by the described previously. During each simulation instance, we task called Check termination. In this application model, each trigger only one run of the application, i.e. the execution of AMALTHEA task contains a single runnable as depicted on the application completes as soon as the task named Check Figure4.AsshownfortherunnableR7,eachofthisrunnable termination finishes. In a more general execution setting, the first reads input data from an input label, then perform a eScience application could possibly iterate several times until specific number of computations instructions and finally write data fit the model (see Section VI-A). outputdatatoanoutputlabel.Inallthefollowingexperiments, On the other hand, we consider two variants of the cloud 32 MS2 tasks are considered. infrastructure: heterogeneous and homogeneous platform ver- sions. The former is depicted in Figure 5, while the latter is B. Cloud Infrastructure obtained by removing from the Heterogeneous platform the The cloud infrastructure model we use in the following host HOST 0 2, which is a GPU. experiments, also provided by HLRS, is shown in Figure 5. We conducted two sets of experiments. In the first ex- It contains two nodes, named node 0 and node 1. Node 0 periment, we simulated the execution of 100 random static containsthreehosts(node0,0node0,1andnode0,2)whilenode mappings of the eScience application on the homogeneous 1containstwohosts(node1,0andnode1,1).Theconfiguration platform. The characterization of each mapping in terms of details for each host can be observed. The Frontend host acts execution time, energy dissipation and simulation time is as the master, and resource allocations are performed by this represented in Figure 6. The red and blue dots in the Figure, host. The bandwidth requirements for each interconnection pointtotheworstandbestmappingsintermsoftheexecution link have also been provided. time. (a) Executiontime (a) Executiontime (b) Energyconsumption (b) Energyconsumption (c) Simulationtime (c) Simulationtime Fig. 6: Simulation on homogeneous platform Fig. 7: Simulation on heterogeneous platform of view. Indeed, the mapping number 28 may be a good candidate,asitprovidesanexecutiontimeclosetotheshortest The second experiment is similar to the first one with the one while using around 15 percent less of energy. difference that the heterogeneous platform is considered. The corresponding results are shown in Figure 7. Here, for the The execution time in Figure 8(a) is shorter than in Fig- mappings with the shortest and longest execution time, the ure 8(b) because in the former case the tasks MS2 are more detailed temporal evolution of the execution is depicted in evenlydistributedonthehosts.Indeed,inFigure8(b),thehost Figures 8(a) and 8(b) respectively. We can notice that in this HOST 0 1 executes 15 MS2 tasks and the host HOST 0 0 heterogeneous case, the mapping with the shortest execution only 4 tasks. Thus, in Figure 8(b) the host HOST 0 1 is time may not be the most efficient one from the energy point overloaded compared to the other hosts. In that case, the host (a) Bestmappingscenario (b) Worstmappingscenario Fig. 8: Zoom into the best and worst scenarios in simulating the execution on the homogeneous platform HOST 0 1 has to provide computational ressources to more randomly choosen. tasks than the other hosts. Consequently, the tasks on the host The shortest execution time is approximately 0.3064 sec- HOST 0 1 need more time to execute. Indeed, in Figure 8(b) onds and the longest execution time is 0.6938 seconds. The the tasks executing on the host HOST 0 1 need 50% more reason for this difference is that in the first case, 34.4% of time to finish compared to the tasks executing on the other the MS2 tasks are mapped on the host HOST 0 2 which is hosts. almost 100 times more powerful than the other hosts. In the 2) Large-scale mapping evaluation: We extend the pre- second case, only 18.8% of the MS2 tasks are mapped on vious evaluation to a larger number of mappings in order the host HOST 0 2. Thus the execution for the first mapping to get a better insight of the mapping impact on execution finishessoonerthanforthesecondmapping.TableIshowsthe time. Figure 9 shows the evaluation of 6000 mappings of the execution times (E time(s)) in seconds of different mappings. eScience application on the heterogeneous platform. As for Each line corresponds to one mapping. For each mapping, the the experiments in the previous section, these mappings are percentageoftheMS2tasksthatweremappedonthedifferent Execution time (seconds) our platform, 4 hosts have identical performances and 1 host 0.7 provides 100 times more flops compared to any other host. 0.65 H 0 0 H 0 1 H 0 2 H 1 0 H 1 1 E time(s) )s( em 0.6 mmgboaodd 2553%% 162..35%% 3148..48%% 99..44%% 1182..75%% 00..63903684 0.55 it n m worst 0 100% 0 0 0 1.0270 o 0.5 m best 0 0 100% 0 0 0.1181 itu c e 0.45 TABLE I: Execution times and MS2 tasks mappings for x E different mappings 0.4 0.35 VII. CONCLUSIONANDPERSPECTIVES 0 1000 2000 3000 4000 5000 6000 This work proposes an integrated simulation workflow al- Mapping count lowing to quickly assess the quality of mapping algorithms (a) Executiontime targetingcloudinfrastructuresandaccordingtodifferentcrite- Energy (Joules) rion(i.e.energyefficiencyorexecutiontime).Thissimulation 1200 workflow takes as input applications specifications expressed intheAMALTHEAmodelchosenastheapplicationmodelto 1100 beusedinthecontextoftheDreamCloudproject.Thespecifi- 1000 cationofcustommappingsishandledbyimplementingaclean interface whose main role is to specify where AMALTHEA )J( y 900 entities should be located on the targeted cloud architecture. g renE 800 We plan to extend the mapping design space exploration introduced in the last section with optimization techniques 700 such as simulated annealing to allow end users to identify what are (and why) the best mapping strategies for particular 600 applications. We will also improve the quality of the provided 0 1000 2000 3000 4000 5000 6000 Mapping count results by taking into account more architectural parameters. (b) Energyconsumption ACKNOWLEDGMENT Simulation time (seconds) The research leading to these results has received funding 1.3 from the European Community’s Seventh Framework Pro- 1.2 gramme(FP7/2007-2013)undertheDreamCloudProject:http: 1.1 //www.dreamcloud-project.org, grant agreement no. 611411. )s ( e 1 The authors would like also to thank the SimGrid team for m it n 0.9 the provided support. o italu 0.8 REFERENCES m iS 0.7 [1] “AMALTHEA : Model Based Open Source Development Environment forAutomotiveMulti-CoreSystems,”http://www.amalthea-project.org/. 0.6 [2] H. Casanova, A. Giersch, A. Legrand, M. Quinson, and F. Suter, “Versatile, scalable, and accurate simulation of distributed applications 0.5 and platforms,” Journal of Parallel and Distributed Computing, 0 1000 2000 3000 4000 5000 6000 vol. 74, no. 10, pp. 2899–2917, Jun. 2014. [Online]. Available: Mapping count http://hal.inria.fr/hal-01017319 (c) Simulationtime [3] R. Buyya, R. Ranjan, and R. N. Calheiros, “Modeling and simulation of scalable cloud computing environments and the cloudsim toolkit: Challengesandopportunities,”CoRR,vol.abs/0907.4878,2009.[Online]. Fig. 9: Large scale simulation on heterogeneous platform Available:http://arxiv.org/abs/0907.4878 [4] A. Nez, J. Vzquez-Poletti, A. Caminero, G. Casta, J. Carretero, and hosts is specified. The mapping m good is the best mapping I. Llorente, “icancloud: A flexible and scalable cloud infrastructure simulator,” Journal of Grid Computing, vol. 10, no. 1, pp. 185–209, in terms of the execution time among the 6000 mappings 2012.[Online].Available:http://dx.doi.org/10.1007/s10723-012-9208-5 depicted in Figure 9. The m bad is the worst mapping among [5] D. Kliazovich, P. Bouvry, Y. Audzevich, and S. Khan, “Greencloud: A those 6000 mappings. For the mapping m best, all the tasks packet-levelsimulatorofenergy-awarecloudcomputingdatacenters,”in GlobalTelecommunicationsConference(GLOBECOM2010),2010IEEE, oftheeScienceapplicationweremappedtothemostpowerful Dec2010,pp.1–5. host HOST 0 2. In case of m worst, all the tasks were [6] C. W. Glass, S. Reiser, G. Rutkai, S. Deublein, A. Kster, G. Guevara- mapped on the host HOST 0 1. When mapping all the tasks Carrion,A.Wafai,M.Horsch,M.Bernreuther,T.Windmann,H.Hasse, and J. Vrabec, “A molecular simulation tool for thermodynamic on the most powerful host, the execution time is almost 10 properties, new version release,” Computer Physics Communications, times shorter than if all the tasks are mapped on another vol. 185, no. 12, pp. 3302 – 3306, 2014. [Online]. Available: host. This result was expected. Indeed, among the 5 hosts of http://www.sciencedirect.com/science/article/pii/S0010465514002550

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.