ebook img

Multicloud Resource Allocation PDF

163 Pages·2017·2.17 MB·French
by  ZhuangHao
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Multicloud Resource Allocation

Multicloud Resource Allocation: Cooperation, Optimization and Sharing THÈSE NO 7483 (2017) PRÉSENTÉE LE 10 MARS 2017 À LA FACULTÉ INFORMATIQUE ET COMMUNICATIONS LABORATOIRE DE SYSTÈMES D'INFORMATION RÉPARTIS PROGRAMME DOCTORAL EN INFORMATIQUE ET COMMUNICATIONS ÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE POUR L'OBTENTION DU GRADE DE DOCTEUR ÈS SCIENCES PAR Hao ZHUANG acceptée sur proposition du jury: Prof. B. Faltings, président du jury Prof. K. Aberer, directeur de thèse Prof. Ph. Cudré-Mauroux, rapporteur Prof. H. Pan, rapporteur Prof. B. Ford, rapporteur Suisse 2017 Tomydearparents&familymembers. Acknowledgements At this moment, a myriad of gratitude fill my heart. I would like to take this opportunity togivemysincerethanksforthosewhoofferedthegeneroushelpduringmyfouryearsof PhDstudies. First and foremost, I would like to thank my supervisor, Prof. Karl Aberer, for giving me the opportunity to do the PhD studies under his guidance. His enlightening guidance and inspiringinstructionsovertheseyearshavehelpedmegreatly.Hegavemethegreatfreedom toworkonthetopicsthatIfeltpassionateaboutandIamveryappreciatedforthat. Secondly,IwouldliketoexpressmysinceregratitudetoDr. RameezRahmanforhisvalu- ableguidanceandmentorship.Henotonlygavemealotofhelponsearchingtheinteresting problemandthesiswriting,but,moreimportantly,showedmehowtoconductthescientific researchwithcriticalthinkingaswell. Thanksforyourencouragementandalldiscussions wemade. Youaremylife-timecomradeandIwishyouallthebestforthefuture. Thirdly, IwouldliketothankProf. BoiFaltings, Prof. FordBryan, Prof. PhilippeCudre- Mauroux and Prof. Pan Hui for agreeing to be my thesis committee members, and for providingvaluablecommentsandsuggestionstoimprovemythesiswork. I would like to thank all my colleagues in LSIR, both the alumni and the current mem- bers. My story with LSIR started with Zhixian Yan, thank for your recommendation and I wish you all the best for the future. Also, great thanks to my officemate Jean-Paul and Surender,forsharingyourresearchexperienceanddiscussions. ThanksforXinandThana- sios, for helping me overcome the cold-start of my PhD life. My special thanks go to our CloudSpaces team, Hamza and Rameez, thank for all technical discussions, debates and yourgreateffortstoourproject. MygratitudeisextendedtoothermembersinLSIR,Tian, Michele,Julia,Jean-Eudes,Julien,Thanh,Amit,Alex,Alevtina,Martin,Remi,Panayiotis, Berker,Hung,Tri,Alexandra,Matteo,Mehdiandallmembersinourlab. Lastbutnotleast, manythankstoChantalforallthehelpandsupportsthroughoutmystayhere. Also,IwouldliketothankoursmallChinesecommunityBClunchgroupmembers,Xifan Tang, Xiaolu Sun, Tian Guo, Jian Zhang, Zhou Xue, Bin Fan, Hanjie Pan, Cheng Wang, Bin Jin, Jingjing Wang, Runwei Zhang, XinchaoWang, Bin Ding, WeiZhuo, Shenqi Xie, YePu,MinYe,MonkeyKingandZhufeiChu. Wealwayssharetheideasaswellasstories duringthelunchtime,whichkeptmehighmotivatedandenthusiastic. Aspecialgratitudeandlovegoestomybigfamily. Ithankmyparentsfortheirdeepabiding love. IthankmytwoeldersistersJingZhuangandYanZhuang,fortheircareandalsomy lovely niece and nephew, Shiqi and Shengbo. Thank my two brothers-in-law, Feng Mao and Yang Yu, for their love to our family. This thesis can be impossible without all your supports. Lastbutnottheleast,Idedicatethisthesistomywife,HuizhenLI,andmysoon-to-beborn babygirlXixi. Iamindebtedtomywifeforherconstantlove,encouragement,understand- ing and supports during the past 11 years. My dearest sweety, we have journeyed far on life’spathtogether. Icannotwaittoseewhatwonderfulandcolorfultrailsawaitusinour future. Also, ourincomingbabygirlXixi, asyournamesuggests, wehopeyoucanlivea happy and healthy life. You will be always the source of our happiness and we love you foreverandever! HaoZhuang Lausanne,8thJanuary,2017 Abstract Nowadaysourdailylifeisnotonlypoweredbywater,electricity,gasandtelephonybutby “cloud” as well. Due to the high penetration of cloud-based services or applications into every aspect of our life and the unprecedented increase of digital data, cloud computing becomesthe5thutilitythatprocessesandstorestheseapplicationsanddata. Bigcloudven- dorssuchasAmazon,MicrosoftandGooglehavebuiltlarge-scalecentralizeddatacenters toachieveeconomiesofscale,on-demandresourceprovisioning,highresourceavailability andelasticity. However,thosemassivedatacentersalsobringaboutmanyotherproblems, e.g.,bandwidthbottlenecks,privacy,security,hugeenergyconsumption,legalandphysical vulnerabilities. One of the possible solutions for those problems is to employ multicloud architectures. Inthisthesis,ourworkprovidesresearchcontributionstomulticloudresource allocation from three perspectives of cooperation, optimization and data sharing. We ad- dressthefollowingproblemsinthemulticloud: howresourceproviderscooperateinamul- ticloud,howtoreduceinformationleakageinamulticloudstoragesystemandhowtoshare thebigdatainacost-effectiveway. Morespecifically,wemakethefollowingcontributions: Cooperationinthedecentralizedcloud. Recentlyduetoincreasingconcernsonthepri- vacyanddatacontrol,manysmalldatacenters(SDCs)establishedbydifferentprovidersare emerging in an attempt to meet demand locally. However, SDCs can suffer from resource in-elasticity due to their relatively scarce resources, resulting in a loss of performance and revenue.Inthiswork,weproposeadecentralizedcloudmodelinwhichagroupofSDCscan cooperatewitheachothertoimproveperformance. Moreover,wedesignageneralstrategy functionforSDCstoevaluatetheperformanceofcooperationbasedondifferentdimensions ofresourcesharing. Throughextensivesimulationsusingarealisticdatacentermodel,we show that the strategies based on reciprocity are more effective than other strategies, e.g., thoseusingpredictionbasedonhistoricaldata. Ourresultsshowthatthereciprocity-based strategycanthriveinaheterogeneousenvironmentwithcompetingstrategies. Multicloud optimization on information leakage. Many schemes have been recently advancedforstoringdataonmultipleclouds. Distributingdataoverdifferentcloudstorage providers(CSPs)automaticallyprovidesuserswithacertaindegreeofinformationleakage control,fornosinglepointofattackcanleakalltheinformation. However,unplanneddis- tributionofdatachunkscanleadtohighinformationdisclosureevenwhileusingmultiple clouds. Inthiswork,wefirstlystudyanimportantinformationleakageproblemcausedby unplanneddatadistributioninmulticloudstorageservices.Then,wepresentStoreSim,anin- formationleakageawarestoragesysteminmulticloud. StoreSimaimstostoresyntactically similar data on the same cloud, thereby minimizing the user’s information leakage across multiple clouds. We design an approximate algorithm to efficiently generate similarity- preservingsignaturesfordatachunksbasedonMinHashandBloomfilter,andalsodesigna functiontocomputetheinformationleakagebasedonthesesignatures. Next,wepresentan effectivestorageplangenerationalgorithmbasedonclusteringfordistributingdatachunks withminimalinformationleakageacrossmultipleclouds. Finally,weevaluateourscheme using two realdatasets fromWikipediaandGitHub. Weshow thatour scheme canreduce theinformationleakagebyupto60%comparedtounplannedplacement. Furthermore,our analysis in terms of system attackability demonstrates that our scheme makes attacks on informationmuchmorecomplex. Smartdatasharing. Movinglargeamountsofdistributeddataintothecloudorfromone cloudtoanothercanincurhighcostsinbothtimeandbandwidth. Theoptimizationondata sharinginthemulticloudcanbeconductedfromtwodifferentangles: • Inter-cloudscheduling. ExistingcentralizedsolutionsfordatasharingsuchasDrop- boxreplicatingdatatoallinterestedpartiesisprohibitivelycostly,giventhelargesize ofdatasets. AmorepracticalsolutionistouseaPeer-to-Peer(P2P)approachtorepli- cate data in a self-organized manner. However, existing P2P approaches focus on minimizingdownloadingtimewithouttakingintoaccountthebandwidthcost. Inthis work, we present CoShare, a P2P inspired decentralized cost effective sharing sys- tem for data replication. CoShare allows users to specify their requirements on data sharing tasks and maps these requirements into resource requirements for data trans- fer. Through extensive simulations, we demonstrate that CoShare finds the desirable tradeoffsforagivencostandperformancewhilevaryinguserrequirementsandrequest arrivalrates. • Making big data smaller. The sheer size of big data imposes great challenges on storing, sharingandprocessingsuchdatainthemulticloud. Thesechallengescanbe addressedbydatasummarizationwhichtransformstheoriginaldatasetintoasmaller, yet still useful subset. In this work, we take Twitter data and its applications based on topic models as a case study. We aim to reduce the size of the Twitter dataset while preserving topics in the original big dataset. Existing work finds such small subsetswithobjectivefunctionsbasedondatapropertiessuchasrepresentativenessor informativenessbutdoesnotexploitsocialcontexts,whicharedistinctcharacteristics ofsocialdata. ThroughanalyzingTwitterdata,wediscovertwosocialcontextswhich areimportantfortopicgenerationanddissemination,namely(i)CrowdExptopicscore that captures the influence of both the crowd and the expert users in Twitter and (ii) Retweet topicscorethatcapturestheinfluenceofTwitterusers’actions. Weconduct extensiveexperimentsontworeal-worldTwitterdatasetsusingtwoapplications. The experimental results show that, by leveraging social contexts, our proposed solution canreducethetotalsizeofdatawithouttheperformancedegradationontopic-related applications. Keywords:Mutlicloudresourceallocation,decentralizedcloud,multicloudstoragesystem, multicloudoptimization,datasharing,smalldatacenter,datatransfer,informationleakage, systemattackability,datasummarization,socialcontext vi Re´sume´ De nos jours, notre vie quotidienne est non seulement alimente´e par l’eau, l’e´lectricite´, le gaz et la te´le´phonie mais e´galement par “l’informatique en nuage”. En raison de la forte pe´ne´trationdesservicesouapplicationsbase´ssurleclouddanstouslesaspectsdenotrevie, etdel’augmentationsanspre´ce´dentdesdonne´esnume´riques,lecloudcomputingdevientle cinquie`meservicequitraiteetstockecesapplicationsetdonne´es. Lesgrandsfournisseurs de cloud tels que Amazon, Microsoft et Google construisent des centres de donne´es cen- tralise´s pour re´aliser des e´conomies d’e´chelle, et fournir des ressources a` la demande en abondanceetavecunegrandee´lasticite´. Cependant,cescentresdedonne´esmassifsengen- drent e´galement de nombreux proble`mes : embouteillage de la bande passante, protection de la vie prive´e, se´curite´ des donne´es, une consommation e´norme d’e´nergie, vulne´rabilite´ juridiqueetphysique. Unedessolutionspossiblespourcesproble`mesestd’utiliserdesar- chitecturesmulti-cloud. Danscettethe`se,nousproposonsl’allocationderessourcesmulti- cloud selon trois perspectives : coope´ration, optimisation et partage de donne´es. Nous abordons ainsi les proble`mes suivants : (1) comment les fournisseurs de ressources colla- borentdansunmulti-cloud;(2)commentre´duirelesfuitesd’informationdansunsyste`mede stockagemulti-cloud;(3)commentpartagerungrandvolumededonne´esdefac¸onrentable. Pluspre´cise´ment,nousapportonslescontributionssuivantes: Coope´ration dans le nuage de´centralise´. Re´cemment, en raison des inquie´tudes gran- dissantes concernant la vie prive´e et le controˆle des donne´es, de nombreux petits centres dedonne´es(SDC)e´tablispardiffe´rentsfournisseurse´mergentpourtenterdere´pondrea` la demandelocale. Cependant, lesSDCpeuventsouffrirdel’e´lasticite´ desressourcesenrai- son de leurs ressources relativement limite´es, ce qui entraˆıne une perte de performance et de revenus. Dans ce travail, nous proposons un mode`le de nuage de´centralise´ dans lequel un groupe de SDC peut coope´rer pour ame´liorer les performances. De plus, nous intro- duisons une fonction de strate´gie ge´ne´rale pour les SDC afin d’e´valuer la performance de la coope´ration base´e sur les diffe´rentes dimensions du partage des ressources. Graˆce a` de vastessimulationsutilisantunmode`ledecentrededonne´esre´aliste,nousmontronsqueles strate´giesbase´essurlare´ciprocite´sontplusefficacesqued’autresstrate´gies,commeparex- emplecellesutilisantlapre´dictionbase´esurdesdonne´eshistoriques.Nosre´sultatsmontrent que la strate´gie base´e sur la re´ciprocite´ peut prospe´rer dans un environnement he´te´roge`ne avecdesstrate´giesconcurrentes. Optimisationmulti-cloudsurlesfuitesd’information.Denombreuxsyste`mesontre´cemment e´te´de´veloppe´spourstockerdesdonne´essurplusieursnuages. Cettedistributiondedonne´es sur diffe´rents fournisseurs de stockage en nuage fournit automatiquement aux utilisateurs uncertaindegre´ decontroˆledesfuitesd’informations,caraucunpointd’attaqueuniquene peutdivulguertouteslesinformations. Toutefois,ladistributionnonplanifie´edesblocsde donne´espeutengendrerunedivulgationd’informationse´leve´e,meˆmeenutilisantplusieurs nuages.Danscetravail,nouse´tudionsd’abordunimportantproble`medefuited’information cause´ par la distribution non planifie´e des donne´es dans les services de stockage multi- cloud. Ensuite, nous pre´sentons StoreSim, un syste`me de stockage de de´tection de fuites d’informations en multi-cloud. StoreSim vise a` stocker des donne´es syntaxiquement sim- ilaires sur le meˆme nuage, ce qui minimise les fuites d’informations de l’utilisateur sur plusieursnuages. Nousintroduisonsunalgorithmed’approximationpourge´ne´rerefficace- mentdessignaturesdepre´servationdesimilarite´pourlesblocsdedonne´esbase´ssurlefiltre MinHashetBloom, accompagne´ d’unefonctionpourcalculerlafuited’informationbase´e surcessignatures. Ensuite, nouspre´sentonsunalgorithmeefficacedege´ne´rationdeplans destockage, base´ surleclustering, pourladistributiondeblocsdedonne´esenminimisant lesfuitesd’informationssurplusieursnuages. Enfin, nouse´valuonsnotresyste`meenutil- isantdeuxensemblesdedonne´esre´elsdeWikipediaetGitHub. Nousmontronsquecelui-ci peut re´duire lesfuites d’informationjusqu’a` 60% par rapporta` un placement non planifie´. Deplus,notreanalyseentermesd’attaquesdusyste`mede´montrequenotresyste`merendles attaquessurlesinformationsbeaucouppluscomplexes. Unpartagededonne´esintelligent. Lede´placementdegrandesquantite´sdedonne´esdis- tribue´esverslenuageoud’unnuageversunautrepeutentraˆınerdescouˆtse´leve´sentermes detempsetdebandepassante. L’optimisationdupartagedesdonne´esdanslemulti-cloud peuteˆtrere´alise´esousdeuxanglesdiffe´rents: • Uneorganisationinter-cloud. Lessolutionsexistantesetcentralise´espourlepartage dedonne´estellesqueDropboxquidupliquelesdonne´espourtouteslespartiesinte´resse´es sontprohibitivementcouˆteuses,e´tantdonne´lagrandetailledesensemblesdedonne´es. Unesolutionpluspratiqueconsistea` utiliseruneapprochepair-a`-pair(P2P)pourre- produire les donne´es de manie`re auto-organise´e. Toutefois, les approches P2P exis- tantesseconcentrentsurlaminimisationdutempsdete´le´chargementsanstenircompte ducouˆtdebandepassante.Danscetravail,nouspre´sentonsCoShare,unP2Pquiestin- spire´ parunsyste`medepartagede´centralise´ pourlare´plicationdesdonne´es. CoShare permetauxutilisateursdespe´cifierleursbesoinssurlestaˆchesdepartagededonne´es et decartographiercesexigencesenressourcesrequisespourletransfertdedonne´es. Graˆcea`denombreusessimulations,nousmontronsqueCoSharepeuttrouverlescom- promissouhaitablesparrapporta` uncouˆtetuneperformancedonne´s, toutenvariant lesexigencesdesutilisateursetlestauxd’arrive´edesdemandes. • Rendrelesdonne´esvolumineusespluspetites. Legrandvolumededonne´esimpose des de´fis conside´rables pour le stockage, le partage, et le traitement de ces donne´es dans le multi-cloud. Ces de´fis peuvent eˆtre re´solus en re´alisant une synthe`se des donne´es,transformantainsil’ensembledesdonne´esoriginalenunsous-ensembleplus petit, mais encore utile. Dans ce travail, nous conside´rons les donne´es de Twitter et ses applications base´es sur les mode`les the´matiques (topic models) comme e´tude de cas. L’objectifestdere´duirelatailledujeudedonne´esdeTwittertoutenpre´servant

Description:
sur différents fournisseurs de stockage en nuage fournit automatiquement aux utilisateurs un certain degré ilaires sur le même nuage, ce qui minimise les fuites d'informations de l'utilisateur sur plusieurs selfish clouds,” The University of Hong Kong, http://i.cs.hku.hk/ hxli/profit-federat
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.