ebook img

Architecting Efficient Data Centers PDF

166 Pages·2012·2.33 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Architecting Efficient Data Centers

Architecting Efficient Data Centers by David Max Meisner Adissertationsubmittedinpartialfulfillment oftherequirementsforthedegreeof DoctorofPhilosophy (ComputerScienceandEngineering) inTheUniversityofMichigan 2012 DoctoralCommittee: AssistantProfessorThomasF.Wenisch,Chair ProfessorTrevorN.Mudge AssociateProfessorKevinPipe AssistantProfessorPrabalDutta AssociateProfessorChristoforosKozyrakis,StanfordUniversity (cid:13)c DavidMaxMeisner 2012 AllRightsReserved ACKNOWLEDGEMENTS There are many people I would like to thank who have supported me and made this work possible. First, a great thanks to my advisor, Tom, for his guidance. Clearly, this work would not be possible without his support. I also owe a great deal of gratitude to my coauthors: Luiz Barroso, Ricardo Bianchini, Qingyuan Deng, Brian Gold, Steven Pelley, LuizRamos,ChrisSadler,JackUnderwood,JamesVanGilder,WolfWeber,JunjieWu,and Pooya Zandevakili. Also, I wish to acknowledge Laura Faulk and DCO for helping obtain thePowerNaptraces. Thankssomuchtomyfamily–Dad,Mom,Mandy,PeterandRachel – for their support and patience throughout graduate school. And a great thank you to all my friends for making graduate school tolerable. Finally, I would like to thank my thesis committeefortheirinvolvementandfeedback. ii TABLE OF CONTENTS ACKNOWLEDGEMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii LISTOFFIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii LISTOFTABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi CHAPTER 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Serverunderutilizationandinefficiency . . . . . . . . . . . . . . . 2 1.2 PowerNapandDreamWeaver . . . . . . . . . . . . . . . . . . . . 2 1.3 OnlineData-IntensiveServices . . . . . . . . . . . . . . . . . . . 3 1.4 EvaluationMethodology . . . . . . . . . . . . . . . . . . . . . . . 5 1.5 EfficientClusterProvisioning . . . . . . . . . . . . . . . . . . . . 6 1.6 ThesisOrganization . . . . . . . . . . . . . . . . . . . . . . . . . 7 2. Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.1 DataCenterEfficiencyDesignGoalsandMetrics . . . . . . . . . 8 2.1.1 ComputationalEfficiency . . . . . . . . . . . . . . . . 9 2.1.2 Energy-Proportionality . . . . . . . . . . . . . . . . . . 9 2.1.3 InfrastructureEfficiency . . . . . . . . . . . . . . . . . 10 2.2 TaxonomyofLow-PowerModes . . . . . . . . . . . . . . . . . . 10 2.2.1 IdleLow-PowerModes . . . . . . . . . . . . . . . . . . 10 2.2.2 ActiveLow-PowerModes . . . . . . . . . . . . . . . . 11 2.3 PowerManagementChallenges . . . . . . . . . . . . . . . . . . . 11 3. RelatedWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.1 PowerManagement . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.1.1 IdleLow-PowerModes . . . . . . . . . . . . . . . . . . 13 3.1.2 ActiveLow-PowerModes . . . . . . . . . . . . . . . . 14 iii 3.1.3 Cluster-graintechniques . . . . . . . . . . . . . . . . . 14 3.1.4 Schedulingforenergyefficiency . . . . . . . . . . . . . 14 3.1.5 Low-powerProcessors . . . . . . . . . . . . . . . . . . 15 3.2 ModelingandSimulation . . . . . . . . . . . . . . . . . . . . . . 15 3.2.1 PowerModeling . . . . . . . . . . . . . . . . . . . . . 15 3.2.2 DataCenterSimulation . . . . . . . . . . . . . . . . . . 16 3.3 Memory-residentkey-valuestores . . . . . . . . . . . . . . . . . . 16 4. UnderstandingServerUtilizationandIdleness . . . . . . . . . . . . . . 18 4.1 UnderstandingServerUtilization . . . . . . . . . . . . . . . . . . 18 4.2 QuantifyingandAnalyzingIdleness . . . . . . . . . . . . . . . . 22 5. EvaluationMethodology . . . . . . . . . . . . . . . . . . . . . . . . . . 28 5.1 RequirementsofDataCenter-levelEvaluation . . . . . . . . . . . 29 5.2 ShortcomingsofExistingMethodologies . . . . . . . . . . . . . . 30 5.3 StochasticQueuingSimulation . . . . . . . . . . . . . . . . . . . 31 5.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . 32 5.3.2 SQSMethodology . . . . . . . . . . . . . . . . . . . . 32 5.3.3 WorkloadModels . . . . . . . . . . . . . . . . . . . . . 33 5.3.4 OutputVariables . . . . . . . . . . . . . . . . . . . . . 35 5.3.5 SimulationSequence . . . . . . . . . . . . . . . . . . . 37 5.4 Parallelization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 5.4.1 DistributedSimulation . . . . . . . . . . . . . . . . . . 39 5.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 5.5.1 CaseStudy-PowerCapping . . . . . . . . . . . . . . . 39 5.5.2 Performance . . . . . . . . . . . . . . . . . . . . . . . 42 5.5.3 ParallelSimulation . . . . . . . . . . . . . . . . . . . . 43 5.6 Peak-PowerModeling . . . . . . . . . . . . . . . . . . . . . . . . 43 5.6.1 DataCenterPowerProvisioning . . . . . . . . . . . . . 45 5.6.2 UnderstandingSMPSUBehavior . . . . . . . . . . . . 46 5.6.3 ServerPeakPowerAccounting . . . . . . . . . . . . . . 51 6. ThePowerNapServerArchitecture- ACoordinatedIdleLow-PowerMode . . . . . . . . . . . . . . . . . . . 55 6.1 PowerNap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 6.1.1 PowerNapPerformanceandPowerModel . . . . . . . . 56 6.1.2 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 59 6.1.3 ImplementationRequirements . . . . . . . . . . . . . . 60 6.2 EmulatingPowerNapTransitionPerformanceImpact . . . . . . . 62 6.2.1 CacheEffects . . . . . . . . . . . . . . . . . . . . . . . 62 6.2.2 TransitionLatency . . . . . . . . . . . . . . . . . . . . 62 6.3 PowerNapMechanisms . . . . . . . . . . . . . . . . . . . . . . . 63 iv 6.3.1 HardwareMechanisms . . . . . . . . . . . . . . . . . . 63 6.3.2 SoftwareMechanisms . . . . . . . . . . . . . . . . . . 65 6.3.3 EvidenceofPowerNapCapabilitiesintheWild . . . . . 66 6.4 Shortcomings . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 6.4.1 MulticoreServers . . . . . . . . . . . . . . . . . . . . . 67 6.4.2 HighlyUtilizedServices . . . . . . . . . . . . . . . . . 68 6.4.3 Reliability . . . . . . . . . . . . . . . . . . . . . . . . . 68 7. DreamWeaver: ArchitecturalSupportforDeepSleep . . . . . . . . . . 70 7.1 DreamWeaver . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 7.1.1 Hardwaremechanisms: theDreamProcessor . . . . . . 74 7.1.2 WeaveScheduling . . . . . . . . . . . . . . . . . . . . 75 7.2 PrototypeEvaluation . . . . . . . . . . . . . . . . . . . . . . . . . 76 7.2.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . 77 7.2.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . 77 7.3 PowerSavingsEvaluation . . . . . . . . . . . . . . . . . . . . . . 78 7.3.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . 78 7.3.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . 80 7.3.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . 85 8. RedundantArrayforInexpensiveLoadSharing(RAILS) . . . . . . . 87 8.1 PowerSupplyUnitbackground . . . . . . . . . . . . . . . . . . . 88 8.2 RAILSDesign . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 8.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 8.4 CostAnalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 9. Online Data-Intensive Services - The Need for Coordinated Active Low-PowerModes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 9.1 OnlineData-IntensiveServices . . . . . . . . . . . . . . . . . . . 98 9.1.1 UnderstandingOnlineData-IntensiveServices . . . . . 98 9.2 Cluster-scaleCharacterization . . . . . . . . . . . . . . . . . . . . 100 9.2.1 ExperimentalMethodology . . . . . . . . . . . . . . . 100 9.2.2 Activity Graphs: Compactly representing component- levelutilization . . . . . . . . . . . . . . . . . . . . . . 101 9.2.3 CharacterizationResults . . . . . . . . . . . . . . . . . 102 9.2.4 Analysis: EvaluatingAvailable/PotentialPowerModes . 104 9.3 LeafNodePerformanceModel . . . . . . . . . . . . . . . . . . . 105 9.3.1 ModelingL . . . . . . . . . . . . . . . . . . . . . 106 Service 9.3.2 ModelingL . . . . . . . . . . . . . . . . . . . . . . 108 Wait 9.3.3 Validation . . . . . . . . . . . . . . . . . . . . . . . . . 110 9.4 EvaluatingLatency-PowerTradeoffs . . . . . . . . . . . . . . . . 110 v 9.4.1 Active system power modes: Voltage and frequency scaling . . . . . . . . . . . . . . . . . . . . . . . . . . 111 9.4.2 Processoridlelow-powermodes . . . . . . . . . . . . . 112 9.4.3 Full-systemPowerModes: QuerybatchingwithPower- Nap . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 9.4.4 ComparingPowerModes . . . . . . . . . . . . . . . . 114 10. ArchitectingCost-EffectiveDataCenterMemcachedSystems . . . . . 116 10.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 10.1.1 MemcachedClusters . . . . . . . . . . . . . . . . . . . 119 10.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 10.2.1 Understandingmemcachedasaworkload . . . . . . . 121 10.2.2 LoadTestingFramework . . . . . . . . . . . . . . . . . 123 10.2.3 Systemsundertests . . . . . . . . . . . . . . . . . . . . 123 10.3 Single-serverCharacterization . . . . . . . . . . . . . . . . . . . . 124 10.3.1 MicroarchitecuralInefficiency . . . . . . . . . . . . . . 125 10.3.2 Multicorenon-scalability . . . . . . . . . . . . . . . . . 128 10.3.3 I/Osubsystem: Qualityvs. quantity . . . . . . . . . . . 129 10.4 Designingcost-optimalmemcachedclusters . . . . . . . . . . . 131 10.4.1 DesignSpaceExploration . . . . . . . . . . . . . . . . 131 10.4.2 Understandmemcachedclustereconomics . . . . . . . 132 10.4.3 Implicationsoftheoptimaldesignspace . . . . . . . . . 134 10.4.4 Thechallengeofscalingacluster . . . . . . . . . . . . 136 10.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 11. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 vi LIST OF FIGURES Figure 2.1 Low-powermodetaxonomy. . . . . . . . . . . . . . . . . . . . . . . . 11 4.1 Serverutilizationhistogram. . . . . . . . . . . . . . . . . . . . . . . . 19 4.2 Enterprisedatacenterutilizationtraces. . . . . . . . . . . . . . . . . . 19 4.3 Serverpowerbreakdown. . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.4 Busyandidleperiodcumulativedistributions. . . . . . . . . . . . . . 21 4.5 Busyandidleperiodweightedcumulativedistributions. . . . . . . . . 21 4.6 Fine-grainutilizationtraces. . . . . . . . . . . . . . . . . . . . . . . . 21 4.7 Full-systemidlenessvarieswidelyasafunctionofarrivalandrequest sizepatterns. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.8 Gammadistribution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.9 EffectofC onsystemidleness. . . . . . . . . . . . . . . . . . . . . . . 25 v 4.10 EffectofC onmedianidleperiodlength. . . . . . . . . . . . . . . . 25 v 5.1 OverviewoftheSQSmethodology. . . . . . . . . . . . . . . . . . . . . 31 5.2 SQSworkloadmodel. . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 5.3 ThesequenceofphasesinaSQSsimulation. . . . . . . . . . . . . . . 36 5.4 Parallelexecutiononacluster. . . . . . . . . . . . . . . . . . . . . . . 38 5.5 Simulationtimescaling. . . . . . . . . . . . . . . . . . . . . . . . . . . 40 5.6 Sensitivitytoworkloaddistributionvariance. . . . . . . . . . . . . . 40 5.7 Sensitivitytoaccuracyandtargetvariables . . . . . . . . . . . . . . . 42 5.8 Parallelsimulation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 5.9 Exampledatacentertrace: Averageutilizationmasksimportantspikes. 44 5.10 ExamplePDUcircuitbreakercurve. . . . . . . . . . . . . . . . . . . . 45 5.11 Simplified switched-mode power supply design and instrumentation tomeasurepower. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 5.12 Systemsatidle. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 5.13 Effectofmodulationfrequency. . . . . . . . . . . . . . . . . . . . . . . 50 5.14 Delayofastepfunctioninutilization. . . . . . . . . . . . . . . . . . . 50 5.15 Frequencyresponse. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 5.16 SimplifiedserverSMPSUmodel. . . . . . . . . . . . . . . . . . . . . . 52 5.17 Predictedpeakpowercloselyfollowsmeasuredvalue. . . . . . . . . . 53 6.1 PowerNap. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 6.2 PowerNapandDVFSanalyticmodels. . . . . . . . . . . . . . . . . . 57 6.3 PowerNapandDVFSpowerandresponsetimescaling. . . . . . . . . 59 vii 6.4 Cacheeffect. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 6.5 EmulatingPowerNap. . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 6.6 M/M/kanalysisoffull-systemidlenessunderweakscaling. . . . . . . 68 7.1 Voltageandfrequencyscaling. . . . . . . . . . . . . . . . . . . . . . . 71 7.2 Full-systemidlelow-powermode. . . . . . . . . . . . . . . . . . . . . 72 7.3 DreamWeaver. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 7.4 WeaveSchedulingexample. . . . . . . . . . . . . . . . . . . . . . . . . 76 7.5 DreamWeaverprototypevs. simulationvalidation. . . . . . . . . . . 78 7.6 Comparisonofpowersavingsfor4-coresystem. . . . . . . . . . . . . 80 7.7 Comparisonofpowersavingsfor32-coresystem. . . . . . . . . . . . . 81 7.8 Sensitivitytotransitiontime. . . . . . . . . . . . . . . . . . . . . . . . 83 7.9 Sensitivitytonumberofcores. . . . . . . . . . . . . . . . . . . . . . . 84 7.10 Sensitivitytoutilization. . . . . . . . . . . . . . . . . . . . . . . . . . . 85 8.1 Powersupplyefficiency. . . . . . . . . . . . . . . . . . . . . . . . . . . 88 8.2 RAILSPSUdesign. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 8.3 Powersupplypricing. . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 8.4 PowerDeliverySolutionComparison. . . . . . . . . . . . . . . . . . . 93 9.1 Examplediurnalpatterninqueriespersecond(QPS)foraWebsearch cluster. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 9.2 Exampleleafnodequerylatencydistributionat65%ofpeakQPS. . . 99 9.3 Websearchclustertopology. . . . . . . . . . . . . . . . . . . . . . . . 99 9.4 Idealizedlow-powermode. . . . . . . . . . . . . . . . . . . . . . . . . 101 9.5 CPUactivitygraphs. . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 9.6 Memoryactivitygraphs. . . . . . . . . . . . . . . . . . . . . . . . . . 103 9.7 Diskactivitygraphs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 9.8 Powersavingspotentialforavailablelow-powermodes. . . . . . . . . 105 9.9 Per-queryslowdownregression. . . . . . . . . . . . . . . . . . . . . . 107 9.10 Arrivalprocessgreatlyinfluencesquantilepredictions. . . . . . . . . 109 9.11 Distributionoftimebetweenqueryarrivals. . . . . . . . . . . . . . . . 109 9.12 Distributionofqueryservicetimes. . . . . . . . . . . . . . . . . . . . . 110 9.13 Performancemodelvalidation. . . . . . . . . . . . . . . . . . . . . . . 110 9.14 Systempowervs. latencytrade-offforprocessorandmemoryscaling. 112 9.15 SystempowersavingsforCPUidlelow-powermodes. . . . . . . . . . 113 9.16 Systempowervs. latencytrade-offforquerybatchingwithPowerNap.114 9.17 Power consumption at each qps level for a fixed 95th-percentile in- crease. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 10.1 Memcachedperformancedecomposition. . . . . . . . . . . . . . . . . 117 10.2 Exampleclusteroperations. . . . . . . . . . . . . . . . . . . . . . . . . 120 10.3 Workloadcharacteristics. . . . . . . . . . . . . . . . . . . . . . . . . . 121 10.4 MicroarchitecturalinefficiencywithMemcached. . . . . . . . . . . . 125 10.5 Cachingbehavior. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 10.6 Virtualmemorybehavior. . . . . . . . . . . . . . . . . . . . . . . . . . 127 10.7 Branchprediction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 10.8 Memached’slackofscalability. . . . . . . . . . . . . . . . . . . . . . . 129 10.9 NICfeatures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 viii 10.10 Scalingamemcachedcluster. . . . . . . . . . . . . . . . . . . . . . . 133 10.11 Understandingdesignspaceoptimizationdecisions. . . . . . . . . . . 135 10.12 Memcachedclusterdesignspaceexploration. . . . . . . . . . . . . . . 136 ix

Description:
Doctor of Philosophy. (Computer Science Prior work has proposed using non-commodity networking hardware (e.g., infiniband) [101] to improve.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.