GreenDM: A Versatile Tiering Hybrid Drive for the Trade-Off Evaluation of Performance, Energy, and Endurance A Dissertation Presented by Zhichao Li to The Graduate School in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in Computer Science Stony Brook University Technical Report FSL-14-01 May 2014 (cid:13)c Copyrightby ZhichaoLi 2014 Stony Brook University TheGraduateSchool Zhichao Li We, thedissertationcommitteefortheabovecandidateforthe DoctorofPhilosophydegree, hereby recommend acceptanceofthisdissertation. Erez Zadok–DissertationAdvisor AssociateProfessor, Computer Science Department Scott D. Stoller–Chairperson ofDefense Professor, Computer Science Department Michael Ferdman–Third InsideMember AssistantProfessor, Computer Science Department Dr. Andrew W. Leung–Outside Member Software Architect, Computer Science, FormationData Systems This dissertationisaccepted bytheGraduateSchool Charles Taber InterimDean oftheGraduateSchool i AbstractoftheDissertation GreenDM: A VersatileTiering HybridDriveforthe Trade-Off EvaluationofPerformance, Energy, andEndurance by Zhichao Li fortheDegreeof Doctor ofPhilosophy in Computer Science StonyBrook University 2014 There are trade-offs among performance, energy, and device endurance for storage systems. These trade-offs become more complex in storage system comprising different storage technolo- gies. Designs optimized for one dimension or workload often suffer in another. Therefore, it is important to study the trade-offs so as to be able to adapt the system to workloads. As differ- ent types of drives have different traits, tiering hybrid drives are studied more closely. However, previous tiering hybrids are often designed for high throughput, efficient energy consumption, or improvingendurance—leavingempiricalstudyonthetrade-offsbeingunexplored. Pastendurance studiesalso lack aconcrete modeland metrictohelp studythetrade-offs. Lastly,previousdesigns areoften basedon inflexiblepoliciesthatcannot adapteasily tochangingconditions. We designed and developed GreenDM, a versatile tiering hybrid drive that combines Flash- based SSDs with traditional HDDs; we present our endurance model to study the aforementioned trade-offs. GreenDM presents a block interface and requires no modifications to existing appli- cation software. GreenDM migrates hot data to the faster SSD and cold data to the slower HDD. GreenDMofferstunableparametersusefulinadaptingthesystemtomanyworkloads. Wehavede- signed,developed,andcarefullyevaluatedGreenDMwithavarietyofworkloadsusingcommodity SSD andHDDdrives. Wedemonstratedtheimportanceofversatilitytobeabletoadapttovarious workloads. Our thesis is that one must study the trade-offs among performance, energy, and endurance, especially in the ever more popular tiered storage systems, to enable adaptation to workloads. Our system is versatile so that it can adapt to different workloads to achieve certain trade-offs by adjustingtheimportantsystem parameters. We also provideseveral interestingobservationsalong the cost dimension. We developed a cost model for GreenDM and evaluated it under realistic cost metrics. Future storage system designs have to consider multiple optimizations dimensions: performance, energy,endurance, anddollarcost. ii We close with several interesting long-term future research. First, it will be interesting to pro- videautomatedcontrolknobsforuserstotrade-offperformance,energyefficiency,andendurance. Second,onecouldextendthetwo-tiersystemtothreetiersandexploremoretieringpolicies. Third, it would be useful to add security as an additional dimension to further explore these trade-offs. Forth, one could experiment with different storage devices and policies in the future, and help build more efficient storage systems to achieve high performance at minimum cost. Fifth and last, it would be interesting to provide control support at the CPU level as well to further justify the trade-offsamongperformance, energy,and endurance. iii To myfamilythatsupportsmefroma far. iv Contents ListofFigures ix ListofTables x Acknowledgments xii 1 Introduction 1 2 Background 4 2.1 Trade-Offs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Tieringv.s. Caching HybridJustification . . . . . . . . . . . . . . . . . . . . . . . 5 2.3 EnduranceStudy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.3.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.3.2 Hardware FailureFactors . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.3.3 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3 Lessons Learned 8 3.1 ElementsofPastStudy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.1.1 Power: PowerConsumptionin Enterprise-ScaleBackup Storage Systems . 8 3.1.2 Energy and Performance: On the Energy Consumption and Performance ofSystemsSoftware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.2 Put Together . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.2.1 GreenDM: A Versatile Tiering Hybrid Drive for the Trade-Off Evaluation ofPerformance, Energy,and Endurance . . . . . . . . . . . . . . . . . . . 9 3.2.2 Cost Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.2.3 Caching Follow-Up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.2.4 Capacity Ratio Follow-Up . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 4 PowerConsumption inEnterprise-Scale Backup Storage Systems 11 4.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 4.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Experimentalsetup . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.3.1 ControllerMeasurements . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 v Controlleridlepower . . . . . . . . . . . . . . . . . . . . . . . . . 14 Controllerunderload . . . . . . . . . . . . . . . . . . . . . . . . . 15 Power-managed controller . . . . . . . . . . . . . . . . . . . . . . 17 4.3.2 EnclosureMeasurements . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Enclosureidlepower . . . . . . . . . . . . . . . . . . . . . . . . . 17 Enclosureunderload . . . . . . . . . . . . . . . . . . . . . . . . . 18 Powermanagedenclosure . . . . . . . . . . . . . . . . . . . . . . 18 4.3.3 System-LevelMeasurements . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 5 Onthe Energy Consumption andPerformance ofSystems Software 22 5.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 5.1.1 Energy Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 5.1.2 Energy ConsumptionofDataCompression . . . . . . . . . . . . . . . . . 23 5.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 5.2.1 CompressionAlgorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 5.2.2 I/O Schedulers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 5.2.3 Powerand EnergyConsumption . . . . . . . . . . . . . . . . . . . . . . . 24 5.3 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 5.3.1 SystemIdentification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 5.3.2 ProblemsEncountered . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 5.4 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 5.4.1 ExperimentalSetup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 5.4.2 Benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 5.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 5.5.1 Nonlinearity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 5.5.2 Instability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 5.5.3 Multi-Dimensionality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 5.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 6 GreenDM: A Versatile Tiering Hybrid Drive for the Trade-Off Evaluation of Perfor- mance, Energy, andEndurance 38 6.1 Designand Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 6.1.1 DesignGoals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 6.1.2 Architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 6.1.3 DataManagement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Mappingtable . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Dataseparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Decoupledmapping . . . . . . . . . . . . . . . . . . . . . . . . . 41 Datapromotion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 Datademotion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 Migrationthrottling . . . . . . . . . . . . . . . . . . . . . . . . . . 43 Serving directlyfromRAM . . . . . . . . . . . . . . . . . . . . . 44 Versatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 6.1.4 PowerManagement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 vi 6.1.5 EnduranceModel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 6.1.6 ImplementationDetails . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 Concurrency control . . . . . . . . . . . . . . . . . . . . . . . . . 46 Metadatamanagement . . . . . . . . . . . . . . . . . . . . . . . . 46 Statisticsexport . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 Developmentcost . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 6.2 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 6.2.1 ExperimentalSetup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 6.2.2 Benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 6.2.3 Web-Search TraceWorkload . . . . . . . . . . . . . . . . . . . . . . . . . 49 6.2.4 FIU OnlineTrace Workload . . . . . . . . . . . . . . . . . . . . . . . . . 51 6.2.5 File-Server Workload . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 6.2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 6.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 6.4 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 6.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 7 CostEvaluation 58 7.1 Cost Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 7.2 WorkingExample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 7.3 Cost Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 7.3.1 Web-Search TraceWorkload . . . . . . . . . . . . . . . . . . . . . . . . . 61 7.3.2 FIU OnlineTrace Workload . . . . . . . . . . . . . . . . . . . . . . . . . 63 7.3.3 File-Server Workload . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 7.3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 7.4 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 7.5 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 7.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 8 Caching Follow-Up 69 8.1 Designand Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 Capacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 ManagementUnit. . . . . . . . . . . . . . . . . . . . . . . . . . . 69 DataMovement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 Read/WritePolicy . . . . . . . . . . . . . . . . . . . . . . . . . . 70 8.2 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 8.2.1 Web-Search TraceWorkload . . . . . . . . . . . . . . . . . . . . . . . . . 72 8.2.2 OnlineTraceWorkload . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 8.2.3 File-serverWorkload . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 8.2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 8.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 8.4 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 8.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 vii 9 Capacity RatioFollow-Up 82 9.1 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 9.1.1 Web-Search TraceWorkload . . . . . . . . . . . . . . . . . . . . . . . . . 82 9.1.2 OnlineTraceWorkload . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 9.1.3 File-serverWorkload . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 9.2 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 10 Future 87 10.1 AutomationoftheControlKnobs . . . . . . . . . . . . . . . . . . . . . . . . . . 87 10.2 Three-Tierand N-TierStorageSystems . . . . . . . . . . . . . . . . . . . . . . . 88 10.3 Security as aFourthAdditionalDimension . . . . . . . . . . . . . . . . . . . . . . 89 10.4 Support NewStorageDevices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 10.5 ProvideControl Supportat theCPU Level . . . . . . . . . . . . . . . . . . . . . . 90 11 Conclusion 92 Bibliography 92 viii
Description: