Researching systemadministration by EricArnoldAnderson B.S. M.S.CarnegieMellonUniversity1994 Adissertationsubmittedinpartialsatisfactionofthe requirementsforthedegreeof DoctorofPhilosophy in ComputerScience inthe GRADUATEDIVISION ofthe UNIVERSITY ofCALIFORNIA atBERKELEY Committeeincharge: Professor DavePatterson,Chair Professor DougTygar Professor PeterMenell Spring2002 ThedissertationofEricArnoldAndersonisapproved: Chair Date Date Date UniversityofCaliforniaatBerkeley Spring2002 Researching systemadministration Copyright2002 by EricArnoldAnderson 1 Abstract Researchingsystemadministration by EricArnoldAnderson DoctorofPhilosophyinComputerScience UniversityofCaliforniaatBerkeley ProfessorDavePatterson,Chair Systemadministrationisaphenomenallyimportant,yetsurprisinglyignoredsub-fieldofComputerScience. We hypothesize that this avoidance is because approaches for performing academic research on system administration problems are not well known. To reduce the difficulty of performing research, we present a small set of principles that can be used toevaluate solutions, a classification of existing research onsystem administration,andthreeapproachestoresearchonsystemadministrationthatweillustratewiththeresearch thatwehavedone. First,wedemonstrate theapproach of“Letthehumanhandle it”withtheCARDcluster monitor- ing system. We show that CARDis more flexible and scalable than earlier approaches. We also show that monitoringisnecessaryforsystemadministration, butthatthisresearchapproachisnotacompletesolution tosystem administration problems. Second, we demonstrate the approach of “Rewrite everything” with the River I/O programming infrastructure. WeshowthatRiveradaptsaroundperformanceanomaliesimprovingtheperformanceconsis- tencyofI/Okernels. Byrewritingtheentireapplication,wecouldexploreasubstantially differentapproach toprogram structuring, butthisresearch approach limitsthecompleteness oftheresulting system. Third,wedemonstratetheapproachof“Sneakin-between”withtheHippodromeiterativestorage systemdesigner. WeshowthatHippodrome canfindanappropriate storage systemtosupport anI/Owork- load without requiring human intervention. Weshow that by using hooks in existing operating systems we can quickly get toa morecomplete system, but that this research approach can berestricted by theexisting interfaces. Finally, we describe a substantial number of open research directions based both on the classi- fication that we developed of existing research, and on the systems that we built. We conclude that the fieldof system administration is ripe forexploration, and that wehave helped provide afoundation for that exploration. 2 Professor DavePatterson Dissertation CommitteeChair i Contents ListofFigures iv 1 Introduction 1 1.1 Overviewofsystem administration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Principles ofsystem administration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 Thefieldofsystemadministration 7 2.1 Amodeloftasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 Amodelofproblem sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2.1 Examination ofthedifferent categories . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3 Historical trends oftheLISAconference . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.3.1 Taskmodeltrends . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.3.2 Source modeltrends . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.4 Examination ofimportant tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.4.1 SWinstallation: OS,application, packaging andcustomization . . . . . . . . . . . . 16 2.4.2 Backup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.4.3 Configuration: site,host,network,sitemove . . . . . . . . . . . . . . . . . . . . . 17 2.4.4 Accounts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.4.5 Mail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.4.6 Monitoring: system,network,host,data display . . . . . . . . . . . . . . . . . . . . 19 2.4.7 Printing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.4.8 Trouble tickets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.4.9 Secureroot access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.5 Conclusions andanalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3 CARD:extensible,scalable monitoringforclustersofcomputers 23 3.1 Fourproblems andoursolutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.1.2 Handling rapid evolution using relational tables . . . . . . . . . . . . . . . . . . . . 25 3.1.3 Recovering fromfailures using timestamps . . . . . . . . . . . . . . . . . . . . . . 25 3.1.4 Datascalability using hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.1.5 Datatransfer efficiency usingahybrid push/pull protocol . . . . . . . . . . . . . . . 27 3.1.6 Visualization scalability using aggregation . . . . . . . . . . . . . . . . . . . . . . 28 3.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.2.1 Storing relational tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.2.2 Building thehierarchy withthehybrid push/pull protocol . . . . . . . . . . . . . . . 31 3.2.3 Visualization applet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.2.4 Gathering datafortheleafdatabases . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.3 Experience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.4 Re-implementing CARD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 ii 3.5 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 4 River: infrastructure foradaptablecomputation 39 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 4.2 TheRiversystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.2.1 Thedata model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.2.2 Theprogramming model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.3 Experimental validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 4.3.1 Hardware andsoftwareenvironment . . . . . . . . . . . . . . . . . . . . . . . . . . 46 4.3.2 Distributed queue performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 4.3.3 Graduated declustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.3.4 Supporting atrace-driven simulator . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.3.5 One-pass hash join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.3.6 One-pass external sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.4 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.4.1 Parallelfilesystems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.4.2 Programming environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.4.3 Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.5 Applying rivertosystemadministration problems . . . . . . . . . . . . . . . . . . . . . . . 54 4.6 Systemadministration problems inEuphrates . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 5 Hippodrome: runningcircles aroundstorage administration 58 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 5.2 Systemoverview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 5.2.1 Today’s manualloop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 5.2.2 Theiterative loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 5.2.3 Automating theloop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 5.2.4 Balancing systemload . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 5.2.5 Hippodrome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 5.2.6 Hippodrome vs. control loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 5.2.7 Breaking theloop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 5.3 Experimental overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 5.3.1 Workloads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 5.3.2 Experimental infrastructure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 5.4 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 5.4.1 Synthetic workloads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 5.4.2 PostMark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 5.4.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 5.5 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 5.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 6 Futuredirections 86 6.1 Softwareinstallation: OS,application, packaging andcustomization . . . . . . . . . . . . . 86 6.1.1 Packaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 6.1.2 Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 6.1.3 Merging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 iii 6.1.4 Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 6.1.5 End-user customization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 6.2 Backup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 6.3 Configuration: site,host,network,sitemove . . . . . . . . . . . . . . . . . . . . . . . . . . 89 6.4 Accounts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 6.5 Mail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 6.6 Monitoring: system,network, host,datadisplay . . . . . . . . . . . . . . . . . . . . . . . . 90 6.7 Printing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 6.8 Trouble tickets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 6.9 Secure rootaccess . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 6.10 Future workonCARD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 6.11 Future workonRiver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 6.12 Future workonHippodrome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 7 Conclusions 94 7.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 7.2 Research approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 7.3 Themes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 Bibliography 98 iv List of Figures 1.1 Estimated principle importance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.1 Taskcategories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2 Problemsource state transitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3 Time/category breakdown forpapers, pt.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.4 Time/category breakdown forpapers, pt.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.5 Time/problem source breakdown forpapers . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.6 Category vs. principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.1 Ahierarchy ofdatabases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.2 Snapshot oftheinterface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.3 Architecture ofoursystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.4 Implementation properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.5 Architecture oftheforwarder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.6 Architecture ofjoinpush . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.1 Graduated declustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.2 Distributed queue scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 4.3 DQread performance underperturbation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 4.4 DQwriteperformance under perturbation . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.5 Graduated declustering scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 4.6 GDperformance underreadperturbation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 4.7 Parallel external sortscaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 4.8 Perturbing thesortpartitioner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 5.1 Threestages oftheloop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 5.2 Loopadvancement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 5.3 Problems withthesimpleloop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 5.4 Workload characteristics generated byHippodrome’s analysis stage. . . . . . . . . . . . . . 70 5.5 Commonparameters forsynthetic workloads. . . . . . . . . . . . . . . . . . . . . . . . . . 75 5.6 Experimental Infrastructure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 5.7 Behavior withmaximumsynthetic workload . . . . . . . . . . . . . . . . . . . . . . . . . . 77 5.8 Behavior withhalf-max synthetic workload . . . . . . . . . . . . . . . . . . . . . . . . . . 79 5.9 Behavior underphased workload . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 5.10 Relative imbalance withphased workload . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 5.11 Behavior underPostMark workload . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 5.12 Plateaus reached withPostMarkworkload . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 7.1 Principles vs. system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 1 Chapter 1 Introduction Systemadministrationhasgreateconomicimportance. Studiesindicatethecostperyearofadmin- isteringsystemsasonetotentimesthecostoftheactualhardware[Gro97,And95,Coub]. Moreover,system administrators are in remarkable demand, with average salaries growing by over 10% per year [SAN]. As a consequence, many companies have made reducing total cost of ownership one of their primary goals [Mica,Micb,Pacb]. Despite this substantial commercial interest, there is little academic work on system administra- tion. Onlyafewschools haveclassesonsystemadministration [Nem,Coua,Ext],aonlyasmallnumberof research projects havespecifically targeted systemadministration [BR98,Asa00]. Wechoose to focus on system administration of large sites because we believe that the problems faced by large sites are more complex than those faced by end users, and because webelieve that ifwecan make the large sites manageable, they will be able to support the end users. Indeed, some researchers and business people have proposed having only a web browser on the machines used by end users, and hosting all ofthe applications of centralized, large sites. Thiscentralization reduces the administration problem for theendusers,butatbestleavestheproblems thesameforthenewcentralized sites. Thisdissertationservesthreerelatedpurposes. First,itidentifiesapproachesforacademicresearch on system administration. Second, it demonstrates the approaches by examining three systems, each built using a different research approach. Third, it enables future research by both identifying principles for evaluating systemadministration research, andbyidentifying directions offutureresearch. Westartbydescribing principles forevaluatingsystemadministration researchinsection1.2. We identify and explain the principles to help researchers avoid some work of deploying their systems. The principles help identify areas where a particular solution to a system administration problem both assists and complicates the job of system administrators. Weuse these principles to evaluate the systems webuilt, butaswedidnotidentifytheprinciplesuntilafterwehaddevelopedallofthesystems,theydidnotinfluence
Description: