Lustre 2.0 Operations Manual ™ Part No. 821-2076-10 July 2010 Copyright©2010,Oracleand/oritsaffiliates.Allrightsreserved. Thissoftwareandrelateddocumentationareprovidedunderalicenseagreementcontainingrestrictionsonuseanddisclosureandareprotectedby intellectualpropertylaws.Exceptasexpresslypermittedinyourlicenseagreementorallowedbylaw,youmaynotuse,copy,reproduce,translate, broadcast,modify,license,transmit,distribute,exhibit,perform,publish,ordisplayanypart,inanyform,orbyanymeans.Reverseengineering, disassembly,ordecompilationofthissoftware,unlessrequiredbylawforinteroperability,isprohibited. Theinformationcontainedhereinissubjecttochangewithoutnoticeandisnotwarrantedtobeerror-free.Ifyoufindanyerrors,pleasereportthemto usinwriting. IfthisissoftwareorrelatedsoftwaredocumentationthatisdeliveredtotheU.S.GovernmentoranyonelicensingitonbehalfoftheU.S.Government, thefollowingnoticeisapplicable: U.S.GOVERNMENTRIGHTSPrograms,software,databases,andrelateddocumentationandtechnicaldatadeliveredtoU.S.Governmentcustomers are"commercialcomputersoftware"or"commercialtechnicaldata"pursuanttotheapplicableFederalAcquisitionRegulationandagency-specific supplementalregulations.Assuch,theuse,duplication,disclosure,modification,andadaptationshallbesubjecttotherestrictionsandlicensetermsset forthintheapplicableGovernmentcontract,and,totheextentapplicablebythetermsoftheGovernmentcontract,theadditionalrightssetforthinFAR 52.227-19,CommercialComputerSoftwareLicense(December2007).OracleUSA,Inc.,500OracleParkway,RedwoodCity,CA94065. Thissoftwareorhardwareisdevelopedforgeneraluseinavarietyofinformationmanagementapplications.Itisnotdevelopedorintendedforusein anyinherentlydangerousapplications,includingapplicationswhichmaycreateariskofpersonalinjury.Ifyouusethissoftwareorhardwarein dangerousapplications,thenyoushallberesponsibletotakeallappropriatefail-safe,backup,redundancy,andothermeasurestoensurethesafeuse. OracleCorporationanditsaffiliatesdisclaimanyliabilityforanydamagescausedbyuseofthissoftwareorhardwareindangerousapplications. OracleisaregisteredtrademarkofOracleCorporationand/oritsaffiliates.OracleandJavaareregisteredtrademarksofOracleand/oritsaffiliates.Other namesmaybetrademarksoftheirrespectiveowners. AMD,Opteron,theAMDlogo,andtheAMDOpteronlogoaretrademarksorregisteredtrademarksofAdvancedMicroDevices.IntelandIntelXeon aretrademarksorregisteredtrademarksofIntelCorporation.AllSPARCtrademarksareusedunderlicenseandaretrademarksorregisteredtrademarks ofSPARCInternational,Inc.UNIXisaregisteredtrademarklicensedthroughX/OpenCompany,Ltd. Thissoftwareorhardwareanddocumentationmayprovideaccesstoorinformationoncontent,products,andservicesfromthirdparties.Oracle Corporationanditsaffiliatesarenotresponsibleforandexpresslydisclaimallwarrantiesofanykindwithrespecttothird-partycontent,products,and services.OracleCorporationanditsaffiliateswillnotberesponsibleforanyloss,costs,ordamagesincurredduetoyouraccesstooruseofthird-party content,products,orservices. ThisworkislicensedunderaCreativeCommonsAttribution-ShareAlike3.0UnitedStatesLicense.Toviewacopyofthislicenseandobtain moreinformationaboutCreativeCommonslicensing,visitCreativeCommonsAttribution-ShareAlike3.0UnitedStatesorsendaletterto CreativeCommons,1712ndStreet,Suite300,SanFrancisco,California94105,USA. Please Recycle Please Recycle Contents Preface xxv Part I Lustre Architecture 1. Introduction to Lustre 1–1 1.1 Introducing the Lustre File System 1–2 1.1.1 Lustre Key Features 1–3 1.2 Lustre Components 1–5 1.2.1 Lustre Networking (LNET) 1–6 1.2.2 Management Server (MGS) 1–7 1.3 Lustre Systems 1–7 1.4 Files in the Lustre File System 1–9 1.4.1 Lustre File System and Striping 1–11 1.4.2 Lustre Storage 1–12 1.4.2.1 OSS Storage 1–12 1.4.2.2 MDS Storage 1–12 1.4.3 Lustre System Capacity 1–13 1.5 Lustre Configurations 1–14 1.6 Lustre Networking 1–15 1.7 Lustre Failover 1–16 v 2. Understanding Lustre Networking 2–1 2.1 Introduction to LNET 2–1 2.2 Supported Network Types 2–2 2.3 Designing Your Lustre Network 2–3 2.3.1 Identify All Lustre Networks 2–3 2.3.2 Identify Nodes to Route Between Networks 2–3 2.3.3 Identify Network Interfaces to Include/Exclude from LNET 2–3 2.3.4 Determine Cluster-wide Module Configuration 2–4 2.3.5 Determine Appropriate Mount Parameters for Clients 2–4 2.4 Configuring LNET 2–5 2.4.1 Module Parameters 2–5 2.4.1.1 Using Usocklnd 2–7 2.4.1.2 OFED InfiniBand Options 2–8 2.4.2 Module Parameters - Routing 2–8 2.4.2.1 LNET Routers 2–11 2.4.3 Downed Routers 2–12 2.5 Starting and Stopping LNET 2–13 2.5.1 Starting LNET 2–13 2.5.1.1 Starting Clients 2–13 2.5.2 Stopping LNET 2–14 vi Lustre 2.0 Operations Manual • July 2010 Part II Lustre Administration 3. Installing Lustre 3–1 3.1 Preparing to Install Lustre 3–2 3.1.1 Supported Linux Distribution, Architecture and Interconnect 3–2 3.1.2 Required Lustre Software 3–3 3.1.3 Required Tools and Utilities 3–3 3.1.4 (Optional) High-Availability Software 3–4 3.1.5 Debugging Tools 3–4 3.1.6 Environmental Requirements 3–5 3.1.7 Memory Requirements 3–6 3.1.7.1 Client Memory Requirements 3–6 3.1.7.2 MDS Memory Requirements 3–6 3.1.7.3 OSS Memory Requirements 3–7 3.2 Installing Lustre from RPMs 3–9 3.3 Installing Lustre from Source Code 3–13 3.3.1 Patching the Kernel 3–14 3.3.1.1 Introducing the Quilt Utility 3–14 3.3.1.2 Get the Lustre Source and Unpatched Kernel 3–15 3.3.1.3 Patch the Kernel 3–16 3.3.2 Create and Install the Lustre Packages 3–17 3.3.3 Installing Lustre with a Third-Party Network Stack 3–19 Contents vii 4. Configuring Lustre 4–1 4.1 Configuring the Lustre File System 4–2 4.1.0.1 Simple Lustre Configuration Example 4–5 4.1.0.2 Module Setup 4–10 4.1.1 Scaling the Lustre File System 4–10 4.2 Additional Lustre Configuration 4–10 4.3 Basic Lustre Administration 4–11 4.3.1 Specifying the File System Name 4–12 4.3.2 Starting Lustre 4–12 4.3.3 Mounting a Server 4–13 4.3.4 Unmounting a Server 4–14 4.3.5 Working with Inactive OSTs 4–14 4.3.6 Finding Nodes in the Lustre File System 4–15 4.3.7 Mounting a Server Without Lustre Service 4–16 4.3.8 Specifying Failout/Failover Mode for OSTs 4–16 4.3.9 Running Multiple Lustre File Systems 4–17 4.3.10 Setting and Retrieving Lustre Parameters 4–19 4.3.10.1 Setting Parameters with mkfs.lustre 4–19 4.3.10.2 Setting Parameters with tunefs.lustre 4–19 4.3.10.3 Setting Parameters with lctl 4–20 4.3.10.4 Reporting Current Parameter Values 4–21 4.3.11 Regenerating Lustre Configuration Logs 4–22 4.3.12 Changing a Server NID 4–23 4.3.13 Removing and Restoring OSTs 4–25 4.3.13.1 Removing an OST from the File System 4–25 4.3.13.2 Restoring an OST in the File System 4–27 4.3.14 Aborting Recovery 4–27 4.3.15 Determining Which Machine is Serving an OST 4–28 viii Lustre 2.0 Operations Manual • July 2010 4.4 More Complex Configurations 4–29 4.4.1 Failover 4–29 4.5 Operational Scenarios 4–30 4.5.1 Changing the Address of a Failover Node 4–31 5. Service Tags 5–1 5.1 Introduction to Service Tags 5–1 5.2 Using Service Tags 5–2 5.2.1 Installing Service Tags 5–2 5.2.2 Discovering and Registering Lustre Components 5–3 5.2.3 Service Tag Registration Information 5–6 6. Configuring Lustre - Examples 6–1 6.1 Simple TCP Network 6–1 6.1.1 Lustre with Combined MGS/MDT 6–1 6.1.1.1 Installation Summary 6–1 6.1.1.2 Configuration Generation and Application 6–2 6.1.2 Lustre with Separate MGS and MDT 6–3 6.1.2.1 Installation Summary 6–3 6.1.2.2 Configuration Generation and Application 6–3 Contents ix 7. More Complicated Configurations 7–1 7.1 Multihomed Servers 7–1 7.1.1 Modprobe.conf 7–1 7.1.2 Start Servers 7–3 7.1.3 Start Clients 7–4 7.2 Elan to TCP Routing 7–5 7.2.1 Modprobe.conf 7–5 7.2.2 Start servers 7–5 7.2.3 Start clients 7–5 7.3 Load Balancing with InfiniBand 7–6 7.3.1 Setting Up modprobe.conf for Load Balancing 7–6 7.4 Multi-Rail Configurations with LNET 7–7 8. Failover 8–1 8.1 What is Failover? 8–1 8.1.1 Failover Capabilities 8–2 8.1.2 Types of Failover Configurations 8–2 8.2 Failover Functionality in Lustre 8–3 8.2.1 MDT Failover Configuration (Active/Passive) 8–4 8.2.2 OST Failover Configuration (Active/Active) 8–4 8.2.3 Lustre Failover and MMP 8–4 8.2.3.1 Working with MMP 8–5 x Lustre 2.0 Operations Manual • July 2010
Description: