Apache HBase Guide ImportantNotice ©2010-2021Cloudera,Inc.Allrightsreserved. Cloudera,theClouderalogo,andanyotherproductor servicenamesorsloganscontainedinthisdocumentaretrademarksofClouderaand itssuppliersorlicensors,andmaynotbecopied,imitatedorused,inwholeorinpart, withoutthepriorwrittenpermissionofClouderaortheapplicabletrademarkholder.If thisdocumentationincludescode,includingbutnotlimitedto,codeexamples,Cloudera makesthisavailabletoyouunderthetermsoftheApacheLicense,Version2.0,including anyrequirednotices.AcopyoftheApacheLicenseVersion2.0,includinganynotices, isincludedherein.AcopyoftheApacheLicenseVersion2.0canalsobefoundhere: https://opensource.org/licenses/Apache-2.0 HadoopandtheHadoopelephantlogoaretrademarksoftheApacheSoftware Foundation.Allothertrademarks,registeredtrademarks,productnamesandcompany namesorlogosmentionedinthisdocumentarethepropertyoftheirrespectiveowners. Referencetoanyproducts,services,processesorotherinformation,bytradename, trademark,manufacturer,supplierorotherwisedoesnotconstituteorimply endorsement,sponsorshiporrecommendationthereofbyus. Complyingwithallapplicablecopyrightlawsistheresponsibilityoftheuser.Without limitingtherightsundercopyright,nopartofthisdocumentmaybereproduced,stored inorintroducedintoaretrievalsystem,ortransmittedinanyformorbyanymeans (electronic,mechanical,photocopying,recording,orotherwise),orforanypurpose, withouttheexpresswrittenpermissionofCloudera. Clouderamayhavepatents,patentapplications,trademarks,copyrights,orother intellectualpropertyrightscoveringsubjectmatterinthisdocument.Exceptasexpressly providedinanywrittenlicenseagreementfromCloudera,thefurnishingofthisdocument doesnotgiveyouanylicensetothesepatents,trademarkscopyrights,orother intellectualproperty.ForinformationaboutpatentscoveringClouderaproducts,see http://tiny.cloudera.com/patents. Theinformationinthisdocumentissubjecttochangewithoutnotice.Clouderashall notbeliableforanydamagesresultingfromtechnicalerrorsoromissionswhichmay bepresentinthisdocument,orfromuseofthisdocument. Cloudera,Inc. 395PageMillRoad PaloAlto,CA94306 [email protected] US:1-888-789-1488 Intl:1-650-362-0488 www.cloudera.com ReleaseInformation Version:ClouderaEnterprise5.11.x Date:February3,2021 Table of Contents Apache HBase Guide................................................................................................9 Installation............................................................................................................................................................9 Upgrading.............................................................................................................................................................9 Configuration Settings..........................................................................................................................................9 Managing HBase................................................................................................................................................10 HBase Security...................................................................................................................................................10 HBaseReplication...............................................................................................................................................10 HBaseHighAvailability.......................................................................................................................................10 Troubleshooting HBase......................................................................................................................................10 Upstream Information for HBase.......................................................................................................................11 HBase Installation..................................................................................................12 NewFeaturesandChangesforHBaseinCDH5.................................................................................................12 CDH5.4HBaseChanges.......................................................................................................................................................12 CDH5.3HBaseChanges.......................................................................................................................................................14 SlabCache Has Been Deprecated......................................................................................................................................14 checkAndMutate(RowMutations)API...........................................................................................................................14 CDH5.2HBaseChanges.......................................................................................................................................................14 CDH5.1HBaseChanges.......................................................................................................................................................17 CDH5.0.xHBaseChanges....................................................................................................................................................21 Installing HBase..................................................................................................................................................22 StartingHBaseinStandaloneMode...................................................................................................................23 InstallingtheHBaseMaster.................................................................................................................................................23 StartingtheHBaseMaster...................................................................................................................................................23 InstallingandStartingtheHBaseThriftServer....................................................................................................................24 InstallingandConfiguringHBaseREST................................................................................................................................24 ConfiguringHBaseinPseudo-DistributedMode................................................................................................25 ModifyingtheHBaseConfiguration.....................................................................................................................................25 Creatingthe/hbaseDirectoryinHDFS................................................................................................................................26 EnablingServersforPseudo-distributedOperation.............................................................................................................26 InstallingandStartingtheHBaseThriftServer....................................................................................................................27 Deploying HBase on a Cluster............................................................................................................................27 ChoosingWheretoDeploytheProcesses............................................................................................................................28 ConfiguringforDistributedOperation.................................................................................................................................28 AccessingHBasebyusingtheHBaseShell.........................................................................................................29 HBaseShellOverview...........................................................................................................................................................29 SettingVirtualMachineOptionsforHBaseShell.................................................................................................................29 Scripting with HBase Shell...................................................................................................................................................29 ConfiguringHBaseOnlineMerge.......................................................................................................................30 Using MapReduce with HBase...........................................................................................................................30 Troubleshooting HBase......................................................................................................................................31 TableCreationFailsafterInstallingLZO...............................................................................................................................31 ThriftServerCrashesafterReceivingInvalidData...............................................................................................................31 HBaseisusingmorediskspacethanexpected....................................................................................................................31 Upgrading HBase....................................................................................................33 UpgradingHBasefromaLowerCDH5Release.................................................................................................33 Configuration Settings for HBase............................................................................35 UsingDNSwithHBase........................................................................................................................................35 UsingtheNetworkTimeProtocol(NTP)withHBase.........................................................................................35 SettingUserLimitsforHBase.............................................................................................................................35 Using dfs.datanode.max.transfer.threads with HBase..............................................................................37 Configuring BucketCache in HBase....................................................................................................................37 ConfiguringEncryptioninHBase........................................................................................................................37 ConfiguringCellLevelTTLinHBase....................................................................................................................37 UsingHedgedReads...........................................................................................................................................38 AccessingHBasebyusingtheHBaseShell.........................................................................................................38 HBaseShellOverview...........................................................................................................................................................38 SettingVirtualMachineOptionsforHBaseShell.................................................................................................................39 Scripting with HBase Shell...................................................................................................................................................39 ConfiguringHBaseOnlineMerge.......................................................................................................................39 ConfiguringRegionServerGrouping...................................................................................................................40 Configuring the BlockCache...............................................................................................................................43 ConfiguringtheScannerHeartbeat....................................................................................................................43 Troubleshooting HBase......................................................................................................................................43 Managing HBase....................................................................................................44 Creating the HBase Root Directory....................................................................................................................44 GracefulShutdown.............................................................................................................................................44 ConfiguringtheHBaseThriftServerRole...........................................................................................................45 Enabling HBase Indexing....................................................................................................................................45 Adding a Custom Coprocessor...........................................................................................................................45 DisablingLoadingofCoprocessors.....................................................................................................................46 EnablingHedgedReadsonHBase......................................................................................................................46 AdvancedConfigurationforWrite-HeavyWorkloads........................................................................................46 Managing HBase................................................................................................................................................47 CreatingtheHBaseRootDirectory......................................................................................................................................47 Graceful Shutdown..............................................................................................................................................................47 ConfiguringtheHBaseThriftServerRole.............................................................................................................................48 EnablingHBaseIndexing......................................................................................................................................................48 AddingaCustomCoprocessor.............................................................................................................................................48 DisablingLoadingofCoprocessors......................................................................................................................................49 EnablingHedgedReadsonHBase.......................................................................................................................................49 AdvancedConfigurationforWrite-HeavyWorkloads..........................................................................................................49 StartingandStoppingHBase..............................................................................................................................50 StartingorRestartingHBase................................................................................................................................................50 Stopping HBase....................................................................................................................................................................50 AccessingHBasebyusingtheHBaseShell.........................................................................................................51 HBaseShellOverview...........................................................................................................................................................51 SettingVirtualMachineOptionsforHBaseShell.................................................................................................................52 Scripting with HBase Shell...................................................................................................................................................52 UsingHBaseCommand-LineUtilities.................................................................................................................52 PerformanceEvaluation................................................................................................................................................52 LoadTestTool...................................................................................................................................................................53 wal.......................................................................................................................................................................................54 hfile..................................................................................................................................................................................55 hbck....................................................................................................................................................................................55 clean..................................................................................................................................................................................56 Configuring HBase Garbage Collection..............................................................................................................56 ConfigureHBaseGarbageCollectionUsingClouderaManager..........................................................................................57 ConfigureHBaseGarbageCollectionUsingtheCommandLine..........................................................................................57 DisablingtheBoundedByteBufferPool..........................................................................................................................57 ConfiguringtheHBaseCanary............................................................................................................................58 ConfiguretheHBaseCanaryUsingClouderaManager.......................................................................................................58 ConfiguretheHBaseCanaryUsingtheCommandLine.......................................................................................................59 CheckingandRepairingHBaseTables................................................................................................................59 Running hbck Manually......................................................................................................................................................59 Hedged Reads....................................................................................................................................................60 EnablingHedgedReadsforHBaseUsingClouderaManager..............................................................................................60 EnablingHedgedReadsforHBaseUsingtheCommandLine..............................................................................................61 MonitoringthePerformanceofHedgedReads...................................................................................................................61 ConfiguringtheBlocksizeforHBase...................................................................................................................61 ConfiguringtheBlocksizeforaColumnFamily....................................................................................................................62 MonitoringBlocksizeMetrics...............................................................................................................................................62 Configuring the HBase BlockCache....................................................................................................................62 Contentsof theBlockCache.................................................................................................................................................62 DecidingWhetherToUsetheBucketCache.........................................................................................................................63 BypassingtheBlockCache....................................................................................................................................................63 CacheEvictionPriorities.......................................................................................................................................................63 Sizing the BlockCache..........................................................................................................................................................64 AbouttheOff-heapBucketCache.........................................................................................................................................64 ConfiguringtheOff-heapBucketCache................................................................................................................................64 ConfiguringtheHBaseScannerHeartbeat.........................................................................................................69 ConfiguretheScannerHeartbeatUsingClouderaManager...............................................................................................69 ConfiguretheScannerHeartbeatUsingtheCommandLine...............................................................................................69 Limiting the Speed of Compactions...................................................................................................................70 ConfiguretheCompactionSpeedUsingClouderaManager................................................................................................70 ConfiguretheCompactionSpeedUsingtheCommandLine................................................................................................70 Reading Data from HBase..................................................................................................................................71 Hedged Reads......................................................................................................................................................................73 EnablingHedgedReadsforHBaseUsingtheCommandLine..............................................................................................73 HBase Filtering...................................................................................................................................................73 WritingDatatoHBase........................................................................................................................................81 ImportingDataIntoHBase.................................................................................................................................83 ChoosingtheRightImportMethod.....................................................................................................................................83 UsingCopyTable...................................................................................................................................................................83 ImportingHBaseDataFromCDH4toCDH5......................................................................................................................84 UsingSnapshots...................................................................................................................................................................86 Using BulkLoad....................................................................................................................................................................87 Using Cluster Replication.....................................................................................................................................................89 UsingPigandHCatalog.......................................................................................................................................................91 UsingtheJavaAPI................................................................................................................................................................93 UsingtheApacheThriftProxyAPI.......................................................................................................................................93 UsingtheRESTProxyAPI.....................................................................................................................................................94 Using Flume.........................................................................................................................................................................94 Using Spark..........................................................................................................................................................................96 UsingSparkandKafka.........................................................................................................................................................97 UsingaCustomMapReduceJob..........................................................................................................................................99 ConfiguringandUsingtheHBaseRESTAPI........................................................................................................99 InstallingtheRESTServer.....................................................................................................................................................99 UsingtheRESTAPI.............................................................................................................................................................100 Configuring HBase MultiWAL Support.............................................................................................................107 ConfiguringMultiWALSupportUsingClouderaManager.................................................................................................107 ConfiguringMultiWALSupportUsingtheCommandLine.................................................................................................107 StoringMediumObjects(MOBs)inHBase.......................................................................................................108 ConfiguringColumnstoStoreMOBs..................................................................................................................................108 HBaseMOBCacheProperties............................................................................................................................................109 ConfiguringtheMOBCacheUsingClouderaManager......................................................................................................109 ConfiguringtheMOBCacheUsingtheCommandLine......................................................................................................110 TestingMOBStorageandRetrievalPerformance.............................................................................................................110 CompactingMOBFilesManually.......................................................................................................................................110 ConfiguringtheStoragePolicyfortheWrite-AheadLog(WAL).......................................................................111 ExposingHBaseMetricstoaGangliaServer....................................................................................................112 ExposeHBaseMetricstoGangliaUsingClouderaManager.............................................................................................112 ExposeHBaseMetricstoGangliaUsingtheCommandLine.............................................................................................112 Managing HBase Security.....................................................................................113 HBase Authentication......................................................................................................................................113 Configuring HBase Authorization.....................................................................................................................113 UnderstandingHBaseAccessLevels..................................................................................................................................114 Enable HBase Authorization..............................................................................................................................................115 ConfigureAccessControlListsforAuthorization...............................................................................................................116 ConfiguringtheHBaseThriftServerRole.........................................................................................................117 Other HBase Security Topics............................................................................................................................117 HBase Replication................................................................................................118 CommonReplicationTopologies......................................................................................................................118 NotesaboutReplication...................................................................................................................................119 Requirements...................................................................................................................................................119 DeployingHBaseReplication............................................................................................................................119 Configuring Secure Replication........................................................................................................................121 DisablingReplicationatthePeerLevel............................................................................................................123 StoppingReplicationinanEmergency.............................................................................................................123 CreatingtheEmptyTableOntheDestinationCluster......................................................................................124 InitiatingReplicationWhenDataAlreadyExists..............................................................................................124 UnderstandingHowWALRollingAffectsReplication.......................................................................................125 Configuring Secure HBase Replication.............................................................................................................126 RestoringDataFromAReplica.........................................................................................................................126 VerifyingthatReplicationisWorking...............................................................................................................126 Replication Caveats..........................................................................................................................................128 HBase High Availability.........................................................................................129 EnablingHBaseHighAvailabilityUsingClouderaManager.............................................................................129 EnablingHBaseHighAvailabilityUsingtheCommandLine.............................................................................129 HBase Read Replicas........................................................................................................................................129 TimelineConsistency..........................................................................................................................................................130 Keeping Replicas Current...................................................................................................................................................130 EnablingReadReplicaSupport..........................................................................................................................................130 ConfiguringRackAwarenessforReadReplicas.................................................................................................................133 ActivatingReadReplicasOnaTable..................................................................................................................................133 RequestingaTimeline-ConsistentRead.............................................................................................................................133 Troubleshooting HBase........................................................................................135 Table CreationFails after InstallingLZO...........................................................................................................135 ThriftServerCrashesafterReceivingInvalidData...........................................................................................135 HBaseisusingmorediskspacethanexpected................................................................................................135 HiveServer2 Security Configuration.................................................................................................................136 EnablingKerberosAuthenticationforHiveServer2............................................................................................................137 UsingLDAPUsername/PasswordAuthenticationwithHiveServer2..................................................................................138 ConfiguringLDAPSAuthenticationwithHiveServer2........................................................................................................139 Pluggable Authentication..................................................................................................................................................140 TrustedDelegationwithHiveServer2.................................................................................................................................140 HiveServer2Impersonation................................................................................................................................................141 SecuringtheHiveMetastore..............................................................................................................................................141 DisablingtheHiveSecurityConfiguration.........................................................................................................................142 HiveMetastoreServerSecurityConfiguration.................................................................................................142 UsingHivetoRunQueriesonaSecureHBaseServer......................................................................................143 Appendix: Apache License, Version 2.0.................................................................145 ApacheHBaseGuide Apache HBase Guide ApacheHBaseisascalable,distributed,column-orienteddatastore.ApacheHBaseprovidesreal-timeread/write randomaccesstoverylargedatasetshostedonHDFS. Installation HBaseispartoftheCDHdistribution.OnaclustermanagedbyClouderaManager,HDFSinincludedwiththebaseCDH installationanddoesnotneedtobeinstalledseparately. OnaclusternotmanagedusingClouderaManager,youcaninstallHDFSmanually,usingpackagesortarballswiththe appropriatecommandforyouroperatingsystem. ToinstallHBaseOnRHEL-compatiblesystems: $ sudo yum install hbase ToinstallHBaseonUbuntuandDebiansystems: $ sudo apt-get install hbase ToinstallHBaseonSLESsystems: $ sudo zypper install hbase Note: SeealsoStartingHBaseinStandaloneModeonpage23,ConfiguringHBasein Pseudo-DistributedModeonpage25,andDeployingHBaseonaClusteronpage27formore informationonconfiguringHBasefordifferentmodes. Formoreinformation,seeHBaseInstallationonpage12. Upgrading ForinformationaboutupgradingHBasefromalowerversionofCDH5,seeUpgradingHBasefromaLowerCDH5 Releaseonpage33. Note: ToseewhichversionofHBaseisshippinginCDH5,checktheVersionandPackagingInformation. Forimportantinformationonnewandchangedcomponents,seetheCDH5ReleaseNotes. Important: Beforeyoustart,makesureyouhavereadNewFeaturesandChangesforHBaseinCDH 5onpage12,andchecktheKnownIssuesinCDH5andIncompatibleChangesandLimitationsfor HBase. Configuration Settings HBasehasanumberofsettingsthatyouneedtoconfigure.Forinformation,seeConfigurationSettingsforHBaseon page35. ApacheHBaseGuide|9 ApacheHBaseGuide Bydefault,HBaseshipsconfiguredforstandalonemode.Inthismodeofoperation,asingleJVMhoststheHBase Master,anHBaseRegionServer,andaZooKeeperquorumpeer.HBasestoresyourdatainalocationonthelocal filesystem,ratherthanusingHDFS.Standalonemodeisonlyappropriateforinitialtesting. Pseudo-distributedmodediffersfromstandalonemodeinthateachofthecomponentprocessesruninaseparate JVM.Itdiffersfromdistributedmodeinthateachoftheseparateprocessesrunonthesameserver,ratherthanmultiple serversinacluster.Formoreinformation,seeConfiguringHBaseinPseudo-DistributedModeonpage25. Managing HBase YoucanmanageandconfigurevariousaspectsofHBaseusingClouderaManager.Formoreinformation,seeManaging theHBaseService. HBase Security Forthemostpart,securinganHBaseclusterisaone-wayoperation,andmovingfromasecuretoanunsecure configurationshouldnotbeattemptedwithoutcontactingClouderasupportforguidance.Foranoverviewsecurityin HBase,seeManagingHBaseSecurityonpage113. ForinformationaboutauthenticationandauthorizationwithHBase,seeHBaseAuthenticationandConfiguringHBase Authorization. HBase Replication IfyourdataisalreadyinanHBasecluster,replicationisusefulforgettingthedataintoadditionalHBaseclusters.In HBase,clusterreplicationreferstokeepingoneclusterstatesynchronizedwiththatofanothercluster,usingthe write-aheadlog(WAL)ofthesourceclustertopropagatethechanges.Replicationisenabledatcolumnfamilygranularity. Beforeenablingreplicationforacolumnfamily,createthetableandallcolumnfamiliestobereplicated,onthe destinationcluster. Clusterreplicationusesanactive-pushmethodology.AnHBaseclustercanbeasource(alsocalledactive,meaning thatitwritesnewdata),adestination(alsocalledpassive,meaningthatitreceivesdatausingreplication),orcanfulfill bothrolesatonce.Replicationisasynchronous,andthegoalofreplicationisconsistency. Whendataisreplicatedfromoneclustertoanother,theoriginalsourceofthedataistrackedwithaclusterID,which ispartofthemetadata.InCDH5,allclustersthathavealreadyconsumedthedataarealsotracked.Thisprevents replicationloops. FormoreinformationaboutreplicationinHBase,seeHBaseReplicationonpage118. HBase High Availability MostaspectsofHBasearehighlyavailableinastandardconfiguration.AclustertypicallyconsistsofoneMasterand threeormoreRegionServers,withdatastoredinHDFS.Toensurethateverycomponentishighlyavailable,configure oneormorebackupMasters.ThebackupMastersrunonotherhoststhantheactiveMaster. ForinformationaboutconfiguringhighavailabilityinHBase,seeHBaseHighAvailabilityonpage129. Troubleshooting HBase TheClouderaHBasepackageshavebeenconfiguredtoplacelogsin/var/log/hbase.Clouderarecommendstailing the.logfilesinthisdirectorywhenyoustartHBasetocheckforanyerrormessagesorfailures. ForinformationaboutHBasetroubleshooting,seeTroubleshootingHBaseonpage31. 10|ApacheHBaseGuide
Description: