Troubleshooting Cisco APIC-EM Single and Multi-Host ThefollowinginformationmaybeusedtotroubleshootCiscoAPIC-EMsingleandmulti-host: • RecoveryProceduresforCiscoAPIC-EMNodeFailures, page 1 • RemovingaSingleHostfromaMulti-HostCluster, page 5 • RemovingaFaultedHostfromaMulti-HostCluster, page 6 • ResettingtheCiscoAPIC-EM, page 8 • AddingaNewHosttoaMulti-HostCluster, page 10 • ShuttingDownandStartingUpaHostinaMulti-HostCluster, page 14 • ConfirmingtheMulti-HostClusterConfigurationValues, page 16 • ChangingtheSettingsinaMulti-HostCluster, page 18 Recovery Procedures for Cisco APIC-EM Node Failures ThefollowingtabledescribesrecommendedprocedurestotaketoresolveaCiscoAPIC-EMsinglenode failurescenario. Cisco Application Policy Infrastructure Controller Enterprise Module Troubleshooting Guide, Release 1.4.x 1 Troubleshooting Cisco APIC-EM Single and Multi-Host Recovery Procedures for Cisco APIC-EM Node Failures Table 1: Single Host Recovery Procedures Node Failure Scenario Symptoms and Recovery Procedures Poweroutage Inmostcases,thenodeshouldrecoverautomatically whenthepowerisrestored.Inraresituations,some oftheAPIC-EMservicesmaynotcomeupcleanly duetosometransientconditions.Insuchcases,you wouldneedtoexecutethefollowingstepstoensure thatthenodecomesbackonlinecleanly: 1 Ifyouhavenotalreadydoneso,restartthepower onthefailedhost. 2 Resetthehost. SeeResettingtheCiscoAPIC-EM. Badorfaultyhardware Performthefollowingstepstorecoverfromanode failurescenarioduetobadorfaultyhardware: 1 RMAthebadorfaultyhardware. 2 Reinstallnewhardware. SeetheCiscoApplicationPolicyInfrastructure ControllerEnterpriseModuleInstallationGuide. 3 InstallCiscoAPIC-EMcontrollersoftwareonthe newhardware. SeetheCiscoApplicationPolicyInfrastructure ControllerEnterpriseModuleInstallationGuide. 4 Restoreyourdatabasebackupusingthe controller'sGUI. SeetheCiscoApplicationPolicyInfrastructure ControllerEnterpriseModuleAdministrator Guide. 5 Ensurethatyouhaveinstalledandenabledany applicationsthatwerepreviouslyrunningonthe controller. SeetheCiscoApplicationPolicyInfrastructure ControllerEnterpriseModuleAdministrator Guide. 6 Ifapplicabletoyourconfiguration,addthenew hosttothecluster. SeeAddingaNewHosttoaMulti-HostCluster, onpage10. Cisco Application Policy Infrastructure Controller Enterprise Module Troubleshooting Guide, Release 1.4.x 2 Troubleshooting Cisco APIC-EM Single and Multi-Host Recovery Procedures for Cisco APIC-EM Node Failures Node Failure Scenario Symptoms and Recovery Procedures Controllersoftwareupgradefailure Inthiscase,torecoverfromtheupgradefailureand returntothecurrentCiscoAPIC-EMversion,perform thefollowingsteps: 1 Restoreyourdatabasebackupusingthe controller'sGUI. SeetheCiscoApplicationPolicyInfrastructure ControllerEnterpriseModuleAdministrator Guide. 2 Ensurethatyouhaveinstalledandenabledany applicationsthatwerepreviouslyrunningonthe controller. SeetheCiscoApplicationPolicyInfrastructure ControllerEnterpriseModuleAdministrator Guide. ThefollowingtabledescribesrecommendedprocedurestotaketoresolveaCiscoAPIC-EMmulti-host(node) failurescenario. Table 2: Multi-Host Recovery Procedures Node Failure Scenario Symptoms and Recovery Procedures Poweroutagecausingoneormoreoftheclusternodes Inmostcases,thehost(s)shouldrejointheCisco togodown. APIC-EMclusteronitsownwhenthepoweris restored.Inraresituations,someoftheCisco APIC-EMservicesmaynotformtheclusterwiththe existingCiscoAPIC-EMhosts.Insuchcases,you wouldneedtoexecutethefollowingstepstoensure thatthefailedhostjoinsthecluster: 1 Ifyouhavenotalreadydoneso,restartthepower onthefailedhost. 2 Resetthehost. SeeResettingtheCiscoAPIC-EM. Note Ifafterapoweroutage,thehostdoesnot comebackup,thenfollowtheprocedures directlybelowforrecoveringfrombador faultyhardware. Cisco Application Policy Infrastructure Controller Enterprise Module Troubleshooting Guide, Release 1.4.x 3 Troubleshooting Cisco APIC-EM Single and Multi-Host Recovery Procedures for Cisco APIC-EM Node Failures Node Failure Scenario Symptoms and Recovery Procedures Badorfaultyhardwareononeoftheclusternodes. Inthiscase,youwouldneedtofirstremovethefaulty (bad)hostfromtheclusterandthenaddthenewhost tothecluster.Performthefollowingsteps: 1 RMAthebadorfaultyhardware. 2 Reinstallnewhardware. SeetheCiscoApplicationPolicyInfrastructure ControllerEnterpriseModuleInstallationGuide. 3 InstallCiscoAPIC-EMcontrollersoftwareonthe newhardware. SeetheCiscoApplicationPolicyInfrastructure ControllerEnterpriseModuleInstallationGuide. 4 Restoreyourdatabasebackupusingthe controller'sGUI. SeetheCiscoApplicationPolicyInfrastructure ControllerEnterpriseModuleAdministrator Guide. 5 Ensurethatyouhaveinstalledandenabledany applicationsthatwerepreviouslyrunningonthe controller. SeetheCiscoApplicationPolicyInfrastructure ControllerEnterpriseModuleAdministrator Guide. 6 Addthenewhosttothecluster. SeeAddingaNewHosttoaMulti-HostCluster, onpage10. Networkconnectivityissuesbetweenthecluster Inmostcases,thenode(s)shouldrejointheCisco nodes. APIC-EMclusteronitsownwhenthenetwork connectivityisrestored.Inraresituations,someof theCiscoAPIC-EMservicesmaynotformthecluster withtheexistingCiscoAPIC-EMnodes.Insuch cases,youwouldneedtoexecutethefollowingsteps toensurethatthefailednodejoinsthecluster: 1 Resetthehost. SeeResettingtheCiscoAPIC-EM. Cisco Application Policy Infrastructure Controller Enterprise Module Troubleshooting Guide, Release 1.4.x 4 Troubleshooting Cisco APIC-EM Single and Multi-Host Removing a Single Host from a Multi-Host Cluster Node Failure Scenario Symptoms and Recovery Procedures Controllersoftwareupgradefailureononeofthe Inthiscase,torecoverfromtheupgradefailureand clusterhosts. returntothecurrentCiscoAPIC-EMversion,perform thefollowingsteps: 1 Restoreyourdatabasebackupusingthe controller'sGUI. SeetheCiscoApplicationPolicyInfrastructure ControllerEnterpriseModuleAdministrator Guide. 2 Ensurethatyouhaveinstalledandenabledany applicationsthatwerepreviouslyrunningonthe controller. SeetheCiscoApplicationPolicyInfrastructure ControllerEnterpriseModuleAdministrator Guide. Hardwareupgradeononeoftheclusternodes. Gracefully,shutdownthehost,upgradethehardware (RAM,CPU,etc.)andrestartthehost. See ShuttingDownandStartingUpaHostina Multi-HostCluster, onpage14. Removing a Single Host from a Multi-Host Cluster Totroubleshootanissuewithamulti-hostcluster,youmayneedtoremoveasinglehostfromamulti-host cluster.ThisproceduredescribeshowtoremoveoneofthehostsrunningCiscoAPIC-EMfromamulti-host cluster.YouusetheCiscoAPIC-EMconfigurationwizardtoperformthisprocedure. Note Theconfigurationwizardoptiontoremoveahostonlyappearsifthehostonwhichyouarerunningthe configurationwizardispartofamulti-hostcluster.Ifthehostisnotpartofamulti-hostcluster,thenthe optiontoremoveahostdoesnotdisplay.Whenperformingthisprocedure,controllerdowntimeoccurs. Forthisreason,werecommendthatyouperformthisprocedureduringamaintenancetimeperiod. Before You Begin YoushouldhaveinstalledtheCiscoAPIC-EMonamulti-hostclusterasdescribedintheCiscoApplication PolicyInfrastructureControllerEnterpriseModuleInstallationGuide. Youmustperformthisprocedureonthesinglehostthatistoberemovedfromthemulti-hostcluster. Cisco Application Policy Infrastructure Controller Enterprise Module Troubleshooting Guide, Release 1.4.x 5 Troubleshooting Cisco APIC-EM Single and Multi-Host Removing a Faulted Host from a Multi-Host Cluster Themulti-hostclustershouldstillbeoperational. Step 1 UsingaSecureShell(SSH)client,logintothehost(appliance,server,orvirtualmachine)withtheIPaddressthatyou specifiedusingtheconfigurationwizard. Note TheIPaddresstoenterfortheSSHclientistheIPaddressthatyouconfiguredforthenetworkadapter.This IPaddressconnectstheappliancetotheexternalnetwork. Step 2 Whenprompted,enteryourLinuxusername('grapevine')andpasswordforSSHaccess. Step 3 Enterthefollowingcommandtoaccesstheconfigurationwizard. $ config_wizard Note Theconfig_wizardcommandisinthePATHofthe'grapevine'user,andnotthe"root"user.Eitherrunthe commandasthe"grapevine"user,orfullyqualifythecommandasthe"root"user.Forexample: /home/grapevine/bin/config_wizard Step 4 ReviewtheWelcometotheAPIC-EMConfigurationWizard!screenandchoosetheoptiontoremovethehostfrom thecluster: •RemovethishostfromitsAPIC-EMcluster Step 5 Amessageappearswiththefollowingoptions: •[cancel]—Exittheconfigurationwizard. •[proceed]—Begintheprocesstoremovethishostfromitscluster. Chooseproceed>>tobegin.Afterchoosingproceed>>,theconfigurationwizardbeginstoremovethishostfromits cluster. Step 6 Attheendofthisprocess,youmusttheneitherruntheconfigurationwizardagaintoconfigurethehostasanewCisco APIC-EMorjointheCiscoAPIC-EMtoacluster. Important Ifyouwishtousethishostagainaseitherastand-alonecontrolleroroperatingwithinacluster,thenyou mustruntheconfigurationwizardagainandre-installtheCiscoAPIC-EM.Donotattempttousethishost againaseitherastandalonehostorwithinaclusterwithoutre-installingtheCiscoAPIC-EM. Removing a Faulted Host from a Multi-Host Cluster Performthestepsinthefollowingproceduretoremoveafaultedorinoperativehost(runningCiscoAPIC-EM) fromamulti-hostcluster.YouusetheCiscoAPIC-EMconfigurationwizardtoperformthisprocedure.A hostbecomesfaultedwhenitcannolongerparticipateintheclusterduetohardwareorsoftwareissues. Afterfollowingthisprocedureonathreehostcluster(movingfromthreehoststotwohosts),youwilllose high-availabilityprotectionagainstlossofahost.Afterfollowingthisprocedureforatwohostcluster,then theclusterwillbecomeinoperableuntilthatsecondhostisbroughtbackupandaddedtothecluster. Cisco Application Policy Infrastructure Controller Enterprise Module Troubleshooting Guide, Release 1.4.x 6 Troubleshooting Cisco APIC-EM Single and Multi-Host Removing a Faulted Host from a Multi-Host Cluster Note Thefactthatthehostbecomes"faulted"resultsinreplacementinstancesoftheservicesonthefaulted hostbeinggrownontheremaininghostsinthecluster.Duringthetimeperiodwhenthereplacement instancesarebeinggrownanddependingonthetypesofservicesbeinggrown,certainCiscoAPIC-EM functionalitymaynotbeavailable. Before You Begin YouhaveinstalledtheCiscoAPIC-EMonamulti-hostclusterfollowingtheproceduredescribedintheCisco ApplicationPolicyInfrastructureControllerEnterpriseModuleInstallationGuide. Youmustperformthisprocedureonanactivehostinthemulti-hostcluster.Youcannotperformthisprocedure onthefaultedhostthatistoberemovedfromthemulti-hostcluster.Afaultedhostisdisplayedasredinthe SystemHealthtabviewintheHomepageofthecontroller'sGUI. Note Youshouldalwaysfirstattempttobringthefaultedhostbackonline.Afterdeterminingthatthefaulted hostcannolongerparticipateinthecluster,thentrytoremovethefaultedhostusingtheRemovethis hostfromitsAPIC-EMclusterconfigurationwizardoption(asdescribedinthepreviousprocedure). YoushouldonlyfollowthisprocedureandtheRemoveafaultedhostfromthisAPIC-EMcluster configurationwizardoption,ifthatotheroptionistriedfirstandisunsuccessfulinremovingthehost. Step 1 UsingaSecureShell(SSH)client,logintothehost(appliance,server,orvirtualmachine)withtheIPaddressthatyou specifiedusingtheconfigurationwizard. Note TheIPaddresstoenterfortheSSHclientistheIPaddressthatyouconfiguredforthenetworkadapter.This IPaddressconnectstheappliancetotheexternalnetwork. Step 2 Whenprompted,enteryourLinuxusername('grapevine')andpasswordforSSHaccess. Step 3 Enterthefollowingcommandtoaccesstheconfigurationwizard. $ config_wizard Note Theconfig_wizardcommandisinthePATHofthe'grapevine'user,andnotthe"root"user.Eitherrunthe commandasthe"grapevine"user,orfullyqualifythecommandasthe"root"user.Forexample: /home/grapevine/bin/config_wizard. Step 4 ReviewtheWelcometotheAPIC-EMConfigurationWizard!screenandchoosetheoptiontoforciblyremovethe faultedhostfromthecluster: •RemoveafaultedhostfromthisAPIC-EMcluster Step 5 Amessageappearswiththefollowingoptions: •<RemoveIPAddressfromcluster>—Forciblyremovesthefaultedhost(identifiedbyitsIPaddress)fromthe multi-hostcluster. •<exit>—Exittheconfigurationwizardwithoutremovingthefaultedhost. Cisco Application Policy Infrastructure Controller Enterprise Module Troubleshooting Guide, Release 1.4.x 7 Troubleshooting Cisco APIC-EM Single and Multi-Host Resetting the Cisco APIC-EM Choose<RemoveIPAddressfromcluster>tobegin.Afterchoosing<RemoveIPAddressfromcluster>,the configurationwizardbeginstoremovethisfaultedhostfromitscluster. Step 6 Attheendofthisprocess,youmusttheneitherruntheconfigurationwizardagaintoconfigurethehostasanewcontroller orjointhecontrollertoacluster. Important Ifyouwishtousethishostagainaseitherastand-alonecontrolleroroperatingwithinacluster,thenyou mustruntheconfigurationwizardagainandre-installtheCiscoAPIC-EM.Donotattempttousethishost againaseitherastandalonehostorwithinaclusterwithoutre-installingtheCiscoAPIC-EM. Resetting the Cisco APIC-EM YoucantroubleshootaCiscoAPIC-EMdeploymentbyresettingthecontrollerbacktoconfigurationvalues thatwereoriginallysetusingtheconfigurationwizardthefirsttime.Aresetofthecontrollerishelpful,when thecontrollerhasgottenitselfintoanunstablestateandothertroubleshootingactivitieshavenotresolvedthe situation. Note Inamulti-hostenvironment,youneedtoperformthisprocedureononlyasinglehost.Afterperforming thisprocedureonasinglehost,theothertwohostswillbeautomaticallyreset. Before You Begin YouhaveinstalledtheCiscoAPIC-EMfollowingtheproceduredescribedintheCiscoApplicationPolicy InfrastructureControllerEnterpriseModuleInstallationGuide. Step 1 UsingaSecureShell(SSH)client,logintothehost(physicalorvirtual)withtheIPaddressthatyouspecifiedusingthe configurationwizard. Note TheIPaddresstoenterfortheSSHclientistheIPaddressthatyouconfiguredforthenetworkadapter.This IPaddressconnectsthehosttotheexternalnetwork. Step 2 Whenprompted,enteryourLinuxusername('grapevine')andpasswordforSSHaccess. Step 3 NavigatetothebindirectoryontheGrapevineroot.Thebindirectorycontainsthegrapevinescripts. Step 4 Enterthereset_grapevine commandattheprompttoruntheresetgrapevinescript. $ reset_grapevine Thereset_grapevinecommandreturnstheconfigurationsettingsbacktovaluesthatyouconfiguredwhenrunningthe configurationwizardforthefirsttime.Theconfigurationsettingsaresavedtoa.JSONfile.This.JSONfileislocated at: /etc/grapevine/controller-config.json.Thereset_grapevinecommandusesthedatainthe controller-config.jsonfiletoreturntotheearlierconfigurationsettings,sodonotdeletethisfile.Ifyoudelete thisfile,youmustruntheconfigurationwizardagainandreenteryourconfigurationdata. Cisco Application Policy Infrastructure Controller Enterprise Module Troubleshooting Guide, Release 1.4.x 8 Troubleshooting Cisco APIC-EM Single and Multi-Host Resetting the Cisco APIC-EM Important Thereset_grapevinecommandwillterminateiftheSSHconnectionisdisconnectedforanyreason.To avoidthis,werecommendthatyouusetmux(terminalmultiplexer)whichisalreadyinstalledonthe controllertorunthereset_grapevinecommandinthesession.Youcanusethefollowingcommandsfor tmux: tmuxnew-ssession_namereset_grapevine Commandtocreateanewsessionusingtmuxfor reset-grapevine. Forexample,youcanenterthefollowingcommand: tmuxnew-ssession100 reset_grapevine tmuxls Commandtoviewathelistoftmuxsessions. tmuxattach-tsession_namereset_grapevine Commandtoattachtoatmuxsession. Forexample,youcanenterthefollowingcommand: tmuxattach-tsession200reset_grapevine Togetmoreinformationabouttmux,youcanrun themantmuxcommand. Afterenteringthereset_grapevine command,youarethenpromptedtoreenteryourGrapevinepassword. Step 5 EnteryourGrapevinepasswordasecondtime. [sudo] password for grapevine:******** YouarethenpromptedtodeleteallvirtualdisksThevirtualdisksarewheretheCiscoAPIC-EMdatabaseresides.For example,dataaboutdevicesthatthecontrollerdiscoveredaresavedonthesevirtualdisks.Ifyouenteryes(y),allof thisdataisdeleted.Ifyouenterno(n),thenthenewclusterwillcomeuppopulatedwithyourexistingdataoncethe resetprocedurecompletes. Step 6 Enterntopreventthedeletionallofthevirtualdisks. THIS IS A DESTRUCTIVE OPERATION Do you want to delete all VIRTUAL DISKS in your APIC-EM cluster? (y/n):n YouarethenpromptedtodeleteallCiscoAPIC-EMauthenticationtimeoutpolicies,userpasswordpolicies,anduser accountsotherthantheprimaryadministratoraccount. Step 7 Enterntopreventthedeletionofallauthenticationtimeoutpolicies,userpasswordpolicies,anduseraccountsother thantheprimaryadministratoraccount. THIS IS A DESTRUCTIVE OPERATION Do you want to delete authentication timeout policies, user password policies, and Cisco APIC-EM user accounts other than the primary administrator account? (y/n): n Youarethenpromptedtodeleteanyimportedcertificates. Cisco Application Policy Infrastructure Controller Enterprise Module Troubleshooting Guide, Release 1.4.x 9 Troubleshooting Cisco APIC-EM Single and Multi-Host Adding a New Host to a Multi-Host Cluster Step 8 Enterntopreventthedeletionofanyimportedcertificates. THIS IS A DESTRUCTIVE OPERATION Do you want to delete the imported certificates? (y/n): n Youarethenpromptedtodeleteanybackups. Step 9 Enterntopreventthedeletionofanybackups. THIS IS A DESTRUCTIVE OPERATION Do you want to delete the backups? (y/n): n Thecontrollerthenresetsitselfwiththeconfigurationvaluesthatwereoriginallysetusingtheconfigurationwizardthe firsttime.Whenthecontrollerisfinishedresetting,youarepresentedwithacommandpromptfromthecontroller. Step 10 UsingtheSecureShell(SSH)client,logoutofthehost. Adding a New Host to a Multi-Host Cluster PerformthestepsinthisproceduretoconfigureCiscoAPIC-EMonyourhostandtojoinittoanother, pre-existinghosttocreateacluster.ConfiguringtheCiscoAPIC-EMonmultiplehoststocreateaclusteris bestpracticeforbothhighavailabilityandscale. Caution •Whenjoiningahosttoaclusterasdescribedintheprocedurebelow,thereisnomergingofthedata onthetwohosts.Thedatathatcurrentlyexistsonthehostthatisjoiningtheclusteriserasedand replacedwiththedatathatexistsontheclusterthatisbeingjoinedto. •Whenjoiningtheadditionalhoststoformaclusterbesuretojoinonlyasinglehostatatime.You shouldnotjoinmultiplehostsatthesametime,asdoingsowillresultinunexpectedbehavior. •Youshouldalsoexpectsomeservicedowntimewhentheaddingorremovinghoststoacluster, sincetheservicesarethenredistributedacrossthehosts.Beawarethatduringtheservice redistribution,therewillbedowntime. Before You Begin Youmusthaveperformedthefollowingprerequisites: •YoumusthaveeitherreceivedaCiscoAPIC-EMControllerAppliancewiththeCiscoAPIC-EM pre-installedoryoumusthavedownloaded,verified,andinstalledtheCiscoISOimageontoasecond serverorvirtualmachine. •YoumusthavealreadyconfiguredCiscoAPIC-EMonthefirsthost(serverorvirtualmachine)inyour plannedmulti-hostclusterfollowingthestepsinthepreviousprocedure. •Additionally,youmusthavecheckedthecontroller'shealthonthefirsthostusingtheSYSTEMHEALTH tabintheGUI.TheSYSTEMHEALTHtabisdirectlyaccessiblefromtheHOMEpage.Forinformation Cisco Application Policy Infrastructure Controller Enterprise Module Troubleshooting Guide, Release 1.4.x 10
Description: