Oracle9i Data Mining Administrator’sGuide Release2(9.2) March2002 PartNo. A95959-01 1 Introduction ThisdocumentdescribeshowtoinstalltheOracle9iDataMining(ODM) softwareandhowtoperformotheradministrativefunctionscommontoall ODMadministrationonbothUNIXandWindowsplatforms. 1.1 IntendedAudience Thisadministrator’sguideisintendedforanyoneplanningtoinstalland runOracle9iDataMining—eitheradatabaseadministratororasystem administrator. 1.2 Structure Thisguidecontainsthefollowingsections: (cid:1) Section2,"Overview":BrieflydescribesOracle9iDataMiningrelease2. (cid:1) Section3,"Oracle9iDataMiningInstallation":Describesthegeneric installationsteps(platform-specificinformationisinthe platform-specificreleasenotes). (cid:1) Section4,"Oracle9iDataMiningAdministration":Describestopicsof interesttoadministrators,includingimprovingODMperformance, startingandstoppingtheODMtaskmonitor,detectingerrors,etc. 1.3 WheretoFindFurther Information ThedocumentationsetforOracle9iDataMiningispartoftheOracle9i DatabaseDocumentationLibrary;theODMdocumentsetconsistsofthe followingdocuments: Oracleisaregisteredtrademark,andOracle9i,SQL*Net,andSQL*PlusaretrademarksorregisteredtrademarksofOracle Corporation.Othernamesmaybetrademarksoftheirrespectiveowners. Copyright (cid:1)2002,OracleCorporation. AllRightsReserved. (cid:1) Oracle9iDataMiningAdministrator’sGuide,Release2(9.2)(this document,whichincludesinstallationinstructionsthatarethesame acrossallplatforms). (cid:1) Oracle9iDataMiningConcepts,Release2(9.2). FordetailedinformationabouttheODMAPI,seetheODMJavadocinthe directory$ORACLE_HOME/dm/doconanysystemwhereODMisinstalled. TopreparetheJavadocforuseraccess,seeSection4.3. 1.3.1 RelatedManuals FormoreinformationabouttheOracle9idatabase,see: (cid:1) Oracle9iAdministrator’sGuide,Release2(9.2) (cid:1) Releasenotesforyourplatform. (cid:1) OracleUniversalInstallerConceptsGuide (cid:1) Oracle9iDatabaseMigration 1.4 Conventions Inthismanual,WindowsreferstotheWindowsNT,Windows2000,and WindowsXPoperatingsystems. TheSQLinterfacetoOracle9iisreferredtoasSQL.Thisinterfaceisthe Oracle9iimplementationoftheSQLstandardANSIX3.135-1992,ISO 9075:1992,commonlyreferredtoastheANSI/ISOSQLstandardorSQL92. Inexamples,animpliedcarriagereturnoccursattheendofeachline, unlessotherwisenoted.YoumustpresstheReturnkeyattheendofaline ofinput. 2 Overview Oracle9iDataMining(ODM)allowsapplicationprogrammerstoperform dataminingintheOracle9idatabase.Allmodel-buildingandscoring functionsareaccessiblethroughaJava-basedAPI.TheOracle9idatabase providestheinfrastructureforapplicationdeveloperstobuildintegrated applications,withcompleteprogrammaticcontrolofdataminingfunctions todeliverdataminingwithinthedatabase. ODMrelease2hasmanynewfeatures;fordetails,seeOracle9iDataMining Concepts. ODMrelease2runsonRealApplicationClusters. 2 3 Oracle9i Data Mining Installation ThissectioncontainsgenericODMrequirements,aninstallationoverview, anddetailedinstallationsteps. 3.1 ODM Requirements ODMisanoptiontoOracle9iEnterpriseEdition. ODMhasthefollowinggeneralsoftwarerequirements: (cid:1) TheODMAPIrequiresJava1.3.1_01 (cid:1) ODMusesthejava.sqlpackagethatisincludedinJDK1.2knownas theJDBC2.0API 3.1.1 ODMonaRealApplicationCluster(RAC) IfyouplantorunODMonaRAC,theODMtablespacemustbeatleast 250MB.(InaRACenvironment,tablespacesarebuiltonarawdevice;they arenotgrowableasonUFS.) 3.2 InstallationOverview ThisdocumentprovidesthegenericinstructionsforinstallingOracle9iData Mining. BeforeyouinstallODM,confirmthatyoursystemsatisfiesthesoftwareand hardwarerequirementsforOracle9iEnterpriseEdition,asdescribedinthe releasenotesforyourplatform.Youshouldalsoensurethatyoursystem containsenoughspaceforthetablesthatyouplantouseduringdata mining. TherearethreecommoncasesforinstallingODM: (cid:1) Oracle9iandODMarenotinstalledonyoursystem(Section3.2.1). (cid:1) Oracle9irelease1(orearlier)isinstalledonyoursystem(Section3.2.2) (cid:1) Oracle9irelease2isinstalledonyoursystem(Section3.2.2) ToinstallODMonanOracle9iRealApplicationCluster,seeSection3.3. 3.2.1 Oracle9iNotInstalled IfyouareinstallingODMonasystemwherethedatabaseisnotinstalled, therearetwobasicwaystoinstallOracle9iEnterpriseEdition: 1. Createadatabaseusingtheseeddatabase(Section3.2.1.1). 2. Createadatabasebutdonotusetheseeddatabase,thatis,createa customdatabase(Section3.2.1.2). 3 3.2.1.1 ODMInstallationWithaSeedDatabase Oracleprovidesaseedor preconfigureddatabasethatautomaticallyincludesfeaturesthatresultina highlyeffectiveandeasiertomanagedatabase. FollowthesestepstoinstallOracle9iandODM: 1. StartOracleUniversalInstaller(OUI).Fordetails,seetheOracle UniversalInstallerConceptsGuide. 2. SelecttheEnterpriseEditionofOracle9i.Select"CreateGeneralPurpose Database".Ifyoudonotwishtocreateageneralpurposedatabase,see Section3.2.1.2forinformationaboutinstallingODMwithacustom database. 3. Whentheinstallationcompletessuccessfully,unlocktheODMand ODM_MTRaccountsandchangethedefaultpasswordsusingthe followingcommands: alter user odm identified by new_ODM_password account unlock; alter user odm_mtr identified by new_MTR_password account unlock; wherenew_ODM_password isthenewpasswordfortheODMaccountand new_MTR_password isthenewpasswordfortheODM_MTRaccount. 4. LogintothedatabaseasODMuserandstarttheODMMonitorwith thefollowingSQL*Pluscommand: exec DM_START_MONITOR Aftersuccessfulinstallation,allODMsoftwareislocatedinthe$ORACLE_ HOME/dmdirectory. 3.2.1.2 ODMInstallationWithaCustomDatabase Creatingacustomdatabase takeslongerthancreatingaseedone,butgivesyoufullcontroltospecify andchangealldatabaseparameters. ThesearethemajorstepsrequiredtoinstallODMwithoutusingaseed database: 1. InstallOracle9iEnterpriseEditionandcreateacustomdatabase.See Section3.2.3forinformationaboutdatabaseparametersrequiredby ODM. 2. RuntheOracleDatabaseConfigurationAssistant(DBCA)tooltoinstall theODMoption.DBCAisdescribedintheOracle9iDatabase Administrator'sGuide. FollowthesestepstoinstallODMafteryouhavecreatedyourcustom database: 1. StartDBCA.PicktheODMoption. 4 2. LogintothedatabaseasODMuserandstarttheODMMonitorwith thefollowingSQL*Pluscommand: exec DM_START_MONITOR Aftersuccessfulinstallation,allODMsoftwareislocatedinthe$ORACLE_ HOME/dmdirectory. 3.2.2 Oracle9iorEarlierInstalled IfOracle9irelease1orearlierisinstalledonyoursystem,followthesesteps: 1. UpgradethedatabasetoOracle9iRelease2.IfODM9.0.1isinstalledon yoursystem,itisupgradedtoODMrelease2whenthedatabaseis upgraded. 2. IfODMisnotinstalledonyoursystem,installandconfigureODM release2asdescribedinSection3.2.1. Forinformationaboutupgradingthedatabase,seeOracle9iDatabase Migration.ForinformationaboutupgradingODM,seeSection3.4. 3.2.3 ODMDatabaseParameters ThedefaultvalueofinitialparametersinanOraclepreconfigured(seed) databaseisgenerallysufficientforrunningODM.Ifyouarecreatinga customdatabase,thefollowingparametersettingscanbeusedasageneral guidelineforthedatabase.TheODM-specificparametersmustbeset exactlyasshown.Werecommendthatyoutuneotherparametersbasedon yourhardwareresourcecapacity,volumeofyourinputdatasets,andother characteristicsofyoursystem: ########################################### # Cache and I/O ########################################### db_block_size=8192 db_cache_size=67108864 db_file_multiblock_read_count=16 ########################################### # Cursors and Library Cache ########################################### open_cursors=300 ########################################### # Miscellaneous ########################################### compatible=9.2.0.0.0 5 ########################################### # Pools ########################################### java_pool_size=67108864 large_pool_size=10M shared_pool_size=67108864 ########################################### # Optimizer ########################################### hash_join_enabled=TRUE ########################################### # Processes and Sessions ########################################### processes=150 ########################################### # Sort, Hash Joins, Bitmap Indexes ########################################### sort_area_size=5242880 sort_area_retained_size=2097152 ########################################### # ODM Specific ########################################### aq_tm_processes = 1 job_queue_processes =10 ########################################### # Parallel setting, adjust according to CPU number ########################################### parallel_max_servers = 32 parallel_min_servers = 2 3.3 ODM InstallationonaRealApplicationCluster ODMinstallationonaRealApplicationCluster(RAC)issimilartoODM installationonanon-RACsystem.IfyouuseOracleUniversalInstallerto createthepreconfigureddatabaseonRAC,ODMwillbeinstalledinthis databasejustasitisinanon-RACenvironment. IfyouchoosetocreateacustomdatabaseonyourRACandinstallODM there,werecommendthatyouconfiguretheODMtablespacewitharaw devicepartitionofatleast250MB. 6 3.4 UpgradingODM ODMupgradeispartoftheOracledatabaseupgradeprocess.IfyourODM 9.0.1releaseisinstalledinanOracle9irelease1databaseandyourODM schemanameisODM,theDatabaseUpgradeAssistant(DBUA)will upgradeODMfrom9.0.1to9.2.0atthesametimethatthedatabaseis upgraded.AllODM9.2.0relatedfileswillbelocatedinthe$ORACLE_ HOME/dmdirectory.Whenthedatabaseupgradecompletes,ODMwillbeat the9.2.0releaselevel. Note: Forrelease2ofODM,ODMupgradehasthe followingrestrictions: (cid:1) YoucannotdowngradefromODMrelease2. (cid:1) User-createdbinboundariestablesarenotupgraded. FordetailedinformationaboutupgradinganOracledatabase,seethe Oracle9iDatabaseMigrationmanual. TheODMupgradehastwoparts: (cid:1) ODMschemaupgrade (cid:1) ODMJavaobjectupgrade 3.4.1 ODMSchemaUpgrade WhenyouupgradeODMfromrelease1torelease2,theODMschema repositoryisupgradedfrom9.0.1to.9.2.0.Therearemanychangesinthe ODMschemadefinitionfor9.2.0. Aftertheupgrade,allODM9.0.1modelswillcontinuetofunctioninthe ODM9.2.0environment.AssociationRulesandNaiveBayesmodelsare fullymigratedtoODM9.2.0. ThedatabaseupgradeutilitycallsODMupgradescriptsduringthe databaseupgradeprocess;theODMscriptsperformthefollowingactions: (cid:1) UpgradeODM9.0.1tabledefinitionandmigrate9.0.1ODMmodel representationto9.2.0ODMmodels (cid:1) RemoveODM9.0.1tablesanddependentobjects (cid:1) RemoveODM9.0.1Javaobjects (cid:1) UpgradeallJavaobjectsto9.2.0 (cid:1) UpgradeODM9.0.1JavaobjectsresidedinODMschema(BLOB) (cid:1) InstallODM9.2.0newschemaobjects 7 (cid:1) InstallODM_MTR9.2.0schemaobjects Allupgradescriptsareinthe$ORACLE_HOME/dm/admindirectory.The top-levelupgradescriptforODMis $ORACLE_HOME/rdbms/admin/odmdbmig.sql 3.4.2 ODMJavaObjectUpgrade Allminingmodelsrecordedinschemawillbemigratedto9.2.0format;that is,allNaiveBayesandAssociationRulesmodelsareupgraded. YoucannotdowngradeODMJavaobjectsatthisrelease. 3.5 DeinstallingODM Ifyouwanttopreserveexistingminingmodels,youshouldnotdeinstall ODM.ToinstallanewerversionofODM,upgradeasdescribedinOracle9i DatabaseMigrationandinSection3.4. IfyouwishtodeinstallODM,youshouldstoptheODMTaskMonitoras describedinSection4.8.ThenyoucandeinstallODMusingOUI,justas youwouldanyotherdatabasecomponent. 4 Oracle9i Data Mining Administration ThissectioncontainsinformationofinteresttoODMadministrators. ForinformationaboutadministeringanOracle9idatabase,seetheOracle9i DatabaseAdministrator'sGuide. 4.1 ImprovingODMPerformance ToimproveODMperformance,enableparallelismbysettingthedatabase initializationparametersPARALLEL_MAX_SERVERSandPARALLEL_ MIN_SERVERSbasedonthecharacteristicsofyoursystem. 4.2 ChangingDefault ODMPasswords YoushouldchangetheODMdefaultpasswordsafterinstallation completes.YouchangeODMpasswordsjustasyouchangeanyother databasepasswords. 4.3 ODM API Documentation DocumentationfortheODMAPI,createdusingJavadoc,isinthefile $ORACLE_HOME/dm/doc/odmjdoc.tar.Youshoulduntarthisfileso thatuserscandisplayitinabrowser. 8 4.4 ODM ConfigurationParameters ThefollowingODMconfigurationparametersresideintheODM_ CONFIGURATIONtable.Theseparametersmayrequiremodificationfor yourenvironment. ABN_ALG_SETTING_NF_DEPTH Datatypeisint;defaultis10.SpecifiesthemaximumdepthofanyNetwork FeatureforABNsetting. ABN_ALG_SETTING_NUM_NF Datatypeisint;defaultis10.SpecifiesthemaximumnumberofNetwork FeaturesforABNsetting. ABN_ALG_SETTING_NUM_PRUNED_NF Datatypeisint;defaultis5.Specifiesmaximumnumberofconsecutive prunedNetworkFeaturesforABNsetting. AI_BUILD_SEQ_PER_PARTITION Datatypeisint;defaultis50000. AUTO_BIN_CATEGORICAL_NUM Datatypeisint;defaultis5.Specifiesthenumberofbinsusedby automatedbinningforcategoricalattributes.Thisvalueshouldbe>=2. AUTO_BIN_CATEGORICAL_OTHER DatatypeisSTRING;defaultisOTHER.Specifiesthenameofthe"Other" bingeneratedduringTop-ncategoricalbinning. AUTO_BIN_CL_NUMERICAL_NUM Datatypeisint;defaultis100.Specifiesthemaximumnumberofbins allowedfornumericalattributesforclustering.Usefulvaluesarebetween2 and100.ThisparameterisusedinconjunctionwithCL_ALG_SETTING_ KM_BIN_FACTORandCL_ALG_SETTING_OC_BIN_FACTOR. AUTO_BIN_NUMERICAL_NUM Datatypeisint;defaultis5.Specifiesthenumberofbinsusedby automatedbinningfornumericalattributes.Thisvalueshouldbe>=2. CLASSIFICATION_APPLY_SEQ_PER_PARTITION Datatypeinint;defaultis50000.Specifiesthemaximumnumberofunique sequenceIDsperpartitionusedbyclusteringapply. CLASSIFICATION_BUILD_SEQ_PER_PARTITION Datatypeisint;defaultis50000.Keepsthecomputationsconstrainedto memory-sizedchunksanddeterminesthesizeoftherandomsampleused forMDLcomputations(scoringwithinbuild).Thereisnomaximumvalue; thisvalueshouldnotbesmallerthan1000. 9 CLUSTERING_APPLY_SEQ_PER_PARTITION Datatypeisint;defaultis50000.Constrainsthescoringtomemory-sized chunksofthedataandloopthroughsuchchunks.Thevalueforthis parameterdependsonthesortingarea(SA)andthenumberofclusters.The largertheSA,thelargerthisparametercanbe.Aroughformulaforthis parameteris CLUSTERING_APPLY_SEQ_PER_PARTITION=SA/(100*Num_Clusters) where"SA"isthesizeofthesortingareaand"Num_Clusters"isthenumber ofclusters. CL_ALG_SETTING_CHI2_LOW DatatypeisNUMBER;defaultis1.353.Controlsthelevelofstatistical significanceforO-Clustertodetermineifmoredataisnecessarytorefinea model. CL_ALG_SETTING_KM_BIN_FACTOR DatatypeisNUMBER;defaultis2.Factorusedinautomaticbinnumber computationforthek-meansalgorithm.Increasingthisvaluewillincrease resolutionbyincreasingthenumberofbins.However,thenumberofbinsis alsocappedbyAUTO_BIN_CL_NUMERICAL_NUM. CL_ALG_SETTING_KM_BUFFER Datatypeisint;defaultis10000.Numberofrowsusedbythein-memory bufferusedbyk-means.Foraninstallationwithlimitedmemory,this numbershouldbesmallerthanthedefaultdatasize.Summarizationis activatedfordatasetslargerthanthebuffersize. CL_ALG_SETTING_KM_FACTOR DatatypeisNUMBER;defaultis20.Controlsthenumberofpoints producedbydatasummarizationfork-means.Thelargerthevalue,the morepoints.Theformulaforthenumberofpointsis: NumberofPoints=CL_ALG_SETTING_KM_FACTOR* Num_Attributes* Num_Clusters where"Num_Attributes"isthenumberofattributesand"Num_Clusters"is thenumberofclusters. Thenumberofpointsmustbe<=1000.Thisparametercanbeanypositive value;however,asmallnumberofsummarizationpointscanproducepoor accuracy. CL_ALG_SETTING_MIN_CHI2_POINTS Datatypeisint;defaultis10.Controlstheminimumnumberofrows requiredbyO-Clustertofindacluster.Fordatatableswithaverysmall numberofrows,thisnumbershouldbesettoavaluebetween2and10. 10