SUBMITTEDTOIEEETRANS.KNOWLEDGEANDDATAENG.,VOL.XX,NO.Y,MONTH1999 1 Applications of Abduction: Testing Very Long Qualitative Simulations Tim Menzies, Robert F. Cohen, Sam Waugh, Simon Goss T.MenziesiswiththeAIDepartment,SchoolofComputerScienceandEngineering,UniversityofNewSouthWales,Kens- ington,Australia,2052.Email:[email protected];Url:www.cse.unsw.edu.au/(cid:0) timm. RobertF.CoheniswiththeDepartmentofComputerScience,UniversityofNewcastle,Callaghan,Australia2308. Email: [email protected];Url:www.cs.newcastle.edu.au/(cid:0) rfc. SamWaughandSimonGossarewiththeDefenceScienceandTechnologyOrganisationAirOperationsDivision,POBox 4331,Melbourne,Australia,3001.Email:sam.waugh,[email protected] November19,2000 DRAFT 2 SUBMITTEDTOIEEETRANS.KNOWLEDGEANDDATAENG.,VOL.XX,NO.Y,MONTH1999 Abstract We can test a theory of “X” by checking if that theory can reproduce knownbehaviour of “X”. In the general case, this check for time-based simulations is only practical for short simulation runs. We show that given certain reasonable language restrictions, then the complexity of this check reduces to the granularity of the measurements. That is, provided a very long simulation run is only measured infrequently,thenthischeckisfeasible. Keywords Validation,complexity,abduction,qualitativereasoning. I. INTRODUCTION Weneedthoroughmethodsfortestingourknowledgebases(KBs). Modernknowledgeacqui- sition (KA) theorists view KB construction as the construction of inaccurate surrogates models of reality [1,2]. Agnew, Ford & Hayes [3] comment that “expert-knowledge is comprised of context-dependent, personally constructed, highly functional but fallible abstractions”. Prac- tioners confirm just how inaccurate KBs can be. Silverman [4] cautions that systematic biases in expert preferencesmay resultin incorrect/incomplete knowledgebases. Compton [5]reports expert systems in which there was always one further important addition, one more significant andessentialchange. Workingsystemscancontainmultipleundetectederrors. Preece&Shing- hal[6]documentfivefieldedexpertsystemsthatcontainnumerouslogicalanomalies. Myers[7] reports that 51 experienced programmers could only ever find 5 of the 15 errors in a simple 63 lineprogram, evengivenunlimited timeand accesstothe sourcecodeand theexecutable. Potentially inaccurate and evolving theories must be validated, lest they generate inappro- priate output for certain circumstances. Testing can only demonstrate the presence of bugs (never their absence) and so must be repeated whenever new data is available or a program has changed. That is, validation is an essential, on-going process through-out the lifetime of a knowledge base. This view motivatedFeldman & Compton [8], then Menzies & Compton [9], to develop QMOD/HT4: a general technique for automatically validating theories in vague do- mains. Avaguedomainis(i)poorly-measured;and/or(ii)lacksadefinitiveoracle;and/or(iii)is indeterminate/non-monotonic. Validation in such vague domains necessitates making assump- tions about unmeasured variables and maintaining mutually exclusive assumptions in separate worlds. Many domains tackled by modern KA are vague; i.e. this definition of test is widely DRAFT November19,2000 MENZIES,COHEN,WAUGH,GOSS:TESTINGVERYLONGSIMULATIONS 3 ++ fish growth rate ++ -- fish population change ++ -- fish density ++ fish catch Fig.1. Atheory.Atheoryconnecteddeviceswithtwostates(up(cid:1) anddown ).Inthistheoryx (cid:5) ydenotesthat (cid:2) (cid:3)(cid:4)(cid:3) y(cid:1) andy canbeexplainedbyx(cid:1) andx respectively;andx (cid:5) ydenotesthaty(cid:1) andy canbeexplainedby (cid:2) (cid:2) (cid:6)(cid:7)(cid:6) (cid:2) x andx(cid:1) respectively. (cid:2) applicable. Formally, QMOD/HT4 is abduction and abduction is known to be NP-hard [10]; i.e. theoret- ically the system is impractical since its runtimes are very likely to be exponential on theory size( )). Nevertheless,abductivevalidationhashasbeenabletodetectpreviouslyinvisible (cid:8)(cid:10)(cid:9)(cid:12)(cid:11)(cid:14)(cid:13) and significant flaws in the theories published in the international neuroendocrinological litera- ture [8,9]. Further, Menzieshas shown[11] thatabductivevalidationis practicalforat leastthe sample of fielded expert systems studied by Preece & Shinghal. However, standard abductive validationis restrictedto non-temporaltheorieswith theinvariantthat novariablecanhavetwo different values ( II). In the case of time-based simulation, this invariant is inappropriate since (cid:15) variablescanhavedifferentvaluesat differenttimes. Onewaytoextendabductivevalidationtotemporalabductivevalidationistorenamevariables ateachtimepointinthesimulation. Forexample,allthevariablesinFigure1couldberenamed e.g. , where is some timepoint. A naiverenaming (cid:16)(cid:18)(cid:17)(cid:20)(cid:19)(cid:22)(cid:21)(cid:24)(cid:23)(cid:26)(cid:25)(cid:28)(cid:27)(cid:30)(cid:29)(cid:31)(cid:21)! (cid:16)(cid:18)(cid:17)(cid:20)(cid:19)(cid:22)(cid:21)(cid:24)(cid:23)(cid:26)(cid:25)(cid:28)(cid:27)(cid:30)(cid:29)(cid:31)(cid:21)"(cid:11)$#%#%#&(cid:16)’(cid:17)(cid:20)(cid:19)(cid:22)(cid:21)(cid:24)(cid:23)((cid:25))(cid:27)(cid:30)(cid:29)(cid:31)(cid:21)+* * strategywould use these renamedvariables to create copies of the theory as seenin Figure2. * This naive renaming strategy incurs a severe computational cost; i.e. if a non-temporal theory has variablesandis ,thenfor timepoints,itstemporalequivalentis ( III). , (cid:8)-(cid:9).(cid:11) (cid:13)0/ * (cid:8)-(cid:9).(cid:11) (cid:13)21(cid:12)34/ (cid:15) Thispaperpresentsanon-naiverenamingstrategy. Thisnewstrategyisbasedonthefollowing simpleintuition. MuchofthesearchspaceshowninFigure2isthesamestructure,repeatedover and over again. It seems at least possible that no path can be found through copies that *657 can’t be found in copies (since the space is essentially the same). If this were true, then we * could reduce the search space of temporal abductive validation by not copying the structure at all. We show below (in III) that this intuition is incorrect, unless we carefully restrict how vari- (cid:15) November19,2000 DRAFT 4 SUBMITTEDTOIEEETRANS.KNOWLEDGEANDDATAENG.,VOL.XX,NO.Y,MONTH1999 ++ ++ -- Theory: copied for time T=0 (inputs) ++ -- ++ ++ ++ -- Theory: copied -- ++ for time T=1 ++ Theory: copied for times T= 2 .. 99,999,999 ++ ++ -- Theory: copied for time T= 100,000,000 ++ -- ++ Theory: copied for time T= 100,000,001 to time T= 999,999,999 ++ ++ -- Theory: copied for time T=1,000,000,000 ++ -- ++ Fig. 2. The theory of Figure 1, copied times. The copy at time is used for the inputs to the 8:9<;>=?8 @BAB9 simulation. Dashed edges denote links between variables at different times. For space reasons, some of the copiesnotshown. ables are linked in our theories. A linking policy is the method of connecting variables at to *(cid:18)C . We will show that is we apply the implicit symmetric edge linking policy (described be- *’CED(cid:18)F low),thenthesearchspaceofatheorywith statessaturatesafter renamings;i.e. ifaproof G G does not terminate in time , then no such proof exists ( IV). The practical implications of G (cid:15) saturation are that, under certain circumstances, we can ignoring a large subset of the renamed variables at unmeasured time points. Suppose all the variables in Figure 1 had states G HI(cid:11) (e.g. they were variables states: up , down ). Suppose further we had run that model for a J K billion ( ) time steps. Suppose further that we only have data from that simulation at three ML(cid:14)N time points: say, initially, at and . Without implicit symmetric edge linking, *OHP ML(cid:14)Q *OHR %L N thenwhensearchingforproofsofthesemeasurements,wewouldhavetoexplorethespacecre- atedbyall renamingsshowninFigure2However,withimplicitsymmetricedgelinking,we %L N onlyneedexplore of thatspace, asshowninFigure3 ( V). FUST(cid:20)V (cid:15) DRAFT November19,2000 MENZIES,COHEN,WAUGH,GOSS:TESTINGVERYLONGSIMULATIONS 5 ++ ++ -- Theory: copied for time T=0 (inputs) ++ -- ++ ++ ++ -- Theory: copied for time T= 100,000,000 ++ -- ++ ++ Theory: copied ++ -- for time T=1,000,000,000 ++ -- ++ Fig.3. Assumingthesimulationismeasuredonlythreetimes,andthe2-statevariablesareconnectedwithimplicit symmetricedges,thenthesearchspacefromthe copiesofFigure2reducestothe3copiesshownhere. 8:9(cid:31);(cid:28)=W8 Ourapproachrequiresalanguagethatismorerestrictivethanthatusedinstandardqualitative reasoning such as QSIM [12,13]. Nevertheless, we argue that this restriction is both practical and desirable: VI is an experimental demonstration that these language restrictions still permit the simu- X (cid:15) lation and validationofreal-world theories. In our related work section ( VII), we will note that standard qualitativereasoning systems X (cid:15) cannot guarantee a tractable simulation for all models represented in that system. By com- parison, we can guarantee a tractable simulation for all theories written in our language, providedthataverylongsimulationrun isonlymeasured infrequently. II. NON-TEMPORAL ABDUCTIVE VALIDATION Thissectioncontainsour standarddescription of non-temporal abductivevalidation. A. Tutorial AbductionisthesearchforassumptionsAwhich,whencombinedwithsometheoryTachieves someset of goalsOUTwithout causingsomecontradiction [14]. That is: : ; X(cid:26)Y[Z F \^]‘_Wacbed(cid:7)\ : . X(cid:26)Y[Z^f \^]‘_hajg i Menzies’ HT4 abductive inference engine [15] caches the proof trees used to satisfy Y(Z and F . ThesearethensortedintoworldsW:maximalconsistentsubsets(maximalwithrespectto Y[Z^f November19,2000 DRAFT 6 SUBMITTEDTOIEEETRANS.KNOWLEDGEANDDATAENG.,VOL.XX,NO.Y,MONTH1999 current trade account deficit balance -- investor ++ confidence foriegn sales ++ ++ ++ company corporate -- wages domestic profits ++ spending restraint sales ++ ++ ++ -- public -- inflation confidence Fig.4. Atheory. size). Inthecaseofmultipleworldsbeinggenerated,thebestworld(s)arethosewithmaximum cover: theintersection of thatworld and theOUTputs. For example, consider the task of checking that we can achieve certain OUTputs using some INputs across the KB shown in Figure 4. We denote x=up as x and x=down as x . In that J K figure, x D(cid:24)k D y denotes that y and y can be explained by x and x respectively; and x k y J K J K l(cid:24)l denotes that y and y can be explained by x and x respectively. Edges in our theories are J K K J optionalinferences. Theutilityofusinganedgeisassessedviaitseventualcontributiontoworld coverage. InthecaseoftheobservedOUTputsbeing investorConfidence , wagesRestraint , m J J inflation , and the observed INputs being foriegnSales , domesticSales , K(cid:24)n m J K(cid:24)n HT4 canconnectOUTputsbacktoINputsusingtheproofsofTableI. Theseproofsmaycontain controversialassumptions;i.e. ifwecan’tbelievethatavariablecanbebothupanddownF ,then we can declare the known values for companyProfits and corporateSpending to be controversial. Since corporateSpending is fully dependent on companyProfits (see Figure4),thekeyconflictingassumptionsare companyProfits , companyProfits m J K(cid:24)n (denotedbasecontroversialassumptionsorA.b). WecanusedA.btofindconsistentbeliefsets called worlds W using an approach inspired by the ATMS [16]. A proof P[i] is in W[j] if that proof does not conflict with the environment ENV[j] (a maximal consistent subset of A.b). In o Note:inthetemporalabductivecase,thisrulewouldbeavariablecanbeupanddownatthesametime. DRAFT November19,2000 MENZIES,COHEN,WAUGH,GOSS:TESTINGVERYLONGSIMULATIONS 7 TABLEI PROOFS FROM FIGURE 4 CONNECTINGOUT= investorConfidence(cid:1) , wagesRestraint(cid:1) , p inflation BACK TOINPUTS= foriegnSales(cid:1) , domesticSales . (cid:2)eq p (cid:2)eq P[1]: domesticSales , companyProfits , (cid:2) (cid:2) inflation (cid:2) P[2]: foriegnSales(cid:1) , publicConfidence(cid:1) , inflation (cid:2) P[3]: domesticSales , companyProfits , (cid:2) (cid:2) corporateSpending , wagesRestraint(cid:1) (cid:2) P[4]: domesticSales , companyProfits , (cid:2) (cid:2) inflation , wagesRestraint(cid:1) (cid:2) P[5]: foriegnSales(cid:1) , publicConfidence(cid:1) , inflation , wagesRestraint(cid:1) (cid:2) P[6]: foriegnSales(cid:1) , companyProfits(cid:1) , corporateSpending(cid:1) , investorConfidence(cid:1) investor confidence foriegn sales ++ ++ company corporate wages profits ++ spending restraint ++ -- public -- inflation confidence Fig. 5. World #1 is generated from Figure 4 by combining P[2], P[5], and P[6]. World #1 assumes companyProfits(cid:1) andcovers100%oftheknownOUTputs. November19,2000 DRAFT 8 SUBMITTEDTOIEEETRANS.KNOWLEDGEANDDATAENG.,VOL.XX,NO.Y,MONTH1999 ffoorriieeggnn ssaalleess ++ ccoommppaannyy ccoorrppoorraattee -- wwaaggeess domestic pprrooffiittss ++++ ssppeennddiinngg rreessttrraaiinntt sales ---- ++ ++ ppuubblliicc ---- iinnffllaattiioonn ccoonnffiiddeennccee Fig. 6. World #2 is generated from Figure 4 by combining P[1], P[2], P[3], and P[4]. World #2 assumes companyProfits andcovers67%oftheknownOUTputs. (cid:2) our example, ENV[1]= companyProfits and ENV[2]= companyProfits . Hence, m J(cid:24)n m K(cid:24)n W[1]= P[2],P[5],P[6] and W[2]= P[1]P[2]P[3],P[4] (seeFigure5 andFigure6). m n m n HT4 defines cover to be size of the intersection of a world and the OUTput set. The cover of Figure 5 is 3 (100%) and the cover of Figure 6 is 2 (67%). Note that since there exists a world with100%cover,thenalltheOUTputscanbeexplained;i.e. thistheoryhaspassedtheabductive validationtest. In essence, abductive validation answers the following question: “what portions of a theory of X can reproduce the largest % of known behaviour of X?”. This algorithm will work in two hardcases: 1. Onlysome subsetofknownbehaviourcan beexplained. 2. In the casewhere atheoryis globallyinconsistent, butcontainsusefulportions. For exam- ple, observe in Figure 4 that the theory author’s disagree on the connection from infla- tiontowagesRestraint. Notethatthisinferenceprocedureignoredcertainpossibleinferences;e.g. inW[1],tradeDeficit K and currentAccountBalance . HT4 does not compute ATMS-style total envisionments; J i.e. all state assignments consistent with known facts. A total envisionment would have in- cludedtradeDeficit andcurrentAccountBalance . Nordoes HT4 compute QSIM- K J style attainable envisionments [12,13]; i.e. the subset of total envisionments downstream of the INputs. An attainable envisionment would have included currentAccountBalance . J Rather, HT4 restricts itself to the inferences that connect INputs to OUTputs. That is, HT4 only DRAFT November19,2000 MENZIES,COHEN,WAUGH,GOSS:TESTINGVERYLONGSIMULATIONS 9 computes relevant envisionments; i.e. the subset of the attainable envisionments which are up- streamoftheOUTputs. Theextrainferencesofnon-relevantenvisionmentmayresultinpointless world generation. For example, if somehowcurrentAccountBalance was incompatible J with wagesRestraint , then total or attainable envisionments would divide W[1] into at J least two additional worlds, each of which would contain some subset of the literals shown in Figure 5. The new additional world containing currentAccountBalance would subse- J quentlybeignoredsincethenewadditionalworldwithwagesRestraint wouldbereturned J duringthe searchformaximumcover. B. Complexity This section describes our algorithm for solving the core problem of HT4: finding the base controversial assumption set A.B. This algorithm will be shown to be theoretically NP-hard, experimentallyexponential,butpracticalforcertainproblems. HT4 executes in four phases: the facts sweep, the forwards sweep, the backwards sweep, and the worlds sweep. Firstly, the facts sweep removes all variable assignments inconsistent with knownFACTS (typically,FACTS= ). Using ahash table, the facts sweep runs in linear r(cid:22)st]cbed(cid:7)\ time. Secondly, the forward sweep finds the conflicting assumption set (denoted A.c) as a side- effectofcomputingthetransitiveclosureofIN(denotedIN*). Inatheorycomprisingadirected graph with vertices V, edges E, and fanout , the worst-case complexity of the forwards ucH vwxv vy<v sweep is acceptable at . Note that if the theory lacked invariants, then the validation (cid:8)(cid:10)(cid:9)&z{jzS / process could stop at this point since the transitive closure would find the OUTputs reachable from the INputs. However, in theories with invariants, it may be the case that we can only consistentlyuse portionsof the theoryand INputsto achievesomesubsetof the OUTputs. Thirdly, the backwards sweep grows proofs backwards from a member of OUT back to IN while maintaining several invariants. (i) Proofs can only use members of IN*; i.e. only those literalsdownstreamof the INputs. (ii) Proofsmaintaina forbidsset;i.e. asetof literalsthat are incompatiblewiththeliteralsusedintheproof. Forexample,theliteralsusedinP[1]forbidthe literals domesticSales , companyProfits , inflation . (iii) The upper-most m J J J(cid:24)n A.c found along the way is recorded as that proof’s guess. The union of all the guesses of all theproofswillbeA.b. (iv)Aproof mustnotcontainloops. (v)A proofmustnotcontainitems November19,2000 DRAFT 10 SUBMITTEDTOIEEETRANS.KNOWLEDGEANDDATAENG.,VOL.XX,NO.Y,MONTH1999 procedure worldsSweep begin ENV := maximalConsistentSubsets(A.b) for i := 1 to size(ENV) begin W[i] := ; | for p P } if p.forbids ENV[i] = ~ | then W[i] := W[i] + p; end end Fig.7. TheworldssweepofHT4. thatcontradictotheritemsintheproof;i.e. aproof’smembersmustnotintersectwithitsforbids set. We candemonstrate informallyand formallythatthe backwards sweep isaslowprocess: Informally: If the average size of a proof is , then worse case backwards sweep is X z(cid:127)4(cid:128)(cid:130)(cid:129)M(cid:131):z . To make matters worse, the backwards sweep cannot cull its search at a local (cid:8)(cid:10)(cid:9) z(cid:127)4(cid:128)E(cid:129)M(cid:131)(cid:30)z/(cid:133)(cid:132)j/ propagation level. The utility of an edge may not be apparent till we have examined the searchspace accessedafterusing that edge. Formally: Bylander et.al. [10] show that that general abduction is NP-hard. For our par- X ticular implementation, we can repeat that prior result since we can show that satisfying invariant (v) is NP-hard. Clearly, we can find a theory to generate any directed graph. Gabow et.al. [17] showed that finding a directed path across a directed graph that has at most one of a set of forbidden pairs is NP-hard. Our forbidden pairs are assignments of differentvaluesto thesame variable;e.g. the pairsx &x andx &x are forbidden. J K K J Fourthly,onceA.b isknown,thenthe proofs canbesorted intoworldsvia theworlds sweep. HT4 extracts all the objects O referenced in A.b. A world-defining environment ENV[i] is created for each combination of objects and their values. In our example, ENV[1]= c and m J(cid:24)n ENV[2]= c . The worlds sweep is simply two nested loops over each ENV[i] and each m K(cid:24)n P[j] (see Figure 7). A proof Pj belongs in world W[i] if its forbids set does not intersect the assumptions ENV[i]that definethat world. The worldssweep isexponential at = (cid:8)-(cid:9)xz(cid:127)(cid:134)z%(cid:135)(z(cid:136)(cid:137)s(cid:28){jz/ . (cid:8)(cid:10)(cid:9):(cid:9) z(cid:127)!(cid:128)(cid:130)(cid:129)M(cid:131):z/(cid:30)(cid:132) (cid:135)(z(cid:136)(cid:137)s(cid:28){jz/ DRAFT November19,2000
Description: