Logic Programming(cid:0) Abduction and Probability(cid:1) a top(cid:2)down anytime algorithm for estimating prior and posterior probabilities (cid:0) David Poole Department of Computer Science(cid:0) University of British Columbia(cid:0) Vancouver(cid:0) B(cid:1)C(cid:1)(cid:0) Canada V(cid:2)T (cid:3)Z(cid:4) poole(cid:5)cs(cid:1)ubc(cid:1)ca March (cid:3)(cid:6)(cid:0) (cid:3)(cid:7)(cid:7)(cid:8) Abstract Probabilistic Horn abduction is a simple framework to combine probabilistic and logical reasoning into a coherent practical frame(cid:0) work(cid:1) The numbers can be consistently interpreted probabilistically(cid:2) and all of the rules can be interpreted logically(cid:1) The relationship be(cid:0) tween probabilistic Horn abduction and logic programming is at two levels(cid:1) At the (cid:3)rst level probabilistic Horn abduction is an extension of pure Prolog(cid:2) that is useful for diagnosis and other evidential rea(cid:0) soning tasks(cid:1) At another level(cid:2) current logic programming implemen(cid:0) tation techniques can be used to e(cid:4)ciently implement probabilistic Horn abduction(cid:1) This forms the basis of an (cid:5)anytime(cid:6) algorithm for estimating arbitraryconditional probabilities(cid:1) The focus ofthis paper is on the implementation(cid:1) (cid:0) Scholar(cid:0) Canadian Institute for Advanced Research (cid:0) Logic Programming(cid:0) Abduction and Probability (cid:1) (cid:0) Introduction Probabilistic Horn Abduction (cid:2)(cid:1)(cid:1)(cid:3) (cid:1)(cid:0)(cid:3) (cid:1)(cid:4)(cid:5) is a framework for logic(cid:6)based ab(cid:6) duction that incorporates probabilities with assumptions(cid:7) It is being used as a framework for diagnosis (cid:2)(cid:1)(cid:1)(cid:5) that incorporates both pure Prolog and discrete Bayesian Networks (cid:2)(cid:0)(cid:8)(cid:5) as special cases (cid:2)(cid:1)(cid:0)(cid:5)(cid:7) This paper is about the relationship of probabilistic Horn abduction to logic programming(cid:7) This simpleextension to logic programmingprovides a wealth of new applications in diagnosis(cid:3) recognition and evidential reasoning (cid:2)(cid:1)(cid:4)(cid:5)(cid:7) This paper also presents a logic(cid:6)programming solution to the problem in abduction of searching for the (cid:9)best(cid:10) diagnoses (cid:11)rst(cid:7) The main features of the approach are(cid:12) (cid:0) We are using Horn clause abduction(cid:7) The procedures are simple(cid:3) both conceptuallyand computationally(cid:13)for a certain class of problems(cid:14)(cid:7) We develop a simple extension of SLD resolution to implement our frame(cid:6) work(cid:7) (cid:0) The search algorithms form (cid:9)anytime(cid:10) algorithms that can give an estimateofthe conditionalprobabilityatany time(cid:7) Wedo not generate the unlikely explanations unless we need to(cid:7) We have a bound on the probabilitymassoftheremainingexplanationswhichallowsusto know the error in our estimates(cid:7) (cid:0) A theory of (cid:9)partial explanations(cid:10) is developed(cid:7) These are partial proofs that can be stored in a priority queue until they need to be further expanded(cid:7) We show how this is implementedin a Prolog inter(cid:6) preter in Appendix A(cid:7) (cid:1) Probabilistic Horn abduction The formulation of abduction used is a simpli(cid:11)ed form of Theorist (cid:2)(cid:1)(cid:8)(cid:3) (cid:0)(cid:15)(cid:5) with probabilities associated with the hypotheses(cid:7) It is simpli(cid:11)ed in being restricted to de(cid:11)nite clauses with simple forms of integrity constraints (cid:13)sim(cid:6) ilar to that of Goebel et(cid:7) al(cid:7) (cid:2)(cid:0)(cid:16)(cid:5)(cid:14)(cid:7) This can also be seen as a generalisation of an ATMS (cid:2)(cid:1)(cid:17)(cid:5) to be non(cid:6)propositional(cid:7) Logic Programming(cid:0) Abduction and Probability (cid:4) The language is that of pure Prolog (cid:13)i(cid:7)e(cid:7)(cid:3) de(cid:11)nite clauses(cid:14) with special disjoint declarations that specify a set of disjoint hypotheses with associated probabilities(cid:7) There are some restrictions on the forms of the rules and the probabilisticdependence allowed(cid:7) The language presentedhere is that of (cid:2)(cid:1)(cid:4)(cid:5) rather than that of (cid:2)(cid:1)(cid:1)(cid:3) (cid:1)(cid:0)(cid:5)(cid:7) The main design considerations were to make a language the simplest extension to pure Prolog that also included probabilities (cid:13)not just numbers associated with rules(cid:3) but numbersthat follow the laws of probability(cid:3)and so can be consistently interpreted as probabilities (cid:2)(cid:1)(cid:4)(cid:5)(cid:14)(cid:7) We are also assuming very strong independence assumptions(cid:18) this is not intended to be a tempo(cid:6) rary restriction on the language that we want to eventually remove(cid:3) but as a feature(cid:7) We can represent any probabilistic information using only inde(cid:6) pendent hypotheses (cid:2)(cid:1)(cid:4)(cid:5)(cid:18) if there is any dependence amongst hypotheses(cid:3) we invent a new hypothesis to explain that dependency(cid:7) (cid:0)(cid:1)(cid:2) The language Our language uses the Prolog conventions(cid:3) and has the same de(cid:11)nitions of variables(cid:3) terms and atomic symbols(cid:7) De(cid:0)nition (cid:1)(cid:2)(cid:3) A de(cid:0)nite clause (cid:13)or just (cid:9)clause(cid:10)(cid:14) is of the form(cid:12) a(cid:0) or a (cid:1) a(cid:0)(cid:2)(cid:3)(cid:3)(cid:3)(cid:2)an(cid:0) where a and each ai are atomic symbols(cid:7) De(cid:0)nition (cid:1)(cid:2)(cid:1) A disjoint declaration is of the form disjoint(cid:13)(cid:2)h(cid:0) (cid:12) p(cid:0)(cid:1)(cid:3)(cid:3)(cid:3)(cid:1)hn (cid:12) pn(cid:5)(cid:14)(cid:0) where the hi are atoms(cid:3) and the pi are real numbers (cid:16) (cid:2) pi (cid:4) (cid:0) such that p(cid:0) (cid:19) (cid:3)(cid:3)(cid:3) (cid:19) pn (cid:20) (cid:0)(cid:7) Any variable appearing in one hi must appear in all of the hj (cid:13)i(cid:7)e(cid:7)(cid:3) the hi share the same variables(cid:14)(cid:7) The hi will be referred to as hypotheses(cid:7) Anyground instanceofa hypothesisinone disjointdeclaration(cid:3)cannot be an instance of another hypothesis in any of the disjoint declarations (cid:13)either the same declaration or a di(cid:21)erent declaration(cid:14) nor can it be an instance of the head of any clause(cid:7) This restricts the language so that we cannot encode arbitrary clauses as disjoint declarations(cid:7) Each instance of each disjoint dec(cid:6) laration is independent of all of the other instances of disjoint declarations and the clauses(cid:7) Logic Programming(cid:0) Abduction and Probability (cid:22) De(cid:0)nition (cid:1)(cid:2)(cid:4) A probabilistic Horn abduction theory (cid:13)which will be referred to as a (cid:9)theory(cid:10)(cid:14) is a collection of de(cid:11)nite clauses and disjoint dec(cid:6) larations(cid:7) Given theory T(cid:3) we de(cid:11)ne FT the facts(cid:3) is the set of de(cid:11)nite clauses in T together with the clauses of the form false (cid:1) hi (cid:2)hj where hi and hj both appear in the same disjoint declaration in T(cid:3) and (cid:0) i (cid:5)(cid:20) j(cid:7) Let FT be the set of ground instances of elements of FT(cid:7) HT to be the set of hypotheses(cid:3) the set of hi such that hi appears in a (cid:0) disjoint declaration in T(cid:7) Let HT be the set of ground instances of elements of HT(cid:7) (cid:0) (cid:0) (cid:0) PT is a function HT (cid:6)(cid:7) (cid:2)(cid:16)(cid:1)(cid:0)(cid:5)(cid:7) PT(cid:13)hi(cid:14) (cid:20) pi where hi is a ground instance of hypothesis hi(cid:3) and hi (cid:12) pi is in a disjoint declaration in T(cid:7) Where T is understood from context(cid:3) we omit the subscript(cid:7) De(cid:0)nition (cid:1)(cid:2)(cid:5) (cid:2)(cid:1)(cid:8)(cid:3) (cid:0)(cid:23)(cid:5) If g is a closed formula(cid:3) an explanation of g from (cid:0) hF(cid:1)Hi is a set D of elements of H such that (cid:0) F (cid:8)D j(cid:20) g and (cid:0) F (cid:8)D (cid:5)j(cid:20) false(cid:7) The (cid:11)rst condition says that D is a su(cid:24)cientcause for g(cid:3) and the second says that D is possible(cid:7) De(cid:0)nition (cid:1)(cid:2)(cid:6) A minimal explanation of g is an explanation of g such that no strict subset is an explanation of g(cid:7) Logic Programming(cid:0) Abduction and Probability (cid:8) (cid:0)(cid:1)(cid:0) Assumptions about the rule base Probabilistic Horn abduction also contains some assumptions about the rule base(cid:7) It can be argued that these assumptions are natural(cid:3) and do not really restrict what can be represented (cid:2)(cid:1)(cid:4)(cid:5)(cid:7) Here we list these assumptions(cid:3) and use them in order to show how the algorithms work(cid:7) The (cid:11)rst assumption we makeis about the relationship between hypothe(cid:6) ses and rules(cid:12) Assumption (cid:1)(cid:2)(cid:7) There are no rules with head unifying with a member of H(cid:7) Instead of having a ruleimplyinga hypothesis(cid:3) wecan inventa new atom(cid:3) make the hypothesis imply this atom(cid:3) and all of the rules imply this atom(cid:3) and use this atom instead of the hypothesis(cid:7) (cid:0) Assumption (cid:1)(cid:2)(cid:8) (cid:13)acyclicity(cid:14)IfF isthe set of ground instances of elements of F(cid:3) then it is possible to assign a natural number to every ground atom (cid:0) such that for every rule in F the atoms in the body of the rule are strictly less than the atom in the head(cid:7) This assumption is discussed by Apt and Bezem (cid:2)(cid:0)(cid:5)(cid:7) (cid:0) Assumption (cid:1)(cid:2)(cid:9) The rules in F for a ground non(cid:6)assumable atom are cov(cid:6) ering(cid:7) (cid:0) That is(cid:3) if the rules for a in F are a (cid:1) B(cid:0) a (cid:1) B(cid:1) (cid:7) (cid:7) (cid:7) a (cid:1) Bm if a is true(cid:3) one of the Bi is true(cid:7) Thus Clark(cid:25)s completion (cid:2)(cid:22)(cid:5) is valid for every non(cid:6)assumable(cid:7) Often we get around this assumption by adding a rule a (cid:1) some other reason for a and making (cid:9)some other reason for a(cid:10) a hypothesis (cid:2)(cid:1)(cid:4)(cid:5)(cid:7) Logic Programming(cid:0) Abduction and Probability (cid:26) (cid:0) Assumption (cid:1)(cid:2)(cid:10) The bodies of the rules in F for an atom are mutually exclusive(cid:7) Given the above rules for a(cid:3) this means that for each i (cid:5)(cid:20) j(cid:3) Bi and Bj (cid:0) cannot both be true in the domain under consideration (cid:7) We can make this true by adding extra conditions to the rules to make sure they are disjoint(cid:7) Associated with each possible hypothesis is a prior probability(cid:7) We use this prior probability to compute arbitrary probabilities(cid:7) Our (cid:11)nal assump(cid:6) tion is to assumethat logical dependencies imposethe only statistical depen(cid:6) dencies on the hypotheses(cid:7) In particular we assume(cid:12) Assumption (cid:1)(cid:2)(cid:3)(cid:11) Groundinstancesofhypothesesthatarenotinconsistent (cid:13)with FT(cid:14) are probabilistically independent(cid:7) The only ground instances of hypotheses that are inconsistent with FT are those of the form hi(cid:3) and hj(cid:3)(cid:3) where hi and hj are di(cid:21)erent elements of the same disjoint declaration(cid:7) Thus di(cid:21)erent disjoint declarations de(cid:11)ne independent hypotheses(cid:7) Di(cid:21)erent instances of hypotheses are also indepen(cid:6) dent(cid:7) Note that the hypotheses in a minimalexplanation are always logically independent(cid:7) The language has been carefully set up so that the logic does not force any dependencies amongst the hypotheses(cid:7) If we could prove that somehypotheses impliedother hypotheses or their negations(cid:3) the hypotheses could not be independent(cid:7) The language is deliberately designed to be too weak to be able to state such logical dependencies between hypotheses(cid:7) See (cid:2)(cid:1)(cid:4)(cid:5) for more justi(cid:11)cation of these assumptions(cid:7) (cid:0)(cid:1)(cid:3) Some Consequents of the Assumptions Lemma (cid:1)(cid:2)(cid:3)(cid:3) Underassumption(cid:1)(cid:7)(cid:0)(cid:16)(cid:3) ifthereisatleastonyhypothesiswith a free variable(cid:3) then distinct ground terms denote di(cid:21)erent individuals(cid:7) Proof(cid:12) If t(cid:0) and t(cid:1) are distinct ground terms(cid:3) then t(cid:0) (cid:5)(cid:20) t(cid:1)(cid:7) Otherwise(cid:3) suppose t(cid:0) (cid:20) t(cid:1)(cid:7) If h(cid:13)X(cid:14) is a hypothesis(cid:3) then h(cid:13)t(cid:0)(cid:14) cannot be independent of h(cid:13)t(cid:1)(cid:14) as they are logically equivalent(cid:3) (cid:0) which violates assumption (cid:1)(cid:7)(cid:0)(cid:16)(cid:7) (cid:0) Wedo not insist that we are able to prove neg(cid:1)Bi(cid:1)bj(cid:2)(cid:3) Typicallywe cannot(cid:3) This is a semantic restriction(cid:3) There is(cid:0) however(cid:0) one syntactic check we can make(cid:3) If there is a set of independent (cid:1)see assumption (cid:4)(cid:3)(cid:5)(cid:6)(cid:2) hypotheses from which both Bi and Bj can be derived then the assumption is violated(cid:3) Logic Programming(cid:0) Abduction and Probability (cid:17) The following lemma can be easily proved(cid:12) Lemma (cid:1)(cid:2)(cid:3)(cid:1) Underassumptions(cid:1)(cid:7)(cid:26)and(cid:1)(cid:7)(cid:15)(cid:3)minimalexplanationsofground atoms or conjunctions of ground atoms are mutually inconsistent(cid:7) Lemma (cid:1)(cid:2)(cid:3)(cid:4) A minimal explanation of a ground atom cannot contain any free variables(cid:7) Proof(cid:12) If a minimal explanation of a ground atom contains free variables(cid:3) there will be in(cid:11)nitely many ground explanations of the atom (cid:13)substituting di(cid:21)erent ground termsfor the free vari(cid:6) ables(cid:14)(cid:7) Each of these ground explanations will have the same probability(cid:7) Thus this probability must be zero (cid:13)as we can sum the probabilities of disjoint formulae(cid:3) and this sum must be less than or equal to one(cid:14)(cid:7) This contradicts the fact that the explana(cid:6) tions are minimal(cid:3)and the hypotheses are independent(cid:3) and each (cid:0) positive(cid:7) Lemma (cid:1)(cid:2)(cid:3)(cid:5) (cid:2)(cid:8)(cid:3) (cid:0)(cid:15)(cid:5) Under assumptions (cid:1)(cid:7)(cid:26)(cid:3) (cid:1)(cid:7)(cid:17) and (cid:1)(cid:7)(cid:23)(cid:3) if expl(cid:13)g(cid:1)T(cid:14) is the set of minimal explanations of ground g from theory T(cid:3) and comp(cid:13)T(cid:14) is Clark(cid:25)s completion of the non assumables (cid:13)and includes Clark(cid:25)s equational (cid:1) theory(cid:14) then (cid:0) (cid:3) comp(cid:13)T(cid:14) j(cid:20) g (cid:9) ei (cid:2) (cid:1) ei(cid:1)expl(cid:2)g(cid:0)T(cid:3) A The following is a corollary of lemmata (cid:1)(cid:7)(cid:0)(cid:22) and (cid:1)(cid:7)(cid:0)(cid:1) Lemma (cid:1)(cid:2)(cid:3)(cid:6) Under assumptions (cid:1)(cid:7)(cid:26)(cid:3) (cid:1)(cid:7)(cid:17)(cid:3) (cid:1)(cid:7)(cid:23)(cid:3) (cid:1)(cid:7)(cid:15) and (cid:1)(cid:7)(cid:0)(cid:16)(cid:3) if expl(cid:13)g(cid:1)T(cid:14) is the set of minimalexplanations of conjunction of atoms g from probabilistic Horn abduction theory T(cid:12) (cid:0) (cid:3) P(cid:13)g(cid:14) (cid:20) P ei (cid:2) (cid:1)ei(cid:1)expl(cid:2)g(cid:0)T(cid:3) A (cid:20) P(cid:13)ei(cid:14) X ei(cid:1)expl(cid:2)g(cid:0)T(cid:3) (cid:1) For the case we have here(cid:0) Console et(cid:3) al(cid:3)(cid:7)s results (cid:8)(cid:9)(cid:10) can be extended to acyclic theories (cid:1)the set of ground instances of the acyclic theory is a hierarchical theory(cid:0) and we only need really consider the ground instances(cid:2)(cid:3) Logic Programming(cid:0) Abduction and Probability (cid:23) Thus to compute the prior probability of any g we sum the probabilities of the explanations of g(cid:7) Poole (cid:2)(cid:1)(cid:4)(cid:5) proves this directly using a semantics that incorporates the above assumptions(cid:7) To compute arbitrary conditional probabilities(cid:3) we use the de(cid:11)nition of conditional probability(cid:12) P(cid:13)(cid:4)(cid:2)(cid:5)(cid:14) P(cid:13)(cid:4)j(cid:5)(cid:14) (cid:20) P(cid:13)(cid:5)(cid:14) Thus to (cid:11)nd arbitrary conditional probabilities P(cid:13)(cid:4)j(cid:5)(cid:14)(cid:3) we (cid:11)nd P(cid:13)(cid:5)(cid:14)(cid:3) which is the sum of the explanations of (cid:5)(cid:3) and P(cid:13)(cid:4) (cid:2) (cid:5)(cid:14) which can be found by explaining (cid:4) from the explanations of (cid:5)(cid:7) Thus arbitrary condi(cid:6) tional probabilities can be computed from summing the prior probabilities of explanations(cid:7) It remains only to compute the prior probability of an explanation D of g(cid:7) Under assumption(cid:1)(cid:7)(cid:0)(cid:16)(cid:3) if fh(cid:0)(cid:1)(cid:3)(cid:3)(cid:3)(cid:1)hng are part of a minimalexplanation(cid:3) then n P(cid:13)h(cid:0) (cid:2)(cid:3)(cid:3)(cid:3)(cid:2)hn(cid:14) (cid:20) P(cid:13)hi(cid:14) Y (cid:0)(cid:4)(cid:0) To compute the prior of the minimal explanation we multiply the priors of the hypotheses(cid:7) The posterior probability of the explanation is proportional to this(cid:7) (cid:0)(cid:1)(cid:4) An example In this section we show an example that we use later in the paper(cid:7) It is intended to be as simple as possible to show how the algorithm works(cid:7) (cid:5) Suppose we have the rules and hypotheses (cid:12) rule(cid:0)(cid:0)a (cid:1)(cid:2) b(cid:3) h(cid:4)(cid:4)(cid:5) rule(cid:0)(cid:0)a (cid:1)(cid:2) q(cid:3)e(cid:4)(cid:4)(cid:5) rule(cid:0)(cid:0)q (cid:1)(cid:2) h(cid:4)(cid:4)(cid:5) rule(cid:0)(cid:0)q (cid:1)(cid:2) b(cid:3)e(cid:4)(cid:4)(cid:5) rule(cid:0)(cid:0)h (cid:1)(cid:2) b(cid:3) f(cid:4)(cid:4)(cid:5) (cid:2) Here we have used a syntax rule(cid:0)(cid:0)H (cid:1)(cid:2) B(cid:3)(cid:3) to represent the rule H (cid:2) B(cid:3) This is the syntax that is accepted by the code in Appendix A(cid:3) Logic Programming(cid:0) Abduction and Probability (cid:15) rule(cid:0)(cid:0)h (cid:1)(cid:2) c(cid:3) e(cid:4)(cid:4)(cid:5) rule(cid:0)(cid:0)h (cid:1)(cid:2) g(cid:3) b(cid:4)(cid:4)(cid:5) disjoint(cid:0)(cid:6)b(cid:1)(cid:7)(cid:5)(cid:8)(cid:3)c(cid:1)(cid:7)(cid:5)(cid:9)(cid:10)(cid:4)(cid:5) disjoint(cid:0)(cid:6)e(cid:1)(cid:7)(cid:5)(cid:11)(cid:3)f(cid:1)(cid:7)(cid:5)(cid:8)(cid:3)g(cid:1)(cid:7)(cid:5)(cid:12)(cid:10)(cid:4)(cid:5) There are four minimalexplanations of a(cid:3) namelyfc(cid:1)eg(cid:3)fb(cid:1)eg(cid:3)ff(cid:1)bgand fg(cid:1)bg(cid:7) The priors of the explanations are as follows(cid:12) P(cid:13)c(cid:2)e(cid:14) (cid:20) (cid:16)(cid:0)(cid:17)(cid:10)(cid:16)(cid:0)(cid:26) (cid:20) (cid:16)(cid:0)(cid:22)(cid:1)(cid:0) Similarly P(cid:13)b(cid:2)e(cid:14) (cid:20) (cid:16)(cid:0)(cid:0)(cid:23)(cid:3) P(cid:13)f (cid:2)b(cid:14) (cid:20) (cid:16)(cid:0)(cid:16)(cid:15) and P(cid:13)g (cid:2)b(cid:14) (cid:20) (cid:16)(cid:0)(cid:16)(cid:4)(cid:7) Thus P(cid:13)a(cid:14) (cid:20) (cid:16)(cid:0)(cid:22)(cid:1)(cid:19)(cid:16)(cid:0)(cid:0)(cid:23)(cid:19)(cid:16)(cid:0)(cid:16)(cid:15)(cid:19)(cid:16)(cid:0)(cid:16)(cid:4) (cid:20) (cid:16)(cid:0)(cid:17)(cid:1) There are two explanations of e (cid:2) a(cid:3) namely fc(cid:1)eg and fb(cid:1)eg(cid:7) Thus P(cid:13)e(cid:2) a(cid:14) (cid:20) (cid:16)(cid:0)(cid:26)(cid:16)(cid:7) Thus the conditional probability of e given a is P(cid:13)eja(cid:14) (cid:20) (cid:16)(cid:0)(cid:26)(cid:6)(cid:16)(cid:0)(cid:17)(cid:1) (cid:20) (cid:16)(cid:0)(cid:23)(cid:4)(cid:4)(cid:7) What is important about this example is that all of the probabilistic calculations reduce to (cid:11)nding the probabilities of explanations(cid:7) (cid:0)(cid:1)(cid:5) Tasks The following tasks are what we expect to implement(cid:12) (cid:0)(cid:7) Generate the explanations of some goal (cid:13)conjunction of atoms(cid:14)(cid:3) in or(cid:6) der(cid:7) (cid:1)(cid:7) Estimate the prior probability of some goal(cid:7) This is implemented by enumerating some of the explanations of the goal(cid:7) (cid:4)(cid:7) Estimate the posterior probabilities of the explanations of a goal (cid:13)i(cid:7)e(cid:7)(cid:3) the probabilities of the explanations given the goal(cid:14)(cid:7) (cid:22)(cid:7) Estimate the conditional probability of one formula given another(cid:7) That is(cid:3) determining P(cid:13)(cid:4)j(cid:5)(cid:14) for any (cid:4) and (cid:5)(cid:7) Each of these will be computed using the preceding task(cid:7) The last task is the essential task that we need in order to makedecisions under uncertainty(cid:7) All of these will be implemented by enumerating the explanations of a goal(cid:3) and estimating the probability mass in the explanations that have not been enumerated(cid:7) It is this problem that we consider for the next few sec(cid:6) tions(cid:3) and then return to the problem of the tasks we want to compute(cid:7) Logic Programming(cid:0) Abduction and Probability (cid:0)(cid:16) (cid:2) A top(cid:3)down proof procedure In this section we show how to carry out a best(cid:6)(cid:11)rst search of the explana(cid:6) tions(cid:7) In order to do this we build a notion of a partial proof that we can add to a priority queue(cid:3) and restart when necessary(cid:7) (cid:3)(cid:1)(cid:2) SLD(cid:6)BF resolution In this section we outline an implementation based on logic programming technology and a branch and bound search(cid:7) The implementation keeps a priority queue of sets of hypotheses that could be extended into explanations (cid:13)(cid:9)partial explanations(cid:10)(cid:14)(cid:7) At any time the set of all the explanations is the set of already generated explanations(cid:3) plus those explanations that can be generated from the partial explanations in the priority queue(cid:7) De(cid:0)nition (cid:4)(cid:2)(cid:3) a partial explanation is a structure hg (cid:1) C(cid:1)Di where g is an atom (cid:13)or conjunction of atoms(cid:14)(cid:3) C is a conjunction of atoms and D is a set of hypotheses(cid:7) Figure (cid:0) gives an algorithm for (cid:11)nding explanations of q in order of prob(cid:6) ability (cid:13)most likely (cid:11)rst(cid:14)(cid:7) We have the following data structures(cid:12) Q is a set of partial explanations (cid:13)implemented as a priority queue(cid:14)(cid:7) (cid:27) is a set of explanations of g that have been generated (cid:13)initially empty(cid:14)(cid:7) NG is the set of pairs hhi(cid:1)hji such that hi and hj are di(cid:21)erent hypotheses in a disjoint declaration (cid:13)N(cid:7)B(cid:7) hi and hj share the same variables(cid:14)(cid:7) At each step we choose an element hg (cid:1) C(cid:1)Di of the priority queue Q(cid:7) Which element is chosen depends on the search strategy (cid:13)e(cid:7)g(cid:7)(cid:3) we could choose the element with maximumprior probability of D(cid:14)(cid:7)
Description: