ebook img

NASA Technical Reports Server (NTRS) 19940029537: From conditional oughts to qualitative decision theory PDF

9 Pages·0.79 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview NASA Technical Reports Server (NTRS) 19940029537: From conditional oughts to qualitative decision theory

I. roceedlngs _,. _he Ninth Conference on Uncertainty in Artlflclal Incelllgenc___e, Nashington, D.C., July 1993. IZOtNICAL REPORT R-192-II N94- 34043 Hay 1993 From Conditional Oughts to Qualitative Decision Theory Judea Pearl Cognitive Systems Laboratory University of California, Los Angeles, CA 90024 judea @cs.ucla. edu Content Areas: Commonsense Reasoning, Probabilistic Reasoning, Reasonin 9 about Action Abstract selves, we know from decision theory, can be fairly The primary theme of this investigation is a deci- unstructured, save for obeying asymmetry and tran- sion theoretic account of conditional ought statements sitivity. Thus, paralleling the order-of-magnitude ab- (e.g., "You ought to do A, if C") that rectifies glaring straction of probabilities, it is reasonable to score con- deficiencies in classical deontic logic. The resulting ac- sequences on an integer scale of utility: very desirable count forms a sound basis for qualitative decision the- (U = O(l/e)), very undesirable (U = -O(1/c)), bear- ory, thus providing a framework for qualitative plan- able (U = O(1)), and so on, mapping each linguistic ning under uncertainty. In particular, we show that assessment into the appropriate +_O(1/e i) utility rat- adding causal relationships (in the form of a single ing. This utility rating, when combined with the in- graph) as part of an epistemic state is sufficient to finitesimal rating of probabilistic beliefs (Goldszmidt facilitate the analysis of action sequences, their conse- & Pearl 1992), should permit us to rate actions by the quences, their interaction with observations, their ex- expected utility of their consequences, and a require- pected utilities and, hence, the synthesis of plans and ment to do A would then be asserted iff the rating of strategies under uncertainty. doing A is substantially (i.e., a factor of l/e) higher than that of not doing A. 1 INTRODUCTION This decision theoretic agenda, although conceptually straightforward, encounters some subtle difficulties in practice. First, when we deal with actions and conse- In natural discourse, "ought" statements reflect two quences, we must resort to causal knowledge of the do- kinds of considerations: requirements to act in ac- main and we must decide how such knowledge is to be cordance with moral convictions or peer's expecta- encoded, organized, and utilized. Second, while theo- tions, and requirements to act in the interest of one's ries of actions are normally formulated as theories of survival, namely, to avoid danger and pursue safety. temporal changes (Shoham 1988, Dean & Kanazawa Statements of the second variety are natural candi- 1989), ought statements invariably suppress explicit dates for decision theoretic analysis, albeit qualitative references to time, strongly suggesting that temporal in nature, and these will be the focus of our discus- information is redundant, namely, it can be recon- sion. The idea is simple. A sentence of the form structed if required, but glossed over otherwise. In "You ought to do A if C" is interpreted as shorthand other words, the fact that people comprehend, evalu- for a more elaborate sentence: "If you observe, be- ate and follow non-temporal ought statements suggests lieve, or know C, then the expected utility resulting that people adhere to some canonical, yet implicit as- from doing A is much higher than that resulting from sumptions about temporal progression of events, and not doing A". l The longer sentence combines several that no account can be complete without making these modalities that have been the subjects of AI investiga- assumptions explicit. Third, actions in decision the- tions: observation, belief, knowledge, probability ("ex- ory are predesignated explicitly to a few distinguished pected"), desirability ("utility"), causation ("resulting atomic variables, while statements of the type "You from"), and, of course, action ("doing A"). With the ought to do A" are presumed applicable to any arbi- exception of utility, these modalities have been for- trary proposition A. 2 Finally, decision theoretic meth- mulated recently using qualitative, order-of-magnitude ods, especially those based on static influence dia- abstractions of probability theory (Goidszmidt g: Pearl grams, treat both the informational relationships be- 1992, Goldszmidt 1992). Utility preferences them- tween observations and actions and the causal relation- ships between actions and consequences as instanta- 1An alternative interpretation, in which "doing A" is neous (Chapter 6, Pearl 1988, Shachter 1986). In real- required to be substantially superior to both "not doing A" and "doing not-A" isequally valid, and could be formulated 2This has been an overriding assumption in both the as a straightforward extension of our analysis. deontic logic and the preference logic literatures. 178 ity,theeffecot fournextactionmightbetoinvalidate currentlyobservepdropertiesh,enceanynon-temporal _¢(_)= (mino_olc(w) ffoorr w_ __ ",_o accounotfoughtmustcarefullydistinguisphroperties thatareinfluencebdytheactionfromthosethatwill _(_1¢) = _(_ ^ ¢) - _(¢) (2) persisdtespitetheactiona, ndmustexplicatteherefore somecanonicaalssumptionasboutpersistence. 2. (Stratified Rankings and Causal Networks (Gold- Theseissueasretheprimaryfocusofthispaper.We szmidt _ Pearl 1992)). A causal network is a directed acyclic graph (dag) in which each node corresponds to startbypresentingabriefintroductionto infinites- imalprobabilitiesandshowinghowactionsb, eliefs, an atomic variable and each edge Xi _ Xi asserts andcausarlelationshipasrerepresentebdyranking that Xi has a direct causal influence on Xj. Such net- functions_:(w)andcausanletworkFs (Section2).In works provide a convenient data structure for encoding two types of information: how the initial ranking func- Section3wepresenatsummaryoftheformalresults obtainedinthispaperi,ncludinganassertabilitcyrite- tion to(w) is formed, and how external actions would rionforconditionaolughtsS. ection4sand5explicate influence the agent's belief ranking _(w). Formally, causal networks are defined in terms of two notions: theassumptionlesadingtothecriterionpresenteidn stratification and actions. Section3.InSection4weintroduceaninteger-valued utility ranking/l(w)andshowhowthethreecompo- A ranking function _(w) is said to be stratified relative nentsx,(w),F,and/z(w)p,ermitustocalculates,emi- to a dag F if qualitativelyt,heutilityofanarbitrarypropositio_n, theutilityofagivenaction A, and whether we ought to _(w) = _ _(Xi@)lpai(w)) (3) do A. Section 5introduces conditional oughts, namely, i statements in which the action is contingent upon ob- serving a condition C. A calculus is then developed where pai(w ) are the parents of Xi in F evaluated at for transforming the conditional ranking _(wlC) into state w. Given a ranking function *:(w), any edge- a new ranking _A(wIC), representing the beliefs an minimal dag P satisfying Eq. (3), is called a Bayesian agent will possess after implementing action A, hav- network of to(w) (Pearl 1988). A dag F is said to be a ing observed C. These two ranking functions are then causal network of _(w) if it is a Bayesian network of combined with p(w) to form an assertability criterion _(_a) and, in addition, it admits the following repre- sentation of actions. for the conditional statement O(AIC): "We ought to do A, given C". In Section 6 we compare our formu- 3. (Actions) The effect of an atomic action do(Xi = lation to other accounts of ought statements, in par- true) is represented by adding to P a link DOi ---* ticular deontic logic, preference logic, counterfactual Xi, where DOi is a new variable taking values in conditionals, and quantitative decision theory. {do(zi), do(-_zi), idle} and zi stands for Xi = true. Thus, the new parent set of Xi is pa_ = pai U {DOi} and it is related to Xi by INFINITESIMAL PROBABILITIES, RANKING _:(Xi(w)lpa_@)) = FUNCTIONS, CAUSAL oo if DOi = do(y) and Xi(w) ¢ y NETWORKS, AND ACTIONS / _(X,(w)lpa_(w)) if DOi = idle 0 if DOi = do(y) and Xi(w) = y (4) 1. (Ranking Functions). Let f_ be a set of worlds, The effect of performing action do(xi) is to transform each world w G f2 being a truth-value assignment to _(w) into a new belief ranking, _,(w), given by a finite set of atomic variables (Xl, X2, ..., X,,) which in this paper we assume to be bi-valued, namely, Xi G {true, false}. A belief ranking function _(w) a_,(w) = { o_o'(_ldo(xd) ffoorr _w__ z-,_zl (5) is an assignment of non-negative integers to the ele- ments of f_ such that to(w) = 0 for at least one w E 12. where _¢' is the ranking dictated by the augmented Intuitively, n(w) represents the degree of surprise asso- network F U {DOi _ Xi} and Eqs. (3) and (4). ciated with finding a world w realized, and worlds as- signed _ = 0 are considered serious possibilities. _(w) This representation embodies the following aspects of can be considered an order-of-magnitude approxima- actions: tion of a probability function P(w) by writing P(w) as a polynomial of some small quantity e and taking the most significant term of that polynomial, i.e., (i) An action do(zi) can affect only the descendants of Xi in F. P(w) __ C,_(_) (I) Treatingeasan infinitesimaqluantityinducesa condi- (it) Fixing the value of pai (by some appropriate tionalranking function,¢(iai¢o)n propositionswhich choice of actions) renders Xi unaffected by any is governed by Spohn's calculus (Spohn 1988): external intervention do(x_), _¢# i. _(_) = 0 179 3 SUMMARY OF RESULTS 4 FROM UTILITIES AND BELIEFS TO GOALS AND ACTIONS The assertability condition we are about to develop in this paper requires the specification of an epistemic state ES = (x(w), P, p(to)) which consists of three Given a proposition _othat describes some condition or components: event in the world, what information is needed before we can evaluate the merit of obtaining _o,or, at the x(w) - an ordinal belief ranking function on ft. least, whether _ol is "preferred" to _ou? Clearly, if we r - a causal network of _(w). are to apply the expected utility criterion, we should define two measures on possible worlds, a probabil- #(w)- an integer-valued utility ranking of worlds, ity measure P(to) and a utility measure U(to). The where #(w) =+_ i assigns to w a utility first rates the likelihood that a world to will be real- U(to) =+_O(1/_i), i = 0, 1, 2, .... ized, while the second measures the desirability of to. Unfortunately, probabilities and utilities in themselves The main results of this paper can be summarized as follows: are not sufficient for determining preferences among 1. Let W+ and Wi- be the formulas whose models propositions. The merit of obtaining _odepends on at leasttwo otherfactors:how the truthof_oisestab- receive utility ranking +i and -i, respectively, and let lished,and what controlwe possessoverwhich model _'(w) denote the ranking function that prevails after of_ willeventuallyprevail.We willdemonstrate these establishing the truth of event _o,where _ois an arbi- two factorsby example. trary proposition (i.e., g'(-,_o) = _ and ¢¢'(_) = 0). The expected utility rank of _, is characterized by two Considertheproposition_o= "The ground iswet".In integers themidstofa drought,theconsequencesofthisstate- ment would depend criticalloyn whether we watered n+ = maxi[0; i- x'(Wi + A _o)] theground (action)orwe happened tofindtheground n- = maw[0; i- ^ (6) wet (observation).Thus, theutilityofa proposition10 and is given by clearlydepends on how we came to know _o,by mere observationor by willfulaction.Inthefirstcase,find- #[(_o; _¢'(w)] = ( ambiguous if n+ = n- > 0 ing_otruemay provideinformationabout thenatural n+ - n- otherwise (7) process that led to the observation _o, and we should 2. A conditional ought statement O(AIC) is assertable in ES iff change the current probability from P(to) to P(to]_,). In the second case, our actions may perturb the natu- #(A; xa(wlC)) > #(true; n(wlC)) (8) ral flow of events, and P(w) will change without shed- ding light on the typical causes of _o. We will denote where A and C are arbitrary propositions and the the probability resulting from externally enforcing the ranking xa(w[C) (to be defined in step 3) represents truth of _oby P_0(to), which will be further explicated the beliefs that an agent anticipates holding, after im- in Section 5 in terms of the causal network r. z plementing action A, having observed C. However, regardless of whether the probability func- 3. If A is a conjunction of atomic propositions, A = tion P(toko) or P_,(to) results from learning _o,we are Ajes A./, where each Aj stands for either Xj = true still unable to evaluate the merit of _ounless we un- or x i =fatse, then the post-action ranking xa(wlC) derstand what control we have over the opportuni- is given by the formula ties offered by _o. Simply taking the expected utility g(_o) = E,o[P(wl_o)U(to)] amounts to assuming that _a(wlC) = x(w) - _ '¢(X,(w)lpai(w)) + the agent is to remain totally passive until Nature se- iE g.ta_ lects a world to with probability P(to]_o), as in a game of mir_, [_.j, Si(w,w') + _(to'lC)] (9) chance. It ignores subsequent actions which the agent i¢I might be able to take so as to change this probability. where t2 is the set of root nodes and For example, event _omight provide the agent with the option of conducting further tests so as to determine Si(to,to') = si ifXi(w) ¢ Xi(to'),pa i ¢ 0 and with greater certainty which world would eventually si if Xxi((-wX)i(to)_lpaX/(iw(t)o)') and= p0ai = be realized. Likewise, in case _ostands for "Joe went 0 otherwise (10) to get his gun", our agent might possess the wisdom S(w,w t) represents persistence assumptions: It is sur- to protect itself by escaping in the next taxicab. prising (to degree si > 1) to find Xi change from its pre-action value of Xi(w *) to a post-action value of 3The difference between P(w[¢) and Pc(w) is precisely Xi(w) if there is no causal reason for the change. the difference between conditioning and "imaging" (Lewis If A is a disjunction of actions, A = Vi AZ, where each 1973), and between belief revision and belief update (Al- chourron et.al. 1985, Katsuno & Mendelzon 1991, Gold- AI is a conjunction of atomic propositions, then szmidt & Pearl 1992). It also accounts for the difference _¢a(tolC) -- _in xa,(wlC) (11) between indicative and subjunctive conditionals - a topic of much philosophical discussion (Harper et.al. 1980). 180 In practical decision analysis the utility of being in a to invest in something safer"), we may take a risk- situation _ois computed using a dynamic programming averse attitude and rank ambiguous states as low as approach, which assumes that subsequent to realizing U = -O(1/e) (other attitudes are, of course, perfectly _othe agent will select the optimal sequence of actions legitimate). This attitude, together with x_(_o) = 0, from those enabled by _o. This computation is rather yields the desired expression for/_(_o; _'(w)): exhaustive and is often governed by some form of my- opic approximation (Chapter 6, Pearl 1988). Ought statements normally refer to a single action A, tac- itly assuming that the choice of subsequent actions, if 0 if tc'(W-VW +l_o)>0 (13) available, is rather obvious and their consequences are #(_;x'(w)) = -+11 iiff r_':('W(W-I-_[_) )=>0 0 well understood. We say, for example, "You ought to and x'(W+l_o) = 0 get some food", assuming that the food would subse- quently be eaten and not be left to rot in the car. In The three-level utility model is, of course, only a coarse rating of desirability. In a multi-level model, our analysis, we will make a similar myopic approxi- where W+ and W/- are the formulas whose models mation, assuming either that action A is terminal or receive utility ranking +i and -i, respectively s, and that the consequences of subsequent actions (if avail- i = 0, 1, 2, ..., the ranking of the expected utility of able) are already embodied in the functions P(w) and is given by Eq. (7) (Section 3). #(w). We should keep in mind, however, that the re- sult of this myopic approximation is not applicable to Having derived a formula for the utility rank of an all actions; in sequential planning situations, some ac- arbitrary proposition _o, we are now in a position to tions may be selected for the sole purpose of enabling formulate our interpretation of the deontic expression certain subsequent actions. O(AIC): ''You ought to do A if C, iff the expected Denote by Pt(w) the probability function that would utility associated with doing A is much higher than that associated with not doing A". We start with a prevail after obtaining _o.4 Let us examine next how belief ranking x(w) and a utility ranking /_(w), and the expected utility criterion U(_o) = EP'(o_)U(w) we wish to assess the utilities associated with doing translates into the language of ranking functions. A versus not doing A, given that we observe C. The Let us assume that U takes on values in observation C would transform our current x(w) into {-O(1/e), O(1), +O(i/e)}, read as {very undesirable, x(wlC). Doing A would further transform _(wlC ) bearable, very desirable}. For notational simplic- into d(w) = _:a(wlC), while not doing A would ren- ity, we can describe these linguistic labels as a util- der _¢(wlC) unaltered, so x'(w) = _(_lC). Thus, ity ranking function #(w) that takes on the values the utility rank associated with doing A is given by -1, 0, and +1, respectively. Our task, then, is to /J(A; _(w[C)), while that associated with not doing evaluate the rank #(_o), as dictated by the expected A is given by #(C; x(wIC)) = iJ(true;tc(wlC ). Con- value of U(w) over the models of _o. sequently, we can write the assertability criterion for conditional ought as Let the sets of worlds assigned the ranks -1, 0, and +1 be represented by the formulas W-, W °, and W +, re- O(AIC) iff /_(A;xA(wlC)) > #(true;x(wlC)) (14) spectively, and let the intersections of these sets with where the function #(_; x(w)) is given in Eq. (13). _o be represented by the formulas _o-,_0 °, and _o+, We remark that the transformation from x(wlC) to respectively. The expected utility of _o is given by ga(wlC ) requires causal knowledge of the domain, -C_/e P'(W-) + Co P'(W °) + C+/e P'(W+), which will be provided by the causal network r (Sec- where C_, Co, and C+ are some positive coefficients. tion 5). Once we are given F it will be convenient to Introducing now the infinitesimal approximation for encode both x(w) and/_(w) using integer-valued labels P', in the form of the ranking function _, we obtain on the links of F. Moreover, it is straightforward to apply Eqs. (7) and (14) to the usual decision theo- and x'(ta +) > 0 retic tasks of selecting an optimal action or an opti- O(1) if £(_0-) > 0 mal information-gathering strategy (Chapter 6, Pearl and d(_o +) > 0 1988). U(_o) = +O(1/e) if d(_o-) > 0 / -o(1/0 if Example 1: and x'(_+) = 0 To demonstrate the use of Eq. (14), let us examine the ambiguous if x'(_o-) = 0 (12) assertability of "If it is cloudy you ought to take an umbrella" (Boutilier 1993). We assume three atomic The ambiguous status reflects a state of conflict propositions, c - "Cloudy", r - "Rain", and u - "Hav- U(_o) = -C_/e + C+/e, where there is a serious possi- ing an Umbrella", which form eight worlds, each corre- bility of ending in either terrible disaster or enormous sponding to a complete truth assignment to c, r, and u. success. Recognizing that ought statements are of- ten intended to avert such situations (e.g., "You ought Sin practice, the specification of U(w) is done by defin- ing an integer-valued variable V (connoting "value") as a 'P'(w) = P(wl_) in case ¢ is observed, and P'(w) = function of a select set of atomic variables. W+ would P,(w) in case _ is enacted. In both cases P'(_) = 1. correspond then to the assertion V = +i, i= O,1,2,.... 181 To express our belief that rain does not normally occur P* p in a clear day, we assign a x value of 1 (indicating one Pnn-aetion Netwodk Poat-action Nclwo_k unit of surprise) to any world satisfying r ^-,c and a X1 X 1 *¢value of 0 to all other worlds (indicating a serious x! _: ................................ ->-.-.), •_" \ x2/ \ a: possibility that any such world may be realized). To express the fear of finding ourselves in the rain with- out an umbrella, we assign a p value of -I to worlds satisfying rA_u and a p value of 0 to all other worlds. X4 X4 Thus, W + = false, W ° = -,(rA-_u), and W- = rA--,u. In this simple example, there is no difference between Figure 1: Persistence interactions between two causal xa(w) and x(wlA ) because the act A = "Taking an networks umbrella" has the same implications as the finding "Having an umbrella". Thus, to evaluate the two ex- pressions in Eq. (14), with A = u and C = c, we first that would prevail if we act A after observing C, i.e., note that the A-update of k(wlC ). First we note that this up- date cannot be obtained by simply applying the up- _(w-lu, c) = _(, ^ -,ulu, c)= _¢ date formula developed in (Eq. (2.2), Goidszmidt & Pearl 1992), x(W- V W+lu, c) = _o _:a(w) = _ _(w)-a(AlPaa(w)) w _ A SO ( c¢ w _ "A (16) _(u; _(_lu, c)) = 0 Similarly, where paA(w ) are the parents (or immediate causes) of A in the causal network F evaluated at w. The _(w-I_) = _(, ^ -,ulc) = 0 formula above was derived under the assumption that hence F is not loaded with any observations (e.g., C) and renders lea(w) undefined for worlds w that are excluded fz(c; x(wlc)) = -1 (15) by previous observations and reinstated by A. Thus, O(ulc) is assertable according to the criterion of Sq. (14). To motivate the proper transformation from x(w) to xa(wlC), we consider two causal networks, F' and F Note that although x(w) does not assume that nor- respectively representing the agent's epistemic states mally we do not have an umbrella with us (x(u) > 0), before and after the action (see Figure 1). Although the advice to take an umbrella is still assertable, since the structures of the two networks are almost the same leaving u to pure chance might result in harsh conse- (F contains additional root nodes representing the ac- quences (if it rains). tion do(A)), it is the interactions between the corre- sponding variables that determine which beliefs are Using the same procedure, it is easy to show that the going to persist in r and which are to be "clipped" by example also sanctions the assertability ofO(-,rlc, -,u), the influence of action A. which stands for "If it is cloudy and you don't have an umbrella, then you ought to undo (or stop) the rain". Let every variable X' in F' be connected to the corre- This is certainly useless advice, as it does not take into sponding variable Xi in F by a directed link X" -.--+ Xi account one's inability to control the weather. Con- that represents persistence by default, namely, the nat- trollability information is not encoded in the ranking ural tendency of properties to persist, unless there is functions x and _u; it should be part of one's causal a cause for a change. Thus, the parent set of each Xi theory and can be encoded in the language of causal in F has been augmented with one more variable: X'. networks using costly preconditions that, until satis- To specify the conditional probability of Xi, given its fied, would forbid the action do(A) from having any new parent set {Pax, U X'}, we need to balance the effect on A. s tendency of Xi to persist (i.e., be equal to X') against its tendency to obey the causal influence exerted by 5 COMBINING ACTIONS AND pax,. We will assume that persistence forces yield to causal forces and will perpetuate only those properties OBSERVATIONS that are not under any causal influence to terminate. In terms of ranking functions, this assumption reads: In this section we develop a probabilistic account for x(X_(aJ)lpai(w), X'(#))- the term xa(wlC), which stands for the belief ranking 6In decision theory it is customary to attribute direct s, + _(X_(to)lpai(w)) if X,(w) # X_(w') and costs to actions, which renders p(w) action-dependent. An I si if pai = 0 and Xi,¢(w(-),Xi(w_)lpXaii((ww'))) = 0 alternative, which is more convenient when actions are not enumerated explicitly, is to attribute costs to precondi- x(X,(w)lpa,(w)) otherwise (17) tions that must be satisfied before (any) action becomes where w_and w specify the truth values of the variables effective. in the corresponding networks, F_and F, and si > 1is 182 a constant characterizing the tendency of Xi to persist. Eq. (19) demonstrates that the effect of observations Eq. (17) states that the past value of Xi may affect the and actions can be computed as an updating opera- normal relation between 3(/and its parents only when tion on epistemic states, these states being organized it differs from the current value and, at the same time, by a fixed causal network, with the only varying el- the parents of Xi do not compel the change. In such ement being g, the belief ranking. Long streams of a case, the inequality Xi(w) # X_(w _) contributes si observations and actions could therefore be processed units of surl_rise to the normal relation between Xi and as a sequence of updates on some initial state, without its parents.' The unique feature of this model, unlike requiring analysis of long chains of temporally indexed the one proposed in (Goldszmidt & Pearl 1992), is that networks, as in Dean and Kanazawa (1989). persistence defaults can be violated by causal factors To handle disjunctive actions such as "Paint the wall without forcing us to conclude that such factors are either red or blue" one must decide between two in- abnormal. terpretations: "Paint the wall red or blue regardless Eq. (17) specifies the conditional rank a(X[pax) for of its current color" or "Paint the wall either red or every variable X in the combined networks and, hence, blue but, if possible, do not change its current color" it provides a complete specification of the joint rank (see Katsuno & Mendelzon 1991 and Goldszmidt _:(w,w'). s The desired expression for the post-action Pearl 1992). We will adopt the former interpretation, ranking lea(w) can then be obtained by marginalizing according to which "do(A VB)" is merely a shorthand overw': for "do(A) V do(B)". This interpretation is particu- larly convenient for ranking systems, because for any *ca(w) = min_c(w,J) (18) two propositions, A and B, we have Cadt We need, however, to account for the fact that some _(A V B) = min[_(A); n(B)] (21) variables in network I' are under the direct influence of the action A, and hence the parents of these nodes Thus, if we do not know which action, A or B, will are replaced by the action node do(A). If A con- be implemented but consider either to be a serious sists of a conjunction of atomic propositions, A = possibility, then AjEjAj, where each Aj stands for either Xj = true KAv/3(W) "-- min[sa(w); xB(w)] (22) or Xj = false, then each Xi, i E J, should be ex- empt from incurring the spontaneity penalty speci- Accordingly, if A is a disjunction of actions, A = fied in Eq. (17). Additionally, in calculating _:(w,w') Vt At, where each AIis a conjunction of atomic propo- we need to sum _(Xi(w)lpai(w),X_(w')) only over sitions, then i _ J, namely, over variables not under the direct influence of A. Thus, collecting terms and writing xA(wlC) = _inxa,(wlC) (23) x(w) = _-,i r_(Xi(w)[pa_(w)), we obtain Example 2 XA(wlC)--- _¢(w)- _ x(Xi(w)lpa/(w))+ iEJuR To demonstrate the interplay between actions and ob- mir_,[E Si(w,w') + ,_(w'lC)] (19) servations, we will test the assertability of the following dialogue: Robot 1: It is too dark in here. where R is the set of root nodes and Robot 2: Then you ought to push the switch up. Robot 1: The switch is already up. = ifx (w) # x (w'),pm # 0 and Robot 2: Then you ought to push the switch down. si if X,¢i((-w.X)i(w_) XiI(pwa'_)(w))and= p0ai= 0 0 otherwise (20) The challenge would be to explain the reversal of the "ought" statement in response to the new observation rThis is essentially the persistence model used by Dean "The switch is already up". The inferences involved and Kanazawa (Dean & Kanazawa 1089), in which si rep- in this example revolve around identifying the type of resents the survival function of Xi. The use of ranking switch Robot 1 is facing, that is whether it is normal functions allows us to distinguish crisply between changes (n) or abnormal ('-,n) (a normal switch is one that that are causally supported, s(-.X,(a_)lpa,(_)) > 0, and should be pushed up (u) to turn the light on (i)). The those that are unsupported, _(--X,(_)lpa,(_)) = 0. causal network, shown in Figure 2, involves three vari- 8The expressions, familiar in probability theory, ables: P(._, w') = [I P(Xj(._, _')lP%(W, J)), P(_o) = _ P(w, w') L - the current state of the light (l vs ",/), j _e S- the current position of the switch (u vs ",u), and translate into the ranking expressions T - the type of switch at hand (n vs --,n). ' = m,n ) ,_(w,w') = ,_(X,(u_,w ))[pa,(_, J)), s(_) .," ' The variable L stands in functional relationship to S and T, via where j ranges over all variables in the two networks. i = (n A u) V (-,n A --,u) (24) 183 or,equivalently, k = oo unless 1satisfies the relation Minimizing Eq. (19) over the two possible wI worlds, above. yields Since initially the switch is believed to be normal, we set x('_n) = 1, resulting in the following initial xa(wlC) = _f 0 for w = w, (30) t i for w = w2 ranking: S T L _;(w) u n I 0 We see that wz = uA --,n A --,1is penalized with one ",u n ",i O unit of surprise for exhibiting an unexplained change u -',n --,l 1 in switch type (initially believed to be normal). --,u --m l 1 It is worth noting how wl, which originally was ruled out (with g = oo) by the observation --,l, is suddenly S: Switch Position T: Type of Switch reinstated after taking the action A , u. In fact, Eq. u (up) n (normal) (19) first restores all worlds to their original _(w) value and then adjusts their value in three steps. First it ",u (down) _n (abnormal) excludes worlds satisfying -_A, then adjusts the x(w) of / the remaining worlds by an amount x(Alpaa(w)) ,and finally makes an additional adjustment for violation of persistence. L: Light From Eqs. (26) and (28), we see that xa(l]C) = 0 < x(llC ) = oo, hence the action A = u meets the as- l (on) sertability criterion of Eq. (14) and the first statement, --,l (not on) "You ought to push the switch up", isjustified. At this point, Robot 2receives a new piece of evidence: S = u. As a result, ,¢(wl-_/) changes to ,¢(wl--,l , u) and the cal- Figure 2: Causal network for Example 2 culation of xa(wlC ) needs to be repeated with a new set of observations, C = -_iA u. Since x(w'l-,l, u) per- We also assume that Robot 1prefers light to darkness, mits only one possible world w' = u A --,n A --,i, the by setting minimization of Eq. (19) can be skipped, yielding (for A = -,u) p(w) = _"-1 if w_-_l (25) k 0 if w_l xA( lC) = 1for w == -_uAnA--q (31) The first statement of Robot 1 expresses an observa- tion C = _l, yielding which, in turn, justifies the opposite "ought" state- ment ("Then you ought to push the switch down"). Note that although finding a normal switch is less _;(w]C) = I for w=uA--,nA--,l (26) 0oo ffoorr wal=l"o,uthAenrA--,wlorlds surprising than finding an abnormal switch, a spon- taneous transition to such a state would violate per- To evaluate na(wlC ) for A = u, we now invoke Eq. sistence and is therefore penalized by obtaining a x (19), using the spontaneity functions of 1. ST(W,W t) "- 1 if T(w) # T(w') 6 RELATIONS TO OTHER SL(w,w') = 0 if L(w) # L(w') (27) ACCOUNTS because L(w), being functionally determined by pat,(w ) is exempt from conforming to persistence de- 6.1 DEONTIC AND PREFERENCE faults. Moreover, for action A = u we also have LOGICS x(ulpaa) = _(u) = 0, hence Ought statements of the pragmatic variety have been xa(wlC ) = to(w) - s(T(w)) investigated in two branches of philosophy, deontic min {I[T(w) # T(w')] + x(w'lC)} , logic and preference logic. Surprisingly, despite an I ¢ t intense effort to establish a satisfactory account of for w = wl,w2 (28) "ought" statements (Von Wright 1963, Van Fraassen 1973, Lewis 1973), the literature of both logics is where I[p] equals 1 (or 0) if p is true (or false), and loaded with paradoxes and voids of principle. This raises the question of whether "ought" statements are wl = uAnAi w_ = _uAnA--,l (29) destined to forever elude formalization or that the w2 = uA"mA-_l w_ = uA_nA'-,l approach taken by deontic logicians has been misdi- All other worlds are excluded by either A = u or rected. I believe the answer involves a combination of C = -_l. both. 184 Philososphers hoped to develop deontic logic as asep- basic features of causal relationships such as asym- arate branch of conditional logic, not as a synthetic metry, transitivity, and complicity with temporal or- amalgam of logics of belief, action, and causation. der. To the best of my knowledge, there has been no In other words, they have attempted to capture the attempt to translate causal sentences into specifica- meaning of "ought" using asingle modal operator 0(.), tions of the Stalnaker-Lewis selection function. 9 Such instead of exploring the couplings between "ought" specifications were partially provided in (Goldszmidt and other modalities, such as belief, action, causation, & Pearl 1992), through the imaging function w*(w), and desire. The present paper shows that such an and are further refined in this paper by invoking the isolationistic strategy has little chance of succeeding. persistence model (Eq. (19)). Note that a directed Whereas one can perhaps get by without explicit refer- acyclic graph is the only ingredient one needs to add ence to desire, it is absolutely necessary to have both to the traditional notion of epistemic state so as to probabilistic knowledge about the effect of observa- specify a causality-based selection function. tions on the likelihood of events and causal knowledge From this vantage point, our calculus provides, in about actions and their consequences. essence, a new account of subjunctive conditionals that We have seen in Section 3 that to ratify the sentence is more reflective of those used in decision making. The "Given C, you ought to do A", we need to know not account is based on giving the subjunctive the follow- merely the relative desirability of the worlds delineated ing causal interpretation: "Given C, if I were to per- by the propositions A A C and -_A A C, but also the form A, then I believe B would come about", written feasibility or likelihood of reaching any one of those A > B]C, which in the language of ranking function worlds in the future, after making our choice of A. We reads also saw that this likelihood depends critically on how _(-',B]C) = 0 and tcA(_B]C ) > 0 (33) C is confirmed, by observation or by action. Since this The equality states that "_B is considered a serious information cannot be obtained from the logical con- possibility prior to performing A, while the inequal- tent of A and C, it is not surprising that "almost ev- ity renders --B surprising after performing A. This ery principle which has been proposed as fundamental account, which we call Decision Making Conditionals to a preference logic has been rejected by some other (DMC), avoids several paradoxes of conditional log- source" (Mullen 1979). ics (see Nute 1992) and is further described in (Pearl In fact, the decision theoretic account embodied in Eq. 1993). (14) can be used to generate counterexamples to most of the principles suggested in the literature, simply by 6.3 OTHER DECISION THEORETIC selecting a combination of to, /_, and r that defies the ACCOUNTS proposed principle. Since any such principle must be valid in all epistemic states and Since we have enor- Poole (1992) has proposed a quantitative decision- mous freedom in choosing these three components, it theoretic account of defaults, taking the utility of A, is not surprising that only weak principles, such as given evidence e, to be O(A]C) ==_ _O(_A[C), survive the test. Among the few that do survive, we find the sure-thing principle: p(Ale) = E,o p(w, A)P(wle) (34) O(A]C) h O(A]',C) _ O(A) (32) This requires a specification of an action-dependent read as "If you ought to doA given C and you ought preference function for each (w, A) pair. Our proposal to doA given "-C, then you ought to do A without (in line with (Stalnaker 1972)) attributes the depen- examining C". But one begins to wonder about the dence of p on A to beliefs about the possible conse- value of assembling a logic from a sparse collection of quences of A, thereby keeping the utility of each conse- such impoverished survivors when, in practice, a full quence constant. In this way, we see more clearly how specification of I¢, /z, and r would be required. the structure of causal theories should affect the choice of actions. For example, suppose A and e are incom- 6.2 COUNTERFACTUAL CONDITIONALS patible ("If the light is on (e), turn it off (A)"), taking (34) literally (without introducing temporal indices) Stalnaker (1972) was the first to make the connection would yield absurd results. Additionally, Poole's is a between actions and counterfactual statements, and he calculus of incremental improvements of utility, while proposed using the probability of the counterfactual 9Gibbard and Harper (Gibbard &Harper 1980) develop conditional (as opposed to the conditional probability, a quantitative theory of rational decisions that is based which is more appropriate for indicative conditionals) on Stalnaker's suggestion and explicitly attributes causal in the calculation of expected utilities. Stalnaker's the- character to counterfactual conditionals. However, they ory does not provide an explicit connection between assume that probabilities of counterfactuals are given in subjunctive conditionals and causation, however. Al- axivance and do not specify either how such probabilities though the selection function used in the Stalnaker- are encoded or how they relate to probabilities of ordinary Lewis nearest-world semantics can be thought of as a propositions. Likewise, a criterion for accepting a counter- generalization of, and a surrogate for, causal knowl- factual conditional, given other counterfactuals and other edge, it is too general, as it is not constrained by the propositions, is not provided. 185 oursisconcerned with substantial improvements, as is in Proceedings of the 3rd International Confer- typical of ought statements. ence on Knowledge Representation and Reason- Boutilier (1993) has developed a modal logic account ing, Morgan Kaufmann, San Mateo, CA, 661-672, of conditional goals which embodies considerations October 1992. similar to ours. It remains to be seen whether causal Harper, L., Stalnaker, R., and Pearce, G. (Eds.), Ifs, relationships such as those governing the interplay D. Reidel Dordrecht, Holland, 1980. among actions and observations can easily be encoded into his formalism. Katsuno, H., and Mendelzon, A.O., "On the differ- ence between updating a knowledge base and re- vising it," in Principles of Knowledge Represen- 7 CONCLUSION tation and Reasoning: Proceedings of the end In- ternational Conference, Morgan Kaufmann, San Mateo, CA, 387-394, 1991. By pursuing the semantics of ought statements this pa- per develops an account of qualitative decision theory Lewis, D., Counlerfactuals, Harvard University and a framework for qualitative planning under uncer- Press, Cambridge, MA, 1973. tainty. The two main features of this account are: Nute, D., "Logic, conditional," in Stuart C. Shapiro 1. Order-of-magnitude specifications of probabilities (Ed.), Encyclopedia of Artificial Intelligence, 2nd and utilities are combined to produce qualitative ex- Edition, John Wiley, New York, 854-860, 1992. pected utilities of actions and consequences, condi- tioned on observations (Eq. (7)). Mullen, J.D., "Does the logic of preference rest on a 2. A single causal network, combined with universal mistake?", Metaphilosophy, 10,247-255, 1979. assumptions of persistence is sufficient for specifying Pearl, J., Probabilistic Reasoning in Intelligent Sys- the dynamics of beliefs under any sequence of actions tems, Morgan Kaufmann, San Mateo, CA, 1988. and observations (Eq. (9)). Pearl, J., "From Adams' conditionals to default ex- pressions, causal conditionals, and counterfactu- Acknowledgements als," UCLA, Technical Report (R-193), 1993. To appear in Festschrifl for Ernest Adams, Cam- This work benefitted from discussions with Craig bridge University Press, 1993. Boutilier, Adnan Darwiche, Mois6s Goldszmidt, and Sek-Wah Tan. The research was partially supported Poole, D., "Decision-theoretic defaults," 9th Cana- by Air Force grant #AFOSR 90 0136, NSF grant dian Conference on AI, Vancouver, Canada, 190- 197, May 1992. #IRI-9200918, Northrop Micro grant #92-123, and Rockwell Micro grant #92-122. Shachter, R.D., "Evaluating influence diagrams," Op- erations Research, 34,871-882, 1986. References Shoham, Y., Reasoning About Change: Time and Causation from the Standpoint of Artificial Intel- Alchourron, C., Gardenfors, P., and Makinson, D., ligence, MIT Press, Cambridge, MA, 1988. "On the logic of theory change: Partial meet con- traction and revision functions," Journal of Sym- Spohn, W., "Ordinal conditional functions: A dy- bolic Logic, 50, 510-530, 1985. namic theory of epistemie states," in W.L. Harper and B. Skyrms (Eds.), Causation in Decision, Be- Boutilier, C., "A modal characterization Of defeasi- lief Change, and Statistics, D. Reidel, Dordrecht, ble deontic conditionals and conditional goals," Holland, 105-134, 1988. Working Notes of the AAAI Spring Symposium Series, Stanford, CA, 30-39, March 1993. Stalnaker, R., "Letter to David Lewis" in Harper, L., Stalnaker, R., and Pearce, G. (Eds.), Ifs, D. Dean, T., and Kanazawa, K., "A model for reasoning Reidel Dordrecht, Holland, 151-152, 1980. about persistence and causation," Computational Intelligence, 5, 142-150, 1989. Van Fraassen, B.C., "The logic of conditional obliga- tion," in M. Bunge (Ed.), Exact Philosophy, D. Gibbard, A., and Harper, L., "Counterfactuals and Reidel, Dordrecht, Holland, 151-172, 1973. two kinds of expected utility," in Harper, L., Stal- Von Wright, G.H., The Logic of Preference, Uni- naker, R., and Pearce, G. (Eds.), Ifs, D. Reidel versity of Edinburgh Press, Edinburgh, Scotland, Dordrecht, Holland, 153-190, 1980. 1963. Goldszmidt, M., "Qualitative probabilities: A nor- mative framework for commonsense reasoning," Technical Report (R-190), UCLA, Ph.D. Disser- tation, October 1992. Goldszmidt, M., and Pearl, J., "Rank-based systems: A simple approach to belief revision, belief up- date, and reasoning about evidence and actions," 186

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.