An Argumentation-Based Framework to Address the Attribution Problem in Cyber-Warfare Paulo Shakarian1 Gerardo I. Simari2 Geoffrey Moores1 Simon Parsons3 Marcelo A. Falappa4 1Dept. of Electrical Engineering and Computer Science, U.S. Military Academy, West Point, NY 2Dept. of Computer Science, University of Oxford, Oxford, UK 3Dept. of Computer Science, University of Liverpool, Liverpool, UK 4Dep. de Cs. e Ing. de la Computaci´on, Univ. Nac. del Sur, Bah´ıa Blanca, Argentina and CONICET [email protected], [email protected], geoff[email protected] [email protected], [email protected] Abstract eral goals, among which we distinguish the following main 4 capabilities: 1 Attributing a cyber-operation through the use of mul- 0 2tiple pieces of technical evidence (i.e., malware reverse- 1. Reasonaboutevidence inaformal,principledmanner, engineering and source tracking) and conventional intelli- i.e., relying on strong mathematical foundations. r pgence sources (i.e., human or signals intelligence) is a diffi- 2. Considerevidenceforcyberattributionassociatedwith Acult problem not only due to the effort required to obtain some level of probabilistic uncertainty. evidence, but the ease with which an adversary can plant 7 2false evidence. Inthis paper,we introduce a formalreason- 3. Considerlogicalrulesthatallowforthesystemtodraw ing system called the InCA (Intelligent Cyber Attribution) conclusions based on certain pieces of evidence and it- ]framework that is designed to aid an analyst in the attri- eratively apply such rules. R bution of a cyber-operation even when the available infor- Cmation is conflicting and/oruncertain. Our approachcom- 4. Consider pieces of information that may not be com- s.bines argumentation-based reasoning, logic programming, patible with each other, decide which information is candprobabilisticmodels to notonly attribute anoperation most relevant, and express why. [ but also explain to the analyst why the system reaches its 5. Attribute a given cyber-operationbased on the above- 1conclusions. described features and provide the analyst with the v 9 ability to understand how the system arrived at that 91 Introduction conclusion. 6 6 Inthis paper we presentthe InCA (IntelligentCyber At- .An important issue in cyber-warfare is the puzzle of deter- 4 tribution)framework,whichmeetsalloftheabovequalities. mining who was responsible for a given cyber-operation – 0 Our approach relies on several techniques from the artifi- 4be it an incident of attack, reconnaissance, or information cial intelligence community, including argumentation, logic 1theft. This is known as the “attribution problem” [1]. The programming, and probabilistic reasoning. We first out- :difficultyofthisproblemstemsnotonlyfromtheamountof v line the underlying mathematical framework and provide ieffort required to find forensic clues but also the ease with X examplesbasedonreal-worldcasesofcyber-attribution(cf. which an attacker can plant false clues to mislead security Section 2); then, in Sections 3 and 4, we formally present rpersonnel. Further, while techniques such as forensics and a InCA and attribution queries, respectively. Finally, we dis- reverse-engineering [2], source tracking [3], honeypots [4], cuss conclusions and future work in Section 5. and sinkholing [5] are commonly employedto find evidence that canlead to attribution, it is unclear how this evidence istobecombinedandreasonedabout. Inamilitarysetting, 2 Two Kinds of Models suchevidence is augmentedwith normalintelligencecollec- tion, such as human intelligence (HUMINT), signals intel- Our approach relies on two separate models of the world. ligence (SIGINT) and other means – this adds additional The first, called the environmental model (EM) is used complications to the task of attributing a given operation. todescribethebackgroundknowledgeandisprobabilisticin Essentially, cyber-attribution is a highly-technical intelli- nature. Thesecondone,calledtheanalytical model (AM) gence analysis problem where an analyst must consider a is used to analyze competing hypotheses that can account variety of sources, each with its associated level of confi- for a given phenomenon (in this case, a cyber-operation). dence, to provide a decision maker (e.g., a military com- The EM must be consistent – this simply means that there mander) insight into who conducted a given operation. mustexistaprobabilitydistributionoverthepossiblestates As it is well knownthat people’s ability to conduct intel- oftheworldthatsatisfiesalloftheconstraintsinthemodel, ligence analysis is limited [6], and due to the highly tech- aswellastheaxiomsofprobabilitytheory. Onthecontrary, nical nature of many cyber evidence-gathering techniques, the AM will allow for contradictoryinformationas the sys- an automated reasoning system would be best suited for tem must have the capability to reason about competing the task. Such a system must be able to accomplish sev- explanations for a given cyber-operation. In general, the EM AM condOp(baja,worm123) is true if baja was responsible for “MalwareXwascompiled “MalwareXwascompiled cyber-operation worm123. A sample set of predicate sym- onasystemusingthe onasysteminEnglish- bols for the analysis of a cyber attack between two states Englishlanguage.” speakingcountryY.” “MalwareWandmalwareX “MalwareWand overcontentionofaparticularindustryisshowninFigure2; werecreatedinasimilar malwareXare these will be used in examples throughout the paper. codingstyle.” related.” A construct formed with a predicate and constants as “CountryYandcountryZ “CountryYhasamotiveto arguments is known as a ground atom (we shall often deal arecurrentlyatwar.” launchacyber-attack against countryZ.” with ground atoms). The sets of all ground atoms for EM “CountryYhasasignificant “CountryYhasthecapability and AM are denoted with G and G , respectively. EM AM investment inmath-science- toconduct acyber-attack.” engineering(MSE)education.” Example 2.2 The following are examples of ground atoms over the predicates given in Figure 2. Figure 1: Example observations – EM vs. AM. G : origIP(mw123sam1,krasnovia), EM EM contains knowledge such as evidence, intelligence re- mwHint(mw123sam1,krasnovia), porting, or knowledge about actors, software, and systems. inLgConf(krasnovia,baja), TheAM,ontheotherhand,containsideastheanalystcon- mseTT(krasnovia,2) cludes based on the information in the EM. Figure 1 gives some examples of the types of information in the two mod- G : evidOf(mojave,worm123), els. Note that an analyst (or automated system) could as- AM signa probabilitytostatements inthe EMcolumnwhereas motiv(baja,krasnovia), statementsintheAMcolumncanbetrueorfalsedepending expCw(baja), onacertaincombination(orseveralpossiblecombinations) tgt(krasnovia,worm123) ofstatementsfromtheEM.Wenowformallydescribethese twomodelsaswellasatechniqueforannotating knowledge (cid:4) in the AM with information from the EM – these annota- tions specify the conditions under which the various state- For a given set of ground atoms, a world is a subset of the ments in the AM can potentially be true. atoms that are considered to be true (ground atoms not in Before describing the two models in detail, we first in- theworldarefalse). Hence, thereare2|GEM| possibleworlds troduce the language used to describe them. Variable and intheEMand2|GAM| worldsintheAM,denotedwithWEM constantsymbolsrepresentitemssuchascomputersystems, and WAM, respectively. types of cyber operations, actors (e.g., nation states, hack- Clearly, even a moderate number of ground atoms can ing groups), and other technical and/or intelligence infor- yield an enormous number of worlds to explore. One way mation. The set of all variable symbols is denoted with to reduce the number of worlds is to include integrity con- V, and the set of all constants is denoted with C. For our straints, which allow us to eliminate certain worlds from framework,weshallrequiretwosubsetsofC,C andC , consideration – they simply are not possible in the setting act ops thatspecifythe actorsthatcouldconductcyber-operations beingmodeled. Ourprincipleintegrityconstraintwillbe of and the operations themselves, respectively. In the exam- the form: ples in this paper, we will use capital letters to represent oneOf(A′) variables (e.g., X,Y,Z). The constants in Cact and Cops where A′ is a subset of groundatoms. Intuitively, this says that we use in the running example are specified in the fol- that any world where more than one of the atoms from lowing example. set A′ appear is invalid. Let IC and IC be the sets EM AM of integrity constraints for the EM and AM, respectively, Example 2.1 The following (fictitious) actors and cyber- and the sets of worlds that conform to these constraints be operations will be used in our examples: W (IC ),W (IC ), respectively. EM EM AM AM C = {baja,krasnovia,mojave} (1) Atomscanalsobecombinedintoformulasusingstandard act logicalconnectives: conjunction(and),disjunction(or),and C = {worm123} (2) ops negation(not). Thesearewrittenusingthesymbols∧,∨,¬, respectively. We say a world (w) satisfies a formula (f), (cid:4) written w |=f, based on the following inductive definition: The next component in the model is a set of predicate • if f is a single atom, then w |=f iff f ∈w; symbols. These constructs can accept zero or more vari- ables or constants as arguments, and map to either true • if f =¬f′ then w |=f iff w 6|=f′; or false. Note that the EM and AM use separate sets of • if f =f′∧f′′ then w |=f iff w |=f′ and w |=f′′; and predicate symbols – however, they can share variables and constants. The sets of predicates for the EM and AM are • if f =f′∨f′′ then w |=f iff w |=f′ or w|=f′′. denoted with P ,P , respectively. In InCA, we require EM AM P to include the binary predicate condOp(X,Y), where Weusethenotationformula ,formula todenotethe AM EM AM X is an actor and Y is a cyber-operation. Intuitively, this set of all possible (ground) formulas in the EM and AM, means that actor X conducted operation Y. For instance, respectively. Also, note that we use the notation ⊤,⊥ to P : origIP(M,X) Malware M originated from an IP address belonging to actor X. EM malwInOp(M,O) Malware M was used in cyber-operation O. mwHint(M,X) Malware M contained a hint that it was created by actor X. compilLang(M,C) Malware M was compiled in a system that used language C. nativLang(X,C) Language C is the native language of actor X. inLgConf(X,X′) Actors X and X′ are in a larger conflict with each other. mseTT(X,N) There are at least N number of top-tier math-science-engineering universities in country X. infGovSys(X,M) Systems belonging to actor X were infected with malware M. cybCapAge(X,N) Actor X has had a cyber-warfare capability for N years or less. govCybLab(X) Actor X has a government cyber-security lab. P : condOp(X,O) Actor X conducted cyber-operation O. AM evidOf(X,O) There is evidence that actor X conducted cyber-operation O. motiv(X,X′) Actor X had a motive to launch a cyber-attack against actor X′. isCap(X,O) Actor X is capable of conducting cyber-operation O. tgt(X,O) Actor X was the target of cyber-operation O. hasMseInvest(X) Actor X has a significant investment in math-science-engineering education. expCw(X) Actor X has experience in conducting cyber-operations. Figure 2: Predicate definitions for the environment and analytical models in the running example. represent tautologies (formulas that are true in all worlds) We now consider a probability distribution Pr over the and contradictions (formulas that are false in all worlds), set W (IC ). We say that Pr satisfies probabilis- EM EM respectively. tic formula f : p ± ǫ iff the following holds: p − ǫ ≤ Pr(w) ≤ p+ǫ. A set Π of probabilistic Pw∈WEM(ICEM) EM 2.1 Environmental Model formulas is called a knowledge base. We say that a prob- ability distribution over W (IC ) satisfies Π if and EM EM EM In this section we describe the first of the two models, only if it satisfies all probabilistic formulas in Π . EM namely the EM or environmental model. This model is It is possible to create probabilistic knowledge bases for largelybasedonthe probabilisticlogicof[7],whichwenow which there is no satisfying probability distribution. The briefly review. following is a simple example of this: First, we define a probabilistic formula that consists of a formula f over atoms from G , a real number p in the condOp(krasnovia,worm123) EM interval [0,1], and an error tolerance ǫ ∈ [0,min(p,1−p)]. ∨condOp(baja,worm123):0.4±0; A probabilistic formula is written as: f :p±ǫ. Intuitively, condOp(krasnovia,worm123) thisstatementisinterpretedas“formulaf istruewithprob- ∧condOp(baja,worm123):0.6±0.1. ability between p−ǫ and p+ǫ” – note that we make no statementabouttheprobabilitydistributionoverthisinter- Formulas and knowledge bases of this sort are inconsis- val. Theuncertaintyregardingtheprobabilityvaluesstems tent. In this paper, we assume that informationis properly from the fact that certain assumptions (such as probabilis- extracted from a set of historic data and hence consistent; tic independence) may not be suitable in the environment (recallthatinconsistentinformationcanonlybehandledin being modeled. the AM, not the EM). A consistent knowledge base could Example 2.3 To continue our running example, consider alsobeobtainedasaresultofcurationbyexperts,suchthat the following set Π : all inconsistencies were removed – see [8, 9] for algorithms EM for learning rules of this type. f = govCybLab(baja):0.8±0.1 1 The main kind of query that we require for the proba- f2 = cybCapAge(baja,5):0.2±0.1 bilistic model is the maximum entailment problem: given f3 = mseTT(baja,2):0.8±0.1 a knowledge base ΠEM and a (non-probabilistic) formula q, identify p,ǫ such that all valid probability distributions f = mwHint(mw123sam1,mojave) 4 Pr that satisfy Π also satisfy q : p±ǫ, and there does EM ∧compilLang(worm123,english):0.7±0.2 not exist p′,ǫ′ s.t. [p−ǫ,p+ǫ] ⊃ [p′ −ǫ′,p′ +ǫ′], where f5 = malwInOp(mw123sam1,worm123) all probability distributions Pr that satisfy ΠEM also sat- ∧malwareRel(mw123sam1,mw123sam2) isfy q : p′ ± ǫ′. That is, given q, can we determine the probability(with maximumtolerance)ofstatementq given ∧mwHint(mw123sam2,mojave):0.6±0.1 the information in Π ? The approach adopted in [7] to EM f = inLgConf(baja,krasnovia) 6 solve this problem works as follows. First, we must solve ∨¬cooper(baja,krasnovia):0.9±0.1 the linear program defined next. f = origIP(mw123sam1,baja):1±0 7 Definition 2.1 (EM-LP-MIN) Given a knowledge base Throughout the paper, let Π′EM ={f1,f2,f3}. (cid:4) ΠEM and a formula q: • create a variable x for each w ∈W (IC ); We can now solve EP-LP-MAX(Π′ ,q) and i i EM EM EM • for each f :p ±ǫ ∈Π , create constraint: EP-LP-MIN(Π′EM,q) to get solution 0.9±0.1. (cid:4) j j j EM p −ǫ ≤ x ≤p +ǫ ; j j X i j j 2.2 Analytical Model wi∈WEM(ICEM) s.t. wi|=fj Fortheanalyticalmodel(AM),wechooseastructuredargu- • finally, we also have a constraint: mentationframework[11]duetoseveralcharacteristicsthat x =1. make such frameworks highly applicable to cyber-warfare X i domains. Unlike the EM, which describes probabilistic in- wi∈WEM(ICEM) formation about the state of the real world, the AM must The objective is to minimize the function: allow for competing ideas – it must be able to represent contradictory information. The algorithmic approach al- x . X i lows for the creation of arguments based on the AM that wi∈WEM(ICEM)s.t. wi|=q may“compete” with eachother to describe who conducted We use the notation EP-LP-MIN(ΠEM,q) to refer to the a given cyber-operation. In this competition – known as a value of the objective function in the solution to the EM- dialecticalprocess –oneargumentmaydefeatanotherbased LP-MIN constraints. onacomparisoncriterion thatdeterminestheprevailingar- gument. Resulting from this process, the InCA framework Let ℓ be the result of the process described in Defini- will determine arguments that are warranted (those that tion 2.1. The next step is to solve the linear program a are not defeated by other arguments) thereby providing a secondtime, butinsteadmaximizingthe objectivefunction (we shall refer to this as EM-LP-MAX) – let u be the re- suitable explanation for a given cyber-operation. sult of this operation. In [7], it is shown that ǫ = u−ℓ and The transparencyprovidedby the systemcan allow ana- 2 lyststoidentifypotentiallyincorrectinputinformationand p=ℓ+ǫ is the solution to the maximum entailment prob- fine-tune the models or, alternatively, collect more infor- lem. Wenotethatalthoughtheabovelinearprogramhasan mation. In short, argumentation-based reasoning has been exponential number of variables in the worst case (i.e., no studiedasanaturalwaytomanageasetofinconsistentin- integrity constraints), the presence of constraints has the formation–itisthewayhumanssettledisputes. Aswewill potential to greatly reduce this space. Further, there are see, another desirable characteristic of (structured) argu- also good heuristics (cf. [8, 10]) that have been shown to mentationframeworksisthat,onceaconclusionisreached, providehighlyaccurateapproximationswithareduced-size we are left with an explanation of how we arrived at it linear program. andinformationabout why a givenargumentis warranted; Example 2.4 Consider KB Π′EM from Example 2.3 and a this is very important information for analysts to have. In set of ground atoms restricted to those that appear in that this section, we recall some preliminaries of the underly- program. Hence, we have: ing argumentationframeworkused, andthen introduce the w = {govCybLab(baja),cybCapAge(baja,5), analytical model (AM). 1 mseTT(baja,2)} w = {govCybLab(baja),cybCapAge(baja,5)} Defeasible Logic Programming with Presumptions 2 w3 = {govCybLab(baja),mseTT(baja,2)} DeLP with Presumptions (PreDeLP) [12] is a formalism w = {cybCapAge(baja,5),mseTT(baja,2)} combining Logic Programming with Defeasible Argumen- 4 tation. We now briefly recall the basics of PreDeLP; we w = {cybCapAge(baja,5)} 5 refer the reader to [13, 12] for the complete presentation. w = {govCybLab(baja)} 6 The formalism contains several different constructs: facts, w7 = {mseTT(baja,2)} presumptions, strict rules, and defeasible rules. Facts are w = ∅ statements about the analysis that can always be consid- 8 ered to be true, while presumptions are statements that and suppose we wish to computethe probability for formula: may or may not be true. Strict rules specify logical con- q =govCybLab(baja)∨mseTT(baja,2). sequences of a set of facts or presumptions (similar to an implication, though not the same) that must always occur, For each formula in Π we have a constraint, and for EM whiledefeasible rulesspecifylogicalconsequencesthatmay each world above we have a variable. An objective function be assumed to be true when no contradicting information is created based on the worlds that satisfy the query formula is present. These constructs are used in the construction (here, worlds w –w , w , w ). Hence, EP-LP-MIN(Π′ ,q) 1 4 6 7 EM of arguments, and are part of a PreDeLP program, which can be written as: is a set of facts, strict rules, presumptions, and defeasible max x +x +x +x +x +x w.r.t.: rules. Formally, we use the notation Π = (Θ,Ω,Φ,∆) 1 2 3 4 6 7 AM 0.7≤ x +x +x +x ≤0.9 to denote a PreDeLP program, where Ω is the set of strict 1 2 3 6 rules, Θ is the set of facts, ∆ is the set of defeasible rules, 0.1≤ x +x +x +x ≤0.3 1 2 4 5 and Φ is the set of presumptions. In Figure 3, we provide 0.8≤ x1+x3+x4+x7 ≤1 anexampleΠAM. Wenowdescribeeachoftheseconstructs x +x +x +x +x +x +x +x =1 in detail. 1 2 3 4 5 6 7 8 Facts (Θ) are ground literals representing atomic informa- Θ: θ = evidOf(baja,worm123) 1a tion or its negation, using strong negation “¬”. Note that θ = evidOf(mojave,worm123) 1b all of the literals in our framework must be formed with a θ = motiv(baja,krasnovia) 2 predicate from the set P . Note that information in this AM form cannot be contradicted. Ω: ω = ¬condOp(baja,worm123)← 1a condOp(mojave,worm123) Strict Rules(Ω) representnon-defeasiblecause-and-effect ω = ¬condOp(mojave,worm123)← information that resembles an implication (though the se- 1b condOp(baja,worm123) mantics is different since the contrapositive does not hold) ω = condOp(baja,worm123)← and are of the form L ←L ,...,L , where L is a ground 2a 0 1 n 0 evidOf(baja,worm123), literal and {L } is a set of ground literals. i i>0 isCap(baja,worm123), Presumptions(Φ)aregroundliteralsofthe sameformas motiv(baja,krasnovia), facts,exceptthattheyarenottakenasbeingtruebutrather tgt(krasnovia,worm123) defeasible,whichmeansthattheycanbecontradicted. Pre- ω = condOp(mojave,worm123)← 2b sumptions are denoted in the same manner as facts, except evidOf(mojave,worm123), that the symbol –≺ is added. While any literal can be used isCap(mojave,worm123), asapresumptioninInCA,wespecificallyrequireallliterals motiv(mojave,krasnovia), created with the predicate condOp to be defeasible. tgt(krasnovia,worm123) Defeasible Rules (∆) represent tentative knowledge that canbe usedif nothing canbe posedagainstit. Just aspre- Φ: φ1 = hasMseInvest(baja)–≺ sumptionsarethedefeasiblecounterpartoffacts,defeasible φ2 = tgt(krasnovia,worm123)–≺ rules are the defeasible counterpart of strict rules. They φ3 = ¬expCw(baja)–≺ are of the form L0 –≺L1,...,Ln, where L0 is a ground lit- eral and {L } is a set of ground literals. Note that with ∆: δ1a = condOp(baja,worm123)–≺ i i>0 evidOf(baja,worm123) both strict and defeasible rules, strong negation is allowed in the head of rules, and hence may be used to represent δ1b = condOp(mojave,worm123)–≺ evidOf(mojave,worm123) contradictory knowledge. Even though the above constructs are ground, we allow δ2 = condOp(baja,worm123)–≺ isCap(baja,worm123) forschematicversionswithvariablesthatareusedtorepre- sent sets of ground rules. We denote variables with strings δ3 = condOp(baja,worm123)–≺ motiv(baja,krasnovia), starting with an uppercase letter; Figure 4 shows a non- tgt(krasnovia,worm123) ground example. When a cyber-operation occurs, InCA must derive ar- δ4 = isCap(baja,worm123)–≺ hasMseInvest(baja) guments as to who could have potentially conducted the action. Derivation follows the same mechanism of Logic δ5a = ¬isCap(baja,worm123)–≺¬expCw(baja) Programming [14]. Since rule heads can contain strong δ5b = ¬isCap(mojave,worm123)–≺ ¬expCw(mojave) negation, it is possible to defeasibly derive contradictory literalsfromaprogram. Forthe treatmentofcontradictory Figure 3: A ground argumentation framework. knowledge, PreDeLP incorporates a defeasible argumenta- tionformalismthatallowsthe identificationofthepiecesof Θ: θ = evidOf(baja,worm123) knowledge that are in conflict, and through the previously 1 θ = motiv(baja,krasnovia) mentioneddialecticalprocess decideswhichinformationpre- 2 vails as warranted. Ω: ω = ¬condOp(X,O)←condOp(X′,O), This dialectical process involves the construction and 1 X 6=X′ evaluation of arguments that either support or interfere ω = condOp(X,O)←evidOf(X,O), with a given query, building a dialectical tree in the pro- 2 isCap(X,O),motiv(X,X′), cess. Formally, we have: tgt(X′,O),X 6=X′ Definition 2.2 (Argument) An argument hA,Li for a literal L is a pair of the literal and a (possibly empty) set Φ: φ1 = hasMseInvest(baja)–≺ of the EM (A⊆ΠAM) that provides a minimal proof for L φ2 = tgt(krasnovia,worm123)–≺ meeting the requirements: (1.) L is defeasibly derived from φ3 = ¬expCw(baja)–≺ A, (2.) Ω∪Θ∪A is not contradictory, and (3.) A is a minimal subset of ∆∪Φ satisfying 1and 2, denoted hA,Li. ∆: δ1 = condOp(X,O)–≺evidOf(X,O) Literal L is called the conclusion supported by the argu- δ2 = condOp(X,O)–≺isCap(X,O) ment, and A is the support of the argument. An argument δ3 = condOp(X,O)–≺motiv(X,X′),tgt(X′,O) hB,Li is a subargument of hA,L′i iff B ⊆A. An argument δ4 = isCap(X,O)–≺hasMseInvest(X) hA,Li is presumptive iff A∩Φ is not empty. We will also δ5 = ¬isCap(X,O)–≺¬expCw(X) use Ω(A) = A∩Ω, Θ(A) = A∩Θ, ∆(A) = A∩∆, and Φ(A)=A∩Φ. Figure 4: A non-ground argumentation framework. hA1,condOp(baja,worm123)i A1 ={θ1a,δ1a} 2. There is at least one set H′ ⊆ F, Ω(A1)∪Ω(A2)∪ hA ,condOp(baja,worm123)i A ={φ ,φ ,δ ,ω , H′ is non-contradictory, suchthat thereis a derivation 2 2 1 2 4 2a θ1a,θ2} for L2 from Ω(A1)∪Ω(A2)∪H′∪∆(A2), there is no hA3,condOp(baja,worm123)i A3 ={φ1,δ2,δ4} derivation forL2 fromΩ(A1)∪Ω(A2)∪H′,andthereis hA4,condOp(baja,worm123)i A4 ={φ2,δ3,θ2} noderivation for L1 from Ω(A1)∪Ω(A2)∪H′∪∆(A1). hA ,isCap(baja,worm123)i A ={φ ,δ } 5 5 1 4 Intuitively, the principle of specificity says that, in the hA ,¬condOp(baja,worm123)i A ={δ ,θ ,ω } 6 6 1b 1b 1a presenceoftwoconflictinglinesofargumentaboutapropo- hA ,¬isCap(baja,worm123)i A ={φ ,δ } 7 7 3 5a sition,theonethatusesmoreoftheavailableinformationis moreconvincing. Aclassicexampleinvolvesabird,Tweety, Figure 5: Example ground arguments from Figure 3. and arguments stating that it both flies (because it is a bird) and doesn’t fly (because it is a penguin). The latter Note that our definition differs slightly from that of [15] argumentuses more information about Tweety – it is more where DeLP is introduced, as we include strict rules and specific – and is thus the stronger of the two. facts as part of the argument. The reason for this will be- come clear in Section 3. Arguments for our scenario are Definition 2.4 ([12]) Let ΠAM = (Θ,Ω,Φ,∆) be a Pre- shown in the following example. DeLP program. An argument hA1,L1i is preferred to hA ,L i, denoted with A ≻ A iff any of the following 2 2 1 2 Example 2.5 Figure 5 shows example arguments based on conditions hold: the knowledge base from Figure 3. Note that the following relationship exists: 1. hA ,L i and hA ,L i are both factual arguments and 1 1 2 2 hA ,L i≻ hA ,L i. hA ,isCap(baja,worm123)i is a sub-argument of 1 1 PS 2 2 5 hA ,condOp(baja,worm123)i and 2 2. hA ,L i is a factual argument and hA ,L i is a pre- 1 1 2 2 hA3,condOp(baja,worm123)i. (cid:4) sumptive argument. Given argument hA ,L i, counter-arguments are argu- 1 1 3. hA ,L i and hA ,L i are presumptive arguments, and ments that contradict it. Argument hA ,L i counterargues 1 1 2 2 2 2 or attacks hA ,L i literal L′ iff there exists a subargument 1 1 (a) ¬(Φ(A )⊆Φ(A )), or 1 2 hA,L′′iofhA ,L is.t.setΩ(A )∪Ω(A )∪Θ(A )∪Θ(A )∪ 1 1 1 2 1 2 (b) Φ(A )=Φ(A ) and hA ,L i≻ hA ,L i. {L ,L′′} is contradictory. 1 2 1 1 PS 2 2 2 Example 2.6 Consider the arguments from Example 2.5. Generally,ifA,B areargumentswithrules X andY, resp., The following are some of the attack relationships between andX ⊂Y,thenAisstrongerthanB. Thisalsoholdswhen them: A1, A2, A3, and A4 all attack A6; A5 attacks A7; A and B use presumptions P1 and P2, resp., and P1 ⊂P2. and A7 attacks A2. (cid:4) Example 2.7 The following are relationships between ar- A proper defeater of an argument hA,Li is a counter- guments from Example 2.5, based on Definitions 2.3 argument that – by some criterion – is considered to be and 2.4: better than hA,Li; if the two are incomparable according A and A are incomparable (blocking defeaters); 1 6 to this criterion, the counterargumentis said to be a block- A ≻A , and thus A defeats A ; 6 2 6 2 ing defeater. An important characteristic of PreDeLP is A ≻A , and thus A defeats A ; 6 3 6 3 that the argument comparison criterion is modular, and A ≻A , and thus A defeats A ; 6 4 6 4 thus the most appropriate criterion for the domain that A5 and A7 are incomparable (blocking defeaters). (cid:4) is being represented can be selected; the default criterion used in classical defeasible logic programming (from which A sequence of arguments called an argumentation line PreDeLP is derived) is generalized specificity [16], though thus arises from this attack relation, where each argument an extension of this criterion is required for arguments us- defeats its predecessor. To avoid undesirable sequences, ing presumptions [12]. We briefly recall this criterion next that may represent circular or fallacious argumentation – the first definition is for generalized specificity, which is lines,inDeLPanargumentation line isacceptable ifitsat- subsequently used in the definition of presumption-enabled isfies certain constraints (see [13]). A literal L is warranted specificity. if there exists a non-defeated argument A supporting L. Clearly, there can be more than one defeater for a par- Definition 2.3 Let Π = (Θ,Ω,Φ,∆) be a PreDeLP AM ticular argument hA,Li. Therefore, many acceptable argu- program andlet F betheset of all literals that have adefea- mentation lines could arise from hA,Li, leading to a tree sible derivation from Π . An argument hA ,L i is pre- AM 1 1 structure. The tree is built from the set of all argumenta- ferred to hA ,L i, denoted with A ≻ A iff the two 2 2 1 PS 2 tion lines rooted in the initial argument. In a dialectical following conditions hold: tree, every node (except the root) represents a defeater of 1. ForallH ⊆F,Ω(A )∪Ω(A )∪H isnon-contradictory: its parent, and leaves correspondto undefeated arguments. 1 2 if there is a derivation for L from Ω(A )∪Ω(A )∪ Eachpathfromthe rootto aleafcorrespondsto adifferent 1 2 1 ∆(A ) ∪ H, and there is no derivation for L from acceptable argumentation line. A dialectical tree provides 1 1 Ω(A )∪Ω(A )∪H, then there is a derivation for L a structure for considering allthe possible acceptable argu- 1 2 2 from Ω(A )∪Ω(A )∪∆(A )∪H. mentationlines that canbe generatedfor deciding whether 1 2 2 an argument is defeated. We call this tree dialectical be- af(θ )= origIP(worm123,baja)∨ 1 cause it represents an exhaustive dialectical1 analysis for malwInOp(worm123,o)∧ the argument in its root. For argument hA,Li, we denote (cid:0)mwHint(worm123,baja)∨ its dialectical tree with T(hA,Li). (cid:0)(compilLang(worm123,c)∧ Givenaliteral LandanargumenthA,Li,inorderto de- nativLang(baja,c)) cide whether or not a literal L is warranted, every node in af(θ )= inLgConf(baja,kras(cid:1)n(cid:1)ovia) 2 the dialectical tree T(hA,Li) is recursively marked as “D” af(ω )= True 1 (defeated) or “U” (undefeated), obtaining a marked dialec- af(ω )= True 2 tical tree T∗(hA,Li) where: af(φ )= mseTT(baja,2)∨govCybLab(baja) 1 • All leaves in T∗(hA,Li) are marked as “U”s, and af(φ2)= malwInOp(worm123,o′)∧ infGovSys(krasnovia,worm123) • LethB,qibeaninner nodeofT∗(hA,Li). Then,hB,qi af(φ )= cybCapAge(baja,5) 3 willbemarkedas“U”iffeverychildofhB,qiismarked af(δ )= True 1 as “D”. Node hB,qi will be markedas “D” iff it has at af(δ )= True 2 least a child marked as “U”. af(δ )= True 3 Gisivmeanrkaergdu“mUe”n,tthhAe,nLTi∗o(vheAr,ΠhiA)Mw,airfrtahnetsroLotaonfdTt∗h(ahAt,LLii)s aaff((δδ54))== TTrruuee warranted fromΠ . (Warrantedargumentscorrespondto AM those in the grounded extension of a Dung argumentation Figure 6: Example annotation function. system [17].) We can then extend the idea of a dialectical tree to a Suppose we added the following presumptions to our run- dialectical forest. For a given literal L, a dialectical forest ning example: F(L)consistsofthesetofdialecticaltreesforallarguments forL. Weshalldenoteamarkeddialecticalforest,thesetof φ3 =evidOf(X,O)–≺, and all marked dialectical trees for arguments for L, as F∗(L). Hence, for a literal L, we say it is warranted if there is at φ4 =motiv(X,X′)–≺. least one argument for that literal in the dialectical forest Note that these presumptions are constructed using the F∗(L)thatis labeled“U”,not warranted ifthereis atleast same formulas as facts θ ,θ . Suppose we extend af as one argument for literal ¬L in the forest F∗(¬L) that is 1 2 follows: labeled “U”, and undecided otherwise. af(φ ) = malwInOp(M,O)∧malwareRel(M,M′) 3 3 The InCA Framework ∧mwHint(M′,X) af(φ ) = inLgConf(Y,X′)∧cooper(X,Y) 4 Having defined our environmental and analytical models (ΠEM,ΠAM respectively), we now define how the two re- So,forinstance,unlikeθ1,φ3 canpotentiallybetrueinany late,whichallowsustocompletethedefinitionofourInCA world of the form: framework. The key intuition here is that given a ΠAM, every ele- {malwInOp(M,O),malwareRel(M,M′),mwHint(M′,X)} ment of Ω∪Θ∪∆∪Φ might only hold in certain worlds in the set WEM – that is, worlds specified by the environ- while θ1 cannot be considered in any those worlds. ment model. As formulas over the environmental atoms With the annotation function, we now have all the com- in set G specify subsets of W (i.e., the worlds that ponents to formally define an InCA framework. EM EM satisfy them), we can use these formulas to identify the conditions under which a component of Ω∪Θ∪∆∪Φ can Definition 3.1 (InCA Framework) Given environmen- be true. Recall that we use the notation formula to tal model ΠEM, analytical model ΠAM, and annotation EM denote the set of all possible formulas over GEM. There- function af, I = (ΠEM,ΠAM,af) is an InCA frame- fore, it makes sense to associate elements of Ω∪Θ∪∆∪Φ work. with a formula from formula . In doing so, we can in EM Given the setup described above, we consider a world- turn compute the probabilities of subsets of Ω∪Θ∪∆∪Φ based approach– the defeat relationship among arguments usingtheinformationcontainedinΠ ,whichweshallde- EM will depend on the currentstate of the world(based on the scribe shortly. We first introduce the notion of annotation EM). Hence, we now define the status of anargument with function, which associates elements of Ω∪Θ∪∆∪Φ with respect to a given world. elements of formula . EM We also note that, by using the annotation function (see Definition 3.2 (Validity) Given InCA framework Figure 6), we may have certain statements that appear as I =(Π ,Π ,af), argument hA,Li is valid w.r.t. world both facts and presumptions (likewise for strict and defea- EM AM w∈W iff ∀c∈A,w |=af(c). sible rules). However, these constructs would have differ- EM ent annotations,and thus be applicable in different worlds. In other words, an argument is valid with respect to w 1Inthesenseofprovidingreasonsforandagainstaposition. if the rules, facts, and presumptions in that argument are present in w – the argument can then be built from infor- and mation that is available in that world. In this paper, we ℓL,Pr,I ≤PL,Pr,I ≤uL,Pr,I. extendthenotionofvaliditytoargumentationlines,dialec- Now let us consider the computation of probability tical trees, and dialectical forests in the expected way (an bounds on a literal when we are given a knowledge base argumentation line is valid w.r.t. w iff all arguments that Π in the environmental model, which is specified in I, comprise that line are valid w.r.t. w). EM instead of a probability distribution over all worlds. For a Example 3.1 Consider worlds w1,...,w8 from Exam- givenworldw∈WEM,letfor(w)=(cid:0)Va∈wa(cid:1)∧(cid:0)Va∈/w¬a(cid:1) ple2.4alongwiththeargumenthA ,isCap(baja,worm123)i – that is, a formula that is satisfied only by world w. Now 5 from Example 2.5. This argumentis valid in worlds w –w , wecandetermine the upper andlowerbounds onthe prob- 1 4 w6, and w7. (cid:4) ability of a literal w.r.t. ΠEM (denoted PL,I) as follows: We now extend the idea of a dialectical tree w.r.t. worlds – so, for a given world w ∈ WEM, the dialectical ℓL,I =EP-LP-MINΠEM, _ for(w), (resp., marked dialectical) tree induced by w is denoted w∈nec(L) by T hA,Li (resp., T∗hA,Li). We require that all argu- w w ments and defeaters in these trees to be valid with respect tow. Likewise,weextendthenotionofdialecticalforestsin u =EP-LP-MAXΠ , for(w), L,I EM _ the same manner (denoted with Fw(L) and Fw∗(L), respec- w∈poss(L) tively). Based on these concepts we introduce the notion and of warranting scenario. ℓ ≤P ≤u . L,I L,I L,I Definition 3.3 (Warranting Scenario) Let I = (Π , EM Hence, P = ℓ + uL,I−ℓL,I ± uL,I−ℓL,I. ΠAM, af) be an InCA framework and L be a ground literal L,I (cid:16) L,I 2 (cid:17) 2 over G ; a world w ∈ W is said to be a warranting AM EM scenario for L (denoted w ⊢war L) iff there is a dialectical Example 3.4 Following from Example 3.1, argu- forest F∗(L) in which L is warranted and F∗(L) is valid ment hA5,isCap(baja,worm123)i, we can compute w.r.t w.w w PisCap(baja,worm123),I (where I = (Π′EM,ΠAM,af)). Note that for the upper bound, the linear program we need to set Example 3.2 Following from Example 3.1, argument up is as in Example 2.4. For the lower bound, the objective hA ,isCap(baja,worm123)i is warranted in worlds w , w , function changes to: minx +x +x . From these linear 5 3 6 3 6 7 and w7. (cid:4) constraints, we obtain: PisCap(baja,worm123),I =0.75±0.25. (cid:4) Hence,thesetofworldsintheEMwherealiteralLinthe AM must be true is exactly the set of warrantingscenarios – these are the “necessary” worlds, denoted: 4 Attribution Queries nec(L)={w∈WEM |(w ⊢war L).} We now have the necessary elements required to formally definethekindofqueriesthatcorrespondtotheattribution Now, the set of worlds in the EM where AM literal L can problems studied in this paper. be true is the following – these are the “possible” worlds, denoted: Definition 4.1 Let I = (Π ,Π ,af) be an InCA EM AM poss(L)={w∈WEM |w6⊢war ¬L}. framework, S ⊆ Cact (the set of “suspects”), O ∈ Cops (the “operation”), and E ⊆ G (the “evidence”). An ac- EM The following example illustrates these concepts. torA∈S issaidtobeamostprobablesuspectifftheredoes notexistA′ ∈S suchthatPcondOp(A′,O),I′ >PcondOp(A,O),I′ Example 3.3 Following from Example 3.1: where I′ = (Π ∪ Π ,Π ,af′) with Π defined as EM E AM E {c:1±0}. nec(isCap(baja,worm123))={w ,w ,w } and Sc∈E 3 6 7 Given the above definition, we refer to Q = (I,S,O,E) poss(isCap(baja,worm123))={w1,w2,w3,w4,w6,w7}. as an attribution query, and A as an answer to Q. We note (cid:4) thatintheabovedefinition,theitemsofevidenceareadded to the environmental model with a probability of 1. While Hence, for a given InCA framework I, if we are given in general this may be the case, there are often instances a probability distribution Pr over the worlds in the EM, in analysisof a cyber-operationwhere the evidence may be then we can compute an upper and lower bound on the true with some degree of uncertainty. Allowing for proba- probability of literal L (denoted PL,Pr,I) as follows: bilistic evidence is a simple extensionto Definition 4.1 that does not cause any changes to the results of this paper. ℓL,Pr,I = X Pr(w), To understand how uncertain evidence can be present in w∈nec(L) acyber-securityscenario,considerthefollowing. InSyman- uL,Pr,I = X Pr(w), tec’s initial analysis of the Stuxnet worm, they found the routine designed to attack the S7-417 logic controller was w∈poss(L) incomplete,andhencewouldnotfunction[18]. However,in- to extend Defeasible Logic Programming with probabilis- dustrial controlsystem expert Ralph Langner claimed that tic information. Currently, we are implementing InCA and the incomplete code would run provided a missing data the associated algorithms and heuristics to answer these block is generated, which he thought was possible [19]. In queries. We also feel that there are some key areas to ex- thiscase,thoughthecodewasincomplete,therewasclearly plore relating to this framework, in particular: uncertainty regarding its usability. This situation provides • Automatically learning the EM and AM from data. a real-world example of the need to compare arguments – in this case, in the worlds where both arguments are valid, • Conducting attribution decisions in near real time. Langner’sargumentwouldlikelydefeatSymantec’sbygen- eralized specificity (the outcome, of course, will depend on • Identifying additional evidence that must be collected the exact formalization of the two). Note that Langner in order to improve a given attribution query. was later vindicated by the discovery of an older sample, Stuxnet 0.5, which generated the data block.2 • ImprovingscalabilityofInCAtohandlelargedatasets. InCAalsoallowsforavarietyofrelevantscenariostothe Futureworkwillbecarriedoutinthesedirections,focusing attribution problem. For instance, we can easily allow for onthe use of both realand synthetic datasets for empirical the modeling of non-stateactorsbyextending the available evaluations. constants– forexample, traditionalgroupssuchas Hezbol- lah,whichhaspreviouslywieldeditscyber-warfarecapabil- itiesinoperationsagainstIsrael[1]. Likewise,theInCAcan Acknowledgments also be used to model cooperation among different actors inperforminganattack,includingthe relationshipbetween This work was supported by UK EPSRC grant non-state actors and nation-states, such as the potential EP/J008346/1 – “PrOQAW”, ERC grant 246858 – connectionbetweenIranandmilitantsstealingUAVfeedsin “DIADEM”, by NSF grant #1117761, by the Army Iraq, or the much-hypothesized relationship between hack- Research Office under the Science of Security Lablet grant tivist youth groups and the Russian government [1]. An- (SoSL) and project 2GDATXR042, and DARPA project other aspect that can be modeled is deception where, for R.0004972.001. instance, an actor may leave false clues in a piece of mal- The opinions in this paper are those of the authors and ware to lead an analyst to believe a third party conducted do not necessarily reflect the opinions of the funders, the the operation. Such a deception scenario can be easily cre- U.S. Military Academy, or the U.S. Army. ated by adding additional rules in the AM that allow for the creation of such counter-arguments. Another type of References deception that could occur include attacks being launched fromasystemnotintheresponsibleparty’sarea,butunder [1] P. Shakarian, J. Shakarian, and A. Ruef, Introduc- their control(e.g., see [5]). Again, modeling who controlsa tion to Cyber-Warfare: A Multidisciplinary Approach. given system can be easily accomplished in our framework, Syngress, 2013. and doing so would simply entail extending an argumenta- tion line. Further, campaigns of cyber-operations can also [2] C.Altheide, Digital Forensics with Open Source Tools. bemodeled,aswellasrelationshipsamongmalwareand/or Syngress, 2011. attacks (as detailed in [20]). As with all of these abilities, InCA provides the analyst [3] O. Thonnard, W. Mees, and M. Dacier, “On a mul- the means to model a complex situation in cyber-warfare ticriteria clustering approach for attack attribution,” but saves him from carrying out the reasoning associated SIGKDD Explorations, vol. 12, no. 1, pp. 11–20,2010. with such a situation. Additionally, InCA results are con- [4] L. Spitzner, “Honeypots: Catching the Insider structive, so an analyst can “trace-back” results to better Threat,” in Proc. of ACSAC 2003. IEEE Computer understand how the system arrivedat a given conclusion. Society, 2003,pp. 170–179. [5] “ShadowsintheCloud: InvestigatingCyberEspionage 5 Conclusion 2.0,” Information Warfare Monitor and Shadowserver Foundation, Tech. Rep., 2010. In this paper we introduced InCA, a new framework that allowsthemodelingofvariouscyber-warfare/cyber-security [6] R.J.Heuer,Psychology of Intelligence Analysis. Cen- scenarios in order to help answer the attribution question ter for the Study of Intelligence. bymeansofacombinationofprobabilisticmodelingandar- [7] N. J. Nilsson, “Probabilistic logic,” Artif. Intell., gumentative reasoning. This is the first framework, to our vol. 28, no. 1, pp. 71–87, 1986. knowledge,thataddressestheattributionproblemwhileal- lowingformultiplepiecesofevidencefromdifferentsources, [8] S. Khuller, M. V. Martinez, D. S. Nau, A. Sliva, G. I. including traditional (non-cyber) forms of intelligence such Simari, and V. S. Subrahmanian, “Computing most as human intelligence. Further, our framework is the first probable worlds of actionprobabilistic logic programs: 30,000 2http://www.symantec.com/connect/blogs/stuxnet-05-disrupting- scalable estimation for 10 worlds,” AMAI, vol. uranium-processing-natanz 51(2–4), pp. 295–331,2007. [9] P. Shakarian, A. Parker, G. I. Simari, and V. S. Sub- rahmanian, “Annotated probabilistic temporal logic,” TOCL, vol. 12, no. 2, p. 14, 2011. [10] G. I. Simari, M. V. Martinez, A. Sliva, and V. S. Sub- rahmanian, “Focused most probable world computa- tions in probabilistic logic programs,” AMAI, vol. 64, no. 2-3, pp. 113–143,2012. [11] I. Rahwan and G. R. Simari, Argumentation in Artifi- cial Intelligence. Springer, 2009. [12] M. V. Martinez, A. J. Garc´ıa, and G. R. Simari, “On the use of presumptions in structured defeasible rea- soning,” in Proc. of COMMA, 2012, pp. 185–196. [13] A. J. Garc´ıa and G. R. Simari, “Defeasible logic programming: An argumentative approach,” TPLP, vol. 4, no. 1-2, pp. 95–138,2004. [14] J. W. Lloyd, Foundations of Logic Programming, 2nd Edition. Springer, 1987. [15] G. R. Simari and R. P. Loui, “A mathematical treat- ment of defeasible reasoning and its implementation,” Artif. Intell., vol. 53, no. 2-3, pp. 125–157,1992. [16] F. Stolzenburg, A. Garc´ıa, C. I. Chesn˜evar, and G. R. Simari, “Computing Generalized Specificity,” Journal of Non-Classical Logics, vol. 13, no. 1, pp. 87–113, 2003. [17] P.M.Dung,“Ontheacceptabilityofargumentsandits fundamentalroleinnonmonotonicreasoning,logicpro- gramming and n-person games,” Artif. Intell., vol. 77, pp. pp. 321–357,1995. [18] N.Falliere,L.O.Murchu,andE.Chien,“W32.Stuxnet DossierVersion1.4,”SymantecCorporation,Feb.2011. [19] R. Langner, “Matching Langner Stuxnet analysis and Symantic dossier update,” Langner Communications GmbH, Feb. 2011. [20] “APT1: Exposing one of China’s cyber espionage units,” Mandiant (tech. report), 2013.