Table Of ContentMANCaLog: A Logic for Multi-Attribute Network Cascades
(Technical Report)
Paulo Shakarian Gerardo I. Simari Robert Schroeder
NetworkScienceCenterand Dept. ofComputerScience CORELab
Dept. ofElectricalEngineering UniversityofOxford DefenseAnalysisDept.
andComputerScience WolfsonBuilding,ParksRoad NavalPostgraduateSchool
U.S.MilitaryAcademy OxfordOX13QD,UK Monterey,CA93943
WestPoint,NY10996 [email protected] [email protected]
[email protected]
3 ABSTRACT variety of disciplines, including computer science [10], bi-
1
ology [11], sociology [8], economics [12], and physics [15].
0 The modeling of cascade processes in multi-agent systems
Much existing work in this area is based on pre-existing
2 intheformofcomplexnetworkshasinrecentyearsbecome
models in sociology and economics – in particular the work
an important topic of study due to its many applications:
n of [8, 12]. However, recent examinations of social networks
theadoptionofcommercialproducts,spreadofdisease,the
a – both analysis of large data sets and experimental – have
diffusion of an idea, etc. In this paper, we begin by identi-
J indicated that there may be additional factors to consider
fying a desiderata of seven properties that a framework for
9 that are not taken into account by these models. These
modeling such processes should satisfy: the ability to rep-
1 include the attributes of nodes and edges, competing diffu-
resent attributes of both nodes and edges, an explicit rep-
sion processes, and time. In this paper, we outline seven
resentation of time, the ability to represent non-Markovian
] design criteria (Section 1.1) for such a framework and in-
I temporalrelationships,representationofuncertaininforma-
A tion, the ability torepresent competingcascades, allowance troduce MANCaLog (Section 2), which is to the best of our
knowledge the first logical language for modeling diffusion
. of non-monotonic diffusion, and computational tractability.
s in complex networks that meets these criteria. MANCaLog
WethenpresenttheMANCaLoglanguage,aformalismbased
c is a rule-based framework (inspired by logic programming)
[ onlogicprogrammingthatsatisfiesallthesedesiderata,and
that can richly express how agents adopt or fail to adopt
focusonalgorithmsforfindingminimalmodels(fromwhich
2 the outcome of cascades can be obtained) as well as how certainbehaviors,andhowthesebehaviorscascadethrough
v this formalism can be applied in real world scenarios. We a network. We also introduce fixed-point based algorithms
2 are not aware of any other formalism in the literature that that allow for the calculation of the result of the diffusion
0 meets all of the above requirements. processinSection3. Notethatthesealgorithmsareproven
3 not only to be correct, but also to run in polynomial time.
0 Hence, our approach can not only better express many as-
CategoriesandSubjectDescriptors
. pects of cascades in complex networks, but it can do so in
1
0 I.2.4 [Artificial Intelligence]: Knowledge Representation a reasonable amount of time. We conclude by discussing
3 Formalisms and Methods—Representation Languages applications of MANCaLog in Section 4.
1 Proofs of all results stated in this paper can be found in the
: GeneralTerms appendix.
v
Xi Languages, Algorithms 1.1 DesiderataofProperties
r Webeginbyidentifyingasetofcriteriathatwebelievea
a Keywords frameworkforreasoningaboutcascadesincomplexnetworks
should satisfy.
Complex Networks, Cascades, Logic Programming
1. Multiply labeled and weighted nodes and edges.
Manyexistingframeworksforstudyingdiffusionincomplex
1. INTRODUCTIONANDRELATEDWORK
networks assume that there is only one type of vertex that
Anepidemicworkingthroughapopulation,cascadingelec- maybecome“active”[10]ormay“mutate”[11,15]andonly
trical power failures, product adoption, and the spread of one possible relationship between nodes. In reality, nodes
a mutant gene are all examples of diffusion processes that and edges often have different properties. For instance, la-
can happen in multi-agent systems structured as complex bels on edges can be used to differentiate between strong
networks. These network processes have been studied in a and weak ties (edge types) – a concept that is well stud-
ied [7]. Recently, such attributes of nodes have been shown
This is an extended version of MANCaLog: A Logic to impact influence in a network [1].
for Multi-Attribute Network Cascades, which ap- 2. Explicit Representation of Time. Most work in
pears in:Proceedings of the 12th International Conference
the literature assumes static models, with the exception
on Autonomous Agents and Multiagent Systems (AAMAS
of the recent developments in [4, 5, 6], which assume the
2013), Ito, Jonker, Gini, and Shehory (eds.), May 6–10,
2013, Saint Paul, Minnesota, USA. existence of a timestamped log referring to actions taken
in the network in order to learn how nodes influence each
other. Though [4] tackles the problem of predicting the 1 3 7 8
5 6
time at which a certain node will take an action, the au-
thors make several simplifying assumptions such as mono-
tonicityofprobabilityfunctions,probabilisticindependence, 2 4 1 0 9
sub-modularity and, most importantly for this criterion, a
modeling of time solely based on temporal decay of influ-
ence. We seek a richer model of temporal relationships be- Figure 1: Simple online social network G . Solid
soc
tweenconditionsinthenetworkstructure,thecurrentstate edgesarelabeledwithstrTie,whiledashededgesare
of the cascades in process, and how influence propagates. labeled with wkTie. White nodes are labeled with
3. Non-Markovian Temporal Relationships. Apart male,whilegraynodesarelabeledwithfem. Arrows
fromtimebeingexplicitlyrepresented,thetemporaldepen- represent the direction of the edge; double-headed
denciesshouldbeabletospanmultipleunitsoftime. Hence, edges represent two edges with the same label.
the“memoryless”modeofastandardMarkovprocess,where
onlytheinformationofthecurrentstateisrequired,isinsuf-
low for competing processes or non-monotonic cascades. A
ficient. Here, we strive to create a framework where depen-
relatedlogicprogrammingframework,competitivediffusion
dencies can be from other earlier time steps. This issue has
(CD)[2]allowsforcompetitivediffusionandnon-monotonic
been previously studied with respect to more general logic
processes but does not explicitly represent time and also
programmingframeworkssuchas[13],buttoourknowledge
makes Markovian assumptions. Further, we also note that
has not been applied to social networks.
thesemanticsofCDyieldsa“mostprobableinterpretation”
4. Representation of Uncertainty. As in practice it is that is not a unique solution. Hence, a given model in
notalwayspossibletojudgetheattributesofallindividuals that framework can lead to multiple and possibly contra-
in a network, an element of uncertainty must be included. dictory, outcomes to a cascade (this problem is avoided in
However, in connection with point 7, this should not be at MANCaLog). AnotherpopularclassofmodelsisEvolution-
the expense of tractability. For instance, the probabilistic ary Graph Theory (EGT) [11], which is highly related to
models of [10] are normally addressed with simulation (and the voter model (VM) [15]. Although this framework al-
hencedonotscalewell)asthecomputationoftheexpected lows for competing processes and non-monotonic diffusion,
number of activated nodes is a #P-hard problem [3]. it also makes Markovian assumptions while not explicitly
5. Competing Cascades. Often, in real-world situations representing time. Further, determining the outcome of a
there will be competing cascading processes. For example, cascade in those models is NP-hard, while determining the
inevolutionarygraphtheory[11],“mutants”and“residents” outcome in MANCaLog can be accomplished in polynomial
competefornodesinthenetwork–thesuccessofonehinges time. Table1listshowthesemodelscomparetoMANCaLog
on the failure of the other. when considering our design criteria.
6. Non-Monotonic Cascades. In much existing work on
2. FRAMEWORK
cascades in complex networks, the number of nodes attain-
ing a certain property at each time step can only increase.
2.1 SyntaxandSemantics
However, if we allow for competing cascades in the same
model, we cannot have such a strong restriction as the suc- In this paper we assume that agents are arranged in a
cess of one cascade may come at the expense of another. directed graph (or network) G = (V,E), where the set of
7. Tractability. The social networks of interest in today’s nodes corresponds to the agents, and the edges model the
data mining problems often have millions of nodes. It is relationships between them. We also assume a set of labels
reasonabletoexpectthatsoonbillion-nodenetworkswillbe L,whichispartitionedintotwosets: fluentlabelsLf (labels
commonplace. Any framework for dealing with these prob- thatcanchangeovertime)andnon-fluentlabelsLnf (labels
lems must be solvable in a reasonable amount of time and that do not); labels can be applied to both the nodes and
offer areas for practical improvement for further scalability. edges of the network. We will use the notation G = V ∪E
to be the set of all components (nodes and edges) in the
1.2 RelatedWork network. Thus, c∈G could be either a node or an edge.
The above criteria can be summarized as the desire to Example 2.1. We will use the sample online social net-
design the most expressive language for network cascades
workG showninFigure1astherunningexample;G is
soc soc
possible while still allowing computation of the outcome of
used to denote the set of components of G . Here we have
soc
a diffusion process to be completed in a tractable amount
L ={male,fem,strTie,wkTie}representingmale,female,
nf
of time. As a comparison, let us briefly describe some rel-
strongtiesandweakties,respectively. Additionally,wehave
evant related work. Perhaps the best known general model
L ={visPgA,visPgB} representing visiting webpage A and
f
for representing diffusion in complex networks is the in-
visiting webpage B, respectively. (cid:4)
dependent cascade/linear threshold (IC/LT) model of [10].
However, although this framework was shown to be capa- Inthispaper,wepresentalogicallanguagewhereweuse
ble of expressing a wide variety of sociological models, it atoms,referringtolabelsandweights,todescribeproperties
assumes the Markov property and does not allow for the of the nodes and edges. Though labels themselves could
representation of multiple attributes on vertices and edges. be modeled as atoms instead of predicates (to model non-
Amorerecentframework,socialnetworkoptimizationprob- ground labelings that allow for greater expressibility), for
lems (SNOPs) [14] uses logic programming to allow for the simplicityofpresentationweleavethistofuturework. The
representationofattributes,butthisframeworkdoesnotal- first piece of the syntax is the network atom.
Table 1: Comparison with other models
Criterion MANCaLog IC/LT [10] SNOP [14] CD [2] EGT/VM [11]
1. Labels Yes No Yes Yes No
2. Explicit Representation of Time Yes No Yes No Yes
3. Non-Markovian Temporal Relationships Yes No No No No
4. Uncertainty Yes Yes Yes Yes Yes
5. Competing Cascades Yes No No Yes Yes
6. Non-monotonic Cascades Yes No No Yes Yes
7. Tractablity PTIME #P-hard PTIME PTIME NP-hard
Definition 2.1 (Network Atom). GivenlabelL∈L might be used to identify a woman who visits webpage A.
and weight interval bnd ⊆ [0,1], then (cid:104)L,bnd(cid:105) is a net- Clearly, we have that W |=
work atom. A network atom is fluent (resp., non-fluent) if
(cid:104)fem,[1,1](cid:105)∧¬(cid:104)visPgA,[0.5,0.9](cid:105)∧¬(cid:104)visPgB,[0.1,0.7](cid:105)
L ∈ L (resp., L ∈ L ). We use NA to denote the set of
f nf
all possible network atoms. Note that the network atoms formed with strTie and wkTie
are not present; this could be due to the fact that such a
Network atoms describe properties of nodes and edges. world is used to describe a node and not an edge, and hence
The definition is intuitive: L represents a property of the there is no information about those two labels. As such is
vertex or edge, and associated with this property is some the case, W |=(cid:104)strTie,[0,1](cid:105)∧(cid:104)wkTie,[0,1](cid:105). (cid:4)
weightthatmayhaveassociateduncertainty–hencerepre-
sented as an interval bnd, which can be open or closed. An The idea is to use MANCaLog to describe how properties
invalid bound is represented by ∅, which is equivalent to all (specifiedbylabels)ofthenodesinthenetworkchangeover
other invalid bounds. time. We assume that there is some natural number t
max
that specifies the total amount of time we are considering,
Definition 2.2 (World). A world W is a set of net- and we use τ = {t | t ∈ [0,t ]} to denote the set of all
max
work atoms such that for each L∈L there is no more than time points. How well a certain property can be attributed
one network atom of the form (cid:104)L,bnd(cid:105) in W. to a node is based on a weight (to which the bnd bound in
the network atom refers). As time progresses, a weight can
A network formula over NA is defined using conjunc- either increase/decrease and/or become more/less certain.
tion, disjunction, and negation in the usual way. If a for- We now introduce the MANCaLog fact, which states that
mula contains only non-fluent (resp., fluent) atoms, it is a somenetworkatomistrueforanodeoredgeduringcertain
non-fluent (resp., fluent) formula. times.
Definition 2.3 (Satisfaction of Worlds). Givena Definition 2.4 (MANCaLog Fact). If[t ,t ]⊆[0,t ],
1 2 max
world W and network formula f, satisfaction of W by f c ∈ G, and a ∈ NA, then (a,c) : [t ,t ] is a MANCaLog
1 2
(denoted W |=f) is defined: fact. A fact is fluent (resp., non-fluent) if atom a is fluent
(resp., non-fluent). All non-fluent facts must be of the form
• If f =(cid:104)L,[0,1](cid:105) then W |=f.
(a,c):[0,t ]. Let F be the set of all facts and F ,F be
max nf f
the set of all non-fluent and fluent facts, respectively.
• If f =(cid:104)L,∅(cid:105) then W (cid:54)|=f.
Example 2.3. FollowingfromExample2.2,thefollowing
• If f = (cid:104)L,bnd(cid:105), with bnd (cid:54)= ∅ and bnd (cid:54)= [0,1], then
facts are based on Figure 1:
W |=f iff there exists (cid:104)L,bnd(cid:48)(cid:105)∈W s.t. bnd(cid:48) ⊆bnd.
F1 = ((cid:104)male,[1,1](cid:105),1):[0,tmax]
• If f =¬f(cid:48) then W |=f iff W (cid:54)|=f(cid:48).
F2 = ((cid:104)fem,[1,1](cid:105),1):[0,tmax]
• If f =f ∧f then W |=f iff W |=f and W |=f . F3 = ((cid:104)male,[1,1](cid:105),3):[0,tmax]
1 2 1 2
F4 = ((cid:104)strTie,[1,1](cid:105),(1,2)):[0,tmax]
• If f =f1∨f2 then W |=f iff W |=f1 or W |=f2. F5 = ((cid:104)strTie,[1,1](cid:105),(2,1)):[0,tmax]
F6 = ((cid:104)wkTie,[1,1](cid:105),(2,3)):[0,tmax]
For some arbitrary label L ∈ L, we will use the nota-
tion Tr=(cid:104)L,[0,1](cid:105) and F=(cid:104)L,∅(cid:105) to represent a tautology F7 = ((cid:104)visPgA,[0.8,1.0](cid:105),1):[0,tmax]
and contradiction, respectively. For ease of notation (and F8 = ((cid:104)visPgA,[0.5,1.0](cid:105),2):[0,tmax]
without loss of generality), we say that if there does not
exist some bnd s.t. (cid:104)L,bnd(cid:105) ∈ W, then this implies that Forinstance,agent1ismale,andhasastrongtietoagent2,
(cid:104)L,[0,1](cid:105)∈W. who is female. (cid:4)
Example 2.2. Following from Example 2.1, the network Next, we introduce integrity constraints (ICs).
atom (cid:104)female,[1,1](cid:105) can be used to identify a node as a
woman. Likewise, the world W = Definition 2.5. Given fluent network atom a and con-
(cid:110) (cid:111) junctionofnetworkatomsb,anintegrityconstraintisofthe
(cid:104)fem,[1,1](cid:105),(cid:104)male,[0,0](cid:105),(cid:104)visPgA,[1,1](cid:105),(cid:104)visPgB,[0,0](cid:105) form a←(cid:45)b.
Intuitively, integrity constraint (cid:104)L,bnd(cid:105) ←(cid:45) b means that Intuitively, the above function says that an agent adopts a
if at a certain time point a component (vertex or edge) of behavior with a weight of at least 0.7 if half of the incoming
thenetworkhasasetofpropertiesspecifiedbyconjunction neighbors that have some attribute and meet some criteria,
b, then at that same time the component’s weight for label and we have no information otherwise. Another possibility
L must be in interval bnd. is to have an influence function that may reduce the weight
that an agent adopts a certain behavior:
Example 2.4. Followingfromthepreviousexamples,the
(cid:40)
integrity constraint (cid:104)male,[0,0](cid:105) ←(cid:45) (cid:104)fem,[1,1](cid:105) would re- [0.0,0.2] if x=y
quire any node designated as a female to not be male. (cid:4) ngTp(x,y)= [0.0,1.0] otherwise
WenowdefineMANCaLogrules. Theideabehindrulesis
The ngTp function says that an agent will adopt a behav-
simple: an agent that meets some criteria is influenced by
ior with a weight no greater than 0.2 if all of the incoming
the set of its neighbors who possess certain properties. The
neighbors possessing some property meet some criteria, and
amount of influence exerted on an agent by its neighbors is
specifiedbyaninfluence function,whosepreciseeffectswill that we have no information otherwise. (cid:4)
be described later on when we discuss the semantics. As a
result, a rule consists of four major parts: (i) an influence Definition 2.7 (Neighbor Criterion). Ifgedge,gnode
function,(ii)neighborcriteria,(iii)targetcriteria,and(iv)a arenon-fluentnetworkformulas(formedoveredgesandnodes,
target. Intuitively, (i) specifies how the neighbors influence respectively),hisaconjunctionofnetworkatoms,andifl is
the agent in question, (ii) specifies which of the neighbors an influence function, then (gedge,gnode,h)ifl is a neighbor
caninfluencetheagent,(iii)specifiesthecriteriathatcause criterion.
the agent to be influenced, and (iv) is the property of the
agent that changes as a result of the influence. Formulas g and h in a neighbor criterion specify the
node
Wewilldiscusseachofthesepartsinturn,andthendefine (non-fluentandfluent,respectively)criteriaonagivenneigh-
rules in terms of these elements. First, we define influence bor, while formula g specifies the non-fluent criteria on
edge
functions and neighbor criteria. thedirectededgefromthatneighbortothenodeinquestion.
Thenextcomponentisthe“targetcriteria”,whicharethe
Definition 2.6 (Influence Function). Aninfluence
criteriathatanagentmustsatisfyinordertobeinfluenced
function is a function ifl :N×N→[0,1]×[0,1] that sat-
by its neighbors. Ideas such as“susceptibility”[1] can be
isfies the following two axioms:
integrated into our framework via this component. We rep-
1. ifl can be computed in constant (O(1)) time. resent these criteria with a formula of non-fluent network
2. For x(cid:48) >x we have ifl(x(cid:48),y)⊆ifl(x,y). atoms. The final component, the“target”, is simply the la-
bel of the target agent that is influenced by its neighbors.
We use IFL to denote the set of all influence functions.
Hence, we now have all the pieces to define a rule.
Intuitively,aninfluencefunctiontakesthenumberofqual-
ifyinginfluencersandthenumberofeligibleinfluencersand Definition 2.8 (Rule). Given fluent label L, natural
returnsaboundonthenewvaluefortheweightoftheprop- number ∆t, target criteria f and neighbor criteria
ertyofthetargetnodethatchanges. Inpractice,weexpect (g ,g ,h) , a MANCaLog Rule is of the form:
edge node ifl
the time complexity of such a function to be a polynomial
intermsofthetwoarguments. However,asbotharguments r=L ←∆t f,(g ,g ,h)
edge node ifl
are naturals bounded by the maximum degree of a node in
the network, this value will be much smaller than the size We will use the notation head(r) to denote L.
of the network – we thus treat it as a constant here.
Notethatthetarget(alsoreferredtoasthehead)oftherule
Example 2.5. Thewell-known“tippingmodel”originally
isasinglelabel;essentially,thebodyoftherulecharacterizes
introduced in [8, 12] states that an agent adopts a behavior
a set of nodes, and this label is the one that is modified for
when a certain fraction of his incoming neighbors do so. A
eachnodeinthisset. Morespecifically,theruleisessentially
common tipping function is the majority threshold where
saying that when certain conditions for an agent and its
at least half of the agent’s neighbors must previously adopt
neighbors are met, the bnd bound for the network atom
the behavior. We can represent this using the following in-
formed with label L on that agent changes. Later, in the
fluence function:
semantics,weintroducenetworkinterpretations,whichmap
(cid:40)
[1.0,1.0] if x/y≥0.5 components (nodes and edges) of the network to worlds at
tip(x,y)= a given point in time. The rule dictates how this mapping
[0.0,1.0] otherwise
changes in the next time step.
This function says that an agent adopts a certain behavior
if at least half of his incoming neighbors have some property Definition 2.9 (MANCaLog Program). AprogramP
(strong ties, weak ties, meet some requirement of gender, is a set of rules, facts, and integrity constraints s.t. each
income, etc.) and that we have no information otherwise. non-fluent fact F ∈ F appears no more than once in the
nf
Inourframework,wecanleveragetheboundsassociatedwith program. Let P be the set of all programs.
the influence function to create a“soft”tipping function:
(cid:40) Example 2.6. Following from the previous examples, we
[0.7,1.0] if x/y≥0.5
sftTp(x,y)= can have a MANCaLog program that leverage the sftTp and
[0.0,1.0] otherwise ngTp influence functions in rules that are more expressive
Example 2.8. Consider interpretation I , where I (0)=
Table 2: Example network interpretation, NI . 1 1
1 NI (from Example 2.7), and MANCaLog facts F and F
Comp. male fem strTie wkTie visPgA visPgB 1 7 8
1 [1,1] [0,0] - - [0.9,1.0] [0.8,1.0] from Example 2.3. In this case, I1 |=F7 and I1 (cid:54)|=F8. (cid:4)
2 [0,0] [1,1] - - [0.0,0.3] [0.0,0.2]
3 [1,1] [0,0] - - [0.6,1.0] [0.0,0.2] Fornon-fluentfacts,weintroducethenotionofstrictsat-
4 [0,0] [1,1] - - [0.0,0.2] [0.9,1.0] isfaction, which enforces the bound in the interpretation to
5 [1,1] [0,0] - - [0.0,0.2] [0.7,1.0] be set to exactly what the fact dictates.
(1,2) - - [1,1] [0,0] - -
(2,1) - - [1,1] [0,0] - -
(1,3) - - [0,0] [1,1] - - Definition 2.13 (Strict Fact Satisfaction). Inter-
(2,3) - - [0,0] [1,1] - - pretation I strictly satisfies MANCaLog fact (c,a):[t ,t ]
1 2
(3,4) - - [1,1] [0,0] - - iff ∀t∈[t ,t ], a∈I(t)(c).
1 2
(4,3) - - [1,1] [0,0] - -
(4,5) - - [1,1] [0,0] - -
Next, we define what it means for an interpretation to
satisfy an integrity constraint.
than previous models. Consider the following rules: Definition 2.14 (IC Satisfaction). Aninterpretation
I satisfies integrity constraint a ←(cid:45) b iff for all t ∈ τ and
2
R1 = visPgA← c∈G, I(t)(c)|=¬b∨a.
(cid:104)fem,[1,1](cid:105),((cid:104)strTie,[0.9,1](cid:105),Tr,(cid:104)visPgA,[0.9,1.0](cid:105))
sftTp
Before we define what it means for an interpretation to
1
R2 = visPgB← satisfy a rule, we require two auxiliary definitions that are
(cid:104)male,[1,1](cid:105),(Tr,Tr,(cid:104)visPgB,[0.8,1.0](cid:105))sftTp usedtodefinetheboundenforcedonalabelbyagivenrule,
3 and the set of time points that are affected by a rule.
R3 = visPgA←
(cid:104)male,[1,1](cid:105),(Tr,(cid:104)fem,[1,1](cid:105),¬(cid:104)visPgA,[0.7,1.0](cid:105))ngTp Definition 2.15 (Bound function). Foragivenrule
Rule R1 says that a female agent in the network visits page r=L←∆tf,(gedge,gnode,h)ifl, nodev, andnetworkinterpre-
A with a weight of at least 0.7 (this is specified in the sftTp tation NI, Bound(r,v,NI)=
influence function) if at least half of her strong ties (with
weight of at least 0.9) visited the page (with a weight of at ifl(cid:16)(cid:12)(cid:12)Qual(cid:0)v,g ,g ,h,NI(cid:1)(cid:12)(cid:12),(cid:12)(cid:12)Elig(cid:0)v,g ,g ,NI(cid:1)(cid:12)(cid:12)(cid:17),
(cid:12) edge node (cid:12) (cid:12) edge node (cid:12)
least 0.9) two days ago. The rest of the rules can be read
analogously. (cid:4) where Elig(v,gedge,gnode,NI)=
We now introduce our first semantic structure: the net- (cid:110)v(cid:48) ∈V |NI(v(cid:48))|=g ∧(v(cid:48),v)∈E∧NI(cid:0)(v(cid:48),v)(cid:1)|=g (cid:111)
node edge
work interpretation.
and Qual(v,g ,g ,h,NI)=
Definition 2.10 (Network Interpretation). Anet- edge node
work interpretation is a mapping of network components to (cid:110)v(cid:48) ∈Elig(v,g ,g ,NI)|NI(v(cid:48))|=h(cid:111)
sets of network atoms, NI : G → 2NA. We will use NI to edge node
denote the set of all network interpretations.
Intuitively, the bound returned by the function depends on
We note that not all labels will necessarily apply to all the influence function and the number of qualifying and el-
nodesandedgesinthenetwork. Forinstance,certainlabels igible nodes that influence it.
may describe a relationship while others may only describe
apropertyofanindividualinthenetwork. IfagivenlabelL Definition 2.16 (Target Time Set). Forinterpreta-
doesnotdescribeacertaincomponentcofthenetwork,then tion I, node v, and rule r = L ←∆t f,(g ,g ,h) , the
edge node ifl
in a valid network interpretation NI, (cid:104)L,[0,1](cid:105)∈NI(c). target time set of I,v,r is defined as follows:
Example 2.7. ConsiderG(cid:48)soc,theinducedsubgraphofGsoc TTS(I,v,r)=(cid:110)t∈[0,tmax]|I(t−∆t)(v)|=f(cid:111)
thathasonlynodes{1,2,3,4,5}. Table2showsthecontents
of NI1, an example network interpretation. (cid:4) We also extend this definition to a program P, for a given
c∈G and L∈L, as follows; TTS(I,c,L,P)=
We define a MANCaLog interpretation as follows.
(cid:91) (cid:8) (cid:9)
Definition 2.11 (Interpretation). AMANCaLogin- TTS(I,c,r)∪ t∈[t1,t2]|((cid:104)L,bnd(cid:105),c):[t1,t2]∈P
r∈P,head(r)=L
terpretation I is a mapping of natural numbers in the inter-
val [0,t ] to network interpretations, i.e., I : N → NI. (cid:110) (cid:111)
max ∪ t|(cid:104)L,bnd(cid:105)←(cid:45)b∈P ∧I(t)(c)|=b
Let I be the set of all possible interpretations.
2.2 Satisfaction Wecannowdefinesatisfactionofarulebyaninterpretation.
First, we define what it means for an interpretation to
Definition 2.17. AninterpretationI satisfiesaruler=
satisfy a fact and a rule.
L←∆tf,(g ,g ,h) iffforallv∈V andt∈TTS(I,v,r)
edge node ifl
Definition 2.12 (Fact Satisfaction). Aninterpreta- it holds that
tion I satisfies MANCaLog fact (a,c):[t1,t2], written I |= (cid:68) (cid:0) (cid:1)(cid:69)
I(t)(v)|= L,Bound r,v,I(t−∆t) .
(a,c):[t ,t ], iff ∀t∈[t ,t ], I(t)(c)|=a.
1 2 1 2
Example 2.9. Let I be the interpretation from Exam- networkatom(cid:104)L,∅(cid:105)∈(cid:62)(t)(c). Clearly,nootherinterpreta-
1
ple 2.8. Suppose that (cid:104)visPgB,[0.8,1.0](cid:105) ∈ I(1)(5). In this tioncanbebelow⊥asthe[(cid:96),u]boundonallnetworkatoms
case,I |=R . LetI beequivalenttoI exceptthatwehave foreachtimestepandeachcomponentis[0,1];similarly,no
1 2 2 1
(cid:104)visPgB,[0.0,0.5](cid:105)∈I2(1)(3). In this case, I2 (cid:54)|=R2. (cid:4) otherinterpretationisabove(cid:62),sinceforanyinterpretation
I for which there exists (cid:104)L,bnd(cid:105) ∈ I(t)(c) where bnd (cid:54)= ∅,
We now define satisfaction of programs, and introduce we have ∅⊆bnd. We can prove (see the full version of the
canonical interpretations, in which time points that are not paper for details) that with (cid:62) and ⊥, (cid:104)I,(cid:118)(cid:105) is a complete
“targets”retain information from the last time step. lattice. We can now arrive at the notion of minimal model
for a MANCaLog program.
Definition 2.18. For interpretation I and program P:
I is a model for P iff it satisfies all rules, integrity con- Definition 2.24 (Minimal Model). GivenprogramP,
straints, and fluent facts in that program, strictly satisfies theminimalmodelofP isa(canonical)interpretationI s.t.
all non-fluent facts in the program, and for all L∈L, c∈G I |=P and for all (canonical) interpretation I(cid:48) s.t. I(cid:48) |=P,
and t∈/ TTS(I,c,L,P), (cid:104)L,[0,1](cid:105)∈I(c)(t). we have that I (cid:118)I(cid:48).
I is a canonical model for P iff it satisfies all rules, in-
Suppose we have some algorithm A that takes as input
tegrity constraints, and fluent facts in P, strictly satisfies
a program P and returns an interpretation I (where I does
all non-fluent facts in P, and for all L ∈ L, c ∈ G, and
notnecessarilysatisfyP)s.t.forallI(cid:48)whereI(cid:48) |=P,I (cid:118)I(cid:48).
t ∈/ TTS(I,c,L,P), (cid:104)L,[0,1](cid:105) ∈ I(c)(t) when t = 0 and
It is easy to show that if A(P) |= P then P is consistent.
(cid:104)L,bnd(cid:105)∈I(t)(c) where (cid:104)L,bnd(cid:105)∈I(t−1)(c), otherwise.
Likewise, if A(P) = (cid:62) then P is inconsistent, as all mod-
Example 2.10. Following from previous examples, if we els must then have a tighter weight bound for the network
consider interpretation I and program P = {F ,R }, we atomsthananinvalidinterpretation(hence,makingsuchan
1 7 2
havethat(cid:104)visPgB,[0.0,0.2](cid:105)mustbeinI (1)(2)inorderfor interpretation invalid as well). Clearly, any such algorithm
1
I1 to be canonical. (cid:4) Awouldprovideasoundandcompleteanswertotheconsis-
tencyproblem. Likewise,ifweconsidertheentailmentprob-
2.3 ConsistencyandEntailment lem,itiseasytoshowthatforfactF =((cid:104)L,bnd(cid:105),c):[t ,t ],
1 2
In this section we discuss consistency and entailment in P (canonically)entailsF ifftheminimalmodelofP (canon-
MANCaLogprograms,andexploretheuseofminimalmod- ically) satisfies F. This is because for minimal model A(P)
els towards computing answers to these problems. of P, for any time t∈[t1,t2], if A(P)(t)(c)|=(cid:104)L,bnd(cid:105) then
there is network atom (cid:104)L,bnd(cid:48)(cid:105) ∈ A(P)(t)(c) s.t. bnd(cid:48) ⊆
Definition 2.19 (Consistency). A MANCaLog pro- bnd. We note that for any other interpretation I of P with
gram P is (canonically) consistent iff there exists a (canon- (cid:104)L,bnd(cid:48)(cid:48)(cid:105)∈I(t)(c) we have that bnd(cid:48) ⊇bnd(cid:48)(cid:48). Hence, hav-
ical) model I of P. ingaminimalmodelallowsustosolveanyentailmentquery.
We can think of a minimal model of a MANCaLog program
Definition 2.20 (Entailment). AMANCaLogprogram astheoutcomeofadiffusionprocessinamulti-agentsystem
P (canonically)entailsMANCaLogfactF iffforall(canon- modeled as a complex network. Hence, a question such as
ical) models I of P, it holds that I |=F. “how many agents will adopt the product with a weight of
at least 0.9 in two months?” can be easily answered once
Nowwedefineanorderingovermodelsanddefinethecon-
the minimal model is obtained.
cept of minimal model. We then show that if we can find a
minimalmodelthenwecananswerconsistency,entailment,
3. FIXEDPOINTMODELCOMPUTATION
and tight entailment queries. To do so, we first define a
pre-order over interpretations. In this section we introduce a fixed-point operator that
produces the non-canonical minimal model of a MANCaLog
Definition 2.21 (Preorder over Interpretations). program in polynomial time. This is followed by an algo-
GiveninterpretationsI,I(cid:48) wesayI (cid:118)pre I(cid:48) ifandonlyiffor rithmtofindacanonicalminimalmodelalsoinpolynomial
all t,v,L if there exists (cid:104)L,bnd(cid:105) ∈ I(t)(v) then there must time. First, we introduce three preliminary definitions.
exist (cid:104)L,bnd(cid:48)(cid:105)∈I(cid:48)(t)(v) s.t. bnd(cid:48) ⊆bnd.
Definition 3.1. For a given MANCaLog program P, c∈
Next,wedefineanequivalencerelationforinterpretations G, L∈L, and t∈τ we define function FBnd(P,c,t,L)=
denoted with ∼; we will use the notation [I] for the set of
(cid:92)
all interpretations equivalent to I w.r.t. ∼. This allows us bnd
to define a partial ordering. ((cid:104)L,bnd(cid:105),c):[t1,t2]∈P s.t. t∈[t1,t2]
Definition 2.22. Two interpretations I,I(cid:48) are equiva- Definition 3.2. For a given MANCaLog program P, c∈
lent (written I ∼I(cid:48)) iff for all P ∈P, I |=P iff I(cid:48) |=P. G, L∈L, and t∈τ we define function IBnd(P,c,t,L)=
(cid:92)
Definition 2.23 (Partial Ordering). Givenclasses bnd
ofinterpretations[I],[I(cid:48)]thatareequivalentw.r.t.∼,wesay (cid:104)L,bnd(cid:105)←(cid:45)a∈P s.t. I(t)(c)|=a
that [I] precedes [I(cid:48)], written [I](cid:118)[I(cid:48)], iff I (cid:118)pre I(cid:48).
Definition 3.3. Given MANCaLog program P, interpre-
The partial ordering is clearly reflexive, antisymmetric, tationI,v∈V,L∈L,andt∈τ,wedefineRBnd(P,I,v,t,L)=
and transitive. Note that we will often use I (cid:118)I(cid:48) as short- (cid:92)
handfor[I](cid:118)[I(cid:48)]. Wedefinetwospecialinterpretations,⊥ Bound(r,v,I(t−∆t))
and (cid:62), such that ∀t∈τ,c∈G, ⊥(t)(c)=∅ and there exists r∈P s.t. t∈TTS(I,v,L,P)∩TTS(I,v,r)
We can now introduce the operator. Algorithm 1 CANON PROC
Require: ProgramP
Definition 3.4 (Γ Operator). ForagivenMANCaLog Ensure: InterpretationI
program P, we define the operator Γ : I → I as follows:
P 1: cur interp=Γ∗(⊥);
ForagivenI,foreacht∈τ,c∈G,andL∈L,add(cid:104)L,bnd(cid:105) P
2: Initialize matrix array cur free[·][·] where for v ∈ V, and
to Γ (I)(t)(c) where bnd is defined as:
P L∈L,cur free[v][L]=τ−TTS(cur interp,v,L,P)−{0};
bnd = bnd ∩FBnd(P,c,t,L)∩ 3: Initializearrayvl pr[·]whereforeacht∈[1,tmax],vl pr[t]=
prv {(v,L)|t∈cur free[v][L]};
IBnd(P,I,c,t,L)∩RBnd(P,I,c,t,L) 4: fort=1,...,tmax do
5: if vl pr[t](cid:54)=∅then
where (cid:104)L,bndprv(cid:105)∈I(t)(c). 6: for(v,L)∈vl pr[t]do
7: Remove(cid:104)L,bnd(cid:105)fromI(t)(v);
It is easy to show that Γ can be computed in polynomial 8: LetabetheatominI(t−1)(v)oftheform(cid:104)L,bnd(cid:48)(cid:105);
time (the proof is in the full version). Next, we introduce
notation for repeated applications of Γ. 9: AddatoI(t)(v);
10: endfor
Definition 3.5 (Iterated Applications of Γ). Given 11: Setcur interp=Γ∗P(cur interp);
12: endif
natural number i>0, interpretation I, and program P, we
13: For v ∈ V, and L ∈ L, cur free[v][L] = τ −
define Γi (I), the multiple applications of Γ, as follows:
P TTS(cur interp,v,L,P)−{0,...,t}
(cid:40) 14: For each t ∈ [t + 1,tmax], vl pr[t] = {(v,L) | t ∈
Γi (I)= ΓP(I) if i=1 cur free[v][L]}
P Γ (Γi−1(I)) otherwise 15: endfor
P P 16: return I
WecanprovethattheiteratedΓoperatorconvergesaftera
polynomial number of applications:
of consistency and tight entailment, and show that we can
Theorem 3.1. Given interpretation I and program P, boundtherunningtimeofthealgorithmwithapolynomial.
there exists a natural number k s.t. ΓkP(I)=ΓkP+1(I), and We also note that subsequent runs of the convergence of Γ
(cid:16) (cid:17) will likely complete quicker in practice, as the initial inter-
k∈O |P|·di∗n·tmax ·|E| pretation is the last interpretation calculated (cf. line 11).
Wealsoshowthattheinterpretationproducedbythealgo-
where din is the maximum in-degree in the network.
∗ rithmisacanonicalminimalmodel. Followingfromthat,a
Proof (sketch). For a given vertex i ∈ V, we will use program is inconsistent iff the algorithm returns (cid:62).
thenotationdintodenotethenumberofincomingneighbors
i
(of any edge type). First note that for a given t∈τ,i∈V, Proposition 3.1. AlgorithmCANON PROCperformsno
andL∈L,agivenrulercantightentheboundonanetwork morethan1+tmax·min(|L|,|P|)·|V|calculationsofthecon-
atomformedwithLnomorethan(din+1)·(din+1)times. vergence of Γ.
i ∗
At each application of Γ, at least one network atom must
tighten. Hence, as there are only O(cid:16)|P|·din ·t ·|E|(cid:17) Theorem 3.4. IfP isconsistent,thenCANON PROC(P)
∗ max
is the minimal canonical model of P.
tightenings possible, this is also the bound on the number
of applications of Γ.
4. APPLICATIONS
In the following, we will use the notation Γ∗ to denote the
P Inthissection,wewillbrieflydiscussworkinprogresson
iterated application of Γ after a number of steps sufficient
how MANCaLog can be applied in real world settings.
for convergence; Theorem 3.1 means that we can efficiently
Itiswidelyacknowledgedthatmodelinginfluenceinmulti-
compute Γ∗. We also note that as a single application of Γ
P agentsystems(mostusefullymodeledascomplexnetworks)
can be computed in polynomial time, this implies that we
is highly desirable for many practical problems as varied
canfindaminimalmodelofaMANCaLogprograminpoly-
asviralmarketing,preventionofdruguse,vaccination,and
nomialtime. Wenowprovethecorrectnessoftheoperator.
powerplantfailure. ThoughMANCaLogprogramsarearich
We do this first by proving a key lemma that, when com-
modeltoworkwith, theacquisitionofrulesistheprincipal
bined with a claim showing that for consistent program P,
hurdle to overcome; this is mainly due to this richness of
Γ∗ isamodelofP,tellsusthatΓ∗ isaminimalmodelforP.
P P representation, since for each rule we must provide a set
Following directly from this, we have that P is inconsistent
of conditions on the agents being influenced, conditions on
iff Γ∗ =(cid:62).
P their neighbors and their ties to their neighbors, and how
Lemma 3.2. If I |=P and I(cid:48) (cid:118)I then Γ(I(cid:48))(cid:118)I. capable these neighbors are of influencing them. A domain
expertislikelyabletoprovideimportantinsightsintothese
Theorem 3.3. If program P is consistent then Γ∗ is a components, but the best way to obtain these rules is un-
P
minimal model for P. doubtedlytoleveragethepresenceoflargeamountsofdata
indomainslikeTwitter(withabout340Mmessagessentper
These results, when taken together, prove that tight entail- day, available through public APIs), Facebook (over 950M
mentandconsistencyproblemsforMANCaLogcanbesolved userswithmorecomplexinformation;notpubliclyavailable,
inpolynomialtime,whichispreciselywhatwesetouttoac- butdatacanberequestedthroughapps),andbloggingand
complish as part of our desiderata described in Section 1.1. photo hosting sites such as Blogger and Flickr (which have
Next, we develop an algorithm for the canonical versions millions of users as well).
…
Data Sources
Rule MANCaLog
Clustering Generation Program
Influence & Expert
Susceptibility Knowledge
Figure 2: An architecture for obtaining MANCaLog
programs from available data sources.
Concretely, we have begun working towards this goal by
Figure 3: (Left) Visualization of a multi-attribute
extracting several time-series, multi-attribute network data
time-seriesauthor-papernetworkfrom1952to2012.
sets on which to apply MANCaLog. For instance, to study
(Top-Right)Close-upofthedatainsidethesmallbox
the proliferation of research on different topics, we looked
inthemainfigure. (Bottom-Right)Close-upshowing
at research on“niacin”indexed by Thomson Reuters Web
node attributes. In all cases, authors are colored
of Knowledge (http://wokinfo.com). This topic was cho-
green and papers are colored red. Data extracted
sen due to its interest to a variety of disciplines, such as
from Thomson Reuters Web of Knowledge.
medicine, biology, and chemistry; this gives the data more
variety compared with more discipline-specific topics. We
extracted an author-paper bipartite network consisting of
3,790paperswith10,465authorsand16,722edges(cf.Fig-
ure3);fromthisdatawecaneasilyfocusonvariouskindsof
networks (co-author, citation, etc.). We have also collected
5. CONCLUSION
attribute and time-series data for this network, as well as
thesubjectsofthepapers;thepropagationofthesesubjects In this paper, we presented the MANCaLog language for
is a good starting point to test methods for the acquisition modeling cascades in multi-agent systems organized in the
ofMANCaLogrules. Weareharvestinglargerdatasetsfrom formofcomplexnetworks. Westartedbyestablishingseven
variousonlinesocialnetworks. Furtherdetailscanbefound criteria in the form of desiderata for such a formalism, and
in the full version of the paper. provedthatMANCaLogmeetsallofthem;tothebestofour
knowledge, this has not been accomplished by any previous
A proposed learning architecture. We are currently
modelintheliterature. WealsonotethatMANCaLogisthe
developing a MANCaLog learning architecture (depicted in
first language of its kind to consider network structure in
Fig. 2) based on the use of state-of-the-art data analysis,
the semantics, potentially opening the door for algorithms
clustering,andinfluencelearningtechniquesasbuildingblocks
thatleveragefeaturesofnetworktopologyinmoreefficiently
for the acquisition of MANCaLog rules from data sets. The
answeringqueries. Ourcurrentworkinvolvesimplementing
key question is not just the identification of the best tech-
the algorithms described in this paper, as well as the real-
niques to adopt, but how to adapt them and combine them
world applications described in Section 4; though our al-
insuchawayastoproducemeaningfulandusefuloutputs.
gorithms have polynomial time complexity, it is likely that
Consider the diagram in Fig. 2: the data first flows from
further optimizations will be needed in practice to ensure
raw data sources to the cluster identification component,
scalability for very large data sets.
which has the goal of identifying sets of agents behaving
In the near future, we shall also explore various types
as groups (for instance, teens influencing other teens of the
of queries that have been studied in the literature, such
samesexintheconsumptionofmusic,orscientistsofacer-
as finding agents of maximum influence, identifying agents
tainfieldinfluencingtheresearchtopicsofothersinarelated
thatcauseacascadetospreadmorequickly,andidentifying
field) [16, 9]; the main output here is a set of conditions on
agents that can be influenced in order to halt a cascade.
nodes and edges that characterize groups of nodes. Once
clusters are identified, the influence recognition component
will make use of both the clusters and the data sources to
recognizewhatkindofinfluenceispresentinthesystem[1,
5, 6]; the main output of this component is the influence
function to be used in the MANCaLog rules. The rule gen-
erationcomponentthentakestheoutputoftheclusteriden-
tification and influence recognition components, along with
the raw data (e.g., to analyze time stamps) and produces 6. ACKNOWLEDGMENTS
MANCaLog rules; the output of this component is involved
P.S. is supported by the Army Research Office (project
inarefinementcyclewithexpertswhocanprovidefeedback
2GDATXR042). G.I.S. is supported under (UK) EPSRC
on the rules being produced (such as possible combinations
grantEP/J008346/1–PrOQAW.Theopinionsinthispaper
of rules, identification of cases of overfitting, etc.).
are those of the authors and do not necessarily reflect the
opinions of the funders, the U.S. Military Academy, or the
U.S. Army.
7. REFERENCES 8. APPENDIX
[1] S. Aral and D. Walker. Identifying Influential and
Susceptible Members of Social Networks. Science,
337(6092):337–341, 2012. 8.1 Setofinterpretationsformacompletelat-
[2] M. Broecheler, P. Shakarian, and V. Subrahmanian. A tice
scalable framework for modeling competitive diffusion
With top interpretation (cid:62) and bottom interpretation ⊥,
in social networks. In Proc. of SocialCom. IEEE, 2010.
(cid:104)I,(cid:118)(cid:105) is a complete lattice.
[3] W. Chen, C. Wang, and Y. Wang. Scalable influence
maximization for prevalent viral marketing in Proof. Let I(cid:48) be a subset of I. We can create inf(I(cid:48))
large-scale social networks. In Proc. of KDD ’10, as follows. We build interpretation I(cid:48). For each t ∈ τ,c ∈
pages 1029–1038. ACM, 2010. G,L ∈ L, let (cid:96)1 be the least of the set ∪I∈I(cid:48){(cid:96)|(cid:104)L,[(cid:96),u](cid:105) ∈
I(t)(c),(cid:104)L,[(cid:96),u)(cid:105) ∈ I(t)(c)} and (cid:96) be the least of the set
[4] A. Goyal, F. Bonchi, and L. Lakshmanan. Discovering 2
leaders from community actions. In Proc. of CIKM, ∪I∈I(cid:48){(cid:96)|(cid:104)L,((cid:96),u](cid:105)∈I(t)(c),(cid:104)L,((cid:96),u)(cid:105)∈I(t)(c)}. Then, for
eacht∈τ,c∈G,L∈Lletu bethegreatestelementofthe
pages 499–508. ACM, 2008. 1
[5] A. Goyal, F. Bonchi, and L. Lakshmanan. Learning set ∪I∈I(cid:48){u|(cid:104)L,[(cid:96),u](cid:105)∈I(t)(c),(cid:104)L,((cid:96),u](cid:105)∈I(t)(c)}
and u be the greatest of the set
influence probabilities in social networks. In Proc. of 2
WSDM, pages 241–250. ACM, 2010. ∪I∈I(cid:48){u|(cid:104)L,[(cid:96),u)(cid:105)∈I(t)(c),(cid:104)L,((cid:96),u)(cid:105)∈I(t)(c)}. Ifthereis
any interpretation I in I where there is not some bnd s.t.
[6] A. Goyal, F. Bonchi, and L. Lakshmanan. A
(cid:104)L,bnd(cid:105)∈I(t)(c) then add (cid:104)L,[0,1](cid:105) to I(cid:48)(t)(c). If (cid:96) ≤(cid:96)
data-based approach to social influence maximization. 2 1
andu ≥u thenadd(cid:104)L,((cid:96) ,u ](cid:105)toI(cid:48)(t)(c). If(cid:96) ≤(cid:96) and
Proc. of VLDB, 5(1):73–84, 2011. 1 2 2 1 2 1
u > u then add (cid:104)L,((cid:96) ,u )(cid:105) to I(cid:48)(t)(c). If (cid:96) > (cid:96) and
2 1 2 2 2 1
[7] M. Granovetter. The Strength of Weak Ties. The u >u then add (cid:104)L,[(cid:96) ,u )(cid:105) to I(cid:48)(t)(c). Finally, if (cid:96) >(cid:96)
2 1 1 2 2 1
American Journal of Sociology, 78(6):1360–1380, 1973. and u ≥u then add (cid:104)L,[(cid:96) ,u ](cid:105) to I(cid:48)(t)(c). Clearly, I(cid:48) =
1 2 1 1
[8] M. Granovetter. Threshold models of collective inf(I(cid:48)).
behavior. The American Journal of Sociology, In the next part of the proof, we show we can create
83(6):1420–1443, 1978. sup(I(cid:48)) as follows. We build interpretation I(cid:48). For each
[9] A. Jain. Data clustering: 50 years beyond K-means. t∈τ,c∈G,L∈L let (cid:96) be the greatest of the set
1
Pattern Recognition Letters, 31(8):651–666, 2010. ∪I∈I(cid:48){(cid:96)|(cid:104)L,[(cid:96),u](cid:105) ∈ I(t)(c),(cid:104)L,[(cid:96),u)(cid:105) ∈ I(t)(c)} and (cid:96)2 be
[10] D. Kempe, J. Kleinberg, and E. Tardos. Maximizing the greatest of the set
the spread of influence through a social network. In ∪I∈I(cid:48){(cid:96)|(cid:104)L,((cid:96),u](cid:105)∈I(t)(c),(cid:104)L,((cid:96),u)(cid:105)∈I(t)(c)}. Then, for
Proc. of KDD ’03, pages 137–146. ACM, 2003. eacht∈τ,c∈G,L∈Lletu betheleastelementoftheset
1
[11] E. Lieberman, C. Hauert, and M. A. Nowak. ∪I∈I(cid:48){u|(cid:104)L,[(cid:96),u](cid:105) ∈ I(t)(c),(cid:104)L,((cid:96),u](cid:105) ∈ I(t)(c)} and u2 be
Evolutionary dynamics on graphs. Nature, the least of the set ∪I∈I(cid:48){u|(cid:104)L,[(cid:96),u)(cid:105)∈I(t)(c),(cid:104)L,((cid:96),u)(cid:105)∈
433(7023):312–316, 2005. I(t)(c)}. If max((cid:96) ,(cid:96) ) > min(u ,u ) or ((cid:96) > (cid:96) )∧(u <
1 2 1 2 2 1 2
[12] T. C. Schelling. Micromotives and Macrobehavior. u1)∧((cid:96)2 = u2) then add (cid:104)L,∅(cid:105) to I(cid:48)(t)(c). If (cid:96)2 > (cid:96)1 and
W.W. Norton and Co., 1978. u1 ≤ u2 then add (cid:104)L,((cid:96)2,u1](cid:105) to I(cid:48)(t)(c). If (cid:96)2 > (cid:96)1 and
[13] P. Shakarian, A. Parker, G. I. Simari, and V. S. u2 < u1 then add (cid:104)L,((cid:96)2,u2)(cid:105) to I(cid:48)(t)(c). If (cid:96)2 ≤ (cid:96)1 and
Subrahmanian. Annotated probabilstic temporal logic. u2 <u1 then add (cid:104)L,[(cid:96)1,u2)(cid:105) to I(cid:48)(t)(c). Finally, if (cid:96)2 ≤(cid:96)1
ACM Trans. on Comp. Logic, 12(2), 2011. and u1 ≤ u2 then add (cid:104)L,[(cid:96)1,u1](cid:105) to I(cid:48)(t)(c). Clearly,
I(cid:48) =sup(I(cid:48)).
[14] P. Shakarian, V. Subrahmanian, and M. L. Sapino.
Asbothinf(I(cid:48))andsup(I(cid:48))existandareclearlyinIthen
Using Generalized Annotated Programs to Solve
the statement follows.
Social Network Optimization Problems. In Proc. of
ICLP (tech. comm.), 2010.
[15] V. Sood, T. Antal, and S. Redner. Voter models on
heterogeneous networks. Physical Review E, 8.2 A single application of Γ can be computed
77(4):041121, 2008. inpolynomialtime
[16] T. Warren Liao. Clustering of time series data–a ForinterpretationI,Γ(I)canbecomputedbyconducting
survey. Pattern Recognition, 38(11):1857–1874, 2005. O(|P|·|V|·t ·din)satisfactioncheckswheredinisthemax-
max ∗ ∗
imum in-degree of a node in the network. (This combined
withtheassumptionthattheinfluencefunctioniscomputed
inconstanttimeresultsinpolynomialtimecomputationfor
a single application of Γ.)
Proof. We note that a given rule will require the most
satisfactionchecks,asarulewillpotentiallyaffectanetwork
atom of a certain label for each vertex-time point pair. By
the definition of RBnd, a given rule clearly causes no more
than O(din) satisfaction checks. As the number of rules is
∗
no more than |P|, the statement follows.
8.3 ProofofTheorem3.1 satisfies all non-fluent facts in P. We also note that the fi-
GiveninterpretationI andprogramP,thereexistsanat- nal application of the Γ operator ensures that all integrity
ural number k s.t. ΓkP(I)=ΓkP+1(I), and cthonersetriasinatrsualreeinsaPtistfiheadtbΓy∗Γd∗Po.eNsonwot,Ssautpispfoys.eH,BowWeOveCr,,wthitaht
(cid:16) (cid:17) P
k∈O |P|·din·t ·|E| each application of Γ, for each rule, we include the bound
∗ max
ontheweightreturnedbytheBound functionforeachtime
where din is the maximum in-degree in the network. stepin the target time stepassociatedwith that rule. As Γ
∗
is applied to convergence, and new bounds are intersected
Proof. Foragivenvertexi∈V,wewillusethenotation
with each application, then we know that all time points in
din to denote the number of incoming neighbors (of any
i any associated target time set are considered in the inter-
edge type) and din = max din. First we show that for a
∗ i i section.
given t ∈ τ,i ∈ V, and L ∈ L, a given rule r can tighten
Proof of Theorem: The above claim tells us that Γ∗ |= P.
the bound on a network atom formed with L no more than P
Now consider interpretation I s.t. I |=P. As ⊥(cid:118)I, multi-
(din + 1) · (din + 1) times. This is because a given rule
i ∗ ple applications of Lemma 3.2 tell us that Γ∗ (cid:118) I. Hence,
adjusts the bound on a network atom based on the number P
the statement follows.
of eligible and qualifying neighbors, which are bounded by
diin,di∗n respectively. At each application of Γ, at leas(cid:16)t one 8.6 ProofofTheorem3.4
networkatommusttighten. Hence,asthereareonlyO |P|·
If P is consistent, then CANON PROC(P) is the minimal
(cid:17) (cid:16) (cid:17)
din · t · (cid:80) din =O |P| · din · t · |E| tightenings canonical model of P.
∗ max i i ∗ max
possible,thisisalsotheboundonthenumberofapplications Proof. CLAIM1: IfP isconsistent,thenCANON PROC(P)
of Γ. is a canonical model of P.
Clearly, I = CANON PROC(P) satisfies all facts and in-
8.4 ProofofLemma3.2
tegrity constraints in P. Hence, we shall consider programs
If I |=P and I(cid:48) (cid:118)I then Γ(I(cid:48))(cid:118)I. thatonlyconsistofrulesinthisproof. WesayIL-canonically
Proof. Suppose, BWOC, that Γ(I(cid:48)) (cid:61) I. Then, there satisfies P iff I canonically satisfies {r ∈P |head(r)=L}.
exists some L ∈ L, t ∈ τ and c ∈ G s.t. (cid:104)L,bnd(cid:105) ∈ I(t)(c), Clearly, I canonically satisfies P if for all L ∈ L, P L-
(cid:104)L,bnd(cid:48)(cid:105) ∈ I(cid:48)(t)(c), and (cid:104)L,bnd(cid:48)(cid:48)(cid:105) ∈ Γ(I(cid:48))(t)(c) s.t. bnd ⊃ canonically satisfies by I. We say that I is an (L,c,q)-
bnd(cid:48)(cid:48) and bnd(cid:48) ⊇ bnd(cid:48)(cid:48). There are four things that affect canonically consistent interpretation if for c ∈ G, for the
bnd(cid:48)(cid:48): facts, rules, integrity constraints and bnd(cid:48). Clearly, firstt∈τ−TTS(I,c,L,P)−{0}, I(t)(c)|=(cid:104)L,bnd(cid:105) where
we need not consider the effect that either facts or bnd(cid:48) (cid:104)L,bnd(cid:105) ∈ I(t−1)(c). Consider some L ∈ L and c ∈ G.
have on bnd(cid:48)(cid:48), as I satisfies all facts and I(cid:48) (cid:118) I. We also Clearly, I is an (L,c,0)-model for P. Let us assume, for
note that a given integrity constraint imposed by Defini- somevalueq,thatI isan(L,c,q−1)modelforP. Lettime
tion3.2cantightenbnd(cid:48)(cid:48)nomorethantheassociatedbound point t be the q-th element of τ −TTS(I,c,L,P)−{0}.
in any model. Hence, there must be some rule r = L ←∆t Considerthetimestepbeforetimetisconsideredinthefor-
f,(g ,g ,h) that causes bnd(cid:48)(cid:48) to become less than loopatline4of CANON PROC,whichcausesthecondition
edge node ifl
bnd. As bnd(cid:48)(cid:48) (cid:54)= bnd(cid:48), we know that t ∈ TTS(Γ(I(cid:48)),c,r)∩ atline5tobetrue. Byline13,τ−TTS(I,c,L,P)−{0}⊆
TTS(I(cid:48),c,r). Asaresult,wehaveΓ(I(cid:48))(t−∆t)(c)|=f and cur free[c][L]. This means that t is a member of both.
I(cid:48)(t−∆t)(c) |= f. Further, as I |= P,I(cid:48) (cid:118) I, and no rule Hence,whentisconsideredatline4,theconditionatline5
is true, causing (cid:104)L,bnd(cid:105) ∈ I(t)(c)∩I(t−1)(c) and as the
can modify a non-fluent atom, we have
element (cid:104)L,bnd(cid:105)∈I(t−1)(c) is not changed here, we have
|Elig(v,g ,g ,Γ(I(cid:48))(t−∆t)|=
edge node shown the I is an (L,c,q)-model for P. By the for-loop at
|Elig(v,g ,g ,I(cid:48)(t−∆t)|= line6,forallL∈Landc∈G,I isan(L,c,q)-modelforP.
edge node
Hence, at the for-loop at line 4, we can be assured that for
|Elig(v,g ,g ,Γ(I(cid:48))(t−∆t)|.
edge node L ∈ L and c ∈ G that I (L,c,|τ −TTS(I,c,L,P)−{0}|)
Further, we know that as I(cid:48) (cid:118)I, it must be the case that satisfies P – which means that I canonically satisfies P
CLAIM 2: If I is a canonical model for P,
|Qual(v,gedge,gnode,h,I(t−∆t))|≥ cur interp(cid:118)I isaninterpretationthatalsostrictlysatisfies
all non-fluent facts in P, and cur interp(cid:48) is cur interp af-
|Qual(v,gedge,gnode,h,I(cid:48)(t−∆t))|. ter being manipulated in lines 6-10 of CANON PROC, then
cur interp(cid:48) (cid:118)I.
This implies, by Axiom 2 that, Bound(r,v,I(t−∆t)) ⊆
Bound(r,v,I(cid:48)(t−∆t)). Thisthenimpliesthatbnd ⊆bnd(cid:48)(cid:48), We note that by the definition of satisfaction of a non-
fluent fact, and the fact that both cur interp and I must
which is a contradiction.
strictlysatisfyallnon-fluentfactsinP,weknowthatforall
8.5 ProofofTheorem3.3 c∈G and L∈L that:
Γ∗ is a minimal model for P.
P TTS(I,c,L,P) = TTS(cur interp,c,L,P)
Proof. Claim: If program P is consistent then Γ∗P is a = TTS(cur interp(cid:48),c,L,P)
model of P.
Suppose,BWOC,thatthereisafactinP thatΓ∗P doesnot Letusassumethatlines6-10ofthealgorithmarechanging
satisfy. However,bythedefinitionofΓandthedefinitionof cur interp when the outer loop is considering time t and
a fact, Γ∗P must satisfy all facts as the bound on the weight that the condition at line 5 is true. Clearly,
associatedwitheachfactisincludedintheintersection. Fur-
ther, we can also see by the definition of Γ that Γ∗ strictly t∈τ −TTS(I,v(cid:48),L(cid:48),P)−{0}
P