Table Of ContentLanguage Generation and Veri(cid:12)cation in the NRL Protocol
Analyzer
Catherine Meadows
Center for High Assurance Computing Systems
Naval Research Laboratory
Washington, DC 20375
USA
Abstract classes of states. One of the most important of these
involves induction on formal languages. The user de-
The NRL Protocol Analyzer is a tool for proving (cid:12)nes aformallanguageanduses the Analyzer toprove
security properties of cryptographic protocols, and for that, if an intruder trying to break the protocol has
(cid:12)nding (cid:13)aws if they exist. It is used by having the founda word in that language,then the intruder must
user (cid:12)rst prove a number of lemmas stating that in- havealreadyknown awordinthat language. This, to-
(cid:12)nite classes of states are unreachable, and then per- gether with the fact that the intruder knows no words
forming an exhaustive search on the remaining state in the language initially (if that is the case), can be
space. One main source of di(cid:14)culty in using the tool used to prove inductively that the intruder can never
is in generating the lemmas that are to be proved. In learnawordinthatlanguage. Theprocedure forprov-
this paper we show how we have made the task easier ing a language unreachable has been automated, and
by automating the generation of lemmas involving the is documented in [5].
use of formal languages.
Although automation of the language veri(cid:12)cation
1 Introduction procedure was helpful, until recently it was up to the
user to de(cid:12)ne the language his or her self. This was
not an easy procedure for complicated protocols, and
The NRL Protocol Analyzer is a tool for proving
requiredclose inspectionofAnalyzeroutput,aswellas
security properties of cryptographic protocols, and for
oftheoutputofthelanguageveri(cid:12)erwhenever itfailed
(cid:12)nding (cid:13)aws if they exist. In its most basic form, it
in a proof. Moreover, it was often possible to de(cid:12)ne a
is a search tool. A goal (usually an insecure state) is
language that could be proved unreachable, but was
presented toit,anditattemptsto(cid:12)ndallpathstothat
actually somewhat smaller than necessary. It was dif-
state. However,exhaustivesearchinitselfisnotanad-
(cid:12)cult to detect when this had occurred, but failure to
equatemeansofverifyingthesecurityofcryptographic
prove the largest possible language unreachable could
protocols. This is because the state space is assumed
result in an unmanageablylarge search space.
to be in(cid:12)nite. For example, it is necessary to assume
that an unbounded number of executions of a proto-
col may have taken place, and that a principal can be
engaginginan arbitrarilylargenumberofprotocolex- Fortunately, it is possible to describe an heuristic
ecutions atanygiventime. Moreover,forthe purposes procedure for de(cid:12)ning formal languages that avoids
ofanalysis,very largesets, such as the numberof keys manyof these problems,and this has been automated
available,orthenumberofwordsthatcanbeproduced in the most recent version of the Analyzer. Although
byencryptingawordoverandoveragain,are assumed this procedure does not guarantee the largest possible
to be in(cid:12)nite. language, we have used it to prove unreachability of
In order to deal with these problems, we have de- languages that are large enough to be useful, and we
veloped several ways in which users of the Analyzer have found that it saves a signi(cid:12)cant amountof labor.
can prove lemmas about the unreachability of in(cid:12)nite In this paper we describe how this procedure works.
Report Documentation Page Form Approved
OMB No. 0704-0188
Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and
maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information,
including suggestions for reducing this burden, to Washington Headquarters Services, Directorate for Information Operations and Reports, 1215 Jefferson Davis Highway, Suite 1204, Arlington
VA 22202-4302. Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to a penalty for failing to comply with a collection of information if it
does not display a currently valid OMB control number.
1. REPORT DATE 3. DATES COVERED
1996 2. REPORT TYPE 00-00-1996 to 00-00-1996
4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER
Language Generation and Verification in the NRL Protocol Analyzer
5b. GRANT NUMBER
5c. PROGRAM ELEMENT NUMBER
6. AUTHOR(S) 5d. PROJECT NUMBER
5e. TASK NUMBER
5f. WORK UNIT NUMBER
7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) 8. PERFORMING ORGANIZATION
Naval Research Laboratory,Center for High Assurance Computer REPORT NUMBER
Systems,4555 Overlook Avenue, SW,Washington,DC,20375
9. SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES) 10. SPONSOR/MONITOR’S ACRONYM(S)
11. SPONSOR/MONITOR’S REPORT
NUMBER(S)
12. DISTRIBUTION/AVAILABILITY STATEMENT
Approved for public release; distribution unlimited
13. SUPPLEMENTARY NOTES
14. ABSTRACT
15. SUBJECT TERMS
16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF 18. NUMBER 19a. NAME OF
ABSTRACT OF PAGES RESPONSIBLE PERSON
a. REPORT b. ABSTRACT c. THIS PAGE 14
unclassified unclassified unclassified
Standard Form 298 (Rev. 8-98)
Prescribed by ANSI Std Z39-18
2 How Languages Are Used in the An- that (cid:28) = (cid:22)(cid:27) for some substitution (cid:22). The preceding
alyzer state consists of the input to the state transition and
the part of the state description that was not used to
produce O.
The NRL Protocol Analyzer is written in Prolog
Languages can arise when we attempt to (cid:12)nd out
and relies upon equational uni(cid:12)cation. It employs the
howtheintrudercan(cid:12)ndaword,andwe(cid:12)ndourselves
worst-case model used by Dolev and Yao [1] in which
in an in(cid:12)nite regression. Consider the following very
the networkiscontrolled byahostileintruder whocan 1
simpleprotocol, with two rules :
readallmessage,destroymessages,andcreate ormod-
ifymessages. Since the intruder can read allmessages,
Encrypt-Decrypt Protocol
any message sent may be assumed to have been re-
ceived by the intruder, and since the intruder controls
Protocol Rule 1
what messages are received, any message received is
If the intruder knows X and Y, then he or she can
assumed to have been sent by the intruder. Thus we
(cid:12)nd e(X,Y), where e(X,Y) denotes the encryption of
canthinkoftheprotocolasanalgebraicsystemthatis
Y with key X.
manipulatedbythe intruder. We alsoassume that the
Protocol Rule 2
intruder may be a legitimate participant in the pro-
If the intruder knows X and Y, then he or she can
tocol, and so has access to some (but not all) of the
(cid:12)nd d(X,Y), where d(X,Y) denotes the decryption of
secret keys used, and also has the ability to perform
Y with key X.
operations such as encryption available to legitimate
The words used in the encrypt-decrypt protocol
participants. AsdoDolevandYao,wordsintheProto-
also obey two rewrite rules d(X,e(X,Y)) ! Y, and
colAnalyzer modelobeyaset ofrewriterulesspeci(cid:12)ed
e(X,d(X,Y)) ! Y.
by the user. For example, the user may want to spec-
Suppose that we want to (cid:12)nd all the words that
ify that encryption of a word with a key, followed by
the intruder can know. We ask how the intruder can
decryption with the same key, reduces to the original
(cid:12)nd Z, where Z is a variable that can stand for any
word.
irreducible word. Using Protocol Rule 1, the Analyzer
Inthe Protocol Analyzer, wordssent inmessagesor
willtell us that this can be done if the intruder knows
stored in localstate variablesare represented byterms
d(W,Z) andW for someW, orifZ = e(X,Y)and the
that are made up of function symbols, constants and
intruder knows X and Y. We label these solutions 1
variables. A protocol itself is speci(cid:12)ed as a set ofstate
and2. UsingProtocol Rule2,the Analyzer willtellus
transitionsinwhichtheinputstateisdescribedbymes-
this can be done if the intruder knows e(W,Z) and W
sages received and the values of local state variables,
for some X, or if Z = d(X,Y) and the intruder knows
and the output state is described in terms of messages
X and Y. We label these solutions 3 and 4.
sent and new values of local state variables. Actions
Suppose that we continue our search on solution 3
localto the intruder, such as the intruder’s performing
andask howthe intruder can (cid:12)nd e(W,Z). UsingPro-
encryption or decryption, are represented internally in
tocol Rule 1, the Analyzer will tell us that it can be
the same way. States are speci(cid:12)ed in terms of a set
found if the intruder can (cid:12)nd X = W and Y = Z (so-
of words known by the intruder and sets of local state
lution 3.1), or if the intruder can (cid:12)nd X = X1 and Y
variablesand their values. An exampleofa localstate
= d(X1,e(W,Z)) (solution 3.2), in which case e(X,Y)
variable would be one holding the key that an honest
= e(X1,d(X1,e(W,Z))) will reduce to e(W,Z). Next,
principal is using to converse with another. An exam-
using Protocol Rule 2, the Analyzer will tell us that
ple of an insecure state would be one in which that
e(W,Z) can be found ifthe intruders can (cid:12)nd X = X1
state variablecontainsthe wordK and K isknownby
and Y = e(X1,e(W,Z)) (solution 3.3), in which case
the intruder.
d(X,Y) = d(X1,e(X1,e(W,Z)).
The Protocol Analyzer (cid:12)nds a completedescription
We can rule out the solution 3.1 found using Pro-
ofallstates preceding aspeci(cid:12)ed state inthe following
tocol Rule 1, since it requires the intruder to know Z,
way. Foreachstate transitionasubset O oftheoutput
which is the word he or she is trying to (cid:12)nd. Thus, we
is paired up with a subset S of the state description.
onlyneed toknowhowthe intruder can(cid:12)ndthe words
The Analyzer uses a narrowing algorithm[7] to (cid:12)nd a
1
complete set of substitutions (cid:6) to the variables in O It is actuallytrivial to verify that the intrudercan’t learn
and S such that the words in O can be made equal any words in this protocol, since the intruderknows no words
initiallyandevery rule thatproducesa word requiresthatthe
to the words in S by the application of rewrite rules.
intruderknew a word previously. Thus the languagetechnique
By complete,we meanthat, if(cid:28) is a substitution such isoverkillhere. However,we(cid:12)ndthesimplicityofthisprotocol
that (cid:28)O is reducible to (cid:28)S, then there is a (cid:27) in(cid:6) such makesithelpfulasaninitialexample
in the second two solutions. For 3.2 and 3.3, if we d(X,Y) gives us one solution, in which Y is uni(cid:12)ed
ask the Protocol Analyzer how to (cid:12)nd d(X1,e(W,Z)) withe(X,e(C,B)) givingoutputd(X,e(X,e(C,B))) re-
and e(X1,e(W,Z)), the reader can verify that the in- ducing to e(C,B). Thus the intruder most know X
truder can (cid:12)nd the (cid:12)rst if he or she can (cid:12)nd e(X2, and Y = e(X,e(C,B)). Since e(C,B) is a member of
d(X1,e(W,Z))) or d(X2, d(X1,e(W,Z))), and the sec- A, e(X,e(C,B)) must belong to A by the second lan-
ondifheorshe can(cid:12)nde(X2,e(X1,e(W,Z)))ord(X2, guage rule, and we are done.
e(X1,e(W,Z))). This procedure of proving a language unreachable
If we keep applying the Protocol Analyzer, we will has been automated in the Protocol Analyzer.
generate ever longer and longer words in this fashion.
Thus, our search will be made easier if we can prove
3 Notation and De(cid:12)nitions
that, whenever Z is a word not already known by the
intruder, then it is impossiblefor the intruder to learn
e(X,Z) for any X. Inthissection weoutlinesomeofthe basicnotation
We note that the patterns of words we obtained by and de(cid:12)nitions used in the rest of the paper.
looking for e(X,Z) showed a certain regularity. This 3.1 ElementaryDefinitions
leadsus tode(cid:12)ne the followinglanguageA,whose def-
inition is dependent upon the state of the intruder’s
knowledge: De(cid:12)nitions: LetX be aterm. ThesizeofX isthe
number of subterms of X.
1. A ! e(L,K), where L is the set of all irreducible
Thus,thesize ofg(Y)is2,since g(Y)andY are the
wordsandK isthe set ofallirreducible wordsnot
subterms. Likewise, the size of g(Z,b(t),r(s(q))) is 7.
currently known by the intruder.
De(cid:12)nition: Let X be a term and Y be a subterm
2. A ! e(L,A) of X. We de(cid:12)ne an occurrence ! of Y in X as follows.
If Y = X, then the occurrence of Y in X is (cid:15). If Y is
3. A ! d(L,A)
the i’thargumentofX, then the occurrence ofY inX
Wenowprove thatA isunreachable bytakingeach is i. If the occurrence of Z in X is i1:::ik,and Y is the
languageruleofA,substitutingvariablesfortheterms, j’th argument of Z, then the occurrence of Y in X is
running the Protocol Analyzer on the resulting words, i1:::ik:j. If ! = i1:::ik is an occurrence, we calleach ij
a component of !.
and examining the words that must by input by the
Note thatthere can be morethan oneoccurrence of
intruder. In each case, we try todetermine that one of
a subterm in a term. Thus, if X = f(g,h(b),t(g)), g
these words mustalsobelong to the language. We will
illustratethis procedure by showinghow we show that occurs at 1 and 3.1.
theintruder’slearningawordsatisfyingthesecondlan- We can also compare occurrences, as follows.
guageruleimpliesthattheintrudermostalreadyknow De(cid:12)nition: Let !1 and !2 be two occurrences. We
awordbelongingtoA.Theprocedurefortheothertwo say that !1 > !2 if, either the number of components
language rules is similar. of !1 is greater than that of !2, or !1 and !2 have
Wetaketheworde(C,B),where B isassumedtobe the same number of components and i < j, where i
a memberof A. Applying Protocol Rule 1,which says and j are the (cid:12)rst nonequal components of !1 and !2,
that if the intruder can produce X and Y, then he or respectively.
she can produce e(X,Y),gives ustwosolutions. In the Thus, for example,1.1.1> 3.2, and 3.2 > 3.1.
(cid:12)rstsolution,C isuni(cid:12)edwithX andB isuni(cid:12)edwith De(cid:12)nition: A substitution is a function from vari-
Y. In the second, Y is uni(cid:12)ed with d(X,e(C,B)). The ables to terms. If a substitution assigns term T to
outpute(X,d(X,e(C,B))) reduces toe(C,B). The(cid:12)rst variable V, we represent this by V/T. The identity
solution requires the intruder’s previous knowledge of substitution is designated by (cid:19).
B,whichisassumedtobeamemberofA.Wenowcon- De(cid:12)nition: Let X and Y be two terms. We say
siderthe second solution. Thisrequires the intruderto that a substitution (cid:27) is a uni(cid:12)er of X and Y (cid:27)X =
know Y = d(X,e(C,B)). But, since e(C,B) is a mem- (cid:27)Y. We say that (cid:27) is a most general uni(cid:12)er, or mgu,
berofthelanguageAaccordingtothesecondlanguage of X and Y if (cid:27) is a uni(cid:12)er and, for any other uni(cid:12)er
rule, d(X,e(C,B)) belongs to A by the third language (cid:28),(cid:28) =(cid:22)(cid:27) forsome(cid:22). Mostgeneral uni(cid:12)ers areunique
rule. Thusthissolutionrequirestheintruder’sprevious up to renamingof variables.
knowledge of a member of A. The following de(cid:12)nition is somewhat nonstandard,
Applying the Protocol Rule 2, which says that if butweuseitbecauseitdescribesthetypeofuni(cid:12)cation
the intruder knows X and Y, then he or she can learn that is used in the Protocol Analyzer.
De(cid:12)nition: Let X and Y be two terms, and let E our original goal was the state in which the intruder
be a set of equations. We say that (cid:27) is a left-handed knew the word Z. Suppose that we assigned that goal
uni(cid:12)er of X and Y with respect to E (or simply a left- the integer 1. The solutions we generated by trying to
handed equational uni(cid:12)er of X and Y when we can (cid:12)nd the word Z would be indexed as 1.1,1.2,1.3, and
avoid confusion), if (cid:27)X can be made equal to (cid:27)Y by 1.4. When we asked how to (cid:12)nd the word e(W,Z) in
applyingthe equations fromE to (cid:27)X. solution 1.3, the answers we got would be indexed as
This di(cid:11)ers from the usual de(cid:12)nition of equational 1.3.1,1.3.2,and 1.3.3.
uni(cid:12)er, in which the equations can be applied to both Let TN:i1:::::it, TN:i1:::::it(cid:0)1, ...,GN be a sequence of
terms being uni(cid:12)ed. However, we are interested in the goals found by the Analyzer, where GN is the orig-
case in which E is a set of reduction rules, and (cid:27)Y is inal goal. Let (cid:28)(N:i1:::::it;N:i1:::::ij) denote the com-
assumed to be irreducible. position of (cid:27)N:i1:::::ij, (cid:27)N:i1:::::ij+1, through (cid:27)N:i1:::::it.
3.2 DefinitionsRelatedtotheAnalyzer Then TN:i1:::::it, (cid:28)(N:i1:::::it;N:i1:::::it)TN:i1:::::it(cid:0)1, ... ,
(cid:28)(N:i1:::::it;N:i1)GN represents a path through the pro-
tocol. That is, it is possible to proceed from the
state described by TN:i1:::::it to the state described
A protocol is speci(cid:12)ed in the Analyzer as a set
by (cid:28)(N:i1:::::it;N:i1:::::it)TN:i1:::::it(cid:0)1, and so forth until
of state transition rules. These are stored as Prolog
ultimately the state described by (cid:28)(N:i1:::::it;N:i1)GN
clauses. Wealsoassume,asisthe case forProlog,that
is reached. We call this a path from TN:i1:::::it to
quanti(cid:12)cationofvariables inrules is existential. Thus,
(cid:28)(N:i1:::::it;N:i1)GN, or a path from TN:i1:::::it to GN
whenever a rule is used, its variables are renamed, so
when we can avoid confusion. We refer to the triple
that substitutions made to variables in one use of the
(N:i1:::::it,TN:i1:::::it, (cid:28)(N:i1:::::it;N:i1)GN) as an input
rulehavenorelationtosubstitutions madeinanysub-
state triple. We say that (M,T,(cid:22)G)precedes (R,S,(cid:28)G)
sequent use of the rule.
if they are on the same path and R is a pre(cid:12)x of M.
The Analyzer is used by having the user specify a
Tosee howthisworks,consider the encrypt-decrypt
goal G consisting of words to be learned by the in-
protocol again. We start with goal word Z, with label
truder, values of local state variables, and/or a se-
1. Consider solution 1.2, which used Protocol Rule 1,
quence of events that should have occurred. The An-
which says that if the intruder knows X and Y, he or
alyzer returns a set of solutions for G. Each solution
shecanproduce e(X,Y),todeduce thatifZ =e(X,Y)
consists of an output state S (derived fromthe output
the intruder can learn Z if he or she knows X and Y.
ofastatetransitionrule),aleft-handedequationaluni-
Inthiscase(cid:27)1:2isthesubstitutionZ/e(X,Y). Suppose
(cid:12)er (cid:27)S ofa subset T ofS and asubset H ofG, andan
0 that we apply Protocol Rule 1 to Y again, to obtain
input state S that immediatelyprecedes (cid:27)SG consist-
solution 1.2.2, in which the intruder can learn Y if Y
ing ofthe (cid:27)SR where R was the input of the rule, and
= e(X1,Y1) andthe intruder knows X1 and Y1. In this
anyelementsof(cid:27)SGnotin(cid:27)SH. Notethat,aswesaw
case (cid:27)1:2:2 = Y/e(X1,Y1), and (cid:28)1:2:1, the composition
fromtheexampleinSection2,theoutputofarulecan
of(cid:27)1:2 and(cid:27)1:2:1,isZ/e(X,e(X1,Y1)). Theinputstate
havemorethanoneleft-handedequationaluni(cid:12)erwith
triple is (1.2.2,fX1,Y1,Xg,e(X,e(X1,Y1))).
a goal; thus more than one solution can be generated
froma single rule. The Analyzer can be used to query
0
S or a portion of it; it will return a set of solutions 4 How Languages are Represented and
00
as before, each consisting of a state S , a left-handed Veri(cid:12)ed in the Protocol Analyzer
00 0
equationaluni(cid:12)er(cid:27)S00 ofS andS ,andaninputstate
000
S . A languagerule is represented in the Protocol Ana-
Solutions are referred to as follows. Suppose that a lyzer database as a clause of the form
goal GN is identi(cid:12)ed by an integer N. The solutions
languagerule(N,langmember(W,Langname),
for GN are identi(cid:12)ed by N:1, ..., N:k. The solutions
Conditions)
found for N:i are identi(cid:12)ed by N:i:1, ..., N:i:n, and
so on. If N:i1:::::it identi(cid:12)es a solution, we refer to whereN isanintegeridentifyingtherule,W isaword,
the state output by the rule that produced N:i1:::::it
and Conditions is a set of conditions, which may in-
as SN:i1:::::it, the input state to the solution TN:i1:::::it
clude conditions saying that certain subterms of of W
and the the restriction of the left-handed equational
are members of languages. Thus, an example of a lan-
uni(cid:12)er of SN:i1:::::it and TN:i1:::::it(cid:0)1 to the variables in
guage rule would be
TN:i1:::::it(cid:0)1 as (cid:27)N:i1:::::it. We refer to N:1, ..., N:k as
the index of the solution SN:i1:::::ik. languagerule(5,langmember(e(A,B),seskey),
To see how this works, in our examplein Section 2, langmember(B,seskey)).
which would be the Analyzer’s internal representation expandconditionsallsubs will produce the set of all ex-
of the language rule pansion pairs. If Expandconditionsallsubs (cid:12)nds no such
uni(cid:12)ers, it produces the single expansion pair ((cid:19),C).
Seskey ! e(L,Seskey). As an example, we consider the language
enckey described below, and consider the condition
Since forthe remainderofthis paper, we willbe de- langmember(e(W,Z),enckey). The three language
scribing the way in which the Protocol Analyzer deals rules stored as Prolog clauses are of the form:
withlanguagesinternally,we willuse this internalrep-
languagerule(1,
resentation of these language rules fromnow on.
langmember(e(X,key(A)),enckey),ok).
Theexactwayinwhichlanguagemembershipisver-
languagerule(2,langmember(e(X,Y),enckey),
i(cid:12)edisdescribedin[5],sowedonotgointodetailhere.
langmember(Y,enckey)).
Brie(cid:13)y, the outline is this.
languagerule(3,langmember(d(X,Y),enckey),
A word belongs to a languageLang ifand only if it
langmember(Y,enckey)).
is of the form(cid:27)W, where
For the (cid:12)rst application of expandconditionsallsubs,
languagerule(N,langmember(W,Lang),C)
we unify e(W,Z) with e(X,key(A)) from Rule 1. The
resulting condition is ok, that is, e(X,key(A)) is al-
is a language rule, (cid:27)W is irreducible, and (cid:27)C is true.
ways in the language. For the second, we unify
The condition C consists of the conjunction of condi-
e(W,Z) with e(X,Y) from Rule 2 to obtain the con-
tionsofthe formlangmember(W,Lang),not(W = V),
dition langmember(W,enckey). In the case of Rule
2
and lookedfor(Y), where Y is a subterm of W . The
3, we fail to unify d(X,Y) with e(X,Y). Thus, there
(cid:12)rst condition is self-explanatory. The second, not(W
are two expansion pairs produced: (Z/key(A),ok) and
= V), is interpreted to mean that there is no uni(cid:12)er
((cid:19),langmember(Z,enckey)) where (cid:19) is the identity sub-
(cid:27) of W and V so that (cid:27) is the identity on W (that
stitution.
is, V does not subsume W). The third, lookedfor(Y),
Expandconditionsalways is computed as follows.
is interpreted to mean that the intruder has not yet
As in the case of expandconditionsallsubs, for each
learned the word Y.
occurrence of langmember(X,L) in a condition C
We use these facts to implementtwo Prolog proce-
where X is not a variable, it (cid:12)nds a rule
dures. One, expandconditionsallsubs, given a condition languagerule(M,langmember(Y,L),D), and a most
C returns a completeset Sofsubstitutions (cid:27) and con-
general uni(cid:12)er (cid:28) of X and Y. However, it only suc-
ditions E such that E implies (cid:27)C. By complete we
ceeds if such a (cid:28) can be found that is the identity on
meanthat,if(cid:28) isasubstitution, thenthere isa(possi-
C, that is, if Y subsumes X. If it does succeed, it re-
blyempty)subsetTofSsuchthat,if((cid:27)i,Ei)2T,then
places langmember((cid:28)X,L)= languagerule(X,L)in(cid:28)C
(cid:28) = (cid:22)i(cid:27)i and (cid:28)C holds if and only if the logical dis-
= C with (cid:28)D. It continues makingthese substitutions
junction of all (cid:22)iEi in T holds. The other procedure,
and replacements until no further nonvariable occur-
expandconditionsalways, given a condition C returns a rences oflangmember(X,L)canbe found. Itcomputes
condition E such that E impliesC.
all conditions that can be calculated this way, and re-
Expandconditionsallsubs is computed as follows. For turns the disjunction of these conditions.
each occurrence of langmember(X,L) in a condi-
Consider again the language enckey and the con-
tion C where X is not a variable, it (cid:12)nds a rule dition langmember(e(W,Z),enckey). For the (cid:12)rst lan-
languagerule(M,langmember(Y,L),D), and the most guagerule,e(X,key(A)) does notsubsumee(W,Z) be-
general uni(cid:12)er (cid:28) of X and Y. It then replaces cause the substitution Z/key(A) is not the identity
langmember((cid:28)X,L) in (cid:28)C with (cid:28)D. It continues on Z. On the other hand, for the second language
making these substitutions and replacements until no
rule,the substitutionis the identity. Forthe thirdlan-
further nonvariable occurrences of langmember(X,L)
guage rule, there is no uni(cid:12)er. Thus the (cid:12)nal result
can be found. The resulting condition is E, and of expandconditionsalways is langmember(Z,enckey).
this together with the substitution (cid:27) obtained by Thisconditionwillimplylangmember(e(W,Z),enckey)
composing the most general uni(cid:12)ers obtained and no matter what substitutions are made to W and Z.
restricting to the variables in C is called an ex-
We now use the procedures expandconditionsallsubs
pansion pair ((cid:27),E). When queried repeatedly,
and expandconditionsalways to produce a proof that
2 knowledge of a member of a language implies previ-
A user de(cid:12)ninga languageactually has more leeway than
this in de(cid:12)ning conditions, but this describes the form of the ous knowledge. Our strategy is, for each languagerule
conditionsgeneratedbytheproceduredescribedinthispaper. N de(cid:12)ningawordW,toattempttoconstruct pathsto
W, working backwards fromW. We identify the state ing expandconditionsallsubs on langmember(Z,enckey),
inwhichthe intruder knowsW asgoalN,correspond- we will obtain the single expansion
ing to our notation in Section 3. We use a breadth- pair ((cid:19),langmember(Z, enckey)). For this expansion
(cid:12)rst search strategy, (cid:12)rst (cid:12)nding all states that can pair, we attempt to compute expandconditionsallsubs
immediatelyprecede goal N, then the states that can on langmember(d(R,Z),enckey). The onlyrule we can
immediatelyprecede each ofthose states, andso forth. apply is languagerule(3, langmember(d(X,Y),enckey),
Eachtimeweproduce aninputstate,weattemptto langmember(Y,enckey)). This results in the condition
provethat itcontainsamemberofthe languageforall langmember(Z,enckey). Since langmember(Z,enckey)
possible substitutions makingW amemberofthe lan- implieslangmember(Z,enckey), we are done.
guage. This is done as follows,by a procedure we call Amoredetaileddescriptionofthisprocess, withfur-
the language membership veri(cid:12)cation procedure. Let ther examplesand an outlineofhow it is implemented
(N:i1:::::ik,TN:i1:::::ik,(cid:27)(N:i1:::::ik;N:i1)fW;Cg)beanin- in Prolog, is given in [5].
put state triple produced by a search for fW,Cg. We
would like to show that, whenever (cid:27)(N:i1:::::ik;N:i1)W 5 How the Protocol Analyzer Gener-
is a member of Lang, then there is a word known ates Languages
by the intruder in TN:i1:::::ik that is a member of
5.1 OverviewofthisSection
Lang. We begin by determining all cases in which
(cid:27)(N:i1:::::ik;N:i1)C can hold. We do this by invoking
expandconditionsallsubs on (cid:27)N:i1:::::ikC. For each ex-
We will present the Protocol Analyzer’s language
pansionpair((cid:28);E) produced,we lookateachword(cid:28)V
generation procedure in the following way. First, we
in (cid:28)TN:i1:::::ik. We execute expandconditionsalways on
will describe the general procedure for generating lan-
langmember((cid:28)V,Lang) to produce a condition F that
guages. Then we will describe in detail the various
implies langmember((cid:28)V,Lang). We then attempt to
types of language rules that can be generated when
show that E implies F. If, for each expansion pair
we failto show that a state contains a word belonging
((cid:28);E), there is some (cid:28)V such that we can prove that
to the language we are trying to de(cid:12)ne. Once this is
this holds, then we will have proved our result for the
done, we willfocus morebroadlyand describe the lan-
inputstate TN:i1:::::ik. Wemarkthe solutionSN:i1:::::ik
guage generation process itself, dividing it into stages
as a success and do not attempt to prove the result
and describing each stage in detail.
for any solution preceding SN:i1:::::ik. Otherwise, we
5.2 HowRulesareGenerated
use the Analyzer to produce all solutions that can im-
mediatelyprecede SN:i1:::::ik andperformthe language
membership veri(cid:12)cation procedure on the input state
ThestrategytheProtocolAnalyzeruses togenerate
for each solution.
languages is to start with one language rule, supplied
We continue in this fashion until we have either by the user. It then attempts to prove the language
shown that all paths to W must contain a member unreachable by proving that knowledge of a word in
of the language Lang, we encounter a path from an the language implies previous knowledge of the word
initial state to W that cannot be proved to contain in that language. In each case in which it fails to do
a member of the language, or we encounter a path of so, it either creates a rule that impliesthat one of the
length Q that cannot be proved to contain a member words the intruder mustknowpreviously is inthe lan-
ofthelanguage,whereQisaparametermaintainedby guage, or modi(cid:12)es an old rule so that the particular
the system. Inthe (cid:12)rst case, we willhavesucceeded in wordthatisbeingveri(cid:12)ed isnolongerinthe language.
provingthatintruder knowledgeofW impliesprevious TheAnalyzernowattemptstoprovethenewlanguage
intruder knowledge of W, and in the other two cases unreachable, and adds or modi(cid:12)es rules as before. It
we willhave failed. continues this process until it either succeeds in prov-
As an example, consider the language enckey ingthelanguageunreachableorisunabletocreate any
again, and consider the second language rule, new rules. There is also the possibility that the Ana-
which has language member e(W,Z) with condition lyzermayget intoanin(cid:12)niteloop,inwhichcase itwill
langmember(Z,enckey). Suppose that we ask the fail after a certain number of iterations, which can be
analyzer how the intruder can (cid:12)nd e(W,Z), and speci(cid:12)ed by the user.
it tells us that this can be done if the intruder The only input required by the user is to name the
knows d(R,Z). If we label e(W,Z) with the inte- language, to input the (cid:12)rst language rule, called the
ger 1, the corresponding input triple will be (1.1, seedword rule, and to choose the search depth and
d(R,Z), fe(W,Z), langmember(Z,enckey)g). Comput- strategy. The procedure for generating the seedword
0
rule is fairly straightforward. Languages usually arise We add to the condition C the condition not(U =
out of the user’s trying to prove a particular word un- R) (that is, we are saying that U in general cannot be
reachable. What the user often (cid:12)nds instead is a set found, except possibly for the case U = R).
of words that contain that word, or a portion of that A rule of type III is generated when W contains a
word, as a subword, and which de(cid:12)nes the language. lookedfor word Y, and the input word contains Y but
We call the originalword the user is trying to (cid:12)nd the notW. Wegenerate anew rule orset ofrules thatsay
seedwordofthelanguage. Thuswebeginbyhavingthe that the input word is in the language as long Y is a
user specify the seedword S. In some cases, the seed- lookedfor word.
word may contain a subword that is being looked for These rules are described in detail below.
by the intruder. We allow the user to specify this one
5.3 TheDifferent TypesofRules
condition. Thus, if the user is trying to (cid:12)nd out how
to (cid:12)nd e(X,Y),where Y isnot known bythe intruder,
andspeci(cid:12)es the name\encrypt" for the language,the
5.3.1 Rules of Type I
Analyzer will construct the initialseedword rule
We have already encountered rules of type I in Sec-
languagerule(1,langmember(e(X,Y),encrypt), tion 2. Suppose that (cid:28)T contains a word U either
lookedfor(Y)). containing W as a subword or a word X such that
langmember(X,Lang)appears in E. We can make(cid:28)T
The basic scenario for generating rules is this. Sup- contain a memberof Lang by adding the rules
pose that we are given a rule of the form
languagerule(Ni, langmember(Vi,Lang),
languagerule(N,langmember(W,Lang),C) langmember(Yi,Lang))
and we are attempting to prove that, if C holds, then where the Vi and Yi are created as follows.
every path leading to a state S in which the intruder Let V be the smallest subterm of U containing W
knowsW andthe conditionsinC holdcontainsastate (orX)suchthatmembershipofV inLangwouldimply
in which the intruder knows a memberof Lang. That membership of U in Lang. If there is more than one
is, for each path, we want to show that there is an suchsubterm,choosetheonewiththeleastoccurrence.
input triple (M,T, (cid:27)fW;Cg) such that T contains a Let i1:i2:::ik be the leastoccurrence ofW (orX) inV.
word of Lang. We attempt to prove that T contains a Let V1 be the result of replacing the termoccurring at
memberofLang byrunningexpandconditionsallsubs on i1 in V by the variableY1. The language rule
(cid:27)C, and for each expansion pair (E,(cid:28)) generated, at-
tempttoprovethatE impliesthat(cid:28)T containsaword languagerule(N1, langmember(V1,Lang),
in Lang by showing that, for at least one X in (cid:28)T, E langmember(Y1,Lang))
impliestheresultofrunningexpandconditionsallsubs on
will guarantee that V is in Lang as long as the term
langmember(X,Lang). If we failto do so, we generate
occurring at i1 is. Similarly,if Z is the term occurring
a new rule.
at i1:i2:::ij inV,where j <k, let Vj+1 be the result of
Thewayinwhichthenewruleisgenerated depends
replacinginZ the termoccurring ati1:i2:::ij:ij+1 with
upon the structure of the words in (cid:28)T. We classify
the variable Yj+1. The language rule
rules as of type I, II, or III, depending upon how they
are generated. languagerule(Nj+1, langmember(Vj+1,Lang),
Brie(cid:13)y, a rule of type I is generated when an input langmember(Yj+1,Lang))
word is generated containing a word Z known to be a
member of the language as a subword. We replace Z will guarantee that Z is in Lang as long as the term
or some subword containing Z with a variable Y, and occurringati1:i2:::ij:ij+1. Together,alltheseruleswill
generate a rule or set of rules saying that this word is imply that V is in Lang as long as W (or X is), and
in the language as long as Y is in the language. hence that U is in Lang as long as W or X is.
A rule of type II is generated when we (cid:12)nd that We call rules generated in this way rules of type I.
(cid:27)W can be obtained for some (cid:27). Let R be a subword For example, suppose that we started out with the
occurring in(cid:27)W such that R is in the language. Then seedword rule
R = (cid:22)U, fromsome language rule
languagerule(1,langmember(e(X,Y),encrypt),
0
languagerule(N,langmember(U,Lang),C). ok)
and the Analyzer discovered an input state triple ThelattercanbehelpfuliflanguagerulesoftypeIII
(M,T,fe(X,Y),okg) where T is the state in which the are to be generated, as we will see in the next section.
intruder knows e(R,(Q,d(Z,e(X,Y)))), where ( , ) is We call either type of rule a rule of Type II.
the concatenation function. The result of applying For example, suppose we are trying to generate the
expandconditionsallsubs to the condition lookedfor(Y) language encrypt2 fromthe seedword rule
is the expansion pair ((cid:19),ok), where (cid:19) is the identity.
languagerule(1,langmember(e(X,Y),encrypt2),
Clearly,ok does not implythat e(R,(Q,d(Z,e(X,Y))))
lookedfor(Y))
is a member of encrypt, so we attempt to gen-
erate a rule of Type I. The smallest subterm of
and that the Analyzer generated the input triple
e(R,(Q,d(Z,e(X,Y)))) whose membership in encrypt
(M,(cid:30),fe(key(A),rand(A,N),okg)where (cid:30) is the empty
wouldimplymembershipofthe whole wordinencrypt
set. Then, depending upon which strategy we are us-
is (Q,d(Z,e(X,Y))). So inthis case we would generate
ing, we can generate one of the followingrules of type
two rules:
II that will guarantee that e(key(A),rand(A,N)) no
longer satis(cid:12)es Rule 1:
languagerule(2,langmember((Q,Y1),encrypt),
langmember(Y1,encrypt)).
languagerule(1,langmember(e(X,Y),encrypt2),
(lookedfor(Y),
languagerule(3,langmember(d(Z,Y2),encrypt),
not(e(X,Y) = e(key(A),rand(A,N)))))
langmember(Y2,encrypt)).
or
Noticethatitwouldalsobe possibletogenerate the
single language rule languagerule(1,langmember(e(X,Y),encrypt2),
(lookedfor(Y),
languagerule(2, not(Y = rand(A,N)))).
langmember((Q,d(Z,Y1)),encrypt),
At this point in the implementationof the Protocol
langmember(Y1,encrypt)).
Analyzer, we restrict ourselves to modifyingseedword
rules and rules of Type III when we generate rules of
However,weprefertogeneratethemultiplerules,(cid:12)rst,
Type II. Rules of Type III are described in the next
because they result in a larger language,and secondly,
section.
becausetheyresultinsimpler,moreuniform-appearing
languages for which it is easier to develop faster algo-
rithms for verifying membership. 5.3.3 Rules of Type III
A third type of rule, which arises more rarely than
5.3.2 Rules of Type II the other two, is used in only in the case in which C
contains a condition lookedfor(Y) where Y is a sub-
Inanumberofcases, itwillnotbe possible toproduce
term of W. Suppose that we have an input triple
a rule or rules of Type I. But, it may be that (cid:28)(cid:27)W
(M,T,(cid:27)fW;Cg)andanexpansionpair((cid:28);E)suchthat
contains a subterm V satisfying
(cid:28)T contains a word U containing (cid:28)(cid:27)Y as a subterm,
0 but U does not contain (cid:28)(cid:27)W. This can be used to
languagerule(Q,langmember(X,Lang),C )
generate what we call rules of Type III.
Our procedure for generating rules of Type III is
for some language rule, that is, there is a substitution
0 similarto that for generating rules of Type I.
(cid:22) such that V = (cid:22)X and (cid:22)C holds. If (cid:22) is not the
We add the rules
identity,we can modifythe language rule to
languagerule(Ni, langmember(Vi,Lang),
languag0erule(Q, langmember(X,Lang), langmember(Yi,Lang))
(C , not(X=V))).
where the Vi and Yi are created as follows.
0
We also have the option, if C contains a condition Let V be the smallestsubterm ofU containing(cid:28)(cid:27)Y
ofthe formlookedfor(Z),andZ,andU =(cid:22)Z, ofmod- such thatmembershipofV inLang wouldimplymem-
ifyingthe language rule to be bership of U in Lang. If there is more than one such
subterm,choosethe onewiththe leastoccurrence. Let
languagerule(Q, langmember(X,Lang), i1:i2:::ik be the least occurrence of V in U. Let V1 be
0
(C ,not(Z=U)). V after the term occurring at i1 has been replaced by
5.4 StrategiesforGeneratingRules
the variable Yi1. Similarly, if Z is the term occurring
at i1:i2:::ij, where j < k, let Vj+1 be Z after the term
occurring at i1:i2:::ij:ij+1 has been replaced with the Ournextproblemistochooseastrategyforgenerat-
variable Yj+1. ingrules. In manycases we willhave achoice between
For j from 1 to k-1, we add the rules generating a rule of type I, II, or III. The strategy we
have chosen is to prefer rules of type I over rules of
languagerule(Nj, langmember(Vj,Lang), type II, since adding rules of type I makes the lan-
langmember(Yj,Lang)). guage larger (which is preferable), and adding rules of
type II makes it smaller. However, although rules of
For j =k, we add the rule type III also extend the language, it turns out that
they in turn must satisfy additional constraints in or-
languagerule(Nk, langmember(Vk,Lang), der to make them consistent with the initialseedword
(lookedfor(Yk), C)) rule, which inturn alsolimitsthe size ofthe language.
Thuswegenerate rulesoftypeIII onlyasalastresort.
where Yk = Y and Vk is the smallestsubterm ofU not The use of rules of Type III puts a constraint on
equal to Y containing Y. the generation of rules of Type II inthe followingway.
TheremainingpartoftheconditionCisconstructed Suppose that we have generated a rule of type II
as follows. Suppose that the current formof the seed-
word rule is languagerule(N, langmember(W,Lang),
Conditions)
languagerule(1,langmember(W,Lang),
where Conditions contains a condition of the form
lookedfor(X)).
not(W = Z). Suppose that next we generate a rule
of type III
Then C is empty. On the other hand, if the current
formof the seedword rule is
languagerule(M,langmember(V,Lang),
Conditions).
languagerule(1,langmember(W,Lang),
lookedfor(X),D),
IfW containsvariablesnotinV,theruleoftypeIII
generated maybe vacuous. For example,suppose that
where D is the concatenation ofconditionsofthe form
the rule of type II is
not(X = S), then C is set equal to D. Ifthe seedword
rule is of any other form,the procedure fails.
languagerule(1,langmember(e(X,Y),encrypt),
For example, suppose that, we attempted to de(cid:12)ne
(lookedfor(Y), not(e(X,Y) = e(key(A),Y)))
a language with the seedword rule
and the rule of type III is
languagerule(1,langmember(e(X,Y),encrypt2),
lookedfor(Y))
languagerule(3,langmember(d(Z,Y),encrypt),
(lookedfor(Y),
andatsomepointthishadbeen replaced withthe rule
not(e(X,Y) = e(key(A),Y)))).
of Type II
The second language rule says that a word in the
languagerule(1,langmember(e(X,Y),encrypt2), language must satisfy the condition that there is no
(lookedfor(Y),not(Y = rand(A,N)))). X such that e(X,Y) = e(key(A),Y), which is patently
false. The fact that the (cid:12)rst language rule says that
Suppose that the Analyzer found an input state this is the case for a particular value of X does not
tripleoftheform(N,T,fe(X,Y),lookedfor(Y),not(Y implythe result for the second language rule.
= rand(A,N))g). Suppose, furthermore, that T was We get around this by using a di(cid:11)erent strategy
found to contain a word d(Z,Y) for some Z. Then we for computing rules of Type II when we expect to en-
could construct a rule of Type III of the form counter rules of Type III. Given a rule
languagerule(3,langmember(d(X,Y),encrypt), languagerule(N,langmember(W,Lang),
(lookedfor(Y),not(Y = rand(A,N)))). Conditions)