Simulating Reachability using First-Order Logic with Applications to Verification of Linked Data Structures T.Lev-Ami1,N.Immerman2(cid:63),T.Reps3(cid:63)(cid:63),M.Sagiv1,S.Srivastava2(cid:63),and G.Yorsh1(cid:63)(cid:63)(cid:63) 1 SchoolofComp.Sci.,TelAvivUniv., {tla,msagiv,gretay}@post.tau.ac.il 2 Dept.ofComp.Sci. Univ.ofMassachusetts,Amherst {immerman,siddharth}@cs.umass.edu 3 Comp.Sci.Dept.,Univ.ofWisconsin,[email protected] This paper shows how to harness existing theorem provers for first-order logic to automatically verify safety properties of imperative programs that perform dynamic storage allocation and destructive updating of pointer-valued structure fields. One of the main obstacles is specifying and proving the (absence) of reachability properties amongdynamicallyallocatedcells. Themaintechnicalcontributionsaremethodsforsimulatingreachabilityinacon- servativewayusingfirst-orderformulas—theformulasdescribeasupersetofthesetof programstatesthatcanactuallyarise.Thesemethodsareemployedforsemi-automatic program verification (i.e., using programmer-supplied loop invariants) on programs suchasmark-and-sweepgarbagecollectionanddestructivereversalofasinglylinked list. (The mark-and-sweep example has been previously reported as being beyond the capabilitiesofESC/Java.) 1 Introduction This paper explores how to harness existing theorem provers for first-order logic to prove reachability properties of programs that manipulate dynamically allocated data structures. The approach that we use involves simulating reachability in a conserva- tivewayusingfirst-orderformulas—i.e.,theformulasdescribeasupersetofthesetof programstatesthatcanactuallyarise. Automaticallyestablishingsafetyandlivenesspropertiesofsequentialandconcur- rentprogramsthatpermitdynamicstorageallocationandlow-levelpointermanipula- tionsischallenging.Dynamicallocationcausesthestatespacetobeinfinite;moreover, a program is permitted to mutate a data structure by destructively updating pointer- valuedfieldsofnodes.Thesefeaturesremainevenifaprogramminglanguagehasgood capabilities for data abstraction. Abstract-datatype operations are implemented using loops,procedurecalls,andsequencesoflow-levelpointermanipulations;consequently, itishardtoprovethatadata-structureinvariantisreestablishedonceasequenceofop- erationsisfinished[1].InlanguagessuchasJava,concurrencyposesyetanotherchal- lenge: establishing the absence of deadlock requires establishing the absence of any cycleofthreadsthatarewaitingforlocksheldbyotherthreads. Reachability is crucial for reasoning about linked data structures. For instance, to establishthatamemoryconfigurationcontainsnogarbageelements,wemustshowthat (cid:63)SupportedbyNSFgrantCCR-0207373andaGuggenheimfellowship. (cid:63)(cid:63)SupportedbyONRundercontractsN00014-01-1-{0796,0708}. (cid:63)(cid:63)(cid:63)PartiallysupportedbytheIsraeliAcademyofScience everyelementisreachablefromsomeprogramvariable.Othercaseswherereachability isausefulnotioninclude – Specifyingacyclicityofdata-structurefragments,i.e.,everyelementreachablefrom nodencannotreachn – Specifyingtheeffectofprocedurecallswhenreferencesarepassedasarguments: onlyelementsthatarereachablefromaformalparametercanbemodified – Specifyingtheabsenceofdeadlocks – Specifying safety conditions that allow establishing that a data-structure traversal terminates,e.g.,thereisapathfromanodetoasink-nodeofthedatastructure. The verification of such properties presents a challenge. Even simple decidable frag- mentsoffirst-orderlogicbecomeundecidablewhenreachabilityisadded[2,3].More- over,theutilityofmonadicsecond-orderlogicontreesisratherlimitedbecause(i)many programs allow non-tree data structures, (ii) expressing postconditions of procedures (whichisessentialformodularreasoning)requiresreferringtothepre-statethatholds before the procedure executes, and thus cannot, in general, be expressed in monadic second-order logic on trees—even for procedures that manipulate only singly-linked lists,suchasthein-situlist-reversalprogramshowninFig.1,and(iii)thecomplexity isprohibitive. Whileourworkwasactuallymotivatedbyourexperienceusingabstractinterpreta- tion–and,inparticular,theTVLAsystem[4–6]–toestablishpropertiesofprograms that manipulate heap-allocated data structures, in this paper, we consider the problem ofverifyingdata-structureoperations,assumingthatwehaveuser-suppliedloopinvari- ants.ThisissimilartotheapproachtakeninsystemslikeESC/Java[7],andPale[8]. Thecontributionsofthepapercanbesummarizedasfollows: Handling FO(TC) formulas using FO theorem provers. We want to use first- order theorem provers and we need to discuss the transitive closure of certain binary predicates,f.However,first-ordertheoremproverscannothandletransitiveclosure.We solvethisconundrumbyaddinganewrelationsymbolf foreachsuchf,togetherwith tc first-orderaxiomsthatassurethatf isinterpretedcorrectly.Thetheoreticaldetailsof tc howthisisdonearepresentedinSections3and4.Thefactthatweareabletohandle transitiveclosureeffectivelyandreasonablyautomaticallyisamajorcontributionand quitesurprising. As explained in Section 3, the axioms that we add to control the behavior of the added predicates, f , must be sound but not necessarily complete. One way to think tc aboutthisisthatwearesimulatingaformula,χ,inwhichtransitiveclosureoccurs,with apurefirst-orderformulaχ(cid:48).Ifouraxiomsarenotcompletethenweareallowingχ(cid:48)to denotemorestoresthanχdoes.Thisismotivatedbythefactthatabstractioncanbean aid in the verification of many properties; that is, a definite answer can sometimes be obtainedevenwheninformationhasbeenlost(inaconservativemanner).Thismeans thatourmethodsaresoundbutpotentiallyincomplete. If χ(cid:48) is proven valid in FO then χ is also valid in FO(TC); however, if we fail to provethatχ(cid:48) isvalid,itisstillpossiblethatχisvalid:thefailurewouldbeduetothe incompleteness of the axioms, or the lack of time or space for the theorem prover to completetheproof. It is easy to write a sound axiom, T [f], that is “complete” in the very limited 1 sensethateveryfinite,acyclicmodelsatisfyingT [f]mustinterpretf asthereflexive, 1 tc transitiveclosureofitsinterpretationoff.However,inpracticethisisnotworthmuch because, as is well-known, finiteness is not expressible in first-order logic. Thus, the properties that we want to prove do not follow from T [f]. We do prove that T [f] is 1 1 complete for positive transitive-closure properties. The real difficulties lie in proving propertiesinvolvingthenegationofTC[f]. Inductionaxiomscheme.Tosolvetheaboveproblem,weaddaninductionaxiom scheme. Although in general, there is no complete, recursively-enumerable axioma- tization of transitive closure, we have found that on the examples we have tried, T 1 plusinductionallowsustoautomaticallyproveallofourdesiredproperties.Wethink of the axioms that we use as aides for the first-order theorem prover that we employ (Spass [9]) to prove the properties in question. Rather than giving Spass many in- stances of the induction scheme, our experience is that it finds the proof faster if we giveitseveralaxiomsthataresimplertousethaninduction.Asalreadymentioned,the hardpartistoshowthatcertainpathsdonotexist. Coloringaxiomschemes.Inparticular,weusethreeaxiomschemes,havingtodo withpartitioningmemoryintoasmallsetofcolors.Wecallinstancesoftheseschemes “coloringaxioms”.Ourcoloringaxiomsaresimple,andareeasilyprovedusingSpass (in under ten seconds) from the induction axioms. For example, the first coloring axiom scheme, NoExit[A,f], says that if no f-edges leave color class, A, then no f- pathsleaveA.ItturnsoutthattheNoExitaxiomschemeimplies–andthusisequivalent to – the induction scheme. However, we have found in practice that explicitly adding other coloring axioms (which are consequences of NoExit) enables Spass to prove propertiesthatitotherwisefailsat. Wefirstassumethattheprogrammerprovidesthecolorsbymeansoffirst-orderfor- mulaswithtransitiveclosure.Ourinitialexperienceindicatesthatthegeneratedcolor- ingaxiomsareusefultoSpass.Inparticular,itprovidestheabilitytoverifyprograms like the mark phase of a mark-and-sweep garbage collector. This example has been previouslyreportedasbeingbeyondthecapabilitiesofESC/Java.TVLAalsosucceeds onthisexample;howeverournewapproachprovidesverificationmethodsthatcanin someinstancesbemoreprecisethanTVLA. Prototypeimplementation.Perhapsmostexciting,wehaveimplementedtheheuris- tics for selecting colors and their corresponding axioms in a prototype using Spass. Wehaveusedthistoautomaticallychooseusefulcoloraxiomsandthenverifyseveral small heap-manipulating programs. More work needs to be done here, but the initial resultsareveryencouraging. Strengthening Nelson’s results. Greg Nelson considered a set of axiom schemes for reasoning about reachability in function graphs, i.e., graphs in which there is at mostonef-edgeleavinganynode[10].Heleftopenthequestionofwhetherhisaxiom schemes were complete for function graphs. We show that Nelson’s axioms are prov- able from T plus our induction axioms. We also show that Nelson’s axioms are not 1 complete:infact,theydonotimplyNoExit. Outline.Theremainderofthepaperisorganizedasfollows:Section2explainsour notation and the setting; Section 3 introduces the induction axiom scheme and fills in ourformalframework;Section4statesthecoloringaxiomschemes;Section5explains thedetailsofourheuristics;Section6describessomerelatedwork;Section7describes somefuturedirections. 2 Preliminaries Thissectiondefinesthebasicnotationsusedinthispaperandthesetting. 2.1 Notation Syntax: A relational vocabulary τ = {p ,p ,...,p } is a set of relation symbols, 1 2 k each of fixed arity. We write first-order formulas over τ with quantifiers ∀ and ∃, logical connectives ∧, ∨, →, ↔, and ¬, where atomic formulas include: equality, p (v ,v ,...v ), and TC[f](v ,v ), where p ∈ τ is of arity a and f ∈ τ is bi- i 1 2 ai 1 2 i i nary. Here TC[f](v ,v ) denotes the existence of a finite path of 0 or more f edges 1 2 fromv tov .AformulawithoutTCiscalledafirst-orderformula. 1 2 We use the following precedence of logical operators: ¬ has highest precedence, followedby∧and∨,followedby→and↔,and∀and∃havelowestprecedence. Semantics: A model, A, of vocabulary τ, consists of a non-empty universe, |A|, andarelationpA overtheuniverseinterpretingeachrelationsymbolp ∈ τ.Wewrite A|=ϕtomeanthattheformulaϕistrueinthemodelA. 2.2 Setting Weareprimarilyinterestedinformulasthatarisewhileprovingthecorrectnessofpro- grams.Weassumethattheprogrammerspecifiespreandpost-conditionsforprocedures andloopinvariantsusingfirst-orderformulaswithtransitiveclosureonbinaryrelations. Thetransformerforaloopbodycanbeproducedautomaticallyfromtheprogramcode. Forinstance,toestablishthepartialcorrectnesswithrespecttoauser-suppliedspec- ificationofaprogramthatcontainsasingleloop,weneedtoestablishthreeproperties: First, the loop invariant must hold at the beginning of the first iteration; i.e., we must showthattheloopinvariantfollowsfromthepreconditionandthecodeleadingtothe loop.Second,theloopinvariantprovidedbytheusermustbemaintained;i.e.,wemust showthatiftheloopinvariantholdsatthebeginningofaniterationandtheloopcon- dition also holds, the transformer causes the loop invariant to hold at the end of the iteration.Finally,thepostconditionmustfollowfromtheloopinvariantandthecondi- tionforexitingtheloop. Ingeneral,theseformulasareoftheform ψ [τ]∧T[τ,τ(cid:48)]→ψ [τ(cid:48)] 1 2 whereτ isthevocabularyofthebeforestate,τ(cid:48)isthevocabularyoftheafterstate,and T is the transformer, which may use both the before and after predicates to describe themeaningofthemoduletobeexecuted.Ifsymbolf denotesthevalueofapredicate beforetheoperationthenf(cid:48)denotesthevalueofthesamepredicateaftertheoperation. Aninterestingspecialcaseistheproofofthemaintenanceformulaofaloopinvari- ant.Thishastheform: LC[τ]∧LI[τ]∧T[τ,τ(cid:48)]→LI[τ(cid:48)] Here LC is the condition for entering the loop and LI is the loop invariant. LI[τ(cid:48)] indicatesthattheloopinvariantremainstrueafterthebodyoftheloopisexecuted. The challenge is that the formulas of interest contain transitive closure; thus, the validity of these formulas cannot be directly proven using a theorem prover for first- orderlogic. 3 AxiomatizationofTransitiveClosure Theoriginalformulathatwewanttoprove,χ,containstransitiveclosure,whichfirst- order theorem provers cannot handle. To address this problem, we replace χ by the newformula,χ(cid:48),whereallappearancesofTC[f]havebeenreplacedbythenewbinary relationsymbol,f . tc Weshowinthispaperthatfromχ(cid:48),wecanoftenautomaticallygenerateanappro- priatefirst-orderaxiom,σ,withthefollowingtwoproperties: 1. ifσ →χ(cid:48)isvalidinFOthenχisvalidinFO(TC). 2. Atheoremproversuccessfullyprovesthatσ →χ(cid:48)isvalidinFO. We now explain the theory behind this process. A TC model, A, is a model such thatiff andf areinthevocabularyofA,then(f )A =(fA)(cid:63);i.e.,Ainterpretsf tc tc tc asthereflexive,transitiveclosureofitsinterpretationoff. A first-order formula ϕ is TC valid iff it is true in all TC models. We say that an axiomatization,Σ,isTCsoundifeveryformulathatfollowsfromΣisTCvalid.Since first-orderreasoningissound,Σ isTCsoundiffeveryσ ∈Σ isTCvalid. WesaythatΣisTCcompleteifforeveryTC-validϕ,Σ |=ϕ.IfΣisTCcomplete andTCsound,thenforallfirst-orderϕ, Σ |=ϕ ⇔ ϕisTCvalid ThusaTC-completesetofaxiomsprovesexactlythefirst-orderformulas,χ(cid:48),such thatthecorrespondingFO(TC)formula,χ,isvalid. All the axiomatizations that we consider are TC sound. There is no recursively enumerableTC-completeaxiomsystem(see[11,12]). 3.1 SomeTC-SoundAxioms WebeginwithourfirstTCaxiomscheme.Foranybinaryrelationsymbol,f,let, T [f] ≡ ∀u,v.f (u,v) ↔ (u=v)∨∃w.f(u,w)∧f (w,v) 1 tc tc We first observe that T [f] is “complete” in a very limited way for finite, acyclic 1 graphs,i.e.,T [f]exactlycharacterizesthemeaningoff forallfinite,acyclicgraphs. 1 tc Thereasonthisislimited,isthatitdoesnotgiveusacompletesetoffirst-orderaxioms because,asiswellknown,thereisnofirst-orderaxiomatizationof“finite”. Proposition1. AnyfiniteandacyclicmodelofT [f]isaTCmodel. 1 Proof:LetA |= T [f]whereAisfiniteandacyclic.Leta ,b ∈ |A|.Assumethereis 1 0 anf-pathfroma tob.SinceA|=T [f],itiseasytoseethatA|=f (a ,b). 0 1 tc 0 Conversely,supposethatA|=f (a ,b).Ifa =b,thenthereisapathoflength0 tc 0 0 froma tob.Otherwise,byT [f],thereexistsana ∈ |A|suchthatA |= f(a ,a )∧ 0 1 1 0 1 f (a ,b). Note that a (cid:54)= a since A is acyclic. If a = b then there is an f-path of tc 1 1 0 1 length1fromatob.Otherwisetheremustexistana ∈|A|suchthatA|=f(a ,a )∧ 2 1 2 f (a ,b)andsoon,generatingaset{a ,a ,...}.Noneofthea canbeequaltoa , tc 2 1 2 i j forj <i,byacyclicity.Thus,byfiniteness,somea =b.HenceAisaTCmodel. (cid:50) i LetT(cid:48)[f]bethe←directionofT [f]: 1 1 T(cid:48)[f] ≡ ∀u,v.f (u,v) ← (u=v)∨∃w.f(u,w)∧f (w,v) 1 tc tc Proposition2. Letf occuronlypositivelyinϕ.IfϕisTCvalid,thenT(cid:48)[f]|=ϕ. tc 1 Proof: Suppose that T(cid:48)[f] (cid:54)|= ϕ. Let A |= T(cid:48)[f] ∧ ¬ϕ. Note that f occurs only 1 1 tc negativelyin¬ϕ.Furthermore,sinceA|=T(cid:48)[f],itiseasytoshowbyinductiononthe 1 lengthofthepath,thatifthereisanf-pathfromatobinA,thenA|=f (a,b).Define tc A(cid:48)tobethemodelformedfromAbyinterpretingf inA(cid:48)as(fA)(cid:63).ThusA(cid:48)isaTC tc modelanditonlydiffersfromAbythefactthatwehaveremovedzeroormorepairs from(f )A toform(f )A(cid:48).BecauseA |= ¬ϕandf occursonlynegativelyin¬ϕ, tc tc tc itfollowsthatA(cid:48) |=¬ϕ,whichcontradictstheassumptionthatϕisTCvalid. (cid:50) Proposition2showsthatprovingpositivefactsoftheformf (u,v)iseasy;itisthe tc taskofprovingthatpathsdonotexistthatismoresubtle. Proposition 1 shows that what we are missing, at least in the acyclic case, is that thereisnofirst-orderaxiomatizationoffiniteness.Traditionally,whenreasoningabout the natural numbers, this problem is mitigated by adding induction axioms. We next introduce an induction scheme that, together with T , seems to be sufficient to prove 1 anypropertyweneedconcerningTC. Notation:Ingeneral,wewilluseF todenotethesetofallbinaryrelationsymbols, f, such that TC[f] occurs in a formula we are considering. If ϕ[f] is a formula in (cid:94) whichf occurs,letϕ[F] = ϕ(f).Thus,forexample,T [F]istheconjunctionof 1 f∈F theaxiomT [f]forallbinaryrelationsymbols,f,underconsideration. 1 Definition1. Foranyfirst-orderformulasZ(u),P(u),andbinaryrelationsymbol,f, lettheinductionprinciple,IND[Z,P,f],bethefollowingfirst-orderformula: (∀z.Z(z)→P(z)) ∧ (∀u,v.P(u)∧f(u,v)→P(v)) →∀u,z.Z(z)∧f (z,u)→P(u) tc TheinductionprinciplesaysthatifeveryzeropointsatisfiesP,andP ispreserved whenfollowingedges,theneverypointreachablefromazeropointsatisfiesP.Obvi- ouslythisprincipleissound. Asaneasyapplicationoftheinductionprinciple,considerthefollowingcousinof T [f], 1 T [f] ≡ ∀u,v.f (u,v) ↔ (u=v)∨∃w.f (u,w)∧f(w,v) 2 tc tc ItiseasytoseethatneitherofT [f],T [f]impliestheother.However,inthepres- 1 2 ence of the induction principle they do imply each other. For example, it is easy to prove T [f] from T [f] using IND[Z,P,f] where Z(v) ≡ v = u and P(v) ≡ u = 2 1 v∨∃w.f (u,w)∧f(w,v).Here,foreachuweuseIND[Z,P,f]toprovebyinduc- tc tionthateveryvreachablefromusatisfiestheright-handsideofT [f]. 2 Arelatedaxiomschemethatwehavefoundusefulisthetransitivityofreachability: Trans[f] ≡ ∀u,v,w.f (u,w)∧f (w,v)→f (u,v) tc tc tc 4 ColoringAxioms WenextdescribethreeTC-soundaxiomsschemesthatarenotimpliedbyT [F]∧T [F], 1 2 and are provable from the induction principle of Section 3. We will see in the sequel thatthesecoloringaxiomsareveryusefulinprovingthatpathsdonotexist,permitting us to verify a variety of algorithms. In Section 5, we will present some heuristics for automaticallychoosingparticularinstancesofthecoloringaxiomschemesthatenable ustoproveourgoalformulas. ThefirstcoloringaxiomschemeistheNoExitaxiomscheme: (∀u,v.A(u)∧¬A(v)→¬f(u,v)) → ∀u,v.A(u)∧¬A(v)→¬f (u,v) (1) tc foranyfirst-orderformulaA(u),andbinaryrelationsymbol,f,NoExit[A,f]saysthat ifnof-edgeleavescolorclassA,thennof-pathleavescolorclassA. Observethatalthoughitisverysimple,NoExit[A,f]doesnotfollowfromT [f]∧ 1 T [f]. Let G = (V,f,f ,A) consist of two disjoint cycles: V = {1,2,3,4}, f = 2 1 tc {(cid:104)1,2(cid:105),(cid:104)2,1(cid:105),(cid:104)3,4(cid:105),(cid:104)4,3(cid:105)},andA = {1,2}.Letf haveall16possibleedges.Thus tc G satisfiesT [f]∧T [f]butviolatesNoExit[A,f].Evenforacyclicmodels,NoExit[A,f] 1 1 2 doesnotfollowfromT [f]∧T [f]becausethereareinfinitemodelsinwhichtheim- 1 2 plicationdoesnothold(see[12]). NoExit[A,f]followseasilyfromtheinductionprinciple:ifnoedgesleaveA,then induction tells us that everything reachable from a point in A satisfies A. Similarly, NoExit[A,f]impliestheinductionaxiom,IND[Z,A,f],foranyformulaZ. ThesecondcoloringaxiomschemeistheGoOutaxiom:foranyfirst-orderformulas A(u),B(u),andbinaryrelationsymbol,f,GoOut[A,B,f]saysthatiftheonlyedges leavingcolorclassAaretoB,thenanypathfromapointinAtoapointnotinAmust passthroughB. (∀u,v.A(u)∧¬A(v)∧f(u,v)→B(v))→ (2) ∀u,v.A(u)∧¬A(v)∧f (u,v)→∃b.B(b)∧f (u,b)∧f (b,v) tc tc tc To see that GoOut[A,B,f] follows from the induction principle, assume that the onlyedgesoutofAenterB.ForanyfixeduinA,weprovebyinductionthatanypoint vreachablefromuiseitherinAorhasapredecessor,binB,thatisreachablefromu. The third coloring axiom scheme is the NewStart axiom, which is useful in the contextofdynamicallychanginggraphs:foranyfirst-orderformulaA(u),andbinary relationsymbolsf andg,thinkoff asthepreviousedgerelationandg asthecurrent edgerelation.NewStart[A,f,g]saysthatiftherearenonewedgesbetweenAnodes, thenanynewpathfromAmustleaveAtomakeitschange: (∀u,v.A(u)∧A(v)∧g(u,v)→f(u,v))→ (3) ∀u,v.g (u,v)∧¬f (u,v)→∃b.¬A(b)∧g (u,b)∧g (b,v) tc tc tc tc NewStart[A,f,g]followsfromtheinductionprinciplebyaproofthatissimilarto theproofofGoOut[A,B,f] Weremarkthatthespiritbehindourconsiderationofthecoloringaxiomsissimilar tothatfoundinapaperofGregNelson’sinwhichheintroducedasetofreachability axiomsforafunctionalpredicate,f,i.e.,thereisatmostonef edgeleavinganypoint [10].Nelsonaskedwhetherhisaxiomschemesarecompleteforthefunctionalsetting. WeremarkthatNelson’saxiomschemesareprovablefromT plusourinductionprin- 1 ciple.However,Nelson’saxiomschemesarenotcomplete:weconstructedafunctional graphsatisfyingNelson’saxiomsbutviolatingNoExit[A,f],(see[12]). AtleastoneofNelson’saxiomschemesdoesseemorthogonaltoourcoloringax- ioms and may be useful in certain proofs. Nelson’s fifth axiom scheme states that the points reachable from a given point are linearly ordered. The soundness of the axiom scheme is due to the fact that f is functional. We make use of a simplified version of Nelson’sorderingaxiomscheme:LetFunc[f]≡∀u,v,w.f(u,v)∧f(u,w)→v =w; then, Order[f] ≡ Func[f]→∀u,v,w.f (u,v)∧f (u,w) → f (v,w)∨f (w,v) tc tc tc tc 5 HeuristicsforUsingtheColoringAxioms This section presents heuristics for using the coloring axioms. Toward that end, it an- swersthefollowingquestions: – Howcanthecoloringaxiomsbeusedbyatheoremprovertoproveχ? – Whenshouldaspecificinstanceofacoloringaxiombegiventothetheoremprover whiletryingtoproveχ? – Whatpartoftheprocesscanbeautomated? We first present a running example that will be used in later sections to illustrate the heuristics. We then explain how the coloring axioms are useful, describe the search space for useful axioms, give an algorithm for exploring this space, and conclude by discussingaprototypeimplementationwehavedevelopedthatprovestheexamplepre- sentedandothers. 5.1 ReverseSpecification TheheuristicsdescribedinSections5.2–5.4areillustratedonproblemsthatariseinthe verification of partial correctness of a list reversal procedure. Other examples proven usingthistechniquecanbefoundinthefullversionofthispaper[12]. Theprocedurereverse,showninFig.1,performsin-placereversalofasinglylinked list, destructively updating the list. The precondition requires that the input list be acyclic and unshared. For simplicity, we assume that there is no garbage. The post- conditionensuresthattheresultinglistisacyclicandunshared.Also,itensuresthatthe nodes reachable from the formal parameter on entry to reverse are exactly the nodes reachablefromthereturnvalueofreverseattheexit.Mostimportantly,itensuresthat eachedgeintheoriginallistisreversedinthereturnedlist. The specification for reverse is shown in Node reverse(Node x){ Fig. 2. We use unary predicates to represent [0] Node y = null; program variables and binary predicates to [1] while (x != null){ represent data-structure fields. Fig. 2(a) de- [2] Node t = x.next; finessomeshorthands.Tospecifythataunary [3] x.next = y; predicatezcanpointtoasinglenodeatatime [4] y = x; and that a binary predicate f of a node can [5] x = t; pointtoatmostonenode(apartialfunction), [6] } we use unique[z] and func[f] . To specify [7] return y; thattherearenocyclesoff-fieldsinthegraph, } we use acyclic[f]. To specify that the graph doesnotcontainnodessharedbyf-fields,(i.e., Fig.1. AsimpleJava-likeimplemen- nodes with 2 or more incoming f-fields), we tation of the in-place reversal of a useunshared[f].Tospecifythatallnodesin singly-linkedlist. the graph are reachable from z or z by fol- 1 2 lowing f-fields, we use total[z ,z ,f]. Another helpful shorthand is r (v) which 1 2 x,f specifiesthatvisreachablefromthenodepointedtobyxusingf-edges. ThepreconditionofthereverseprocedureisshowninFig.2(b).Weusethepredi- catesxeandnetorecordthevaluesofthevariablexandthenextfieldatthebeginning oftheprocedure.Thepreconditionrequiresthatthelistpointedtobyxbeacyclicand unshared. It also requires that unique[z] and func[f] hold for all unary predicates z thatrepresentprogramvariablesandallbinarypredicatesf thatrepresentfields,respec- tively.Forsimplicity,weassumethatthereisnogarbage,i.e.,allnodesarereachable fromx. Thepost-conditionisshowninFig.2(c).Itensuresthattheresultinglistisacyclic andunshared.Also,itensuresthatthenodesreachablefromtheformalparameterxon entry to the procedure are exactly the nodes reachable from the return value y at the exit.Mostimportantly,wewishtoshowthateachedgeintheoriginallistisreversedin thereturnedlist(seeEq.(11)). A loop invariant is given in Fig. 2(d). It describes the state of the program at the beginningofeachloopiteration.Everynodeisinoneoftwodisjointlistspointedbyx andy(Eq.(12)).Thelistsareacyclicandunshared.Everyedgeinthelistpointedtoby xisexactlyanedgeintheoriginallist(Eq.(14)).Everyedgeinthelistpointedtobyy isthereverseofanedgeintheoriginallist(Eq.(15)).Theonlyoriginaledgegoingout ofyistox(Eq.(16)). ThetransformerisgiveninFig.2(e),usingtheprimedpredicatesn(cid:48),x(cid:48),andy(cid:48) to describethevaluesofpredicatesn,x,andy,respectively,attheendoftheiteration. unique[z]=def ∀v ,v .z(v )∧z(v )→v =v (4) 1 2 1 2 1 2 func[f]=def ∀v ,v ,v.f(v,v )∧f(v,v )→v =v (5) 1 2 1 2 1 2 acyclic[f]=def ∀v ,v .¬f(v ,v )∨¬TC[f](v ,v ) (6) (a) 1 2 1 2 2 1 unshared[f]=def ∀v ,v ,v.f(v ,v)∧f(v ,v)→v =v (7) 1 2 1 2 1 2 total[z ,z ,f]=def ∀v.∃w.(z (w)∨z (w))∧TC[f](w,v) (8) 1 2 1 2 r (v)=def ∃w.x(w)∧TC[f](w,v) (9) x,f (b) pre=def total[xe,xe,ne]∧acyclic[ne]∧unshared[ne]∧unique[xe]∧func[ne] (10) (c) post=def total[y,y,n]∧acyclic[n]∧unshared[n]∧∀v ,v .ne(v ,v )↔n(v ,v ) (11) 1 2 1 2 2 1 LI[x,y,n]=def total[x,y,n] ∧ ∀v.(¬r (v)∨¬r (v))∧ (12) x,n y,n acyclic[n]∧unshared[n] ∧ unique[x]∧unique[y]∧func[n]∧ (13) (d) ∀v ,v .(r (v )→(ne(v ,v )↔n(v ,v )))∧ (14) 1 2 x,n 1 1 2 1 2 ∀v ,v .(r (v)∧¬y(v )→(ne(v ,v )↔n(v ,v )))∧ (15) 1 2 y,n 1 1 2 2 1 ∀v ,v ,v.y(v )→(x(v )↔ne(v ,v )) (16) 1 2 1 2 1 2 T =def ∀v.(y(cid:48)(v)↔x(v)) ∧ ∀v.(x(cid:48)(v)↔∃w.x(w)∧n(w,v))∧ (e) ∀v ,v .n(cid:48)(v ,v )↔((n(v ,v )∧¬x(v ))∨(x(v )∧y(v ))) (17) 1 2 1 2 1 2 1 1 2 Fig.2.Examplespecificationofreverseprocedure:(a)shorthands,(b)preconditionpre, (c) postcondition post, (d) loop invariant LI[x,y,n], (e) transformer T (effect of the loopbody). 5.2 ProvingFormulasusingtheColoringAxioms AllthecoloringaxiomshavetheformA ≡ P → C ,whereP andC areclosed A A A A formulas.WecallP theaxiom’spremiseandC theaxiom’sconclusion.Foranaxiom A A tobeuseful,thetheoremproverwillhavetoprovethepremise(asasubgoal)andthen usetheconclusionintheproofofthegoalformulaχ.Foreachofthecoloringaxioms, wenowexplainwhenthepremisecanbeproved,howitsconclusioncanhelp,andgive anexample. NoExit.ThepremiseP [C,f]statesthattherearenof-edgesexitingcolor NoExit classC.WhenC isaunarypredicateappearingintheprogram,thepremiseissome- times a direct result of the loop invariant. Another color that will be used heavily throughout this section is reachability from a unary predicate, i.e., unary reachability, formally defined in Eq. (9). Let’s examine two cases. P [r ,f] is immediate NoExit x,f fromthedefinitionofr andthetransitivityoff .P [r ,f(cid:48)]actuallystates x,f tc NoExit x,f thatthereisnof pathfromxtoanedgeforwhichf(cid:48)holdsbutf doesn’t,i.e.,achange in f(cid:48) with respect to f. Thus, we use the absence of f-paths to prove the absence of f(cid:48)-paths.Inmanycases,thechangeisanimportantpartoftheloopinvariant,andpaths fromandtoitarepartofthespecification. AsketchoftheproofbyrefutationofP [r ,n(cid:48)]thatarisesinthereverse NoExit x(cid:48),n exampleisgiveninFig.3.Thenumbersinbracketsarethestagesoftheproof. 1. Thenegationofthepremiseexpandsto: ∃u ,u ,u .x(cid:48)(u )∧n (u ,u )∧¬n (u ,u )∧n(cid:48)(u ,u ) 1 2 3 1 tc 1 2 tc 1 3 2 3 2. Sinceu isreachablefromu andu isnot,bytransitivityofn ,wehave¬n(u ,u ). 2 1 3 tc 2 3 3. By the definition of n(cid:48) in the transformer, the only edge in which n differs from n(cid:48) isoutofx(oneoftheclausesgeneratedfromEq.(17)is∀v ,v .¬n(cid:48)(v ,v )∨ 1 2 1 2 ¬n(v ,v )∨x(v )).Thus,x(u )holds. 1 2 1 2 4. Bythedefinitionofx(cid:48)ithasanincomingnedgefromx.Thus,n(u ,u )holds. 2 1 Thelistpointedtobyxmustbeacyclic,whereaswehaveacyclebetweenu andu ; 1 2 i.e.,wehaveacontradiction.Thus,P [r ,n(cid:48)]musthold. NoExit x(cid:48),n n[4] (cid:124)(cid:124) x(cid:48)[1] (cid:47)(cid:47)(cid:71)(cid:64)(cid:70)(cid:65)u(cid:69)(cid:66)1(cid:68)(cid:67)(cid:76)(cid:76)(cid:76)¬(cid:76)(cid:76)n(cid:76)tc(cid:76)n[(cid:37)(cid:37)1t]c[1]n(cid:114)(cid:121)(cid:121) (cid:114)(cid:48)[(cid:114)1(cid:114)](cid:114)(cid:114)(cid:114)(cid:47)(cid:47)(cid:71)(cid:64)(cid:70)(cid:65)u(cid:69)(cid:66)2(cid:68)(cid:67)(cid:111)(cid:111) x[3] (cid:71)(cid:64)(cid:70)(cid:65)u(cid:69)(cid:66)3(cid:68)(cid:67)(cid:105)(cid:105) ¬n[2] Fig.3.ProvingP [r ,n(cid:48)]. NoExit x,n C [C,f] states there are no f paths (f edges) exiting C. This is useful NoExit tc becauseprovingtheabsenceofpathsisthedifficultpartofprovingformulaswithTC. GoOut. The premise P [A,B,f] states that all f edges going out of color GoOut classA,gotoB.WhenAandBareunarypredicatesthatappearintheprogram,again thepremisesometimesholdsasadirectresultoftheloopinvariant.Aninterestingspe- cialcaseiswhenB isdefinedas∃w.A(w)∧f(w,v).Inthiscasethepremiseisim- mediate.NotethatinthiscasetheconclusionisprovablealsofromT .However,from 1 experience,theaxiomisveryusefulforimprovingperformance(2ordersofmagnitude whenprovingtheacyclicpartofreverse’spostcondition). C [A,B,f]statesthatallpathsoutofAmustpassthroughB.Thus,under GoOut the premise P [A,B,f], if we know that there is a path from A to somewhere GoOut outsideofA,weknowthatthereisapathtotherefromB.IncaseallnodesinB are reachablefromallnodesinA,togetherwiththetransitivityoff thismeansthatthe tc nodesreachablefromBareexactlythenodesoutsideofAthatarereachablefromA. Forexample,C [y(cid:48),y,n(cid:48)]allowsustoprovethatonlytheoriginallistpointed GoOut tobyyisreachablefromy(cid:48)(inadditiontoy(cid:48)itself). NewStart.ThepremiseP [C,g,h]statesthatallgedgesbetweennodesin NewStart C arealsohedges.Thiscanmeantheiterationhasnotaddededgesorhasnotremoved