Table Of ContentLazy Automata Techniques for WS1S
Toma´sˇFiedor1,Luka´sˇHol´ık1,PetrJanku˚1,OndˇrejLenga´l1,2,andToma´sˇVojnar1
1 FIT,BrnoUniversityofTechnology,IT4InnovationsCentreofExcellence,CzechRepublic
2 InstituteofInformationScience,AcademiaSinica,Taiwan
Abstract. WepresentanewdecisionprocedureforthelogicWS1S.Itoriginates
fromtheclassicalapproach,whichfirstbuildsanautomatonacceptingallmodels
ofaformulaandthentestswhetheritslanguageisempty.Themainnoveltyisto
testtheemptinessonthefly,whileconstructingasymbolic,term-basedrepresen-
tationoftheautomaton,andprunetheconstructedstatespacefrompartsirrele-
vanttothetest.Thepruningisdonebyageneralizationoftwotechniquesusedin
antichain-basedlanguageinclusionanduniversalitycheckingoffiniteautomata:
7
subsumption and early termination. The richer structure of the WS1S decision
1
0 problemallowsus,however,toelaborateonthesetechniquesinnovelways.Our
2 experiments show that the proposed approach can in many cases significantly
outperformtheclassicaldecisionprocedure(implementedintheMONAtool)as
n
wellasrecentlyproposedalternatives.
a
J
1 Introduction
4
2
Weak monadic second-order logic of one successor (WS1S) is a powerful language
] for reasoning about regular properties of finite words. It has found numerous uses,
O
fromsoftwareandhardwareverificationthroughcontrollersynthesistocomputational
L linguistics, and further on. Some more recent applications of WS1S include verifica-
. tion of pointer programs and deciding related logics [1,2,3,4,5] as well as synthesis
s
c fromregularspecifications[6].Mostofthesuccessfulapplicationswereduetothetool
[ MONA[7],whichimplementsclassicalautomata-baseddecisionproceduresforWS1S
2 andWS2S(ageneralizationofWS1Stofinitebinarytrees).Theworstcasecomplexity
v ofWS1Sisnonelementary[8]and,despitemanyoptimizationsimplementedinMONA
2 andothertools,thecomplexitysometimesstrikesback.Authorsofmethodstranslating
8
theirproblemstoWS1S/WS2Sarethenforcedtoeitherfindworkaroundstocircumvent
2
thecomplexityblowup,suchasin[2],or,oftenrestrictingtheinputoftheirapproach,
6
0 giveuptranslatingtoWS1S/WS2Saltogether[9].
. TheclassicalWS1SdecisionprocedurebuildsanautomatonA acceptingallmod-
1 ϕ
els of the given formula ϕ in a form of finite words, and then tests A for language
0 ϕ
7 emptiness. The bottleneck of the procedure is the size of Aϕ, which can be huge due
1 tothefactthatthederivationofA involvesmanynestedautomataproductconstruc-
ϕ
: tionsandcomplementationsteps,precededbydeterminization.Themainpointofthis
v
i paper is to avoid the state-space explosion involved in the classical explicit construc-
X
tionbyrepresentingautomatasymbolicallyandtestingtheemptinessonthefly,while
r constructingA ,andbyomittingthestatespaceirrelevanttotheemptinesstest.Thisis
a ϕ
doneusingtwomainprinciples:lazyevaluationandsubsumption-basedpruning.These
principleshave,tosomedegree,alreadyappearedintheso-calledantichain-basedtest-
ingoflanguageuniversalityandinclusionoffiniteautomata[10].Thericherstructure
oftheWS1Sdecisionproblemallowsus,however,toelaborateontheseprinciplesin
novelwaysandutilizetheirpowerevenmore.
Overview of our algorithm. Our algorithm originates in the classical WS1S decision
procedureasimplementedinMONA,inwhichmodelsofformulaeareencodedbyfinite
words over a multi-track binary alphabet where each track corresponds to a variable
of ϕ. In order to come closer to this view of formula models as words, we replace
the input formula ϕ by a language term t describing the language L of all word
ϕ ϕ
encodingsofitsmodels.
Int ,theatomicformulaeofϕarereplacedbypredefinedautomataacceptinglan-
ϕ
guagesoftheirmodels.Booleanoperators( , ,and )areturnedintothecorrespond-
∧ ∨ ¬
ingsetoperators( , ,andcomplement)overthelanguagesofmodels.Anexistential
∪ ∩
quantification X becomes a sequence of two operations. First, a projection π re-
X
∃
movesinformationaboutvaluationsofthequantifiedvariableX fromsymbolsofthe
alphabet.Aftertheprojection,theresultinglanguageLmay,however,encodesomebut
notnecessarilyallencodingsofthemodels.Inparticular,encodingswithsomespecific
numbers of trailing ¯0’s, used as a padding, may be missing. ¯0 here denotes the sym-
bolwith0ineachtrack.Toobtainalanguagecontainingallencodingsofthemodels,
Lmustbeextendedtoincludeencodingswithanynumberoftrailing¯0’s.Thiscorre-
sponds to taking the (right) ¯0∗-quotient of L, written L ¯0∗, which is the set of all
prefixesofwordsofLwiththeremainingsuffixin¯0∗.We−giveanexampleWS1Sfor-
mulaϕin(1)anditslanguagetermt in(2).Thedottedoperatorsrepresentoperators
ϕ
overlanguageterms.See
ϕ X:Sing(X) ( Y:Y =X+1) (1)
Fig. 2 for the automata ≡∃ ∧ ∃
t π ( (π ( ) ¯0∗) ) ¯0∗ (2)
and . ϕ X Sing(X) Y Y=X+1
Sing(X) Y=X+1 ≡ A ∩ A − −
A A
The main novelty of our work is t(cid:8)hat we test emptiness of L direc(cid:9)tly over t .
ϕ ϕ
Thetermisusedasasymbolicrepresentationoftheautomatathatwouldbeexplicitly
constructedintheclassicalprocedure:inductivelytothetermsstructure,startingfrom
theleavesandcombiningtheautomataofsub-termsbystandardautomataconstructions
that implement the term operators. Instead of first building automata and only then
testingemptiness,wetestitontheflyduringtheconstruction.Thisoffersopportunities
topruneoutlargeportionsofthestatespacethatturnoutnottoberelevantforthetest.
Asub-termt oft ,correspondingtoasub-formulaψ,representsfinalstatesofthe
ψ ϕ
automaton acceptingthelanguageencodingmodelsofψ.Predecessorsofthefinal
ψ
A
states represented by t correspond to quotients of t . All states of could hence
ψ ψ ψ
A
beconstructedbyquotientingt untilfixpoint.Byworkingwithterms,ourprocedure
ψ
can often avoid building large parts of the automata when they are not necessary for
answering the emptiness query. For instance, when testing emptiness of the language
of a term t t , we adopt the lazy approach (in this particular case the so-called
1 2
∪
short-circuit evaluation) and first test emptiness of the language of t ; if it is non-
1
empty,wedonotneedtoprocesst .Testinglanguageemptinessoftermsarisingfrom
2
quantifiedsub-formulaeismorecomplicatedsincetheytranslateto ¯0∗ quotients.We
evaluate the test on t ¯0∗ by iterating the ¯0 quotient from t. W−e either conclude
− −
with the positive result as soon as one of the iteration computes a term with a non-
emptylanguage,orwiththenegativeoneifthefixpointofthequotientconstructionis
reached.Thefixpointconditionisthattheso-farcomputedquotientssubsumethenewly
constructed ones, where subsumption is a relation under-approximating inclusion of
languagesrepresentedbyterms.Subsumptionisalsousedtoprunethesetofcomputed
termssothatonlyanantichainofthetermsmaximalwrtsubsumptioniskept.
2
Besides lazy evaluation and subsumption, our approach can benefit from multiple
further optimizations. For example, it can be combined with the explicit WS1S deci-
sionprocedure,whichcanbeusedtotransformarbitrarysub-termsoft toautomata.
ϕ
Theseautomatacanthenberathersmallduetominimization,whichcannotbeapplied
intheon-the-flyapproach(theautomatacan,however,alsoexplodeduetodetermini-
sationandproductconstruction,hencethistechniquecomeswithatrade-off).Wealso
proposeanovelwayofutilisingBDD-basedencodingofautomatatransitionfunctions
in the MONA style for computing quotients of terms. Finally, our method can exploit
various methods of logic-based pre-processing, such as anti-prenexing, which, in our
experience,canoftensignificantlyreducethesearchspaceoffixpointcomputations.
Experiments. Wehaveimplementedourdecisionprocedureinaprototypetoolcalled
GASTONandcompareditsperformancewithotherpubliclyavailableWS1Ssolverson
benchmarksfromvarioussources.Intheexperiments, GASTON managedtowinover
allothersolversonvariousparametricfamiliesofWS1Sformulaethatweredesigned—
mostlybyauthorsofothertools—tostress-testWS1Ssolvers.Moreover,GASTONwas
abletosignificantlyoutperformMONAandothersolversonanumberofformulaeob-
tainedfromvariousformalverificationtasks.Thisshowsthatourapproachisapplicable
inpracticeandhasagreatpotentialtohandlemorecomplexformulaethanthosesofar
obtainedinWS1Sapplications.Webelievethattheefficiencyofourapproachcanbe
pushedmuchfurther,makingWS1Sscaleenoughfornewclassesofapplications.
Relatedwork. Asalreadymentionedabove,MONA[7]istheusualtoolofchoiceforde-
cidingWS1Sformulae.TheefficiencyofMONAstemsfrommanyoptimizations,both
higher-level(suchasautomataminimization,theencodingoffirst-ordervariablesused
inmodels,ortheuseofBDDstoencodethetransitionrelationoftheautomaton)aswell
aslower-level(e.g.optimizationsofhashtables,etc.)[11,12].ApartfromMONA,there
areotherrelatedtoolsbasedontheexplicitautomataprocedure,suchasJMOSEL[13]
forarelatedlogicM2L(Str),whichimplementsseveraloptimizations(suchassecond-
order value numbering [14]) that allow it to outperform MONA on some benchmarks
(MONA alsoprovidesanM2L(Str)interfaceontopoftheWS1Sdecisionprocedure),
ortheprocedureusingsymbolicfiniteautomataofD’Antonietal.in[15].
Ourworkwasoriginallyinspiredbyantichaintechniquesforcheckinguniversality
andinclusionoffiniteautomata[16,10,17],whichusesymboliccomputationandsub-
sumptiontoprunelargestatespacesarisingfromsubsetconstruction.In[18],whichis
a starting point for the current paper, we discussed a basic idea of generalizing these
techniquestoaWS1Sdecisionprocedure.Inthecurrentpaperwehaveturnedtheidea
of [18] to an algorithm efficient in practice by roughly the following steps: (1) refor-
mulatingthesymbolicrepresentationofautomatafromnestedupwardanddownward
closedsetsofautomatastatestomoreintuitivelanguageterms,(2)generalizingthepro-
cedureoriginallyrestrictedtoformulaeintheprenexnormalformtoarbitraryformulae,
(3)introductionoflazyevaluation,and(4)manyotherimportantoptimizations.
Recently,acoupleoflogic-basedapproachesfordecidingWS1Sappeared.Ganzow
andKaiser[19]developedanewdecisionprocedurefortheweakmonadicsecond-order
logiconinductivestructures,withintheirtool TOSS,whichisevenmoregeneralthan
WSkS. Their approach completely avoids automata; instead, it is based on Shelah’s
composition method. The TOSS tool is quite promising as it outperforms MONA on
3
someofthebenchmarks.It,however,lackssomefeaturesinordertoperformmeaning-
fulcomparisononbenchmarksusedinpractice.Traytel[20],ontheotherhand,usesthe
classicaldecisionprocedure,recastintheframeworkofcoalgebras.Theworkfocuses
on testing equivalence of a pair of formulae, which is performed by finding a bisim-
ulation between derivatives of the formulae. While it is shown that it can outperform
MONAonsomesimpleartificialexamples,theimplementationisnotoptimizedenough
andiseasilyoutperformedbytherestofthetoolsonotherbenchmarks.
2 PreliminariesonLanguagesandAutomata
A word over a finite alphabet Σ is a finite sequence w = a a , for n 0, of
1 n
··· ≥
symbols from Σ. Its i-th symbol a is denoted by w[i]. For n = 0, the word is the
i
empty word (cid:15). A language L is a set of words over Σ. We use the standard language
operatorsofconcatenationL.L(cid:48) anditerationL∗.The(right)quotientofalanguageL
wrt the language L(cid:48) is the language L L(cid:48) = u v L(cid:48) : uv L . We abuse
− { | ∃ ∈ ∈ }
notationandwriteL wtodenoteL w ,forawordw Σ∗.
− −{ } ∈
Afiniteautomaton(FA)overanalphabetΣ isaquadruple =(Q,δ,I,F)where
A
Q is a finite set of states, δ Q Σ Q is a set of transitions, I Q is a set
⊆ × × ⊆
of initial states, and F Q is a set of final states. The pre-image of a state q Q
⊆ ∈
over a Σ is the set of states pre[a](q) = q(cid:48) (q(cid:48),a,q) δ , and it is the set
∈ { | ∈ }
pre[a](S)= q∈Spre[a](q)forasetofstatesS.
The language (q) accepted at a state q Q is the set of words that can be read
(cid:83) L ∈
alongarunendinginq,i.e.allwordsa a ,forn 0,suchthatδ containstransi-
1 n
··· ≥
tions(q ,a ,q ),...,(q ,a ,q )withq I andq =q.Thelanguage ( )of
0 1 1 n−1 n n 0 n
∈ L A A
isthentheunion (q)oflanguagesofitsfinalstates.
q∈F L
(cid:83)
3 WS1S
Inthissection,wegiveaminimalisticintroductiontotheweakmonadicsecond-order
logic of one successor (WS1S) and outline its explicit decision procedure based on
representingsetsofmodelsasregularlanguagesandfiniteautomata.See,forinstance,
Comonetal.[21]foramorethoroughintroduction.
3.1 SyntaxandSemanticsofWS1S
WS1S allows quantification over second-order variables, which we denote by upper-
caselettersX,Y,...,thatrangeoverfinitesubsetsofN .Atomicformulaeareofthe
0
form (i) X Y, (ii) Sing(X), (iii) X = 0 , and (iv) X = Y +1. Formulae are
⊆ { }
builtfromtheatomiconesusingthelogicalconnectives , , ,andthequantifier
∧ ∨ ¬ ∃X
where isafinitesetofvariables(wewrite X if isasingleton X ).Amodelof
aWS1XSformulaϕ( )withthesetoffreeva∃riablesX isanassignm{ent}ρ : 2N0
of the free variablesX of ϕ to finite subsets of N fXor which the formula isXsa→tisfied,
0
X
writtenρ = ϕ.Satisfactionofatomicformulaeisdefinedasfollows:(i)ρ = X Y
| | ⊆
iff ρ(X) ρ(Y), (ii) ρ = Sing(X) iff ρ(X) is a singleton set, (iii) ρ = X = 0
⊆ | | { }
iff ρ(X) = 0 , and (iv) ρ = X = Y + 1 iff ρ(X) = x ,ρ(Y) = y , and
{ } | { } { }
x = y +1. Satisfaction for formulae obtained using Boolean connectives is defined
asusual.Aformulaϕisvalid,written = ϕ,iffallassignmentsofitsfreevariablesto
|
4
finite subsets of N are its models, and satisfiable if it has a model. Wlog we assume
0
thateachvariableinaformulaisquantifiedatmostonce.
3.2 ModelsasWords
Let beafinitesetofvariables.Asymbolτ over isamappingofallvariablesin
X X X
totheset 0,1 ,e.g.τ = X 0,X 1 for = X ,X ,whichwewillwrite
1 2 1 2
asτ = XX21{::01 be}low.These{tofal(cid:55)→lsymbols(cid:55)→ove}rX iXsdeno{tedasΣ}X.Weuse¯0todenote
thesymbolinΣ thatmapsallvariablesto0,i.e.¯0= X 0 X .
X
Anassignmentρ : 2N0 maybeencodedasa{wor(cid:55)→dwρ|ofs∈ymXb}olsover in
X → X
thefollowingway:w contains1inthe(i+1)-stpositionoftherowforX iffi X
ρ
∈
inρ.Noticethatthereexistsaninfinitenumberofencodingsofρ:theshortestencoding
iswsofthelengthn+1,wherenisthelargestnumberappearinginanyofthesetsthat
ρ
isassignedtoavariableof inρ,or 1whenallthesesetsareempty.Therestofthe
encodingsareallthosecorrXesponding−tows extendedwithanarbitrarynumberof¯0’s
ρ
appendedtoitsend.Forexample, X1:0, X1:00, X1:000, X1:000...0 areallencodingsofthe
X2:1 X2:10 X2:100 X2:100...0
assignmentρ= X ,X 0 .Weuse (ϕ) Σ∗ todenotethelanguageof
{ 1 (cid:55)→∅ 2 (cid:55)→{ }} L ⊆ X
allencodingsofaformulaϕ’smodels,where arethefreevariablesofϕ.
X
For two sets and of variables and any two symbols τ ,τ Σ , we write
1 2 X
X Y ∈
τ τ iff X :τ (X)=τ (X),i.e.thetwosymbolsdiffer(atmost)inthe
1 Y 2 1 2
∼ ∀ ∈X \Y
valuesofvariablesin .Therelation isgeneralizedtowordssuchthatw w iff
Y 1 Y 2
Y ∼ ∼
w = w and 1 i w : w [i] w [i].ForalanguageL Σ∗,wedefine
| 1| | 2| ∀ ≤ ≤ | 1| 1 ∼Y 2 ⊆ X
π (L) as the language of words w that are -equivalent with some word w(cid:48) L.
Y Y
∼ ∈
Seen from the point of view of encodings of sets of assignments, π (L) encodes all
Y
assignmentsthatmaydifferfromthoseencodedbyL(only)inthevaluesofvariables
from .If isdisjointwiththefreevariablesofϕ,thenπ ( (ϕ))correspondstothe
Y
Y Y L
so-calledcylindrificationof (ϕ),andifitistheirsubset,thenπ ( (ϕ))corresponds
Y
L L
totheso-calledprojection[21].Weuseπ todenoteπ foravariableY.
Y {Y}
Consider formulae over the set V of vari- V V V
(ϕ ψ)= (ϕ) (ψ) (3)
ables.Letfree(ϕ)bethesetoffreevariablesof L ∨ L ∪L
ϕ,andletLV(ϕ)=πV\free(ϕ)(L(ϕ))bethelan- LV(ϕ∧ψ)=LV(ϕ)∩LV(ψ) (4)
guage (ϕ) cylindrified wrt those variables of V( ϕ)=Σ∗ V(ϕ) (5)
VthatLarenotfreeinϕ.Letϕandψ beformu- L ¬ V\L
lae and assume that V(ϕ) and V(ψ) are lan- LV(∃X :ϕ)=πX(LV(ϕ))−¯0∗ (6)
guagesofencodingsoLftheirmodLelscylindrifiedwrtV.Languagesofformulaeobtained
fromϕandψ usinglogicalconnectivesaredefinedbyequations(3)to(6).Equations
(3)-(5) above are straightforward: Boolean connectives translate to the corresponding
set operators over the universe of encodings of assignments of variables in V. Exis-
tential quantification : ϕ translates into a composition of two language transfor-
∃X
mations. First, π makes the valuations of variables of arbitrary, which intuitively
X
X
correspondstoforgettingeverythingaboutvaluesofvariablesin (noticethatthisisa
X
differentuseofπ thanthecylindrificationsinceherevariablesof arefreevariables
X
ofϕ).Thesecondstep,removingsuffixesof¯0’sfromthemodelencXodings,isnecessary
sinceπ ( V(ϕ))mightbemissingsomeencodingsofmodelsof :ϕ.Forexample,
X
supposethLatV= X,Y andtheonlymodelofϕis X 0 ,∃YX 1 ,yielding
V(ϕ) = X:100∗.{Then}π ( V(ϕ)) = X:100∗ does{not(cid:55)→co{nta}in th(cid:55)→e s{ho}rt}est encod-
L Y:010 Y L Y:???
(cid:20)(cid:21) (cid:20)(cid:21)
ing X:1 (whereeach‘?’denotesanarbitraryvalue)oftheonlymodel X 0 of
Y:? { (cid:55)→ { }}
5
Y : ϕ.Itonlycontainsitsvariantswithatleastone¯0appendedtoit.Thisgenerally
∃
happensformodelsofϕwherethelargestnumberinthevalueofthevariableY being
eliminatedislargerthanmaximumnumberfoundinthevaluesofthefreevariablesof
Y : ϕ. The role of the ¯0∗ quotient is to include the missing encodings of models
w∃ithasmallernumberof−trailing¯0’sintothelanguage.
The standard approach to decide satisfiability of a WS1S formula ϕ with the set
ofvariablesVistoconstructanautomaton accepting V(ϕ)andcheckemptiness
ϕ
A L
of its language. The construction starts with simple pre-defined automata for ϕ’s
ψ
A
atomicformulaeψ (seeFig.2forexamplesofautomataforselectedatomicformulae
ande.g.[21]formoredetails)acceptingcylindrifiedlanguages V(ψ)ofmodelsofψ.
L
These are simple regular languages. The construction then continues by inductively
constructing automata accepting languages V(ϕ(cid:48)) of models for all other sub-
ϕ(cid:48)
A L
formulae ϕ(cid:48) of ϕ, using equations (3)–(6) above. The language operators used in the
rulesareimplementedusingstandardautomata-theoreticconstructions(see[21]).
4 SatisfiabilityviaLanguageTermEvaluation
This section introduces the basic version of our symbolic algorithm for deciding sat-
isfiabilityofaWS1SformulaϕwithasetofvariablesV.Itsoptimizedversionisthe
subjectofthenextsection.Tosimplifypresentation,weconsidertheparticularcaseof
ground formulae (i.e. formulae without free variables), for which satisfiability corre-
spondstovalidity.Satisfiabilityofaformulawithfreevariablescanbereducedtothis
casebyprefixingitwithexistentialquantificationoverthefreevariables.Ifϕisground,
thelanguage V(ϕ)iseitherΣ∗ inthecaseϕisvalid,oremptyifϕisinvalid.Then,
V
todecidethevLalidityofϕ,itsufficestotestif(cid:15) V(ϕ).
∈L
Ouralgorithmevaluatestheso-calledlanguagetermt ,asymbolicrepresentation
ϕ
ofthelanguage V(ϕ),whosestructurereflectstheconstructionof .Itisa(finite)
ϕ
L A
termgeneratedbythefollowinggrammar:
t::= t t t t t π (t) t α t α∗ T
X
A| ∪ | ∩ | | | − | − |
where is a finite automaton over the alphabet ΣV, α is a symbol τ ΣV or a set
A ∈
S ΣV of symbols, and T is a finite set of terms. We use marked variants of the
⊆
operators to distinguish the syntax of language terms manipulated by our algorithm
fromthecaseswhenwewishtodenotethesemanticalmeaningoftheoperators.Aterm
oftheformt α∗iscalledastarquotient,orshortlyastar,andatermt τ isasymbol
− −
quotient.Botharealsocalledquotients.Thelanguage (t)ofatermtisobtainedby
L
takingthelanguagesoftheautomatainitsleavesandcombiningthemusingtheterm
operators.Termswiththesamelanguagearelanguage-equivalent.ThespecialtermsT,
havingtheformofaset,representintermediatestatesoffixpointcomputationsusedto
eliminatestarquotients.ThelanguageofasetT equalstheunionofthelanguagesofits
elements.Thereasonforhavingtwowaysofexpressingaunionoftermsisadifferent
treatment of and T, which will be discussed later. We use the standard notion of
∪
isomorphism of two terms, extended with having two set terms isomorphic iff they
containisomorphicelements.
Aformulaϕisinitiallytransformedintothetermt byreplacingeveryatomicsub-
ϕ
formula ψ in ϕ by the automaton accepting V(ψ), and by replacing the logical
ψ
A L
connectives with dotted term operators according to equations (3)–(6) of Section 3.2.
Thecoreofouralgorithmisevaluationofthe(cid:15)-membershipquery(cid:15) t ,whichwill
ϕ
∈
alsotriggerfurtherrewritingoftheterm.
6
The(cid:15)-membershipqueryonaquotient- (cid:15) T iff (cid:15) tforsomet T (7)
∈ ∈ ∈
free term is evaluated using equivalences (cid:15) t t(cid:48) iff (cid:15) tor(cid:15) t(cid:48) (8)
(7)to(12).Equivalences(7)to(11)reduce (cid:15)∈t∪t(cid:48) iff (cid:15)∈tand(cid:15)∈ t(cid:48) (9)
tests on terms to Boolean combinations of ∈ ∩ ∈ ∈
(cid:15) t iff not(cid:15) t (10)
tests on their sub-terms and allow pushing
∈ ∈
the test towards the automata at the term’s (cid:15) πX(t) iff (cid:15) t (11)
∈ ∈
leaves. Equivalence (12) then reduces it to (cid:15) iff I( ) F( )= (12)
testingintersectionoftheinitialstatesI( )andthe∈fiAnalstatesFA( ∩)ofAana(cid:54)uto∅maton.
A A
Equivalences(7)to(11)donotapplytoquotients,whicharisefromquantifiedsub-
formulae(cf.equation(6)inSection3.2).Aquotientistherefore(inthebasicversion)
firstrewrittenintoalanguage-equivalentquotient-freeform.Thisrewritingcorresponds
to saturating the set of final states of an automaton in the explicit decision procedure
withallstatesintheirpre∗-imageover¯0.Inourprocedure,weuserules(13)and(14).
Rule (13) transforms the term into π (T) ¯0∗ π (T π (¯0)∗) (13)
a form in which a star quotient is applied X − → X − X
onaplainsetoftermsratherthanonaprojection.Astarquotientofasetisthenelimi-
natedusingafixpointcomputationthatsaturatesthesetwithallquotientsofitselements
wrtthesetofsymbolsS = π (¯0).Asingleiterationisimplementedusingrule(14).
X
There,T Sistheset t τ
(cid:9) { − | T ifT S T
t T τ S of quo- T S∗ (cid:9) (cid:118) (14)
tien∈tsof∧terms∈inT}wrtsym- − →(cid:26)(T ∪(T (cid:9)S))−S∗ otherwise
bolsofS.(Notethat(14)usestheidentityS∗ = (cid:15) S∗S.)Terminationofthefixpoint
{ }∪
computation is decided based on the subsumption relation , which is some relation
(cid:118)
that under-approximates language inclusion of terms. When the condition holds, then
thelanguageofT isstablewrtquotientingbyS,i.e. (T)= (T S∗).Inthebasical-
L L −
gorithm,weusetermisomorphismfor ;later,weprovideamoreprecisesubsumption
(cid:118)
relationwithagoodtrade-offbetweenprecisionandcost.Notethataniterationofrule
(14)canbeimplementedefficientlybythestandardworklistalgorithm,whichextends
T onlywithquotientsT(cid:48) SoftermsT(cid:48)thatwereaddedtoT inthepreviousiteration.
(cid:9)
The set T S introduces quotient terms
(cid:9) (t t(cid:48)) τ (t τ) (t(cid:48) τ) (15)
of the form t τ, for τ ΣV, which also ∪ − → − ∪ −
need to be e−liminated to∈ facilitate the (cid:15)- (t t(cid:48)) τ (t τ) (t(cid:48) τ) (16)
∩ − → − ∩ −
membershiptest.Thisisdoneusingrewriting t τ t τ (17)
rules(15)to(19),wherepre[τ]( )is with − → −
itssetoffinalstatesF replacedbAypreA[τ](F). πX(t)−τ → πX(t−πX(τ)) (18)
If t is quotient-free, then rules (15)–(18) τ pre[τ]( ) (19)
A− → A
appliedtot τ pushthesymbolquotientdownthestructureofttowardstheautomata
−
in the leaves, where it is eliminated by rule (19). Otherwise, if t is not quotient-free,
it can be re-written using rules (13)–(19). In particular, if t is a star quotient of a
quotient-free term, then the quotient-free form of t can be obtained by iterating rule
(14), combined with rules (15)–(19) to transform the new terms in T into a quotient-
free form. Finally, terms with multiple quotients can be rewritten to the quotient-free
forminductivelytotheirstructure.Everyinductivesteprewritessomestarquotientof
aquotient-freesub-termintothequotient-freeform.Notethatthisprocedureisbound
toterminatesincethetermsgeneratedbyquotientingastarhavethesamestructureas
theoriginalterm,differingonlyinthestatesintheirleaves.Asthenumberofthestates
isfinite,soisthenumberoftheterms.
7
(cid:15)∈πX {q}∩πY {t}−πY(¯0)∗ −πX(¯0)∗
(cid:16)n (cid:16) 1 (cid:17)o (cid:17)
(cid:15)∈ {q}∩πY {t}−πY(¯0)∗ −πX(¯0)∗
2 n (cid:16) (cid:17)o 4
(cid:15)∈{q}∩πY {t}−πY(¯0)∗ ∨ (cid:15)∈ {q}∩πY {t}−πY(¯0)∗ (cid:9)πX(¯0) −πX(¯0)∗
3 (cid:16) (cid:17) (cid:16)n (cid:16) (cid:17)o (cid:17)
5
(cid:15)∈{q} ∧ (cid:15)∈πY {t}−πY(¯0)∗
(cid:16) (cid:17) 9
(cid:15)∈ {q}∩πY {t}−πY(¯0)∗ − YX::00 ∨
(cid:16) (cid:16) (cid:17)(cid:17) h i
6 (cid:15)∈ {q}∩πY {t}−πY(¯0)∗ − YX::10 ∨ (cid:15)∈ {q}∩πY {t}−πY(¯0)∗ (cid:9)πX(¯0) (cid:9)πX(¯0) −πX(¯0)∗
(cid:16) (cid:16) (cid:17)(cid:17) h i (cid:16)(cid:16)n (cid:16) (cid:17)o (cid:17) (cid:17)
10 12
(cid:15)∈(cid:18){q}−hYX7::00i(cid:19)∩(cid:18)πY(cid:16){t}−πY(¯0)∗(cid:17)−hYX::00i(cid:19) (cid:15)∈{q}−hYX::10i ∧ (cid:15)∈πY(cid:16){t}−πY(¯0)∗(cid:17)−hYX::10i
11 13
(cid:15)∈{q8}−hYX::00i ∧ (cid:15)∈πY(cid:16){t}−πY(¯0)∗(cid:17)−hYX::00i (cid:15)∈{p} (cid:15)∈ {t}−πY(¯0)∗ − YX::10 ∨ (cid:15)∈ {t}−πY(¯0)∗ − YX::11
(cid:15)∈∅ (cid:15)∈{t}−hYX::10i 14 1(cid:16)6 (cid:17) h i (cid:16) (cid:17) h i
15 ∨
(cid:15)∈∅ (cid:15)∈(cid:16){t}(cid:9)πY(¯0)(cid:17)−hYX::10i ∨ (cid:15)∈(cid:16)(cid:16)({t}(cid:9)πY(¯0))(cid:9)πY(¯0)(cid:17)−πY(¯0)∗(cid:17)−hYX::10i
17
19 18
(cid:15) r (cid:15) s X:1 (cid:15) t X:0 X:1 (cid:15) t X:0 X:1
∈{} ∈{}−hY:0i ∈(cid:26) −hY:1i(cid:27)−hY:0i ∨ ∈(cid:26) −hY:0i(cid:27)−hY:0i
Fig.1.Exampleofdecidingvalidityoftheformulaϕ≡∃X :Sing(X)∧(∃Y :Y =X+1)
Example1. Wewillshowtheworkingsofourprocedure
[X:0] [X:1] [X:0]
using an example of testing satisfiability of the formula p q
ϕ X.Sing(X) ( Y.Y =X+1).Westartbyrewrit- a)
ing≡ϕ∃intoatermt∧re∃presentingitslanguage V(ϕ): ASing(X)
ϕ L X:0 X:0
tϕ ≡πX({{q}∩πY({t}−πY(¯0)∗)}−πX(¯0)∗) hY:0i X:1 X:0hY:0i
r Y:0 s Y:1 t
(we have already used rule (13) twice). In the example, h i h i
asetRofstateswilldenoteanautomatonobtainedfrom b) Y=X+1
A
or (cf. Fig. 2) by setting the final Fig.2.Exampleautomata
Sing(X) Y=X+1
A A
states to R. Red nodes in the computation tree denote (cid:15)-membership tests that failed
andgreennodesthosethatsucceeded.Greynodesdenoteteststhatwerenotevaluated.
Asnotedpreviously,itholdsthat =ϕiff(cid:15) t .Thesequenceofcomputationsteps
ϕ
| ∈
for determining the (cid:15)-membership test is shown using the computation tree in Fig. 1.
Thenodescontain(cid:15)-membershiptestsontermsandthetestofeachnodeisequivalent
toaconjunctionordisjunctionoftestsofitschildren.Leafsoftheform(cid:15) Rareeval-
∈
uatedastestingintersectionofRwiththeinitialstatesofthecorrespondingautomaton.
In the example, we also use the lazy evaluation technique (described in Section 5.2),
whichallowsustoevaluate(cid:15)-membershiptestsonpartiallycomputedfixpoints.
The computation starts at the root of the tree and proceeds along the edges in the
ordergivenbytheircircledlabels.Edges 2 and 4 wereobtainedbyapartialunfolding
ofafixpointcomputationbyrule(14)andimmediatelyapplying(cid:15)-membershipteston
the obtained terms. After step 3, we conclude that (cid:15) / q since p q = ,
∈ { } { } ∩ { } ∅
whichfurtherrefutesthewholeconjunctionbelow 2,sotheoverallresultdependson
the sub-tree starting by 4. The steps 5 and 9 are another application of rule (14),
which transforms π (¯0) to the symbols X:0 and X:1 respectively. The branch 5
X Y:0 Y:0
(cid:20) (cid:21) (cid:20) (cid:21)
pushesthe X:0 quotienttotheleaftermusingrules(16)and(9)andeventuallyfails
− Y:0
(cid:20) (cid:21)
8
becausethepredecessorsof q overthesymbol X:0 in istheemptyset.On
{ } Y:0 ASing(X)
(cid:20) (cid:21)
theotherhand,theevaluationofthebranch 9 continuesusingrule(16),succeedingin
thebranch 10.Thebranch 12 isfurtherevaluatedbyprojectingthequotient X:1
− Y:0
(cid:20) (cid:21)
wrtY (rule18)andunfoldingtheinnerstarquotientzerotimes(14,failed)andonce
(16). The unfolding of one symbol eventually succeeds in step 19, which leads to
concluding validity of ϕ. Note that thanks to the lazy evaluation, none of the fixpoint
computationshadtobefullyunfolded.
(cid:116)(cid:117)
5 AnEfficientAlgorithm
Inthissection,weshowhowtobuildanefficientalgorithmbasedonthesymbolicterm
rewritingapproachfromSection4.Theoptimizationopportunitiesofferedbythesym-
bolic approach are to a large degree orthogonal to those of the explicit approach. The
maindifferenceisintheavailabletechniquesforreducingtheexploredautomatastate
space.Whiletheexplicitconstructionin MONA profitsmainlyfromcallingautomata
minimizationaftereverystepoftheinductiveconstruction,thesymbolicalgorithmcan
usegeneralizedsubsumptionandlazyevaluation.Noneofthetwoapproachesseemsto
becompatiblewithboththesetechniques(atleastintheirpurevariant,disregardingthe
possibilityofacombinationofthetwoapproachesdiscussedbelow).
Efficientdatastructureshaveamajorimpactonperformanceofthedecisionproce-
dure.TheefficiencyoftheexplicitprocedureimplementedinMONAistoalargedegree
duetotheBDD-basedrepresentationofautomatatransitionrelations.BDDscompactly
representtransitionfunctionsoverlargealphabetsandprovideefficientimplementation
ofoperationsneededintheexplicitalgorithm.Oursymbolicalgorithmcan,ontheother
hand,benefitfromarepresentationoftermsasDAGswherealloccurrencesofthesame
sub-termarerepresentedbyauniqueDAGnode.Moreover,weassumethenodestobe
associated with languages rather than with concrete terms (allowing the term associ-
ated with a node to change during its further processing, without a need to transform
theDAGstructureaslongasthelanguageofthetermdoesnotchange).
Wealsoshowthatdespiteouralgorithmusesacompletelydifferentdatastructure
than the explicit one, it can still exploit a BDD-based representation of transitions of
theautomataintheleavesofterms.Moreover,oursymbolicalgorithmcanalsobecom-
bined withtheexplicitalgorithm.Particularly,itturnsoutthat,sometimes,itpaysoff
to translate to automata sub-formulae larger than the atomic ones. Our procedure can
thenbeviewedasanextensionofMONAthattakesoveronceMONAstopsmanaging.
Lastly, optimizations on the level of formulae often have a huge impact on the per-
formanceofouralgorithm.Thetechniquethatwefoundmosthelpfulistheso-called
anti-prenexing.Weelaborateonalltheseoptimizationsintherestofthissection.
5.1 Subsumption
Ourfirsttechniqueforreducingtheexploredstatespaceisbasedonthenotionofsub-
sumption between terms, which is similar to the subsumption used in antichain-based
universality and inclusion checking over finite automata [10]. We define subsumption
astherelation ontermsthatisgivenbyequivalences(20)–(25).Noticethat,inrule
s
(cid:118)
(20),alltermsofT aretestedagainstalltermsofT(cid:48),whileinrule(21),theleft-hand
sidetermt isnottestedagainsttheright-handsidetermt(cid:48) (andsimilarlyfort andt(cid:48)).
1 2 2 1
9
The reason why is order- T T(cid:48) iff t T t(cid:48) T(cid:48) :t t(cid:48) (20)
∪ s s
sensitiveisthatthetermsondif- (cid:118) ∀ ∈ ∃ ∈ (cid:118)
t t t(cid:48) t(cid:48) ifft t(cid:48) andt t(cid:48) (21)
ferentsidesofthe areassumed 1∪ 2 (cid:118)s 1∪ 2 1 (cid:118)s 1 2 (cid:118)s 2
to be built from a∪utomata with t1∩t2 (cid:118)s t(cid:48)1∩t(cid:48)2 ifft1 (cid:118)s t(cid:48)1 andt2 (cid:118)s t(cid:48)2 (22)
disjoint sets of states (originat- t t(cid:48) ifft t(cid:48) (23)
s s
ing from different sub-formulae (cid:118) (cid:119)
π (t) π (t(cid:48))ifft t(cid:48) (24)
of the original formula), and X (cid:118)s X (cid:118)s
(cid:48) iffF( ) F( (cid:48)) (25)
hence the subsumption test on s
A(cid:118) A A ⊆ A
them can never conclude positively. The subsumption under-approximates language
inclusion and can therefore be used for in rule (14). It is far more precise than iso-
(cid:118)
morphismanditsuseleadstoanearlierterminationoffixpointcomputations.
Moreover, can be
(cid:118)s T T t ifthereist(cid:48) T t witht t(cid:48) (26)
used to prune star quotient → \{ } ∈ \{ } (cid:118)s
terms T S∗ while preservingtheir language. Sincethe semantics ofthe set T is the
−
union of the languages of its elements, then elements subsumed by others can be re-
movedwhilepreservingthelanguage.T canthusbekeptintheformofanantichainof
-incomparableterms.Thepruningcorrespondstousingtherewritingrule(26).
s
(cid:118)
5.2 LazyEvaluation
Thetop-downnatureofourtechniqueallowsustopostponeevaluationofsomeofthe
computationbranchesincasetheso-farevaluatedpartissufficientfordeterminingthe
result of the evaluated (cid:15)-membership or subsumption test. We call this optimization
lazy evaluation. A basic variant of lazy evaluation short-circuits elimination of quo-
tients from branches of and . When testing whether (cid:15) t t(cid:48) (rule (8)), we first
∪ ∩ ∈ ∪
evaluate, e.g.,the test (cid:15) t, andwhen itholds, we cancompletely avoidexploring t(cid:48)
∈
and evaluating quotients there. When testing (cid:15) t t(cid:48), we can proceed analogously
∈ ∩
if one of the two terms is shown not to contain (cid:15). Rules (21) and (22) offer similar
opportunitiesforshort-circuitingevaluationofsubsumptionof and .
∪ ∩
Let us note that subsumption is in a different position than (cid:15)-membership since
correctness of our algorithm depends on the precision of the (cid:15)-membership test, but
subsumption may be evaluated in any way that under-approximates inclusion of lan-
guages of terms (and over-approximates isomorphism in order to guarantee termina-
tion). Hence, (cid:15)-membership test must enforce eliminating quotients until it can con-
cludetheresult,whilethereisachoiceinthecaseofthesubsumption.Ifsubsumption
istestedonquotients,itcaneithereliminatethem,oritcanreturnthe(safe)negativean-
swer.However,thischoicecomeswithatrade-off.Subsumptioneliminatingquotients
ismoreexpensivebutalsomoreprecise.Thehigherprecisionallowsbetterpruningof
thestatespaceandearlierterminationoffixpointcomputation,which,accordingtoour
empiricalexperience,paysoff.
Lazyevaluationcanalsoreducethenumberofiterationsofastar.Theiterationscan
becomputedondemand,onlywhenrequiredbythetests.Theideaistotrytoconclude
atest(cid:15) T S∗ basedontheintermediatestateT ofthefixpointcomputation.This
∈ −
can be done since (T) always under-approximates (T S∗), hence if (cid:15) (T),
L L − ∈ L
then(cid:15) (T S∗).Continuingthefixpointcomputationisthenunnecessary.
∈L −
The above mechanism alone is, however, rather insufficient in the case of nested
stars.AssumethataninnerstarfixpointcomputationwasterminatedinastateT S∗
−
10