Lazy Automata Techniques for WS1S Toma´sˇFiedor1,Luka´sˇHol´ık1,PetrJanku˚1,OndˇrejLenga´l1,2,andToma´sˇVojnar1 1 FIT,BrnoUniversityofTechnology,IT4InnovationsCentreofExcellence,CzechRepublic 2 InstituteofInformationScience,AcademiaSinica,Taiwan Abstract. WepresentanewdecisionprocedureforthelogicWS1S.Itoriginates fromtheclassicalapproach,whichfirstbuildsanautomatonacceptingallmodels ofaformulaandthentestswhetheritslanguageisempty.Themainnoveltyisto testtheemptinessonthefly,whileconstructingasymbolic,term-basedrepresen- tationoftheautomaton,andprunetheconstructedstatespacefrompartsirrele- vanttothetest.Thepruningisdonebyageneralizationoftwotechniquesusedin antichain-basedlanguageinclusionanduniversalitycheckingoffiniteautomata: 7 subsumption and early termination. The richer structure of the WS1S decision 1 0 problemallowsus,however,toelaborateonthesetechniquesinnovelways.Our 2 experiments show that the proposed approach can in many cases significantly outperformtheclassicaldecisionprocedure(implementedintheMONAtool)as n wellasrecentlyproposedalternatives. a J 1 Introduction 4 2 Weak monadic second-order logic of one successor (WS1S) is a powerful language ] for reasoning about regular properties of finite words. It has found numerous uses, O fromsoftwareandhardwareverificationthroughcontrollersynthesistocomputational L linguistics, and further on. Some more recent applications of WS1S include verifica- . tion of pointer programs and deciding related logics [1,2,3,4,5] as well as synthesis s c fromregularspecifications[6].Mostofthesuccessfulapplicationswereduetothetool [ MONA[7],whichimplementsclassicalautomata-baseddecisionproceduresforWS1S 2 andWS2S(ageneralizationofWS1Stofinitebinarytrees).Theworstcasecomplexity v ofWS1Sisnonelementary[8]and,despitemanyoptimizationsimplementedinMONA 2 andothertools,thecomplexitysometimesstrikesback.Authorsofmethodstranslating 8 theirproblemstoWS1S/WS2Sarethenforcedtoeitherfindworkaroundstocircumvent 2 thecomplexityblowup,suchasin[2],or,oftenrestrictingtheinputoftheirapproach, 6 0 giveuptranslatingtoWS1S/WS2Saltogether[9]. . TheclassicalWS1SdecisionprocedurebuildsanautomatonA acceptingallmod- 1 ϕ els of the given formula ϕ in a form of finite words, and then tests A for language 0 ϕ 7 emptiness. The bottleneck of the procedure is the size of Aϕ, which can be huge due 1 tothefactthatthederivationofA involvesmanynestedautomataproductconstruc- ϕ : tionsandcomplementationsteps,precededbydeterminization.Themainpointofthis v i paper is to avoid the state-space explosion involved in the classical explicit construc- X tionbyrepresentingautomatasymbolicallyandtestingtheemptinessonthefly,while r constructingA ,andbyomittingthestatespaceirrelevanttotheemptinesstest.Thisis a ϕ doneusingtwomainprinciples:lazyevaluationandsubsumption-basedpruning.These principleshave,tosomedegree,alreadyappearedintheso-calledantichain-basedtest- ingoflanguageuniversalityandinclusionoffiniteautomata[10].Thericherstructure oftheWS1Sdecisionproblemallowsus,however,toelaborateontheseprinciplesin novelwaysandutilizetheirpowerevenmore. Overview of our algorithm. Our algorithm originates in the classical WS1S decision procedureasimplementedinMONA,inwhichmodelsofformulaeareencodedbyfinite words over a multi-track binary alphabet where each track corresponds to a variable of ϕ. In order to come closer to this view of formula models as words, we replace the input formula ϕ by a language term t describing the language L of all word ϕ ϕ encodingsofitsmodels. Int ,theatomicformulaeofϕarereplacedbypredefinedautomataacceptinglan- ϕ guagesoftheirmodels.Booleanoperators( , ,and )areturnedintothecorrespond- ∧ ∨ ¬ ingsetoperators( , ,andcomplement)overthelanguagesofmodels.Anexistential ∪ ∩ quantification X becomes a sequence of two operations. First, a projection π re- X ∃ movesinformationaboutvaluationsofthequantifiedvariableX fromsymbolsofthe alphabet.Aftertheprojection,theresultinglanguageLmay,however,encodesomebut notnecessarilyallencodingsofthemodels.Inparticular,encodingswithsomespecific numbers of trailing ¯0’s, used as a padding, may be missing. ¯0 here denotes the sym- bolwith0ineachtrack.Toobtainalanguagecontainingallencodingsofthemodels, Lmustbeextendedtoincludeencodingswithanynumberoftrailing¯0’s.Thiscorre- sponds to taking the (right) ¯0∗-quotient of L, written L ¯0∗, which is the set of all prefixesofwordsofLwiththeremainingsuffixin¯0∗.We−giveanexampleWS1Sfor- mulaϕin(1)anditslanguagetermt in(2).Thedottedoperatorsrepresentoperators ϕ overlanguageterms.See ϕ X:Sing(X) ( Y:Y =X+1) (1) Fig. 2 for the automata ≡∃ ∧ ∃ t π ( (π ( ) ¯0∗) ) ¯0∗ (2) and . ϕ X Sing(X) Y Y=X+1 Sing(X) Y=X+1 ≡ A ∩ A − − A A The main novelty of our work is t(cid:8)hat we test emptiness of L direc(cid:9)tly over t . ϕ ϕ Thetermisusedasasymbolicrepresentationoftheautomatathatwouldbeexplicitly constructedintheclassicalprocedure:inductivelytothetermsstructure,startingfrom theleavesandcombiningtheautomataofsub-termsbystandardautomataconstructions that implement the term operators. Instead of first building automata and only then testingemptiness,wetestitontheflyduringtheconstruction.Thisoffersopportunities topruneoutlargeportionsofthestatespacethatturnoutnottoberelevantforthetest. Asub-termt oft ,correspondingtoasub-formulaψ,representsfinalstatesofthe ψ ϕ automaton acceptingthelanguageencodingmodelsofψ.Predecessorsofthefinal ψ A states represented by t correspond to quotients of t . All states of could hence ψ ψ ψ A beconstructedbyquotientingt untilfixpoint.Byworkingwithterms,ourprocedure ψ can often avoid building large parts of the automata when they are not necessary for answering the emptiness query. For instance, when testing emptiness of the language of a term t t , we adopt the lazy approach (in this particular case the so-called 1 2 ∪ short-circuit evaluation) and first test emptiness of the language of t ; if it is non- 1 empty,wedonotneedtoprocesst .Testinglanguageemptinessoftermsarisingfrom 2 quantifiedsub-formulaeismorecomplicatedsincetheytranslateto ¯0∗ quotients.We evaluate the test on t ¯0∗ by iterating the ¯0 quotient from t. W−e either conclude − − with the positive result as soon as one of the iteration computes a term with a non- emptylanguage,orwiththenegativeoneifthefixpointofthequotientconstructionis reached.Thefixpointconditionisthattheso-farcomputedquotientssubsumethenewly constructed ones, where subsumption is a relation under-approximating inclusion of languagesrepresentedbyterms.Subsumptionisalsousedtoprunethesetofcomputed termssothatonlyanantichainofthetermsmaximalwrtsubsumptioniskept. 2 Besides lazy evaluation and subsumption, our approach can benefit from multiple further optimizations. For example, it can be combined with the explicit WS1S deci- sionprocedure,whichcanbeusedtotransformarbitrarysub-termsoft toautomata. ϕ Theseautomatacanthenberathersmallduetominimization,whichcannotbeapplied intheon-the-flyapproach(theautomatacan,however,alsoexplodeduetodetermini- sationandproductconstruction,hencethistechniquecomeswithatrade-off).Wealso proposeanovelwayofutilisingBDD-basedencodingofautomatatransitionfunctions in the MONA style for computing quotients of terms. Finally, our method can exploit various methods of logic-based pre-processing, such as anti-prenexing, which, in our experience,canoftensignificantlyreducethesearchspaceoffixpointcomputations. Experiments. Wehaveimplementedourdecisionprocedureinaprototypetoolcalled GASTONandcompareditsperformancewithotherpubliclyavailableWS1Ssolverson benchmarksfromvarioussources.Intheexperiments, GASTON managedtowinover allothersolversonvariousparametricfamiliesofWS1Sformulaethatweredesigned— mostlybyauthorsofothertools—tostress-testWS1Ssolvers.Moreover,GASTONwas abletosignificantlyoutperformMONAandothersolversonanumberofformulaeob- tainedfromvariousformalverificationtasks.Thisshowsthatourapproachisapplicable inpracticeandhasagreatpotentialtohandlemorecomplexformulaethanthosesofar obtainedinWS1Sapplications.Webelievethattheefficiencyofourapproachcanbe pushedmuchfurther,makingWS1Sscaleenoughfornewclassesofapplications. Relatedwork. Asalreadymentionedabove,MONA[7]istheusualtoolofchoiceforde- cidingWS1Sformulae.TheefficiencyofMONAstemsfrommanyoptimizations,both higher-level(suchasautomataminimization,theencodingoffirst-ordervariablesused inmodels,ortheuseofBDDstoencodethetransitionrelationoftheautomaton)aswell aslower-level(e.g.optimizationsofhashtables,etc.)[11,12].ApartfromMONA,there areotherrelatedtoolsbasedontheexplicitautomataprocedure,suchasJMOSEL[13] forarelatedlogicM2L(Str),whichimplementsseveraloptimizations(suchassecond- order value numbering [14]) that allow it to outperform MONA on some benchmarks (MONA alsoprovidesanM2L(Str)interfaceontopoftheWS1Sdecisionprocedure), ortheprocedureusingsymbolicfiniteautomataofD’Antonietal.in[15]. Ourworkwasoriginallyinspiredbyantichaintechniquesforcheckinguniversality andinclusionoffiniteautomata[16,10,17],whichusesymboliccomputationandsub- sumptiontoprunelargestatespacesarisingfromsubsetconstruction.In[18],whichis a starting point for the current paper, we discussed a basic idea of generalizing these techniquestoaWS1Sdecisionprocedure.Inthecurrentpaperwehaveturnedtheidea of [18] to an algorithm efficient in practice by roughly the following steps: (1) refor- mulatingthesymbolicrepresentationofautomatafromnestedupwardanddownward closedsetsofautomatastatestomoreintuitivelanguageterms,(2)generalizingthepro- cedureoriginallyrestrictedtoformulaeintheprenexnormalformtoarbitraryformulae, (3)introductionoflazyevaluation,and(4)manyotherimportantoptimizations. Recently,acoupleoflogic-basedapproachesfordecidingWS1Sappeared.Ganzow andKaiser[19]developedanewdecisionprocedurefortheweakmonadicsecond-order logiconinductivestructures,withintheirtool TOSS,whichisevenmoregeneralthan WSkS. Their approach completely avoids automata; instead, it is based on Shelah’s composition method. The TOSS tool is quite promising as it outperforms MONA on 3 someofthebenchmarks.It,however,lackssomefeaturesinordertoperformmeaning- fulcomparisononbenchmarksusedinpractice.Traytel[20],ontheotherhand,usesthe classicaldecisionprocedure,recastintheframeworkofcoalgebras.Theworkfocuses on testing equivalence of a pair of formulae, which is performed by finding a bisim- ulation between derivatives of the formulae. While it is shown that it can outperform MONAonsomesimpleartificialexamples,theimplementationisnotoptimizedenough andiseasilyoutperformedbytherestofthetoolsonotherbenchmarks. 2 PreliminariesonLanguagesandAutomata A word over a finite alphabet Σ is a finite sequence w = a a , for n 0, of 1 n ··· ≥ symbols from Σ. Its i-th symbol a is denoted by w[i]. For n = 0, the word is the i empty word (cid:15). A language L is a set of words over Σ. We use the standard language operatorsofconcatenationL.L(cid:48) anditerationL∗.The(right)quotientofalanguageL wrt the language L(cid:48) is the language L L(cid:48) = u v L(cid:48) : uv L . We abuse − { | ∃ ∈ ∈ } notationandwriteL wtodenoteL w ,forawordw Σ∗. − −{ } ∈ Afiniteautomaton(FA)overanalphabetΣ isaquadruple =(Q,δ,I,F)where A Q is a finite set of states, δ Q Σ Q is a set of transitions, I Q is a set ⊆ × × ⊆ of initial states, and F Q is a set of final states. The pre-image of a state q Q ⊆ ∈ over a Σ is the set of states pre[a](q) = q(cid:48) (q(cid:48),a,q) δ , and it is the set ∈ { | ∈ } pre[a](S)= q∈Spre[a](q)forasetofstatesS. The language (q) accepted at a state q Q is the set of words that can be read (cid:83) L ∈ alongarunendinginq,i.e.allwordsa a ,forn 0,suchthatδ containstransi- 1 n ··· ≥ tions(q ,a ,q ),...,(q ,a ,q )withq I andq =q.Thelanguage ( )of 0 1 1 n−1 n n 0 n ∈ L A A isthentheunion (q)oflanguagesofitsfinalstates. q∈F L (cid:83) 3 WS1S Inthissection,wegiveaminimalisticintroductiontotheweakmonadicsecond-order logic of one successor (WS1S) and outline its explicit decision procedure based on representingsetsofmodelsasregularlanguagesandfiniteautomata.See,forinstance, Comonetal.[21]foramorethoroughintroduction. 3.1 SyntaxandSemanticsofWS1S WS1S allows quantification over second-order variables, which we denote by upper- caselettersX,Y,...,thatrangeoverfinitesubsetsofN .Atomicformulaeareofthe 0 form (i) X Y, (ii) Sing(X), (iii) X = 0 , and (iv) X = Y +1. Formulae are ⊆ { } builtfromtheatomiconesusingthelogicalconnectives , , ,andthequantifier ∧ ∨ ¬ ∃X where isafinitesetofvariables(wewrite X if isasingleton X ).Amodelof aWS1XSformulaϕ( )withthesetoffreeva∃riablesX isanassignm{ent}ρ : 2N0 of the free variablesX of ϕ to finite subsets of N fXor which the formula isXsa→tisfied, 0 X writtenρ = ϕ.Satisfactionofatomicformulaeisdefinedasfollows:(i)ρ = X Y | | ⊆ iff ρ(X) ρ(Y), (ii) ρ = Sing(X) iff ρ(X) is a singleton set, (iii) ρ = X = 0 ⊆ | | { } iff ρ(X) = 0 , and (iv) ρ = X = Y + 1 iff ρ(X) = x ,ρ(Y) = y , and { } | { } { } x = y +1. Satisfaction for formulae obtained using Boolean connectives is defined asusual.Aformulaϕisvalid,written = ϕ,iffallassignmentsofitsfreevariablesto | 4 finite subsets of N are its models, and satisfiable if it has a model. Wlog we assume 0 thateachvariableinaformulaisquantifiedatmostonce. 3.2 ModelsasWords Let beafinitesetofvariables.Asymbolτ over isamappingofallvariablesin X X X totheset 0,1 ,e.g.τ = X 0,X 1 for = X ,X ,whichwewillwrite 1 2 1 2 asτ = XX21{::01 be}low.These{tofal(cid:55)→lsymbols(cid:55)→ove}rX iXsdeno{tedasΣ}X.Weuse¯0todenote thesymbolinΣ thatmapsallvariablesto0,i.e.¯0= X 0 X . X Anassignmentρ : 2N0 maybeencodedasa{wor(cid:55)→dwρ|ofs∈ymXb}olsover in X → X thefollowingway:w contains1inthe(i+1)-stpositionoftherowforX iffi X ρ ∈ inρ.Noticethatthereexistsaninfinitenumberofencodingsofρ:theshortestencoding iswsofthelengthn+1,wherenisthelargestnumberappearinginanyofthesetsthat ρ isassignedtoavariableof inρ,or 1whenallthesesetsareempty.Therestofthe encodingsareallthosecorrXesponding−tows extendedwithanarbitrarynumberof¯0’s ρ appendedtoitsend.Forexample, X1:0, X1:00, X1:000, X1:000...0 areallencodingsofthe X2:1 X2:10 X2:100 X2:100...0 assignmentρ= X ,X 0 .Weuse (ϕ) Σ∗ todenotethelanguageof { 1 (cid:55)→∅ 2 (cid:55)→{ }} L ⊆ X allencodingsofaformulaϕ’smodels,where arethefreevariablesofϕ. X For two sets and of variables and any two symbols τ ,τ Σ , we write 1 2 X X Y ∈ τ τ iff X :τ (X)=τ (X),i.e.thetwosymbolsdiffer(atmost)inthe 1 Y 2 1 2 ∼ ∀ ∈X \Y valuesofvariablesin .Therelation isgeneralizedtowordssuchthatw w iff Y 1 Y 2 Y ∼ ∼ w = w and 1 i w : w [i] w [i].ForalanguageL Σ∗,wedefine | 1| | 2| ∀ ≤ ≤ | 1| 1 ∼Y 2 ⊆ X π (L) as the language of words w that are -equivalent with some word w(cid:48) L. Y Y ∼ ∈ Seen from the point of view of encodings of sets of assignments, π (L) encodes all Y assignmentsthatmaydifferfromthoseencodedbyL(only)inthevaluesofvariables from .If isdisjointwiththefreevariablesofϕ,thenπ ( (ϕ))correspondstothe Y Y Y L so-calledcylindrificationof (ϕ),andifitistheirsubset,thenπ ( (ϕ))corresponds Y L L totheso-calledprojection[21].Weuseπ todenoteπ foravariableY. Y {Y} Consider formulae over the set V of vari- V V V (ϕ ψ)= (ϕ) (ψ) (3) ables.Letfree(ϕ)bethesetoffreevariablesof L ∨ L ∪L ϕ,andletLV(ϕ)=πV\free(ϕ)(L(ϕ))bethelan- LV(ϕ∧ψ)=LV(ϕ)∩LV(ψ) (4) guage (ϕ) cylindrified wrt those variables of V( ϕ)=Σ∗ V(ϕ) (5) VthatLarenotfreeinϕ.Letϕandψ beformu- L ¬ V\L lae and assume that V(ϕ) and V(ψ) are lan- LV(∃X :ϕ)=πX(LV(ϕ))−¯0∗ (6) guagesofencodingsoLftheirmodLelscylindrifiedwrtV.Languagesofformulaeobtained fromϕandψ usinglogicalconnectivesaredefinedbyequations(3)to(6).Equations (3)-(5) above are straightforward: Boolean connectives translate to the corresponding set operators over the universe of encodings of assignments of variables in V. Exis- tential quantification : ϕ translates into a composition of two language transfor- ∃X mations. First, π makes the valuations of variables of arbitrary, which intuitively X X correspondstoforgettingeverythingaboutvaluesofvariablesin (noticethatthisisa X differentuseofπ thanthecylindrificationsinceherevariablesof arefreevariables X ofϕ).Thesecondstep,removingsuffixesof¯0’sfromthemodelencXodings,isnecessary sinceπ ( V(ϕ))mightbemissingsomeencodingsofmodelsof :ϕ.Forexample, X supposethLatV= X,Y andtheonlymodelofϕis X 0 ,∃YX 1 ,yielding V(ϕ) = X:100∗.{Then}π ( V(ϕ)) = X:100∗ does{not(cid:55)→co{nta}in th(cid:55)→e s{ho}rt}est encod- L Y:010 Y L Y:??? (cid:20)(cid:21) (cid:20)(cid:21) ing X:1 (whereeach‘?’denotesanarbitraryvalue)oftheonlymodel X 0 of Y:? { (cid:55)→ { }} 5 Y : ϕ.Itonlycontainsitsvariantswithatleastone¯0appendedtoit.Thisgenerally ∃ happensformodelsofϕwherethelargestnumberinthevalueofthevariableY being eliminatedislargerthanmaximumnumberfoundinthevaluesofthefreevariablesof Y : ϕ. The role of the ¯0∗ quotient is to include the missing encodings of models w∃ithasmallernumberof−trailing¯0’sintothelanguage. The standard approach to decide satisfiability of a WS1S formula ϕ with the set ofvariablesVistoconstructanautomaton accepting V(ϕ)andcheckemptiness ϕ A L of its language. The construction starts with simple pre-defined automata for ϕ’s ψ A atomicformulaeψ (seeFig.2forexamplesofautomataforselectedatomicformulae ande.g.[21]formoredetails)acceptingcylindrifiedlanguages V(ψ)ofmodelsofψ. L These are simple regular languages. The construction then continues by inductively constructing automata accepting languages V(ϕ(cid:48)) of models for all other sub- ϕ(cid:48) A L formulae ϕ(cid:48) of ϕ, using equations (3)–(6) above. The language operators used in the rulesareimplementedusingstandardautomata-theoreticconstructions(see[21]). 4 SatisfiabilityviaLanguageTermEvaluation This section introduces the basic version of our symbolic algorithm for deciding sat- isfiabilityofaWS1SformulaϕwithasetofvariablesV.Itsoptimizedversionisthe subjectofthenextsection.Tosimplifypresentation,weconsidertheparticularcaseof ground formulae (i.e. formulae without free variables), for which satisfiability corre- spondstovalidity.Satisfiabilityofaformulawithfreevariablescanbereducedtothis casebyprefixingitwithexistentialquantificationoverthefreevariables.Ifϕisground, thelanguage V(ϕ)iseitherΣ∗ inthecaseϕisvalid,oremptyifϕisinvalid.Then, V todecidethevLalidityofϕ,itsufficestotestif(cid:15) V(ϕ). ∈L Ouralgorithmevaluatestheso-calledlanguagetermt ,asymbolicrepresentation ϕ ofthelanguage V(ϕ),whosestructurereflectstheconstructionof .Itisa(finite) ϕ L A termgeneratedbythefollowinggrammar: t::= t t t t t π (t) t α t α∗ T X A| ∪ | ∩ | | | − | − | where is a finite automaton over the alphabet ΣV, α is a symbol τ ΣV or a set A ∈ S ΣV of symbols, and T is a finite set of terms. We use marked variants of the ⊆ operators to distinguish the syntax of language terms manipulated by our algorithm fromthecaseswhenwewishtodenotethesemanticalmeaningoftheoperators.Aterm oftheformt α∗iscalledastarquotient,orshortlyastar,andatermt τ isasymbol − − quotient.Botharealsocalledquotients.Thelanguage (t)ofatermtisobtainedby L takingthelanguagesoftheautomatainitsleavesandcombiningthemusingtheterm operators.Termswiththesamelanguagearelanguage-equivalent.ThespecialtermsT, havingtheformofaset,representintermediatestatesoffixpointcomputationsusedto eliminatestarquotients.ThelanguageofasetT equalstheunionofthelanguagesofits elements.Thereasonforhavingtwowaysofexpressingaunionoftermsisadifferent treatment of and T, which will be discussed later. We use the standard notion of ∪ isomorphism of two terms, extended with having two set terms isomorphic iff they containisomorphicelements. Aformulaϕisinitiallytransformedintothetermt byreplacingeveryatomicsub- ϕ formula ψ in ϕ by the automaton accepting V(ψ), and by replacing the logical ψ A L connectives with dotted term operators according to equations (3)–(6) of Section 3.2. Thecoreofouralgorithmisevaluationofthe(cid:15)-membershipquery(cid:15) t ,whichwill ϕ ∈ alsotriggerfurtherrewritingoftheterm. 6 The(cid:15)-membershipqueryonaquotient- (cid:15) T iff (cid:15) tforsomet T (7) ∈ ∈ ∈ free term is evaluated using equivalences (cid:15) t t(cid:48) iff (cid:15) tor(cid:15) t(cid:48) (8) (7)to(12).Equivalences(7)to(11)reduce (cid:15)∈t∪t(cid:48) iff (cid:15)∈tand(cid:15)∈ t(cid:48) (9) tests on terms to Boolean combinations of ∈ ∩ ∈ ∈ (cid:15) t iff not(cid:15) t (10) tests on their sub-terms and allow pushing ∈ ∈ the test towards the automata at the term’s (cid:15) πX(t) iff (cid:15) t (11) ∈ ∈ leaves. Equivalence (12) then reduces it to (cid:15) iff I( ) F( )= (12) testingintersectionoftheinitialstatesI( )andthe∈fiAnalstatesFA( ∩)ofAana(cid:54)uto∅maton. A A Equivalences(7)to(11)donotapplytoquotients,whicharisefromquantifiedsub- formulae(cf.equation(6)inSection3.2).Aquotientistherefore(inthebasicversion) firstrewrittenintoalanguage-equivalentquotient-freeform.Thisrewritingcorresponds to saturating the set of final states of an automaton in the explicit decision procedure withallstatesintheirpre∗-imageover¯0.Inourprocedure,weuserules(13)and(14). Rule (13) transforms the term into π (T) ¯0∗ π (T π (¯0)∗) (13) a form in which a star quotient is applied X − → X − X onaplainsetoftermsratherthanonaprojection.Astarquotientofasetisthenelimi- natedusingafixpointcomputationthatsaturatesthesetwithallquotientsofitselements wrtthesetofsymbolsS = π (¯0).Asingleiterationisimplementedusingrule(14). X There,T Sistheset t τ (cid:9) { − | T ifT S T t T τ S of quo- T S∗ (cid:9) (cid:118) (14) tien∈tsof∧terms∈inT}wrtsym- − →(cid:26)(T ∪(T (cid:9)S))−S∗ otherwise bolsofS.(Notethat(14)usestheidentityS∗ = (cid:15) S∗S.)Terminationofthefixpoint { }∪ computation is decided based on the subsumption relation , which is some relation (cid:118) that under-approximates language inclusion of terms. When the condition holds, then thelanguageofT isstablewrtquotientingbyS,i.e. (T)= (T S∗).Inthebasical- L L − gorithm,weusetermisomorphismfor ;later,weprovideamoreprecisesubsumption (cid:118) relationwithagoodtrade-offbetweenprecisionandcost.Notethataniterationofrule (14)canbeimplementedefficientlybythestandardworklistalgorithm,whichextends T onlywithquotientsT(cid:48) SoftermsT(cid:48)thatwereaddedtoT inthepreviousiteration. (cid:9) The set T S introduces quotient terms (cid:9) (t t(cid:48)) τ (t τ) (t(cid:48) τ) (15) of the form t τ, for τ ΣV, which also ∪ − → − ∪ − need to be e−liminated to∈ facilitate the (cid:15)- (t t(cid:48)) τ (t τ) (t(cid:48) τ) (16) ∩ − → − ∩ − membershiptest.Thisisdoneusingrewriting t τ t τ (17) rules(15)to(19),wherepre[τ]( )is with − → − itssetoffinalstatesF replacedbAypreA[τ](F). πX(t)−τ → πX(t−πX(τ)) (18) If t is quotient-free, then rules (15)–(18) τ pre[τ]( ) (19) A− → A appliedtot τ pushthesymbolquotientdownthestructureofttowardstheautomata − in the leaves, where it is eliminated by rule (19). Otherwise, if t is not quotient-free, it can be re-written using rules (13)–(19). In particular, if t is a star quotient of a quotient-free term, then the quotient-free form of t can be obtained by iterating rule (14), combined with rules (15)–(19) to transform the new terms in T into a quotient- free form. Finally, terms with multiple quotients can be rewritten to the quotient-free forminductivelytotheirstructure.Everyinductivesteprewritessomestarquotientof aquotient-freesub-termintothequotient-freeform.Notethatthisprocedureisbound toterminatesincethetermsgeneratedbyquotientingastarhavethesamestructureas theoriginalterm,differingonlyinthestatesintheirleaves.Asthenumberofthestates isfinite,soisthenumberoftheterms. 7 (cid:15)∈πX {q}∩πY {t}−πY(¯0)∗ −πX(¯0)∗ (cid:16)n (cid:16) 1 (cid:17)o (cid:17) (cid:15)∈ {q}∩πY {t}−πY(¯0)∗ −πX(¯0)∗ 2 n (cid:16) (cid:17)o 4 (cid:15)∈{q}∩πY {t}−πY(¯0)∗ ∨ (cid:15)∈ {q}∩πY {t}−πY(¯0)∗ (cid:9)πX(¯0) −πX(¯0)∗ 3 (cid:16) (cid:17) (cid:16)n (cid:16) (cid:17)o (cid:17) 5 (cid:15)∈{q} ∧ (cid:15)∈πY {t}−πY(¯0)∗ (cid:16) (cid:17) 9 (cid:15)∈ {q}∩πY {t}−πY(¯0)∗ − YX::00 ∨ (cid:16) (cid:16) (cid:17)(cid:17) h i 6 (cid:15)∈ {q}∩πY {t}−πY(¯0)∗ − YX::10 ∨ (cid:15)∈ {q}∩πY {t}−πY(¯0)∗ (cid:9)πX(¯0) (cid:9)πX(¯0) −πX(¯0)∗ (cid:16) (cid:16) (cid:17)(cid:17) h i (cid:16)(cid:16)n (cid:16) (cid:17)o (cid:17) (cid:17) 10 12 (cid:15)∈(cid:18){q}−hYX7::00i(cid:19)∩(cid:18)πY(cid:16){t}−πY(¯0)∗(cid:17)−hYX::00i(cid:19) (cid:15)∈{q}−hYX::10i ∧ (cid:15)∈πY(cid:16){t}−πY(¯0)∗(cid:17)−hYX::10i 11 13 (cid:15)∈{q8}−hYX::00i ∧ (cid:15)∈πY(cid:16){t}−πY(¯0)∗(cid:17)−hYX::00i (cid:15)∈{p} (cid:15)∈ {t}−πY(¯0)∗ − YX::10 ∨ (cid:15)∈ {t}−πY(¯0)∗ − YX::11 (cid:15)∈∅ (cid:15)∈{t}−hYX::10i 14 1(cid:16)6 (cid:17) h i (cid:16) (cid:17) h i 15 ∨ (cid:15)∈∅ (cid:15)∈(cid:16){t}(cid:9)πY(¯0)(cid:17)−hYX::10i ∨ (cid:15)∈(cid:16)(cid:16)({t}(cid:9)πY(¯0))(cid:9)πY(¯0)(cid:17)−πY(¯0)∗(cid:17)−hYX::10i 17 19 18 (cid:15) r (cid:15) s X:1 (cid:15) t X:0 X:1 (cid:15) t X:0 X:1 ∈{} ∈{}−hY:0i ∈(cid:26) −hY:1i(cid:27)−hY:0i ∨ ∈(cid:26) −hY:0i(cid:27)−hY:0i Fig.1.Exampleofdecidingvalidityoftheformulaϕ≡∃X :Sing(X)∧(∃Y :Y =X+1) Example1. Wewillshowtheworkingsofourprocedure [X:0] [X:1] [X:0] using an example of testing satisfiability of the formula p q ϕ X.Sing(X) ( Y.Y =X+1).Westartbyrewrit- a) ing≡ϕ∃intoatermt∧re∃presentingitslanguage V(ϕ): ASing(X) ϕ L X:0 X:0 tϕ ≡πX({{q}∩πY({t}−πY(¯0)∗)}−πX(¯0)∗) hY:0i X:1 X:0hY:0i r Y:0 s Y:1 t (we have already used rule (13) twice). In the example, h i h i asetRofstateswilldenoteanautomatonobtainedfrom b) Y=X+1 A or (cf. Fig. 2) by setting the final Fig.2.Exampleautomata Sing(X) Y=X+1 A A states to R. Red nodes in the computation tree denote (cid:15)-membership tests that failed andgreennodesthosethatsucceeded.Greynodesdenoteteststhatwerenotevaluated. Asnotedpreviously,itholdsthat =ϕiff(cid:15) t .Thesequenceofcomputationsteps ϕ | ∈ for determining the (cid:15)-membership test is shown using the computation tree in Fig. 1. Thenodescontain(cid:15)-membershiptestsontermsandthetestofeachnodeisequivalent toaconjunctionordisjunctionoftestsofitschildren.Leafsoftheform(cid:15) Rareeval- ∈ uatedastestingintersectionofRwiththeinitialstatesofthecorrespondingautomaton. In the example, we also use the lazy evaluation technique (described in Section 5.2), whichallowsustoevaluate(cid:15)-membershiptestsonpartiallycomputedfixpoints. The computation starts at the root of the tree and proceeds along the edges in the ordergivenbytheircircledlabels.Edges 2 and 4 wereobtainedbyapartialunfolding ofafixpointcomputationbyrule(14)andimmediatelyapplying(cid:15)-membershipteston the obtained terms. After step 3, we conclude that (cid:15) / q since p q = , ∈ { } { } ∩ { } ∅ whichfurtherrefutesthewholeconjunctionbelow 2,sotheoverallresultdependson the sub-tree starting by 4. The steps 5 and 9 are another application of rule (14), which transforms π (¯0) to the symbols X:0 and X:1 respectively. The branch 5 X Y:0 Y:0 (cid:20) (cid:21) (cid:20) (cid:21) pushesthe X:0 quotienttotheleaftermusingrules(16)and(9)andeventuallyfails − Y:0 (cid:20) (cid:21) 8 becausethepredecessorsof q overthesymbol X:0 in istheemptyset.On { } Y:0 ASing(X) (cid:20) (cid:21) theotherhand,theevaluationofthebranch 9 continuesusingrule(16),succeedingin thebranch 10.Thebranch 12 isfurtherevaluatedbyprojectingthequotient X:1 − Y:0 (cid:20) (cid:21) wrtY (rule18)andunfoldingtheinnerstarquotientzerotimes(14,failed)andonce (16). The unfolding of one symbol eventually succeeds in step 19, which leads to concluding validity of ϕ. Note that thanks to the lazy evaluation, none of the fixpoint computationshadtobefullyunfolded. (cid:116)(cid:117) 5 AnEfficientAlgorithm Inthissection,weshowhowtobuildanefficientalgorithmbasedonthesymbolicterm rewritingapproachfromSection4.Theoptimizationopportunitiesofferedbythesym- bolic approach are to a large degree orthogonal to those of the explicit approach. The maindifferenceisintheavailabletechniquesforreducingtheexploredautomatastate space.Whiletheexplicitconstructionin MONA profitsmainlyfromcallingautomata minimizationaftereverystepoftheinductiveconstruction,thesymbolicalgorithmcan usegeneralizedsubsumptionandlazyevaluation.Noneofthetwoapproachesseemsto becompatiblewithboththesetechniques(atleastintheirpurevariant,disregardingthe possibilityofacombinationofthetwoapproachesdiscussedbelow). Efficientdatastructureshaveamajorimpactonperformanceofthedecisionproce- dure.TheefficiencyoftheexplicitprocedureimplementedinMONAistoalargedegree duetotheBDD-basedrepresentationofautomatatransitionrelations.BDDscompactly representtransitionfunctionsoverlargealphabetsandprovideefficientimplementation ofoperationsneededintheexplicitalgorithm.Oursymbolicalgorithmcan,ontheother hand,benefitfromarepresentationoftermsasDAGswherealloccurrencesofthesame sub-termarerepresentedbyauniqueDAGnode.Moreover,weassumethenodestobe associated with languages rather than with concrete terms (allowing the term associ- ated with a node to change during its further processing, without a need to transform theDAGstructureaslongasthelanguageofthetermdoesnotchange). Wealsoshowthatdespiteouralgorithmusesacompletelydifferentdatastructure than the explicit one, it can still exploit a BDD-based representation of transitions of theautomataintheleavesofterms.Moreover,oursymbolicalgorithmcanalsobecom- bined withtheexplicitalgorithm.Particularly,itturnsoutthat,sometimes,itpaysoff to translate to automata sub-formulae larger than the atomic ones. Our procedure can thenbeviewedasanextensionofMONAthattakesoveronceMONAstopsmanaging. Lastly, optimizations on the level of formulae often have a huge impact on the per- formanceofouralgorithm.Thetechniquethatwefoundmosthelpfulistheso-called anti-prenexing.Weelaborateonalltheseoptimizationsintherestofthissection. 5.1 Subsumption Ourfirsttechniqueforreducingtheexploredstatespaceisbasedonthenotionofsub- sumption between terms, which is similar to the subsumption used in antichain-based universality and inclusion checking over finite automata [10]. We define subsumption astherelation ontermsthatisgivenbyequivalences(20)–(25).Noticethat,inrule s (cid:118) (20),alltermsofT aretestedagainstalltermsofT(cid:48),whileinrule(21),theleft-hand sidetermt isnottestedagainsttheright-handsidetermt(cid:48) (andsimilarlyfort andt(cid:48)). 1 2 2 1 9 The reason why is order- T T(cid:48) iff t T t(cid:48) T(cid:48) :t t(cid:48) (20) ∪ s s sensitiveisthatthetermsondif- (cid:118) ∀ ∈ ∃ ∈ (cid:118) t t t(cid:48) t(cid:48) ifft t(cid:48) andt t(cid:48) (21) ferentsidesofthe areassumed 1∪ 2 (cid:118)s 1∪ 2 1 (cid:118)s 1 2 (cid:118)s 2 to be built from a∪utomata with t1∩t2 (cid:118)s t(cid:48)1∩t(cid:48)2 ifft1 (cid:118)s t(cid:48)1 andt2 (cid:118)s t(cid:48)2 (22) disjoint sets of states (originat- t t(cid:48) ifft t(cid:48) (23) s s ing from different sub-formulae (cid:118) (cid:119) π (t) π (t(cid:48))ifft t(cid:48) (24) of the original formula), and X (cid:118)s X (cid:118)s (cid:48) iffF( ) F( (cid:48)) (25) hence the subsumption test on s A(cid:118) A A ⊆ A them can never conclude positively. The subsumption under-approximates language inclusion and can therefore be used for in rule (14). It is far more precise than iso- (cid:118) morphismanditsuseleadstoanearlierterminationoffixpointcomputations. Moreover, can be (cid:118)s T T t ifthereist(cid:48) T t witht t(cid:48) (26) used to prune star quotient → \{ } ∈ \{ } (cid:118)s terms T S∗ while preservingtheir language. Sincethe semantics ofthe set T is the − union of the languages of its elements, then elements subsumed by others can be re- movedwhilepreservingthelanguage.T canthusbekeptintheformofanantichainof -incomparableterms.Thepruningcorrespondstousingtherewritingrule(26). s (cid:118) 5.2 LazyEvaluation Thetop-downnatureofourtechniqueallowsustopostponeevaluationofsomeofthe computationbranchesincasetheso-farevaluatedpartissufficientfordeterminingthe result of the evaluated (cid:15)-membership or subsumption test. We call this optimization lazy evaluation. A basic variant of lazy evaluation short-circuits elimination of quo- tients from branches of and . When testing whether (cid:15) t t(cid:48) (rule (8)), we first ∪ ∩ ∈ ∪ evaluate, e.g.,the test (cid:15) t, andwhen itholds, we cancompletely avoidexploring t(cid:48) ∈ and evaluating quotients there. When testing (cid:15) t t(cid:48), we can proceed analogously ∈ ∩ if one of the two terms is shown not to contain (cid:15). Rules (21) and (22) offer similar opportunitiesforshort-circuitingevaluationofsubsumptionof and . ∪ ∩ Let us note that subsumption is in a different position than (cid:15)-membership since correctness of our algorithm depends on the precision of the (cid:15)-membership test, but subsumption may be evaluated in any way that under-approximates inclusion of lan- guages of terms (and over-approximates isomorphism in order to guarantee termina- tion). Hence, (cid:15)-membership test must enforce eliminating quotients until it can con- cludetheresult,whilethereisachoiceinthecaseofthesubsumption.Ifsubsumption istestedonquotients,itcaneithereliminatethem,oritcanreturnthe(safe)negativean- swer.However,thischoicecomeswithatrade-off.Subsumptioneliminatingquotients ismoreexpensivebutalsomoreprecise.Thehigherprecisionallowsbetterpruningof thestatespaceandearlierterminationoffixpointcomputation,which,accordingtoour empiricalexperience,paysoff. Lazyevaluationcanalsoreducethenumberofiterationsofastar.Theiterationscan becomputedondemand,onlywhenrequiredbythetests.Theideaistotrytoconclude atest(cid:15) T S∗ basedontheintermediatestateT ofthefixpointcomputation.This ∈ − can be done since (T) always under-approximates (T S∗), hence if (cid:15) (T), L L − ∈ L then(cid:15) (T S∗).Continuingthefixpointcomputationisthenunnecessary. ∈L − The above mechanism alone is, however, rather insufficient in the case of nested stars.AssumethataninnerstarfixpointcomputationwasterminatedinastateT S∗ − 10

