ebook img

Decidable Logics for Transductions and Data Words PDF

0.4 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Decidable Logics for Transductions and Data Words

Decidable Logics for Transductions and Data ∗ Words Luc Dartois Emmanuel Filiot Nathan Lhote Universite´ libre de Bruxelles Universite´ libre de Bruxelles Universite´ libre de Bruxelles F.R.S.-FNRS LaBRI, Universite´ de Bordeaux 7 Abstract—We introduce a logic, called LT, to express proper- results exist. At the computational level, (1-way) finite au- 1 ties of transductions, i.e. binary relations from input to output tomata with outputs,called transducers,have been studied for 0 (finite)words.InLT,theinput/outputdependenciesaremodeled long [6], [7]. They have been extended to 2-way transducers 2 via an origin function which associates with any position of the byallowingtheinputheadtomoveinbothdirections.Among outputword,theinputpositionfromwhichitoriginates.Thelogic eb LT canexpressallMSO-definablefunctions,andisincomparable the most important results are the decidability of equivalence with MSO-transducers for relations. Despite its high expressive [8] and the closure under composition [9] of deterministic 2- F power,weshow,amongotherinterestingproperties,thatLT has way transducers. More recently, a 1-way deterministic model 5 decidable satisfiability and equivalence problems. of transducers with word registers, called streaming string 1 The transduction logic LT is shown to be expressively equiv- transducers, has been introduced and shown to be equivalent ] tarlaenntsdtuocatiolnosgiwciftohrodraigtianwtoorddast,aLwDo,rdusp(ttohesoomrieginbiojefcatinonouftrpoumt todeterministic2-waytransducers[10].Atthealgebraiclevel, L positionbecomes thedataof that position).ThelogicLD,which a canonical object, called bimachines, has been introduced F is interesting in itself and extends in expressive power known for functions definable by 1-way transducers [11], which has s. logics for data words, is shown to have decidable satisfiability. beenrecentlythe basisof a moregeneralstudyof definability c I. INTRODUCTION problemsfor such functions[12]. Finally, at the logical level, [ onlyonelogicalformalismisknown,calledMSO-transducers Language theory and applications. The theory of regular 2 (MSOT), which was interestingly shown to be equivalent, for v languages of finite words is rich and robust, founded on theclassoffunctions,todeterministic2-waytransducers[13]. 0 threeimportantpillars:computation(automatatheory),algebra MSOT have been introduced by Courcelle [14] in a more 7 (theory of monoids) and logic (monadic second-order logic). 6 generalcontext,asaformalismtodefinefunctionsofarbitrary Ithasbeensuccessfullyextendedtootherclassesoflanguages 3 relationalstructures.Themainideaistodefinethen-arypred- [1],[2]andstructures,suchastrees[3]andinfinitewords[4]. 0 icatesoftheoutputstructure,byformulasofmonadicsecond- . A well-knownapplicationofthe logic-automataconnection 1 orderlogic(MSO)withnfreefirst-ordervariables,interpreted for(finiteandinfinite)wordlanguagesisthetheoryofmodel- 0 over the input structure. Moreover, the input structure can be checking, in the domain of computer-aided verification [5], 7 copied a fixed number of times and the MSO formulas are 1 where system computations are modeled by automata, and parameterisedby the copies their free variables belong to. By : system specifications are written in high-level formalism, a v taking a linear-order ≤ and unary predicates for labels, one i logic.Specificationsarethenturnedintoautomataandmodel- X obtains the signature of words and hence MSOT can define checkingthenreducestolanguageinclusion.Manylogicsand functional word transductions. MSOT have been extended r automata models have been proposed in this domain, with a with non-determinism (called NMSOT) to define relations in different modeling features, expressiveness, and complexities general. Non-determinism is obtained by having a fixed tuple with respect to the model-checking problem, contributing to offreemonadicsecond-ordervariablesthatcanbeusedinany the success of automata and logic techniques in verification. of the formulas of the NMSOT. NMSOT are known to corre- Theory of transductions. In this paper, we go beyond spond to non-deterministic streaming string transducers [15] languages and consider binary relations of finite words, aka but not to non-deterministic 2-way transducers [13]. transductions. They relate input words over some finite al- phabet Σ to output words over some alphabet Γ. E.g., the Need for a new logic. Despite the appealing connection transductionτdouble duplicateseachinputsymbol((ab,aabb)∈ between MSOT, deterministic 2-way transducers and deter- τdouble), and τshuffle associates a word with all its permutations ministic streaming string transducers (which all define the ((ab,ab),(ab,ba)∈τshuffle). A transductionis said to be func- class of so-called regular functions), it turns out that MSOT tional if it is a (partial) function. The theory of transductions and NMSOT are not satisfactory as a specification language is not as advanced as the theory of languages, but important fortransductions,inthesensethattheyarestilltoooperational, thus precluding novel applications such as model-checking ∗ThisworkissupportedbytheARCprojectTransform(Wallonia-Brussels data-processing systems. For instance, specifying the order Federation), theFNRS PDRproject Flare (J013116F), and the French ANR project ExtStream(ANR-13-JS02-0010). of the output word as MSO-formulas over the input word somehow forces the designer to describe the left-right moves a client identifier, and m ∈ {0,1}∗ is the message to be of a 2-way machine running on the input word. Even in the processed by task t. The server gets as input a list of task non-deterministicsetting,oncesomeinterpretationoftheextra requests# m ...# m , andtreatstheminanyorder,thus i1 1 in n parameters has been fixed, then the resulting transduction is outputtingaword# t(m )...# t(m )forsome iσ(1) σ(1) iσ(n) σ(n) functional, and one is forced to fully specify it. In a model- permutationσof{1,...,n}.Now,onewouldliketoverifythe checking scenario, if one focuses on some critical property following property: every request is processed exactly once. rather than on the full system specification, it is desirable to Therefore, we are not interested here on whether the task t have a formalism that leaves parts of the system unspecified. is realised correctly, but rather on whether every request is Thus,havingahighlevelofnon-determinismiscrucialinthis treated. This could be expressed by requiring the system to context.ItishoweverimpossibleinNMSOT,atrivialexample output # whenever it processes a request # m of Client i, i i being that nothing is specified: the universal transduction of and then requiring that the origin mapping defined by this non empty words Σ+×Γ+ is not definable in NMSOT. input/output behaviour is bijective and label-preserving. A In this paper, our objective is to propose a logic tailored to possible example of such a correct behaviour is: transductions, with the following requirements: it should (i) be expressive, by that we mean it should at least capture the #2 0 1 0 #1 1 1 0 #3 0 0 robustclassofregularfunctions,(ii)offerahighlevelofnon- #1 ............ #3 ............ #2 ............... determinism (to leave parts of the system unspecified), (iii) where the dots represent any word in {0,1}∗, whose origins be decidable with respect to satisfiability. are not specified (and so not depicted). The property can be The proposed logic L for transductions. The logic we T expressedinL bysayingthatoisalabel-preservingbijection define in this paper is based on a simple idea: a pair of T between hash positions (using special quantifiers over input words of a transduction can be seen as a single structure, and outputs positions): with two orders ≤in and ≤out on respectively the input and outputpositions,andunarypredicatesforthelabels.However, ∀inx k # (x)→(∃outy o(y)=x∧# (y)) ∧ i=1 i i such a structure is not rich enough to express interesting de- ∀outx∀outyV k (# (x)∧# (y)∧o(x)=o(y))→x=y pendencies between input and output positions, and therefore i,j=1 i j V we also assume the existence of an origin mapping, which If one wants to verify that the system processes the requests maps any output position to some input position, intuitively inthesameorderastheoneitgetsthem,thenweexpressthat the position from which it originates. As noticed in [16], the input/output orders are preserved between hash positions: any known transducer model, including NMSOT, implicitly k produce some origin information when translating an input ∀outx∀outy (#i(x)∧#j(y)∧x≤out y)→o(x)≤in o(y) word. For instance, a transition (p,a,q) with output bc of a i,^j=1 finite-state transducer not only translate a into bc, but also None of the properties described before are definable in produces the origin of b and c as the current position. In this NMSOT, because NMSOT forces to fully specify the result paper,a transductionis notonlya setof pairsof words,butit of a transduction (hence here we must also specify what the isa set ofproductions(u,(v,o)) whereu isthe inputword,v resultst(m )shouldbe).Also,τ isnotNMSOT-definable. i shuffle is the outputword,ando mapsanypositionof v to a position Contributions on L . We show, and it is our main result, of u. E.g. τ can be naturally extended with origin: T shuffle that L has decidable satisfiability problem. As a result, T a b c a a b c a the equivalence problem, is decidable for transductions (with origininformation)definedinL .Ontheexpressivenessside, T a c a b a c b a we prove that any L -transduction has regular domain, that T Fig.1. Twoproductions ofτshuffle L can express any function definable in MSOT, and is in- T A production (u,(v,o)) can be seen as a structure with comparablewithNMSOT.Wealsoshowthatthefunctionality theorders≤in,≤out,unarypredicatesforthelabels,andsome problem (whether any input word is mapped to at most one originfunctionsymbolo.Itturnsoutthatevenfirst-orderlogic pair(v,o))andthebounded-originproblem(whetheranyinput is undecidable over such structures. We propose a restriction, position is the origin of at most k output positions, for some the logic LT, defined intuitively as follows. We take the two- constant k) are decidable. The decidability of LT is obtained variable fragment of first-order logic over this signature (i.e. via a reduction to a new logic for data words. only two variable names can be used in the formulas), but Connection with data words and the logic L . A trans- D extend it with all MSO-definable binary predicates over ≤in duction is non-erasing if for any of its production (u,(v,o)), and Σ, hence being able only to talk about the input word. every u position is the origin of at least some v position. Example. Suppose some server receives task requests from We show that every L -transduction can be turned into a T a fixed number k of clients. For simplicity we assume that non-erasing one while preserving some interesting properties, there is only one task, which consists in realizing some including satisfiability. There is a simple bijection between functiont:{0,1}∗ →{0,1}∗. To requestfortask processing, non-erasing transductions and so called typed data words. Client i ∈ {1,...,k} sends the word # m, where # is Let u = a ...a ∈ Σ∗ and v = b ...b ∈ Γ∗. Any i i 1 n 1 m production (u,(v,o)) can be encoded as the sequence w = for transductions are consequences of known results for data (b ,a ,o(1))...(b ,a ,o(m)).E.g.,theleftproduction words. It is the case for instance of the undecidability of the 1 o(1) m o(m) of Fig. 1 is encoded as (a,a,1)(c,c,3)(a,a,4)(b,b,2). The logic FO2[Σ,Γ,≤in,≤out,Sout] where Sout is the successor sequence w is called a typed data word, the integers being over output positions by using [20], and the decidability of the data, and the symbolao(i) being the type of the data o(i). FO2[Σ,Γ,≤in,≤out] by using [17]. In a typed data word w = (γ ,σ ,d )...(γ ,σ ,d ), if two 1 1 1 k k k Organisation of the paper. In Section II, we introduce the tripleshavethesamedatumd =d , thentheyarerequiredto i j logic L and give the main results. In Section III, we give T have the same type σ =σ . The data define a total preorder i j the connection between transductions and data words, and (cid:22) on the positionsof w byi(cid:22)j if d ≤d . Then,typeddata i j introduce the logic L . Section IV is devoted to proving D words can be seen as structures with the linear-order ≤ on the decidability of L . Finally, in Section V, we prove the D positions, the preorder (cid:22), and unary predicates for labels and decidabilityof L and givesa few consequencesof the proof T types.ThelogicL translatesintoalogicwecallL ontyped T D techniques adopted for the decidability of L . Most proofs D data word structures, defined as follows. Itis the two-variable are only sketched but can be found in the appendix. first-order logic over Γ and ≤, extended with all binary MSO-predicates over (cid:22) and Σ with the following restriction: II. LOGICWITH ORIGIN FORTRANSDUCTIONS in MSO-predicates, monadic second-order variables X range A. Words and Transductions oversetsofpositionsclosedfordataequality(ifx∈X andy has the same data as x, then y ∈X). Hence, MSO predicates Given a (finite) alphabet Σ, a word u of length n ≥ 0 can really be seen as formulaswhich quantifies over data and (denoted by |u|) is a function from {1,...,n} into Σ. When sets of data, and can express properties such as “there is an n=0,thedomainofthisfunctionisemptyandwedenoteby even number of data of some given type t”. ǫ this function, called the empty word. We define dom(u)= The logic L strictly extends the logic FO2[Γ,(cid:22),S ,≤] {1,...,n} and for all i∈dom(u), i is called the ith position D d ofu, andu(i)the ithsymbolofu (orletter).Theset of(non- over (untyped) data words (where S is the successor over d empty) words over Σ is denoted by Σ∗ (Σ+). data), for which the satisfiability problem is known to be Let Σ and Γ be two disjoint alphabets. An origin-free EXPSPACE-c [17]. Still, we show that LD has decidable transductionis a subset of Σ∗×Γ∗. A productionwith origin satisfiabilityproblem.Theproofextendsthetechniquesof[17] (orproductionforshort)fromΣ to Γ isa pair(u,(v,o)) such which further reduce the problem to a constraint satisfaction thatu∈Σ+,v ∈Γ∗andoisa(total)mappingfromdom(v)to problemonthe two-dimensionalplane.Thisextensionisnon- dom(u), called the origin mapping. We denote by PR(Σ,Γ) trivial as the MSO predicates induce new constraints that are thesetofproductionsfromΣtoΓ.Atransductionwithorigin handledusingqueryautomatarunningverticallyonthe plane. (or just transduction) τ from Σ to Γ is a set of productions Unlikein[17],wealsouseanautomataapproachtorecognise (u,(v,o)) ∈ PR(Σ,Γ). Intuitively, a pair (u,(v,o)) means sequences of types σ ...σ which can be extended to typed 1 n that the word u is translated into v, and that any position data words (γ ,σ ,d )...(γ ,σ ,d ) that are models of 1 1 1 n n n i ∈ dom(v) has been “produced” while processing position the formula, thus proving regularity of such sequences. The o(i) of u. Somehow, a transduction with origin not only says automataapproachallowsustogetmoreresults,amongwhich whatis translated into what, butalso how. As noticedin [16], the decidability of the data boundedness problem (is there most transducer models in the literature can be interpreted some bound k such that any data occurs at most k times in as machines for transductions with origin. We say that τ is any modelof the formula). We believe that the results on L D functional (or is a function) if for all u, there is at most one are of independent interest for the data word community. pair (v,o) such that (u,(v,o))∈τ, and rather denote it by f Other related works. The study of transductions with origin insteadofτ.Thedomainofatransductionτ isthesetofwords was first initiated in [16] where the goal was to obtain an usuchthatthereexists(v,o)suchthat(u,(v,o))∈τ.Finally, algebraic characterisation of (functional) transductions with the origin-free projection of τ is the origin-free transduction origindefinableinMSOT.Transductionswithoriginhavebeen {(u,v)|∃(u,(v,o))∈τ}. extendedto finite trees in [18] in which the complexityof the Example 1. Let Σ be a finite alphabet, and let Γ={σ |σ ∈ equivalenceproblemisanalysedforvarioustransducerclasses. Σ} (remind that we require the alphabets to be disjoint). By Logics for data words have been studied extensively over extension, we also write u ∈ Γ∗ for the annotated version of the past few years [17], [19], [20], but have been focused any word u∈Σ∗. We give several examples of transductions so far on establishing a decidability frontier for the two- (with origin) from Σ to Γ. First, consider the identity f id variable fragment of first-order logic with combinations of defined for all u ∈ Σ∗ by f (u) = (u,id) where id is the id various predicates (on data: equality, linear-order, successor, identity function over dom(u). As an example, consider the and on positions: linear-order, successor). See [17] for a following production (u,(v,o)) in f : id summary. To the best of our knowledge, it is the first time full MSO predicates have been considered to speak about inputu a b c a a b the data. Through the encoding of transductions with origin origino as data words, some undecidability and decidability results outputv a b c a a b The next transduction τshuffle associates an input word to δ(x), forallδ ∈Σ∪Γ,thetwo orderpredicates≤in and≤out, all its permutations, i.e. (u,(v,o)) ∈ τ iff o is a label- and the function symbol o, interpreted as follows: shuffle preserving bijection from dom(v) to dom(u). For example, • the domain of M is dom(u)⊎dom(v) consider the following two productions (with same input • σM ={i∈dom(u)|u(i)=σ} for any σ ∈Σ, word): • γM ={i∈dom(v)|v(i)=γ} for any γ ∈Γ, a b c a a b c a • ≤Min is the natural linear order on dom(u), • ≤Mout is the natural linear order on dom(v), a c a b a c b a • oM is the function3 i 7→ o(i) if i ∈ dom(v), and i 7→ i if i∈dom(u). B. FO and MSO Logics We also use the predicates =, <in and <out, which are all A signature S is a (possibly infinite) set of predicate sym- definable in the logics that we consider. For an MSO[TΣ,Γ]- bols P with an arity ar(P) ∈ N and a (possibly infinite) set formula φ, we write (u,(v,o)) |= φ instead of M |= φ, and of functional symbols f with an arity ar(f)∈N+ (functions implicitly assume that (u,(v,o)) is given as a structure over of arity 0 are called constants). A (finite) structure M over TΣ,Γ. We may write MSO[Σ,Γ,≤in,≤out,o] for MSO[TΣ,Γ] a signature S consists of a finite domain D together with an to underline which symbols are used in the signature. interpretation.M ofeachpredicatesymbolP byanar(P)-ary Any MSO[TΣ,Γ] sentence φ defines a transduction JφK, relation1 PM ⊆Dar(P), and of each functional symbol f by which is the set of productions (u,(v,o)) such that an ar(f)-ary (total) function2 from Dar(f) into D. (u,(v,o)) |= φ. We say that a transduction τ is definable Let X1,X2 be two disjoint countable sets of first- and in MSO[TΣ,Γ] if τ =JφK for some sentence φ∈MSO[TΣ,Γ]. (monadic) second-order variables respectively. The monadic- We also say that an origin-free transduction τ′ is definable in second order logic over S (denoted by MSO[S]) is the set of MSO[TΣ,Γ] if there exists an (origin)transductionτ definable formulas generated by the following grammar: in MSO[TΣ,Γ] and whose origin-free projection is τ′. Example 2. We first defineseveralmacrosthatwillbe useful φ ::= ∃x φ|∃X φ|φ∧φ|¬φ|X(t)|P(t ,...,t )|⊤ 1 n throughout the paper. The formula in(x) ≡ x ≤in x (resp. where x∈X1, X ∈X2, P is a predicate of S of arity n, and out(x) ≡ x ≤out x) hold true if x belongs to the input t,t ,...,t are terms over the function symbols. The atom word (resp. output word). Now for α ∈ {in,out}, we define 1 n X(t) means that t belongs to X (X is viewed as a unary the guarded quantifiers ∃αx φ and ∀αx φ as shortcuts for predicate).Universalquantifiers∀x φ and ∀X φ are naturally ∃xα(x)∧φand∀xα(x)→φ(notethat¬∃αxφisequivalent defined by ¬∃x ¬φ and ¬∃X ¬φ. We also define the false to ∀αx ¬φ). We could also define similarly guarded second- formula ⊥ ≡ ¬⊤. We may write φ(x ,...,x ,X ,...,X ) order quantifiers. Even if it is not necessary, we will often 1 n 1 m to emphasise that the first- and second-orderfree variables of use guarded quantifiers for the sake of formula readability. a formula φ are respectively x ,...,x and X ,...,X . The following formulas express that the origin mapping is 1 n 1 m WithoutdefiningthesemanticsofMSO(thereadercansee, respectively an injection, surjection and bijection from output e.g.,[1]),letusmentionthat,overastructureMwithdomain positions to input positions: D,first-ordervariablesareinterpretedbyelementsofD while φ ≡ ∀outx,y (o(x)=o(y))→x=y second-order variables are interpreted by subsets of D. inj φ ≡ ∀inx∃outy x=o(y) The first-order logic fragment over S (denoted by FO[S]), surj φ ≡ φ ∧φ is the set of MSO[S] formulas without quantification over bij inj surj second-order variables nor free second-order variables. The We can use it to define the transduction τ of Example 1: shuffle two-variablefragmentoffirst-orderlogic,denotedby FO2[S], is the restriction of FO[S] to formulas in which only two φshuffle ≡φbij∧∀outx σ(o(x))→σ(x) variables appear (but can be reused). The existential MSO σ^∈Σ fragment (∃MSO[S]) is the set of MSO[S] formulas of type If we also require the origin mapping to be order-preserving, ∃X1...∃Xn φ where φ is in FO[S,X1,...,Xn], where each we get a formula defining the identity transduction fid: fXraigimsevnietwofed∃MasSaOu,ndaernyopterdedbiyca∃teM.SSOim2i[lSa]rliys,tthheestewtoo-fvfaorriambule- φorder-pres. ≡ ∀outx∀outy x≤out y →o(x)≤in o(y) φ ≡ φ ∧φ las of type ∃X ...∃X φ where φ is in FO2[S,X ,...,X ]. id shuffle order-pres. 1 n 1 n Given a class C of MSO[T ] formulas, we say that C is Σ,Γ C. FO and MSO Logics for Transductions decidable if the following satisfiability problem is decidable: Any production (u,(v,o)) ∈ PR(Σ,Γ) is seen as a struc- given an MSO[TΣ,Γ] formula φ, does there exist a production tureM overthe signatureT composedofunarypredicates (u,(v,o)) from Σ to Γ such that (u,(v,o)) |= φ. In other Σ,Γ words, it asks whether dom(JφK) 6= ∅. It turns out that the 1With the convention that D0 = {()} (empty tuple), a predicate symbol logic MSO[TΣ,Γ] is undecidable, even if restricted to FO: ofarity0iseitherinterpreted byfalse(emptyset)ortrue({()}) 2functions of arity 0 are constant functions, and therefore are interpreted byasingleelement ofD. 3Tomakeitsdomaintotal,oisinterpretedalsoondom(u),bytheidentity. Proposition 3. Over transductions, the logic FO[Σ,Γ,≤in has two free variables x1 and x2 may be sometimes written ,≤out,o] is undecidable. {φ[x1/t1,x2/t2]},i.e.φinwhichthexi havebeensubstituted by t . We keep the brackets { and } to emphasise the fact it Proof. It is a consequence of a stronger result: Prop. 5. i is a binary MSO formula which speaks about the input word. However, if one restricts to the two-variable fragment, Hence, the latter formula can be written as: one regains decidability. This can be shown by using the decidability from [17] of the logic FO2 with one linear-order φorder-pres. ≡∀outx,y (x≤out y)→{o(x)≤in o(y)} and one total pre-order interpreted over data words, and a NotethatallformulasofExample2canbetranslatedintoL , T correspondence between data words and transductions given modulo brackets (see Appendix). in Section III, or as a particular case of a more expressive Identity on a regular language Since the MSO predicates logic that we define in Section II-D. onlyspeakabouttheinputword,itistemptingtothinkthatthe Proposition 4. Over transductions, the logic FO2[Σ,Γ,≤in expressivepowerontheoutputisrestrictedtoFO2.Itturnsout ,≤out,o] is decidable. thatthankstotheoriginmapping,theregularpropertiesofthe input words transfer to the output word. As an example, any Itisknownthatthesuccessorpredicatecannotbeexpressed regular domain restriction of f is L -definable. Let L⊆Σ∗ id T in the two-variablefirst-orderlogic.However,if oneaddsthis bearegularlanguageandletφL ∈MSO[≤in,Σ]beasentence predicate to the signature, the first-order logic is undecidable, definingL.Then,therestrictionf | isdefinedbyφ ∧{φ }. id L id L even if restricted to the two-variable fragment. The proof of Dyck languages Consider the (origin-free) transduction thisresultisanadaptationtotransductionsoftheundecidabil- (ab)n 7→[n]nwheretheoutputalphabetconsistsofΓ={[,]}. ity, overdata words,of FO2 with a linear-orderand successor By taking any bijective origin mapping such that each [ is predicates over positions, and a linear-order on data [20]. mapped to a and each ] to b, , e.g. as follows: Proposition 5. Over transductions, the logic FO2[Σ,Γ,≤in a b a b a b a b ,≤out,Sout,o] is undecidable. [ [ [ [ ] ] ] ] D. The decidable logic L for transductions with origin T The logic LT extends FO2[Σ,Γ,≤in,≤out,o], which is thenoneobtainsatransductiondefinablebysomeLT-formula decidable,withanybinarypredicatedefinableinMSO[≤in,Σ] φ. Indeed,the language (ab)∗ being regular,it is definable by which is only allowed to quantify over input positions, in an MSO[≤in,Σ]-formula φ(ab)∗. Then, φ is defined by: order to capture regular properties of the input words and, as we will see, of the output words as well, while preserving φdom ≡ {φ(ab)∗} decidability. This extension will also be expressive enough to φrange ≡ ∀outx∀outy [(x)∧](y)→x≤out y φ ≡ ∀outx [(x)→{a(o(x))} ∧ ](x)→{b(o(x))} capturefunctionaltransductionsdefinableinMSOT[14],[13]. lab φ ≡ φ ∧φ ∧φ ∧φ We let MSObin[≤in,Σ] be the set of n-ary predicates, for dom range bij lab n≤2, written {φ} where φ is an MSO[≤in,Σ]-formula with Moregenerally,onecanassociatewithanyword(ab)n theset n free variables and the quantifications are over the input. of all well-parenthesised words of length n over Γ. Thus, the We use the brackets { and } to explicit the fact that {φ} rangeofthistransductionisa Dycklanguageover{[,]}.This is a predicate symbol and not just an MSO formula. Over a transductionis realisedby the formulaφ ∧φ ∧φ ∧φ′ dom bij lab production(u,(v,o)), ann-arypredicate{φ}isinterpretedas where φ′ expressesthat for any two outputpositionsx and y, ann-aryrelationoveru,naturallydefinedbytheinterpretation ifo(y)isthesuccessorofo(x)andislabelledb,thenx≤out y: of the MSO formula on u. The logic, denoted by L , is the two-variable fragment of φ′ ≡∀outx∀outy {Sin(o(x),o(y))∧b(o(y))}→x≤out y T first-order logic over the outputsymbol predicates, the linear- This is correct due to the fact that a word w over Γ is well- order ≤out, and all predicates in MSObin[≤in,Σ], i.e. parenthesised if, and only if, for all its prefixes, the number LT :=FO2[Γ,≤out,o,MSObin[≤in,Σ]] oenfs[urisedgbreyattehretfhoarnmtuhleaφnu′mwbheicrhoffo]r,ceasndtheeqpuraoldounctiwon. T[hoifsains It should be clear that LT is a fragment of MSO[Σ,Γ,≤out a to appear before the production ] of its successor b. ,≤in,o] and as such, there is no need to define its semantics. Remark 7. According to the previous examples, one can Example 6. Order-preservation Preservation of the in- express in L the transduction τ defined as the shuffle over T 1 put/output orders through origin is expressed by the LT- the language a∗b∗, and also τ2 : (ab)n 7→ anbn. Hence formula ∀outx,y (x≤out y)→{x′ ≤in y′}(o(x),o(y)). the composition τ2 ◦ τ1 : anbn 7→ anbn has a non-regular Note that we could equivalently replace x′ and y′ by any domain. However, as we will see in Section V, the domain variable (even x and y), without changing the semantics: the of an L -transduction is always regular, which means that T formula x′ ≤in y′ defines a binary relation on the input word, LT-transductions are not closed under composition. which is used as an interpretation of the predicate {x′ ≤in One of our main result is the decidability of L : y′}. To ease the notations, any predicate {φ}(t ,t ) where φ T 1 2 Theorem 8. Over transductions, the logic L is decidable. the order predicate x ≤ y of the output structure are defined T by MSO formulas with respectively one and two free first- The proof of this theorem relies on a reduction to a data order variables, interpreted over the input structure. Formally, wordlogic,calledL andintroducedinSectionIII,whichhas D an MSO-transducer (MSOT) is a tuple T = the same expressiveness as L , modulo encodings of trans- T ductions as data words and conversely, and which is shown k,φ , φc (x) ≤k , φc(x) ≤k , φc,d(x,y) ≤k to be decidable in Section IV. From the technics developed (cid:18) dom pos c=1 γ c=1,γ∈Γ (cid:16) ≤ (cid:17)c,d=1(cid:19) (cid:0) (cid:1) (cid:0) (cid:1) in the decidability proof,we extractseveralinteresting results where k ∈ N and the formulas φ , φc , φc and φc,d are such as regularity of domains of LT-definable transductions, dom pos γ ≤ MSO[Σ,≤]-formulas.We referthereaderforinstanceto[21], decidabilityoffunctionalityandoforigin-boundedness.These [13], [22] for the (origin-free)semantics of these transducers. results are given in Section V. Theoriginsemanticsisobtainedbyaddingtheoriginmapping An importantand desirableconsequenceofthe decidability which associates with any copies c of input position x, the of L is the decidability of the equivalence problem, which T input position x. We call MSO-transduction a transduction asks, given two L -sentences φ ,φ , whether Jφ K = Jφ K. T 1 2 1 2 definable in this formalism. This amounts to check whether ¬(φ ↔φ ) is unsatisfiable. 1 2 Example 12. As an example, we show how to define the Corollary9. TheequivalenceproblemforL -definabletrans- T transduction f : u 7→ uu with origin mapping o(i) = ((i− ductions is decidable. 1)mod|u|)+1byanMSO-transducer.Weusetwocopies(k= By extending the alphabets, the decidability of L can be 2),thepositionformulasarealltrueandthelabelformulasare T trivially extended to that of ∃MSO2: φc(x) = γ(x), c = 1,2. The order formulas are φi,i(x,y) = γ ≤ Corollary 10. Over transductions, the logic ∃MSO2[Γ,≤out x≤y for i=1,2, φ≤1,2(x,y) is true and φ≤2,1(x,y) is false. ,o,MSObin[≤in,Σ]] is decidable. defiTnheisctorpaynsfdourcmtiuolnasisc1d(xefi)n=ab∀leyiyn<LoTut axs→fol{loow(xs). 6=Weo(fiyr)s}t Note that ∃MSO2[Γ,≤out,o,MSObin[≤in,Σ]] is not closed andc2(x)=¬c1(x)∧∀yy <out x→(c1(y)∨{o(x)6=o(y)}). under negation, and therefore we do not get decidability of Then the order is enforced by the order formula: ψ (x,y)= ≤ equivalence for this logic as a consequence of this corollary. x≤out y ↔ (c1(x)∧c2(y))∨( i=1,2ci(x)∧ci(y)∧{o(x)≤in Finally, we conclude this section by showing that LT lies o(y)}) . No(cid:0)w the function f iWs defined in LT by somehowatthedecidabilityfrontier,ifonewantsthepowerof (cid:1) MSO on the input (which is necessary to capture Courcelle’s ∀outx(c1(x)∨c2(x))∧ γ(x)↔{γ(o(x)}∧∀outy ψ≤(x,y) MSO transductions). Adding just the successor relation over γ_∈Γ the output words leads indeed to undecidability. The latter encoding can be (effectively) generalised to any Proposition 11. Over transductions, FO2[Γ,≤out MSO-transduction: ,Sout,o,MSObin[≤in,Σ]] is undecidable. Theorem 13. Any MSO-transduction is L -definable. T Proof. As shown by Prop. 4, the logic FO2[Σ,Γ,≤in,≤out We conjecture the converse of Theorem 13 is true, i.e. ,Sout,o] is already undecidable. whether for all (origin-free) functions f :Σ∗ →Γ∗ definable in L , f is an MSO-transduction. As a good sign towards T E. Expressiveness of L T this conjecture, we observe that functions definable in L T In this section, we compare our logic to MSO-transducers, are linear-size increase, just as MSO-definable functions. We as introduced by Courcelle in [14], which define (partial) indeed show, in Section V, that for all L -formula φ (not T functionsfromlogicalstructuresoversomesignaturetological necessarily defining a function), there exists a linear-size structures over a possibly different signature. Casted to the increase uniformisation of it. wordsignaturewith unarypredicatesσ(x) forlabelsandwith AnappealingconsequenceofTheorem13andthedecidabil- a linear order ≤ between positions, it defines a logical for- ityofL isthedecidabilityofthemodel-checkingproblemfor T malism forword-to-wordtransductions,whichis knownto be expressive classes of transductions. Given a system realizing equivalentto deterministic 2-way finite state transducers[13]. some functionaltransductionf (with origin),defined in some While these formalisms have been defined without origin expressive operational model such as a deterministic 2-way semantics, they intrinsically bear origin information [16]. We transducer T , and given some specification φ defined in f S first show that any transduction definable by some MSO- L , one can decide whether T |= φ , i.e. for all inputs T f S transducer is definable in L . u ∈ dom(f), (u,f(u)) ∈ Jφ K. It suffices to convert T T S f In an MSO-transducer,the output word structure is defined intosomeMSO-transducer(basedontheequivalencebetween bytakingafixednumberkofcopiesoftheinputworddomain. deterministic 2-way transducers and MSO-transducers [13]) Nodes of these copies can be filtered out by MSO formulas which in turn is converted into some L -formula φ , thanks T f with one free first-order variable. In particular, the nodes of toThm.13, andthentotestthe(un)satisfiabilityofφ ∧¬φ . f S the c-th copy are the input positions that satisfy some given MSO-transductions have been extended to relations (called MSO formula φc (x). The output label predicates γ(x) and NMSO-transductions), by using a set of monadic second- pos order parameters X ,...,X . Every formulas of NMSO- {d ,...,d } = {1,...,m} for some m (i.e. it is closed5 by 1 n 1 n transductions can use X ,...,X as free variables. Once an ≤). Note that by definition of typed data words, if d = d , 1 n i j interpretation of these variables as sets of positions is fixed, then positions i and j carry the same data type σ = σ di dj the transduction becomes functional. Therefore, the maximal from Σ. We let dom(w)={1,...,n} be the set of positions number of output words for the same input word is bounded of w. by the number of interpretations of X ,...,X . The data of a typed data word w induce a total preorder6 1 n (cid:22) on the positions of w defined by i (cid:22) j if d ≤ d . This Proposition 14. The class of NMSO- and L -transductions i j T preorder induces itself an equivalence relation ∼ defined by are incomparable. Any NMSO-transduction is expressible in i ∼ j iff i (cid:22) j and j (cid:22) i, which means that the positions i ∃MSO2[Γ,≤out,o,MSObin[≤in,Σ]]. and j carry the same datum. Indeed consider the transduction τ defined as the set of Given two disjoint alphabets Σ and Γ, we denote by even pairs (u,vv) where v is a subword of u of even length. It is TDW(Σ,Γ) the set of typed data words over alphabets Γ definableinNMSObyusingaparameterX whichdefinesthe and Σ. One might think that the data labels are redundant inputpositionsthatarekept.Then,τ isdefinedalmostasin with the alphabet. However,our goalis to define, similarly to even Example 12: The position formulas φc (x) are just changed the previoussection, a logic where differentrestrictionsapply pos to X(x) and the domain formula states that the predicate tothepositionalphabetandthedatatypes.Thus,havingtypes X is of even size. τ is not L -definable as it requires a for data will allow for more expressive power of the logic. even T quantification over the output within the scope of a second A typed data word w = (γ ,σ ,d )...(γ ,σ ,d ) will 1 1 1 n n n order quantification over the input. Conversely, neither the equivalently be seen as a structure M over the signature universal transduction Σ+ ×Γ+ nor the shuffle are NMSO- D ={(δ(x)) ,≤,(cid:22)} definable, due to the bound on the number of images of an Σ,Γ δ∈Σ∪Γ input word, while they are L -definable. The inclusion is a with the following interpretations: T trivial extension of Thm. 13 by existentially quantifying the • ≤M⊆dom(w)2 is the natural linear order on dom(u), parameters Xi. Going back to model checking, one can then • (cid:22)M⊆dom(w)2 isthetotalpreorderi(cid:22)M j iffdi ≤dj, check if all models of an NSST A satisfy an LT-formula • γM ={i∈dom(w)|γi =γ}, for γ ∈Γ, ϕ. Indeed since NSST are equivalent to NMSOT, one can • σM ={i∈dom(w)|σi =σ}, for σ ∈Σ. construct a formula ϕA ∈∃MSO2[Γ,≤out,o,MSObin[≤in,Σ]] We will also consider the predicate ∼ interpreted by the and check if ϕA∧¬ϕ is satisfiable. equivalence relation induced by (cid:22)M (the predicate ∼ will be definable in the data logic we will consider). As for III. TYPED DATAWORDS productions, we will write w |= φ, for some logical formula Inthissection,wemakeaconnectionbetweentransductions φ over D instead of M |= φ, and implicitly assume that Σ,Γ and data words, introduce a new logic LD for data words, w is given as a structure. which is shown to be decidable in Section IV. We show that B. The logic L for typed data words modulosomeencodings,thelogicsL andL areequivalent. D D T A data word is a word where each position is equipped The logic MSO[DΣ,Γ] is known to be undecidable [20] with a label from a finite alphabet and a value from an (even the first-order fragment). We define here a fragment of infinite domain, its datum. They are especially used to model MSO[DΣ,Γ], calledLD, whichwillbeshownto bedecidable. computationaltracesindistributedenvironments,wherevalues Intuitively,a formulaofLD canbeseenasanFO2 formula correspond to process numbers. Data trees are also used in using the linear order of the positions and some additional database theory to model XML documents, where the values binarydata predicates. The logic LD is indeedbuilt on top of model XML attributes. Logics for data words have already MSO n-arypredicates,for n≤2, whichare allowedto speak beenconsidered(see [19]).Inparticular,itwasprovedin[20] only about the data. Precisely, we define MSObin[Σ,(cid:22)] to be that FO with equality over data in undecidable, but FO2 is. the set of n-ary predicates written {φ}, for n ≤ 2, where φ In [17], the authors proved, over an ordered data domain, the is an MSO-formula with n-free first-order variables, over the decidabilityofFO2equippedwithboththeorderandsuccessor unary predicates σ(x) and the preorder (cid:22), with the following over data, but adding these data predicates is done at the cost semantical restriction7: second-order variables are interpreted of removing the successor for the linear order over positions. by∼-closedsets ofpositions. Overtypeddata words(seen as structuresoverD ), predicates{φ} are interpretedbyn-ary Σ,Γ A. Typed data words relations on positions, n≤2, defined by formulas φ. We consider here typed data words, where our data do- 5We make this assumption wlog, because the logic we define will only main is ordered, and each data also carries a label (type) beabletocomparetheorderofdata, andtherefore cannotdistinguish typed from a finite alphabet. Formally, a typed data word w data words up to renaming of data, as long as the order is preserved. E.g. over two disjoint alphabets4 Γ and Σ is a sequence w = (a,b,1)(c,d,3)(e,f,2) and(a,b,2)(c,d,5)(e,f,4)will be undistinguish- ablebythelogic. (γ1,σd1,d1)...(γn,σdn,dn) from (Γ × Σ × N)∗ such that 6Werecall thatapreorderisareflexive andtransitive relation. 7Notethatthesemanticalrestrictioncouldalsobeenforcedinthelogicby 4Assumingdisjointness isdonewlog.Wedoitforsimplifyingtheproofs. guardingquantifiers ∃Xψ by∃X[∀x∀y x∈X∧y∼x)→y∈X]→ψ. Duetothesemanticalrestriction,formulasinMSO [Σ,(cid:22)] erasingifallproductionsofJφK are non-erasing.Satisfiability bin cannotdistinguish positionswith the same data and therefore, of L is reducible to satisfiability of non-erasing formula: T they can be thought of as formulas which quantify over data Proposition 17. For any L -formula ϕ there exists a non- and sets of data. As an example, the formula ∀y x (cid:22) y T erasing L -formula ϕ′ such that dom(JϕK) =dom(Jϕ′K). In expressesthatthedataofpositionxisthesmallestdata,andit T particular, ϕ is satisfiable if, and only if, ϕ′ is. holdstrueforanyx′ withthesamedata.Then,thelogicL is D definedasthetwo-variablefirst-orderfragmentofMSO[D ] The reason we consider non-erasing transductions is be- Σ,Γ with the MSO-predicates we just defined. cause there are straightforward encodings of non-erasing productions to typed data words, over the same alpha- L :=FO2[Γ,≤,MSO [Σ,(cid:22)]] D bin bets, and conversely. A non-erasing production (u,(v,o)) can be encoded as the typed data word t2d(u,(v,o)) = (v(1),u(d ),d )...(v(n),u(d ),d ) where d = o(i). The 1 1 k k i Example 15. We let x ∼ y stands for x (cid:22) y ∧ y (cid:22) x. inverse of t2d(u,(v,o)) is defined, for all typed data words First, letusmentionthatMSObin[Σ,(cid:22)] predicatescanexpress w =(γ1,σd1,d1)...(γk,σdk,dk),bythenon-erasingproduc- any regular properties about the data, in the following sense. tiont2d−1(w)=(u,(v,o)) suchthatv =γ ...γ ,o(i)=d , 1 k i Given a typed data word w, the total preorder (cid:22) between and u = σ ...σ where n = max d . As an example, 1 n i i positions in dom(w) can be extended to a total order ≤∼ on consider the following production and typed data word: the equivalenceclasses of dom(w)/ , by [i] ≤ [j] if i(cid:22) ∼ ∼ ∼ ∼ j. Then, any typed data word induces a word σ1...σn ∈Σ∗ # $ @ # # 3 2 1 3 5 4 @ $ # @ # # such that σi is the type of the elementsof the ith equivalence a b c c a b a b c c a b class, for ≤ . Any regular property of these induced words ∼ overΣ transferinto aregularpropertyaboutthedata oftyped Proposition 18. Non-erasing productions of PR(Σ,Γ) and data words (it suffices to replace in the MSO-formula on Σ- typed data words of TDW(Σ,Γ) are in bijection by t2d. words expressing the property, the linear order by (cid:22) and the equality by ∼). Examples of properties are: n is even, which This encoding transfers to definability in LD and LT: transfers into “there is an even number of data”, or σ1...σn Theorem 19. A non-erasing transduction τ is L -definable T containsanevennumberofσ ∈Σ,forsomeσ,whichstransfer iff t2d(τ) is L -definable. Conversely, a language of typed D into “there is an even number of data of type σ”. data words L is L -definable iff t2d−1(L) is L -definable. D T One can also define partial specifications like “occurences of a symbols are ordered by their data”, which is stated by Sketch of Proof. To go from LT to LD, the main idea is to φa =∀x∀y(a(x)∧a(y))→({x(cid:22)y}↔x≤y). makeasyntactictransformationthatmimicstheencodingt2d: ≤ once inconsistent use of terms have been removed (such as Ournextexampleisthatofaschedulerwithasetofordered processus (whose id are the data) that can be urgent (data e.g., o(x) ≤out y), terms on(x) are replaced by x, predicates type u) or non urgent (data type n). The property that urgent ≤in by (cid:22) and ≤out by ≤. The converse is similar. processusare treatedfirstandin id-orderis expressedbyψ = ThankstoProposition17,thesatisfiabilityofL reducesto T ∀x∀y{u(x) ∧ (n(y)∨ x (cid:22) y)} → x ≤ y. To enforce that satisfiability of L over non-erasingtransductions.Therefore, T processusare onlytreatedonce,itsufficestostate thatnotwo the combination of the theorems 16 and 19 will lead to the positionscarrythesamedata:ψbij ≡∀x∀y {x∼y}→x=y. decidability of L (Theorem 8). This result, together with T Moregenerally,allLT formulasofExample6canbeturned other interesting results on LT, is proved in Section V. intoformulasofL ,aswewillsee,allowingustodefine,for D example,alanguageoftypeddatawordswithΓ={[,]}such IV. DECIDABILITY OFLD that its projection on Γ is the set of well-parenthesised words This section is devoted to proving the decidability of the overΓ,orlanguagesoftypeddatawordswhoseprojectionson logic L , and hereby of the logic L . The proof scheme D T Γareshuffleclosuresofregularlanguages.(theshuffleclosure uses sets of labelled points, as done in [17], and unfolds as ofaregularlanguageListhesetofwordsγ ...γ forπ follows. We first get a normal form for the logic as formulas π(1) π(n) a bijection from {1,...,n} to {1,...,n}, and γ1...γn ∈L. with quantifier depth at most 2 (Scott normal form). Then Forexample,the closureof(abc)∗ isthe setof wordswith an we observe that data words can be seen as two-dimensional equal number of a, b, and c, which is not even context-free. structures with the horizontal axis representing the linear order and the vertical axis representing the order over data. Our second main resultis the nexttheorem,whose proofis Normalised L -formula are then translated into a set of the main goal of Section IV. D constraints over sets of labelled points. Theorem16. Ontypeddatawords,thelogicL isdecidable. Then,thesatisfiabilitytestoftheseconstraintsisbasedona D boundedabstractionofhorizontallines(setsoflabelledpoints C. From transductions to data words and back aligned horizontally) called profiles. We define an automaton Aproduction(u,(v,o))∈PR(Σ,Γ)issaidtobenon-erasing recognizing sequences of profiles which can be effectively if o is a surjective function, and an L -formula φ is non- turned into sequences of lines (thus forming a labelled point T system) satisfying the constraints. Hence, the decidability of a′ b c an L -formula reduces to testing emptiness of this profile D a′ b automaton. c′ b A. Scott normal form b′ a a a ThefirststepistonormaliseanyformulainL intoaScott D normalform(SNF).TheproceduretoputaformulainSNFis Fig.2. Anexampleoflabelledpointsystem(LPS).Thecorrespondingtyped the same as FO2 logics in general(see [23] for instance). We datawordis(b,c′,2)(a,b′,1)(b,a′,4)(a,b′,1)(c,a′,3)(b,a′,4)(a,b′,1). prove it in our context along with some preservation property about the data. Since we aim to get stronger properties than satisfiability,westateastrongerresult,yettheproofissimilar. (γ ,L(j ),j )...(γ ,L(j ),j ) (note that thanks to condi- 1 1 1 n n n Lemma 20. For any L -formula ϕ over an alphabet Γ and tion(2),thesetofdataisdownward-closed).Conversely,from D a type alphabet Σ, one can construct an LD-formula φ over anytypeddatawordw =(γ1,σd1,d1)...(γn,σdn,dn)weget Γ×Γ′ and the type alphabet Σ such that: the LPS d2p−1(w)=(L,P) where P ={(γi,i,di)|1≤i≤ • up to projection on Γ, φ and ϕ have the same models, n} and L={(σdi,di |1≤i≤maxi(di)}. • φisoftheform∀x∀yψ(x,y)∧ mi=1∀x∃yψi(x,y)where Proposition 21. Typed data words of TDW(Σ,Γ) and la- the formulasψ and ψi, i=1,.V..,m, are quantifierfree. belled point systems of LPS(Σ,Γ) are in bijection by d2p. Sketch of proof. Theideaistoiterativelyreplacesubformulas ConstraintsonLPSarebuiltoverlabelpredicates,andsome of quantifier depth one by a new predicate, and preserve the horizontal and vertical predicates. A label predicate γ ∈ Γ modelsby addinga new conjunct.At each step, a subformula is satisfied by a point p of P if p is labelled γ. Horizontal ∃y ψ(x,y) (resp. ∀y ψ(x,y)) where ψ(x,y) is a quantifier predicates are just horizontal directions →,←, which are free formula is replaced in ϕ by a new predicate P(x). We satisfiedbyapairofpoints((γ,i,j),(γ,i′,j′))if,respectively, add a conjunct ∀x∃y(P(x) → ψ(x,y)) (resp. ∀x∀y(P(x) → i < i′ and i′ < i. Vertical predicates are any MSO-definable ψ(x,y))). The number of steps, and so the number of predi- binary predicate over line labels Σ and the vertical order, catesadded,isequaltothenumberofquantificationsinϕ. denotedby ≤. The pair of points((γ,i,j),(γ,i′,j′)) satisfies B. 2-dimensional constraints over labelled point systems a vertical predicate ψ(x,y) ∈ MSO[Σ,≤] if (L,x = j,y = j′)|=ψ(x,y). Similarly to [17], we translate typed data words into sets An existential constraint is a pair (γ,E) where γ ∈Γ and of labelled points in a plane and positive formulas of L of D E is a possibly empty set of tuples (γ′,d,ψ) such that d is a quantifier depth at most two into sets of constraints, while horizontaldirectionandψisaverticalpredicate.Given(L,P) preserving satisfiability. The main differences with [17] are anLPS,apointpofP satisfiesan∃-constraint(γ,E)ifeither thateachhorizontallinewillbeadditionallylabelledbyadata pdoesnotsatisfyγ,orthereexistsapointq ofP anda triple type,andthattheconstraintswillusearbitraryMSOdefinable (γ′,d,ψ) of E s.t. q satisfies γ′, and (p,q) satisfies d and ψ. binarypredicatesinterpretedonthewordoflinelabels,instead In the latter case, we call q a valid witness of p for (γ,E). of just the order and successor over lines. A universal constraint is a tuple (γ,γ′,d,Ψ) where d is a Formally, we call labelled point system (LPS) over two horizontal direction and Ψ a vertical predicate. A pair (p,q) disjoint alphabets Σ and Γ a pair of partial assignments of finite domains L:N→Σ and P :N2 →Γ such that: of points of P satisfy a ∀-constraint (γ,γ′,d,Ψ) if p is not labelledbyγ,q isnotlabelledbyγ′,or(p,q)doesnotsatisfy 1) dom(L)anddom(P)’s1st-projectionaredownward-closed either d or Ψ. Then a universal constraint can be thought of 2) dom(P)’s 2nd-projection equals dom(L) as a forbidden pattern over pairs of points. 3) notwopointssharethesameabscissa,i.e.if(i,j),(i,j′)∈ An instance of the (two-dimensional) MSO labelled point dom(P), then j =j′. problem (MSOLPP) is a pair C = (C ,C ) of sets of ∃ ∀ Aswe willsee,theconditions1to 3willensurethe existence existential and universal constraints respectively. An LPS of an isomorphism from LPS to typed data words. A point M = (L,P) is a solution of (or model for) C, denoted p of P is a triple (γ,i,j) such that P(i,j) = γ. A line is M |= C, if every point of P satisfies all constraints in C ∃ a set of points with same ordinate. The line label of a point and every pair of points satisfy all constraints in C . ∀ p=(γ,i,j)∈P isthelabelσ suchthatL(j)=σ.Wedenote by LPS(Σ,Γ) the set of LPS over Σ and Γ. C. From L to MSOLPP D Thanks to the finiteness and downward-closedness of the Proposition 22. From any L -sentence φ in SNF over Σ domain of L, L is nothing but a word over Σ. As illus- D and Γ, one can construct an instance C of MSOLPP such trated in Fig. 2, any LPS (L,P) can be viewed as a typed that for any typed data word w ∈ TDW(Σ,Γ), w |= φ iff data word d2p(L,P). Indeed, thanks to condition (3) and d2p(w)∈LPS(Σ,Γ) is a solution of C. downward-closedness of P, there is exactly one point per abscissa up to some bound n. By condition (3), we have Sketch of proof. The key to the proof is that the conjunct P = {(γ ,1,j ),...,(γ ,n,j )}, and we set d2p(L,P) = ∀x∀y ψ(x,y) is translated into a universal constraint while 1 1 n n conjuncts of the form ∀x∃y ψi(x,y) become existential con- a′ b c straints. We consider atomic types for pairs of points, which are complete sets of truth values for all predicates (MSO, a′ b labelsanddirections).Thentheconjunct∀x∀y ψ(x,y)defines b thesetoftypesthatareallowed(thetypesthatsatisfyψ(x,y)), c′ • • • • • • whosecomplementisthentranslatedintouniversalconstraints. Using simple formula manipulations, m ∀x∃y ψ (x,y) is b′ a a a i=1 i m aretowmriitctentypinet.oT∀hxenγV∈fΓo(rγ(exac)h→lett∃eyrℓWγ=1oVtfγ,ℓΓ), wwheerceontsγt,rℓucist aann F(big,.(qx3,q.y) ,A↑)(bl,in(qex,qwa′it)h,↑p)r(ocfi,le(q:x,qcy′),{q,x↑),(qaa,′}(q,x(,bq,x·))(a,,↓(()qqax′,q,qxx)) ,↓) (qa′,qy) (qa′,qa′) (qa′,qy) (qa′,qx) existential constraint over some of the types t . γ,ℓ D. Automata for binary predicates inthesequence,forallk =1,...,n.Bydefinition,thenumber Itiswell-known(seee.g.[24])thatanybinaryMSO[Σ,≤]- ofprofilesisboundedbyN =|Σ|·2|SΨ|(|Γ|·(2(2|SΨ|2+1+1))!. predicate ψ(x,y) over Σ-labelled words, can be equivalently 2) Profile of a line: Formal definitions can be found in defined by a non-deterministic finite automaton (called here Appendix. Given an LPS M = (L,P) and an instance C of a predicate automaton) A = (Q ,Σ,I ,∆ ,F ) equipped MSOLPP,weconstructtheprofileofalinekofM asfollows. ψ ψ ψ ψ ψ with a set SP ⊆ Q2 of selecting pairs with the following The line label is the kth label of L, and the set of states S is ψ ψ semantics: for any word u ∈ Σ∗ and any pair of positions the set of reachable and co-reachable states of the predicate (i,j) of u, we have u |= ψ(i,j) if, and only if, there exists automataofC beforereachingpositionk oftheverticalword. an acceptingrun π of A and a pair (p,q)∈SP s.t. π is in ThenforeachpointpofP,weconstructaclauseoftype(γ,·) ψ ψ state p beforereadingu(i) and in state q beforereadingu(j). if p is on line k, or a clause containing the label of p, the set of all pairs of states such that there exists an accepting run Example 23. Let us consider as an example the binary thatreachesthefirststateonkandthesecondonthelineofp, betweenpredicateBet (x,y)=∃z σ(z)∧(x<z)∧(z <y), σ and the direction(↑ or ↓) toward p. These clauses are ordered which cannot be expressed using only two variables. The by the abscissa of the correspondingpoint, and we only keep automaton for this predicate is: the leftmost and rightmost occurrence of each clause. Σ σ Σ qx qσ qy qf Example24. Letusgiveanexamplemakinguseoflinelabels and which illustrates predicate automata. Figure 3 shows the Σ Σ Σ Σ relevant points for the profile of the second line of the LPS E. Profiles from Figure 2, when considering the binary predicate Beta′ as defined in Example 23. The complete profile sequence of InordertoobtainthedecidabilityofMSOLPP,weconsider the same LPS is given in the Appendix. bounded abstractions of horizontal lines of labelled point Notice that the point labelled by ‘a’ in the middle of the systems.Theabstractionofahorizontallineiscalledaprofile. bottom line does not appear in any clause since it is not Intuitively, given an LPS, the profile of a line in this system relevant to the satisfaction of constraints. holds information regarding some points on the line, but also regarding their relative position with respect to other points 3) Valid profiles: A profile λ = (σ,S,A ...A ) satisfies 1 n of the system that are relevant to the satisfiability of the an existential constraint c = (γ,E) if for every i such that constraints. A profile will be considered valid if for every A = (γ,·) there exists a tuple (γ′,d,ψ) of E, j ≤ n i existential constraint and every point, it describes a valid such that i ∼ j, (p,q) ∈ SP and v ∈ {↑,↓} such d ψ witness for it, and it never declares a pair of points which that (γ′,{(p,q)},v) ∈ A . It satisfies a universal constraint j violates a universal constraint. Two successive profiles are (γ,γ′,d,ψ) if there does not exist i and j such that i ∼ j, d consistentiftheyagreeonthepointsdescribedbyeachprofile, A =(γ′,·) and (γ′,{(p,q)},v)∈A for some (p,q)∈SP i j ψ meaningthatifa pointisdeclaredinoneprofile,onecanfind andv averticaldirection.GivenaninstanceC ofMSOLPP,a a correspondingpointinthe otherone.A sequenceofprofiles profile is C-valid (or just valid)if it satisfies everyconstraint. will be maximal if the profiles do not withhold information, A sequence of valid profiles is also called valid sequence. which is relevant to the satisfiability of constraints. 4) Consistency: The states informationcontained in a pro- 1) Profiles: Let C be an instance of MSOLPP over Σ and filedeclares,onagivenline,everyacceptingrunthatcanoccur Γ, and Ψ the set of MSO-predicates occurring in C. For all and influence the satisfaction of a constraint. When reading ψ∈Ψ,weletA withsetofstatesQ andsetofselectingpairs a sequence of profiles, we then have to ensure that what is ψ ψ SP be the predicate automatonfor ψ. Let S = Q . declared by a profile is sound and complete. To this end, we ψ Ψ ψ∈Ψ ψ A C-profile(orjustprofile)isa tupleλ=(σ,S,AU1...An) define below the consistency of two consecutive profiles. The where σ ∈ Σ is a line label, S ⊆ S and A ...A is a formal definition is technical and can be found in Appendix. Ψ 1 n sequence of elements from Γ×({·}∪P(S×S )×{↑,↓}), Informally, two profiles are consistent if everything that is Ψ called clauses, such that any clause A appearsat most twice declared by one of the profiles is acknowledged by the other. k

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.