ebook img

Untwisting two-way transducers in elementary time PDF

0.67 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Untwisting two-way transducers in elementary time

Untwisting two-way transducers in elementary time Fe´lix Baschenis Olivier Gauwin Anca Muscholl Gabriele Puppis Universite´ de Bordeaux, LaBRI Universite´ de Bordeaux, LaBRI Universite´ de Bordeaux, LaBRI CNRS, LaBRI [email protected] [email protected] [email protected] [email protected] Abstract—Functionaltransductionsrealizedbytwo-waytrans- others. Recently, Alur and Cerny´ [7] proposed an enhanced ducers (equivalently, by streaming transducers and by MSO version of one-way transducers called streaming transducers, transductions) are the natural and standard notion of “regular” and showed that they are equivalent to the two previous mappingsfromwordstowords.Itwasshownrecently(LICS’13) models.Astreamingtransducerprocessestheinputwordfrom that it is decidable if such a transduction can be implemented 7 by some one-way transducer, but the given algorithm has non- lefttoright,andstores(partial)outputwordsinfinitelymany, 1 elementary complexity. We provide an algorithm of different write-only registers. 0 flavor solving the above question, that has double exponential Two-way transducers raise challenging questions about re- 2 space complexity. We further apply our technique to decide source requirements. One crucial resource is the number of n whether the transduction realized by a two-way transducer can times the transducer needs to re-process the input word. In a beimplementedbyasweepingtransducer,witheitherknownor J unknown number of passes. particular,thecasewheretheinputcanbeprocessedinasingle pass, from left to right, is very attractive as it corresponds 0 1 I. INTRODUCTION to the setting of streaming, where the (possibly very large) inputs do not need to be stored in order to be processed. Sincetheearlytimesofcomputerscience,transducershave ] Recently, it was shown in [8] that it is decidable whether L beenidentifiedasafundamentalnotionofcomputation,where the transduction defined by a functional two-way transducer F one is interested how objects can be transformed into each can be implemented by a one-way transducer. However, the . other. Numerous fields of computer science are ultimately s decision procedure of [8] has non-elementary complexity, and c concerned with transformations, ranging from databases to it is very natural to ask whether one can do better. We gave [ image processing, and an important issue is to perform trans- in [9, 10] an exponential space algorithm in the special case 1 formations with low costs, whenever possible. ofsweepingtransducers:headreversalsareonlyallowedatthe v Themostbasicformoftransformersaredevicesthatprocess extremities of the input. However, sweeping transducers are 2 an input and produce outputs during the processing, using 0 knowntobestrictlylessexpressivethantwo-waytransducers. finitememory.Suchdevicesarecalledfinite-statetransducers. 5 In this paper we provide an algorithm of elementary com- Word-to-word finite-state transducers were considered in very 2 plexity for deciding whether the transduction defined by a 0 early work in formal language theory [1, 2, 3], and it was functional two-way transducer can be implemented by a one- . soon clear that they are much more challenging than finite- 1 waytransducer:thedecisionalgorithmhasdoubleexponential state word acceptors - the classical finite-state automata. One 0 space complexity, and an equivalent one-way transducer (if 7 essential difference between transducers and automata over it exists), can be constructed with triple exponential size. 1 words is that the capability to process the input in both The known lower bound [9] is double exponential size. Our : directionsstrictlyincreasestheexpressivepowerinthecaseof v techniques can be further adapted to characterize definabil- i transducers,whereasthisdoesnotforautomata[4,5].Inother X ity of transductions by other models of transducers, e.g. to words, two-way word transducers are strictly more expressive characterizesweepingtransducerswithintheclassoftwo-way r than one-way word transducers. a transducers. We consider in this paper functional transducers, that com- Related work. Besides the papers mentioned above, there pute functions from words to words. Two-way word transduc- are several recent results around the expressivity and the ers capture very nicely the notion of regularity in this setting. resources of two-way transducers, or equivalently, streaming Regularwordfunctions,i.e.functionscomputedbyfunctional transducers. First-order definable transductions were shown to two-waytransducers,inheritmanyofthecharacterizationsand be equivalent to transductions defined by aperiodic streaming algorithmicpropertiesoftherobustclassofregularlanguages. transducers[11]andtoaperiodictwo-waytransducers[12].An Engelfriet and Hoogeboom [6] showed that monadic second- effectivecharacterizationofaperiodicityforone-waytransduc- order definable graph transductions, restricted to words, are ers was obtained in [13]. equivalenttotwo-waytransducers—thisjustifiesthenotation In [10, 14] the minimization of the number of registers of “regular” word functions, in the spirit of classical results deterministicstreamingtransducers,resp.,passesoffunctional in automata theory and logic by Bu¨chi, Elgot, Rabin and sweeping transducers, was shown to be decidable. An alge- braic characterization of (not necessarily functional) two-way ThisworkwaspartiallysupportedbytheANRprojectsExStream(ANR- 13-JS02-0010)andDeLTA(ANR-16-CE40-0007). transducers over unary alphabets was provided in [15]. It was 1 shown that in this case sweeping transducers have the same p1,2q p2,2q p3,0q p4,0q expressivity.Theexpressivityofnon-deterministicinput-unary q5 a2,right q6 a3,right q7 a4,right q8 or output-unary two-way transducers was investigated in [16]. Overview.SectionIIintroducesbasicnotationsfortwo-way a1,right p1,1q p2,1q transducers, and Section III states the main result. Section IV q4 q3 is devoted to the effect of pumping runs on outputs, and a2,left Section V introduces the main tool for our characterization. p0,0q p1,0q p2,0q a3,left Section VI handles the construction of an equivalent one-way Run: q0 a1,right q1 a2,right q2 transducer.Finally,SectionVIIdescribesaproceduretodecide whether a functional transducer is equivalent to a sweeping Positions: 0 1 2 3 4 transducer. Inputword: a1 a2 a3 a4 II. PRELIMINARIES Fig.1. Graphicalpresentationofarunbymeansofcrossingsequences. Two-way automata and transducers. We start with some basic notations and definitions for two-way automata (resp., transducers). We assume that every input word u“a ¨¨¨a 1 n has two special delimiting symbols a “$ and a “% that 1 n do not occur elsewhere: a Rt$,%u for all i“2,...,n´1. i on u. Finally, we say that T is one-way if it does not have A two-way automaton A “ xQ,Σ,$,%,δ,q0,Fy has a transition rules of the form pq,a,w,q1,leftq. finite state set Q, input alphabet Σ, transition relation δ Ď QˆpΣYt$,%uqˆQˆtleft,rightu, initial state q PQ, and 0 set offinal states F ĎQ. Byconvention, left transitionson $ Crossing sequences. The first basic notion is that of crossing arenotallowed.AconfigurationofAhastheformuqv,with sequence. We follow the convenient presentation from [17], uv Pt$u¨Σ˚¨t%uandq PQ.Aconfigurationuqvrepresents which appeals to a graphical representation of runs of a two- thesituationwherethecurrentstateofAisqanditsheadreads way transducer where each configuration is seen as point the first symbol of v (on input uv). If pq,a,q1,rightqPδ, then (location) in a two-dimensional space. Let u “ a ¨¨¨a be 1 n there is a transition from any configuration of the form uqav an input word (recall that a “ $ and a “ %) and let ρ be to the configuration uaq1v, which we denote uqav ÝaÝ,rÝigÑht a run of a two-way automa1ton (or transnducer) T on u. The uaq1v.Similarly,ifpq,a,q1,leftqPδ,thenthereisatransition positions of ρ are the numbers from 0 to n, corresponding fromanyconfigurationoftheformubqavtotheconfiguration to “cuts” between two consecutive letters of the input. For uq1bav, denoted as ubqav ÝaÝ,ÝleÑft uq1bav. A run on w is a example,position0isjustbeforethefirstlettera ,positionn 1 sequence of transitions. It is successful if it starts in the initial is just after the last letter a , and any other position x, with n configuration q w and endsina configuration wq with q PF 0 1ďxăn, is between the letters a and a . x x`1 — note that this latter configuration does not allow additional transitions. The language of A is the set of input words that We say that a transition uqv ÝÝaÝ,dÑ u1q1v1 of ρ crosses admit a successful run of A. position x if either d “ right and |u| “ x, or d “ left and The definition of two-way transducers is similar to that of |u1|“x.Alocationofρisanypairpx,yqforwhichthereare two-way automata, with the only difference that now there is atleasty`1transitionsinρcrossingpositionx;thecomponent an additional output alphabet Γ and the transition relation is y of a location is called level. Each location is associated a a finite subset of QˆpΣYt$,%uqˆΓ˚ˆQˆtleft,rightu, state.Formally,wesaythatqisthestateatlocation(cid:96)“px,yq which associates an output over Γ with each transition of the in ρ, and we denote this by writing ρp(cid:96)q “ q, if the py`1q- underlying two-way automaton. Formally, given a two-way th transition that crosses x ends up in state q. The crossing transducer T “xQ,Σ,$,%,Γ,δ,q0,Fy, we have a transition sequence at position x of ρ is the tuple ρ|x “ pq0,...,qhq, of the form ubqav ÝaÝ,Ýd|Ñw u1q1v1, outputting w, whenever wheretheq ’sareallthestatesatlocationsoftheformpx,yq, y pq,a,w,q1,dq P δ and either u1 “ uba, v1 “ v or u1 “ u, for y “0,...,h. v1 “ bav, depending on whether d “ right or d “ left. The output associatedwitharunρ“u1q1v1 aÝ1Ý,dÝ1Ñ|w1 ...anÝÝ,dÝnÑ|wn As suggested by Fig. 1, any run can be represented as an u q v of T is the word outpρq “ w ¨¨¨w . A annotated path between locations. For example, if a location n`1 n`1 n`1 1 n transducer T defines a relation consisting of all pairs pu,wq px,yq is reached by a rightward transition, then the head of such that w “outpρq, for some successful run ρ on u. the automaton has read the symbol a ; if it is reached by a x The domain of T, denoted dompTq, is the set of input leftward transition, then the head has read the symbol a . x`1 words that have a successful run. For transducers T,T1, we Notethatinasuccessfulrunρeverycrossingsequencehasodd write T1 Ď T to mean that dompT1q Ď dompTq and the lengthandeveryrightward(resp.leftward)transitionreachesa transductions computed by T,T1 coincide on dompT1q. locationwitheven(resp.odd)level.Wecanidentifyfourtypes We say that T is functional if for each input u, at most of transitions between locations, depending on the parities of one output w can be produced by any possible successful run the levels (the reader may refer again to Fig. 1): 2 px,2y`1q requiresasmanystatesasthesizeofR),checkstheinput, px,2yqax`1,rightpx`1,2y1q px,2yq ax`1,left and outputs two copies of the guessed word. 2) Aspecialcaseoftransductionwithfinitedomainisgiven px,2y`2q by Rn “ ta0w0 ¨¨¨a2n´1w2n´1 : a0,...,a2n´1 P ta,buu, where nPN and each w is the binary encoding px,2y`1q ax`1,left px`1,2y1`1q ax,right px,2y`1q of the counter i “ 0,...,2n ´i 1. It is easy to see (cf. Proposition 15 [9]) that the transduction mapping Hereafter, we will identify runs with the corresponding anno- u P R to uu can be implemented by a two-way tated paths between locations. It is also convenient to define n transducer with quadratically many states w.r.t. n, while a total order (cid:2) on the locations of a run ρ by letting (cid:96) (cid:2)(cid:96) 1 2 every equivalent one-way transducer has at least 22n if (cid:96) is reachable from (cid:96) by following the path described by 2 1 states, since it needs to guess a word of length 2n. ρ — the order (cid:2) on locations is called run order. Given two locations (cid:96) (cid:2)(cid:96) of a run ρ, we write ρr(cid:96) ,(cid:96) s for the factor 3) Consider now the periodic language R “ pabcq˚. The 1 2 1 2 function that maps u P R to uu can be easily imple- oftherunthatstartsin(cid:96)1 andendsin(cid:96)`2.Noteth˘atthelatteris mentedbyaone-waytransducer:itsufficestooutputtwo also a run and hence the notation out ρr(cid:96) ,(cid:96) s is permitted. 1 2 letters(i.e.,ab,ca,bc,inturn)foreachinputletter,while Tworunsρ ,ρ canbeconcatenated,providedthatρ endsin 1 2 1 checking that the input is in R. location px,yq, ρ starts in location px,y1q, such that y1 “ y 2 pmod 2q and px,yq, px,y1q are labelled by the same state. Example 2. We consider a slightly more complicated We denote by ρ1ρ2 the run resulting from concatenating ρ1 transduction that is defined on input words of the form with ρ2. Clearly, we have ρr(cid:96)1,(cid:96)2s ρr(cid:96)2,(cid:96)3s“ρr(cid:96)1,(cid:96)3s for all u1 # ... # un, where each factor ui is over the alphabet locations (cid:96)1 (cid:2)(cid:96)2 (cid:2)(cid:96)3. Σ “ ta,b,cu. The output of the transduction is of the form Normalization. Without loss of generality, we will assume w1 # ... # wn, where each wi is either ui ui or just ui, that successful runs of functional transducers are normalized, depending on whether or not ui P pabcq˚ and ui`1 has even meaning that they never visit two locations with the same length, with un`1 “ε. position, the same state, and both either at even or at odd Theobviouswaytoimplementthetransductionisbymeans level. Indeed, if this were not the case, say if a successful run of a two-way transducer that performs multiple passes on the ρ visited two locations (cid:96) “px,yq and (cid:96) “px,y1q such that factors of the input: a first left-to-right pass is performed on 1 2 ρp(cid:96)1q “ ρp(cid:96)2q and y,y1 are both even or both odd, then the ui#ui`1 toproducethefirstcopyofui andtocheckwhether output produced by ρ between (cid:96)1 and (cid:96)2 should be empty, as ui Ppabcq˚ and |ui`1| is even; if so, a second pass on ui is otherwisebyrepeatingthefactorρr(cid:96)1,(cid:96)2sofρwecouldobtain performed to produce another copy of ui. successful runs that produces different outputs on the same The transduction can also be implemented by a one-way input, thus contradicting the assumption that the transducer is transducer: when entering a factor ui, the transducer guesses functional. Now that we know that the output of ρ produced whether or not ui Ppabcq˚ and |ui`1| is even; depending on between(cid:96)1 and(cid:96)2 isempty,wecoulddropthefactorρr(cid:96)1,(cid:96)2s, this it outputs either pabcabcq|u3i| or ui, and checks that the thusobtainingasuccessfulrunwiththesameoutput.Itiseasy guess is correct. to see that, in every normalized successful run, the crossing Our main result is: sequences have length at most 2|Q|´1. We define h “ 2|Q|´1. Moreover, by c we denote Theorem3. Thereisanalgorithmthatfromafunctionaltwo- max max thecapacityofthetransducer,whichisthemaximallengthof way transducer T constructs in triple exponential time a one- the output of a transition. way transducer T1 with the following properties: ‚ T1 ĎT, III. TWO-WAYTRANSDUCERSVSONE-WAYTRANSDUCERS ‚ dompTq“dompT1q iff T is one-way definable. In this section we state our main result, which is the Moreover,thesecondpropertyabovecanbecheckedindouble existence of an elementary algorithm for checking whether a exponential space w.r.t. |T|. two-waytransducerisequivalenttosomeone-waytransducer. Wecallsuchtransducersone-waydefinable.Beforestatingour We remark that a similar characterization for a much more result, we give a few examples. restrictedclassoftransducers(sweepingtransducers)appeared in[9].TheproofofTheorem3,however,ismoretechnical,as Example1. Weconsidertwo-waytransducersthatacceptany it requires a better understanding of the structure of the runs input u from a given regular language R and output the word of two-way transducers and a non-trivial generalization of the uu. We will argue how, depending on R, these transducers combinatorial arguments from [9]. may or may not be one-way definable. Theproofofthetheoremspansalongthenextthreesections. 1) IfR“pa`bq˚thereisnoequivalentone-waytransducer, In Section IV, we present the basic concepts for reasoning on as the output language is not regular. If R is finite, runs of two-way automata. This includes the definition of a then the transduction mapping u P R to uu can be finite semigroup for describing the shapes of two-way runs, implementedbyaone-waytransducerthatguessesu(this as well as Ramsey-type arguments that are used to bound the 3 lengthoftheoutputsproducedbypiecesofrunswithoutloops. Definition 4. Let ρ be a run and I “rx ,x s an interval of 1 2 InSectionVweprovidethemaincombinatorialargumentsfor ρ. Let h be the length of the crossing sequence ρ|x for both i i characterizing one-way definability. The crucial notion will i“1 and i“2. be that of inversion, that captures behaviours of the two- The flow F of I is a directed graph with set of nodes I way transducer that are problematic for one-way definability. t0,...,maxph ,h q´1u and set of edges consisting of all 1 2 Finally, in Section VI we exploit the combinatorial results py,y1q such that there exists a factor of ρ intercepted by I and the Ramsey-type arguments to derive the existence of that starts at location px ,yq and ends at location px ,y1q, i j suitable decompositions of runs that lead to the construction for i,j Pt1,2u. of equivalent one-way transducers. TheeffectE ofI isthetriplepF ,c ,c q,wherec “ρ|x I I 1 2 i i is the crossing sequence at x . i IV. UNTANGLINGRUNSOFTWO-WAYTRANSDUCERS For example, the interval I of Fig. 2 has the flow graph Thissectionisdevotedtountanglingthestructureofrunsof 0 ÞÑ 1 ÞÑ 3 ÞÑ 4 ÞÑ 2 ÞÑ 0. It is easy to see that every two-way transducers. Whereas the classical transformation of node of a flow FI has at most one incoming and at most two-way automata into one-way automata based on crossing one outgoing edge. More precisely, if y ă h1 is even, then sequences is rather simple, we will need a much deeper it has one outgoing edge (corresponding to an LR- or LL- understanding of runs of two-way transducers, because of the factor intercepted by I), and if it is odd it has one incoming additional outputs. In a nutshell, being one-way definable is edge (corresponding to an RL- or LL-factor intercepted by I). related to periodicities (with bounded periods) in the output, Similarly, if y ă h2 is even, then it has one incoming edge and these periodicities are generated by loops in the run. (corresponding to an LR- or RR-factor), and if it is odd it has We will actually work with so called idempotent loops, that one outgoing edge (corresponding to an RL- or RR-factor). generate periodicities in the output in a “nice” way. We will In the following we consider generic effects that are not derivetheexistenceofidempotentloopswithboundedoutputs necessarily associated with intervals of specific runs. The using Ramsey-based arguments. definition of such effects should be clear: these are triples Wefixthroughoutthepaperafunctionaltwo-waytransducer consistingofagraph(calledflow)andtwocrossingsequences T, an input word u, and a successful run ρ of T on u. We of lengths h1,h2 ď hmax, with sets of nodes of the form assume that ρ is normalized, i.e., every state occurs at most t0,...,maxph1,h2q´1u, that satisfy the in/out-degree prop- onceineachcrossingsequenceofρatlevelsofagivenparity. erties stated above. Forsimplicity,wedenotebyω thelengthoftheinputword It is convenient to distinguish the edges in a flow based u. We will consider intervals of positions of the form I “ on the parity of the source and target nodes. Formally, we rx ,x s, with 0ďx ăx ďω. The containment relation Ď partition any flow F into the following subgraphs: 1 2 1 2 on intervals is defined by rx3,x4s Ď rx1,x2s if x1 ď x3 ă ‚ FLR consists of all edges of F between pairs of even x ďx . nodes, 4 2 Factors, flows, and effects. A factor of a run ρ is a con- ‚ FRLconsistsofalledgesofF betweenpairsofoddnodes, tiguous subsequence of ρ. A factor intercepted by an interval ‚ FLL consists of all edges of F from an even node to an odd node, I “ rx ,x s is a maximal factor of 1 2 ρ that visits only positions x P I, ζ 4 ‚ FRR consists of all edges of F from an odd node to an even node. and never uses a left transition from 3 δ positionx orarighttransitionfrom 4 2 We denote by F (resp. E) the set of all flows (resp. effects) position x1. 3 γ 1 augmented with a dummy element K. We equip both sets F 2 2 0 Fig. 2 on the right gives an exam- 1 β and E with a semigroup structure, where the corresponding ple of an interval I that intercepts 0 α products ˝ and d are defined below (similar definitions thefactorsα,β,γ,δ,ζ.Thenumbers appear in [18]). We need this semigroup structure in order that annotate the endpoints of the I“rx1,x2s to identify idempotent loops, that play a crucial role in our Fig.2. Interceptedfactors. factors represent their levels. characterization of one-way definability. Every factor α intercepted by an interval I “ rx ,x s is Definition 5. For two graphs G,G1, we denote by G¨G1 the 1 2 of one of the four types below, depending on its first location graph with edges of the form py,y2q such that py,y1q is an px,yq and its last location px1,y1q: edge of G and py1,y2q is an edge of G1, for some node y1 that belongs to both G and G1. Similarly, we denote by G˚ ‚ α is an LL-factor if x“x1 “x1, the graph with edges py,y1q such that there exists a (possibly ‚ α is an RR-factor if x“x1 “x2, empty) path in G from y to y1. ‚ α is an LR-factor if x“x1 and x1 “x2, ‚ α is an RL-factor if x“x2 and x1 “x1. The product of two flows F,F1 is the unique flow F˝F1 (if it exists) such that: In Fig. 2 we see that α is an LL-factor, β,δ are LR-factors, ζ is an RR-factor, and γ is an RL-factor. ‚ pF ˝F1qLR “FLR¨pFL1L¨FRRq˚¨FL1R, ‚ pF ˝F1qRL “FR1L¨pFRR¨FL1Lq˚¨FRL, 4 ζ γ1 γ1 γ1 γ1 β3 ζ ζ δ β3 β2 β3 β1 ζ ζ δγ ζ δγ γβ ββ21 ββ21 ββ21β3 δ δ δ α β β α3 γ γ α γ α α3 α3 α2 β β β α2 α3 α2 α1 α α α α1 α2 α1 α1 I I copyofI I 2copiesofI L L 2copiesofL Fig.3. Pumpingaloopinarun. Fig.4. Pumpinganidempotentloopwiththreecomponents. ‚ pF ˝F1qLL “FLL Y FLR¨pFL1L¨FRRq˚¨FL1L¨FRL, called pumping, and results in a new run of the transducer T ‚ pF ˝F1qRR “FR1R Y FR1L¨pFRR¨FL1Lq˚¨FRR¨FL1R. on the word If no flow F ˝F1 exists with the above properties, then we let ` ˘ F ˝F1 “K. pumpm`1puq :“ ur0,x s¨ urx `1,x s m`1¨urx `1,ns. L 1 1 2 2 The product of two effects E “ pF,c ,c q and E1 “ 1 2 pF1,c1,c1q is either the effect E dE1 “ pF ˝F1,c ,c1q or Wedenotebypumpm`1pρqthepumped1runonpumpm`1puq. 1 2 1 2 L L the dummy element K, depending on whether F˝F1 ‰K and The goal in this section is to describe the shape of the c “c1. pumped run pumpm`1pρq (and the produced output as well) 2 1 L when L is an idempotent loop. We will focus on idempotent For example, let F be the flow of interval I in Fig. 2. Then loopsbecausepumpingnon-idempotentloopsmayinduceper- pF ˝Fq “tp0,1q,p2,3qu, pF ˝Fq “tp1,2q,p3,4qu, and LL RR mutations of factors that are difficult to handle. For example, pF ˝Fq “tp4,0qu — one can quickly verify this with the LR if we consider again the non-idempotent loop I to the left of help of Fig. 3. Fig. 3, the factor of the run between β and γ (to the right of It is also easy to see that pF,˝q and pE,dq are finite semi- I, highlighted in red) precedes the factor between γ and δ (to groups, and that for every run ρ and every pair of consecutive the left of I, again in red), but this ordering is reversed when intervals I “rx ,x s and J “rx ,x s of ρ, F “F ˝F 1 2 2 3 IYJ I J a new copy of I is added. and E “ E d E . In particular, the function E that IYJ I J When pumping a loop L, subsets of factors intercepted by associates each interval I of ρ with the corresponding effect L are glued together to form longer factors intercepted by E can be seen as a semigroup homomorphism. I the unioned copies of L. The concept of component that we Note that, in a normalized successful run, there are at most introduce below aims at identifying the groups of factors that |Q|hmax distinct crossing sequences and at most 4hmax distinct are glued together. flows, since there are at most h edges in a flow, and each max one has one of the 4 possible types LL,...,RR. Hence there Definition 7. A component of a loop L is any strongly are at most p2|Q|q2hmax distinct effects. connected component of its flow F (note that this is also L Loops and components. Loops of a two-way run are the a cycle, since every node in it has in/out-degree 1). Given basic building blocks for characterizing one-way definability. a component C, we denote by minpCq (resp. maxpCq) the We will consider special types of loops, called idempotent minimum (resp. maximum) node in C. We say that C is left- loops, when showing that outputs generated in non left-to- to-right (resp. right-to-left) if minpCq is even (resp., odd). right manner are essentially periodic. An pL,Cq-factor is a factor of the run that is intercepted by L and corresponds to an edge of C. Definition 6. A loop of ρ is an interval L “ rx ,x s whose 1 2 endpointshavethesamecrossingsequences,i.e.ρ|x “ρ|x . Forexample,theloopI ofFig.3containsasinglecomponent 1 2 It is said to be idempotent if EL “ELdEL and EL ‰K. C “ t0 ÞÑ 1 ÞÑ 3 ÞÑ 4 ÞÑ 2 ÞÑ 0u which is left-to-right. Another example is given in Fig. 4, where the loop L has For example, the interval I of Fig. 2 is a loop, if one assumes three components C ,C ,C (ordered from bottom to top): that the crossing sequences at the borders of I are the same. 1 2 3 α ,α ,α are the pL,C q-factors, β ,β ,β are the pL,C q- However, by comparing with Fig. 3, it is easy to see that I is 1 2 3 1 1 2 3 2 factors, and γ is the unique pL,C q-factor. not idempotent. On the other hand, the loop consisting of 2 1 3 We will usually list the pL,Cq-factors based on their order copies of I is idempotent. Given a loop L “ rx ,x s and a number m P N, we can of occurrence in the run. 1 2 introduce m new copies of L and connect the intercepted 1Using similar constructions, one could remove a loop L from a run ρ, factors in the obvious way. Fig. 3 shows how to do this for resultingintherunpump0pρq.Aswedonotneedthis,theoperationpump L L m “ 1 and m “ 2. The operation that we just described is willalwaysbeparametrizedbyapositivenumberm`1. 5 The following lemma (proved in the appendix) describes starting at the last black dot. The pumped run pumpm`1pρq L the precise shape and order of such factors when the loop L for m“2 is depicted to the right of Fig. 4. is idempotent. It can be used to reason on the shape of runs Ramsey-typearguments.Weconcludethesectionbydescrib- obtained by pumping idempotent loops. ing a technique that can be used for bounding the length of the outputs produced by factors of the run ρ. This technique Lemma8. IfC isaleft-to-right(resp.right-to-left)component is based on Ramsey-type arguments and relies on Simon’s of an idempotent loop L, then the pL,Cq-factors are in the “factorizationforest”theorem[19,20],whichwerecallbelow. following order: k LL-factors (resp. RR-factors), followed by Let X be a set of positions of ρ. A factorization forest one LR-factor (resp. RL-factor), followed by k RR-factors for X is an unranked tree, where the nodes are intervals I (resp. LL-factors), for some k ě0. with endpoints in X, labelled with the corresponding effect We also need to introduce the notions of anchor (Def. 9) E , the ancestor relation is givenby the containment order on I and trace (Def. 10). intervals, the leaves are the minimal intervals rx ,x s, with 1 2 x successor of x in X, and for every internal node I with Definition 9. Let C be a component of an idempotent loop 2 1 L“rx1,x2s.The`anchorofC˘insideL,denoted`2 anpCq,ise˘i- children J1,...,Jk, we have: therthelocation x1,maxpCq orthelocation x2,maxpCq , ‚ I “J1Y¨¨¨YJk, depending on whether C is left-to-right or right-to-left. ‚ EI “EJ1 d¨¨¨dEJk, Intuitively, the anchor anpCq of a component C of L is the ‚ ioffkthąe s2e,mthigernouEpIp“E,EdJq1. “ ¨¨¨ “ EJk is an idempotent source location of the unique LR- or RL-factor intercepted by We will make use of the following three constants de- L that corresponds to an edge of C (recall Lemma 8). fined from the transducer T: the maximum number c max Definition10. LetC beacomponentofsomeidempotentloop of letters output by a single transition, the maximal length L and let pi0,i1q,pi1,i2q,...,pik´1,ikq,pik,ik`1q be a cycle hmax “ 2|Q|´1 of a crossing sequence, and the maximal of C, where i0 “ik`1 “maxpCq. For every j “0,...,k, let size emax “ p2|Q|q2hmax of the effect semigroup pE,dq. βj bethefactorinterceptedbyLthatcorrespondstotheedge By B “ cmax ¨hmax ¨p23emax `4q we will denote the main pi ,i q of C. The trace of C inside L is the run trpCq “ constant appearing in all subsequent sections. j j`1 β β ¨¨¨ β (note that this is not necessarily a factor of the 0 1 k Theorem12(Factorizationforesttheorem[19,20]). Forevery original run ρ). set X of positions of ρ, there is a factorization forest for X Intuitively, the trace trpCq is obtained by concatenating the of height at most 3emax. pL,Cq-factors together, where the first factor is the (unique) It is easy to use the above theorem to show that every run LR-/RL-factor that starts at the anchor anpCq and the remain- thatproducesanoutputlongerthanB containsanidempotent ing ones are the LL-factors interleaved with the RR-factors. loopwith non-emptyoutput. Below,wepresent aresult inthe For example, by referring again to the components same spirit, but refined in a way that it can be used to find C ,C ,C of Fig. 4, we have the following traces: trpC q“ 1 2 3 1 anchors of components of loops inside specific intervals. α α α , trpC q“β β β , and trpC q“γ . 2 1 3 2 2 1 3 3 1 In order to state it formally, we need to consider sub- As shown by the following proposition (proved in the sequences of ρ induced by sets of locations that are not appendix),iterationsofidempotentloopstranslatetoiterations necessarily intervals. Recall that ρr(cid:96) ,(cid:96) s denotes the fac- 1 2 of traces trpCq of components. tor of ρ delimited by two locations (cid:96) (cid:2) (cid:96) . Similarly, 1 2 Proposition 11. Let L be an idempotent loop of ρ with given any set Z of (possibly non-consecutive) locations, we components C ,...,C , listed according to the order of their denote by ρ | Z the subsequence of ρ induced by Z. 1 k anchors: anpC q(cid:1)¨¨¨(cid:1)anpC q. For all mPN, we have A transition of ρ | Z is a transition 1 k (cid:96) from some (cid:96) to (cid:96)1, where both (cid:96),(cid:96)1 2 pumpmL`1pρq “ ρ0 trpC1qm ρ1 ¨¨¨ ρk´1 trpCkqm ρk belong to Z. The output outpρ|Zq is the concatenation of the outputs where of the transitions of ρ | Z (in the ‚ ρ0 is the prefix of ρ that ends at anpC1q, order given by ρ). An example of ‚ ρi is the factor ρranpCiq,anpCi`1qs, for all 1ďiăk, subrun ρ | Z is represented by the (cid:96) ‚ ρk is the suffix of ρ that starts at anpCkq. thick arrows in the figure to the 1 right, where Z “r(cid:96) ,(cid:96) sXpIˆNq. I“rx1,x2s For example, referring to the left hand-side of Fig. 4, the 1 2 runρ0 goesuntilthefirstlocationmarkedbyablackdot.The Theorem 13. Let I “ rx1,x2s be an interval of positions, runρ1 andρ2,resp.,arebetweenthefirstandthesecondblack Kˇ“r(cid:96)1,(cid:96)2s aˇn interval of locations, and Z “K X pIˆNq. dot,andthesecondandthirdblackdot.Finally,ρ3 isthesuffix If ˇoutpρ | Zqˇ ą B, then there exist an idempotent loop L and a component C of L such that 2In denoting the anchor — and similarly the trace — of a component C inside a loop L, we omit the annotation specifying L, since this is often ‚ x1 ăminpLqămaxpLqăx2 (in particular, LĹI), understoodfromthecontext. ‚ (cid:96)1 (cid:1)anpCq(cid:1)(cid:96)2 (in particular, anpCqPK), 6 factor, then they have as period the greatest common divisor ofthetwooriginalperiods.Below,westateaslightlystronger variant of Fine-Wilf’s theorem, which contains an additional anpC2q claim showing how to align a common factor of the words w ,w so as to form a third word w that contains a prefix 1 2 3 of w and a suffix of w . The additional claim will be fully 1 2 anpC1q exploited in the proof of Proposition 26. L2 L1 Lemma 17 (Fine-Wilf’s theorem). If w “ w1 w w2 has Fig.5. Aninversionwithcomponentsinterceptingthehighlightedfactors. 1 1 1 period p , w “ w1 ww2 has period p , and the common 1 2 2 2 2 factor w has length at least p `p ´gcdpp ,p q, then w , 1 2 1 2 1 ‚ outptrpCqq‰ε. w2, and w3 “w11 ww22 have period gcdpp1,p2q. Two further combinatorial results are heavily used in the V. INVERSIONSANDPERIODS proof of Proposition 16. The first one is a result of Ko- As suggested by Examples 1 and 2, a typical phenomenon rtelainen [22], which was later improved and simplified by that may prevent a transducer from being one-way definable Saarela [23]. It is related to word equations with iterated fac- is that of an inversion. An inversion essentially corresponds tors, like those that arise from considering outputs of pumped to a long output produced from right to left. The main result versions of a run. To improve readability, we highlight the in this section is Proposition 16, that shows that the output importantiterationsoffactorsinsidetheconsideredequations. produced between the locations delimiting an inversion must be periodic, with bounded period. Theorem18(Theorem4.3in[23]). Considerawordequation Definition 14. An inversion of ρ is a tuple pL1,C1,L2,C2q v0v1mv2...vk´1vkmvk`1 “ w0w1mw2...wk1´1wkm1 wk1`1 such that where m is the unknown and v ,w are words. Then the set ‚ Li is an idempotent loop, for both i“1,2, of solutions of the equation is eiithejr finite or N. ‚ Ci is a component of Li, for both i“1,2, ‚ anpC1q(cid:2)anpC2q, The second combinatorial result considers a word equation ‚ anpCiq“pxi,yiq, for both i“1,2, and x1 ěx2, with iterated factors parametrized by two unknowns m1,m2 ‚ both outptrpC1qq and outptrpC2qq are non-empty. thatoccurinoppositeorderintheleft,respectivelyrighthand- side of the equation. This type of equation arises when we Fig. 5 gives an example of an inversion involving the loop compare the output associated with an inversion of T and the L with its first component and the loop L with its second 1 2 output produced by an equivalent one-way transducer T1. component (we highlighted the anchors and the factors corre- sponding to these components). Lemma 19. Consider a word equation of the form Definition15. Awordw “a1¨¨¨anhasperiodpifai “ai`p v0pm1,m2qv1m1v2pm1,m2qv3m2v4pm1,m2q “ w0w1m2w2w3m1w4 for all pairs of positions i,i`p of w. where m ,m are the unknowns, v ,v are non-empty words, 1 2 1 3 For example, w “abcabcab has period 3. and vpm1,m2q,vpm1,m2q,vpm1,m2q are words that may contain One-way definability of functional two-way transducers es- factor0s of the f2orm vm14or vm2, for a generic word v. If sentiallyamountstoshowingthattheoutputproducedbyevery the above equation holds for all m ,m P N, then the 1 2 inversion has bounded period. The proposition below shows words v vm1 vpm1,m2q vm2 v are periodic with period a slightly stronger periodicity property, which refers to the gcdp|v |,1|v 1|q, for2all m ,m3 PN3. 1 3 1 2 output produced inside the inversion extended on both sides bythetraceoutputs.Wewillneedthisstrongerpropertylater, The last ingredient used in the proof of Proposition 16 is a when dealing with overlapping portions of the run delimited bound on the period of the output produced by an inversion. by different inversions. Forthis,weintroduceasuitablenotionofminimalityofloops and loop components: Proposition 16. If T is one-way definable, then for every inversionpL ,C ,L ,C qofasuccessfulrunρofT,theword Definition 20. Consider pairs pL,Cq consisting of an idem- 1 1 2 2 ` ˘ ` ˘ ` ˘ potent loop L and a component C of L. out trpC q out ρranpC q,anpC qs out trpC q 1 1 2 2 ‚ On such pairs, we define the relation Ă by pL1,C1q Ă hasperiodpthatdividesboth|outptrpC qq|and|outptrpC qq|. pL,Cq if L1 Ĺ L and at least one pL1,C1q-factor is 1 2 Moreover, pďB. contained in some pL,Cq-factor. ‚ ApairpL,Cqisoutput-minimalifforallpairspL1,C1qĂ The basic combinatorial argument for proving Proposi- pL,Cq, we have outptrpC1qq“ε. tion 16 is a classical result in word combinatorics called Fine and Wilf’s theorem [21]. Essentially, the theorem says that, Note that the relation Ă is not a partial order in general (it whenever two periodic words w ,w share a sufficiently long is however antisymmetric). Lemma 21 below shows that the 1 2 7 length of the output trace of C inside L is bounded whenever (cid:96)2 (cid:96)2 pL,Cq is output-minimal. Lemma 21. For every output-minimal pair pL,Cq, Z(cid:96)(cid:1)x Z(cid:0) |outptrpCqq|ďB. (cid:96) x Proof sketch. We use a Ramsey-type argument here: if Z Z |outptrpCqq|ąB, then Theorem 13 can be applied to exhibit (cid:1)(cid:96)x (cid:1) (cid:96) (cid:96) an idempotent loop strictly inside L and a component C of it 1 1 Fig.6. Outputsthatneedtobeboundedinadiagonalandinablock. withnon-emptytraceoutput.Thiswouldcontradicttheoutput- minimality of pL,Cq. VI. ONE-WAYDEFINABILITY We remark that the above lemma cannot be used directly to bound the period of the output produced Proposition 16 is the main combinatorial argument for by an inversion. The reason is that we cannot assume characterizingtwo-waytransducersthatareone-waydefinable. that inversions are built up from output-minimal pairs. In this section we provide the remaining arguments. Roughly, Acounter-exampleisgivenin the idea is to decompose every successful run ρ into factors the figure to the right, which thatproducelongoutputseitherinaleft-to-rightmanner(“di- shows a run where the only anpC2q agonals”), or based on an almost periodic pattern (“blocks”). inversion pL ,C ,L ,C q We say that a word w is almost periodic with bound p if 1 1 2 2 contains pairs that are not anpC1q w “ w0 w1 w2 for some words w0,w2 of length at most p output-minimal: the factors and some word w1 of period at most p. that produce long outputs are We illustrate the following definition in Fig. 6. those in red, but they occur Definition 22. Consider a factor ρr(cid:96) ,(cid:96) s of the run, where outside ρranpC q,anpC qs. L2 L1 1 2 1 2 (cid:96) “px ,y q, (cid:96) “px ,y q, and x ďx . We call ρr(cid:96) ,(cid:96) s 1 1 1 2 2 2 1 2 1 2 We are now ready to prove Proposition 16. Here we only ‚ a diagonal if for all x P rx1,x2s, there is a location (cid:96)x present the key ideas, and refer the reader to the appendix for at position x such that (cid:96) (cid:2) (cid:96) (cid:2) (cid:96) and the words 1 x 2 (cid:1) more details. outpρ | Z q and outpρ | Z q have length at most B, Proof sketch of Proposition 16. In the first half of the proof `wrhxe,rωesZˆ(cid:96)(cid:1)xN(cid:96)x“˘;r(cid:96)x,(cid:96)2sX`r0,(cid:1)x(cid:96)sxˆN˘andZ(cid:1)(cid:96)x “r(cid:96)1,(cid:96)xsX we pump the two loops L and L so that we obtain also 1 2 loops in the assumed equivalent one-way transducer T1. We ‚ a block if the word outpρr(cid:96)1,(cid:96)2sq is almost periodic with then consider the outputs of the pumped runs of T and T1, bound B, and outpρ|Z(cid:0)q and outpρ`|Z(cid:1)q have l˘ength which contain iterated factors parametrized by two natural at most B, where`Z(cid:0) “ r(cid:96)1,˘(cid:96)2s X r0,x1sˆN and Z “r(cid:96) ,(cid:96) s X rx ,ωsˆN . numbers m1,m2. As those outputs must agree due to the (cid:1) 1 2 2 equivalence of T,T1, we get an equation as in Lemma 19, Intuitively, the output of a diagonal ρr(cid:96) ,(cid:96) s can be simulated 1 2 where the word v1 belongs to outptrpC1qq` and the word while scanning the input interval rx1,x2s from left to right, v3 belongs to outptrpC2qq`. Lemma 19 shows that the word sincetheoutputsofρ|Z (cid:1)andρ|Z arebounded.Asimilar describedbytheequationhasperiodpdividinggcdp|v1|,|v3|q, argument applies to a blo(cid:96)xck ρr(cid:96)1,(cid:96)(cid:1)2s(cid:96)x, where in addition, one and Lemma 17 shows that p even divides |outptrpC1qq| and exploits the fact that the output is almost periodic. Roughly, |outptrpC2qq|. Finally, we use Theorem 18 to transfer the the idea is that one can simulate the output of a block by periodicity property from the word of the equation to the outputting symbols according to a periodic pattern, and in a word w “ outptrpC1qqoutpρranpC1q,anpC2qsqoutptrpC2qq number that is determined from the transitions on urx1,x2s produced by the original run of T. This is possible because and the guessed (bounded) outputs on Z and Z . (cid:0) (cid:1) the word of the equation is obtained by iterating factors of w. The general idea for turning a two-way transducer T into In particular, by reasoning separately on the parameters that an equivalent one-way transducer T1 is to guess (and check) define those iterations, and by stating the periodicity property a factorization of a successful run of T into factors that are as an equation in the form required by Theorem 18, one can either diagonals or blocks, and properly arranged following prove that the periodicity equation holds on all parameters, the order of positions. and thus in particular on w. Inthesecondhalfoftheproofweshowthattheperiodpis Dśefinition 23. A decomposition of ρ is a factorization ρr(cid:96) ,(cid:96) s of ρ into diagonals and blocks, where (cid:96) “ bounded by B. This requires a refinement of the previous ar- i i i`1 i px ,y q and x ăx for all i. gumentsandinvolvespumpingtherunofT simultaneouslyon i i i i`1 three different loops. The idea is that by pumping we manage The one-way transducer T1 whose existence is stated by to find inversions with some output-minimal pair pL0,C0q. In Theorem 3 simulates T precisely on those inputs u that this way we show that the period p also divides outptrpC0qq, have some successful run admitting a decomposition. To which is bounded by B according to Lemma 21. provide further intuition on the notion of decomposition, we 8 S˚-class is an interval of locations witnessed by a series (cid:96) 5 of inversions pL ,C ,L ,C q such that anpC q (cid:2) (cid:96) 2i 2i 2i`1 2i`1 2i 4 anpC q(cid:2)anpC q(cid:2)anpC q. 2i`2 2i`1 2i`3 (cid:96)3 The next result exploits the shape of a non-singleton S˚- class, the assumption that ρ satisfies the periodicity property (cid:96) 2 statedinP2,andLemma17,toshowthattheoutputproduced inside an S˚-class has bounded period. (cid:96) 1 Proposition 26. If ρ satisfies the periodicity property stated u # u # u # u 1 2 3 4 in P`2 and (cid:96)˘(cid:2)(cid:96)1 are two locations in the same S˚-class, then Fig.7. Adecompositionofarunofatwo-waytransducer. out ρr(cid:96),(cid:96)1s has period at most B. The S˚-classes considered so far cannot be directly used consider again the transduction of Example 2 and the two- as blocks for the desired decomposition of ρ, since the x- way transducer T that implements it in the most natural way. coordinates of their endpoints might not be in the appropriate Fig.7showsanexampleofarunofT onaninputoftheform order. The next definition takes care of this, by enlarging the u1 #u2 #u3 #u4, where u2,u4 P pabcq˚, u1u3 R pabcq˚, S˚-classes according to x-coordinates of anchors. and u has even length. The factors of the run that produce 3 long outputs are highlighted by the bold arrows. The first and Definition 27. Let K “ r(cid:96),(cid:96)1s be a non-singleton S˚-class, third factors of the decomposition, i.e. ρr(cid:96) ,(cid:96) s and ρr(cid:96) ,(cid:96) s, let anpKq be the restriction of K to the locations that are 1 2 3 4 are diagonals (represented by the blue hatched areas); the anchors of components of inversions, and let XanpKq “ tx : second and fourth factors ρr(cid:96) ,(cid:96) s and ρr(cid:96) ,(cid:96) s are blocks Dypx,yqPanpKqu be the projection of anpKq on positions. 2 3 4 5 (represented by the red hatched areas). We define blockpKq“r(cid:96)1,(cid:96)2s, where Theorem 24. Let T be a functional two-way transducer. The ‚ (cid:96)1 is` the la˘test location px,yq (cid:2) (cid:96) such that x “ min X , following are equivalent: anpKq P1) T is one-way definable. ‚ (cid:96)2 is`the ear˘liest location px,yq (cid:4) (cid:96)1 such that x “ max X P2) For all inversions pL ,C ,L ,C q of all successful runs anpKq 1 1 2 2 (notethatthelocation(cid:96) existssince(cid:96)istheanchorofthefirst of T, the word 1 ` ˘ ` ˘ ` ˘ component of an inversion, and (cid:96) exists for similar reasons). 2 out trpC q out ρranpC q,anpC qs out trpC q 1 1 2 2 Lemma 28. If K “ r(cid:96),(cid:96)1s is a non-singleton S˚-class, then has period pďB dividing |outptrpC1qq|, |outptrpC2qq|. ρr(cid:96)1,(cid:96)2s is a block, where r(cid:96)1,(cid:96)2s“blockpKq. P3) Every successful run of T admits a decomposition. Proof sketch. The periodicity of outpρr(cid:96),(cid:96)1sq is obtained by The implication from P1 to P2 was already shown in applying Proposition 26. Then Theorem 13 is applied twice: Proposition 16. The rest of this section is devoted to prove first to bound outpρr(cid:96) ,(cid:96)sq and outpρr(cid:96)1,(cid:96) sq (hence proving 1 2 the implications from P2 to P3 and from P3 to P1. The that outpρr(cid:96) ,(cid:96) sq is almost periodic with bound B), and 1 2 issues related to the complexity of the characterization will second, to bound outpρ|Z q and outpρ|Z q, as introduced (cid:0) (cid:1) be discussed further below. in Definition 22. From periodicity to existence of decompositions (P2ÑP3). Thenextlemmashowsthatblocksdonotoverlapalongthe As usual, we fix a successful run ρ of T. We will prove input axis: a slightly stronger result than the implication from P2 to P3, namely: if every inversion of ρ satisfies the periodicity Lemma 29. Suppose that K1 and K2 are two different non- property stated in P2, then ρ admits a decomposition (note singleton S˚-classes such that (cid:96) (cid:1) (cid:96)1 for all (cid:96) P K1 and that this is independent of whether other runs satisfy or not (cid:96)1 P K2. Let blockpK1q “ r(cid:96)1,(cid:96)2s and blockpK2q “ r(cid:96)3,(cid:96)4s, P2). To identify the blocks of a possible decomposition of ρ with (cid:96)2 “px2,y2q and (cid:96)3 “px3,y3q. Then x2 ăx3. weconsiderasuitableequivalencerelationbetweenlocations: Proof sketch. If x ě x , one can exhibit an inversion 2 3 Definition 25. A location (cid:96) is covered by an inversion between a component of a loop in K and another one in 1 pL1,C1,L2,C2q if anpC1q (cid:2) (cid:96) (cid:2) anpC2q. We define the K2, and deduce that K1 “K2. relation S by letting (cid:96) S (cid:96)1 if (cid:96),(cid:96)1 are covered by the For the sake of brevity, we call S˚-block any factor of the same inversion. We define the equivalence relation S˚ as the form ρ|blockpKq that is obtained by applying Definition 27 reflexive and transitive closure of S. to a non-singleton S˚-class K. The results obtained so far Locations covered by the same inversion pL ,C ,L ,C q imply that every location covered by an inversion is also 1 1 2 2 yield an interval w.r.t. the run ordering (cid:2). Thus every non- covered by an S˚-block (Lemma 28), and that the order of singleton S˚-class can be seen as a union of such intervals, occurrence of S˚-blocks is the same as the order of positions say K ,...,K , that are two-by-two overlapping, namely, (Lemma 29). So the S˚-blocks can be used as factors for 1 m K XK ‰H for all iăm. In particular, a non-singleton the decomposition of ρ we are looking for. Below, we show i i`1 9 that the remaining factors of ρ, which do not overlap the S˚- Proof. Consider an input word u. By Theorem 24 we know blocks, are diagonals. This will complete the construction of that u P UA iff there exist a successful run ρ of T on u and a decomposition of ρ. an inversion I “ pL ,C ,L ,C q of ρ such that no positive 1 1 2 2 Formally, we say that a factor ρr(cid:96) ,(cid:96) s overlaps another number pďB is a period of the word 1 2 factorρr(cid:96) ,(cid:96) sifr(cid:96) ,(cid:96) sXr(cid:96) ,(cid:96) s‰H,(cid:96) ‰(cid:96) ,and(cid:96) ‰(cid:96) . ` ˘ ` ˘ ` ˘ 3 4 1 2 3 4 2 3 1 4 w “ out trpC q out ρranpC q,anpC qs out trpC q . ρ,I 1 1 2 2 Lemma30. Letρr(cid:96) ,(cid:96) sbeafactorofρthatdoesnotoverlap 1 2 anyS˚-block,with(cid:96)1 “px1,y1q,(cid:96)2 “px2,y2q,andx1 ăx2. The latter condition on wρ,I can be rephrased as follows: Then ρr(cid:96)1,(cid:96)2s is a diagonal. there is a function f :t1,...,BuÑt1,...,|wρ,I|u such that w rfppqs‰w rfppq`ps for all positive numbers pďB. ρ,I ρ,I lPorcoaotifosnke(cid:96)t1ch.(cid:2)If(cid:96)ρ(cid:2)r(cid:96)1(cid:96),2(cid:96)2sforiswnhoitcha|doiuatgpoρna|l,Zw(cid:96)(cid:1)eq|cąanBfindanda Remeacxal“lthpa2t|QB|q“2hmcamx,axa¨hndmaQx¨pi2s3etmhaex`st4aqt,ewshpearceehomfaxth“e t2w|Qo-|w´a1y, |outpρ | Z q| ą B (recall Definition 22). By applying again transducer T. This means that the run ρ, the inversion I, (cid:1)(cid:96) Theorem 13, we derive the existence of an inversion between and the function f described above can all be guessed within (cid:96)1 and (cid:96)2, and thus of an S˚-block overlapping ρr(cid:96)1,(cid:96)2s. double exponential space, namely, using a number of states that is at most a triple exponential w.r.t. |T|. In particular, we From decompositions to one-way definability (P3ÑP1). can construct in 2EXPSPACE an NFA recognizing UA. Hereafter,wedenotebyU thelanguageofwordsuPdompTq As a consequence of the previous lemma and of Theo- suchthatallsuccessfulrunsofT onuadmitadecomposition. rem24,wehavethattheemptinessofthelanguagedompTqX So far, we know that if T is one-way definable (P1), then UA, and hence the one-way definability of T, can be decided U “ dompTq (P3). This reduces the one-way definability problem for T to the containment problem dompTq Ď U. in 2EXPSPACE: We will see later how the latter problem can be decided in Corollary 33. The problem of deciding whether a functional double exponential space by further reducing it to checking two-way transducer is one-way definable is in 2EXPSPACE. the emptiness of the intersection of the languages dompTq and UA, where UA is the complement of U. VII. DEFINABILITYBYSWEEPINGTRANSDUCERS Below, we show how to construct a one-way transducer T1 Atwo-waytransduceriscalledsweepingifeverysuccessful of triple exponential size such that runofitperformsreversalsonlyattheextremitiesoftheinput T1 Ď T and dompT1q Ě U. word,i.e.whenreadingthesymbols$or%.Similarly,wecall it k-pass sweeping if it is sweeping and every successful run In particular, the existence of such a transducer T1 proves performs at most k´1 reversals. Clearly, a 1-pass sweeping the implication from P3 to P1 of Theorem 24. It also proves transducer is the same as a one-way transducer. the second item of Theorem 3, because when T is one-way In this section we are considering the following question: definable, U “dompTq, and hence T and T1 are equivalent. givenafunctionaltwo-waytransducer,isitequivalenttosome Intuitively, given an input u, the one-way transducer T1 k-pass sweeping transducer? We call such transducers k-pass will guess a successful run ρ of T on u and a decomposition sweeping definable. If the parameter k is not given a priori, of ρ, and then use the decomposition to simulate the output then we denote them as sweeping definable transducers. produced by ρ. Note that T1 accepts at least all the words of In[10]webuiltuponthecharacterizationofone-waydefin- U, possibly more. As a matter of fact, it would be difficult to ability for (the restricted class of) sweeping transducers [9] in construct a transducer whose domain coincides with U, since order to determine the minimal number of passes required by checkingmembershipinU involvesauniversalquantification. sweeping transductions. Essentially, the idea was to consider The proof of the following result is in the appendix. a generalization of the notion of inversion, called k-inversion, and proving that k-pass sweeping definability is equivalent to Proposition 31. Given a functional two-way transducer T, asking that every k-inversion generates a periodic output. one can construct in 3EXPTIME a one-way transducer T1 We show that we can follow the same approach for such that T1 ĎT and dompT1qĚU. two-way transducers. More precisely, we first define a co- inversion in a way similar to Definition 14, namely, as a tuple Deciding one-way definability. Recall that T is one-way pL ,C ,L ,C q consisting of two idempotent loops L ,L , definable iff dompTq Ď U, so iff dompTqXUA “ H. The 1 1 2 2 1 2 a component C of L , and a component C of L such that lemma below exploits the characterization of Theorem 24 to 1 1 2 2 show that the language UA can be recognized by an NFA ‚ anpC1q(cid:2)anpC2q, UA of triple exponential size. The lemma actually shows that ‚ outptrpC1qq,outptrpC2qq‰ε, and the NFA recognizing UA can be constructed using double ‚ anpCiq“pxi,yiq for i“1,2, then x1 ďx2. exponential workspace. The only difference compared to inversions is the ordering of the positions of the anchors, which is now reversed. Lemma 32. Given a functional two-way transducer T, one Alternating inversions and co-inversions leads to: can construct in 2EXPSPACE an NFA recognizing UA. 10

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.