ebook img

Regular Expression Subtyping for XML Query and Update Languages PDF

0.21 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Regular Expression Subtyping for XML Query and Update Languages

Regular Expression Subtyping for XML Query and Update Languages JamesCheney UniversityofEdinburgh 8 0 0 Abstract. XMLdatabasequerylanguagessuchasXQueryemployregularex- 2 pression typeswithstructuralsubtyping. Subtyping systemstypicallyhavetwo n presentations, which should be equivalent: a declarative version in which the a subsumption rule may be used anywhere, and an algorithmic version in which J theuseofsubsumptionislimitedinordertomaketypecheckingsyntax-directed 4 anddecidable.However,theXQuerystandardtypesystemcircumventsthisissue byusing imprecisetyping rulesfor iterationconstructsand definingonlyalgo- ] rithmictypechecking, and another extant proposal provides more precise types L foriterationconstructsbutignoressubtyping. Inthispaper,weconsider acore P XQuery-like language with a subsumption rule and prove the completeness of . s algorithmictypechecking;thisisstraightforwardforXQueryproperbutrequires c somecareinthepresenceofmorepreciseiterationtypingdisciplines.Weextend [ thisresulttoanXMLupdatelanguagewehaveintroducedinearlierwork. 1 v 4 1 Introduction 1 7 The Extensible Markup Language (XML) is a World Wide Web Consortium (W3C) 0 standard for tree-structured data. Regular expression types for XML [13] have been . 1 studiedextensivelyinXMLprocessinglanguagessuchasXDuce[12]andCDuce[1], 0 aswellasprojectstoextendgeneral-purposeprogramminglanguageswith XMLfea- 8 turessuchasXtatic[9]andOCamlDuce[8]. 0 : SeveralotherW3Cstandards,suchasXQuery,addresstheuseofXMLasageneral v format for representing data in databases. Static typechecking is important in XML i X database applications because type information is useful for optimizing queries and r avoidingexpensiverun-timechecksandrevalidation.TheXQuerystandard[5]provides a forstructuralsubtypingbasedonregularexpressiontypes. However, XQuery’s type system is imprecise in some situations involving itera- tion(for-expressions).Inparticular,ifthevariable$xhastype1 a[b[]∗,c[]?],thenthe XQueryexpression for $y in $x/* return $y has type (b[]|c[])∗ in XQuery, but in fact the result will always match the regular ex- pression type b[]∗,c[]?. The reason for this inaccuracy is that XQuery’s type system typechecksaforloopbyconvertingthetypeofthebodyoftheexpression(here,$x/a 1We use the notation for regular expression types from Hosoya, Vouillon and Pierce [13] in preferencetothemoreverboseXQueryorXMLSchemasyntaxes. with typeb[]∗,c[]?) to the “factored”form(α1|...|αn)q, where q is a quantifiersuch as?, +,or∗ andeachα is an atomictype(i.e.a data typesuchas stringor single i elementtypea[τ]). More precise type systems have been contemplated for XQuery-like languages, including a precursor to XQuery designed by Fernandez, Sime´on, and Wadler [7]. Morerecently,Colazzoetal.[4]haveintroducedacoreXQuerylanguagecalledµXQ, equippedwitharegularexpression-basedtypesystemthatprovidesmoreprecisetypes foriterationsusingtechniquessimilartothosein[7].InµXQ,theaboveexpressioncan beassignedthemoreaccuratetypeb[]∗,c[]?. Accurate typing for iteration constructs is especially important in typechecking XMLupdates.Wearedevelopingastatically-typedupdatelanguagecalledFLUX[3]in whichideasfromµXQareessentialfortypecheckingupdatesinvolvingiteration.Using XQuery-style factoring for iteration in FLUX would make it impossible to typecheck updatesthatmodifydatawithoutmodifyingtheoverallschemaofthedatabase—avery common case. For example, using XQuery-style factoring for iteration in FLUX, we wouldnotbeabletoverifystaticallythatgivenadatabaseoftypea[b[string]∗,c[]?], anupdatethatmodifiesthetextinsidesomeofthebelementsproducesanoutputthat isstilloftypea[b[string]∗,c[]?],ratherthana[(b[string]|c[])∗]. OnequestionleftunresolvedinpreviousworkonbothµXQ and FLUX istherela- tionship between declarative and algorithmic presentations of the type system (in the terminologyof [14, Ch. 15–16]).Declarativederivationspermitarbitraryuses of the subsumptionrule: ′ Γ ⊢e:τ τ <:τ ′ Γ ⊢e:τ whereasalgorithmicderivationslimitthe use of this rule in orderto ensurethattype- checking is syntax-directed and decidable. The declarative and algorithmic presenta- tionsofasystemshouldagree.Iftheydo,thendeclarativetypecheckingisdecidable; if they disagree, then the algorithmic system is incomplete relative to the high-level declarativesystem:itrejectsprogramsthatshouldtypecheck. TheXQuerystandardcircumventedthisissuebydirectlydefiningtypecheckingto bealgorithmic.Incontrast,neithersubsumptionnorsubtypingwereconsideredinµXQ, inpartbecausesubtypinginteractsbadlywithµXQ’s“pathcorrectness”analysis(asar- guedbyColazzoetal.[4],Section4.4).Subsumptionwasconsideredinourinitialwork onFLUX[3],butwewereinitiallyunabletoestablishthatdeclarativetypecheckingwas decidable,evenintheabsenceofrecursionintypes,queries,orupdates. In this paper we consider declarative typechecking for µXQ and FLUX extended with recursive types, recursive functions, and recursive update procedures. To estab- lish that typechecking remains decidable, it suffices (following Pierce [14, Ch. 16]) to define an algorithmic typechecking judgment and prove its completeness; that is, that declarative derivations can always be normalized to algorithmic derivations. For XQueryproper,thisappearsstraightforwardbecauseoftheuseoffactoringwhentype- checking iterations. However, for µXQ’s more precise iteration type discipline, com- pletenessof algorithmictypecheckingdoesnotfollow by the “obvious”structuralin- duction.Instead,wemustestablishastrongerpropertybyconsideringthestructureof regularexpressiontypes.WealsoextendtheseresultstoFLUX. 2 The structure of the rest of the paper is as follows. Section 2 reviews regular ex- pression types and subtyping.Section 3 introducesthe core language µXQ, discusses examples highlighting the difficulties involving subtyping in µXQ, and proves decid- abilityofdeclarativetypechecking.Wealsoreviewthe FLUX coreupdatelanguagein Section 4, discuss examples,and extend the proofof decidabilityof declarativetype- checkingtoFLUX.Sections5–6sketchrelatedandfutureworkandconclude. 2 Background Forthepurposesofthispaper,XMLvaluesaretreesbuiltupoutofbooleansb∈Bool = {true,false}, strings w ∈ Σ∗ over some alphabet Σ, and labels l,m,n ∈ Lab, accordingtothefollowingsyntax: v¯::=b|w |n[v] v ::=v¯,v |() Valuesincludetreevaluesv¯ ∈ Tree andforestvaluesv ∈ Val.Wewritev,v′ forthe resultofappendingtwoforestvalues(consideredaslists). Weconsideraregularexpressiontypesystemwithstructuralsubtyping,similarto thoseconsideredinseveraltransformationandquerylanguagesforXML[13,4,7].The syntaxoftypesandtypeenvironmentsisasfollows. Atomictypes α::=bool|string|n[τ] Sequencetypes τ ::=α|()|τ|τ′ |τ,τ′ |τ∗ |X Typedefinitionsτ0 ::=α|()|τ0|τ0′ |τ0,τ0′ |τ0∗ Typesignatures E ::=·|E,typeX =τ0 Wecalltypesoftheformα∈Atomatomictypes(orsometimestreeorsingulartypes), and typesτ ∈ Type of all other formssequence types (or sometimes forestor plural types). It should be obviousthat a value of singular type must always be a sequence oflengthone(thatis, a tree);pluraltypesmayhavevaluesofanylength.Thereexist plural types with only values of length one, but which are not syntactically singular (forexampleint|bool). As usual, the + and ? quantifierscan be definedas follows: τ+ =τ,τ∗andτ? =τ|().Weabbreviaten[()]asn[]. Note that in contrast to Hosoya et al. [13], but following Colazzo et al. [4], we includebothKleenestarandtypevariables.In[13],itwasshownthatKleenestarcan be translated away by introducing type variables and definitions, modulo a syntactic restrictionontop-leveloccurrencesoftypevariables.Incontrast,weallowKleenestar, but further restrict type variables. Recursive and mutually recursive declarations are allowed,buttypevariablesmaynotappearatthe toplevelofatypedefinitionτ0:for example,typeX = nil[]|cons(a,X)and typeY = leaf[]|node[X,X]are allowed but type X′ = ()|a[],X and type Y′ = b[]|Y′,Y′ are not. The equation for X′ definestheregulartreelanguagea[]∗,andwouldbepermittedinXDuce,whilethatfor Y′definesacontext-freetreelanguagethatisnotregular. AnenvironmentE iswell-formedifalltypevariablesappearingindefinitionsare themselvesdeclaredinE.Givenawell-formedenvironmentE,wewriteE(X)forthe 3 definitionofX.Atypedenotesthesetofvalues[[τ]] ,definedasfollows. E [[string]] =Σ∗ [[bool]] =Bool [[()]] ={()} E E E ′ ′ [[n[τ]]] ={n[v]|v ∈[[τ]] } [[X]] =[[E(X)]] [[τ|τ ]] =[[τ]] ∪[[τ ]] E E E E E E ′ ′ ′ ′ [[τ,τ ]] ={v,v |v ∈[[τ]] ,v ∈[[τ ]] } E E E [[τ∗]]E ={()}∪{v1,...,vn |v1 ∈[[τ]]E,...,vn ∈[[τ]]E} Formally,[[τ]] must be definedby a least fixed pointconstructionwhichwe take for E granted.Henceforth,wetreatE asfixedanddefine[[τ]]=[[τ]] . E Inaddition,wedefinea binarysubtypingrelationontypes.Atypeτ1 isasubtype of τ2 (τ1 <: τ2), by definition,if [[τ1]] ⊆ [[τ2]]. Our types can be translated to XDuce types,sosubtypingreducestoXDucesubtyping;althoughthisproblemisEXPTIME- complete in general, the algorithm of [13] is well-behavedin practice. Therefore,we shallnotgiveexplicitinferencerulesforcheckingordecidingsubtyping,buttreatitas a“blackbox”. 3 Query language We review an XQuery-like core language based on µXQ [4]. In µXQ, we distinguish betweentreevariablesx¯ ∈ TVar,introducedbyfor,andforestvariables,x ∈ Var, introduced by let. We write xˆ ∈ Var ∪TVar for an arbitrary variable. The other syntacticclassesofourvariantofµXQincludebooleans,strings,andlabelsintroduced above,functionnamesF ∈FSym,expressionse∈Expr,andprogramsp∈Prog;the abstractsyntaxofexpressionsandprogramsisdefinedasfollows: e::=()|e,e′ |n[e]|w |x|letx=eine′ |F(e1,...,en) ′ ′ | b|ifctheneelsee |x¯|x¯/child|e::n|forx¯∈ereturne p::=querye:τ |declarefunctionF(x1:τ1,...,xn:τn):τ {e};p Thedistinguishedvariablesx¯inforx¯ ∈ereturne′(x¯)andxinletx=eine′(x) areboundine′(x).Hereandelsewhere,weemploycommonconventionssuchascon- sideringexpressionscontainingboundvariablesequivalentuptoα-renamingandem- ployingaricherconcretesyntaxincludingparentheses. Tosimplifythepresentation,wesplitµXQ’sprojectionoperationx¯/child::linto twoexpressions:childprojection(x¯/child)whichreturnsthechildrenofx¯,andnode namefiltering(e ::n)whichevaluatesetoanarbitrarysequenceandselectsthenodes labeledn.Thus,theordinarychildaxisexpressionx¯/child:: nissyntacticsugarfor (x¯/child)::nandthe“wildcard”childaxisisdefinableasx¯/child::∗=x¯/child. Built-inoperationssuchasstringequalitymaybeprovidedasadditionalfunctionsF. Colazzoetal.[4]providedadenotationalsemanticsofµXQquerieswiththedescen- dantaxis butwithoutrecursivefunctions.This semanticsis soundwith respectto the typingrulesinthenextsectionandcanbeextendedtohandlerecursivefunctionsusing operationaltechniques(as in the XQuery standard). However,we omit the semantics sinceitisnotneededintherestofthepaper. 4 Γ ⊢e:τ x¯:α∈Γ x:τ ∈Γ b∈Bool Γ ⊢x¯:α Γ ⊢x:τ Γ ⊢w:string Γ ⊢b:bool Γ ⊢e:τ Γ ⊢e:τ Γ ⊢e′:τ′ Γ ⊢e1 :τ1 Γ,x:τ1 ⊢e2:τ2 Γ ⊢():() Γ ⊢n[e]:n[τ] Γ ⊢e,e′:τ,τ′ Γ ⊢letx=e1ine2 :τ2 Γ ⊢c:bool Γ ⊢e1 :τ1 Γ ⊢e2 :τ2 x¯:n[τ]∈Γ Γ ⊢e:τ τ ::n⇒τ′ Γ ⊢ifcthene1elsee2 :τ1|τ2 Γ ⊢x¯/child:τ Γ ⊢e::n:τ′ Γ ⊢e1 :τ1 Γ ⊢x¯inτ1 →e2 :τ2 F(τ):τ0 ∈∆ Γ ⊢ei :τi Γ ⊢e:τ τ <:τ′ Γ ⊢forx¯∈e1returne2 :τ2 Γ ⊢F(e):τ0 Γ ⊢e:τ′ Γ ⊢pprog Γ ⊢e:τ F notdeclaredinp F(τ):τ0∈∆ Γ,x:τ ⊢e:τ0 Γ ⊢pprog Γ ⊢e:τ prog Γ ⊢declarefunctionF(τ):τ0{e};pprog Fig.1.Queryandprogramwell-formednessrules τ ::n⇒τ′ Γ ⊢x¯inτ →e:τ′ Γ ⊢x¯inE(X)→e:τ n[τ]::n⇒n[τ] Γ ⊢x¯in()→e:() Γ ⊢x¯inX →e:τ E(X)::n⇒τ α6=n[τ] X ::n⇒τ α::n⇒() Γ,x¯:α⊢e:τ Γ ⊢x¯inτ1 →e:τ2 τ1 ::n⇒τ2 Γ ⊢x¯inα→e:τ Γ ⊢x¯inτ1∗→e:τ2∗ ()::n⇒() τ1∗ ::n⇒τ2∗ Γ ⊢x¯inτ1 →e:τ1′ Γ ⊢x¯inτ2 →e:τ2′ τ1 ::n⇒τ1′ τ2 ::n⇒τ2′ Γ ⊢x¯inτ1,τ2 →e:τ1′,τ2′ τ1,τ2::n⇒τ1′,τ2′ Γ ⊢x¯inτ1 →e:τ1′ Γ ⊢x¯inτ2 →e:τ2′ τ1 ::n⇒τ1′ τ2 ::n⇒τ2′ Γ ⊢x¯inτ1|τ2 →e:τ1′|τ2′ τ1|τ2 ::n⇒τ1′|τ2′ Fig.2.Auxiliaryjudgments 3.1 Typesystem OurtypesystemforqueriesisessentiallythatintroducedforµXQby[4],excludingthe pathcorrectnesscomponent.WeconsidertypingenvironmentsΓ andglobaldeclaration environments∆,definedasfollows: Γ ::=·|Γ,x:τ |Γ,x¯:α ∆::=·|∆,F(τ):τ0 NotethatinΓ,treevariablesmayonlybeboundtoatomictypes.Asusual,weassume that variables in type environments are distinct; this convention implicitly constrains all inferencerules. We also write Γ <: Γ′ to indicate that dom(Γ) = dom(Γ′) and Γ′(xˆ)<:Γ(xˆ)forallxˆ∈dom(Γ). ThemaintypingjudgmentforqueriesisΓ ⊢e:τ;wealsodefineaprogramwell- formednessjudgmentΓ ⊢pprogwhichtypechecksthebodiesoffunctions.Following 5 [4], there are two auxiliary judgments,Γ ⊢ x¯ in τ → s : τ′, used for typechecking for-expressions,and τ :: n ⇒ τ′, used for typecheckinglabelmatchingexpressions e::n.TherulesforthesejudgmentsareshowninFigures1and2. We considerthetypingrulestobeimplicitlyparameterizedbyafixedglobaldec- laration environment ∆. Functions in XQuery have global scope so we assume that thedeclarationsforallthefunctionsdeclaredintheprogramhavealreadybeenadded to ∆ by a preprocessing pass. Additionaldeclarationsfor built-in functions might be includedin∆aswell. TherulesinvolvingtypevariablesinFigure2lookupthevariable’sdefinitioninE. Thesejudgmentsonlyinspectthetop-levelofa type;theydonotinspectthecontents ofelementtypesn[τ]. Sincetypedefinitionsτ0 havenotop-leveltypevariables,both judgmentsareterminating.(ThiswasarguedindetailbyColazzoetal.[4,Lem.4.6].) 3.2 Examples Wefirstrevisittheexampleintheintroductioninordertoillustratetheoperationofthe rules.Recallthatx¯/∗istranslatedtox¯/childinourcorelanguage. D x¯:a[b[]∗,c[]?]⊢x¯/child:b[]∗,c[]? x¯:a[b[]∗,c[]?]⊢y¯inb[]∗,c[]? →y¯:b[]∗,c[]? x¯:a[b[]∗,c[]?]⊢fory¯∈x¯/childreturny¯:b[]∗,c[]? wherethesubderivationDis x¯:a[b[]∗,c[]?],y¯:b[]⊢y¯:b[] x¯:a[b[]∗,c[]],y¯:c[]⊢y¯:c[] x¯:a[b[]∗,c[]?]⊢y¯inb[]→y¯:b[] x¯:a[b[]∗,c[]?]⊢y¯inc[]→y¯:c[] D = x¯:a[b[]∗,c[]?]⊢y¯inb[]∗ →y¯:b[]∗ x¯:a[b[]∗,c[]?]⊢y¯inc[]? →y¯:c[]? x¯:a[b[]∗,c[]?]⊢y¯inb[]∗,c[]? →y¯:b[]∗,c[]? Note that this derivation does not use subsumption anywhere. Suppose we wished to show that the expression has type b[]∗,(c[]?|d[]∗), a supertype of the above type. There are several ways to do this: first, we can simply use subsumption at the end of the derivation. Alternatively, we could have used subsumption in one of the sub- derivations such as x¯:a[b[]∗,c[]?],y¯:c[]? ⊢ y¯ : c[]?, to conclude, for example, that x¯:a[b[]∗,c[]?],y¯:c[]? ⊢y¯:c[]?|d[]∗.Thisisvalidsincec[]? <:c[]?|d[]∗. Suppose, instead, that we actually wanted to show that the above expression has type(b[d[]∗]|c[]?)∗,alsoasupertypeofthederivedtype.Thereareagainseveralways of doingthis. Besides using subsumptionat the end of the derivation,we mighthave used it on x¯:a[b[]∗,c[]?] ⊢ x¯/child : b[]∗,c[]? to obtain x¯:a[b[]∗,c[]?] ⊢ x¯/child : (b[d[]∗]|c[]?)∗.Tocompletethederivation,wewouldthenneedtoreplacederivationD withD′: x¯:a[b[]∗,c[]?],y¯:c[]⊢y¯:c[] x¯:a[b[]∗,c[]?],y¯:b[d[]∗]⊢y¯:b[d[]∗] x¯:a[b[]∗,c[]?]⊢y¯inc[]→y¯:c[] ′ D = x¯:a[b[]∗,c[]?]⊢y¯inb[d[]∗]→y¯:b[d[]∗] x¯:a[b[]∗,c[]?]⊢y¯inc[]? →y¯:c[]? x¯:a[b[]∗,c[]?]⊢y¯inb[d[]∗]|c[]? →y¯:b[d[]∗]|c[]? x¯:a[b[]∗,c[]?]⊢y¯in(b[d[]∗]|c[]?)∗ →y¯:(b[d[]∗]|c[]?)∗ 6 NotonlydoesD′havedifferentstructurethanD,butitalsorequiressubderivationsthat werenotsyntacticallypresentinD. Theaboveexampleillustrateswhyeliminatingusesofsubsumptionistricky.Ifsub- sumptionisusedtoweakenthetypeofthefirstargumentofafor-expressionaccording toτ1′ <: τ1,thenweneedtoknowthatwecantransformthecorrespondingderivation D ofΓ ⊢ x¯ inτ1 → e : τ2 toa derivationofD′ ofΓ ⊢ x¯ inτ1′ → e : τ2′ forsome τ2′ <:τ2.Butasillustratedabove,thederivationsDandD′maybearlittleresemblance tooneanother. Nowweconsideratypecheckingarecursivequery.SupposewehavetypeTree = ∗ tree[leaf[string]|node[Tree ]]andfunctiondefinition declarefunctionleaves(x:Tree):leaf[string]∗ { x/leaf,forz¯∈x/node/∗ returnleaves(z¯) }; Thisusesaconstructe/nthatisnotin coreµXQ,butwe canexpande/ntofory¯ ∈ ereturny¯/child::n;thus,wecanderivearule Γ,y¯:l[τ]⊢y¯/child:τ τ ::n⇒τ′ Γ,y¯:l[τ]⊢y¯/child::n:τ′ Γ ⊢e:l[τ] τ ::n⇒τ′ Γ ⊢e:l[τ] Γ ⊢y¯inl[τ]→y¯/child::n:τ′ Γ ⊢e/n:τ′ ⇐⇒ Γ ⊢fory¯∈ereturny¯/child::n:τ′ Using this derived rule and the fact that x : Tree and the definition of Tree, we ∗ can see that x/leaf : leaf[string] and x/node : node[Tree ]], and so x/node/∗ : tree[leaf[string]|node[Tree∗]]∗.Soeachiterationofthefor-loopcanbetypechecked ∗ withz¯:tree[leaf[string]|node[Tree ]].Tocheckthefunctioncallleaves(z¯),weneed subsumptiontoseethattree[leaf[string]|node[Tree∗]]∗ <: Tree.Itfollowsthatthat leaves(z¯) : leaf[string]∗, so the for-loop has type (leaf[string]∗)∗. Again using subsumption,wecanconcludethat ∗ ∗ ∗ x/leaf,leaves(x/node/∗):leaf[string],(leaf[string] ) <:leaf[string] . Noticethatalthoughwecouldhaveusedsubsumptioninseveralmoreplaces,wereally neededitinonlytwoplaces:whentypecheckingafunctioncall,andwhencheckingthe resultofafunctionagainstitsdeclaredtype. 3.3 Decidability Thestandardapproach(seee.g.Pierce[14,Ch.16])todecidingdeclarativetypecheck- ingistodefinealgorithmicjudgmentsthataresyntax-directedanddecidable,andthen showthatthealgorithmicsystemiscompleterelativetothedeclarativesystem. Definition1 (Algorithmicderivations).ThealgorithmictypecheckingjudgmentsΓ ⊢◮ e : τ and Γ ⊢◮ x¯ in τ0 → e : τ are defined by taking the rules of Figures 1 and 2, removingthesubsumptionrule,andreplacingthefunctionapplicationrulewith ′ ′ F(τ):τ ∈Γ Γ ⊢◮ ei :τi τi <:τi Γ ⊢◮ F(e):τ 7 It is straightforward to show that algorithmic derivability is decidable and sound withrespecttothedeclarativesystem: Lemma1 (Decidability).Foranyx¯,e,n,thereexistcomputablepartialfunctionsf , n ge,hx¯,y suchthatforanyΓ,τ0,wehave: 1. fn(τ0)istheuniqueτ suchthatτ0 ::n⇒τ. 2. gx(Γ)istheuniqueτ suchthatΓ ⊢◮ e:τ,whenitexists. 3. hx¯,e(Γ,τ0)istheuniqueτ suchthatΓ ⊢◮ x¯inτ0 →e:τ,whenitexists. Theorem1 (AlgorithmicSoundness). (1)If Γ ⊢◮ e : τ is derivablethenΓ ⊢ e : τ is derivable.(2) If Γ ⊢◮ x¯ in τ0 → e : τ is derivable then Γ ⊢ x¯ in τ0 → e : τ is derivable. Thecorrespondingcompletenessproperty(themainresultofthissection)is: Theorem2 (AlgorithmicCompleteness). (1) IfΓ ⊢ e : τ thenthere exists τ′ <: τ suchthatΓ ⊢◮ e : τ′.(2)IfΓ ⊢ x¯inτ1 → e : τ2 thenthereexistsτ2′ <: τ2 suchthat Γ ⊢◮ x¯inτ1 →e:τ2′. Givenadecidablesubtypingrelation<:,atypicalproofofcompletenessinvolvesshow- ing by induction that occurrences of the subsumption rule can be “permuted” down- wardsintheproofpastotherrules,exceptforfunctionapplications.Completenessfor µXQrequiresstrengtheningthisinductionhypothesis.Toseewhy,recallthefollowing rules: ∗ ∗ ∗ Γ ⊢e1 :τ1 Γ,x:τ1⊢e2:τ2 Γ ⊢e1 :τ1 Γ ⊢x¯inτ1 →e2 :τ2 Γ ⊢e:τ τ ::n⇒τ′ Γ ⊢letx=e1ine2 :τ2 Γ ⊢forx¯∈e1returne2 :τ2 Γ ⊢e::n:τ′ Ifthesubderivationlabeled∗intheaboverulesfollowsbysubsumption,however,we cannotdo anythingto getrid of the subsumptionrule using the inductionhypotheses providedbyTheorem2.Insteadweneedanadditionallemmathatensuresthatthejudg- mentsarealldownwardmonotonic.Downwardmonotonicitymeans,informally,thatif make the “input” types in a derivable judgment smaller, then the judgment remains derivablewithasmaller“output”type. Lemma2 (Downwardmonotonicity). 1. Ifτ1 ::n⇒τ2 andτ1′ <:τ1 thenτ1′ ::n⇒τ2′ forsomeτ2′ <:τ2 2. IfΓ ⊢◮ e:τ andΓ′ <:Γ thenΓ′ ⊢◮ e:τ′forsomeτ′ <:τ. 3. IfΓ ⊢◮ x¯inτ1 → e : τ2 andΓ′ <: Γ andτ1′ <: τ1 thenΓ′ ⊢◮ x¯inτ1′ → e : τ2′ forsomeτ2′ <:τ2. The downward monotonicity lemma is almost easy to prove by direct structural induction (simultaneously on all judgments). The cases for (2) involving expression- directed typechecking are all straightforward inductive steps; however, for the cases involvingtype-directedjudgments,theinductionstepsdonotgothrough.Thedifficulty isillustratedbythefollowingcases.Forderivationsoftheform τ1 ::n⇒τ2 Γ ⊢x¯inτ1 →e:τ2 τ∗ ::n⇒τ∗ Γ ⊢x¯inτ∗ →e:τ∗ 1 2 1 2 8 we are stuck: knowing that τ′ <: τ∗ does not necessarily tell us anything about a 1 1 subtypingrelationshipbetweenτ1′ andτ1.Forexample,ifτ1′ =aaandτ1 =a,thenwe haveaa <: a∗ butnotaa <: a.Instead,weneedtoproceedbyananalysisofregular expressiontypesandsubtyping. We briefly sketch the argument, which involves an excursion into the theory of regular languages over partially ordered alphabets. Here, the “alphabet” is the set of atomic types and the regular sets are the sets of sequences of atomic types that are subtypesof a type τ. The homomorphicextension of a (possiblypartial) functionh : Atom ⇀Type onatomictypesisdefinedas ˆh(())=() hˆ(α)=h(α) hˆ(τ∗)=hˆ(τ)∗ hˆ(τ1,τ2)=hˆ(τ1),hˆ(τ2) hˆ(τ1|τ2)=ˆh(τ1)|hˆ(τ2) ˆh(X)=hˆ(E(X)) (Noteagainthatthisdefinitioniswell-founded,sincetypevariablescannotbeexpanded indefinitely.)Ifhispartial,thenˆhisdefinedonlyontypeswhoseatomsareindom(h). Wecanthenshowthefollowinggeneralpropertyofpartialhomomorphicextensions: Lemma3. Ifh:Atom ⇀Type isdownwardmonotonic,thenitshomomorphicexten- sionhˆ :Type ⇀Type isdownwardmonotonic. It then suffices to show that fn and hx¯,e are partial homomorphic extensions of downward monotonefunctionson atomic types; for f , the required function is sim- n ple and obviously monotone, and for hx¯,e(Γ,−), the required generating function is ge(Γ,x:(−)). Thus, we need to show that ge and hx¯,e are downward monotonic and that hx¯,e(Γ,−) is the partial homomorphicextension of ge(Γ,x:(−)) simultaneously bymutualinduction.This,finally,isastraightforwardinductionoverderivations.More detailedproofsareincludedintheappendix. 4 Update language WenowintroducethecoreFLUXupdatelanguage,whichextendsthesyntaxofqueries with statements s ∈ Stmt, procedurenames P ∈ PSym, tests φ ∈ Test, directions d∈Dir,andtwonewcasesforprograms: ′ ′ s::=skip|s;s |ifethenselses |letx=eins|P(e) | inserte|delete|renamen|snapshotxins|φ?s|d[s] φ::=n|∗|bool|string d::=left|right|children|iter p::=···|updates:τ ⇒τ′ |declareprocedureP(x:τ):τ ⇒τ′ {s};p Updatesincludestandardprogrammingconstructssuchastheno-opskip,sequential composition, conditionals, and let-binding. The basic update operations include in- sertion insert e, which inserts a value into an empty part of the database; deletion delete,whichdeletespartofthedatabase;andrenamen,whichrenamesapartofthe databaseprovideditisasingletree.The“snapshot”operationsnapshotxinsbinds xtopartofthedatabaseandthenappliesanupdates,whichmayrefertox.Notethat thesnapshotoperationistheonlywaytoreadfromthecurrentdatabasestate. 9 Updatesalsoincludetestsφ?swhichtestthetop-leveltypeofasingularvalueand conditionallyperformanupdate,otherwisedonothing.Thenodelabeltestn?schecks whetherthetreeisoftypen[τ],andifsoexecutess;thewildcardtest∗?schecksthat thevalueisatree.Similarly,bool?sandstring?stestwhetheravalueisabooleanor string.The?operatorbindstightly;forexample,φ?s;s′ =(φ?s);s′. Finally, updates include navigation operators that change the selected part of the tree,andperforman updateonthesub-selection.Theleftandrightoperatorsper- formanupdate(typically,aninsert)ontheemptysequencelocatedtotheleftorright of a value. The childrenoperator appliesan update to the child list of a tree value. Theiteroperatorappliesanupdatetoeachtreevalueinaforest. Wedistinguishbetweensingular(unary)updateswhichapplyonlywhenthecon- textisatreevalueandplural(multi-ary)updateswhichapplytoasequence.Testsφ?s are alwayssingular.The childrenoperatorappliesa pluralupdateto allof the chil- drenofasinglenode;theiteroperatorappliesasingularupdatetoalloftheelements ofasequence.Otherupdatescanbeeithersingularorpluralindifferentsituations.Our typesystemtracksmultiplicityaswellasinputandoutputtypesinordertoensurethat updatesarewell-behaved. FLUX updatesoperateonapartofthedatabasethatis“infocus”,whichhelpsen- surethatupdatesaredeterministicandrelativelyeasytotypecheck.Onlythenavigation operationsleft,right,children,itercanchangethefocus.Welackspacetofor- malizethesemanticsofupdatesinthemainbodyofthepaper;thesemanticsofupdates isessentiallythesameasin[3]exceptfortheadditionofprocedures. 4.1 Typesystem In typechecking updates, we extend the global declaration context ∆ with procedure declarations: ∆::=···|∆,P(τ):τ1 ⇒τ2 Therearetwotypingjudgmentsforupdates:singularwell-formednessΓ ⊢1 {α}s{τ′} (thatis,intypeenvironmentΓ,updatesmapstreetypeαtotypeτ′),andpluralwell- formednessΓ ⊢∗ {τ}s{τ′}(thatis, intypeenvironmentΓ,updatesmapstypeτ to typeτ′).Severaloftherulesareparameterizedbyamultiplicitya∈{1,∗}.Inaddition, thereisanauxiliaryjudgmentΓ ⊢iter {τ}s{τ′}fortypecheckingiterations.Therules forupdatewell-formednessareshowninFigure3.Wealsoneedanauxiliarysubtyping relation involving atomic types and tests: we say that α <: φ if [[α]] ⊆ [[φ]]. This is characterizedbytherules: bool<:bool string<:string n[τ]<:n n[τ]<:∗ Remark1. InmostotherXMLupdateproposals(includingXQuery![11]andthedraft XQueryUpdateFacility[2]),side-effectingupdateoperationsaretreatedasexpressions thatreturn().Thus,wecouldperhapstypechecksuchupdatesasexpressionsoftype (). Thiswouldworkfine as longasthe typesofvaluesreachablefromthe freevari- ablesinΓ canneverchange;however,theupdatesavailableintheselanguagescanand do changethe valuesof variables. Thus, to make this approachsound Γ would to be 10

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.