Table Of ContentOn the implementation of
onstru
tion fun
tions
for non-free
on
rete data types
1 2 3
Frédéri
Blanqui , Thérèse Hardin , and Pierre Weis
1
I2NRIA&LORIA,BP 239, 54506 Villers-lès-Nan
y Cedex, Fran
e
7
0 3 UPMC, LIP6, 104, Av.du Pr. Kennedy,75016 Paris, Fran
e
INRIA,Domaine deVolu
eau, BP 105, 78153 Le ChesnayCedex, Fran
e
0
2
n
a Abstra
t. Many algorithms use
on
rete data types with some addi-
J tionalinvariants.Thesetofvaluessatisfyingtheinvariantsisoftenaset
5 of representatives for the equivalen
e
lasses of some equational theory.
For instan
e, a sorted list is a parti
ular representative wrt
ommuta-
] tivity.Theories like asso
iativity, neutral element,idempoten
e, et
. are
O
also very
ommon. Now,when onewants to
ombinevariousinvariants,
L it may be di(cid:30)
ult to (cid:28)nd the suitable representatives and to e(cid:30)
iently
. implementtheinvariants.Thepreservationofinvariantsthroughoutthe
s
c whole program is even more di(cid:30)
ult and error prone. Classi
ally, the
[ programmersolves this problem using a
ombinationof two te
hniques:
the de(cid:28)nition of appropriate
onstru
tion fun
tions for the representa-
1
tives and the
onsistent usage of these fun
tions ensured via
ompiler
v
1 veri(cid:28)
ations. The
ommon way of ensuring
onsisten
y is to use an ab-
3 stra
tdatatypefortherepresentatives;unfortunately,patternmat
hing
0 on representatives is lost. A more appealing alternative is to de(cid:28)ne a
1
on
rete data type with private
onstru
tors so that both
ompiler ver-
0 i(cid:28)
ation and pattern mat
hing on representatives are granted. In this
7 paper,we detailthenotionof privatedata typeandstudytheexisten
e
0
of
onstru
tion fun
tions. We also des
ribe a prototype,
alled Mo
a,
/
s that addresses the entire problem of de(cid:28)ning
on
rete data types with
c invariants: it generates e(cid:30)
ient
onstru
tion fun
tions for the
ombina-
:
v tion of
ommon invariants and builds representatives that belong to a
i
on
rete data type with private
onstru
tors.
X
r
a
1 Introdu
tion
Manyalgorithmsusedatatypeswithsomeadditionalinvariants.Everyfun
tion
reating a new value from old ones must be de(cid:28)ned so that the newly
reated
value satisfy the invariants whenever the old ones so do.
One way to easily maintain invariants is to use abstra
t data types (ADT):
theimplementationofanADTishiddenand
onstru
tionandobservationfun
-
tionsareprovided.AvalueofanADT
anonlybeobtainedbyre
ursivelyusing
the
onstru
tionfun
tions.Hen
e,aninvariant
anbeensuredbyusingappropri-
ate
onstru
tion fun
tions. Unfortunately, abstra
t data types pre
lude pattern
mat
hing, a very useful feature of modern programming languages [10,11,16,
15℄. There have been various attempts to
ombine both features in some way.
In [23℄, P. Wadler proposed the me
hanisms of views. A view on an ADT
α γ in :
is given by providing a
on
rete data type (CDT) and two fun
tions
α → γ out : γ → α in◦out = id out◦in = id
γ α
and su
h that and . Then,
α γ in
a fun
tion on
an be de(cid:28)ned by mat
hing on (by impli
itly using ) and
γ α
the values of type obtained by mat
hing
an be inje
ted ba
k into (by
out in out
impli
itlyusing ).However,byleavingtheappli
ationsof and impli
it,
in out
we
an easily get in
onsisten
ies whenever and are not inverses of ea
h
other.Sin
eitmaybedi(cid:30)
ulttosatisfythis
ondition(
onsiderforinstan
ethe
translations between
artesian and polar
oordinates), these views have never
been implemented. Following the suggestion of W. Burton and R. Cameron
in
to use the fun
tion only [3℄, some propositions have been made for various
programminglanguages but none has been implemented yet [4,17℄.
In [3℄, W. Burton and R. Cameron proposed another very interesting idea
whi
h seems to have attra
ted very little attention. An ADT must provide
on-
stru
tion and observation fun
tions. When an ADT is implemented by a CDT,
they propose to also export the
onstru
tors of the CDT but only for using
them as patterns in pattern mat
hing
lauses. Hen
e, the
onstru
tors of the
underlying CDT
an be used for pattern mat
hing but not for building values:
onlythe
onstru
tionfun
tions
anbeusedforthatpurpose.Therefore,one
an
both ensure some invariantsand o(cid:27)er pattern mat
hing. These types have been
introdu
ed in OCaml by the third author [24℄ under the name of
on
rete data
type with private
onstru
tors, or private data type (PDT) for short.
Now, many invariants on
on
rete data types
an be related to some equa-
list [] ::
tional theory. Take for instan
e the type of with the
onstru
tors and .
v ..v v ..v
1 n 1 n
Givensomeelements , thesortedlistwhi
helementsare isaparti
-
v1 vn []
ular representative of the equivalen
e
lass of ::..:: :: modulo the equation
x y l y x l
:: :: = :: :: . Requiring that, in addition, the list does not
ontain the same
x x l x l
element twi
e is a parti
ular representativemodulo the equation :: :: = :: .
empty singleton
Considernowthetypeofjoinlistswiththe
onstru
tors , and
append
, for whi
h
on
atenation is of
onstant
omplexity. Sorting
orresponds
append
to asso
iativity and
ommutativity of . Requiring that no argument of
append empty empty append
is
orresponds to neutrality of wrt . We have a
stru
ture of
ommutative monoid.
More generally, given some equational theory on a
on
rete data type, one
maywonderwhetherthereexistsarepresentativeforea
hequivalen
e
lassand,
C(t1...tn)
ifso,whetherarepresentativeof
anbee(cid:30)
iently
omputedknowing
t ...t
1 n
that are themselves representatives.
In [21,22℄, S. Thompson des
ribes a me
hanism introdu
ed in the Miranda
fun
tional programminglanguagefor implementing su
h non-free
on
rete data
types without pre
luding pattern mat
hing. The idea is to provide
onditional
rewriterules,
alledlaws, thatareimpli
itly appliedaslongaspossibleonevery
newly
reated value. This
an alsobe a
hieved by usinga PDT whi
h
onstru
-
tion fun
tions (primed
onstru
tors in [21℄) apply as long as possible ea
h of
the laws. Then, S. Thompson studies how to prove the
orre
tness of fun
tions
de(cid:28)ned by pattern mat
hing on su
h lawful types. However, few hints are given
on how to
he
k whether the laws indeed implement the invariants one has in
mind. For this reason and be
ause reasoningon lawful types is di(cid:30)
ult, the law
me
hanism was removed from Miranda.
In this paper, we propose to spe
ify the invariants by unoriented equations
(instead of rules). We will
all su
h a type a relational data type (RDT). Se
-
tions 2 and 3 introdu
e private and relational data types. Then, we study when
an RDT
an be implemented by a PDT, that is, when there exist
onstru
tion
fun
tions
omputing some representative for ea
h equivalen
e
lass. Se
tion 4
provides some general existen
e theorem based on rewriting theory. But rewrit-
ingmaybeine(cid:30)
ient.Se
tion5provides,forsome
ommonequationaltheories,
onstru
tion fun
tions more e(cid:30)
ient than the ones based on rewriting. Se
tion
6 presents Mo
a, an extension of OCaml with relational data types whose
on-
stru
tionfun
tionsareautomati
allygenerated.Finally,Se
tion7dis
ussessome
possible extensions.
2 Con
rete data types with private
onstru
tors
We (cid:28)rst re
all the de(cid:28)nition of a (cid:28)rst-order term algebra. It will be useful for
de(cid:28)ning the values of
on
rete and private data types.
De(cid:28)nition 1 (First-order term algebra) Asorted term algebra de(cid:28)nitionis
A = (S,C,Σ) S C
a triplet where Σi:sCa→noSn+-empty set of sorts, is a non-empty
set of
onstru
tor symbols and is a signature mapping a non-empty
C : σ1...σnσn+1 ∈ Σ
sequen
e of sorts to every
onstru
tor symbol. We write
Σ(C) = σ1...σnσn+1 X = (Xσ)σ∈S
to denote the fa
t that . Let be a family
T (A,X) σ
σ
of pairwise disjoint sets of variables. The sets of terms of sort are
indu
tively de(cid:28)ned as follows:
x∈X x∈T (A,X)
σ σ
(cid:21) If , then .
(cid:21) If C :σ1...σn+1 ∈Σ and ti ∈Tσi(A,X), then C(t1,...,tn)∈Tσn+1(A,X).
T (A) σ
σ
Let be the set of terms of sort
ontaining no variable.
S0
Inthefollowing,weassumegivenaset ofprimitivetypeslikeint,string,
C0 Σ0
...andaseΣt 0(0o)f=priimnittive
onstants0,1,"foo",...Let bethe
orresponding
signature ( , ...).
In this paper, we
all
on
rete data type (CDT) an indu
tive type à la ML
de(cid:28)ned by a set of
onstru
tors. More formally:
De(cid:28)nition 2 (Con
rete data type) A
on
retedatatypede(cid:28)nitionisatriplet
Γ =(γ,C,Σ) γ C
Σ : C → (S0 ∪w{hγer}e)+ is a sort, isa non-empty set ofC
o∈nsCtruΣ
t(oCr)sy=mσbo1l.s.σannγd
is a signature su
h that, for all , .
Val(γ) γ T (A ) A =
γ Γ Γ
The set of values of type is the set of terms where
(S0∪{γ},C0∪C,Σ0∪Σ)
.
This de(cid:28)nition of CDTs
orresponds to a small but very useful subset of all
thepossibletypesde(cid:28)nableinML-likeprogramminglanguages.Forthepurpose
of this paper, it is not ne
essaryto use a more
omplex de(cid:28)nition.
4
Example 1 The following type
exp is a CDT de(cid:28)nition with two
onstant
onstru
tors of sort
exp and a binary operator of sort
exp
exp
exp.
type
exp = Zero | One | Opp of
exp | Plus of
exp *
exp
Now, a private data type de(cid:28)nition is like a CDT de(cid:28)nition together with
onstru
tion fun
tions as in abstra
t data types. Constru
tors
an be used as
patterns as in
on
rete data types but they
annot be used for value
reation
(ex
eptinthede(cid:28)nitionof
onstru
tionfun
tions).Forbuildingvalues,onemust
use
onstru
tion fun
tions as in abstra
t data types. Formally:
De(cid:28)nition 3 (Private data type) A private data type de(cid:28)nition is a pair
Π = (Γ,F) Γ = (π,C,Σ) F
where is a CDT de(cid:28)nition and is a family of
on-
stru
tion fun
tions (fC)C∈C su
h that, for all C :σ1..σnπ ∈ Σ, fC :Tσ1(AΓ)×
...×T (A ) → T (A ) Val(π) π
σn Γ π Γ . Let be the set of the values of type , that
is, the set of terms that one
an build by using the
onstru
tion fun
tions only.
f : Tπ(AΓ) → Tπ(AΓ) C : σ1..σnπ ∈ Σ
The fun
tion su
h that, for all and
ti ∈ Tσi(AΓ), f(C(t1..tn)) = fC(f(t1)..f(tn)), is
alled the normalization fun
-
F
tion asso
iated to .
This is quite immediate to see that:
Val(π) f
Lemma 1. is the image of .
PDTshavebeenimplementedinOCamlbythethirdauthor[24℄.Extendinga
programminglanguagewithPDTsisnotverydi(cid:30)
ult:oneonlyneedstomodify
the
ompiler to parse the PDT de(cid:28)nitions and
he
k that the
onditions on the
use of
onstru
tors are ful(cid:28)lled.
Notethat
onstru
tionfun
tionshaveno
onstraintingeneral:thefullpower
of the underlying programming language is available to de(cid:28)ne them.
π
It should also be noted that, be
ause the set of values of type is a subset
γ π
ofthesetofvaluesoftheunderlyingCDT , afun
tion on de(cid:28)ned bypattern
mat
hingmaybeatotalfun
tioneventhoughitisnotde(cid:28)nedonallthepossible
γ π
asesof . De(cid:28)ning afun
tionwithpatternsthatmat
hno valueoftype does
notharmsin
ethe
orresponding
odewillneverberun.Ithoweverrevealsthat
thedeveloperisnotawareofthedistin
tion betweenthe valuesofthePDT and
those of the underlying CDT, and thus
an be
onsidered as a programming
error. To avoid this kind of errors, it is important that a PDT
omes with a
learidenti(cid:28)
ation ofits setofpossiblevalues.Togoonestep further,one
ould
provideatoolfor
he
kingthe
ompletenessandusefulnessofpatternsthattakes
into a
ount the invariants, when it is possible. We leave this for future work.
Example 2 Letusnowstartourrunningexamplewiththetypeexpdes
ribing
operations on arithmeti
expressions.
4
Examples are written with OCaml [10℄, they
an be readily translated in any pro-
gramminglanguageo(cid:27)eringpattern-mat
hingwithtextualpriority,asHaskell,SML,
et
.
type exp = private Zero | One | Opp of exp | Plus of exp * exp
This type exp is indeed a PDT built upon the CDT
exp. Prompted by the
keywordprivate,the OCaml
ompilerforbidstheuseofexp
onstru
tors(out-
side the modulemy_exp.ml
ontaining the de(cid:28)nition ofexp)ex
ept in patterns.
If Zero is supposed to be neutral by the writer of my_exp.ml, then he/she will
provide
onstru
tion fun
tions as follows:
let re
zero = Zero and one = One and opp x = Opp x
and plus = fun
tion
| (Zero,y) -> y
| (y,Zero) -> y
| (x,y) -> Plus(x,y)
3 Relational data types
Wementionedintheintrodu
tionthat,often,theinvariantsupon
on
retedata
typesaresu
hthatthesetofvaluessatisfyingthemisindeedasetofrepresenta-
tivesforthe equivalen
e
lassesofsomeequationaltheory.Wethereforepropose
to spe
ify invariantsby a set ofunoriented equationsand study to whi
h extent
su
h a spe
i(cid:28)
ation
an be realized with an abstra
t or private data type. In
ase of a private data type however, it is important to be able to des
ribe the
set of possible values.
De(cid:28)nition 4 (Relational data type) A relational data type (RDT) de(cid:28)ni-
(Γ,E) Γ = (π,C,Σ) E
tion is a pair where is a CDT de(cid:28)nition and is a (cid:28)nite
Tπ(AΓ,X) =E
set of equationson . Let be the smallest
ongruen
e relation
on-
E (Γ,F)
taining . Su
h an RDT is implementable by a PDT if the family of
F =(fC)C∈C E
onstru
tion fun
tions is valid wrt :
C :σ1..σnπ vi ∈Val(σi) fC(v1..vn)=E C(v1..vn)
(Corre
tness) For all and , .
C : σ1..σnσ vi ∈ Val(σi) D : τ1..τpσ ∈ Σ
(Completeness) For all , , and
wi ∈Val(τi) fC(v1..vn)=fD(w1..wp) C(v1..vn)=E D(w1..wp)
, whenever .
We aregoing to seethat the existen
eof avalid family of
onstru
tion fun
-
tions is equivalent to the existen
e of a valid normalization fun
tion:
f :T (A )→T (A )
π Γ π Γ
De(cid:28)nition 5 (Valid normalization fun
tion) Amap
(Γ,E) Γ =(π,C,Σ)
is a valid normalization fun
tion for an RDT with if:
t∈Tπ(AΓ) f(t)=E t
(Corre
tness) For all , .
t,u∈Tπ(AΓ) f(t)=f(u) t=E u
(Completeness) For all , whenever .
f ◦ f = f
Note that a valid normalization fun
tion is idempotent ( ) and
=E λxy.f(x)=f(y)
provides a de
ision pro
edure for (the boolean fun
tion ).
Theorem 6 The normalization fun
tion asso
iated to a valid family is a valid
normalization fun
tion.
Proof.
t ∈ T C :
π
(cid:21) Corre
tness. We pro
eed by indu
tion on the size of . We have
σ1..σnπ ∈ Σ ti t = C(t1..tn) f(t) = fC(f(t1)..
and su
h that . By de(cid:28)nition,
f(tn)) f(ti) =E ti
. By indu
tion hypothesis, . Sin
e the family is valid and
f(t1)..f(tn) fC(f(t1)..f(tn))=E C(f(t1)..f(tn)) f(t)=E t
are values, . Thus, .
t,u∈Tπ t=E u t=C(t1..tn) u=
(cid:21) Completeness. Let su
hthat . We have and
D(u1..up) f(t)=fC(f(t1)..f(tn)) f(u)=fD(f(u1)..f(up))
.Byde(cid:28)nition, and .
f(ti) =E ti f(uj) =E uj C(f(t1)..f(tn)) =E
By
orre
tness, and . Hen
e,
D(f(u1)..f(up)) f(t1)..f(tn) fC(f(t1)
..f(tn))=fD(f(.t1S)i.n.f
e(ttnh)e)familyisf(vta)li=dafn(du) arevalues, (cid:4)
. Thus, .
f : T (A ) → T (A )
π Γ π Γ
Conversely, given , one
an easily de(cid:28)ne a family of
f
onstru
tionfun
tionsthatisvalidwhenever isavalidnormalizationfun
tion.
Γ =
De(cid:28)nition 7 (Asso
iated family of
onstr. fun
tions) Given a CDT
(π,C,Σ) f :T (A )→T (A )
π Γ π Γ
and a fun
tion , the family of
onstru
tion fun
-
f (fC)C∈C C : σ1..σnπ ∈ Σ
tions asso
iated to is the family su
h that, for all
and ti ∈Tσ1(AΓ), fC(t1,...,tn)=f(C(t1,...,tn)).
Theorem 8 The family of
onstru
tion fun
tions asso
iatedto avalid normal-
ization fun
tion is valid.
E = {
Example 3 We
an
hoose
exp as the underlying CDT and Plus x
}
Zero = x to de(cid:28)ne a RDT implementable by the PDT exp, with the valid
family of
onstru
tion fun
tions zero, one, opp, plus.
4 On the existen
e of
onstru
tion fun
tions
In this se
tion, we provide a general theorem for the existen
e of valid families
of
onstru
tion fun
tions based on rewriting theory. We re
all the notions of
rewriting and
ompletion. The interested reader may (cid:28)nd more details in [8℄.
(l,r)
Standard rewriting. A rewrite rule is an ordered pair of terms written
l →r l
. A rule is left-linear if no variable o
urs twi
e in its left hand side .
Pos(t) t
Asusual,theset ofpositionsin isde(cid:28)nedasasetofwordsonpositive
p∈Pos(t) t| t p t[u]
p p
integers. Given , let be the subterm of at position and be
t t| u
p
the term with repla
ed by .
R
Given a (cid:28)nite set of rewrite rules, the rewriting relation is de(cid:28)ned as
t →R u p ∈ Pos(t) l → r ∈ R θ
follows: i(cid:27) there are , and a substitution su
h
t| =lθ u=t[rθ] t R u
p p
that and . A term is an -normal form if there is no su
h
t→R u =R →R
that . Let be the symmetri
, re(cid:29)exive and transitive
losure of .
≻
A redu
tion ordering is a well-founded ordering (there is no in(cid:28)nitely de-
t0 ≻ t1 ≻ ... C(..t..) ≻ C(..u..)
reasing sequen
e ) stable by
ontext ( whenever
t≻u tθ ≻uθ t≻u R
)andsubstitution( whenever ).If isin
ludedinaredu
tion
→R
ordering, then is well-founded (terminating, strongly normalizing).
→R t,u,v u←∗R t→∗R v
We say that wis
on(cid:29)uent iuf,→fo∗r awll t←er∗msv su
h that ,
R R
t←he∗r→e e∗xists a term su
h that →∗ ←∗ . This means that the relation
R R R R
is in
luded in the relation (
omposition of relations is written
by juxtaposition).
→R →R
If is
on(cid:29)uent, then every term has at most one normal form. If is
→R
well-founded,theneverytermhasat leastonenormalform.Therefore,if is
on(cid:29)uent and terminating, then every term has a unique normal form.
E
Standard
ompletion. Given a (cid:28)nite set of equations and a redu
tion or-
≻
dering , the standard Knuth-Bendix
ompletion pro
edure [2℄ tries to (cid:28)nd a
R
(cid:28)nite set of rewrite rules su
h that:
• R ≻
is in
luded in ,
• →R
is
on(cid:29)uent,
• R E =E ==R
and have same theory: .
Note that
ompletion may fail or not terminate but, in
ase of su
essful
R =E t=E u
termination, -normalizationprovidesa de
isionpro
edurefor sin
e
R t u
i(cid:27) the -normal forms of and are synta
ti
ally equal.
However,sin
e permutation theories like
ommutativity or asso
iativityand
ommutativity together (written AC for short) are in
luded in no redu
tion
ordering,dealingwiththemrequiresto
onsiderrewritingwithpatternmat
hing
modulo these theories and
ompletion modulo these theories. In this paper, we
restri
t our attention to AC.
Com
De(cid:28)nition 9 (Asso
iative-
ommutative equations) Let bethesetof
C E
ommutative
onstru
tors,i.e. thesetof
onstru
tors su
hthat
ontainsan
C(x,y)=C(y,x) E E
AC
equationoftheform .Then,let bethesubsetof madeof
the
ommutativityandasso
iativityequationsforthe
ommutative
onstru
tors,
=AC EAC E¬AC =E \EAC
be the smallest
ongruen
e relation
ontaining and .
R
Rewriting modulo AC. Given a set ofrewriterules,rewriting with pattern
AC t →R,AC u p ∈ Pos(t)
mat
hing modulo is de(cid:28)ned as follows: i(cid:27) there are ,
l →r∈R θ t| = lθ u=t[rθ]
p AC p
≻andAaCsubstitution su
hthat t,t′,u,aun′d t.A=redut′
tion
AC
ourd=eAriCngu′ tis′ ≻ u-′
omptat≻iblue if, for all term→s R,AC su
h that ACand
(←∗ =, →∗ i)(cid:27)⊆(→∗ . T=he re←la∗tion ) is
on(cid:29)uent modulo if
R,AC AC R,AC R,AC AC R,AC .
E AC
Completion modulo AC. Given a (cid:28)nite set of equations and an -
≻ AC
ompatible redu
tion ordering ,
ompletion modulo [18℄ tries to (cid:28)nd a
R
(cid:28)nite set of rules su
h that:
• R ≻
is in
luded in ,
• →R,AC AC
is
on(cid:29)uent modulo ,
• E and R∪EAC have same theory: =E ==R∪EAC.
E
De(cid:28)nition 10 A theory has a
omplete presentation if there is an AC-
om-
AC E¬AC
patible redu
tion ordering for whi
h the -
ompletion of su
essfully
terminates.
Many interesting systems have a
omplete presentation: (
ommutative) mo-
noids, (abelian) groups, rings, et
. See [13,5℄ for a
atalog. Moreover, there are
automated tools implementing
ompletion modulo AC. See for instan
e [6,12℄.
R,AC
A term may have distin
t -normal forms but, by
on(cid:29)uen
e modulo
AC AC
, all normal forms are -equivalent and one
an easily de(cid:28)ne a notion of
AC
normal form for -equivalent terms [13℄:
AC
De(cid:28)nition 11 ( -normal form) Givenanasso
iativeand
ommutative
on-
C C C
stru
tor , -left-
ombs (resp. -right-
ombs) and their leaves are indu
tively
de(cid:28)ned as follows:
t C t C C
(cid:21) If isnotheadedby ,then isbotha -left-
ombanda -right-
omb.The
t leaves(t)=[t]
leaves of is the one-element list .
t C u C C(t,u) C
(cid:21) If isnotheadedby and isa -right-
omb,then isa -right-
omb.
C(t,u) t::leaves(u)
The leaves of is the list .
t C u C C(u,t) C
(cid:21) If is not headed by and is a -left-
omb, then is a -left-
omb.
C(u,t) leaves(u)@[t] @
The leaves of is the list , where is the
on
atenation.
orient
Let bea fun
tion asso
iatinga kind of
ombs (left orright) to everyAC-
≤ t AC
onstru
tor.Let beatotalorderingonterms.Then,aterm isin -normal
orient ≤
form wrt and if:
t C orient(C)
(cid:21) Every subterm of headed by an AC-
onstru
tor is an -
omb
≤
whose leaves are in in
reasing order wrt .
t C(u,v) C
(cid:21) For every subterm of of the form with
ommutative but non-
u≤v
asso
iative,we have .
AC
As it is well-known, one
an put any term in -normal form:
orient ≤
Theorem 12 Whateverthefun
tion andtheordering are,everyterm
t AC t↓ orient ≤ t= t↓
AC AC AC
has an -normal form wrt and , and .
A
Proof. Let be the set ofrules obtained by
hoosingan orientationfor the
E orient
AC
asso
iativity equations of a
ording to :
orient(C) C(x,C(y,z))→C(C(x,y),z)
(cid:21) If is (cid:16)left(cid:17), then take .
orient(C) C(C(x,y),z)→C(x,C(y,z))
(cid:21) If is (cid:16)right(cid:17), then take .
→A
isa
on(cid:29)uentandterminatingrelationputtingeverysubtermheadedby
orient comb
anAC-
onstru
torintoa
ombforma
ordingto .Let beafun
tion
A sort
omputing the -normal form of a term. Let now be a fun
tion permuting
the leavesof
ombs and the argumentsof
ommutativebut non-asso
iative
on-
≤ sort◦comb
stru
torsto puAtCthem in in
reasingorderwrt .sTohrte(nc,otmhbe(ftu))n
=tiont (cid:4)
AC
omputes the -normal form of any term and .
AC
This naturally provides a de
ision pro
edure for -equivalen
e: the fun
-
λxy.sort(comb(x)) = sort(comb(y)) R,AC
tion . It follows that -normalization
AC
together with -normalization provides a valid normalization fun
tion, hen
e
the existen
e of a valid family of
onstru
tion fun
tions:
E
Theorem 13 If has a
omplete presentation,then there exists a valid family
of
onstru
tion fun
tions.
E R
Proof. Assume that has a
omplete presentation . We de(cid:28)ne the
om-
putation of normal forms as it is generally implemented in rewriting tools. Let
step R,AC
be a fun
tion making an -rewrite step if there is one, orfailing if the
norm step
term is in normal form. Let be the fun
tion applying until a normal
R E
form is rea
hed. Sin
e is a
omplete presentation of , by de(cid:28)nition of the
sort◦comb◦norm
ompletionpro
edure, isavalidnormalizationfun
tion.Thu(cid:4)s,
by Theorem 8, the asso
iated family of
onstru
tion fun
tions is valid.
The
onstru
tion fun
tions des
ribed in the proofarenotvery e(cid:30)
ientsin
e
they are based on rewriting with pattern mat
hing modulo AC, whi
h is NP-
omplete [1℄, and do not take advantage of the fa
t that, by de(cid:28)nition of PDTs,
theyareonlyappliedto termsalreadyin normalform.We
anthereforewonder
whetherthey
anbede(cid:28)nedinamoree(cid:30)
ientwayforsome
ommonequational
theories like the ones of Figure 1.
Fig.1. Some
ommonequations onbinary
onstru
tors
Name Abbrev De(cid:28)nition Example
Assoc(C) C(C(x,y),z)=C(x,C(y,z))(x+y)+z=x+(y+z)
asso
iativity
Com(C) C(x,y)=C(y,x) x+y=y+x
ommutativity
Neu(C,E) C(x,E)=x x+0=x
neutrality
Inv(C,I,E) C(x,I(x))=E x+(−x)=0
inverse
Idem(C) C(x,x)=x x∧x=x
idempoten
e
Nil(C,A) C(x,x)=A x⊕x=⊥
nilpoten
e (ex
lusive or)
Rewritingprovidesalsoawayto
he
kthevalidityof
onstru
tionfun
tions:
E R F =(fC)C∈C
Theorem 14 If has a
omplete presentation and is a family
C : σ1..σnπ ∈ Σ vi ∈ Val(σi) fC(v1..vn)
su
h that, for all and terms , is an
R,AC C(v1..vn) AC F
-normal form of in -normal form, then is valid.
Proof.
C : σ1..σnπ ∈ Σ vi ∈ Val(σi) fC(v1..vn)
(cid:21) Corre
tness. Let and . Sin
e is an
R,AC C(v1..vn) fC(v1..vn)=E C(v1..vn)
-normal form of , we
learly have .
C : σ1..σnπ ∈ Σ vi ∈ ValF(σi) D : τ1..τpπ ∈ Σ
(cid:21) Completeness. Let , , , and
wi ∈ValF(τi) C(v1..vn)=E D(w1..wp) R
su
h that . Sin
e is a
omplete pre-
E norm(C(v1..vn)) =AC norm(D(w1..wp)) fC(v1..vn) =
fseDn(twat1i.o.wnpo)f , . Thus, (cid:4)
.
It follows that rewriting provides a natural way to explain what are the
AC
possible values of an RDT: values are -normal forms mat
hing no left hand
R
side of a rule of .
5 Towards e(cid:30)
ient
onstru
tion fun
tions
When there is no
ommutative symbol,
onstru
tion fun
tions
an be easily
implemented by simulating innermost rewriting as follows:
VPos(t) p∈Pos(t)
De(cid:28)nition 15 (Linearization) Let be the set of positions
t| x∈X ρ:VPos(t)→X
p
su
h that is a variable . Let be an inje
tive mapping
lin(t) t
and be the term obtained by repla
ing in every subterm at position
p ∈ VPos(t) ρ(p) Eq(t)
by . Let now be the
onjun
tion of true and of the
ρ(p)=ρ(q) t| =t| p,q ∈VPos(t)
p q
equations su
h that and .
R F(R)
De(cid:28)nition 16 Given a set of rewrite rules, let be the family of
on-
(fC)C∈C
stru
tion fun
tions de(cid:28)ned as follows:
• l → r ∈ R l = C(l1,...,ln)
For every rule with \, add to the de(cid:28)nition of
fC the
lause lin(l1),...,lin(ln) when Eq(l) -> lin(r), where bt is the term
t C
obtained by repla
ingin everyo
urren
eofa
onstru
tor by a
allto its
f
C
onstru
tion fun
tion .
• f
C
Terminate the de(cid:28)nition of by the default
lause x -> C(x).
E = ∅ E R
AC
Theorem 17 Assume that and has a
omplete presentation .
F(R) E
Then, is valid wrt (whatever the order of the non-default
lauses is).
Wenow
onsiderthe
ase of
ommutativesymbols.Wearegoingto des
ribe
a modular way of de(cid:28)ning the
onstru
tion fun
tions by pursuing our running
example,with the typeexp.Assume thatPlusisde
laredto be asso
iativeand
ommutative only. The
onstru
tion fun
tions
an then be de(cid:28)ned as follows:
let zero = Zero and one = One and opp x = Opp x
and plus = fun
tion
| Plus(x,y), z -> plus (x, plus (y,z))
| x, y -> insert_plus x y
and insert_plus x = fun
tion
| Plus(y,_) as u when x <= y -> Plus(x,u)
| Plus(y,t) -> Plus (y, insert_plus x t)
| u when x > u -> Plus(u,x)
| u -> Plus(x,u)
sort ◦comb
One
an easily see that plus does the same job as the fun
tion
A
used in Theorem 12 but in a slightly more e(cid:30)
ient way sin
e -normalization
and sorting are interleaved.
{ ( ,x)
AssumemoreoverthatZeroisneutral.TheAC-
ompletionof Plus Zero
= x} { ( ,x)→ x} x y
gives Plus Zero . Hen
e, if and are terms in normal form,
(x,y) x = y =
then Plus
an be rewritten modulo AC only if Zero or Zero.
Thus, the fun
tion plus needs to be extended with two new
lauses only: