ebook img

Static Program Analysis Reading List (UCLA CS232) PDF

382 Pages·2017·7.865 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Static Program Analysis Reading List (UCLA CS232)

How to Solve Set Constraints Jens Palsberg October 1, 2008 We will explain how to solve a particular class of set constraints in cubic time. 1 Set Constraints Let C be a finite set of constants, and let V be a set of variables. A set constraint is a conjunction of constraints of the forms: c ∈ v c ∈ v ⇒ v(cid:48) ⊆ v(cid:48)(cid:48) where c ∈ C and v,v(cid:48),v(cid:48)(cid:48) ∈ V. For a particular set constraint, we use C to denote the finite set of constants that occur in that set constraint, and we use V to denote the finite set of variables that occur in that set constraint. We use 2C to denote the powerset of C. A set constraint has solution ϕ : V → 2C if • for each conjunct of the form c ∈ v, we have c ∈ ϕ(v) and • for each conjunct of the form c ∈ v ⇒ v(cid:48) ⊆ v(cid:48)(cid:48), we have c ∈ ϕ(v) ⇒ ϕ(v(cid:48)) ⊆ ϕ(v(cid:48)(cid:48)). We say that a set constraint is satisfiable if it has a solution. Theorem 1 Every set constraint is satisfiable. Proof. The mapping λv : V.C is a solution of the set constraint. (cid:3) For two mappings ϕ,ψ : V → 2C, we define the binary intersection ϕ(cid:117)ψ as: ϕ(cid:117)ψ = λv : V.(ϕ(v)∩ψ(v)) Theorem 2 For a given set constraint, the binary intersection of two solutions is itself a solution. Proof. Suppose ϕ,ψ : V → 2C are both solutions. Let us examine each of the conjuncts of the set constraint. 1 • For a conjunct of the form c ∈ v, we have from ϕ,ψ being solutions that c ∈ ϕ(v) and c ∈ ψ(v). From c ∈ ϕ(v) and c ∈ ψ(v) we have c ∈ (ϕ(v)∩ψ(v)) which is equivalent to c ∈ (ϕ(cid:117)ψ)(v). • for each conjunct of the form c ∈ v ⇒ v(cid:48) ⊆ v(cid:48)(cid:48), we have from ϕ,ψ being solutions that c ∈ ϕ(v) ⇒ ϕ(v(cid:48)) ⊆ ϕ(v(cid:48)(cid:48)) and c ∈ ψ(v) ⇒ ψ(v(cid:48)) ⊆ ψ(v(cid:48)(cid:48)). We want to show c ∈ (ϕ(cid:117)ψ)(v) ⇒ (ϕ(cid:117)ψ)(v(cid:48)) ⊆ (ϕ(cid:117)ψ)(v(cid:48)(cid:48)). Supposec ∈ (ϕ(cid:117)ψ)(v). Fromc ∈ (ϕ(cid:117)ψ)(v) and the definition of (cid:117), we have c ∈ (ϕ(v)∩ψ(v)), so we have c ∈ ϕ(v) and c ∈ ψ(v). From c ∈ ϕ(v) and c ∈ ϕ(v) ⇒ ϕ(v(cid:48)) ⊆ ϕ(v(cid:48)(cid:48)), we have ϕ(v(cid:48)) ⊆ ϕ(v(cid:48)(cid:48)). From c ∈ ψ(v) and c ∈ ψ(v) ⇒ ψ(v(cid:48)) ⊆ ψ(v(cid:48)(cid:48)), we have ψ(v(cid:48)) ⊆ ψ(v(cid:48)(cid:48)). From ϕ(v(cid:48)) ⊆ ϕ(v(cid:48)(cid:48)) and ψ(v(cid:48)) ⊆ ψ(v(cid:48)(cid:48)), we have (ϕ(v(cid:48)) ∩ ψ(v(cid:48))) ⊆ (ϕ(v(cid:48)(cid:48)) ∩ ψ(v(cid:48)(cid:48))), which is equivalent to (ϕ(cid:117)ψ)(v(cid:48)) ⊆ (ϕ(cid:117)ψ)(v(cid:48)(cid:48)), and that is the desired result. (cid:3) For the space V → 2C, we define an ordering ⊆ as follows. We say that ϕ ⊆ ψ if and only if for all v ∈ V : ϕ(v) ⊆ ψ(v). For a set S ⊆ (V → 2C), we say that an element ϕ ∈ S is the ⊆-least element of S if for all ψ ∈ S : ϕ ⊆ ψ. Theorem 3 Every set constraint has a ⊆-least solution. Proof. Let a particular set constraint be given. The space of possible solutions of the set constraint is V → 2C, which is a finite set. Let S ⊆ (V → 2C) be the set of solutions of the set constraint. From V → 2C being finite, we have that S is finite. From Lemma 1 we have that S is nonempty. So, S is a nonempty, finite set. Let S = { ϕ ,ϕ ,ϕ ,...,ϕ }. 1 2 3 n Let ϕ = (...((ϕ (cid:117)ϕ )(cid:117)ϕ )...(cid:117)ϕ ). From Lemma 2 we have that ϕ is a solution of the 1 2 3 n set constraint, so ϕ ∈ S. Additionally, we have that for all i ∈ 1..n : ϕ ⊆ ϕ . So, ϕ is the i ⊆-least solution of the set constraint. (cid:3) 2 Solving Set Constraint Efficiently We will translate the problem of solving a set constraint into a graph problem. For a given set constraint, the initial graph has one node for each element of V, and an empty set of edges. Additionally, each node v is associated with a bit vector B of length |C| in which v every bit is initially 0. For a bit vector B and a constant c ∈ C, we denote the entry for c in v B by B [c]. Finally, every bit in every bit vector is associated with a list L [c] of constraints v v v of the form v(cid:48) ⊆ v(cid:48)(cid:48); all those lists are initially empty. The initial graph represents the mapping λv : V.∅. The idea is that for a given v ∈ V, the bit vector B represents a subset of C. When all the bits are 0, the bit vector B represents v v the empty set. An edge in the graph implies a subset relationship. If the graph has an edge from v to v(cid:48), it implies that the set represented by the bit vector associated with v is a subset of the set represented by the bit vector associated with v(cid:48). We now process each conjunct of the set constraint in turn. The processing will add edges, change bits from 0 to 1, and add constraints to the constraint lists associated with bits in the bit vectors. We will use the following two procedures propagate and insert-edge as key subroutines. 2 procedure propagate(v:Node, c:Constant) { if (B [c] == 0) { v B [c] = 1 v for (each element (v(cid:48) ⊆ v(cid:48)(cid:48)) of L [c]) { insert-edge(v(cid:48),v(cid:48)(cid:48)) } v for (each edge (v,v(cid:48))) { propagate(v(cid:48), c) } } } procedure insert-edge(v,v(cid:48):Node) { insert an edge (v,v(cid:48)) for (c such that B [c] == 1) { propagate(v(cid:48), c) } v } For a constraint of the form c ∈ v, we execute propagate(v,c). For a constraint of the form c ∈ v ⇒ v(cid:48) ⊆ v(cid:48)(cid:48), we execute: if (B [c] == 0) { add the constraint (v(cid:48) ⊆ v(cid:48)(cid:48)) to the list L [c] } v v else { insert-edge(v(cid:48),v(cid:48)(cid:48)) } When we have processed all the constraints, the resulting graph represents the ⊆-least solution of the set constraint. We will now analyze the time complexity of the constraint processing. We will do the analysis for a set constraint with |C| = O(n), |V| = O(n), O(n) conjuncts of the form c ∈ v, and O(n2) conjuncts of the form c ∈ v ⇒ v(cid:48) ⊆ v(cid:48)(cid:48). For each conjunct, the algorithm performs a constant amount of work except for the calls to the propagate subroutine. So, the total time is O(n2) plus the time spent by propagate. The propagate routine itself performs a constant amount of work except for recursive calls! So the key to analyzing the time spent by propagate is to determine the number of calls to the propagate routine. The processing of the set constraint generates O(n3) immediate calls to propagate. The recursion involved in each call to propagate stops when finding a bit that is 1. So, for each c ∈ C, the total work of all calls of the form propagate(v,c) is given by the number of edges in the graph, which is O(n2). To sum up, the total time is O(n2)+(O(n)×O(n2)) = O(n3). 3 InformationProcessingLetters98(2006)150–155 www.elsevier.com/locate/ipl Optimal register allocation for SSA-form programs in polynomial time Sebastian Hack∗, Gerhard Goos InstitutfürProgrammstrukturenundDatenorganisation,Adenauerring20a,76131Karlsruhe,Germany Received11February2005;receivedinrevisedform23November2005 Availableonline17February2006 CommunicatedbyF.MeyeraufderHeide Abstract Thispapergivesaconstructiveproofthattheregisterallocationproblemforauniformregistersetissolvableinpolynomialtime forSSA-formprograms. 2006ElsevierB.V.Allrightsreserved. Keywords:Combinatorialproblems;Graphalgorithms 1. Introduction register allocation must also be. However, if one only considersprogramsinSSA-form(e.g.,referto[5])the Register allocation is the task in a compiler, which situation changes. It turns out, that interference graphs maps (temporary) variables to processor registers. The of SSA-form programs belong to the class of chordal mostprominentapproachistomapthistasktoagraph graphs,whichisinturnasubsetoftheclassofperfect coloring problem. The nodes of the so-called interfer- graphs.Itiswellknown,thatchordalgraphscanbecol- ence graph are formed by the temporaries of the pro- ored in quadratic time. This also answers the question gram. Whenever the compiler finds out, that two tem- posedbyAndersson[1]whetherinterferencegraphsare poraries cannot be held in the same register (due to perfect. interference),anedgeisdrawnbetweenthecorrespond- Thispaperisstructuredasfollows:First,wedescribe ing nodes in the interference graph. Colors correspond themodelofaprogramusedinthispaper.Thenwegive toprocessorregisters.Thus,havingk registers,ak-col- anewdefinitionoflivenessforSSA-formprograms.Af- oring of the interference graph forms a correct register terquotingbasicfactsfromgraphtheory,wewillprove assignment. thatinterferencegraphsofSSA-formprogramsalways Chaitinetal.[4]showthatforeachundirectedgraph haveperfecteliminationordersandshowhowtheyare aprogramcanbegivenwhichhasthisgraphasitsinter- determined. ferencegraph.So,sincegraphcoloringisNP-complete, 2. SSA-formprograms Here,weconsideraprogram(inSSA-form)givenby * Correspondingauthor. itscontrolflowgraph(CFG)whosenodesaremadeup E-mailaddresses:[email protected](S.Hack), [email protected](G.Goos). oflabeledsingleinstructions.Wewillthereforeidentify 0020-0190/$–seefrontmatter 2006ElsevierB.V.Allrightsreserved. doi:10.1016/j.ipl.2006.01.008 S.Hack,G.Goos/InformationProcessingLetters98(2006)150–155 151 if...then if...then (cid:1)1:a←··· (cid:1)2:x←··· (cid:1)2:x1←··· if...then else else (cid:1)2:··· (cid:1)3:x←··· (cid:1)3:x2←··· else end end (cid:1)3:b←··· (cid:1)4:y←x+1 (cid:1)(cid:1)(cid:3)44::xy31←←φx3(x+1,1x2) e(cid:1)n4d:y←φ(cid:3)(a,b) (cid:3) Fig.1.ProgramfragmentanditsequivalentinSSA-form. Fig.2.Livenessatφ-functions. thenodeanditslabelinthefollowing.Letuscalltheset foreachSSA-variable1 v thereisexactlyonelabel Dv oflabelsL.TheCFGhasonedistinctstartnodewhich forwhichv∈ResDv. hasnopredecessornodesandisdenotedby start.The Sinceweallowonlyoneinstructionperlabel,were- instruction corresponding to a node is of the following placethesetofallφ-operationsinabasicblock form: y =φ(x ,...,x ), 1 11 1n (cid:1):(d ,...,d )←τ(u ,...,u ). 1 m 1 n ... Wedenotetheoperationτ atalabel(cid:1)byOp(cid:1).Wecall y =φ(x ,...,x ) Res ={d ,...,d }theorderedsetofresultvaluesand m m1 mn (cid:1) 1 m Arg ={u ,...,u }theorderedsetofargumentvalues bythemoreconciseφ(cid:3)-operation: (cid:1) 1 n at the label (cid:1). All d1,...,dm and u1,...,un are ele- (cid:1):(y ,...,y )=φ(cid:3)(x ,...,x ,...,x ,...,x ) mentsofthesetof(abstract)valuesV oftheconsidered 1 m 11 1n m1 mn program.Givenalabel (cid:1),letus denote Arg(cid:1)(i) the ith whichsetsyi =xij if(cid:1)wasreachedviaP(cid:1)j.Itisconve- argument to the operation at (cid:1). Each label has an or- nient to define Arg(cid:3)(j)={x |1(cid:1)i (cid:1)m} subsuming (cid:1) ij deredsetof k predecessorlabelswhichwewilldenote (cid:3) j alloperandsofaφ -operationwhichrefertoP .Note, byP1,...,Pk. (cid:3) (cid:1) (cid:1) (cid:1) that φ is totally equivalent to the traditional φ, since We will also write (cid:1)(cid:3) →i (cid:1) if (cid:1)(cid:3) = Pi. If we do (cid:1) SSA semantics states that all φ-operations in a basic not care about the position, we simply write (cid:1)(cid:3) → (cid:1) block are evaluated simultaneously. In the following, (cid:3) to denote that (cid:1) is a predecessor to (cid:1). A path p is (cid:3) everything stated for φ -operations implicitly holds for an ordered set {(cid:1) ,...,(cid:1) } of at least two nodes, for (cid:3) 1 n φ-operations.Thuswewillonlyuseφ . which (cid:1)1→(cid:1)2,(cid:1)2→(cid:1)3,...,(cid:1)n−1→(cid:1)n holds. To in- dicate that a set p ={(cid:1) ,...,(cid:1) } is a path, we write 1 n 2.1. LivenessinSSA-formprograms p:(cid:1) →···→(cid:1) . 1 n (cid:3) Finally, we say a label (cid:1) dominates another label (cid:1) To perform register allocation on SSA-form pro- ifeachpathfrom startto(cid:1)(cid:3) contains(cid:1),writing(cid:1)(cid:5)(cid:1)(cid:3). grams,aprecisenotionoflivenessisneeded.Thestan- Note,since(cid:5)isreflexive,(cid:1)(cid:5)(cid:1). darddefinitionofliveness We require the program to be given in SSA-form, which means that each variable is statically only as- Avariablev isliveatalabel(cid:1),ifthereisapathfrom(cid:1) signed once. Usually, when a program is transferred toausageofvnotcontainingadefinitionofv. into SSA-form, for each definition of a non-SSA vari- cannot be straightforwardly transferred to SSA-form able x,aSSAvariable x iscreated.Theusagesofthe i programs.Theproblemariseswith φ-andaccordingly variablesthenhavetobeadjustedtorefertothecorre- (cid:3) φ -operations. Consider the program in Fig. 2. Surely, sponding SSA variable. This is not always possible. It a is not live at label (cid:1) although there is a path from might be dependent on the control flow, which defini- 3 (cid:1) to a usage of a, namely (cid:1) . The cause for this odd- tionofthevariableisapplicableattheusage.Consider 3 4 ity is that the usual notion of usage does not hold for the left program in Fig. 1. At label (cid:1)4, it is depen- φ(cid:3)-operations. In addition to their arguments, a φ(cid:3)-op- dent on the control flow whether the definition at (cid:1) 2 eration also uses control flow information to produce or the one at (cid:1) is relevant for (cid:1) . In SSA-form pro- 3 4 its result.To makethetraditionaldefinitionof liveness grams,aspecialinstruction,calledφ-functionisusedto work,wehavetoincorporatethepredecessorbywhich disambiguatemultipledefinitionsbycontrolflow.Aφ- alabelwasreachedintothenotionofusage: function copies its ith parameter to its result, if it was reachedviaPk.Note,thatabasicblockcanhavemulti- (cid:1) pleφ-functions.So,sincetheprogramisinSSA-form, 1 SSAvariablesareoftencalledvalues. 152 S.Hack,G.Goos/InformationProcessingLetters98(2006)150–155 Definition1(Usage). Fromnowon,wewillonlyconsiderstrictprograms.2 Thenextlemmaisessentialfortherestofthispaperand usage:N×L×V→B, has also been given by Budimlic´ relying on a slightly (cid:1) differentlivenessdefinition. v∈Arg if Op (cid:8)=φ(cid:3), (i,(cid:1),v)(cid:7)→ v∈Arg(cid:1)(cid:3)(i) if Op(cid:1)=φ(cid:3). (cid:1) (cid:1) Lemma 5. Each label (cid:1) at which a value v is live is dominatedbyD . v Now,ausageisnotonlydependentonalabelanda valuebutalsoonanumberwhichrepresentstheprede- Proof. Suppose,(cid:1)isnotdominatedbyD .Then,there v cessorbywhichthelabelwasreached.Inourexample ispathfromstartto(cid:1)notcontainingD .Fromthefact, v inFig.2,usage(1,(cid:1)4,a)istrue,sincea isindeedused that v is live at (cid:1) follows, that there is a usepath of v if(cid:1)4 isenteredvia(cid:1)2.usage(2,(cid:1)4,a)isfalse,sinceais from(cid:1)tosome(cid:1)(cid:3).Sothereisausepathfromstartto(cid:1)(cid:3) notusedif(cid:1)4 isreachedthrough(cid:1)3.Iftheoperationat notcontainingDv whichcontradictsthedefinitionofa a labelisnot φ(cid:3),this definitionresemblesthecommon strictprogram. (cid:1) conceptofusagebysimplyignoringthepredecessorin- dex. 2.2. CommonfactsaboutSSA-formprograms The traditional definition of liveness quoted above, usespathswhichendinusagesofsomevariabletode- Since the definition of liveness given above seems fineliveness.Inthistraditionalsetting,usagesandpaths ratherunusual,weshortlyderivesomewell-knownfacts areunrelated.WithDefinition1,pathsandusagesareno about SSA-form programs from our definition. These longerunrelated.Soitisstraightforwardtomergethem factsarenotvitalfortherestofthepaperandareonly inoneterm. given to clarify certain properties of SSA-form pro- grams. Definition 2 (Usepath). A path p:(cid:1) →···→(cid:1) is a 1 n (cid:3) usepathfrom(cid:1) to(cid:1) concerningavaluev,ifv isused Corollary6.Eachvaluev,usedinanon-φ -operation 1 n at(cid:1) regardingthispath.Moreformally: atalabel(cid:1)isdominatedbyitsdefinition. n j usepath:Ln×V→B, Proof. Then,foreachpredecessorofP(cid:1) of(cid:1),usage(j, j (cid:1),v)holds.WithDefinition3,visliveateachP .With (p:(cid:1) →···→(cid:1) ,v) (cid:1) 1(cid:1) n Lemma 5, Dv dominates each predecessor of (cid:1). Thus (cid:7)→ usage(i,(cid:1)n,v) ifp=(cid:1)1→i (cid:1)n, Dv alsodominates(cid:1). usepath((cid:1) →···→(cid:1) ,v) otherwise. 2 n Corollary7.Ifavaluev∈Arg(cid:3)(i)foraφ(cid:3)-operationat (cid:1) alabel(cid:1)andsomei,thenthedefinitionofv dominates Referring to the example in Fig. 2, (cid:1) ,(cid:1) ,(cid:1) is a 1 2 4 Pi. usepathofa and(cid:1)1,(cid:1)3,(cid:1)4 isausepathofb. (cid:1) Using this definition of usage together with the tra- ditionaldefinitionoflivenessstatedabove,oneobtains Proof. Surely, usage(i,(cid:1),v) holds. So p:P(cid:1)i →(cid:1) is a arealisticmodeloflivenessinSSAprograms: usepathconcerningv.So,afterDefinition3,v isliveat Pi.Thus,withLemma5,D (cid:5)Pi. (cid:1) v (cid:1) Definition3(Liveness).Avaluevisliveatalabel(cid:1)1iff Corollary8.Let(cid:1)bealabelwithOp (cid:8)=φ(cid:3).Eachpair there exists a label (cid:1) with usepath((cid:1) →(cid:1) →···→ (cid:1) n 1 2 ofvaluesv,w∈Arg interfere. (cid:1) ,v)andD ∈/{(cid:1) ,...,(cid:1) }. (cid:1) n v 2 n Proof. Due to Definition 3, v and w are live at each Weusethedefinitionofusepathstore-formulatethe predecessorofthatlabel.Sovandwinterfere. notionofastrictprogramcoinedbyBudimlic´ etal.[3]. Often,onecanreadstatementslike: Definition 4 (Strict program). A program is called strict,iffforeachvaluev eachpathfromstarttosome 2 Surely,eachnon-strictprogramcanbeturnedintoastrictoneby label (cid:1) with usepath(start→···→(cid:1),v) contains the insertinginstructionswhichinitiallydefinethevariablesbyanarbi- definitionofv. traryvalue. S.Hack,G.Goos/InformationProcessingLetters98(2006)150–155 153 • φ(cid:3)-operationsdonotcauseinterferences. bythevaluesoccurringintheprogram,V =V.Since IG • Concerning liveness, φ(cid:3)-operations can be treated nodes in the interference graph and values are identi- asiftheyhadnoarguments. cal, we identify both terms in the following. We draw • φ(cid:3)-operations extend the lifetimes of their argu- anedgebetweentovaluesvandv(cid:3)ifftheyinterfereand mentstotheendoftherespectivepredecessorlabel. write vv(cid:3)∈E . The followinglemmas leadto a theo- IG remthatconnectsthedominancerelationofaprogram AllthesestatementstrytodescribeCorollary8theother toperfecteliminationordersintheinterferencegraphof way around. With the definition of usepaths, they are thatprogram.Lemmas11and12havealsobeenshown covered implicitly. In our model the basic assumption byBudimlic´etal.[3]andaregivenforthesakeofcom- is, that the property of usage is always tied to a value pleteness,here. andapath,whichmakesCorollary8the“special”case. Lemma 11 shows that each edge in the interference graph is directed according to the dominance relation- 3. Graphtheory shipofthevaluestheirnodesrepresent. Here we quote definitions from basic graph theory Lemma11.Iftwovaluesvandwareliveatsomelabel andthetheoryofperfectgraphsimportanttothispaper. (cid:1),eitherD dominatesD orviceversa. Let G=(V,E) be an undirected graph. If there is an v w edge from v ∈V to w ∈V, we write vw ∈E . We G G Proof. ByLemma5,D andD dominate(cid:1).Thus,ei- leaveouttheGinE ,V ifitisclearfromthecontext, v w G G therD dominatesD orD dominatesD . (cid:1) whichgraphisconsidered.WecallagraphGcomplete, v w w v iffforeachv,w∈V,thereisanedgevw∈E.Wecall G(cid:3) an induced subgraph of G, if VG(cid:3) ⊆VG and for all Thenextlemmashowswhatistrivialinbasicblocks nodesv,w∈VG(cid:3),vw∈EG→vw∈EG(cid:3) holds. also holds for complete programs in SSA-form: if one value starts living before another (it dominates the Definition 9 (Simplicial vertex). A vertex v ∈ V is other) and both interfere, the value is still alive at the G definitionoftheother. called simplicial, if v and its neighbors induce a com- pletesubgraphinG. Lemma12.Ifv andw interfereandD (cid:5)D ,thenv v w Definition 10 (Perfect Elimination Order, PEO). We isliveatDw. call a linearly ordered sequence of vertices v ,...,v 1 n a perfect elimination order, if each vi is simplicial in Proof. Assume, v is not live at Dw. Then there is no G−{v1,...,vn−1}whereG−{a1,...,am}isthegraph usepathfromDw tosome(cid:1)(cid:3) concerningv.Sinceallla- obtained by deleting all vertices {a1,...,am} and their bels where w is live are dominated by Dw, there is no incidentedgesfromthegraph. labelwherevandwaresimultaneouslylive.Sovandw donotinterferewhichcontradictstheproposition. (cid:1) Theclassofgraphsforwhichperfecteliminationor- dersexistarealsocalledchordalortriangulatedgraphs. Lemma13showshowthedominanceorderrelation Gavril [6] gives an algorithm for coloring chordal isreflectedbytheinterferencegraph.Itsaysthatallval- graphsinO(|V|2).ThealgorithmconstructsaPEOfor uesdominatingavaluev andinterferingwithv forma agivenchordalgraphbysearchingandremovingasim- cliqueintheinterferencegraph.Thisisusedlateronto plicial node from the graph each step. Afterwards, the connecttheperfecteliminationordertothedominance nodesareinsertedintothegraphinreverseorder.Each relation. node is assigned a color which is not occupied by a neighborofthenodetoinsert.Itisfurtherproven,that Lemma 13. ab,bc∈E and ac∈/E. If D (cid:5)D , then thisalgorithmleadstoaminimalcoloringofthegraph. a b D (cid:5)D . b c 4. InterferencegraphsofSSA-programs Proof. DuetoLemma11,eitherD (cid:5)D orD (cid:5)D . b c c b Wesaytwovaluesvandv(cid:3)interfere,iffthereisala- Assume Dc (cid:5)Db. Then (with Lemma 12), c is live at bel (cid:1) where v and v(cid:3) are live (regarding Definition 3). Db.SinceaandbalsointerfereandDa(cid:5)Db,aisalso Now,wecandefinetheinterferencegraphIG=(V,E) live at Db. So, a and c are live at Db which cannot be ofanSSA-formprogram.Thesetofverticesismadeup byprecondition. (cid:1) 154 S.Hack,G.Goos/InformationProcessingLetters98(2006)150–155 Lemma14.Avaluev canextendaperfectelimination if...then if...then order,ifeachvaluewhosedefinitionisdominatedbyDv (cid:1)2:x1←··· (cid:1)2:x1←··· isalreadycontainedinthePEO. (cid:1)3:y1←··· (cid:1)3:y1←··· else (cid:1)4:x3←x1 (cid:1)4:x2←··· (cid:1)5:y3←y1 Proof. ToextendaPEO,vmustbesimplicial.Assume (cid:1)5:y2←··· else v isnotsimplicial.Thenthereexisttwoneighbors a,b end (cid:1)4:x2←··· for which va,vb ∈ E but ab ∈/ E (by Definition 9). (cid:1)6:(x3,y3)←φ(cid:3)(x1,y1,x2,y2) (cid:1)5:y2←··· Duetotheproposition,allvalueswhosedefinitionsare (cid:1)4:x3←x2 dominatedbyD havealreadybeenremovedfromIG. (cid:1)5:y3←y2 v end Thus,D dominatesD .ByLemma13,D dominates a v v D whichcontradictstheproposition.Thus,vissimpli- Fig.3.AprogramfragmentinSSA-formandafterdestroyingSSA b cial. (cid:1) usingcopyinstructions. Theorem15.TheinterferencegraphofaSSA-formpro- register allocation. We can however destroy the SSA- gramP ischordal. form of the program and convert a register allocation with k registers of a SSA-form program into a register Proof. Consider the tree T of immediate dominators allocationwithexactlythesamenumberofregistersfor (cf. [9]) concerning the control flow graph of P. We theresultingnon-SSA-formprogram. (cid:3) start with an empty PEO and apply Lemma 14 recur- Consideraφ -function sivelyonT startingattheleaves.ThisconstructsaPEO (cid:1):(y ,...,y )←φ(cid:3)(x ,...,x ,...,x ,...,x ) fortheinterferencegraphofP.Sinceeachgraphwhich 1 m 11 1n m1 mn has a PEO is chordal, cf. [7], the interference graph of at some label (cid:1). Arriving at (cid:1) and coming from Pj, (cid:1) P ischordal. thex arecopiedatonceintothey accordingtoSSA ij i semantics. Consider a valid register allocation The si- AswecanseebyTheorem15,apost-ordervisitation multaneousassignmentgivenbytheφ(cid:3)-functioncorre- ofthedominatortreeofaprogramyieldsaPEO.Since spondstoal-to-mmappingofregisters,wherel(cid:1)m(cid:1) the vertices are colored reversely along a PEO, a pre- k, and k represents the number of register available.3 ordervisitationofthedominatortreedefinesasequence Furthermore, all y are assigned to different registers, i inwhichthevaluescanbecoloredoptimallyusingthe since all y interfere. So the question of removing a i algorithm described in [6]. Since the liveness analysis φ(cid:3)-function reduces to implementing l-to-m mappings annotatesthesetoflivevaluestoeachlabel,wealways between registers on the control flow edges to the φ(cid:3)s havethesetofneighborspresentuponcoloringavalue. labelusingordinaryprocessorinstructionsandmregis- Thus,wedonothavetoconstructtheinterferencegraph ters. itself. Theorem16.Anysimultaneousassignmentfroml reg- 5. LeavingtheSSA-form isterstomregisters,wherel(cid:1)m,canbeimplemented withmregistersusingonlycopyandswapinstructions. As no real-world processor has a φ-instruction the compilerhastodestroytheSSA-formoftheprogramat Proof. Considerfollowingsimultaneousassignment: somepointintime.Conventionally,φ-functionsarere- placedbycopiesinitspredecessorblockstoimplement (y ,...,y )←(x ,...,x ,...,x,...,x). 1 m (cid:2)1 1 (cid:3)(cid:4) l (cid:5)l the control flow dependent copy as described in Sec- m tion2.Indoingso,onemodifiestheinterferencegraph Ingeneral,theremaybemultipley towhichthesame of the program since new interferences are introduced, i x is assigned. For each x we arbitrarily pick one of as shown in Fig. 3. x is now interfering with y and j j 3 1 the y to which it is assigned and denote it by [x ]. y whichhasnotbeenthecaseintheinterferencegraph i j 2 Note,thatthisinducesanequivalencerelation∼onthe of the SSA-form program. These interferences are in- y ,...,y :y ∼y ifthereissomex whichisassigned troduced due to the fact, that the atomic, simultaneous 1 n i j k evaluationby φ(cid:3)-functions (as mentionedin Section2) toyi andyj.Thus,yi andyj aremembersoftheequiv- isbrokendowntoasequentialsetofoperations.Inthe worst case, these new interferences render the interfer- 3 Note,thatthesamevaluex canbeassignedtodifferentyi bya ence graph un-chordal which also might invalidate our φ(cid:3)-function.E.g.,(y1,y2)=φ(cid:3)(a1,b1,a1,b2). S.Hack,G.Goos/InformationProcessingLetters98(2006)150–155 155 alenceclass[x ].Wedenotethesetofthex byX and berofthegraphequalsthesizeofitslargestclique).The k j thesetoftheequivalenceclasses[x ]by[X]. resultofthequestforaproofofthisobservationisthis j Consideraregisterallocationρ:V→RoftheSSA- paper.Independentlyofus,Briskprovedtheperfectness form program obtained by the algorithm described in of strict SSA-form programs [2]. In his proof he also the last section.Let π by a function mapping ρ(x ) to shows their chordality without referring to it. Pereira j ρ([x ]). Note, that since all y interfere, all [x ] also and Palsberg extended Andersson’s studies and found j i j interfereandbythefactthatallx interfere,π isinjec- that,withSSA-optimizationsenabled,95%ofthe(non- j tive.π mayalsobepartial,sincelmightbesmallerthan SSA!) interference graphs of the Java standard library k=|R|. were chordal. They use this fact to derive new spilling Each register in ρ([X]) which is not in ρ(X) can andcoalescingheuristicsforgraphcoloringregisteral- be assigned immediately since its value is not needed locators. Finally, the authors of this paper published a anymore.Soweapplythefollowingrecursivescheme: moretechnicalproof(withoutusingperfectelimination Let y =π(x). If y ∈[X] and y ∈/ X we issue a copy orders)ofthispaper’sresultinatechnicalreport[8]. from ρ(x) to ρ(y) and recursively consider the map- pingπ|X\{x}. Acknowledgements Attheendofthisrecursiveprocedure,eitherallele- mentsof[X]areprocessedandthusallofX sinceπ is We thank our colleagues Michael Beck, Marco injectiveortheremainingsubsetof [X] equalstheone Gaertler,GötzLindenmaierandespeciallyRubinoGeiß of X. Thus this rest represents a permutation of regis- formanyfruitfuldiscussions.Wealsothanktheanony- ters which can be, as known from basic linear algebra, mous referees for their suggestions helping to improve implementedbyasequenceofswapinstructions.Ifthe thispaper. processor does not possess swap instructions, one can usethreexors,addsorsubs(cf.[10]). References Finally,each y ∈[x ] canbeprocessedbycopying i j ρ([xj])toρ(yi). (cid:1) [1] C. Andersson, Register allocation by optimal graph coloring, in: G. Hedin (Ed.), CC 2003, Lecture Notes in Comput. Sci., 6. Conclusions vol.2622,Springer-Verlag,Heidelberg,2003,pp.33–45. [2] P.Brisk,F.Dabiri,J.Macbeth,M.Sarrafzadeh,Polynomialtime graphcoloringregisterallocation,in:14thInternat.Workshopon Wehaveshownthattheinterferencegraphsofstrict LogicandSynthesis,ACMPress,NewYork,2005. SSA-formprogramsarechordalwhichleadstoacolor- [3] Z.Budimlic´,K.D.Cooper,T.J.Harvey,K.Kennedy,T.S.Oberg, ing algorithm running in quadratic time. Furthermore, S.W.Reeves,Fastcopycoalescingandlive-rangeidentification, in: Proc. ACM SIGPLAN 2002 Conference on Programming the coloring algorithm does not need to have the inter- LanguageDesignandImplementation,ACMPress,NewYork, ferencegraphmaterializedbutusesacoloringsequence 2002,pp.25–32. inducedbythedominancerelationoftheprogram.We [4] G.J. Chaitin, M.A. Auslander, A.K. Chandra, J. Cocke, M.E. also showed, how a register allocation of a SSA-form Hopkins, P.W. Markstein, Register allocation via coloring, programusingmregisterscanbeturnedintoaregister J.Comput.Languages6(1981)45–57. allocation of a corresponding non-SSA program using [5] R.Cytron,J.Ferrante,B.K.Rosen,M.N.Wegman,F.K.Zadeck, (cid:3) Anefficientmethodofcomputingstaticsingleassignmentform, alsonomorethanmregisters,byimplementingtheφ - in:Symp.onPrinciplesofProgrammingLanguages,ACMPress, functionsproperly. NewYork,1989,pp.25–35. [6] F.Gavril,Algorithmsforminimumcoloring,maximumclique, 7. Relatedwork minimumcoveringbycliques,andindependentsetofachordal graph,SIAMJ.Comput.1(2)(1972)180–187. [7] M.C.Golumbic,AlgorithmicGraphTheoryandPerfectGraphs, Atthetimethispaperwassubmitted,chordalgraphs AcademicPress,NewYork,1980. played no role in register allocation. Meanwhile, they [8] S. Hack, Interference graphs of programs in SSA-form, Tech. havedrawntheattentionofotherresearchersinthearea. Rep.2005-25,UniversitätKarlsruhe,June2005. The paper which initiated our research on the topic is [9] T.Lengauer,R.E.Tarjan,Afastalgorithmforfindingdomina- torsinaflowgraph,Trans.Programm.LanguagesSystems1(1) by Andersson [1] who investigatedinterference graphs (1979)121–141. inreal-worldcompilersandfoundthatallofthemwere [10] H.S.Warren,Hacker’sDelight,Addison-Wesley,Reading,MA, 1-perfect(1-perfectnessmeansthatthechromaticnum- 2003. Register Allocation via Coloring of Chordal Graphs Fernando Magno Quint˜ao Pereira and Jens Palsberg UCLA Computer Science Department University of California, Los Angeles Abstract. We present a simple algorithm for register allocation which is competitive with the iterated register coalescing algorithm of George and Appel. We base our algorithm on the observation that 95% of the methods in the Java 1.5 library have chordal interference graphs when compiledwiththeJoeQcompiler.Agreedyalgorithmcanoptimallycolor a chordal graph in time linear in the number of edges, and we can eas- ily add powerful heuristics for spilling and coalescing. Our experiments showthatthenewalgorithmproducesbetterresultsthaniteratedregis- ter coalescing for settings with few registers and comparable results for settings with many registers. 1 Introduction Register allocation is one of the oldest and most studied research topics of com- puter science. The goal of register allocation is to allocate a finite number of machine registers to an unbounded number of temporary variables such that temporary variables with interfering live ranges are assigned different registers. Most approaches to register allocation have been based on graph coloring. The graph coloring problem can be stated as follows: given a graph G and a positive integer K, assign a color to each vertex of G, using at most K colors, such that no two adjacent vertices receive the same color. We can map a program to a graph in which each node represents a temporary variable and edges connect temporarieswhoseliverangesinterfere.Wecanthenuseacoloringalgorithmto perform register allocation by representing colors with machine registers. In1982Chaitin[8]reducedgraphcoloring,awell-knownNP-completeprob- lem [18], to register allocation, thereby proving that also register allocation is NP-complete. The core of Chaitin’s proof shows that the interference relations between temporary variables can form any possible graph. Some algorithms for register allocation use integer linear programming and may run in worst-case exponential time, such as the algorithm of Appel and George [2]. Other algo- rithms use polynomial-time heuristics, such as the algorithm of Briggs, Cooper, and Torczon [5], the Iterated Register Coalescing algorithm of George and Ap- pel [12], and the Linear Scan algorithm of Poletto and Sarkar [16]. Among the polynomial-time algorithms, the best in terms of resulting code quality appears to be iterated register coalescing. The high quality comes at the price of han- dling spilling and coalescing of temporary variables in a complex way. Figure 1

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.