ebook img

From Safety To Termination And Back: SMT-Based Verification For Lazy Languages PDF

0.18 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview From Safety To Termination And Back: SMT-Based Verification For Lazy Languages

From Safety To Termination And Back: SMT-Based Verification For Lazy Languages NikiVazou EricL.Seidel RanjitJhala UCSanDiego 4 1 0 Abstract eager,call-by-valuelanguages,butwepresumedthattheorderof 2 SMT-based verifiershave long been an effective means of ensur- evaluation would surely prove irrelevant, and that the soundness ingsafetypropertiesofprograms.Whilethesetechniquesarewell guaranteeswouldtranslatetoHaskell’slazy,call-by-needregime. n Tooursurprise,weweretotallywrong. understood, we show that they implicitly require eager seman- a J tics; directlyapplying them toa lazy language isunsound due to 1.LazinessPrecludesPartialCorrectnessOurfirstcontributionis the presence of divergent sub-computations. We recover sound- todemonstratethatrefinementtypingisunsoundinthepresenceof 4 ness by composing the safety analysis with a termination analy- lazyevaluation(§2).Considertheprogram: 2 sis.Ofcourse,terminationisitselfachallengingproblem,butwe foo :: n:Nat -> {v:Nat | v < n} show how the safety analysis can be used to ensure termination, ] foo n = if (n > 0) then (n-1) else (foo n) L therebybootstrappingsoundnessfortheentiresystem.Thus,while safetyinvariantshavelongbeenrequiredtoprovetermination,we P bar :: z:Pos -> x:Int -> Int show how termination proofs can be to soundly establish safety. . bar z x = 2014 ‘div‘ z s We have implemented our approach in LIQUIDHASKELL,a Re- c finementType-basedverifierforHaskell.Wedemonstrateitseffec- [ main = let (a,b) = (0, foo 0) in bar a b tivenessviaanexperimentalevaluationusingLIQUIDHASKELLto verifysafety,functional correctnessandterminationpropertiesof A standard refinement type checker will happily verify the 1 real-worldHaskelllibraries,totalingover10,000linesofcode. above program. The refinement type signature for foo captures v the partial correctness property: the function foo requires non- 7 2 1. Introduction negativeinputsandensuresthatitsoutput(if oneisproduced!)will bestrictlylessthanitsinput.Consequently,thecheckerconcludes 2 SMT-basedverifiers,basedonFloyd-HoareLogic(e.g.EscJava[12]), thatatthecall-siteforbar,thevaluezequals0andthevaluex 6 or combined with abstract interpretation (e.g. SLAM [1]), have is some non-negative integer that is strictly less than 0. In other . 1 been highly effective at the automated verification of imperative words,thechecker concludesthattheenvironment isinconsistent 0 programs. In the functional setting, these techniques are general- andhencetriviallytypechecks. 4 izedasrefinementtypes,whereinvariantsareencodedbycompos- The issue isindependent of refinement typing and affects any 1 ingtypeswithSMT-decidablerefinementpredicates[28,38].For Floyd-Hoarelogic-basedverifier.Forexample,ahypotheticalESC- : example Scala would verify the following code which restates the above v i type Pos = {v:Int | v > 0} usingclassicalrequiresandensuresclauses: X type Nat = {v:Int | v >= 0} def foo(n:Int):Int = r //@ requires (0 <= n) a are the basic type Int refined with logical predicates that state //@ ensures (0 <= \result && \result < n) that “the values” v described by the typeare respectively strictly if (n > 0) return (n-1) else return foo(n) positive and non-negative. We encode pre- and post-conditions (contracts)usingrefinedfunctiontypeslike def bar(z:Int, x:=>Int) = div :: n:Nat -> d:Pos -> {v:Nat | v <= n} //@ requires 0 < z return (2014 div z) whichstatesthatthefunctiondivrequiresinputsthatarerespec- tivelynon-negativeandpositive,andensuresthattheoutputisless def main = {val (a,b) = (0,foo(0));bar(a,b)} thanthefirstinputn.Ifaprogramcontainingdivstaticallytype- checks, we can rest assured that executing the program will not Oneshouldnotbealarmedasthisdeductionisperfectlysound leadtoanyunpleasant divide-by-zero errors.Severalgroupshave under eager, call-by-value semantics. In both cases, the verifier demonstratedthatrefinementscanbeusedtostaticallyverifyprop- determines that the call to bar is dead code – the call is safe ertiesrangingfromsimplearraysafety[27,38]tofunctionalcor- becauseitisnotinvokedatall.Thisreasoningisquiteunsoundfor rectnessofdatastructures[18],securityprotocols[3,5],andcom- Haskell’s lazy, call-by-need semantics, and Scala’s lazy, call-by- pilercorrectness[13]. nameparameters(indicated bythe:=>typeannotation). Inboth Given the remarkable effectiveness of the technique, we em- cases, the program execution would skip blithely over the call to barkedontheprojectofdeveloping arefinementtypebasedveri- foo,plungeheadlongintothediv,andcrash. fierforHaskell,assumingthatthestandardsoundnessproofsfrom Asweshow,theproblemisthatwithlazyevaluation,onecan Floyd-Hoarelogicsandrefinementtypeswouldcarryoverdirectly. onlytrustarefinementorinvariantifoneisguaranteedthatevalu- Ofcourse,thepreviouslogicsandsystemswerealldevelopedfor atingthecorrespondingtermwillnotdiverge.Thatis,theclassical 1 2014/1/27 Floyd-Hoareseparationof“partial”and“total”correctnessbreaks because,tocheckthatthesecondparameteryhastypePosatthe down,andevensafetyverificationrequirescheckingtermination. calltodiv,thesystemwillissueasubtypingquery 2. Termination for and by Refinement Typing The prognosis x:{x≥0}, y:{y≥0} ⊢{v=y}(cid:22){v>0} seems dire: to solve one problem it appears we must first solve a harder one! Our second contribution is to demonstrate that re- whichreducestotheinvalidSMTquery finementtypescanthemselvesbeusedtoprovetermination(§2). In particular, weshow how toadapt theclassical idea of ranking (x≥0)∧(y≥0) ⇒(v=y)⇒(v>0) functions[32],asembodied viasizedtypes[2,15],tothesetting Ontheotherhand,thesystemwillaccepttheprogram of Refinement Typing. The key idea isto ensure that each recur- sivecallismadewithparametersofstrictlydecreasingNat-valued good :: Nat -> Nat -> Int size. good x y = x ‘div‘ (y + 1) We show that refinements naturally encode sized types and Here,thecorrespondingsubtypingqueryis generalize them in useful ways by allowing (1) different notions of size to account for recursive data types, (2) lexicographically x:{x≥0}, y:{y≥0} ⊢{v=y+1}(cid:22){v>0} (1) orderedrankingfunctionstosupportcomplexformsofrecursion, andmostimportantly(3)theuseofauxiliaryrelationalinvariants whichreducestothevalidSMTquery (circularly, via refinement types!) to verify that sizes decrease in (x≥0)∧(y≥0) ⇒(v=y+1)⇒(v>0) (2) non-structurallyrecursivefunctions. Our use of refinements to prove termination makes proving 2.1 LazinessMakesRefinementTypingUnsound soundness interesting in two ways. First, to check Haskell code- bases,wecannotrequirethateverytermterminates,andsowemust Next,letuslookindetailattheprogramwithfooandbarfrom supportprogramscontainingtermsthatmaydiverge.Second,there §1.Astandardrefinementtypecheckerfirstverifiesthesignature is a circularity in the soundness proof itself as termination is re- of foo in the classical rely-guarantee fashion, by (inductively) quiredtoproverefinementssoundandviceversa.Weaddressthese assuming its type and checking that its body yields the specified issuesbydeveloping acorecalculus(§3)withoptimisticseman- output.Inthethenbranch,theoutputsubtypingobligation tics[11],provingthosesemanticsequivalenttocall-by-nameeval- n:{n≥0}, :{n>0} ⊢{v=n−1}(cid:22){0≤v∧v<n} uation, andthenproving soundness withrespect totheoptimistic semantics(§4).Thus,whileitiswellknownthatsafetyproperties reducestothevalidSMTformula (invariants) areneeded toprove termination[8],we show for the firsttimehowterminationpropertiesareneededtoprovesafety. n≥0 ∧ n>0 ⇒(v=n−1)⇒ (0≤v∧v<n) 3. Refinement Types for Real-World Haskell We have imple- Intheelsebranch,theoutputisprovenbyinductivelyusingthe mented our technique by extending LIQUIDHASKELL [34]. Our assumedtypeforfoo.Next,insidemainthebinderbisassigned third contribution is an experimental evaluation of the effective- theoutputtypeoffoowiththeformalnreplacedwiththeactual ness of our approach by using LIQUIDHASKELL to check sub- 0.Thus,thesubtypingobligationatthecalltobaris stantial,real-worldHaskelllibrariestotalingover10,000linesof a:{a=0}, b:{0≤b∧b<0} ⊢{v=a}(cid:22){v>0} code (§ 5). The verification of these libraries requires precisely analyzing recursive structureslikelistsand trees,tracking there- whichreducestotheSMTquery lationshipsbetweentheircontents,inferringinvariantsinthepres- enceoflow-levelpointerarithmetic,andliftingtheanalysisacross a=0 ∧ (0≤b∧b<0) ⇒(v=a)⇒(v>0) polymorphic, higher-order functions. We demonstrate that by us- whichistriviallyvalidastheantecedentisinconsistent. ing LIQUIDHASKELL we were able to prove termination and a Unfortunately, this inconsistency is unsound under Haskell’s varietyofcriticalsafetyandfunctionalcorrectnesspropertieswith lazy evaluation! Since b is not required, the program will dive a modest number of manually specified hints, and even find and headlong into evaluating the div and hence crash, rendering the fixasubtlecorrectness bugrelatedtounicode handling in TEXT. typechecker’sguaranteemeaningless. Tothebest ofour knowledge, thisisthemostsubstantial evalua- tionofrefinementtypesonthirdpartycode,anddemonstratesthat ReconcilingLazinessAndRefinementsOnemaybetemptedtoget SMT-based safety- and termination-verification can be practical aroundtheunsoundnessviaseveraldifferentroutes.First,onemay andeffectiveforhigher-order,functionallanguages. betemptedtopointthefingerofblameatthe“inconsistency”itself. Unfortunately,thiswouldbemisguided,sincesuchinconsistencies are not a bug but a crucial feature of refinement type systems. 2. Overview Theyenable,amongotherthings,pathsensitivitybyincorporating Westart withan overview of our contributions. Afterquickly re- information from run-time tests (guards) and hence let us verify capitulating the basics of refinement types we illustrate why the thatexpressionsthatthrowcatastrophicexceptions(e.g.error e) approach isunsound inthepresence oflazyevaluation. Next,we areindeedunreachabledeadcodeandwillnotexplodeatrun-time. showthatwecanrecoversoundnessbycouplingrefinementswith Second,onemightuseaCPStransformation[25,36]toconvertthe a termination analysis. Fortunately, we demonstrate how we can programintocall-by-value.Weconfesstobesomewhatwaryofthe bootstrapoffofrefinementstosolvetheterminationproblem,and prospectoftranslatinginferredtypesanderrorsbacktothesource henceobtainasoundandpracticalverifierforHaskell. level after such a transformation. Previous experience shows that SMT-BasedRefinementTypeCheckingRecalltherefinementtype theabilitytomaptypesanderrorstosourceiscriticalforusability. aliases Pos and Nat and the specification for div from § 1. A Third, one may want some form of strictness analysis [20] to refinementtypesystemwillusethesetoreject statically predict which expressions must be evaluated, and only use refinements for those expressions. This route is problematic bad :: Nat -> Nat -> Int as it is unclear whether one can develop a sufficiently precise bad x y = x ‘div‘ y strictnessanalysis.Moreimportantly,itisoftenusefultoaddghost valuesintotheprogramforthesolepurposeofmakingrefinement 2 2014/1/27 typescomplete[33].Byconstructionthesevaluesarenotusedby Ourapproachisproperlyviewedasanoptimizationthatstrength- the program, and would be thrown away by a strictnessanalysis, ens the “direct” refinement (with a bottom disjunct) with a ter- thusprecludingverification. minationanalysisthatletsuseliminatethebottomdisjunctinthe commoncaseofterminatingterms.Ingeneral,thisapproachleaves 2.2 EnsuringSoundnessWithTermination openthepossibilityofdirectlyreasoningaboutbottom(e.g.when Thecruxoftheproblemisthatwhenweestablishthat reasoning about infinite streams), as we do not require that all termsbeprovablyterminating,butonlytheoneswith(non-trivial) Γ⊢e:{v:Int|p} refinementswithout thebottomdisjunct.Thedual approach–us- ingsafetyinvariantstostrengthenandproveterminationisclassi- whatwehaveguaranteedisthatif ereducestoanintegern,thenn cal[8];thisisthefirsttimeterminationhasbeenusedtostrengthen satisfiesthelogicalpredicatep[n/v][14,19].Thus,toaccountfor andprovesafety! divergingcomputations,weshouldproperlyviewtheabovetyping Asanaside,readersfamiliarwithfullydependentlytypedlan- judgmentasweakenedwithabottomdisjunct guageslikeAgda[21]andCoq[4]maybeunsurprisedatthetermi- Γ⊢e:{v:Int|v=⊥∨p} nationrequirement. However, therolethatterminationplayshere isquitedifferent.Inthosesettings,arbitrarytermsmayappear in Now,considertheexpressionlet x = e in e’.Inanea- types;terminationensuresthesemanticsarewell-definedandfacil- ger setting, we can readily eliminate the bottom disjunct and as- itatestypeequivalence.Incontrast,refinementlogicsarecarefully sumethatxsatisfiesp[x/v]whenanalyzinge’becauseifedi- designedtoprecludearbitraryterms;theyonlyallowlogicalpred- verges,thene’isnotevaluated.Inotherwords,themerefactthat icates over well-defined, decidable theories, which crucially also evaluation of e’ began allows us to conclude that x6=⊥, and so includeprogramvariables.Aswesaw,thischoiceissoundunder wecaneliminatethebottomdisjunctwithoutcompromisingsound- call-by-value,butproblematicundercall-by-name,asinthelatter ness.However,inalazysettingwecannotdropthebottomdisjunct settingevenamerefirst-ordervariablecancorrespondtoanunde- becausewemaywellevaluatee’evenifediverges! fineddivergingcomputation. One way forward isto take the bull by the horns and directly reason about divergence and laziness using the bottom disjunct. 2.3 EnsuringTerminationWithRefinements Thatis,toweakeneachrefinementwiththebottomdisjunct.While sound, such a scheme is imprecise as the hypotheses will be too Howshallweprovetermination?Fortunately,thereisagreatdeal weak to let us prove interesting relational invariants connecting ofresearchonthisproblem.Theheart ofalmostalltheproposed different programvariables.Forinstance, thesubtyping queryfor solutions isthe classical notion of ranking functions [32]: in any good(1),ifweignorethexbinder,becomes: potentially looping computation, prove that some well-founded metricstrictlydecreaseseverytimearoundtheloop. y:{y=⊥∨y≥0} ⊢{v=⊥∨v=y+1}(cid:22){v=⊥∨v>0} SizedTypesInthecontextoftypedfunctional languages, thepri- whichboilsdowntotheSMTquery marysourceofloopingcomputationsisrecursivefunctions.Thus, the above principle can be formalized by associating a notion of (y=⊥∨y≥0) ⇒(v=⊥∨v=y+1)⇒(v=⊥∨v>0) size with each type, and verifying that in each recursive call, the whichisinvalid,causingustorejectaperfectlysafeprogram! function isinvoked witharguments whose sizeisstrictlysmaller Onemighttrytomakethedirectapproachalittlelessna¨ıveby than the current inputs. Thus, one route to recovering soundness somehow axiomatizing the semantics of operators like + to stip- wouldbetoperformafirstphaseof sizeanalysislike[2,15,29] ulate that the result (e.g. v above) is only non-bottom when the toverifytheterminationofrecursivefunctions,andthencarryout operands(e.g.yand1)arenon-bottom.Inessence,suchanaxiom- refinement typing in a second phase. However, (as confirmed by atization would end up encoding lazy evaluation inside the SMT ourevaluation)provingthatsizesdecreaseoftenrequiresauxiliary solver’slogic,andwouldindeedmaketheabovequeryvalid.How- invariantsofthekindrefinementsaresupposedtoestablishinthe ever,itisquiteuncleartoushowtodesignsuchanaxiomatization firstplace[8].Instead,like[37],wedevelopameansofencoding inasystematicfashion.Worse,evenwithsuchanaxiomatization, sizesandprovingterminationcircularly,viarefinementtypes. wewouldneverbeabletoverifytrivalprogramslike UsingTheValueAsTheSizeConsiderthefunction baz :: x:Int -> y:Int -> {z:Int | x > y} -> Int fib :: Nat -> Int baz x y z = assert (x > y) 0 fib 0 = 1 as the bottom disjunct on z – which is never evaluated – would fib 1 = 1 precludetheverifierfromusingtherefinementrelatingthevalues fib n = fib (n-1) + fib (n-2) ofxandythatisneededtoprovetheassertion.Theaboveexample is not contrived; it illustrates a common idiom of using ghost Recall that in our setting Nat is simply non-negative Integer. variablesthatcarry“proofs”aboutotherprogramvariables. Thus,wecannaturallyassociateasizewithitsownvalue. The logical conclusion of the above line of inquiry is that to restoresoundnessandpreserveprecision,weneedameansofpre- TerminationviaEnvironment-WeakeningToverifysafety,astan- ciselyeliminatingthebottomdisjunct,i.e.ofdeterminingwhena dardrefinementtypechecker(orHoare-logicbasedprogramveri- term is definitely not going to diverge. That is, we need a termi- fier)wouldcheckthebodyassuminganenvironment: nation analysis. With such an analysis, using Reynolds’ [26] ter- n:Nat, fib:Nat→Int minology, we could additionally type each term as either trivial, meaning it must terminate, or serious, meaning it may not termi- Consequently, it would verify that at the recursive call-site the nate.Furnishedwiththisinformation,therefinementtypechecker argumentn−2isaNat,bycheckingtheSMTvalidityof maysoundly dropthebottomdisjunctfortrivialexpressions, and keepthebottomdisjunct(orjustenforcetherefinementtrue)for seriousterms. (n≥0∧ n6=0∧n6=1)⇒ (v=n−2)⇒v≥0 3 2014/1/27 andthecorrespondingformulaforn−1,therebyguaranteeingthat Thistypeisonlyvalidifmapprovablyterminates.Wesimultane- fibrespectsitssignature.Fortermination,wetweaktheprocedure ously verify termination and the type as before, by checking the bycheckingthebodyinatermination-weakenedenvironment bodyinthetermination-weakenedenvironment n:Nat, fib:{n′:Nat|n′ <n}→Int ys:[a] map:(a→b)→{ys′:[a]|lenys′ <lenys}→ListEqbys′ wherewehaveweakenedthetypeoffibbystipulatingthatitonly berecursivelycalledwithNatvaluesn′ thatarestrictlylessthan To ensure that the body recursively uses map per its weakened thecurrentparametern.Thebodystilltypechecksas specification,therecursivecallmap f xsgenerates thesubtyp- ingquery (n≥0∧ n6=0∧n6=1) ⇒ (v=n−2)⇒v≥0∧v<n ys:{lenys=1+lenxs} is a valid SMT formula. We prove (Theorem 3) that since the xs:{lenxs≥0} ⊢ {ys′ =xs}(cid:22){lenys′ <lenys} bodytypechecksundertheweakenedassumptionfortherecursive binder,thefunctionwillterminateonallNats. Thankstothecaseunfolding,thetypeofysisstrengthenedwith Lexicographic Termination Our environment-weakening tech- themeasurerelationshipwiththetailxs[18].Hence,thesubtyping niquegeneralizestoincludeso-calledlexicographicallydecreasing abovereducestothevalidSMTquery measures.Forexample,considertheAckermannfunction. lenys=1+lenxs ack :: Nat -> Nat -> Nat ∧lenxs≥0 ⇒ (ys′ =xs) ⇒ (lenys′ <lenys) ack m n | m == 0 = n + 1 Hence, usingrefinementsand environment-weakening, wesimul- | n == 0 = ack (m-1) 1 taneouslyverifyterminationandtheoutputtypeformap. | otherwise = ack (m-1) (ack m (n-1)) WitnessingTerminationSometimes,thedecreasingmetriccannot beassociatedwithasingleparameterorlexicographicallyordered Wecannot provethatsomeargument alwaysdecreases; thefunc- sequence of parameters, but is instead an auxiliary value that is tionterminatesbecauseeitherthefirstargumentstrictlydecreases a function of the parameters. For example, here is the standard or the first argument remains the same and the second argument mergefunctionfromtheeponymoussortingprocedure: strictlydecreases, i.e.thepair of argumentsstrictlydecreases ac- cording to awell-founded lexicographic ordering. Toaccount for merge xs@(x:xs’) ys@(y:ys’) suchfunctions, wegeneralize thenotionof weakening toallowa | x < y = x : merge xs’ ys sequence of witness arguments while requiring that (1) each wit- | otherwise = y : merge xs ys’ ness argument be non-increasing, and (2) the last witness argu- neitherparameterprovablydecreasesinbothcalls,butthesumof mentbestrictlydecreasing ifthepreviousargumentswereequal. thesizesoftheparametersstrictlydecreases. Theserequirementscanbeencoded bygeneralizingthenotionof Inthesecases,wejustexplicatethemetricwithaghostparam- environment-weakening:wecheckthebodyofackunder eterthatactsasawitnessfortermination: m:Nat, merge d xs@(x:xs’) ys@(y:ys’) n:Nat, | x < y = x : merge dx xs’ ys ack:m′:Nat→{n′:Nat|m′ ≤m∨m′ =m⇒n′ <n}→Nat | otherwise = y : merge dy xs ys’ where Thus,whilesoundrefinementtypingrequiresprovingtermination, dx = length xs’ + length ys onthebrightside,refinementsmakeprovingterminationeasy. dy = length xs + length ys’ Measuring The Size of Structures Consider the function map wherelength :: zs:[a] -> {v:Nat | v = len zs} definedoverthestandardlisttype. returnsthenumberofelementsinthelist.Now,thesystemverifies map f [] = [] thatdequalsthesumofthesizesofthetwolists,andthatineach map f ys@(x:xs) = f x : map f xs recursivecalldxanddyarestrictlysmallerthand,therebyproving thatmergeterminates. Inmap,therecursivecallismadetoa“smaller”input.Weformal- izethenotionofsizewithmeasures. 3. Language measure len :: [a] -> Nat Next,wepresentacorecalculusλ↓thatformalizesourapproachof len [] = 0 refinementtypesunderlazyevaluation.Insteadofprovingsound- len (x:xs’) = 1 + (len xs’) ness directly on lazy, call-by-name (CBN) semantics, we prove soundness with respect to an optimistic (OPT) semantics where Withtheabovedefinition,themeasurestrengthensthetypeofthe (provably) terminating termsare eagerly evaluated, and we sepa- dataconstructorsto: ratelyproveanequivalencerelatingtheCBNandOPTevaluation [] :: {v: [a] | len v = 0} strategies.Withthisinmind,letusseethesyntax(§3.1),thedy- (:) :: x:a -> xs:[a] namicsemantics(§3.2),andfinally,thestaticsemantics(§4). -> {v:[a] | len v = 1 + len xs} 3.1 Syntax wherelenissimplyanuninterpretedfunctioninSMTlogic[18]. Figure1summarizesthesyntaxofexpressionsandtypes. Wecannowverifythatmapdoesnotchangethelengthofthelist: ConstantsTheprimitiveconstantsincludebasicvaluesliketrue, type ListEq a YS = {v:[a] | len v = len YS} false, 0, 1, etc., arithmetic and logical operators like +,−,≤,/, ∧,¬.Inaddition,weincludeaspecial(untypable)crashconstant map :: (a -> b) -> ys:[a] -> ListEq b ys thatmodelserrors.Primitiveoperationswillreturnacrashwhen 4 2014/1/27 SeriousandTrivialExpressionsFollowingReynolds[26],wesay Expressions e ::= x | c | λx.e | ee thatanexpressioneistrivialife֒→∗ vforsomevaluev6=crash. | letx=eine | µf.λx.e n otherwisewecalltheexpressionserious.Thatis,anexpressionis BasicTypes b ::= nat | bool trivialiff itreducestoanon-crashvalueunderCBN.Notetheset Label l ::= ↑|↓ of trivialexpressions isclosedunder evaluation, i.e.ifeistrivial ande֒→e′(foranydefinitionofFin)thene′isalsotrivial. Type τ ::= {v:bl |e} | x:τ →τ Optimistic Evaluation We define optimistic evaluation as the Figure1. Syntaxofλ↓TermsandTypes small-steprelation֒→oobtainedbydefining e∈Fin⇔eistrivial∧enotavalue In§4wewillprovesoundnessofrefinementtypingwithrespect Contexts C ::= • | vC | letx=Cine tooptimisticevaluation.Toshowsoundnessunderlazyevaluation, Values v ::= c | λx.e | µf.λx.e weadditionallyproved[31]thefollowingtheoremthatrelateslazy andoptimisticevaluation. C[e] ֒→ C[e′] ife∈Fin,e֒→e′ Theorem1(OptimisticEquivalence). e֒→∗ c⇔e֒→∗ c. (λx.e)ex ֒→ e[ex/x] ifex 6∈Fin n o (µf.λx.e)ex ֒→ e[ex/x][µf.λx.e/f] ifex 6∈Fin letx=exine ֒→ e[ex/x] ifex 6∈Fin 4. TypeSystem e1e2 ֒→ e′1e2 ife1 ֒→e′1 Next,wedevelopthestaticsemanticsforλ↓ andprovesoundness cv ֒→ [[c]](v) withrespecttotheOPT,andhenceCBN,evaluationstrategies.We developthetypesysteminthreesteps.First,wepresentageneral Figure2. OperationalSemanticsofλ↓ typesystemwhereterminationistrackedabstractlyvialabelsand showhowtoachievesoundrefinementtypingusingagenericter- minationoracle(§4.1).Next,wedescribeaconcreteinstantiation invoked withinputs outside their domain, e.g. when / isinvoked oftheoracleusingrefinements(§4.2).Thisdecouplingallowsus witha0divisor,oranassertiscalledwithfalse. tomakeexplicittheexactrequirementsforsoundness,andalsohas ExpressionsInadditiontotheprimitiveconstantsc,λ↓expressions thepragmaticbenefitofleavingopenthedoorforemployingother includethevariables x,λ-abstractionsλx.eandapplications ee, kindsofterminationanalyses. let-binders let x = e in e, and the fix operator µf.λx.e for definingpotentiallydivergingrecursivefunctions. Basic Types Figure 1 summarizes the syntax of λ↓ types. Basic types in λ↓ are natural numbers nat and booleans bool. We in- clude only two basic types for simplicity; it isstraightforward to 3.2 DynamicSemantics addanytypewhosevaluescanbesized,andhenceorderedusinga Wedefinethedynamicbehaviorofλ↓ programsinFigure2using well-foundedrelation,asdiscussedin§2.3. asmall-stepcontextualoperationalsemantics. LabelsWeusetwolabelstodistinguishterminatinganddiverging ForcingEvaluationInourrules,contextsandvaluesaredefinedas terms.Intuitively,thelabel↓appearsintypesthatdescribeexpres- usual.Inadditiontotheusuallazysemantics,oursoundnessproof sionsthatalwaysterminate,whilethelabel↑impliesthattheex- requires a “helper” optimistic semantics. Thus, we parameterize pressionsmaynotterminate. the small step rules with a force predicate e ∈ Fin that is used todeterminewhethertoforceevaluationoftheexpressioneorto Types Types in λ↓ include the standard basic refinement type b deferevaluationuntilthevalueisneeded. annotatedwithalabel.Thetype{v:bl | e}describesexpressions oftypebthatsatisfytherefinemente.Intuitively,ifthelabellis↓ EvaluationRulesWehavestructuredthesmall-steprulessothatif thentheexpressiondefinitelyterminates,otherwiseitmaydiverge. afunction parameteror let-bindere ∈ Finthenweeagerlyforce evaluationofe(thefirstrule),andotherwisewe(lazily)substitute Finally,λ↓typesincludedependentfunctiontypesx:τ →τ. therelevantbinderwiththeunevaluatedparameterexpression(the SafetyWeformalizesafetypropertiesbygivingvariousprimitive second,thirdandfourthrule).Thefifthruleevaluatesthefunction constantstheappropriaterefinementtypes.Forexample, inanapplication,asitsvalueisalwaysneeded. 3 : {v:nat↓ |v=3} Constants The final rule, application of a constant, requires the (+) : x:nat↓ →y:nat↓ →{v:nat↓ |v=x+y} argumentbereducedtoavalue;inasinglesteptheexpressionis (/) : x:nat↓ →{v:nat↓ |v6=0}→nat↓ reducedtotheoutputoftheprimitiveconstantoperation. errorτ : {v:nat↓ |false}→τ EagerandLazyEvaluationTounderstandtheroleplayedbyFin, consider anexpressionoftheform(λx.e)ex.Ifex ∈ Finholds, Weassumethatforanyconstantcwithtype thenweeagerlyforceevaluationofex(viathefirstrule),otherwise Ty(c) =. x:{v:bl |e}→τ weβ-reduce bysubstitutingex insidethebodye(viathesecond rule). Note that if ex is already a value, then trivially, only the foranyvaluev,iftheformulaeisvalidthenthevalue[[c]](v)isnot second rule can be applied. The same idea generalizes to the let equal tocrashand canbetypedasτ.Thus,wedefinesafetyby andfixcases.Thus,wegetlazy(call-by-name)andeager(call-by- requiringthatatermdoesnotevaluatetocrash. value)semanticsbyinstantiatingFinappropriately . SeriousandTrivialTypesAtypeτ isseriousifτ = {v:b↑ | e}, e∈Fin ⇔ false Call-By-Name otherwise,itistrivial. e∈Fin ⇔ enotavalue Call-By-Value NotationWewritebltoabbreviatetheunrefinedtype{v:bl |true}. Wewrite֒→v (resp.֒→n)forthesmall-steprelationobtainedvia Weensurethatserioustypes(whichare,informallyspeaking, as- theCBVandCBNinstantiationsofFin.Wewritee ֒→j e′(resp. signedtopotentiallydivergingterms)areunrefined.Wewritebfor e֒→∗ e′)ifereducestoe′inatmostj(resp.finitelymany)steps. bl,whenlabellcanbeeither↓or↑. 5 2014/1/27 Well-Formedness Γ⊢τ SubtypingA judgment Γ ⊢ τ1 (cid:22) τ2 states that the type τ1 is a subtype of the type τ2 under environment Γ. That is, informally Γ′ =Trivial(Γ,v:b↓) Γ′⊢e:bool speaking,whenthefreevariablesofτ1andτ2areboundtovalues WF-↓ Γ⊢{v:b↓ |e} described by Γ, the set of values described by τ1 is contained in the set of values described by τ2. The interesting rules are the Γ⊢τ Γ,x:τ ⊢τ′ two rules that check subtyping of basic types. Rule (cid:22)-↑ checks WF-↑ WF-FUN subtyping on serious basic types, and requires that the supertype Γ⊢{v:b↑ |true} Γ⊢x:τ →τ′ isunrefined.Rule(cid:22)-↓checkssubtypingontrivialbasictypes.As usual,subtypingreducestoimplicationchecking:theembeddingof Subtyping Γ⊢τ1 (cid:22)τ2 theenvironmentΓstrengthenedwiththeinterpretationofe1inthe logicshouldimplytheinterpretationofe2intherefinementlogic. SmtValid([[Γ]]⇒[[e1]]⇒[[e2]]) Thecrucialdifferenceisinthedefinitionofembedding.Here,we (cid:22)-↓ Γ⊢{v:b↓ |e1}(cid:22){v:b↓ |e2} keeponlythetrivialbinders: (cid:22)-↑ [[Γ]] =. ^{[[e[x/v]]]|x:{v:bl |e}∈Trivial(Γ)} Γ⊢{v:bl |e}(cid:22){v:b↑ |true} TypingAjudgmentΓ ⊢ e : τ statesthattheexpressionehasthe Γ⊢τ2 (cid:22)τ1 Γ,x:τ2⊢τ1′ (cid:22)τ2′ typeτ underenvironment Γ.Thatis,whenthefreevariablesine (cid:22)-FUN Γ⊢x:τ1 →τ1′ (cid:22)x:τ2 →τ2′ areboundtovaluesdescribedbyΓ,theexpressionewillevaluate toavaluedescribedbyτ.Mostoftherulesarestandard,wediscuss Typing Γ⊢e:τ onlytheinterestingones. Γ⊢e:τ1 Γ⊢τ1(cid:22)τ2 Γ⊢τ2 A variable expression x has a type if a binding x:τ exists T-SUB in the environment. The rule that is used for typing depends on Γ⊢e:τ2 the structure of τ. If τ is basic and trivial, the rule T-↓ is used which,asusual,refinesthebasictypewiththesingletonrefinement T-CON x:{v:b↓ |e}∈Γ T-↓ v = x[23].Otherwise,therule T-VARisusedtotypexwithτ. Γ⊢c:Ty(c) Γ⊢x:{v:b↓ |v=x} If τ isbasicand serious itshould be unrefined, so itstype isnot strengthenedwiththesingletonrefinement. x:τ ∈Γ τ 6={v:b↓ |e} Thedependent applicationruleT-APPisstandard: thetypeof Γ⊢x:τ T-VAR theexpressione1 e2 isτ wherexisreplacedwiththeexpression e2.Notethatifτxisseriousthene2maydivergeandsoshouldnot Γ,x:τx ⊢e:τ Γ⊢x:τx →τ appearinanytype.Inthiscase,well-formednessensuresthatxwill Γ⊢(λx.e):x:τx →τ T-FUN ne2o.t)aOpupreasrysintesmidealτlo,wansdtrsivoiaτl[aer2g/uxm]e=ntsτto(wbehipcahssdeodestonaotfucnocnttiaoinn Γ⊢e1:(x:τx →τ) Γ⊢e2 :τx thatexpectsaseriousinput,butnottheotherwayaround. T-APP Γ⊢e1e2:τ[e2/x] Soundness With a Termination Oracle We prove soundness via preservationandprogresstheoremswithrespecttoOPTevaluation, Γ⊢ex :τx Γ,x:τx ⊢e:τ Γ⊢τ assumingtheexistenceofaterminationoraclethatassignslabels T-LET Γ⊢letx=exine:τ totypessothatseriousexpressionsgetserioustypes. Γ,x:τx,f:x:τx →τ ⊢e:τ Γ⊢x:τx →τ Hypothesis(TerminationOracle). If∅ ⊢ e : τ andτ isatrivial T-REC Γ⊢µf.λx.e:x:τx →τ type,theneisatrivialexpression. Theproofcanbefoundin[31],andreliesontwofacts:(1)We Figure3. Type-checkingforλ↓ defineconstantssothattheapplicationofaconstantisdefinedfor everyvalue(asrequiredforprogress)butpreservestypingforval- ues that satisfy their input refinements (as required for preserva- 4.1 Type-checking tion).(2)Variableswithserioustypesdonotappearinanyrefine- Next,wepresentthestaticsemanticsofλ↓bydescribingthetype- ments.Inparticular,theydonotappearinanytypeorenvironment; checkingjudgmentsandrules.AtypeenvironmentΓisasequence whichmakesitsafetotriviallysubstitutethemwithanyexpression oftypebindingsx:τ.Weuseenvironmentstodefinethreekindsof whilstpreservingtyping.Withthis,wecanprovethatapplication rules: Well-formedness, Subtyping and Typing, which are mostly ofaseriousexpressionspreservestyping. standard[3,19].Thechangesarethatserioustypesareunrefined, Wecombinepreservationandprogresstoobtainthefollowing andbinderswithserioustypescannotappearinrefinements. result.Thecrash-freedomguaranteestatesthatifatermeiswell- typed,itwillnotcrashundercall-by-namesemantics.Frompreser- Well-formedness A judgment Γ ⊢ τ states that the refinements vation and progress (and the fact that the constant crash has no in τ are boolean values in the environment Γ restricted only to type),wecanshowthatawell-typedtermwillnotcrashunderopti- thebinderswhosetypesaretrivial.Thekeyruleis WF-↓,which checksthattherefinementeofatrivialbasictypeisabooleanvalue misticevaluation(֒→o).Hence,wecanuseTheorem1toconclude undertheenvironmentΓ′whichcontainsallofthetrivialbindings thatthewell-typedtermcannotcrashunderlazyevaluation(֒→n). inΓextendedwiththebinding v : b↓.Togetthetrivialbindings Similarly,type-preservationistranslatedfrom֒→o to֒→n byob- weusethefunctionTrivialdefinedas: servingthatatermreducestoaconstantunder֒→oiffitreducesto . thesameconstantunder֒→n. Trivial(∅) = ∅ Trivial(x:τ,Γ) =. x:τ,Trivial(Γ) ifτ istrivial Theorem2(Safety). AssumingtheTerminationHypothesis, . Trivial(x:τ,Γ) = Trivial(Γ) otherwise • Crash-Freedom:If∅⊢e:τ thene6֒→∗ crash. n TheruleWF-↑ensuresserioustypesareunrefined. • Type-Preservation:If∅⊢e:τ ande֒→∗n cthen∅⊢c:τ. 6 2014/1/27 5. Evaluation τ isserious Γ⊢x:τx →τ Γ,x:τx,f :y:τx →τ ⊢e:τ WeimplementedourtechniquebyextendingLIQUIDHASKELL[34]. T-REC-↑ Γ⊢µf.λx.e:x:τx →τ Next,wedescribethetool,thebenchmarks,andaquantitativesum- mary of our results. We then present a qualitative discussion of τ istrivial Γ⊢x:τx →τ howLIQUIDHASKELLwasusedtoverifysafety,termination,and τx ={v:nat↓ |ex} τy ={v:nat↓ |ex∧v<x} functionalcorrectnesspropertiesofalargelibrary,anddiscussthe Γ,x:τx,f :y:τy →τ[y/x]⊢e:τ strengthsandlimitationsunearthedbythestudy. T-REC-↓ Γ⊢µf.λx.e:x:τx →τ Implementation LIQUIDHASKELL takes as input: (1) A Haskell source file, (2) Refinement type specifications, including refined Figure4. TypingrulesforTermination datatypedefinitions,measures,predicateandtypealiases,andfunc- tionsignatures,and(3)Predicatefragmentscalledqualifierswhich areusedtoinferrefinementtypesusingtheabstractinterpretation 4.2 TerminationAnalysis framework of Liquid typing [27]. The verifier returns as output, SAFEorUNSAFE,dependingonwhetherthecodemeetsthespec- Finally, we present an instantiation of the termination oracle that ificationsornot,and,importantlyfordebuggingthecode(orspec- essentiallygeneralizesthenotionofsizedtypes[2,15,29]tothe ification!)theinferredtypesforallsub-expressions. refinementsetting,andhencecompletethedescriptionofasound refinementtypesystemforλ↓.Asdescribedin§2,thekeyideais 5.1 BenchmarksandResults tochangetherulefortypingrecursivefunctions,sothatthebodyof thefunctionischeckedunderatermination-weakenedenvironment Our goal was to use LIQUIDHASKELL to verify a suite of real- wherethefunctioncanonlybecalledonsmallerinputs. world Haskell programs, to evaluate whether our approach is ef- ficient enough for large programs, expressive enough to specify Termination-Weakened EnvironmentsWeformalize this idea by keycorrectnessproperties,andpreciseenoughtoverifyidiomatic splitingtheruleT-RECthattypesfixpointsintothetworulesshown Haskellcodes.Thus,weusedtheselibrariesasbenchmarks: inFigure4.TheruleT-REC-↑canonlybeappliedwhentheoutput type τ is serious, meaning that a recursive function typed with • GHC.ListandData.List,whichtogetherimplementmany thisrulecan(wheninvoked)diverge.Incontrast,ruleT-REC-↓is standardlistoperations; weverifyvarioussizerelatedproper- usedtotypearecursivefunctionthatalwaysterminates,i.e.whose ties, output type istrivial. The rulerequires the argument x of such a • Data.Set.Splay, which implements a splay-tree based functiontobeatrivial(terminating)naturalnumber.Furthermore, functional set data type; we verify that all interface functions when typing the body e, the type of the recursive function f : terminateandreturnwellorderedtrees, y:τy →τ isweakenedtoenforcethatthefunctionisinvokedwith anargumentythatisstrictlysmallerthanxateachrecursivecall- • Data.Map.Base,which implements a functional map data site.Forclarityofexposition,werequirethatthedecreasingmetric type;weverifythatallinterfacefunctionsterminateandreturn bethevalueofthefirstparameter.Itisstraightforwardtogeneralize binary-searchorderedtrees[34], therequirementtoanywell-foundedmetriconthearguments. • VECTOR-ALGORITHMS, which includes a suite of “impera- Weprovethatourmodifiedtypesystemsatisfiesthetermination tive” (i.e. monadic) array-based sorting algorithms; we verify oracle hypothesis. That is, in λ↓ trivial types are only ascribed thecorrectnessofvectoraccessing,indexing,andslicing. totrivialexpressions.ThisdischargestheTerminationHypothesis • BYTESTRING,alibraryformanipulatingbytearrays,weverify neededbyTheorem2andprovessafetyofλ↓. termination,low-levelmemorysafety,andhigh-levelfunctional Theorem 3 (Termination). If ∅ ⊢ e:τ and τ is trivial then e is correctnessproperties, trivial. • TEXT, a library for high-performance unicode text process- ing;weverifyvariouspointersafetyandfunctionalcorrectness Thefullproofisin[31],herewesummarizethekeyparts. properties(5.2),duringwhichwefindasubtlebug. Well-formedTermsWecallatermewell-formedwithrespecttoa Wechosethesebenchmarksastheyrepresentawidespectrumof typeτ,writtenτ |=e,if:(1)ifτ ≡{v:b↓ |e′}thene֒→∗ v,for n idiomatic Haskell codes: the first three are widely used libraries somevaluev,and,(2)ifτ ≡ x:τ′ → τ′thene ֒→∗ v,forsome x n basedonrecursivedatastructures,thefourthandfifthperformsub- valuev,andforallex suchthat∅ ⊢ ex:τx′ andτx′ |= ex,wehave tle, low-level arithmetic manipulation of array indices and point- τ′[ex/x]|=eex. ers, and the last is a rich, high-level library with sophisticated Well-formedSubstitutionsAsubstitutionθ iseitherempty(∅)or application-specific invariants. Theselast threelibrariesareespe- of theformθ;[e/x].Asubstitutionθ iswell-formed withrespect ciallyrepresentativeastheypervasivelyinterminglehighlevelab- toanenvironment Γ,writtenΓ |= θ,ifeither bothareempty, or stractionslikehigher-orderloops,folds,andfusion,withlow-level the environment and the substitution are respectively Γ;x:τ and pointer manipulations inorder todeliver high-performance. They θ[e/x],and(1)Γ|=θ,(2)∅⊢ θe:θτ,and(3)θτ |=θe.Now,we areanappealingtargetfor LIQUIDHASKELL,asrefinementtypes canconnectrefinementtypesandtermination. areanidealwaytostaticallyenforcecriticalinvariantsthatareout- side the scope of run-time checking as even Haskell’s highly ex- Lemma1(Termination). IfΓ⊢e:τ andΓ|=θ,thenθτ |=θe. pressivetypesystem. The Termination Theorem 3 is an immediate corollary of Results Table 1 summarizes our experiments, which covered 39 Lemma1whereΓistheemptyenvironment. Sinceweareusing modules totaling 10204 non-comment lines of source code and refinements to prove termination, our termination proof requires 1652 lines of specifications. The results are on a machine with soundness and vice versa. We resolve this circularity by proving anIntelXeonX5660and32GBofRAM(nobenchmarkrequired Preservation, Progress, and the Termination Lemma by mutual morethan1GB.)Theupshot isthat LIQUIDHASKELLisveryef- induction,toobtainTheorem2withouttheTerminationHypothe- fectiveonreal-worldcodebases.Thetotaloverheadduetohints, sis[31]. i.e.thesum of Annot,Qualif,and Wit(Hintisincluded inAn- 7 2014/1/27 Module LOC Specs Annot Qualif Rec Serious Hint Wit Time(s) GHC.List 310 29/34 8/8 3/3 37 5 1 0 23 Data.List 504 10/18 7/7 0/0 50 2 5 3 38 Data.Set.Splay 149 27/37 5/5 0/0 17 0 4 3 25 Data.Map.Base 1396 123/173 12/12 0/0 88 0 9 2 219 Data.ByteString 1035 94/117 10/10 5/10 49 2 7 21 249 Data.ByteString.Char8 503 20/26 5/5 0/0 9 0 5 2 19 Data.ByteString.Fusion 447 27/42 0/0 8/8 23 0 0 6 55 Data.ByteString.Internal 272 44/101 1/1 10/10 5 0 0 0 12 Data.ByteString.Lazy 673 69/89 24/24 24/96 59 4 20 1 338 Data.ByteString.Lazy.Char8 406 13/19 4/4 0/0 7 0 4 0 20 Data.ByteString.Lazy.Internal 55 18/43 2/2 0/0 8 0 0 0 2 Data.ByteString.Unsafe 110 13/19 0/0 0/0 0 0 0 0 3 Data.Text 801 73/153 6/6 1/5 35 0 3 11 221 Data.Text.Array 167 35/87 2/2 7/9 1 0 0 1 7 Data.Text.Encoding 189 8/25 1/1 8/10 4 1 0 3 197 Data.Text.Foreign 84 7/11 0/0 2/2 7 0 0 2 6 Data.Text.Fusion 177 2/2 3/3 9/22 5 3 0 0 106 Data.Text.Fusion.Size 112 10/22 1/1 2/2 7 0 0 0 6 Data.Text.Internal 55 22/49 10/12 9/23 0 0 0 0 5 Data.Text.Lazy 801 74/150 12/12 1/5 50 0 12 1 215 Data.Text.Lazy.Builder 147 6/31 1/1 2/4 4 0 1 1 28 Data.Text.Lazy.Encoding 127 2/2 0/0 0/0 4 0 0 0 11 Data.Text.Lazy.Fusion 84 4/11 1/1 0/0 3 1 0 0 9 Data.Text.Lazy.Internal 66 19/38 5/5 2/4 5 0 0 0 3 Data.Text.Lazy.Search 104 13/68 5/5 0/0 6 0 5 0 161 Data.Text.Private 25 2/4 0/0 0/0 1 0 0 1 2 Data.Text.Search 61 5/15 0/0 2/2 4 0 0 3 20 Data.Text.Unsafe 74 10/33 5/5 2/7 0 0 0 0 6 Data.Text.UnsafeChar 51 8/11 0/0 2/2 0 0 0 0 4 Data.Vector.Algorithms.AmericanFlag 270 7/17 2/2 3/3 7 0 2 7 44 Data.Vector.Algorithms.Combinators 26 0/0 0/0 0/0 0 0 0 0 0 Data.Vector.Algorithms.Common 33 25/82 0/0 9/9 1 0 0 0 3 Data.Vector.Algorithms.Heap 170 11/31 1/1 0/0 4 0 1 2 53 Data.Vector.Algorithms.Insertion 51 4/14 0/0 0/0 2 0 0 1 4 Data.Vector.Algorithms.Intro 142 13/35 3/3 0/0 6 0 3 2 26 Data.Vector.Algorithms.Merge 71 0/0 3/3 1/1 4 0 3 4 25 Data.Vector.Algorithms.Optimal 183 3/12 0/0 0/0 0 0 0 0 36 Data.Vector.Algorithms.Radix 195 4/16 0/0 0/0 5 0 0 3 8 Data.Vector.Algorithms.Search 78 3/15 0/0 0/0 3 0 0 3 10 Total 10204 857/1652 139/141 112/237 520 18 85 83 2240 Table1. Aquantitativeevaluationofourexperiments.LOCisthenumberofnon-commentlinesofsourcecodeasreportedbysloccount.Specsisthe number(/line-count)oftypespecifications andaliases,datadeclarations, andmeasuresprovided.Annotisthenumber(/line-count)ofotherannotations provided, theseinclude invariants andhints forthetermination checker. Qualifisthenumber(/line-count) ofprovided qualifiers. Recisthenumberof recursivefunctionsinthemodule.Seriousisthenumberoffunctionsmarkedaspotentiallynon-terminating.Hintisthenumberofterminationhintsgiven toLIQUIDHASKELL,whichspecifywhichparameterdecreases(bydefault:thefirstparameterthathasasizemetric).Witisthenumberoffunctionsthat requiredtheadditionofaghostterminationwitnessparameter.Timeisthetime,inseconds,requiredtorunLIQUIDHASKELLonthemodule. not),is4.5%ofLOC.Thespecificationsthemselvesaremachine tersusingUTF-16,inwhichcharactersarestoredineithertwoor checkable versions of the comments placed around functions de- fourbytes. TEXTisavastlibraryspanning39modulesand5700 scribing safe usage and behavior. Our default metric, namely the linesofcode,howeverwefocusonthe17modulesthatarerelevant first parameter with an associated size measure, suffices to prove to the above properties. While we have verified exact functional 67%of (recursive) functionsterminating.30% requireahint(i.e. correctnesssizepropertiesforthetop-levelAPI,wefocushereon thepositionofthedecreasingargument)orawitness(3%required thelow-levelfunctionsandinteractionwithunicode. both),andtheremaining3%weremarkedaspotentiallydiverging. Arraysand TextsA Text consistsof an(immutable) Arrayof Ofthe18functionsmarkedaspotentiallydiverging,wesuspect6 16-bit words, an offset into the Array, and a length describing actuallyterminatebutwereunabletoproveso.Whilethereismuch thenumberofWord16sintheText.TheArrayiscreatedand roomforimprovingtherunningtimes,thetoolisfastenoughtobe filledusingamutableMArray.Allwriteoperationsin TEXTare usedinteractively,verifyahandfulofAPIfunctionsandassociated performedonMArraysintheSTmonad,buttheyarefrozeninto helpersinisolation. Arrays before being used by the Text constructor. We write a measure denoting the size of an MArray and use it to type the 5.2 CaseStudy:Text writeandfreezeoperations. Next,togiveaqualitativesenseofthekindsofpropertiesanalyzed during the course of our evaluation, we present a brief overview measure malen :: MArray s -> Int oftheverificationofTEXT,whichisthestandardlibraryusedfor predicate EqLen A MA = alen A = malen MA seriousunicodetextprocessinginHaskell. predicate Ok I A = 0 <= I < malen A TEXT uses byte arraysand streamfusion to guarantee perfor- type VO A = {v:Int| Ok v A} mancewhileprovidingahigh-levelAPI.InourevaluationofLIQ- UIDHASKELLonTEXT[22],wefocusedontwotypesofproper- unsafeWrite :: m:MArray s ties:(1)thesafetyofarrayindexandwriteoperations,and(2)the -> VO m -> Word16 -> ST s () functional correctness of the top-level API. These are both made unsafeFreeze :: m:MArray s moreinterestingbythefactthat TEXTinternallyencodescharac- -> ST s {v:Array | EqLen v m} 8 2014/1/27 Reasoning about Unicode The function writeChar (abbrevi- | otherwise -> do ating UnsafeChar.unsafeWrite) writes a Char into an let (z’,c) = f z x MArray. TEXT uses UTF-16 to represent characters internally, d <- writeChar arr i c meaningthateveryCharwillbeencodedusingtwoorfourbytes loop z’ s’ (i+d) (oneortwoWord16s). where j | ord x < 0x10000 = i | otherwise = i + 1 writeChar marr i c | n < 0x10000 = do Let’sfocusontheYield x s’case.Wefirstcomputethemax- unsafeWrite marr i (fromIntegral n) imumindexjtowhichwewillwriteanddeterminethesafetyof return 1 a write. If it is safe to write to j we call the provided function | otherwise = do fontheaccumulator zandthecharacterx,andwritetheresult- unsafeWrite marr i lo ing character c into the array. However, we know nothing about unsafeWrite marr (i+1) hi c,inparticular,whethercwillbestoredasoneortwoWord16s! return 2 Thus, LIQUIDHASKELLflagsthecall to writeCharas unsafe. where n = ord c The error can be fixed by lifting f z x into the where clause m = n - 0x10000 anddefiningthewriteindexjbycomparingord c(notord x). lo = fromIntegral LIQUIDHASKELL(andtheauthors)readilyacceptedourfix. $ (m ‘shiftR‘ 10) + 0xD800 hi = fromIntegral 5.3 CodeChanges $ (m .&. 0x3FF) + 0xDC00 Our case studies also highlighted some limitations of LIQUID- TheUTF-16encodingcomplicatesthespecificationofthefunction HASKELLthatwewilladdressinfuturework. Inmost cases,we aswecannotsimplyrequireitobelessthanthelengthofmarr; could alter the code slightly to facilitate verification. We briefly ifiweremalen marr - 1andcrequiredtwoWord16s,we summarizetheimportantcategorieshere;referto[31]fordetails. wouldperformanout-of-boundswrite.Weaccountforthissubtlety Ghost parameters are sometimes needed in order to materialize withapredicatethatstatesthereisenoughRoomtoencodec. valuesthatarenotneededforthecomputation,butarenecessaryto provethespecification;provingterminationmayrequireadecreas- predicate OkN I A N = Ok (I+N-1) A ing value that is a function of several parameters. Infuture work predicate Room I A C = if ord C < 0x10000 itwillbeinterestingtoexploretheuseofadvancedtechniquesfor then OkN I A 1 synthesizingrankingwitnesses[8]toeliminatesuchparameters. else OkN I A 2 Lazybinderssometimesgetinthewayofverification.Acommon type OkSiz I A = {v:Nat | OkN I A v} pattern inHaskell code isto define all local variables ina single type OkChr I A = {v:Char | Room I A v} whereclauseandusethemonlyinasubsetofallbranches.LIQ- UIDHASKELLflagsafewsuchdefinitionsasunsafe,notrealizing Room i marr csays“ifcisencodedusingoneWord16,then that the values will only be demanded in a specific branch. Cur- i must be less than malen marr, otherwise i must be less rently,wemanuallytransformthecodebypushingbindersinwards than malen marr - 1.” OkSiz I A is an alias for a valid totheusagesite.Thistransformationcouldbeeasilyautomated. numberofWord16sremainingaftertheindexIofarrayA.OkChr specifiestheCharsforwhichthereisroom(towrite)atindexI Assumes which can be thought of as “hybrid” run-time checks, inarrayA.ThespecificationforwriteCharstatesthatgivenan had to be placed in a couple of cases where the verifier loses arraymarr,anindexi,andavalidCharforwhichthereisroom information. Onesource istheintroduction of assumptions about atindexi,theoutputisamonadicactionreturningthenumberof mathematical operators that arecurrently conservatively modeled Word16occupiedbythechar. intherefinementlogic(e.g.thatmultiplicationiscommutativeand associative).Thesemayberemovedbyusingmoreadvancednon- writeChar :: marr:MArray s lineararithmeticdecisionprocedures. -> i:Nat -> OkChr i marr 6. RelatedWork -> ST s (OkSiz i marr) Nextwesituateourworkwithcloselyrelatedlinesofresearch. BugThus,clientsofwriteCharshouldonlycallitwithsuitable Dependent Types are the basis of many verifiers, or more gener- indicesandcharacters.UsingLIQUIDHASKELLwefoundanerror ally, proof assistants. In this setting arbitrary terms may appear inoneclient,mapAccumL,whichcombinesamapandafoldover insidetypes, so toprevent logical inconsistencies, and enable the aStream,andstorestheresultofthemapinaText.Consider checkingoftypeequivalence,alltermsmustterminate.“Full”de- theinnerloopofmapAccumL. pendentlytypedsystemslikeCoq[4],Agda[21],andIdris[6]typ- outer arr top = loop icallyusevariousstructuralcheckswhererecursionisallowedon where sub-termsofADTstoensurethatalltermsterminate.Onecanfake loop !z !s !i = “lightweight”dependenttypesinHaskell[7,10,24].Inthisstyle, case next0 s of theinvariantsareexpressedinarestricted[16]totalindexlanguage Done -> return (arr, (z,i)) and relationships (e.g. x < y and y < z) are combined (e.g. Skip s’ -> loop z s’ i x < z)byexplicitlyconstructingatermdenotingtheconsequent Yield x s’ from terms denoting the antecedents. On the plus side this “con- | j >= top -> do structive”approachensuressoundness.Itisimpossibletowitness let top’ = (top + 1) ‘shiftL‘ 1 inconsistencies,asdoingsotriggersdivergingcomputations.How- arr’ <- new top’ ever,itisunclearhoweasyitistouserestrictedindiceswithexplic- copyM arr’ 0 arr 0 top itlyconstructed relationstoverifythecomplex properties needed outer arr’ top’ z s i forlargelibraries. 9 2014/1/27 RefinementTypesareaformofdependenttypeswhereinvariants [4] Y.BertotandP.Caste´ran. Coq’Art:TheCalculusofInductiveCon- areencodedviaacombinationoftypesandSMT-decidablelogical structions. SpringerVerlag,2004. refinementpredicates[28,38].Refinementtypesofferahighlyau- [5] K.Bhargavan,C.Fournet,M.Kohlweiss,A.Pironti,andP.-YStrub. tomatedmeansofverificationandhavebeenappliedtocheckava- Implementingtlswithverifiedcryptographicsecurity.InIEEES&P, rietyofprogramproperties,includingfunctionalcorrectnessofdata 2013. structures[9,18],securityprotocols[3,5]andcompilers[13].The [6] Edwin Brady. Idris: general purpose programming with dependent language of refinements is restricted to ensure consistency, how- types. InPLPV,2013. ever,programvariables(binders),ortheirsingletonrepresentatives [7] M.T.Chakravarty,G.Keller,andS.L.Peyton-Jones.Associatedtype [38],arecruciallyallowedintherefinements.Asdiscussedin§2 synonyms.InICFP,2005. this is sound under call-by-value evaluation, but under Haskell’s [8] B.Cook,A.Podelski,andA.Rybalchenko. Termination proofsfor semanticsanyinnocent binder canbepotentiallydiverging, caus- systemscode. InPLDI,2006. ingunsoundness.Finally,ourimplementationisbasedonLIQUID- [9] J.Dunfield. RefinedtypecheckingwithStardust. InPLPV,2007. HASKELL[34],whichwasunsoundasitassumedCBVevaluation. [10] R.A.EisenbergandS.Weirich.Dependentlytypedprogrammingwith Size-basedTerminationAnalyseshavebeenusedtoverifytermi- singletons. InHaskellSymposium,2012. nationofrecursivefunctions,eitherusingthe“size-changeprinci- [11] R.Ennals andS.Peyton-Jones. Optimistic evaluation: Anadaptive ple”[2,17],orviathetypesystem[15,29]byannotatingtypeswith evaluationstrategyfornon-strictprograms.InICFP,2003. sizeindicesandverifyingthattheargumentsofrecursivecallshave [12] C.Flanagan,K.R.M.Leino,M.Lillibridge,G.Nelson,J.B.Saxe,and smaller indices. In work closely related to ours, Xi [37] encoded R.Stata. ExtendedstaticcheckingforJava. InPLDI,2002. sizesviarefinementtypestoprovetotalityofprograms.Whatdif- [13] C. Fournet, N. Swamy, J. Chen, P-E´. Dagand, P-Y. Strub, and ferentiatestheaboveworkfromoursisthatwedonotaimtoprove B.Livshits. Fullyabstractcompilationtojavascript. InPOPL,2013. thatallexpressionsconverge;onthecontrary,underalazysetting diverging expressions arewelcome. Weusesizeanalysistotrack [14] M.Greenberg,B.C.Pierce,andS.Weirich.Contractsmademanifest. divergingtermsinordertoexcludethemfromthelogic. JFP,2012. [15] J.Hughes,L.Pareto,andA.Sabry.Provingthecorrectnessofreactive Static Checkers like ESCJava [12] are a classical way of verify- systemsusingsizedtypes. InPOPL,1996. ing correctness through assertions and pre- and post-conditions. [16] L. Jia, J. Zhao, V. Sjo¨berg, and S. Weirich. Dependent types and OnecanviewRefinementTypesasatype-basedgeneralizationof programequivalence. InPOPL,2010. thisapproach. Classical contract checkers check “partial”(as op- [17] Neil D.Jones andNina Bohr. Termination analysis oftheuntyped posedto“total”)correctness(i.e.safety)foreager,typicallyfirst- lamba-calculus. InRTA,2004. order, languages and need not worry about termination. We have shown that inthelazysetting, even “partial”correctness requires [18] M.Kawaguchi, P.Rondon,andR.Jhala. Type-baseddatastructure verification. InPLDI,2009. proving“total”correctness![39]describesastaticcontractchecker for Haskell that uses symbolic execution. The (checker’s) termi- [19] K.W. Knowles and C. Flanagan. Hybrid type checking. ACM nation requires that recursive procedures only be unrolled up to TOPLAS,2010. some fixed depth. While this approach removes inconsistencies, [20] A.Mycroft.Thetheoryandpracticeoftransformingcall-by-needinto it yields weaker, “bounded” soundness guarantees. Zeno [30] is call-by-value. InESOP,1980. another automatic prover for Haskell which proves properties by [21] U. Norell. Towards a practical programming language based on unrollingrecursivedefinitions,rewriting,andgoal-splitting,using dependenttypetheory. PhDthesis,Chalmers,2007. sophisticatedproof-searchtechniquestoensureconvergence.Asit [22] B.O’SullivanandT.Harper. text-0.11.2.3:Anefficientpackeduni- isbasedonrewriting,“Zenomightloopforever”whenfacedwith codetexttype. http://hackage.haskell.org/package/text-0.11.2.3. non-terminating functions, but will not conclude erroneous facts. [23] X.Ou,G.Tan,Y.Mandelbaum,andD.Walker. Dynamictypingwith Finally,[35]describesanovelcontractcheckingtechniquethaten- dependenttypes. InIFIPTCS,2004. codesHaskellprogramsintofirst-orderlogic.Intriguingly,thepa- [24] S.L.Peyton-Jones,D.Vytiniotis,S.Weirich,andG.Washburn.Sim- per shows how the encoding, which models the denotational se- pleunification-basedtypeinferenceforGADTs.InICFP,2006. manticsofthecode,issimplifiedbylazyevaluation. [25] GordonPlotkin.Call-by-name,call-by-valueandthelambdacalculus. Unlike the previous contract checkers for Haskell, our type- TheoreticalComputerScience,1975. basedapproachdoesnotrelyonheuristicsforunrollingrecursive procedures,andinsteadusesSMTandabstractinterpretation[27] [26] JohnC.Reynolds. Definitionalinterpretersforhigher-orderprogram- minglanguages. In25thACMNationalConference,1972. toinfersignatures,whichweconjecturemakesLIQUIDHASKELL more predictable. Of course, this requires LIQUIDHASKELL be [27] P.Rondon,M.Kawaguchi,andR.Jhala.Liquidtypes.InPLDI,2008. provided logical qualifiers (predicate fragments) which form the [28] J. Rushby, S. Owre, and N. Shankar. Subtypes for specifications: basisoftheanalysis’abstractdomain.Inourexperience,however, Predicatesubtypinginpvs.IEEETSE,1998. this isnot an onerous burden asmost qualifiers can be harvested [29] D.SereniandN.D.Jones. Terminationanalysisofhigher-orderfunc- from API specifications, and the overall workflow is predictable tionalprograms. InAPLAS,2005. enoughtoenabletheverificationoflarge,real-worldcodebases. [30] W.Sonnex,S.Drossopoulou,andS.Eisenbach. Zeno:Anautomated proverforpropertiesofrecursivedatastructures. InTACAS,2012. [31] From Safety to Termination and back: SMT-Based verification for References Lazylanguages. SupplementaryMaterials. [1] T.Ball,R.Majumdar, T.Millstein, andS.K.Rajamani. Automatic [32] A. M.Turing. Oncomputable numbers, with an application tothe predicateabstractionofCprograms.InPLDI,2001. eintscheidungsproblem. InLMS,1936. [2] G.Barthe,M.J.Frade,E.Gime´nez,L.Pinto,andT.Uustalu. Type- [33] H. Unno, T. Terauchi, and N. Kobayashi. Automating relatively basedterminationofrecursivedefinitions.MathematicalStructuresin completeverificationofhigher-orderfunctionalprograms. InPOPL, ComputerScience,2004. 2013. [3] J.Bengtson,K.Bhargavan,C.Fournet,A.D.Gordon,andS.Maffeis. [34] N. Vazou, P.Rondon, and R. Jhala. Abstract refinement types. In Refinementtypesforsecureimplementations. ACMTOPLAS,2011. ESOP,2013. 10 2014/1/27

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.