Haskell’05 September 30, 2005 • Tallinn, Estonia Proceedings of the ACM SIGPLAN 2005 Haskell Workshop Sponsored by the Association for Computing Machinery Special Interest Group on Programming Languages (SIGPLAN) The Association for Computing Machinery 1515 Broadway New York, New York 10036 Copyright © 2005 by the Association for Computing Machinery, Inc. (ACM). Permission to make digital or hard copies of portions of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyright for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permission to republish from: Publications Dept., ACM, Inc. Fax +1 (212) 869-0481 or <[email protected]>. For other copying of articles that carry a code at the bottom of the first or last page, copying is permitted provided that the per-copy fee indicated in the code is paid through the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923. Notice to Past Authors of ACM-Published Articles ACM intends to create a complete electronic archive of all articles and/or other material previously published by ACM. If you have written a work that has been previously published by ACM in any journal or conference proceedings prior to 1978, or any SIG Newsletter at any time, and you do NOT want this work to appear in the ACM Digital Library, please inform [email protected], stating the title of the work, the author(s), and where and when published. ISBN: 1-59593-071-X Additional copies may be ordered prepaid from: ACM Order Department PO Box 11405 New York, NY 10286-1405 Phone: 1-800-342-6626 (US and Canada) +1-212-626-0500 (all other countries) Fax: +1-212-944-1318 E-mail: [email protected] ACM Order Number 565052 Printed in the USA ii Foreword It is my great pleasure to welcome you to the ACM SIGPLAN 2005 Haskell Workshop. The purpose of the workshop is to discuss experience with Haskell, and future developments for the language. The scope of the workshop includes all aspects of the design, semantics, theory, application, implementation, and teaching of Haskell. The 2005 Haskell workshop takes place on 30 September, 2005, in Tallinn, Estonia, in affiliation with the 2005 International Conference on Functional Programming (ICFP’05). The call for papers attracted 29 submissions. Each paper was evaluated by at least three international referees. During a five-day electronic meeting, the program committee selected ten of the submissions for presentation at the workshop as full papers based on the referee reports. The program committee also selected a tool demonstration of which the abstract is included in these proceedings. The workshop program also includes the annual The future of Haskell discussion. David Roundy accepted the program committee invitation to be an invited speaker on the 2005 Haskell workshop, and an abstract of his talk is included in these proceedings. Putting together the 2005 Haskell workshop was very much a team effort. First of all, I would like to thank the authors for their excellent papers. Also, I would like to thank the program committee and additional reviewers who put a lot of effort into evaluating the submissions and providing constructive feedback to the authors. Finally, I would like to thank Patricia Johann, the ICFP’05 Workshop Chair, and Lisa M. Tolles, Sheridan Printing, for their help with organizing the workshop and producing the proceedings. Daan Leijen Issaquah, WA, USA, July 2005 iii Table of Contents 2005 Haskell Workshop Organization........................................................................................................vii 9:00 – 10:30 Session 1 ● Session Chair: D. Leijen (University of Utrecht) • Darcs: Distributed Version Management in Haskell.........................................................................................1 D. Roundy (Cornell University) • Visual Haskell – A full-featured Haskell Development environment..............................................................5 K. Angelov (no affiliation), S. Marlow (Microsoft Research Ltd.) • Haskell Ready to Dazzle the Real World..........................................................................................................17 M. M. Schrage, A. van IJzendoorn, L. C. van der Gaag (Utrecht University) 11:00 – 12:30 Session 2 ● Session Chair: to be determined • Dynamic Applications From the Ground Up...................................................................................................27 D. Stewart, M. M. T. Chakravarty (University of New South Wales) • Haskell Server Pages through Dynamic Loading............................................................................................39 N. Broberg (Chalmers University of Technology) • Haskell on a Shared-Memory Multiprocessor.................................................................................................49 T. Harris, S. Marlow, S. P. Jones (Microsoft Research Ltd.) 14:00 – 15:30 Session 3 ● Session Chair: to be determined • Verifying Haskell Programs Using Constructive Type Theory......................................................................62 A. Abel, M. Benke, A. Bove, J. Hughes, U. Norell (Chalmers University of Technology) • Putting Curry-Howard to Work.......................................................................................................................74 T. Sheard (Portland State University) • There and Back Again – Arrows for Invertible Programming......................................................................86 A. Alimarine, S. Smetsers, A. van Weelden, M. van Eekelen, R. Plasmeijer (Radboud University Nijmegen) 16:00 – 17:15 Session 4 ● Session Chair: to be determined • TypeCase: A Design Pattern for Type-Indexed Functions.............................................................................98 B. C. d. S. Oliveira, J. Gibbons (Oxford University) • Polymorphic String Matching.........................................................................................................................110 R. S. Bird (Oxford University) • Halfs: A Haskell Filesystem.............................................................................................................................116 I. Jones (Galois Connections) 17:15 – 18:00 Session 5 ● Session Chair: to be determined Author Index.......................................................................................................................................................117 v 2005 Haskell Workshop Organization Program Chair: Daan Leijen (Universiteit Utrecht, The Netherlands) Program Committee: Martin Erwig (Oregon State University, USA) John Hughes (Chalmers University of Technology, Sweden) Mark Jones (OGI School of Science & Engineering at OHSU, USA) Ralf Lämmel (Microsoft Corp., USA) Andres Löh (University of Bonn, Germany) Andrew Moran (Galois Connections Inc., USA) Simon Thompson (University of Kent, UK) Malcolm Wallace (University of York, UK) Additional reviewers: Lennart Augustsson David Burke Paul Graunke Bastiaan Heeren Steve Kollmansberger Brett Letner Ulf Norell Deling Ren Peter White Sponsor: vii Darcs: Distributed Version Management in Haskell DavidRoundy CornellUniversity [email protected] Abstract withC++,butdidwant astronglytyped languagesothatatleast someofmymistakescouldgetcaughtatcompiletime.Ihadheard Acommonreactionfrompeoplewhohearaboutdarcs,thesource ofHaskellonceortwiceonslashdot,anditsoundedappealing.A control system I created, is that it sounds like a great tool, but it bitofexperimentationsuggestedthatitssyntaxandexpressiveness isashamethat itiswritteninHaskell. Peoplethink thatbecause wereworthwhile. darcsiswritteninHaskellitwillbeaslowmemoryhogwithvery Darcs is an interesting application for a pure functional pro- few contributors to the project. I will give a somewhat historical gramming language, in that it isvery IO-intensive, and IO is not overview of my experiences with the Haskell language, libraries commonly thought of as being a strong point of functional lan- andtools. guages.Ontheotherhand,muchofdarcs’codeinvolvesthemanip- Iwillbeginwithabriefoverviewofthedarcsadvancedrevision ulationofpatches,whichispurelyfunctionalcode.Myexperience, controlsystem,howitworksandhowitdiffersfromotherversion however,hasbeenthatbothsidesofdarcshavebenefitedfromthe control systems. Then I will go through various problems and choiceofHaskellastheprogramminglanguage. successes I have had in using the Haskell language and libraries indarcs,roughlyintheorderIencounteredthem.IntheprocessI willgiveabitofatourthroughthedarcssourcecode.Ineachcase, 2. LazinessandunsafeInterleaveIO IwilltellabouttheproblemIwantedtosolve,whatItried,howit worked,andhowitmighthaveworkedbetter(ifthatispossible). Oneofthekeyoperationsindarcsthatneededtobeimplemented wasa“diff”algorithm,whichwouldtaketwodirectoriesandre- Categories and Subject Descriptors J.0 [Computer Applica- turnsetofchangesdescribingthedifferencebetweenthem.Iknew tions]:GENERAL which algorithms I wanted to use, and just needed to implement them.Unfortunately,thereisalotoftediousdirectorytraversalre- GeneralTerms Languages quired.InC++,thisdirectorytraversalcodehadtobeinterspersed withthediffcode,sincewecannotaffordtostorethecontentsof 1. Introduction bothdirectorytreesinmemory. Darcs is a distributed revision control system1. It differs from Haskellallowsanicerapproach.Iwroteonefunctiontolazily most other modern revision control systemsinthat itis“change- readanentiredirectorytreeintomemory(a“Slurpy”datatype), oriented”ratherthanbeing“version-oriented”,whichistosaythat andanothertodotherecursivediffitself.ThisrequiredthatIlearn indarcsthefundamental objectthatistrackedarechangesmade, touseunsafeInterleaveIO,whichwasprettyeasy,andresulted rather than asequence of states. The change-oriented philosophy inHaskellcodewhichwasfarcleanerthantheearlierC++code. of darcs has a number of advantages, but requires a considerable ThiswasthefirstfeatureofHaskell thatstruckmeasbeinga amountof“patcharithmetic”tohandlemergingandreorderingof majorimprovementoverotherlanguages.Byseparatingexecution changesinalosslessmanner. orderfromcodelayout,oneisabletowritecleaner,moremodular IstartedwritingdarcsusingC++inthespringof2002.Starting code.ThisshouldnotbenewstoanyoneinvolvedwithHaskell,but inthefallofthatyear, IrewrotedarcsinHaskell.Partlythiswas itisworthreportingthatthishasindeedbeenhelpfulindarcs. because I was sick of C++, and partly because there were many Thereisalimithere,however.Havingbeensoexcitedaboutthe bugs in the existing C++ code that a complete rewrite seemed separationbetweendirectoryIOanddirectorymanipulationmade necessary. I chose Haskell mostly because I did not want to stay possible by lazy IO, I went a bit overboard, and did everything usingpurefunctionscombinedwithlazyIO.Thisendedupleading 1Revision control systems are also known as “version control systems”, to scenarios where the entire directory tree needed to be held in “sourcecodemanagement”or“softwareconfigurationmanagement”...the memorybecauseafunctionwasnotsufficientlylazy.Muchofthe acronyms never end. I prefer revision control system, but SCM is more recentoptimizationwork(largelydonebyIanLynagh)hasinvolved commonlyused. switchingtomoreoftenworkdirectlyintheIOmonadinorderto robustlyobtainmuchmoremodestmemoryusage.Wewereableto retainthecleanlinessofcodebycreatingamonadclassallowingus towritecodethatcaneitherbeexecutedintheIOmonadorused asapurefunction. Permissiontomakedigitalorhardcopiesofallorpartofthisworkforpersonalor classroomuseisgrantedwithoutfeeprovidedthatcopiesarenotmadeordistributed 3. Object-oriented-likedatastructures forprofitorcommercialadvantageandthatcopiesbearthisnoticeandthefullcitation onthefirstpage.Tocopyotherwise,torepublish,topostonserversortoredistribute One feature of darcs that came almost directly from the earlier tolists,requirespriorspecificpermissionand/orafee. C++versionwasaframeworkforhandlingseparatesubcommands. Haskell’05 September30,2005,Tallinn,Estonia. Copyright(cid:13)c 2005ACM1-59593-071-X/05/0009...$5.00. Darcsisinvokedusingcommandssimilar(insomeways)tothose ofCVS,e.g.thecommand“get”isinvokesby 1 darcs get http://darcs.net 6. Efficientstringhandling In C++, thiswas handled by an abstract parent class from which Inthebeginning,darcsusedStringtohandlefilecontents.Even- weredescended oneclassforeachsubcommand. Themaindarcs tuallyIrealizedthatthisrequiredtoomuchmemoryandwastoo function then checked the command line against the names of slow, so I switched to PackedString. This lead to a major im- the different subcommands, and the arguments were processed provement in speed, but still required four bytes per character, using getopt according to the list of legal flags for that sub- sinceitworkedwithUnicodecharacters.However,theIOroutines command.ThiscodelayouttranslatedverynaturallyintoHaskell, darcsusesguaranteethateachcharacterwillcontainonlyasingle DarcsCommandbeingthefollowingdatastructure: byte.SoaftersomefrustrationIimplementedmyownversionof PackedStringcalledFastPackedString. data DarcsCommand = DarcsCommand { TheoriginalversionofFastPackedStringwasimplemented command_name :: String, using UArrays tostorethe characters. Besides using one quarter command_darcsoptions :: [DarcsOption], the memory, the FastPackedString allowed the splitting of a command_command :: stringwithout copying memory, although this causes the original [DarcsFlag] -> [String] -> IO (), string to be held in memory, which could be problematic. This command_help, command_description :: String, feature allows darcs to split a file into lines and store both the command_extra_args :: Int, original file and the split file at only the cost of the locations of command_extra_arg_help :: [String], lineendings. command_prereq :: Both memory use and file access speed remained a prob- [DarcsFlag] -> IO (Either String FilePath), lem, and I decided to try using mmap to read file contents. So I command_get_arg_possibilities :: IO [String], rewroteFastPackedStringusingaForeignPtrtostoretheac- command_argdefaults :: [String] -> IO [String] tualdata.Interestinglyjustthisconversiongavea15%speedup— } suggestingaproblemeitherwiththeefficiencyofUArrayorwith ThisframeworkisevensomewhatmorenaturalthanitwasinC++, my use of it. The ForeignPtr storage allowed me to call opti- sincewearenotforcedtodefineaseparatetypeforeachobject. mizedClibrarycallssuchasmemcmptoefficientlyperformcertain FastPackedString functions, suchas(==), whichwerebottle- 4. QuickCheck necks.ThisisanothercaseoftheusefulnessoftheFFI. One thing lead to another, and soon I was doing most of my OneoftheproblemsIhadwiththeinitialC++darcswasthatIhad optimizationwithinFastPackedStringbywritingfastCroutines nounittestingcode.Withintwoweeksofthefirstdarcsrecord,I thatwerethencalledfromHaskell.Thissaysgoodthingsaboutthe startedusingQuickChecktotestthepatchfunctions,andthesame FFI,inthatitallowedmetoeasilywritehand-tunedcodeinCto dayIfixedabugthatwasdiscoveredbyQuickCheck. optimizekeyfunctions,butitislessgoodnewsthatIfoundthisso QuickCheckmakesitveryeasytodefinepropertiesthatfunc- mucheasierthanwritingefficientfunctionsinHaskell. tions much have, which are then tested with randomly generated It would be nice to be able to access a particular chunk of data.Asimpleexampleis: memory both as a ForeignPtr and as a UArray. This would prop_readPS_show :: Patch -> Bool requirethatthememorywillnotbemodifiedbytheForeignPtr prop_readPS_show p = calls,soitwouldbean“unsafe”function.ButwhenIknowthata case readPatchPS $ packString $ show p of chunkofmemoryisnotgoingtobemodified,Iwouldpreferaccess Just (p’,_) -> p’ == p itwitheitherCcodeorpureHaskellcode,ratherthanIOcodeusing Nothing -> False Ptrs.ThisillustratesalimitationoftheFFI,whichisthatitcannot interact in a friendly way with Haskell data structures except by The trouble with QuickCheck is in creating valid patches (and copying.Thus,ifIwanttobeabletouseClibraryfunctionswith sequences of patches) to use as input. One can define custom achunkofmemoryIhavetogoallthewayandstorethedataina generators,butitishardtodetermineifasequenceoftestsisvalid. ForeignPtr,whicheliminatesthepossibilityofalsoaccessingit AlltoooftenthebugsfoundusingQuickCheckhavebeenbugsin aspureHaskelldata. thegenerationofrandompatchesratherthanbugsindarcsitself. Still,QuickCheckhasbeeninvaluableintestingdarcsasithas moved forward. My one gripe withQuickCheck hasbeen that it 7. Handlesandzlibandthreads doesnotseemtobepossibleforthecodecallingQuickCheckto Thefirstapproach towritingcompressed filesindarcsusingzlib discoverifthetestpassedorfailed. wasbasedonasimplefunctionwhichwrotethefileonecharacter atatime: 5. ForeignFunctionInterface gzWriteFile :: FilePath -> String -> IO () TheForeignFunctionInterface(FFI)hasbeenabsolutelyessential indarcs,andIhaveverylittlebutgoodtosayaboutit.Mybiggest Thisismemory-efficientaslongastheStringisgeneratedlazily, gripewouldbethatwhenIwasfirstlearningtouseittherewereso and it was pretty simple to writeusing the FFI, but was horribly manytoolsthatarelayeredoverit(GreenCard,HDirect,etc)that slow. Making one library call per character is a bad idea in any itwasquiteawhilebeforeIrealizedhoweasytheFFIistousein language,butisparticularlypainfulinHaskell. itsrawform.Darcs’firstuseoftheFFIcameaboutwhileadding SoIdecidedtowriteafunctionthatwouldopenacompressed supporttodarcstouselibcurltosupporthttpdownloads.This fileforwriting,andreturnaHandlesoIcouldthenusethesame featurethatturnedouttobequiteeasytoadd.MostFFIimportsin patch-writing code for writing to either compressed or uncom- darcsareassimpleasdefining pressed files. This would be hard to do in C, but it seemed like inafunctionallanguagelikeHaskellitshouldnotbeaproblem.It foreign import ccall "hscurl.h get_curl" turnedouttobeveryproblematicindeed. get_curl :: CString -> CString -> CString Myfirstattemptwasunderahundredlinesofpure(concurrent) -> CString -> CInt -> IO CInt Haskell. It created a pipe and used ForkIO to generate a thread andusingwithCStringtoconvertHaskellStringsintoCStrings. readingfromoneendandwritingtothecompressedfile,whilethe 2 otherendofthepipewasattachedtotheHandletowhichwewish dler that throws an asynchronous exception when a signal is re- to write. It seemed like an elegant solution, but there was a race ceived.OnWindows,weusetheFFItocallsetConsoleCtrlHandler condition that caused trouble if darcs exited before the spawned toachieveasimilareffect. threadfinishedwritingtodisk. None of thiswasprohibitively hard, but it should beeasier to The second attempt used the FFI and fork to spawn an OS write a robust IO function that creates a temporary file and then process from C, which read from one end of a pipe and wrote removesthatfilewhenitisfinished. to disk. This code was buggy (not to mention complex), and I Averyusefulidiomforthissortoffunctionisthe“withSome- soonswitchedtousingpthreads tospawnanOSthreadtoread thing” idiom, which shows up scattered across the standard li- fromthepipeandwritecompresseddatatodisk.Thisworked,and braries.Theideaistowriteafunctionsuchas wasrace-free,butwasacontinual portabilityproblem.Thereisa withSomething :: XXX -> (b -> IO a) -> IO a pthreadslibraryavailableonwindows,sowehadanefficientcross- platformsolution.However,theuseofpthreadscausedmoreusers whereXXXissomeappropriate input thatallowsyoutocreatean tohavetroublecompilingdarcsthananythingelse. object of type b that involves a resource that needs to be freed Eventually, we moved back to a function quite similar to the when the function is complete. These functions are most often originalfunctionthatwrotealazyString: implementedwithbracket.Afewexamplesofthisidiomindarcs gzWriteFilePSs :: FilePath -> [PackedString] -> IO () are This function differs from the original gzWriteFile in that it withSignalsHandled :: IO a -> IO a writesawholeblockofdatawithasingleFFIcallratherthanone withRepoLock :: IO a -> IO a characteratatime.Thismakesahugedifferenceinperformance. withLock :: String -> IO a -> IO a Theprocessofcreatingsucha[PackedString]isnicelyhandled withTemp :: (String -> IO a) -> IO a by“Printer”,aformattingmodulebyIanLynagh. withOpenTemp :: ((Handle, String) -> IO a) -> IO a Itwouldbeveryniceifthestandardlibrariesweremoreextensi- withTempDir :: String -> (String -> IO a) -> IO a ble.ThisisanothercasewheretheFFIishelpful,butweareforced withNamedTemp :: String -> (String -> IO a) -> IO a tochoosebetweenusingtheFFItogetsomethingdoneandusing One of thekeys towritingof robust code iswritingfunctions thestandardHaskellfacilities—theHandle-basedIOroutines. that cannot be easily misused, and this is one area where I feel Haskell is particularly strong. As discussed above, it has taken 8. Startingotherprocesses some work to writea robust and correct withLock function, but Therearenumerousinstancesinwhichdarcsneedstoexecutean oncethatfunctionhasbeenproperlywritten,itisalmostimpossible externalprogram.Examplesincludessh,atexteditororsendmail. tousethatfunctioninsuchawaythatalockfileisleftbehindwhen In some cases, such as ssh, we would like to provide the input, darcsexits(theexceptionbeingcasessuchaskill -9orreckless andcapturetheoutputfordisplaytotheuser.Atfirst,allexternal useoftheFFI). programs were started by calls to system. This is a fragile way of starting a program, since any shell meta-characters must be 10. Optimizationexperiences escaped. We may be able to do better with rawsystem, but we stillwouldneedtobecarefulwhenpassingargumentsthatcontain My experiences optimizing darcs have been mixed. Optimizing spaces. Ingeneral, itwouldbepreferable(onPOSIXsystems)to Haskell code seems to usually boil down to making the code ei- startexternalprogramswithforkandexecvp. ther more strict or more lazy. Increasing laziness isoften helpful Wehad major difficultieswitha“virtual timerexpired” error, inreducingmemoryusage,whileincreasingstrictnessinlow-level andeventuallyfoundthatwehadtoturnofftheVT ALARMafter functions usually makes them faster. The trouble is that it is not forking and before execing—the solution was found in the GHC alwayseasytotellwhichcategoryafunctionfallsinto—anditis sourcecodetosystemandfriends.Thisisfundamentalproblem, rarelyobvioushowagivenchangewillaffectthelazinessofafunc- but illustratesthepoint that as wonderful as theFFI is, there are tion. Profiling has been a very helpful tool when optimizing, al- pitfallsthatcancauseserioustrouble. thoughsometimestheprofilingitselfchangestheprogram’stiming In this particular case, hopefully the new System.Process behavior. modulewillprovehelpful.Wehavenotyetstartedusingit,mostly I have had more success with time optimizations than mem- becausewestillwantcompatibilitywitholderversionsofGHC. ory optimizations, although Ian Lynagh has been very successful withthelatter.Timeoptimizationmostoftenconsistofworkingon thelowestlevelfunctions,whicharecalledintheinnermostloops. 9. Errorhandlingandcleanup Inmanycases—particularlyinFastPackedString, wheredarcs Anissuethatpersistedforquiteawhilewereproblemswithfailing spendsmuchof itstime—optimizationhasconsistedofrewriting to clean up properly when darcs isinterrupted. The function that akeyfunctioninCorcallingaClibraryfunction,havingchosen onewouldthinktouseforthispurposeisbracket thatfunctiononthebasisprofiling.Atahigherlevel,onecanoften rewritea function so that it calls more efficient lower-level func- bracket :: IO a -> (a->IO b) -> (a->IO c) -> IO c tions,aswasthecasewithgzWriteFile andgzWriteFilePSs. whichallowsonetoperformaninitialization,runacalculationand Inbothcases,optimizationisreasonablystraightforward. then afterwards cleanup, withthecleanup being performed even Athigherlevels,wemoreoftenwantlazinessthanstrictness— if the function throws an exception. TheHaskell standard library sincelazyevaluationatthehighestlevelcostsverylittleintime,but hasnolessthanthreeseparateversionsofbracket(twoofwhich improvesthememory usage and consequently garbage-collecting areidentical).ThekeyistouseControl.Exception.bracket, efficiencyandlocalityofaccesstomemory.Ianhasmadeanumber whichcausesthecleanupfunctiontoberunevenifthecodeexits of improvemnts in the memory efficiency of darcs. Most of his withexitWithorifitreceivesanasynchronousexception. improvements have revolved around arranging to never hold an Additional confusion resultsfromthe fact that POSIXsignals entire parsed patch in memory, but instead to consume the patch stillcauseaprogramtodiewithoutrunningthecleanup.Wedealt aswelazilyparseit.However,relativelysubtlechangescanhave withthisproblembyintroducingonPOSIXsystemsasignalhan- disastrouseffectsbycausingapatchtoberetainedinmemory. 3