A U TO M AT I N G 5 A B S T R A C T I N T E R P R E TAT I O N 1 0 2 O F A B S T R A C T M A C H I N E S r p A 9 2 ] L P . s c [ 1 v james ian johnson 3 3 0 2015 April 8 0 . 4 0 5 1 Submitted in partial fulfillment of the requirements : v for the degree of Doctor of Philosophy i X r to the a Faculty of the College of Computer and Information Science Northeastern University Boston, Massachussetts, USA ii colophon This document was typeset using the typographical look-and-feel classicthesis developed by André Miede. The style was inspired by Robert Bringhurst’s seminal book on typography “The Elements of Typographic Style”. classicthesis is available for both LATEX and LYX: http://code.google.com/p/classicthesis/ The PDF construction used pdfTeX. pdflatex -v: pdfTeX 3.1415926-2.4-1.40.13 (TeX Live 2012/Debian) kpathsea version 6.1.0 Copyright 2012 Peter Breitenlohner (eTeX)/Han The Thanh (pdfTeX). There is NO warranty. Redistribution of this software is covered by the terms of both the pdfTeX copyright and the Lesser GNU General Public License. For more information about these matters, see the file named COPYING and the pdfTeX source. Primary author of pdfTeX: Peter Breitenlohner (eTeX) Han The Thanh (pdfTeX). Compiled with libpng 1.2.49; using libpng 1.2.49 Compiled with zlib 1.2.7; using zlib 1.2.7 Compiled with poppler version 0.20.4 6 Figures were constructed with Racket ’s plot library, graphviz’s 2263 201001261600 0484 9939 22 dot version . . ( . ), and Inkscape . . r (Jan 2014 ). Final Version as of May 1, 2015 (classicthesis version 1.0). Dedicated to my grandparents, Mary Jane & Robert Johnson, and Imogene & Willard Speck ABSTRACT Static program analysis is a valuable tool for any programming lan- guage that people write programs in. The prevalence of scripting languages in the world suggests programming language interpreters are relatively easy to write. Users of these languages lament their inability to analyze their code, therefore programming language an- alyzers (abstract interpreters) are not easy to write. This thesis more deeply investigates a systematic method of creating abstract inter- preters from traditional interpreters, called Abstracting Abstract Ma- chines. Abstract interpreters are difficult to develop due to technical, the- oretical, and pragmatic problems. Technical problems include engi- neering data structures and algorithms. I show that modest and sim- ple changes to the mathematical presentation of abstract machines 1000 result in times better running time - just seconds for moderately sized programs. In the theoretical realm, abstraction can make correctness difficult to ascertain. Analysis techniques need a reason to trust them. Pre- vious analysis techniques, if they have a correctness proof, will have to bridge multiple formulations of a language’s semantics to prove correct. I provide proof techniques for proving the correctness of reg- ular, pushdown, and stack-inspecting pushdown models of abstract computationbyleavingcomputationalpowertoanexternalfactor: al- location. Each model is equivalent to the concrete (Turing-complete) semanticswhentheallocatorcreatesfreshaddresses. Evenifwedon’t trust the proof, we can run models concretely against test suites to better trust them. If the allocator reuses addresses from a finite pool, then the structure of the semantics collapses to one of these three sound automata models, without any foray into automata theory. In the pragmatic realm, I show that the systematic process of ab- stractingabstractmachinesisautomatable. Idevelopameta-language for expressing abstract machines similar to other semantics engineer- ing languages. The language’s special feature is that it provides an interface to abstract allocation. The semantics guarantees that if allo- cationisfinite,thenthesemanticsisasoundandcomputableapprox- imation of the concrete semantics. I demonstrate the language’s ex- pressiveness by formalizing the semantics of a Scheme-like language with temporal higher-order contracts, and automatically deriving a computable abstract semantics for it. i ACKNOWLEDGMENTS ReadersunfamiliarwithJorgeCham’sPhDcomicsarelikelynotPhDs or PhD students. For those not in the know: the trials, tribulations, trivialities and sometimes moral turpitude of the PhD as depicted in these works of comedy really happen all the time. I know. I’m a data point. Nothing in my life has ever been as difficult as these six years, and I could not have done it without the help I received from faculty, colleagues, friends and of course family. First of all, I thank my committee: • DavidVanHorn,myadvisor. Ourrelationshipstartedwithhim as a postdoc with some cool ideas and great presentation skills. His philosophy and approach to research are both fundamen- tallypedagogicalandprogressive: everythinghedoes,however complicated it was before, becomes easy, obvious, and better. Sure, this makes publishing difficult (I think reviewers get off on being confused), but I found this way of operating enviable. His focus on the long game calmed my indignation of rejection. His willingness to hear me out with a half-baked idea kept me from censoring my creativity. And his sense of humor kept our conversations enjoyable. • Olin Shivers, my co-advisor. Previously my advisor, Olin gives his students room to explore and grow as researchers. He’s fa- mously entertaining, and always has his eyes on a shiny future. • MitchellWand(Mitch)introducedmetoprogramminglanguages research and a new way of thinking about proof. His ability to cutthroughargumentsforcedmetothinkmoreprecisely,toget to the heart of the matter. I thought I had mathematical matu- ritybeforeImetMitch,butafterworkingwithhimforayearon hygienic macros, well... This man knows semantics, and since I had the privilege of our time together, I feel I know too. • Cormac Flanagan has done a vast amount of work in practical program analysis. I appreciate the effort he’s put in to review- ing this dissertation. My co-authors’ help and support improved our publications more than I could have: thanks to Matthew Might, Ilya Sergey, and again, David Van Horn. I thank the other Northeastern faculty who helped me in this pro- cess: Matthias Felleisen, for having my back; Amal Ahmed, for her help with the harder correctness arguments in this document; Pana- giotis Manolios (Pete), for first teaching me formal methods; and ii Thomas Wahl for his perspective from the model-checking commu- nity. I would also like to thank J Strother Moore for welcoming me 2 at the ACL seminar in my final year at the University of Texas, and 2009 2 generously funding my attending the ACL Workshop. I could not have done so much of my work without the development team for Racket, most notably Matthew Flatt. Thanks for all the bugfixes. My colleagues in the lab are by far my greatest learning asset. We spent countless hours together working, learning, complaining and joking. My time in the PhD was immensely humbling, not because of the difficulty of the work, but knowing all of you tremendously talented people. • Claire Alvis: undefined amount of fun • Dan Brown: categorically helpful • Harsh Raju Chamarthi: theorem disprover • Stephen Chang: father, proof-reader, friend • Ryan Culpepper: master macrologist • Christos Dimoulas: keeping us honest with contracts • Carl Eastlund: macro logician • TonyGarnock-Jones: happytosubscribetoyourconversationtopics • Evgeny (Eugene) Goldberg: satisfying conversationalist • Dave Herman: web freedom fighter and rusty macrologist • Mitesh Jain: reimagining correctness • Jamie Perconti: very cool if true • Tim Smith: my go-to for obscure automata theory • Vincent St-Amour: telling us how we’re doing it wrong • Paul Stansifer: fun partner and macro advocate • Asumu Takikawa: affable, helpful and 全て上手 • SamTobin-Hochstadt: inspiredcrazymacrohackerturnedprofessor • Aaron Turon: concurrently brilliant and a good person • Dimitrios Vardoulakis: full stack analyst I thank Neil Toronto for his plot library in Racket, and for all the time he spent helping me use it to produce the plots in this disserta- tion. Ithankmyfriendsforkeepingmefromfloatingoffintojargon-land every time I open my mouth. • Matthew Martinez (Mattousai): you’re my best college buddy. Best wishes for your life in Ireland. • Nicholas Marquez (Alex): may you and Alex find other Alexes to happily Alex your Alex while you Alex with Alex. • DanielDavee(Mage): Iknowyou’rethephysicist,butquantum mechanics will not make undecidable problems decidable. Finally and most importantly, I thank my family for all their love and support. • Shaunie: I love you and your ability to put up with me. • V: I hope you never ever have to read this document. • Mom & Dad: the condo will appreciate, and we appreciate the condo. • Grandpa Johnson: for all the stories and ego-boosting. CONTENTS 1 introduction and contributions 1 11 1 . My thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2 . Structure of the dissertation . . . . . . . . . . . . . . . . 13 3 . The case for abstract machines . . . . . . . . . . . . . . 14 4 . Previously published material . . . . . . . . . . . . . . . i systematic constructions 7 2 abstracting abstract machines 11 2.1 Standardizing non-standard semantics: alloc and tick . 11 22 17 . Widening for polynomial complexity . . . . . . . . . . 3 engineering engineered semantics 21 31 21 . Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Abstract interpretation of λIF . . . . . . . . . . . . . . . 23 33 27 . From machine semantics to baseline analyzer. . . . . . 34 29 . Implementation techniques . . . . . . . . . . . . . . . . 35 42 . Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . 4 pushdown analysis via relevant allocation 47 41 47 . Tradeoffs of approximation strength . . . . . . . . . . . 42 49 . Refinement of AAM for exact stacks . . . . . . . . . . . 43 57 . Stack inspection and recursive metafunctions . . . . . . 44 63 . Relaxing contexts for delimited continuations . . . . . 45 70 . Short-circuiting via “summarization” . . . . . . . . . . ii algorithmic constructions 75 5 a language for abstract machines 79 51 79 . Representing an abstract machine . . . . . . . . . . . . 52 81 . Discussion of the design space . . . . . . . . . . . . . . 53 82 . The grammar of patterns and rules . . . . . . . . . . . . 54 85 . Term equality . . . . . . . . . . . . . . . . . . . . . . . . 55 86 . Pattern matching . . . . . . . . . . . . . . . . . . . . . . 56 87 . Expression evaluation . . . . . . . . . . . . . . . . . . . 57 93 . Running a machine . . . . . . . . . . . . . . . . . . . . . 6 a language for aam 95 61 95 . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 62 97 . Representing an abstract abstract machine . . . . . . . 63 98 . Overview of running . . . . . . . . . . . . . . . . . . . . 64 100 . Store refinements . . . . . . . . . . . . . . . . . . . . . . 65 101 . Design motivation by example . . . . . . . . . . . . . . 6.6 Externals and NDTerm . . . . . . . . . . . . . . . . . . . 105 67 106 . Term Equality . . . . . . . . . . . . . . . . . . . . . . . . 68 122 . Pattern Matching . . . . . . . . . . . . . . . . . . . . . . 69 126 . Expression evaluation . . . . . . . . . . . . . . . . . . . v
Description: