ebook img

Modern Compiler Design PDF

829 Pages·2012·4.154 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Modern Compiler Design

Dick Grune • Kees van Reeuwijk • Henri E. Bal Ceriel J.H. Jacobs • Koen Langendoen Modern Compiler Design Second Edition Dick Grune Ceriel J.H. Jacobs Vrije Universiteit Vrije Universiteit Amsterdam, The Netherlands Amsterdam, The Netherlands Kees van Reeuwijk Koen Langendoen Vrije Universiteit Delft University of Technology Amsterdam, The Netherlands Delft, The Netherlands Henri E. Bal Vrije Universiteit Amsterdam, The Netherlands Additional material to this book can be downloaded from http://extras.springer.com. ISBN 978-1-4614-4698-9 ISBN 978-1-4614-4699-6 (eBook) DOI 10.1007/978-1-4614-4699-6 Springer New York Heidelberg Dordrecht London Library of Congress Control Number: 2012941168 © Springer Science+Business Media New York 2012 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection with reviews or scholarly analysis or material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of this publication or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’s location, in its current version, and permission for use must always be obtained from Springer. Permissions for use may be obtained through RightsLink at the Copyright Clearance Center. Violations are liable to prosecution under the respective Copyright Law. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. While the advice and information in this book are believed to be true and accurate at the date of publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein. Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com) Preface Twelve years have passed since the first edition of Modern Compiler Design. For manycomputersciencesubjectsthiswouldbemorethanalifetime,butsincecom- piler design is probably the most mature computer science subject, it is different. Anadultpersondevelopsmoreslowlyanddifferentlythanatoddlerorateenager, andsodoescompilerdesign.Thepresentbookreflectsthat. Improvements to the book fall into two groups: presentation and content. The ‘look and feel’ of the book has been modernized, but more importantly we have rearrangedsignificantpartsofthebooktopresenttheminamorestructuredmanner: largechaptershavebeensplitandtheoptimizingcodegenerationtechniqueshave been collected in a separate chapter. Based on reader feedback and experiences in teachingfromthisbook,bothbyourselvesandothers,materialhasbeenexpanded, clarified,modified,ordeletedinalargenumberofplaces.Wehopethatasaresult of this the reader feels that the book does a better job of making compiler design andconstructionaccessible. The book adds new material to cover the developments in compiler design and construction over the last twelve years. Overall the standard compiling techniques and paradigms have stood the test of time, but still new and often surprising opti- mizationtechniqueshavebeeninvented;existingoneshavebeenimproved;andold oneshavegainedprominence.Examplesofthefirstare:proceduralabstraction,in which routines are recognized in the code and replaced by routine calls to reduce size; binary rewriting, in which optimizations are applied to the binary code; and just-in-timecompilation,inwhichpartsofthecompilationaredelayedtoimprove theperceivedspeedoftheprogram.Anexampleofthesecondisatechniquewhich extendsoptimalcodegenerationthroughexhaustivesearch,previouslyavailablefor tinyblocksonly,tomoderate-sizebasicblocks.Andanexampleofthethirdistail recursionremoval,indispensableforthecompilationoffunctionallanguages.These developmentsaremainlydescribedinChapter9. Although syntax analysis is the one but oldest branch of compiler construction (lexical analysis being the oldest), even in that area innovation has taken place. Generalized(non-deterministic)LRparsing,developedbetween1984and1994,is nowusedincompilers.ItiscoveredinSection3.5.8. Newhardwarerequirementshavenecessitatednewcompilerdevelopments.The mainexamplesaretheneedforsizereductionoftheobjectcode,bothtofitthecode intosmallembeddedsystemsandtoreducetransmissiontimes;andforlowerpower v vi Preface consumption,toextendbatterylifeandtoreduceelectricitybills.Dynamicmemory allocationinembeddedsystemsrequiresabalancebetweenspeedandthrift,andthe question is how compiler design can help. These subjects are covered in Sections 9.2,9.3,and10.2.8,respectively. With age comes legacy. There is much legacy code around, code which is so oldthatitcannolongerbemodifiedandrecompiledwithreasonableeffort.Ifthe sourcecodeisstillavailablebutthereisnocompileranymore,recompilationmust startwithagrammarofthesourcecode.Forfiftyyearsprogrammersandcompiler designershaveusedgrammarstoproduceandanalyzeprograms;nowlargelegacy programs are used to produce grammars for them. The recovery of the grammar from legacy source code is discussed in Section 3.6. If just the binary executable program is left, it must be disassembled or even decompiled. For fifty years com- pilerdesignershavebeencalledupontodesigncompilersandassemblerstoconvert sourceprogramstobinarycode;nowtheyarecalledupontodesigndisassemblers and decompilers, to roll back the assembly and compilation process. The required techniquesaretreatedinSections8.4and8.5. Thebibliography Theliteraturelisthasbeenupdated,butitsusefulnessismorelimitedthanbefore, fortworeasons.Thefirstisthatbythetimeitappearsinprint,theInternetcanpro- videmoreup-to-dateandmoreto-the-pointinformation,inlargerquantities,thana printedtextcanhopetoachieve.Itisourcontentionthatanybodywhohasunder- stood a larger part of the ideas explained in this book is able to evaluate Internet informationoncompilerdesign. The second is that many of the papers we refer to are available only to those fortunate enough to have login facilities at an institute with sufficient budget to obtain subscriptions to the larger publishers; they are no longer available to just anyone who walks into a university library. Both phenomena point to paradigm shiftswithwhichreaders,authors,publishersandlibrarianswillhavetocope. Thestructureofthebook Thisbookisconceptuallydividedintotwoparts.Thefirst,comprisingChapters1 through10,isconcernedwithtechniquesforprogramprocessingingeneral;itin- cludesachapteronmemorymanagement,bothinthecompilerandinthegenerated code. The second part, Chapters 11 through 14, covers the specific techniques re- quiredby thevariousprogrammingparadigms. The interactionsbetween theparts ofthebookareoutlinedintheadjacenttable.Theleftmostcolumnshowsthefour phasesofcompilerconstruction:analysis,contexthandling,synthesis,andrun-time systems.Chaptersinthiscolumncoverboththemanualandtheautomaticcreation Preface vii of the pertinent software but tend to emphasize automatic generation. The other columns show the four paradigms covered in this book; for each paradigm an ex- ample of a subject treated by each of the phases is shown. These chapters tend to containmanualtechniquesonly,allautomatictechniqueshavingbeendelegatedto Chapters2through9. inimperative infunctional inlogic inparallel/ andobject- programs programs distributed oriented (Chapter12) (Chapter13) programs programs (Chapter14) (Chapter11) Howtodo: analysis −− −− −− −− (Chapters2&3) context identifier polymorphic staticrule Lindastatic handling identification typechecking matching analysis (Chapters4&5) synthesis codefor codeforlist structure marshaling (Chapters6–9) while- comprehension unification statement run-time stack reduction Warren replication systems machine Abstract (nochapter) Machine The scientific mind would like the table to be nice and square, with all boxes filled —in short “orthogonal”— but we see that the top right entries are missing andthatthereisnochapterfor“run-timesystems”intheleftmostcolumn.Thetop right entries would cover such things as the special subjects in the program text analysis of logic languages, but present text analysis techniques are powerful and flexible enough and languages similar enough to handle all language paradigms: there is nothing to be said there, for lack of problems. The chapter missing from the leftmost column would discuss manual and automatic techniques for creating run-timesystems.Unfortunatelythereislittleornotheoryonthissubject:run-time systems are still crafted by hand by programmers on an intuitive basis; there is nothingtobesaidthere,forlackofsolutions. Chapter1introducesthereadertocompilerdesignbyexaminingasimpletradi- tionalmodularcompiler/interpreterindetail.Severalhigh-levelaspectsofcompiler constructionarediscussed,followedbyashorthistoryofcompilerconstructionand introductionstoformalgrammarsandclosurealgorithms. Chapters2and3treattheprogramtextanalysisphaseofacompiler:theconver- sionoftheprogramtexttoanabstractsyntaxtree.Techniquesforlexicalanalysis, lexicalidentificationoftokens,andsyntaxanalysisarediscussed. Chapters 4 and 5 cover the second phase of a compiler: context handling. Sev- eral methods of context handling are discussed: automated ones using attribute grammars, manual ones using L-attributed and S-attributed grammars, and semi- automatedonesusingsymbolicinterpretationanddata-flowanalysis. viii Preface Chapters6through9coverthesynthesisphaseofacompiler,coveringbothin- terpretationandcodegeneration.Thechaptersoncodegenerationaremainlycon- cernedwithmachinecodegeneration;theintermediatecoderequiredforparadigm- specificconstructsistreatedinChapters11through14. Chapter10concernsmemorymanagementtechniques,bothforuseinthecom- pilerandinthegeneratedprogram. Chapters11through14addressthespecialproblemsincompilingforthevarious paradigms–imperative,object-oriented,functional,logic,andparallel/distributed. Compilers for imperative and object-oriented programs are similar enough to be treatedtogetherinonechapter,Chapter11. Appendix B contains hints and answers to a selection of the exercises in the book. Such exercises are marked by a (cid:2) followed the page number on which the answerappears.AlargersetofanswerscanbefoundonSpringer’sInternetpage; thecorrespondingexercisesaremarkedby(cid:2)www. Severalsubjectsinthisbookaretreatedinanon-traditionalway,andsomewords ofjustificationmaybeinorder. Lexicalanalysisisbasedonthesamedotteditemsthataretraditionallyreserved for bottom-up syntax analysis, rather than on Thompson’s NFA construction. We see the dotted item as the essential tool in bottom-up pattern matching, unifying lexicalanalysis,LRsyntaxanalysis,bottom-upcodegenerationandpeep-holeop- timization.Thetraditionallexicalalgorithmsarejustlow-levelimplementationsof itemmanipulation.Weconsiderthedifferenttreatmentoflexicalandsyntaxanaly- sistobeahistoricalartifact.Also,thedifferencebetweenthelexicalandthesyntax levelstendstodisappearinmodernsoftware. Considerable attention is being paid to attribute grammars, in spite of the fact thattheirimpactoncompilerdesignhasbeenlimited.Yettheyaretheonlyknown way of automating context handling, and we hope that the present treatment will helptolowerthethresholdoftheirapplication. Functionsasfirst-classdataarecoveredinmuchgreaterdepthinthisbookthanis usualincompilerdesignbooks.AfteragoodstartinAlgol60,functionslostmuch status as manipulatable data in languages like C, Pascal, and Ada, although Ada 95 rehabilitated them somewhat. The implementation of some modern concepts, for example functional and logic languages, iterators, and continuations, however, requires functions to be manipulated as normal data. The fundamental aspects of the implementation are covered in the chapter on imperative and object-oriented languages;specificsaregiveninthechaptersonthevariousotherparadigms. Additionalmaterial,including moreanswers toexercises,and alldiagrams and allcodefromthebook,areavailablethroughSpringer’sInternetpage. Useasacoursebook Thebookcontainsfartoomuchmaterialforacompilerdesigncourseof13lectures of two hours each, as given at our university, so a selection has to be made. An Preface ix introductory,moretraditionalcoursecanbeobtainedbyincluding,forexample, Chapter1; Chapter2upto2.7;2.10;2.11;Chapter3upto3.4.5;3.5upto3.5.7; Chapter4upto4.1.3;4.2.1upto4.3;Chapter5upto5.2.2;5.3; Chapter6;Chapter7upto9.1.1;9.1.4upto9.1.4.4;7.3; Chapter10upto10.1.2;10.2upto10.2.4; Chapter11upto11.2.3.2;11.2.4upto11.2.10;11.4upto11.4.2.3. AmoreadvancedcoursewouldincludeallofChapters1to11,possiblyexclud- ingChapter4.ThiscouldbeaugmentedbyoneofChapters12to14. Anadvancedcoursewouldskipmuchoftheintroductorymaterialandconcen- trateonthepartsomittedintheintroductorycourse,Chapter4andallofChapters 10to14. Acknowledgments Weowemanythankstothefollowingpeople,whosupplieduswithhelp,remarks, wishes, and food for thought for this Second Edition: Ingmar Alting, José Fortes, BertHuijben,JonathanJoubert,SaraKalvala,FrankLippes,PaulS.Moulson,Pras- antK.Patra,CarloPerassi,MarcoRossi,MoolySagiv,GertJanSchoneveld,Ajay Singh,EvertWattel,andFreekZindel.Theirinputrangedfromsimplecorrectionsto detailedsuggestionstomassivecriticism.SpecialthanksgotoStefanieScherzinger, whose thorough and thoughtful criticism of our outline code format induced us to improveitconsiderably;anyremainingimperfectionsshouldbeattributedtostub- bornnessonthepartoftheauthors.Thepresentationoftheprogramcodesnippets in the book profited greatly from Carsten Heinz’s listings package; we thank himformakingthepackageavailabletothepublic. WearegratefultoAnnKostant,MelissaFearon,andCourtneyClarkofSpringer US,who,throughfastandcompetentwork,haveclearedmanyobstaclesthatstood in the way of publishing this book. We thank them for their effort and pleasant cooperation. WemournthedeathofIrinaAthanasiu,whodidnotlivelongenoughtolendher expertiseinembeddedsystemstothisbook. We thank the Faculteit der Exacte Wetenschappen of the Vrije Universiteit for theirsupportandtheuseoftheirequipment. Amsterdam, DickGrune March2012 KeesvanReeuwijk HenriE.Bal CerielJ.H.Jacobs Delft, KoenG.Langendoen x Preface AbridgedPrefacetotheFirstEdition(2000) In the 1980s and 1990s, while the world was witnessing the rise of the PC and the Internet on the front pages of the daily newspapers, compiler design methods developedwithlessfanfare,developmentsseenmainlyinthetechnicaljournals,and –moreimportantly–inthecompilersthatareusedtoprocesstoday’ssoftware.These developments were driven partly by the advent of new programming paradigms, partly by a better understanding of code generation techniques, and partly by the introductionoffastermachineswithlargeamountsofmemory. The field of programming languages has grown to include, besides the tradi- tionalimperativeparadigm,theobject-oriented,functional,logical,andparallel/dis- tributed paradigms, which inspire novel compilation techniques and which often requiremoreextensiverun-timesystemsthandoimperativelanguages.BURStech- niques(Bottom-UpRewritingSystems)haveevolvedintoverypowerfulcodegen- erationtechniqueswhichcopesuperblywiththecomplexmachineinstructionsets ofpresent-daymachines.Andthespeedandmemorysizeofmodernmachinesallow compilationtechniquesandprogramminglanguagefeaturesthatwereunthinkable before.Moderncompilerdesignmethodsmeetthesechallengeshead-on. Theaudience Ouraudiencearestudentswithenoughexperiencetohaveatleastusedacompiler occasionallyandtohavegivensomethoughttotheconceptofcompilation.When thesestudentsleavetheuniversity,theywillhavetobefamiliarwithlanguagepro- cessorsforeachofthemodernparadigms,usingmoderntechniques.Althoughcur- riculum requirements in many universities may have been lagging behind in this respect, graduates entering the job market cannot afford to ignore these develop- ments. Experiencehasshownusthataconsiderablenumberoftechniquestraditionally taughtincompilerconstructionarespecialcasesofmorefundamentaltechniques. Oftenthesespecialtechniquesworkforimperativelanguagesonly;thefundamental techniqueshaveamuchwiderapplication.Anexampleisthestackasanoptimized representationforactivationrecordsinstrictlylast-in-first-outlanguages.Therefore, thisbook • focusesonprinciplesandtechniquesofwideapplication,carefullydistinguish- ing between the essential (= material that has a high chance of being useful to the student) and the incidental (= material that will benefit the student only in exceptionalcases); • providesafirstlevelofimplementationdetailsandoptimizations; • augmentstheexplanationsbypointersforfurtherstudy. Thestudent,afterhavingfinishedthebook,canexpectto: Preface xi • haveobtainedathoroughunderstandingoftheconceptsofmoderncompilerde- signandconstruction,andsomefamiliaritywiththeirpracticalapplication; • beabletostartparticipatingintheconstructionofalanguageprocessorforeach ofthemodernparadigmswithaminimaltrainingperiod; • beabletoreadtheliterature. Thefirsttwoprovideafirmbasis;thethirdprovidespotentialforgrowth. Acknowledgments Weowemanythankstothefollowingpeople,whowerewillingtospendtimeand effort on reading drafts of our book and to supply us with many useful and some- timesverydetailedcomments:MirjamBakker,RaoulBhoedjang,WilfredDittmer, ThomerM.Gil,BenN.Hasnai,BertHuijben,JacoA.Imthorn,JohnRomein,Tim Rühl, and the anonymous reviewers. We thank Ronald Veldema for the Pentium codesegments. WearegratefultoSimonPlumtree,GaynorRedvers-Mutton,DawnBooth,and Jane Kerr of John Wiley & Sons Ltd, for their help and encouragement in writing this book. Lambert Meertens kindly provided information on an older ABC com- piler,andRalphGriswoldonanIconcompiler. We thank the Faculteit Wiskunde en Informatica (now part of the Faculteit der Exacte Wetenschappen) of the Vrije Universiteit for their support and the use of theirequipment. DickGrune [email protected],http://www.cs.vu.nl/~dick HenriE.Bal [email protected],http://www.cs.vu.nl/~bal CerielJ.H.Jacobs [email protected],http://www.cs.vu.nl/~ceriel KoenG.Langendoen [email protected],http://pds.twi.tudelft.nl/~koen Amsterdam,May2000

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.