ebook img

Formal Languages and Compilation PDF

408 Pages·2013·6.148 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Formal Languages and Compilation

Texts in Computer Science Stefano Crespi Reghizzi Luca Breveglieri Angelo Morzenti Formal Languages and Compilation Second Edition Texts in Computer Science Editors DavidGries FredB.Schneider Forfurthervolumes: www.springer.com/series/3191 Stefano Crespi Reghizzi (cid:2) Luca Breveglieri (cid:2) Angelo Morzenti Formal Languages and Compilation Second Edition StefanoCrespiReghizzi AngeloMorzenti DipartimentoElettronica DipartimentoElettronica InformazioneeBioingegneria InformazioneeBioingegneria PolitecnicodiMilano PolitecnicodiMilano Milan,Italy Milan,Italy LucaBreveglieri DipartimentoElettronica InformazioneeBioingegneria PolitecnicodiMilano Milan,Italy SeriesEditors DavidGries FredB.Schneider DepartmentofComputerScience DepartmentofComputerScience CornellUniversity CornellUniversity Ithaca,NY,USA Ithaca,NY,USA ISSN1868-0941 ISSN1868-095X(electronic) TextsinComputerScience ISBN978-1-4471-5513-3 ISBN978-1-4471-5514-0(eBook) DOI10.1007/978-1-4471-5514-0 SpringerLondonHeidelbergNewYorkDordrecht ©Springer-VerlagLondon2009,2013 Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpartof thematerialisconcerned,specificallytherightsoftranslation,reprinting,reuseofillustrations,recitation, broadcasting,reproductiononmicrofilmsorinanyotherphysicalway,andtransmissionorinformation storageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilarmethodology nowknownorhereafterdeveloped.Exemptedfromthislegalreservationarebriefexcerptsinconnection with reviews or scholarly analysis or material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of this publication or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’slocation,initscurrentversion,andpermissionforusemustalwaysbeobtainedfromSpringer. PermissionsforusemaybeobtainedthroughRightsLinkattheCopyrightClearanceCenter.Violations areliabletoprosecutionundertherespectiveCopyrightLaw. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant protectivelawsandregulationsandthereforefreeforgeneraluse. Whiletheadviceandinformationinthisbookarebelievedtobetrueandaccurateatthedateofpub- lication,neithertheauthorsnortheeditorsnorthepublishercanacceptanylegalresponsibilityforany errorsoromissionsthatmaybemade.Thepublishermakesnowarranty,expressorimplied,withrespect tothematerialcontainedherein. Printedonacid-freepaper SpringerispartofSpringerScience+BusinessMedia(www.springer.com) Preface The textbook derives from the well-known identically titled volume1 published in 2009:twoyoungerco-authorshavebroughttheirexperiencetoenrichandstream- linethebook,withoutdullingitsoriginalstructure.Theselectionofmaterialsand thepresentationstylehavenotsubstantiallychanged.Thebookreflectsmanyyears ofteachingcompilercoursesandofdoingresearchonformallanguagetheoryand formal methods, on compiler and language technology, and to a lesser extent on naturallanguageprocessing.Themoreimportantchangeconcernsthecentraltopic oflanguageparsing.Itisacompletelynew,systematic,andunifiedpresentationof themostimportantparsingalgorithms,includingalsoparallelparsing. Goals Intheturmoilofinformationtechnologydevelopments,thesubjectofthe book has kept the same fundamental principles since half a century, and has pre- served its conceptual importance and practical relevance. This state of affairs in a topicthatiscentraltocomputerscienceandisbasedonestablishedprinciples,might lead some people to believe that the corresponding textbooks are by now consoli- dated,muchastheclassicalbooksonmathematicsandphysics.Inreality,thisisnot thecase:thereexistfineclassicalbooksonthemathematicalaspectsoflanguageand automatatheory,butforwhatconcernstheapplicationtocompiling,thebestbooks are sort of encyclopedias of algorithms, design methods, and practical tricks used incompilerdesign.Indeed,acompilerisamicrocosm,andfeaturesmanydifferent aspectsrangingfromalgorithmicwisdomtocomputerhardware.Asaconsequence, thetextbookshavegrowninsize,andcompetewithrespecttotheircoverageofthe last developments on programming languages, processor architectures and clever mappingsfromtheformertothelatter. Toputthingsinorder,itisbettertoseparatesuchcomplextopicsintotwoparts, basicandadvanced,whichtoalargeextentcorrespondtothetwosubsystemsthat make a compiler: the user-language specific front-end, and the machine-language specificback-end.Thebasicpartisthesubjectofthisbook.Itcoverstheprinciples andalgorithmstobeusedfordefiningthesyntaxoflanguagesandforimplementing simpletranslators.Itdoesnotinclude:thespecializedknow-howneededforvarious classesofprogramminglanguages(imperative,functional,objectoriented,etc.),the computerarchitecturerelatedaspects,andtheoptimizationmethodsusedtoimprove themachinecodeproducedbythecompiler. 1S.CrespiReghizzi,FormalLanguagesandCompilation(Springer,London,2009). v vi Preface OrganizationandFeatures Inothertextbooksthebiastowardspracticalaspects hasreducedtheattentiontofundamentalconcepts.Thishaspreventedtheirauthors from taking advantage of the improvements and simplifications made possible by decades of extensive use, and from avoiding the irritating variants and repetitions that are found in the original papers. Moving from these premises, we decided to present, in a simple minimalistway, the principles and methods used in designing languagesyntaxandsyntax-directedtranslators. Chapter2coversregularexpressionsandcontext-freegrammars,withemphasis onthestructuraladequacyofsyntacticdefinitionsandacarefulanalysisofambigu- ityandhowtoavoidit. Chapter3presentsfinite-staterecognizersandtheirconversionbackandforthto regularexpressionsandgrammars. Chapter4presentspush-downmachinesandparsingalgorithms,focusingatten- tionontheLL,LRandEarleymethods.Wehavesubstantiallyimprovedthestandard presentationofsuchalgorithms,byunifyingtheconceptsandnotationsusedinvar- iousapproaches,andbyextendingthemethodcoveragewithareduceddefinitional apparatus.Anexamplethatexpertreadersandinstructorsshouldappreciate,isthe unificationofthetop-down(LL)andbottom-up(LR)parsingalgorithms,aswellas the tabular (Early) one, within a novel practical framework. In this way, the effort and space spared have made room for advanced methods typically not present in similar textbooks. First, our parsing algorithms apply to the Extended BNF gram- mars, which are the de facto standard in the language reference manuals. Second, weprovideaparallelparsingalgorithmthattakesadvantageofthemanyprocessing unitsofmodernmicroprocessors,tospeed-uptheanalysisoflargefiles. The book is not restricted to syntax: Chap. 5, the last one, studies translations, semantic functions (attribute grammars), and the static program analysis by data flow equations. This provides a comprehensive understanding of the compilation process,andcoverstheessentialelementsofasyntax-directedtranslator. Thepresentationisillustratedbymanysmallyetrealisticexamplesandpictures, to ease the understanding of the theory and the transfer to application. Theoret- ical models of automata, transducers and formal grammars are extensively used, whenever practical motivations exist, without insisting too much on their formal definition.Algorithmsaredescribedinapseudo-codetoavoidthedisturbingdetails of a programming language, yet they are straightforward to convert to executable procedures. Thisbookshouldbewelcomebythosewillingtoteachortolearntheessential concepts of syntax-directed compilation, without the need of relying on software toolsandimplementations.Webelievethatlearningbydoingisnotalwaysthebest approach,andthatover-commitmenttopracticalworkmaysometimesobscurethe conceptualfoundations.Inthecaseofformallanguages,theeleganceandsimplicity oftheunderlyingtheoryallowsthestudentstoacquirethefundamentalparadigms of language structures, to avoid pitfalls such as ambiguity, and to adequately map structuretomeaning.Inthisfield,therelevantalgorithmsaresimpleenoughtobe practicedbypaperandpencil.Ofcourse,studentsshouldbeencouragedtoenrollin ahands-onlaboratoryandtoexperimentsyntax-directedtools(likeflexandbison) onrealisticcases. Preface vii IntendedAudiences Thisisprimarilyatextbooktargetedtograduate(orupper- divisionundergraduate)studentsincomputerscienceorcomputerengineering.We listasprerequisites:familiaritywithsomeprogramminglanguagemodelsandwith algorithm design; and the ability to use elementary mathematical and logical no- tation. If the reader or student has already taken a course on theoretical computer scienceincludingafewelementsofformallanguageandautomatatheory,thetime andeffortneededtolearnthefirstchapterscanbereduced.Butitisfairtosaythat thisbookdoesnotcompetewiththemanyavailablebooksonthetheoryofformal languagesandcomputation:weusuallydonotincludemathematicalproofsofprop- erties,andwerelyinsteadonexamplesandinformalarguments.Yetmathematically orientedreaderswillfindheremanymotivatingexamplesandapplications. Alargecollectionofproblemsandsolutionscomplementingthenumerousexam- plesinthebookisavailableontheauthors’coursewebsiteatPolitecnicodiMilano. Similarly,acomprehensivesetoflectureslidesisalsoavailableonthecourseweb site. The Authors Thank The colleagues who have taught this book to computer engineering students, Giampaolo Agosta and Licia Sbattella; the Italian National Research Group on Formal Languages, in particular Antonio Restivo, Alessan- dra Cherubini, Dino Mandrioli, Matteo Pradella and Pierluigi San Pietro; our past andpresentPh.D.studentsandteachingassistants,AlessandroBarenghi,Marcello Bersani, Andrea Di Biagio, Simone Campanoni, Silvia Lovergine, Michele Scan- dale, Ettore Speziale, Martino Sykora and Michele Tartara. We also acknowledge thesupportof STMicroelectronicsCompanyforR.&D.projectsoncompilersfor advancedmicroprocessorarchitectures. The first author remembers the late Antonio Grasselli, a pioneer of computer sciencestudies,whofirstfascinatedhimwithasubjectcombininglinguistic,math- ematical,andtechnologicalaspects. Milan,Italy StefanoCrespiReghizzi July2013 LucaBreveglieri AngeloMorzenti Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 IntendedScopeandAudience . . . . . . . . . . . . . . . . . . . 1 1.2 CompilerPartsandCorrespondingConcepts . . . . . . . . . . . 2 2 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1.1 ArtificialandFormalLanguages . . . . . . . . . . . . . 5 2.1.2 LanguageTypes . . . . . . . . . . . . . . . . . . . . . . 6 2.1.3 ChapterOutline . . . . . . . . . . . . . . . . . . . . . . 7 2.2 FormalLanguageTheory . . . . . . . . . . . . . . . . . . . . . 7 2.2.1 AlphabetandLanguage . . . . . . . . . . . . . . . . . . 7 2.2.2 LanguageOperations . . . . . . . . . . . . . . . . . . . 11 2.2.3 SetOperations . . . . . . . . . . . . . . . . . . . . . . . 13 2.2.4 StarandCross . . . . . . . . . . . . . . . . . . . . . . . 14 2.2.5 Quotient . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.3 RegularExpressionsandLanguages . . . . . . . . . . . . . . . 17 2.3.1 DefinitionofRegularExpression . . . . . . . . . . . . . 17 2.3.2 DerivationandLanguage . . . . . . . . . . . . . . . . . 19 2.3.3 OtherOperators . . . . . . . . . . . . . . . . . . . . . . 22 2.3.4 ClosurePropertiesofREGFamily . . . . . . . . . . . . 23 2.4 LinguisticAbstraction . . . . . . . . . . . . . . . . . . . . . . . 24 2.4.1 AbstractandConcreteLists . . . . . . . . . . . . . . . . 25 2.5 Context-FreeGenerativeGrammars . . . . . . . . . . . . . . . . 28 2.5.1 LimitsofRegularLanguages . . . . . . . . . . . . . . . 29 2.5.2 IntroductiontoContext-FreeGrammars . . . . . . . . . 29 2.5.3 ConventionalGrammarRepresentations . . . . . . . . . 32 2.5.4 DerivationandLanguageGeneration . . . . . . . . . . . 33 2.5.5 ErroneousGrammarsandUselessRules . . . . . . . . . 35 2.5.6 RecursionandLanguageInfinity . . . . . . . . . . . . . 37 2.5.7 SyntaxTreesandCanonicalDerivations . . . . . . . . . 38 2.5.8 ParenthesisLanguages . . . . . . . . . . . . . . . . . . . 41 2.5.9 RegularCompositionofContext-FreeLanguages . . . . 43 2.5.10 Ambiguity . . . . . . . . . . . . . . . . . . . . . . . . . 45 2.5.11 CatalogueofAmbiguousFormsandRemedies . . . . . . 47 ix x Contents 2.5.12 WeakandStructuralEquivalence . . . . . . . . . . . . . 54 2.5.13 GrammarTransformationsandNormalForms . . . . . . 56 2.6 GrammarsofRegularLanguages . . . . . . . . . . . . . . . . . 67 2.6.1 FromRegularExpressionstoContext-FreeGrammars . . 67 2.6.2 LinearGrammars . . . . . . . . . . . . . . . . . . . . . 68 2.6.3 LinearLanguageEquations . . . . . . . . . . . . . . . . 71 2.7 ComparisonofRegularandContext-FreeLanguages . . . . . . . 73 2.7.1 LimitsofContext-FreeLanguages . . . . . . . . . . . . 76 2.7.2 ClosurePropertiesofREGandCF . . . . . . . . . . . . 78 2.7.3 AlphabeticTransformations . . . . . . . . . . . . . . . . 79 2.7.4 GrammarswithRegularExpressions . . . . . . . . . . . 82 2.8 MoreGeneralGrammarsandLanguageFamilies . . . . . . . . . 85 2.8.1 ChomskyClassification . . . . . . . . . . . . . . . . . . 86 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 3 FiniteAutomataasRegularLanguageRecognizers . . . . . . . . . 91 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 3.2 RecognitionAlgorithmsandAutomata . . . . . . . . . . . . . . 92 3.2.1 AGeneralAutomaton . . . . . . . . . . . . . . . . . . . 93 3.3 IntroductiontoFiniteAutomata . . . . . . . . . . . . . . . . . . 96 3.4 DeterministicFiniteAutomata. . . . . . . . . . . . . . . . . . . 97 3.4.1 ErrorStateandTotalAutomata . . . . . . . . . . . . . . 98 3.4.2 CleanAutomata . . . . . . . . . . . . . . . . . . . . . . 99 3.4.3 MinimalAutomata . . . . . . . . . . . . . . . . . . . . . 100 3.4.4 FromAutomatatoGrammars . . . . . . . . . . . . . . . 103 3.5 NondeterministicAutomata . . . . . . . . . . . . . . . . . . . . 104 3.5.1 MotivationofNondeterminism . . . . . . . . . . . . . . 105 3.5.2 NondeterministicRecognizers . . . . . . . . . . . . . . . 107 3.5.3 AutomatawithSpontaneousMoves . . . . . . . . . . . . 109 3.5.4 CorrespondenceBetweenAutomataandGrammars . . . 110 3.5.5 AmbiguityofAutomata . . . . . . . . . . . . . . . . . . 111 3.5.6 Left-LinearGrammarsandAutomata . . . . . . . . . . . 112 3.6 DirectlyfromAutomatatoRegularExpressions:BMCMethod . 112 3.7 EliminationofNondeterminism . . . . . . . . . . . . . . . . . . 114 3.7.1 ConstructionofAccessibleSubsets . . . . . . . . . . . . 115 3.8 FromRegularExpressiontoRecognizer . . . . . . . . . . . . . 119 3.8.1 ThompsonStructuralMethod . . . . . . . . . . . . . . . 119 3.8.2 Berry–SethiMethod . . . . . . . . . . . . . . . . . . . . 121 3.9 RegularExpressionswithComplementandIntersection . . . . . 132 3.9.1 ProductofAutomata . . . . . . . . . . . . . . . . . . . . 134 3.10 SummaryofRelationsBetweenRegularLanguages,Grammars, andAutomata . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.