ebook img

Automata Theory and Formal Languages: Fundamental Notions, Theorems, and Techniques PDF

287 Pages·2022·3.158 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Automata Theory and Formal Languages: Fundamental Notions, Theorems, and Techniques

Undergraduate Topics in Computer Science Alberto Pettorossi Automata Theory and Formal Languages Fundamental Notions, Theorems, and Techniques Undergraduate Topics in Computer Science Series Editor Ian Mackie, University of Sussex, Brighton, UK Advisory Editors Samson Abramsky , Department of Computer Science, University of Oxford, Oxford, UK Chris Hankin , Department of Computing, Imperial College London, London, UK Mike Hinchey , Lero – The Irish Software Research Centre, University of Limerick, Limerick, Ireland Dexter C. Kozen, Department of Computer Science, Cornell University, Ithaca, NY, USA Andrew Pitts , Department of Computer Science and Technology, University of Cambridge, Cambridge, UK Hanne Riis Nielson , Department of Applied Mathematics and Computer Science, Technical University of Denmark, Kongens Lyngby, Denmark StevenS.Skiena,DepartmentofComputerScience,StonyBrookUniversity, Stony Brook, NY, USA Iain Stewart , Department of Computer Science, Durham University, Durham, UK ‘Undergraduate Topics in Computer Science’ (UTiCS) delivers high-quality instructional content for undergraduates studying in all areas of computing and information science. From core foundational and theoretical material to final-year topics and applications, UTiCS books take a fresh, concise, and modern approach and are ideal for self-study or for a one- or two-semester course. The texts are all authored by established experts in their fields, reviewedbyaninternationaladvisoryboard,andcontainnumerousexamples and problems, many of which include fully worked solutions. The UTiCS concept relies on high-quality, concise books in softback format, and generally a maximum of 275–300 pages. For undergraduate textbooksthatarelikelytobelonger,moreexpository,Springercontinuesto offerthehighlyregardedTextsinComputerScienceseries,towhichwerefer potential authors. Alberto Pettorossi Automata Theory and Formal Languages Fundamental Notions, Theorems, and Techniques 123 AlbertoPettorossi University of RomeTor Vergata Rome, Italy IASI-CNR Rome, Italy ISSN 1863-7310 ISSN 2197-1781 (electronic) Undergraduate Topics inComputer Science ISBN978-3-031-11964-4 ISBN978-3-031-11965-1 (eBook) https://doi.org/10.1007/978-3-031-11965-1 ©TheEditor(s)(ifapplicable)andTheAuthor(s),underexclusivelicensetoSpringerNature SwitzerlandAG2022 Thisworkissubjecttocopyright.AllrightsaresolelyandexclusivelylicensedbythePublisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting,reuseofillustrations,recitation,broadcasting,reproductiononmicrofilmsorinany otherphysicalway,andtransmissionorinformationstorageandretrieval,electronicadaptation, computersoftware,orbysimilarordissimilarmethodologynowknownorhereafterdeveloped. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthis publication does not imply, even in the absence of a specific statement, that such names are exemptfromtherelevantprotectivelawsandregulationsandthereforefreeforgeneraluse. Thepublisher,theauthors,andtheeditorsaresafetoassumethattheadviceandinformationin thisbookarebelievedtobetrueandaccurateatthedateofpublication.Neitherthepublishernor the authors or the editors give a warranty, expressed or implied, with respect to the material containedhereinorforanyerrorsoromissionsthatmayhavebeenmade.Thepublisherremains neutralwithregardtojurisdictionalclaimsinpublishedmapsandinstitutionalaffiliations. ThisSpringerimprintispublishedbytheregisteredcompanySpringerNatureSwitzerlandAG Theregisteredcompanyaddressis:Gewerbestrasse11,6330Cham,Switzerland Preface In this book we present some basic notions and results on Automata Theory, Formal Language Theory, Computability Theory, and Parsing Theory. In particular, we consider the class of regular languages which are related to the class of finite automata, and the class of the context-free languages which are related to the class of pushdown automata. For the finite automata we also study the problem of their minimalization and the characterization of their behaviour using regular expressions. Forcontext-freelanguagesweillustratehowtoderive theirgrammarsinChom- sky and Greibach normal form. We study the relationship between deterministic and nondeterministic pushdown automata and the context-free languages they ac- cept. We present also some fundamental techniques for parsing both regular and context-free languages. Then we consider more powerful automata and we illustrate the relationship between linear bounded automata and context-sensitive languages, and between Turing Machines and type 0 languages. Chapter 6 of the book is dedicated to the analysis of various decidability and undecidability problems in context-free languages. In the Supplementary Topics chapter we deal with other classes of machines and languages, such as the counter machines, the stack automata, and the ab- stract families of languages. We also present some additional properties of finite automata, regular grammars, and context-free grammars, and we present a suf- ficient condition for the existence of a bijection between sets and we prove the existence of functions that are not computable. This book was written for a course on ‘Automata, Languages, and Translators’ taught at the University of Roma Tor Vergata. A theorem with number k.m.n is in Chapter k, Section m, and within that section it is identified by the number n. Analogous numbering system is used for algorithms, corollaries, definitions, exam- ples, exercises, figures, and remarks. I use ‘iff’ as an abbreviation for ‘if and only if’. Many thanks to my colleagues of the Department of Informatics, Systems, and Production and the Department of Civil Engineering and Informatics of the University of Roma Tor Vergata, and the IASI Institute of the National Research Council of Italy. I amalso grateful to allmy students and co-workers for their help and encouragement, and in particular to Alessandro Cacciotti, Lorenzo Clemente, v vi PREFACE Emanuele De Angelis, Corrado Di Pietro, Fabio Fioravanti, Fulvio Forni, Fabio Lecca, Maurizio Proietti, Marco Scarlino, and Valerio Senni. Thanks also to Ms. Michela Castrica, Mr. Ralf Gerstner, Mr. Ronan Nugent, and Mr. Wayne Wheeler of Springer for their most appreciated cooperation and help. Previous editions of this book were published by the Aracne Publishing Com- pany, Ariccia (RM), Italy. Roma, June 2022 Alberto Pettorossi Contents Preface ..................................................................... v 1. Formal Grammars and Languages....................................... 1 1.1. Free Monoids....................................................... 1 1.2. Formal Grammars .................................................. 2 1.3. The Chomsky Hierarchy ............................................ 5 1.4. Chomsky Normal Form and Greibach Normal Form................. 13 1.5. Epsilon Productions................................................. 14 1.6. Derivations in Context-Free Grammars.............................. 19 1.7. Substitutions and Homomorphisms.................................. 22 2. Finite Automata and Regular Grammars ............................... 25 2.1. Deterministic and Nondeterministic Finite Automata................ 25 2.2. NondeterministicFiniteAutomataandS-extendedType3Grammars . 29 2.3. Finite Automata and Transition Graphs............................. 31 2.4. Left Linear and Right Linear Regular Grammars.................... 36 2.5. Finite Automata and Regular Expressions .......................... 41 2.6. Arden Rule......................................................... 53 2.7. Axiomatization of Equations Between Regular Expressions.......... 55 2.8. Minimization of Finite Automata ................................... 57 2.9. Pumping Lemma for Regular Languages ............................ 70 2.10. A Parser for Regular Languages.................................... 73 2.10.1. A Java Program for Parsing Regular Languages.................. 83 2.11. Generalizations of Finite Automata................................ 91 2.11.1. Moore Machines ................................................. 92 2.11.2. Mealy Machines.................................................. 93 2.11.3. Generalized Sequential Machines................................. 93 2.12. Closure Properties of Regular Languages........................... 95 2.13. Decidability Properties of Regular Languages...................... 98 3. Pushdown Automata and Context-Free Grammars ...................... 101 3.1. Pushdown Automata and Context-Free Languages .................. 101 3.2. From PDA’s to Context-Free Grammars and Back: Some Examples. 115 3.3. Deterministic PDA’s and Deterministic Context-Free Languages.....121 3.4. Deterministic PDA’s and Grammars in Greibach Normal Form...... 127 3.5. Simplifications of Context-Free Grammars...........................129 3.5.1. Elimination of Nonterminal Symbols That Do Not Generate Words 129 3.5.2. Elimination of Symbols Unreachable from the Start Symbol........130 3.5.3. Elimination of Epsilon Productions................................ 132 vii viii CONTENTS 3.5.4. Elimination of Unit Productions................................... 133 3.5.5. Elimination of Left Recursion......................................136 3.6. Construction of the Chomsky Normal Form ........................ 137 3.7. Construction of the Greibach Normal Form ......................... 140 3.8. Theory of Language Equations...................................... 148 3.9. Summary on the Transformations of Context-Free Grammars........154 3.10. Self-Embedding Property of Context-Free Grammars .............. 155 3.11. Pumping Lemma for Context-Free Languages ..................... 158 3.12. Ambiguity and Inherent Ambiguity ................................ 164 3.13. Closure Properties of Context-Free Languages......................165 3.14. Basic Decidable Properties of Context-Free Languages ............. 167 3.15. Parsers for Context-Free Languages................................ 168 3.15.1. The Cocke-Younger-Kasami Parser............................... 168 3.15.2. The Earley Parser................................................171 3.16. Parsing Classes of Deterministic Context-Free Languages...........176 3.17. Closure Properties of Deterministic Context-Free Languages........179 3.18. Decidable Properties of Deterministic Context-Free Languages..... 179 4. Linear Bounded Automata and Context-Sensitive Grammars ........... 181 4.1. Recursiveness of Context-Sensitive Languages....................... 190 5. Turing Machines and Type 0 Grammars ................................ 195 5.1. Turing Machines ................................................... 195 5.2. Equivalence Between Turing Machines and Type 0 Languages ...... 203 6. Decidability and Undecidability in Context-Free Languages ............. 209 6.1. Preliminary Definitions and Theorems...............................209 6.2. Basic Decidability and Undecidability Results ...................... 215 6.2.1. Basic Undecidable Properties of Context-Free Languages.......... 217 6.3. Decidability in Deterministic Context-Free Languages............... 221 6.4. Undecidability in Deterministic Context-Free Languages.............222 6.5. Undecidable Properties of Linear Context-Free Languages........... 223 7. Supplementary Topics...................................................225 7.1. Iterated Counter Machines and Counter Machines...................225 7.2. Stack Automata .................................................... 236 7.3. Relationships Among Various Classes of Automata.................. 238 7.4. Decidable Properties of Classes of Languages........................243 7.5. Algebraic and Closure Properties of Classes of Languages ........... 245 7.6. Abstract Families of Languages ..................................... 246 7.7. Finite Automata to/from S-extended Regular Grammars............252 7.8. Context-Free Grammars over Singleton Terminal Alphabets......... 255 7.9. The Bernstein Theorem............................................. 258 7.10. Existence of Functions That Are Not Computable..................261 List of Algorithms and Programs............................................269 Index....................................................................... 271 Bibliography................................................................ 279 CHAPTER 1 Formal Grammars and Languages In this chapter we introduce some basic notions and some notations we will use in the book. In particular, we introduce the notions of a free monoid, a formal grammar and its generated language, the Chomsky hierarchy, the Kuroda normal form, the Chomsky normal form, and the Greibach normal form. We also examine the effects of the presence of the epsilon production in formal grammars. Finally, westudythederivationsincontext-freelanguagesandthenotionsofasubstitution and a homomorphism. 1.1. Free Monoids The set of natural numbers {0,1,2,...} is denoted by N. Given a set A, |A| denotes the cardinality of A, and 2A denotes the power- set of A, that is, the set of all subsets of A. Instead of 2A, we will also write Powerset(A). The set of all finite subsets of A is denoted by P (A). fin We say that a set S is countable iff either S is finite or there exists a bijection between S and the set N of natural numbers. Let us consider a countable set V, also called an alphabet. The elements of V are called symbols. The free monoid generated by the set V is the set, denoted V∗, consisting of all finite sequences of symbols in V, that is, V∗ = {v ...v |n ≥ 0 and for i = 0,...,n, v ∈ V}. 1 n i The unary operation ∗ (pronounced ‘star’) is called the Kleene star, or the Kleene closure, or the ∗closure (pronounced ‘the star closure’). Sequences of symbols are also called words or strings. The length of a sequence v ...v is n. The sequence 1 n of length 0 is called the empty sequence or empty word and it is denoted by ε. The length of a sequence w is denoted by |w|. For all w∈V∗, for all a∈V, the number of occurrences of the symbol a in the word w is denoted by |w| . a Given two sequences w and w in V∗, their concatenation, denoted w (cid:1) w or 1 2 1 2 simply w w , is the sequence in V∗ defined by recursion on the length of w as 1 2 1 follows: (cid:1) w w = w if w = ε 1 2 2 1 (cid:1) = v ((v ...v ) w ) if w = v v ...v with n>0. 1 2 n 2 1 1 2 n We have that |w (cid:1) w | = |w |+|w |. The concatenation operation (cid:1) is associative 1 2 1 2 and its neutral element is the empty sequence ε. Any set of sequences which is a subset of V∗ is called a language (or a formal language) over the alphabet V. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 1 A. Pettorossi, Automata Theory and Formal Languages, Undergraduate Topics in Computer Science, https://doi.org/10.1007/978-3-031-11965-1_1

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.