ebook img

A course in formal languages, automata and groups PDF

162 Pages·2009·1.746 MB·English
by  ChiswellIan
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview A course in formal languages, automata and groups

Universitext Forothertitlespublishedinthisseries,goto www.springer.com/series/223 Ian Chiswell A Course in Formal Languages, Automata and Groups ABC IanChiswell DepartmentofPureMathematics SchoolofMathematicalSciences QueenMary,UniversityofLondon LondonE14NS,UK [email protected] Editorialboard: SheldonAxler,SanFranciscoStateUniversity VincenzoCapasso,Universita`degliStudidiMilano CarlesCasacuberta,UniversitatdeBarcelona AngusMacIntyre,QueenMary,UniversityofLondon KennethRibet,UniversityofCalifornia,Berkeley ClaudeSabbah,CNRS,E´colePolytechnique EndreSu¨li,UniversityofOxford WojborWoyczyn´ski,CaseWesternReserveUniversity ISBN978-1-84800-939-4 e-ISBN978-1-84800-940-0 DOI 10.1007/978-1-84800-940-0 BritishLibraryCataloguinginPublicationData AcataloguerecordforthisbookisavailablefromtheBritishLibrary LibraryofCongressControlNumber:2008939035 MathematicsSubjectClassification(2000):03D10,03D20,20F10,20F65,68Q05, 68Q42,68Q45 Hopcroft/Ullman,FormalLanguagesandTheirRelationtoAutomata(adaptedmaterialfromChapter5 (Section4.2,Section4.3,Theorem5.1,Theorem5.2,andTheorem5.3)andChapter12(Theorem12.4 andTheorem12.9)),(cid:2)c 1969.ReproducedbypermissionofPearsonEducation,Inc. (cid:2)c Springer-VerlagLondonLimited2009 Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permittedundertheCopyright,DesignsandPatentsAct1988,thispublicationmayonlybereproduced, storedortransmitted,inanyformorbyanymeans,withthepriorpermissioninwritingofthepublishers, or in the case of reprographic reproduction in accordance with the terms of licenses issued by the CopyrightLicensingAgency.Enquiriesconcerningreproductionoutsidethosetermsshouldbesentto thepublishers. Theuseofregisterednames,trademarks,etc.,inthispublicationdoesnotimply,evenintheabsenceofa specificstatement,thatsuchnamesareexemptfromtherelevantlawsandregulationsandthereforefree forgeneraluse. Thepublishermakesnorepresentation,expressorimplied,withregardtotheaccuracyoftheinformation containedinthisbookandcannotacceptanylegalresponsibilityorliabilityforanyerrorsoromissions thatmaybemade. Printedonacid-freepaper SpringerScience+BusinessMedia springer.com Preface Thisbookisbasedonnotesforamaster’scoursegivenatQueenMary,University ofLondon,inthe1998/9session.SuchcoursesinLondonarequiteshort,andthe courseconsistedessentiallyofthematerialinthefirstthreechapters,togetherwith a two-hour lecture on connections with group theory. Chapter 5 is a considerably expandedversionofthis. Forthecourse,themainsourceswerethebooksbyHopcroftandUllman([20]), byCohen([4]),andbyEpsteinetal.([7]).Someusewasalsomadeofalaterbook by Hopcroft and Ullman ([21]). The ulterior motive in the first three chapters is to give a rigorous proof that various notions of recursively enumerable language are equivalent. Three such notions are considered. These are: generated by a type 0 grammar, recognised by a Turing machine (deterministic or not) and defined by means of a Go¨del numbering, having defined “recursively enumerable” for sets of natural numbers. It is hoped that this has been achieved without too many argu- mentsusingcomplicatednotation.Thisisaproblemwiththeentiresubject,andit is important to understand the idea of the proof, which is often quite simple. Two particular places that are heavy going are the proof at the end of Chapter 1 that a languagerecognisedbyaTuringmachineistype0,andtheproofinChapter2that aTuringmachinecomputablefunctionispartialrecursive. Chapter1beginsbydiscussinggrammarsandtheChomskyhierarchy,thenthe notion of machine recognition. It is shown that the class of regular languages co- incides with the class recognised by a finite state automaton, whether or not we restrict to deterministic machines, and whether or not blank squares are allowed on the tape. There is also a discussion of Turing machines and the languages they recognise, including the result mentioned above, that a language recognised by a Turing machine is type 0. There are also further characterisations of regular lan- guages, including Kleene’s theorem that they are precisely the rational languages. Thechapterendswithabriefdiscussionofmachinerecognitionofcontext-sensitive languages,whichwasnotincludedinthecourse. Chapter2isaboutcomputablefunctions,andbeginswithastandarddiscussion ofprimitiverecursive,recursiveandpartialrecursivefunctions,andofprimitivere- cursiveandrecursivepredicates.Thenvariousprecisenotionsofcomputabilityare v vi Preface considered.Theseare:computationbyregisterprograms,byabacusmachinesand byTuringmachines.Inallcases,itisshownthatthecomputablefunctionsarepre- ciselythepartialrecursivefunctions.Theaccountfollows[4],exceptthatmodular machinesarenotused.ThisentailsgivingadirectproofthatTuringmachinecom- putableimpliespartialrecursive.Asmentionedabove,thisisheavygoing,although brieferthanifthetheoryofmodularmachineshadbeendeveloped.Toeasematters, theproofofatechnicallemmahasbeenplacedinanappendix. Chapter3beginswithanaccountofrecursivelyenumerablesetsofnaturalnum- bers.RecursivelyenumerablelanguagesaredefinedbymeansofGo¨delnumberings, andwethenproceedtotheproofofthemainresult,previouslymentioned,charac- terisingrecursivelyenumerablelanguages.Thecommentsoncomplexityattheend ofthechapterwerenotincludedinthecourse,andareintendedforuseinChapter5. Chapter 4 is about context-free languages and is material not included in the course. It is considerably heavier going than the previous three chapters. Much of thematerialfollowsthebooksofHopcroftandUllman,includingtheirmorerecent one with Motwani ([22]). Some of the results are needed in Chapter 5. However, theulteriormotiveforthischapteristoclarifytherelationshipbetweenLR(k)lan- guages and deterministic (context-free) languages. Neither [20] nor [21] seems to giveacompleteaccountofthis. Chapter5isonconnectionswithgrouptheory,whichisasubjectofgreatinter- esttotheauthor,andaprimarymotivationforstudyingformallanguagetheory.It beginswiththeauthor’sphilosophicalmusingsontheideaofagrouppresentation, whicharequiteelementary.Thereisabriefdiscussionoffreegroups,freeproducts andHNN-extensions.Mostoftherestofthechapterisdevotedtothewordproblem forgroups.WeproveAnisimov’stheoremthatagrouphasregularwordproblemif, and only if, it is finite. The highlight is a reasonably self-contained account of the resultofMullerandSchupp.Thissaysthatagrouphascontext-freewordproblem if and only if it is free by finite. It makes use of Dunwoody’s result that a finitely presented group is accessible. To give a proof of this would have been too great a digression. A discussion of groups with word problem in other language classes isalsogiven.Thechapterendswithabriefdiscussionof(synchronous)automatic groups, including a proof of the characterisation by means of the fellow traveller property. Expanding the lectures has given Chapter 5 a theme, which is the interplay be- tweengrouptheory,geometry(specifically,theCayleygraph)andformallanguage theory.Itseemslikelythatthereisalotmoretobesaidonthissubject. TheproofsofseveralresultshavebeenplacedinAppendixA,usuallytoimprove theflowofthemaintext.Insomecases,theseweregivenashandoutstotheclass. AppendicesBandCwerealsohandouts,althoughAppendixBhasbeenexpanded to include a brief discussion of universal Turing machines. Appendix D contains solutions to selected exercises. A complete solutions manual, password protected, isavailabletoinstructorsviatheSpringerwebsite.Toapplyforapassword,visitthe bookwebpageatwww.springer.comoremailtextbooks@springer.com.Thenumber ofexercisesisfairlysmall,andtheyvaryindifficulty;someofthemcanbeusedas Preface vii templatesforsimilarexercises(onlytheexercisesinChapters1and2wereactually usedinthecourse). The impetus for the development of formal language theory comes from com- puter science, and as already noted, it can be at times quite complicated. Despite this,itisanelegantpartofpuremathematics.Thebookiswrittenbyamathemati- cianandintendedformathematicians.Nevertheless,itishopeditmaybeofsome interesttocomputerscientists. The book can be viewed only as an introduction to the subject (the audience consistedofgraduatestudentsinmathematics).Forfurtherreadingonformallan- guages,see,forexample,[33]and[34]. Theprerequisiteforunderstandingthebookissomeexposuretoabstractmath- ematics, including an understanding of some basic ideas, such as mapping, Carte- sian product and equivalence relation (note that “mapping” and “function” mean the same thing throughout the book). At various points the reader is assumed to befamiliarwiththecombinatorialideaofagraph.Thisincludesbothdirectedand undirectedgraphsandtheideaofatree.Generally,verticesofagrapharedenoted bycirclesordots,butinthecaseofparsingtrees(Chapter4)theyareindicatedonly by their labels. Of course, in Chapter 5, some knowledge of basic group theory is assumed.Also,thereaderneedstoknowatleastthedefinitionofasemigroupand amonoid.Noadvancedmathematicalknowledgeisneeded. Concerning notation, words in a formal language are elements of a Cartesian product An, where n is an integer, and in this context are usually written without commasandparentheses.InothercaseswhereCartesianproductsareinvolved,for example the transitions of a machine or the definition of grammars and machines, commas and parentheses are used. The exception is in writing the transitions of a Turingmachine,inordertoconformwithwhatappearstobetheusualpractice.Our definitions of grammars and machines are quite formal. This seems the best way to proceed, although it has gone out of fashion when defining basic mathematical objects(suchasagroup).Asusual,Rdenotesthesetofrealnumbers,Qthesetof rational numbers, Z the set of integers and N the set of natural numbers, which in thisbookmeans{0,1,2,...}. TheauthorthanksSarahRees,ClaasRo¨verandRichardThomasfortheirhelpful conversationsandemailmessages.Inparticular,severaloftheargumentsinChapter 5weresuggestedbyRichardThomas.HealsowarmlythanksDanielCohenforhis veryusefulandperceptivecommentsonthemanuscript. Alistoferratawillbeavailableonthebookwebpageatwww.springer.com. QueenMary,UniversityofLondon IanChiswell SchoolofMathematicalSciences June2008 Contents Preface............................................................ v 1 GrammarsandMachineRecognition............................. 1 2 RecursiveFunctions............................................ 21 3 RecursivelyEnumerableSetsandLanguages...................... 49 4 Context-freeLanguages......................................... 59 5 ConnectionswithGroupTheory ................................. 93 A ResultsandProofsOmittedintheText ...........................131 B TheHaltingProblemandUniversalTuringMachines ..............139 C Cantor’sDiagonalArgument....................................141 D SolutionstoSelectedExercises...................................143 References.........................................................151 Index .............................................................153 ix Chapter 1 Grammars and Machine Recognition Byalanguagewehaveinmindawrittenlanguage.Suchalanguage,whethernatural or a programming language, has an alphabet, and words are formed by writing stringsoflettersinthealphabet.(Inthecaseofsomenaturallanguages,thealphabet forthispurposemaynotbewhatisnormallydescribedasthealphabet.)However, to develop a mathematical theory, we need precise definitions of these ideas. An alphabetconsistsofanumberofletters,whicharewritteninacertainway.However, the letters are not physical entities, but abstract concepts. If one writes “a” twice, thetwocopieswillnotlookidentical,butonehopestheyaresufficientlyclosetobe recognisedasrepresentingtheabstractconceptofthefirstletterofthealphabet. To the pure mathematician, this presents no problem. An alphabet is just a set. Thewordsarethenjustfinitesequencesofelementsofthealphabet.Allowingsuch awide-rangingdefinitionwillturnouttobeveryconvenient.Thealphabetcaneven beinfinite,althoughinthisbookitisusuallyfinite.(Theexceptions areinthede- finition of abacus machines in Chap. 2, and the discussion of free products and HNN-extensionsinChap.5.) Thus, let A be a set and let Am be the set of all finite sequences a ...a with 1 m a ∈Afor1≤i≤m.ElementsofAarecalledlettersorsymbols,andelementsof i AmarecalledwordsorstringsoverAoflengthm. Note:misanaturalnumber;A0={ε},whereεistheemptywordhavingnoletters, and A1 can be identified with A. The set Am (m≥2) can be identified with the Cartesian product (cid:2)A×A×(cid:3)(cid:4)...×A(cid:5), but its elements are written without the usual mcopies commasandparentheses. (cid:6) (cid:6) Definition. PutA+= Am,A∗= Am=A+∪{ε}. m≥1 m≥0 Ifα=a ...a ,β=b ...b ∈A∗,defineαβtobea ...a b ...b (anelement 1 m 1 n 1 m 1 n ofAm+n).ThisgivesabinaryoperationonA∗ (andonA+)calledconcatenation.It isassociative:α(βγ)=(αβ)γandαε=εα=α.ThusA+isasemigroup(thefree ∗ semigrouponA)andA isamonoid(thefreemonoidonA).Denotethelengthofa wordαby|α|.Asusual,wecandefineαn,wheren∈N,by:α0=ε,αn+1=αnα. I.Chiswell,ACourseinFormalLanguages,AutomataandGroups, 1 DOI10.1007/978-1-84800-940-0 1, (cid:2)c Springer-VerlagLondonLimited2009 2 1 GrammarsandMachineRecognition If αis a word over an alphabet A, a subword of αis a word γ∈A∗ such that α=βγδfor some β, δ∈A∗. If α=βγ, then βis called a prefix of αand γis calledasuffixofα. ∗ Definition. AlanguagewithalphabetAisasubsetofA . We shall consider languages defined in a particular way, using what is called a rewritingsystem.Thisisessentiallyasetofrules,eachofwhichallowssomestring u, whenever it occurs in a word, to be replaced by another string v. Such a rule is specifiedbytheorderedpair(u,v),leadingtothefollowingformaldefinitions. Definition. ArewritingsystemonAisasubsetofA∗×A∗. If R is a rewriting system and (α,β)∈R, then for any u, v∈A∗, we say that uαv rewritestouβv.Elementsof.Rarewrittenasα−→βratherthan(α,β). Definition. Foru,v∈A∗,u−→vmeansthereisafinitesequenceu=u ,...,u =v 1 n ofelementsofA∗ suchthatu rewritestou .for1≤i≤n−1.Suchasequenceis i i+1 calledanR-derivationofvfromu.(Writeu−→vifnecessary.) R Definition. Agrammarisaquadruple(V ,V ,P,S)where N T (1) V , V are disjoint finite sets (the set of non-terminal and terminal symbols N T respectively). (2) S∈V (thestartsymbol). N (3) PisafiniterewritingsystemonV ∪V . N T (ElementsofParecalledproductionsinthiscontext.) Definition. ThelanguageL generatedbyGis G . (cid:7) (cid:8) L = w∈V∗|S−→w G T (alanguagewithalphabetV ). T Definition. A production is context-free if it has the form A−→α, where A∈V N and α∈(V ∪V )+. It is context-sensitive if it has the form βAγ−→βαγ, where N T A∈V ,α,β,γ∈(V ∪V )∗,α(cid:8)=ε. N N T The reason for the names is that in using a context-free production A−→αin a derivation, A can be replaced in a word by the word αregardless of the context (the strings of letters that appear to the left and right of A in the word). With a context-sensitiveproductionβAγ−→βαγ,whetherornotitcanbeusedtoreplace Abyγdependsonthecontext(βmustoccurtotheleft,andγtotherightofAinthe word).Note,however,thatβ=γ=εisallowedinthedefinitionofcontext-sensitive production,socontext-freeproductionsarecontext-sensitive. TheChomskyhierarchy.Thisisasequenceoffourclassesofgrammars(andcor- respondingclassesoflanguages),eachcontainedinthenext. A grammar G as defined above is said to be of type 0. It is of type 1 if all productionshavetheformα−→βwith|α|≤|β|.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.