ebook img

Finite-State Techniques: Automata, Transducers and Bimachines PDF

315 Pages·2019·3.559 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Finite-State Techniques: Automata, Transducers and Bimachines

CambridgeTractsinTheoreticalComputerScience60 Finite-StateTechniques Finite-statemethodsarethemostefficientmechanismsforanalysingtextualand symbolicdata,providingelegantsolutionsforanimmensenumberofpractical problemsincomputationallinguisticsandcomputerscience. Thisbookforgraduatestudentsandresearchersgivesacompletecoverageofthe field,startingfromaconceptualintroductionandbuildingtoadvancedtopicsand applications.Thecentralfinite-statetechnologiesareintroducedwithmathematical rigour,rangingfromsimplefinite-stateautomatatotransducersandbimachinesas ‘input-output’devices.Specialattentionisgiventotherichpossibilitiesof simplifying,transformingandcombiningfinite-statedevices. Allalgorithmspresentedareaccompaniedbyfullcorrectnessproofsandexecutable sourcecodeinanewprogramminglanguage,C(M),whichfocusesontransparencyof stepsandsimplicityofcode.Thus,byenablingreaderstoobtainadeepformal understandingofthesubjectandtoputfinite-statemethodstorealuse,thisbook closesthegapbetweentheoryandpractice. STOYAN MIHOVisAssociateProfessorattheBulgarianAcademyofSciences (IICT)andalectureratSofiaUniversity.Hehaspublishedseveralefficientautomata constructionsandapproximatesearchmethods,whicharewidelyusedfornatural languageprocessingandinformationretrieval.DrMihovhasledthedevelopmentof multipleaward-winningsystemsforlanguageandspeechprocessing. KLAUS U. SCHULZisProfessorofInformationandLanguageProcessingatthe Ludwig-Maximilians-UniversitätinMunich.Hehaspublishedover100articlesin distinctfieldsofcomputerscience,withcontributionsinapproximatesearchand transducertechnology.Hewasheadofmanyprojectsintext-correctionanddigital humanities,onbothanationalandEuropeanlevel. CAMBRIDGE TRACTS IN THEORETICAL COMPUTER SCIENCE 60 EditorialBoard S.Abramsky,DepartmentofComputerScience,UniversityofOxford P.H.Aczel,SchoolofComputerScience,UniversityofManchester Y.Gurevich,MicrosoftResearch J.V.Tucker,DepartmentofComputerScience,SwanseaUniversity Titlesintheseries Acompletelistofbooksintheseriescanbefoundat www.cambridge.org/computer-science. Recenttitlesincludethefollowing: 30. M.Anthony&N.BiggsComputationalLearningTheory 31. T.F.MelhamHigherOrderLogicandHardwareVerification 32. R.CarpenterTheLogicofTypedFeatureStructures 33. E.G.ManesPredicateTransformerSemantics 34. F.Nielson&H.R.NielsonTwo-LevelFunctionalLanguages 35. L.M.G.Feijs&H.B.M.JonkersFormalSpecificationandDesign 36. S.Mauw&G.J.Veltink(eds)AlgebraicSpecificationofCommunicationProtocols 37. V.StavridouFormalMethodsinCircuitDesign 38. N.ShankarMetamathematics,MachinesandGödel’sProof 39. J.B.ParisTheUncertainReasoner’sCompanion 40. J.Desel&J.EsparzaFreeChoicePetriNets 41. J.-J.Ch.Meyer&W.vanderHoekEpistemicLogicforAIandComputerScience 42. J.R.HindleyBasicSimpleTypeTheory 43. A.S.Troelstra&H.SchwichtenbergBasicProofTheory 44. J.Barwise&J.SeligmanInformationFlow 45. A.Asperti&S.GuerriniTheOptimalImplementationofFunctionalProgramming Languages 46. R.M.Amadio&P.-L.CurienDomainsandLambda-Calculi 47. W.-P.deRoever&K.EngelhardtDataRefinement 48. H.KleineBüning&T.LettmannPropositionalLogic 49. L.Novak&A.GibbonsHybridGraphTheoryandNetworkAnalysis 50. J.C.M.Baeten,T.Basten&M.A.ReniersProcessAlgebra:EquationalTheoriesof CommunicatingProcesses 51. H.SimmonsDerivationandComputation 52. D.Sangiorgi&J.Rutten(eds)AdvancedTopicsinBisimulationandCoinduction 53. P.Blackburn,M.deRijke&Y.VenemaModalLogic 54. W.-P.deRoeveretal.ConcurrencyVerification 55. TereseTermRewritingSystems 56. A.Bundyetal.Rippling:Meta-LevelGuidanceforMathematicalReasoning 57. A.M.PittsNominalSets 58. S.Demri,V.Goranko&M.LangeTemporalLogicsinComputerScience 59. B.JacobsIntroductiontoCoalgebra Finite-State Techniques Automata, Transducers and Bimachines STOYAN MIHOV BulgarianAcademyofSciences KLAUS U. SCHULZ Ludwig-Maximilians-UniversitätMünchen UniversityPrintingHouse,CambridgeCB28BS,UnitedKingdom OneLibertyPlaza,20thFloor,NewYork,NY10006,USA 477WilliamstownRoad,PortMelbourne,VIC3207,Australia 314–321,3rdFloor,Plot3,SplendorForum,JasolaDistrictCentre, NewDelhi–110025,India 79AnsonRoad,#06–04/06,Singapore079906 CambridgeUniversityPressispartoftheUniversityofCambridge. ItfurtherstheUniversity’smissionbydisseminatingknowledgeinthepursuitof education,learning,andresearchatthehighestinternationallevelsofexcellence. www.cambridge.org Informationonthistitle:www.cambridge.org/9781108485418 DOI:10.1017/9781108756945 (cid:2)c CambridgeUniversityPress2019 Thispublicationisincopyright.Subjecttostatutoryexception andtotheprovisionsofrelevantcollectivelicensingagreements, noreproductionofanypartmaytakeplacewithoutthewritten permissionofCambridgeUniversityPress. Firstpublished2019 PrintedandboundinGreatBritainbyClaysLtd,ElcografS.p.A. AcataloguerecordforthispublicationisavailablefromtheBritishLibrary. LibraryofCongressCataloging-in-PublicationData Names:Mihov,Stoyan,1968–author.|Schulz,K.U.(KlausUlrich),1957–author. Title:Finite-statetechniques:automata,transducersandbimachines/ StoyanMihov,KlausU.Schulz. Description:NewYork,NY:CambridgeUniversityPress,2019.|Series: Cambridgetractsintheoreticalcomputerscience;60 Identifiers:LCCN2019000810|ISBN9781108485418(hardback) Subjects:LCSH:Sequentialmachinetheory. Classification:LCCQA267.5.S4M5252019|DDC511.3/50285635–dc23 LCrecordavailableathttps://lccn.loc.gov/2019000810 ISBN978-1-108-48541-8Hardback CambridgeUniversityPresshasnoresponsibilityforthepersistenceoraccuracyof URLsforexternalorthird-partyinternetwebsitesreferredtointhispublication anddoesnotguaranteethatanycontentonsuchwebsitesis,orwillremain, accurateorappropriate. Contents Preface pageix PART I FORMAL BACKGROUND 1 FormalPreliminaries 3 1.1 Sets,FunctionsandRelations 3 1.2 LiftingFunctionstoSetsandTuples 8 1.3 Alphabets,WordsandLanguages 10 1.4 WordTuples,StringRelationsandStringFunctions 13 1.5 TheGeneralMonoidalPerspective 16 1.6 SummingUp 20 1.7 ExercisesforChapter1 21 2 MonoidalFinite-StateAutomata 23 2.1 BasicConceptandExamples 23 2.2 ClosurePropertiesofMonoidalFinite-StateAutomata 30 2.3 MonoidalRegularLanguagesandMonoidalRegular Expressions 33 2.4 EquivalenceBetweenMonoidalRegularLanguagesand MonoidalAutomatonLanguages 35 2.5 SimplifyingtheStructureofMonoidalFinite-StateAutomata 37 2.6 SummingUp 41 2.7 ExercisesforChapter2 41 3 ClassicalFinite-StateAutomataandRegularLanguages 43 3.1 DeterministicFinite-StateAutomata 43 3.2 DeterminizationofClassicalFinite-StateAutomata 46 3.3 AdditionalClosurePropertiesforClassicalFinite-State Automata 48 v vi Contents 3.4 MinimalDeterministicFinite-StateAutomataandthe Myhill–NerodeEquivalenceRelation 50 3.5 MinimizationofDeterministicFinite-StateAutomata 57 3.6 ColouredDeterministicFinite-StateAutomata 62 3.7 Pseudo-DeterminizationandPseudo-Minimizationof MonoidalFinite-StateAutomata 67 3.8 SummingUp 69 3.9 ExercisesforChapter3 69 4 MonoidalMulti-TapeAutomataandFinite-StateTransducers 72 4.1 MonoidalMulti-TapeAutomata 72 4.2 AdditionalClosurePropertiesofMonoidalMulti-Tape Automata 75 4.3 ClassicalMulti-TapeAutomataandLetterAutomata 77 4.4 MonoidalFinite-StateTransducers 80 4.5 ClassicalFinite-StateTransducers 83 4.6 DecidingFunctionalityofClassicalFinite-StateTransducers 85 4.7 SummingUp 90 4.8 ExercisesforChapter4 91 5 DeterministicTransducers 94 5.1 DeterministicTransducersandSubsequentialTransducers 94 5.2 ADeterminizationProcedureforFunctionalTransducers withtheBoundedVariationProperty 100 5.3 DecidingtheBoundedVariationProperty 108 5.4 Minimal Subsequential Finite-State Transducers: Myhill–NerodeRelationforSubsequentialTransducers 115 5.5 MinimizationofSubsequentialTransducers 123 5.6 NumericalSubsequentialTransducers 133 5.7 SummingUp 135 5.8 BibliographicNotes 135 5.9 ExercisesforChapter5 136 6 Bimachines 138 6.1 BasicDefinitions 138 6.2 EquivalenceofRegularStringFunctionsandClassical Bimachines 145 6.3 Pseudo-MinimizationofMonoidalBimachines 149 6.4 DirectCompositionofClassicalBimachines 151 6.5 SummingUp 156 6.6 ExercisesforChapter6 156 Contents vii PART II FROM THEORY TO PRACTICE 7 TheC(M)language 161 7.1 BasicsandSimpleExamples 161 7.2 Types,TermsandStatementsinC(M) 168 8 C(M)ImplementationofFinite-StateDevices 177 8.1 C(M)ImplementationsforAutomataAlgorithms 177 8.2 C(M)ProgramsforClassicalFinite-StateTransducers 194 8.3 C(M)ProgramsforDeterministicTransducers 211 8.4 C(M)ProgramsforBimachines 222 9 TheAho–CorasickAlgorithm 236 9.1 FormalConstruction–FirstVersion 236 9.2 LinearComputationoftheAho–CorasickAutomaton 244 9.3 Space-Efficient Variant – Construction of the Aho–Corasickf-Automaton 246 10 TheMinimalDeterministicFinite-StateAutomatonfora FiniteLanguage 253 10.1 FormalConstruction 253 10.2 C(M)ImplementationoftheConstruction–FirstVersion 258 10.3 EfficientConstructionoftheMinimalDictionaryAutomaton 262 10.4 AdaptingtheLanguageofMinimalDictionaryAutomata 265 10.5 The MinimalSubsequential Transducer fora Finite Two-SidedDictionary 270 11 ConstructingFinite-StateDevicesforTextRewriting 279 11.1 SimpleTextRewritingBasedonRegularRelations 280 11.2 UsingDeterministicMachinesforSimpleTextRewriting 282 11.3 Leftmost-LongestMatchTextRewriting 288 11.4 RegularRelationsforLeftmost-LongestMatchRewriting 290 References 298 Index 302 Preface Finite-statetechniquesprovidetheoreticallyelegantandcomputationallyeffi- cient solutions for various (hard, non-trivial) problems in text and natural languageprocessing.Duetoitsimportanceinmanyfundamentalapplications, the theory of finite-state automata and related finite-state machines has been extensivelystudiedanditsdevelopmentstillcontinues. This textbook describes the basics of finite-state technology, following a combined mathematical and implementational point of view. It is written for advanced undergraduate and graduate students in computer science, compu- tational linguistics and mathematics. Though concepts are introduced in a mathematically rigorous way and correctness proofs for all procedures are given,thebookisnotmeantasapurelytheoreticalintroductiontothesubject. Theultimategoalistobringstudentstoapositionwheretheycanbothunder- stand and implement complex finite-state based procedures for practically relevanttasks.Someofthespecificfeaturesofthebookarethefollowing. 1. The spectrum of finite-state machines that are covered is not restricted to classicalfinite-stateautomataand‘recognition’tools.Wealsotreatimpor- tant ‘input-output’ and ‘translation’ devices such as multi-tape automata, finite-state transducers and bimachines. All these machines can be used, forexample,forefficienttextrewriting,informationextractionfromtextual corpora,andmorphologicalanalysis. 2. Afteraconceptualintroduction,fullimplementations/executableprograms are given for all procedures, including a documentation of the program- ming code. In this way it is possible to observe the concrete behaviour of algorithms for examples selected by the readers. It is also not difficult to enhancethegivenprogramsbymeansofself-writtentracefunctionalities, whichhelpsevenbetterunderstandingofparticulardetailsofthealgorithms andprogramspresented. ix

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.