ebook img

Speech and Language Processing PDF

975 Pages·1999·4.378 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Speech and Language Processing

Speech and Language Processing AAII PRENTICEHALLSERIES INARTIFICIALINTELLIGENCE StuartRussellandPeterNorvig,Editors GRAHAM ANSI CommonLisp MUGGLETON LogicalFoundations ofMachineLearning RUSSELL & NORVIG ArtificialIntelligence: AModernApproach JURAFSKY & MARTIN SpeechandLanguageProcessing Speech and Language Processing An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition Daniel Jurafsky and James H. Martin Draft of September 28, 1999. Do not cite without permission. Contributingwriters: AndrewKehler, KeithVanderLinden, NigelWard PrenticeHall,EnglewoodCliffs,NewJersey07632 LibraryofCongressCataloging-in-PublicationData Jurafsky,DanielS.(DanielSaul) SpeechandLangaugeProcessing/DanielJurafsky,JamesH.Martin. p. cm. Includesbibliographicalreferencesandindex. ISBN Publisher:AlanApt c 2000byPrentice-Hall,Inc. (cid:13) ASimon&SchusterCompany EnglewoodCliffs,NewJersey07632 Theauthorandpublisherofthisbookhaveusedtheirbesteffortsinpreparingthis book.Theseeffortsincludethedevelopment,research,andtestingofthetheories andprogramstodeterminetheireffectiveness.Theauthorandpublishershallnot beliableinanyeventforincidentalorconsequentialdamagesinconnectionwith, orarisingoutof,thefurnishing,performance,oruseoftheseprograms. Allrightsreserved.Nopartofthisbookmaybe reproduced,inanyformorbyanymeans, withoutpermissioninwritingfromthepublisher. PrintedintheUnitedStatesofAmerica 10 9 8 7 6 5 4 3 2 1 Prentice-HallInternational(UK)Limited,London Prentice-HallofAustraliaPty.Limited,Sydney Prentice-HallCanada,Inc.,Toronto Prentice-HallHispanoamericana,S.A.,Mexico Prentice-HallofIndiaPrivateLimited,NewDelhi Prentice-HallofJapan,Inc.,Tokyo Simon&SchusterAsiaPte.Ltd.,Singapore EditoraPrentice-HalldoBrasil,Ltda.,RiodeJaneiro Formyparents— D.J. ForLinda —J.M. Summary of Contents 1 Introduction............................................ 1 I Words 19 2 RegularExpressionsandAutomata...................... 21 3 MorphologyandFinite-StateTransducers............... 57 4 ComputationalPhonologyandText-to-Speech........... 91 5 ProbabilisticModelsofPronunciationandSpelling ...... 139 6 N-grams ............................................... 189 7 HMMsandSpeechRecognition......................... 233 II Syntax 283 8 WordClassesandPart-of-SpeechTagging............... 285 9 Context-FreeGrammarsforEnglish .................... 319 10 ParsingwithContext-FreeGrammars...................353 11 FeaturesandUnification................................391 12 Lexicalized andProbabilisticParsing....................443 13 LanguageandComplexity.............................. 473 III Semantics 495 14 RepresentingMeaning..................................497 15 SemanticAnalysis...................................... 543 16 LexicalSemantics ...................................... 587 17 WordSenseDisambiguationandInformationRetrieval .. 627 IV Pragmatics 661 18 Discourse .............................................. 663 19 DialogueandConversational Agents.....................715 20 Generation.............................................759 21 MachineTranslation....................................797 A RegularExpression Operators.......................... 829 B ThePorterStemmingAlgorithm........................ 831 C C5andC7tagsets ...................................... 835 D TrainingHMMs: TheForward-BackwardAlgorithm.... 841 Bibliography 851 Index 923 vii Contents 1 Introduction 1 1.1 KnowledgeinSpeechandLanguageProcessing . . . . . . 2 1.2 Ambiguity . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 ModelsandAlgorithms . . . . . . . . . . . . . . . . . . . 5 1.4 Language, Thought,andUnderstanding . . . . . . . . . . . 6 1.5 TheStateoftheArtandTheNear-TermFuture . . . . . . . 9 1.6 SomeBriefHistory . . . . . . . . . . . . . . . . . . . . . 10 Foundational Insights: 1940’s and1950’s . . . . . . . . . . 10 TheTwoCamps: 1957–1970 . . . . . . . . . . . . . . . . 11 FourParadigms: 1970–1983 . . . . . . . . . . . . . . . . . 13 EmpiricismandFiniteStateModelsRedux: 1983-1993 . . 14 TheFieldComesTogether: 1994-1999 . . . . . . . . . . . 14 AFinalBriefNoteonPsychology . . . . . . . . . . . . . . 15 1.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Bibliographical andHistoricalNotes . . . . . . . . . . . . . . . . 16 I Words 19 2 RegularExpressionsandAutomata 21 2.1 RegularExpressions . . . . . . . . . . . . . . . . . . . . . 22 BasicRegularExpression Patterns . . . . . . . . . . . . . 23 Disjunction, Grouping, andPrecedence . . . . . . . . . . . 27 Asimpleexample . . . . . . . . . . . . . . . . . . . . . . 28 AMoreComplexExample . . . . . . . . . . . . . . . . . 29 AdvancedOperators . . . . . . . . . . . . . . . . . . . . . 30 RegularExpression Substitution, Memory,andELIZA . . . 31 2.2 Finite-StateAutomata . . . . . . . . . . . . . . . . . . . . 33 UsinganFSAtoRecognize Sheeptalk . . . . . . . . . . . 34 FormalLanguages . . . . . . . . . . . . . . . . . . . . . . 38 AnotherExample . . . . . . . . . . . . . . . . . . . . . . 39 Nondeterministic FSAs . . . . . . . . . . . . . . . . . . . 40 UsinganNFSAtoacceptstrings . . . . . . . . . . . . . . 42 Recognition asSearch . . . . . . . . . . . . . . . . . . . . 44 RelatingDeterministicandNon-deterministic Automata . . 48 2.3 RegularLanguages andFSAs . . . . . . . . . . . . . . . . 49 2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 ix x Contents Bibliographical andHistoricalNotes . . . . . . . . . . . . . . . . 52 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 3 MorphologyandFinite-StateTransducers 57 3.1 Surveyof(Mostly)EnglishMorphology . . . . . . . . . . 59 Inflectional Morphology . . . . . . . . . . . . . . . . . . . 61 Derivational Morphology . . . . . . . . . . . . . . . . . . 63 3.2 Finite-StateMorphological Parsing . . . . . . . . . . . . . 65 TheLexiconandMorphotactics . . . . . . . . . . . . . . . 66 Morphological ParsingwithFinite-StateTransducers . . . 71 Orthographic RulesandFinite-StateTransducers . . . . . . 76 3.3 CombiningFSTLexiconandRules . . . . . . . . . . . . . 79 3.4 Lexicon-free FSTs: ThePorterStemmer . . . . . . . . . . 82 3.5 HumanMorphological Processing . . . . . . . . . . . . . 84 3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 Bibliographical andHistoricalNotes . . . . . . . . . . . . . . . . 87 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 4 ComputationalPhonologyandText-to-Speech 91 4.1 SpeechSoundsandPhoneticTranscription . . . . . . . . . 92 TheVocalOrgans . . . . . . . . . . . . . . . . . . . . . . 94 Consonants: PlaceofArticulation . . . . . . . . . . . . . . 97 Consonants: MannerofArticulation . . . . . . . . . . . . 98 Vowels . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 4.2 ThePhonemeandPhonological Rules . . . . . . . . . . . 102 4.3 Phonological RulesandTransducers . . . . . . . . . . . . 104 4.4 AdvancedIssuesinComputational Phonology . . . . . . . 109 Harmony . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 TemplaticMorphology . . . . . . . . . . . . . . . . . . . 111 OptimalityTheory . . . . . . . . . . . . . . . . . . . . . . 112 4.5 MachineLearningofPhonological Rules . . . . . . . . . . 117 4.6 MappingTexttoPhonesforTTS . . . . . . . . . . . . . . 119 Pronunciation dictionaries . . . . . . . . . . . . . . . . . . 119 BeyondDictionary Lookup: TextAnalysis . . . . . . . . . 121 AnFST-basedpronunciation lexicon . . . . . . . . . . . . 124 4.7 ProsodyinTTS . . . . . . . . . . . . . . . . . . . . . . . 129 Phonological AspectsofProsody . . . . . . . . . . . . . . 129 PhoneticorAcousticAspectsofProsody . . . . . . . . . . 131 ProsodyinSpeechSynthesis . . . . . . . . . . . . . . . . 131

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.