Table Of ContentSpeech and Language Processing
T
F
A
R
D
AAII
PRENTICEHALLSERIES
INARTIFICIALINTELLIGENCE
StuartRussellandPeterNorvig,Editors
T
FORSYTH & PONCE ComputerVision:AModernApproach
GRAHAM ANSICommonLisp
JURAFSKY &MARTIN SpeechandLanguageProcessing
NEAPOLITAN LearningBayeFsianNetworks
RUSSELL& NORVIG ArtificialIntelligence:AModernApproach
A
R
D
Speech and Language Processing
An Introduction to Natural Language Processing,
T
Computational Linguistics, and Speech Recognition
F
Daniel Jurafsky and James H. Martin
A
R
D
Upper SaddleRiver,New Jersey07458
LibraryofCongressCataloging-in-PublicationData
Jurafsky,DanielS.(DanielSaul)
SpeechandLanguageProcessing/DanielJurafsky,JamesH.Martin.
p. cm.
Includesbibliographicalreferencesandindex.
ISBN0-13-095069-6FIXTHIS
Editor-in-Chief: FIXTHISSTUFF T
Publisher:TracyDunkelberger
Editorial/productionsupervision:ScottDisanno
Editorialassistant:
Executivemanagingeditor:
Coverdesigndirector:
Coverdesignexecution:
Manufacturingmanager: F
Manufacturingbuyer:
Assistantvice-presidentofproductionandmanufacturing:
Coverdesign: DanielJurafsky,JamesH.Martin,andLindaMartin. FIXTHISThefrontcoverdrawing
istheactionfortheJacquardLoom(Usher,1954). ThebackcoverdrawingisAlexanderGrahamBell’s
Gallowstelephone(Rhodes,1929).
A
ThisbookwassetinTimes-RomanandTIPA(IPA)bytheauthorsusingLATEX2e .
(cid:13)c 2008byPrentice-Hall,Inc.
PearsonHigherEducation
UpperSaddleRiver,NewJersey07458
Theauthorandpublisherofthisbookhaveusedtheirbesteffortsinpreparingthisbook.Theseefforts
R
includethedevelopment,research,andtestingofthetheoriesandprogramstodeterminetheireffectiveness.
Theauthorandpublishershallnotbeliableinanyeventforincidentalorconsequentialdamagesin
connectionwith,orarisingoutof,thefurnishing,performance,oruseoftheseprograms.
Allrightsreserved.Nopartofthisbookmaybereproduced,inanyformorbyanymeans,withoutpermission
inwritingfromthepublisher.
D
PrintedintheUnitedStatesofAmerica
10 9 8 7 6 5 4 3 2 1
ISBN 0-13-095069-6
FIXTHISTOO
Prentice-HallInternational(UK)Limited,London
Prentice-HallofAustraliaPty.Limited,Sydney
Prentice-HallCanada,Inc.,Toronto
Prentice-HallHispanoamericana,S.A.,Mexico
Prentice-HallofIndiaPrivateLimited,NewDelhi
Prentice-HallofJapan,Inc.,Tokyo
Prentice-HallAsiaPte.Ltd.,Singapore
EditoraPrentice-HalldoBrasil,Ltda.,RiodeJaneiro
T
F
A
For —D.J.
For — J.M.
R
D
T
F
A
R
D
Summary of Contents
Preface.............................................................xxiii
1 Introduction..................................................... 1
T
I Words
2 RegularExpressionsandAutomata............................... 17
3 Words&Transducers............................................ 45
4 N-grams......................................................... 83
5 Part-of-SpeechTagging .......................................... 123
6 HiddenMarkovandMaximumEntrFopyModels...................173
II Speech
7 Phonetics........................................................215
8 SpeechSynthesis.................................................249
9 AutomaticSpeechRecognition....................................287
A
10 SpeechRecognition:AdvancedTopics............................ 337
11 ComputationalPhonology........................................365
III Syntax
12 FormalGrammarsofEnglish.....................................389
13 ParsingwithContext-FreeGrammars............................ 431
14 StatisticalParsing................................................465
15 FeatuRresandUnification......................................... 495
16 LanguageandComplexity........................................537
IV Semantics andPragmatics
17 RepresentingMeaning........................................... 553
18 ComputationalSemantics........................................ 593
19 LexicalSemantics................................................627
D
20 ComputationalLexicalSemantics.................................653
21 ComputationalDiscourse.........................................697
V Applications
22 InformationExtraction...........................................741
23 QuestionAnsweringandSummarization..........................783
24 DialogueandConversationalAgents..............................829
25 MachineTranslation.............................................879
Bibliography 929
Index 981
vii
T
F
A
R
D
Contents
Preface xxiii
T
1 Introduction 1
1.1 KnowledgeinSpeechandLanguageProcessing . . . . . . . . . . . 2
1.2 Ambiguity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 ModelsandAlgorithms . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Language,Thought,andUnderstandinFg. . . . . . . . . . . . . . . . 6
1.5 TheStateoftheArt . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.6 SomeBriefHistory . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.6.1 FoundationalInsights:1940sand1950s . . . . . . . . . . . 9
1.6.2 TheTwoCamps:1957–1970. . . . . . . . . . . . . . . . . 10
1.6.3 FourParadigmAs:1970–1983 . . . . . . . . . . . . . . . . . 11
1.6.4 EmpiricismandFiniteStateModelsRedux:1983–1993 . . 12
1.6.5 TheFieldComesTogether:1994–1999 . . . . . . . . . . . 12
1.6.6 TheRiseofMachineLearning:2000–2007 . . . . . . . . . 13
1.6.7 OnMultipleDiscoveries . . . . . . . . . . . . . . . . . . . 13
1.6.8 AFinalBriefNoteonPsychology . . . . . . . . . . . . . . 14
1.7 SumRmary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
BibliographicalandHistoricalNotes . . . . . . . . . . . . . . . . . . . . . 15
I Words
2 RegularExpressionsandAutomata 17
D2.1 RegularExpressions . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.1.1 BasicRegularExpressionPatterns . . . . . . . . . . . . . . 18
2.1.2 Disjunction,Grouping,andPrecedence . . . . . . . . . . . 21
2.1.3 ASimpleExample . . . . . . . . . . . . . . . . . . . . . . 22
2.1.4 AMoreComplexExample . . . . . . . . . . . . . . . . . . 23
2.1.5 AdvancedOperators . . . . . . . . . . . . . . . . . . . . . 24
2.1.6 RegularExpressionSubstitution,Memory,andELIZA . . . 25
2.2 Finite-StateAutomata . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.2.1 UsinganFSAtoRecognizeSheeptalk . . . . . . . . . . . . 26
2.2.2 FormalLanguages . . . . . . . . . . . . . . . . . . . . . . 30
2.2.3 AnotherExample . . . . . . . . . . . . . . . . . . . . . . . 31
2.2.4 Non-DeterministicFSAs . . . . . . . . . . . . . . . . . . . 32
2.2.5 UsinganNFSAtoAcceptStrings . . . . . . . . . . . . . . 33
2.2.6 RecognitionasSearch . . . . . . . . . . . . . . . . . . . . 35
2.2.7 RelatingDeterministicandNon-DeterministicAutomata . . 38
2.3 RegularLanguagesandFSAs . . . . . . . . . . . . . . . . . . . . . 38
2.4 Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
BibliographicalandHistoricalNotes . . . . . . . . . . . . . . . . . . . . . 42
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
ix
x Contents
3 Words&Transducers 45
3.1 Surveyof(Mostly)EnglishMorphology . . . . . . . . . . . . . . . 47
3.1.1 InflectionalMorphology . . . . . . . . . . . . . . . . . . . 48
3.1.2 DerivationalMorphology. . . . . . . . . . . . . . . . . . . 50
3.1.3 Cliticization. . . . . . . . . . . . . . . . . .T. . . . . . . . 51
3.1.4 Non-concatenativeMorphology . . . . . . . . . . . . . . . 52
3.1.5 Agreement . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.2 Finite-StateMorphologicalParsing . . . . . . . . . . . . . . . . . . 53
3.3 BuildingaFinite-StateLexicon . . . . . . . . . . . . . . . . . . . . 54
3.4 Finite-StateTransducers . . . . . . . . . . . . . . . . . . . . . . . . 57
F
3.4.1 SequentialTransducersandDeterminism . . . . . . . . . . 59
3.5 FSTsforMorphologicalParsing . . . . . . . . . . . . . . . . . . . 60
3.6 TransducersandOrthographicRules . . . . . . . . . . . . . . . . . 63
3.7 CombiningFSTLexiconandRules . . . . . . . . . . . . . . . . . . 65
3.8 Lexicon-FreeFSTs: ThAePorterStemmer . . . . . . . . . . . . . . . 68
3.9 WordandSentenceTokenization . . . . . . . . . . . . . . . . . . . 69
3.9.1 SegmentationinChinese . . . . . . . . . . . . . . . . . . . 70
3.10 DetectingandCorrectingSpellingErrors . . . . . . . . . . . . . . . 72
3.11 MinimumEditDistance . . . . . . . . . . . . . . . . . . . . . . . . 74
3.12 HumanMorphologicalProcessing . . . . . . . . . . . . . . . . . . 77
3.13 Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
R
BibliographicalandHistoricalNotes . . . . . . . . . . . . . . . . . . . . . 80
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4 N-grams 83
4.1 CountingWordsinCorpora . . . . . . . . . . . . . . . . . . . . . . 84
4.2 Simple(Unsmoothed)N-grams . . . . . . . . . . . . . . . . . . . . 86
D
4.3 TrainingandTestSets . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.3.1 N-gramSensitivitytotheTrainingCorpus. . . . . . . . . . 92
4.3.2 UnknownWords:Openversusclosedvocabularytasks . . . 94
4.4 EvaluatingN-grams:Perplexity . . . . . . . . . . . . . . . . . . . . 95
4.5 Smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.5.1 LaplaceSmoothing . . . . . . . . . . . . . . . . . . . . . . 98
4.5.2 Good-TuringDiscounting . . . . . . . . . . . . . . . . . . 101
4.5.3 SomeadvancedissuesinGood-Turingestimation . . . . . . 102
4.6 Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.7 Backoff. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.7.1 Advanced:DetailsofcomputingKatzbackoffa andP . . 106
∗
4.8 PracticalIssues: ToolkitsandDataFormats. . . . . . . . . . . . . . 107
4.9 AdvancedIssuesinLanguageModeling . . . . . . . . . . . . . . . 109
4.9.1 AdvancedSmoothingMethods:Kneser-NeySmoothing . . 109
4.9.2 Class-basedN-grams . . . . . . . . . . . . . . . . . . . . . 111
4.9.3 LanguageModelAdaptationandUsingtheWeb . . . . . . 111
4.9.4 UsingLongerDistanceInformation:ABriefSummary . . . 112
4.10 Advanced:InformationTheoryBackground . . . . . . . . . . . . . 113
4.10.1 Cross-EntropyforComparingModels . . . . . . . . . . . . 116