ebook img

Speech and Language Processing PDF

629 Pages·2021·22.486 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Speech and Language Processing

Speech and Language Processing An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition Third Edition draft Daniel Jurafsky Stanford University James H. Martin University of Colorado at Boulder Copyright ©2020. All rights reserved. Draft of September 21, 2021. Comments and typos welcome! Summary of Contents 1 Introduction................................................... 1 2 RegularExpressions,TextNormalization,EditDistance......... 2 3 N-gramLanguageModels ..................................... 29 4 NaiveBayesandSentimentClassification....................... 56 5 LogisticRegression............................................ 77 6 VectorSemanticsandEmbeddings............................. 97 7 NeuralNetworksandNeuralLanguageModels.................128 8 SequenceLabelingforPartsofSpeechandNamedEntities......153 9 DeepLearningArchitecturesforSequenceProcessing...........178 10 MachineTranslationandEncoder-DecoderModels............. 207 11 TransferLearningwithContextualEmbeddingsandPre-trained LanguageModels................................................... 236 12 ConstituencyGrammars.......................................237 13 ConstituencyParsing..........................................265 14 DependencyParsing...........................................287 15 LogicalRepresentationsofSentenceMeaning...................312 16 ComputationalSemanticsandSemanticParsing................338 17 InformationExtraction........................................339 18 WordSensesandWordNet.....................................362 19 SemanticRoleLabeling........................................381 20 LexiconsforSentiment,Affect,andConnotation................401 21 CoreferenceResolution........................................421 22 DiscourseCoherence...........................................448 23 QuestionAnswering...........................................470 24 Chatbots&DialogueSystems..................................497 25 Phonetics......................................................531 26 AutomaticSpeechRecognitionandText-to-Speech..............553 Bibliography......................................................579 SubjectIndex.....................................................613 2 Contents 1 Introduction 1 2 RegularExpressions,TextNormalization,EditDistance 2 2.1 RegularExpressions . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.2 Words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.3 Corpora . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.4 TextNormalization . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.5 MinimumEditDistance . . . . . . . . . . . . . . . . . . . . . . . 22 2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 BibliographicalandHistoricalNotes . . . . . . . . . . . . . . . . . . . . 27 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3 N-gramLanguageModels 29 3.1 N-Grams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.2 EvaluatingLanguageModels . . . . . . . . . . . . . . . . . . . . 35 3.3 Samplingsentencesfromalanguagemodel. . . . . . . . . . . . . 37 3.4 GeneralizationandZeros . . . . . . . . . . . . . . . . . . . . . . 38 3.5 Smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.6 Kneser-NeySmoothing . . . . . . . . . . . . . . . . . . . . . . . 45 3.7 HugeLanguageModelsandStupidBackoff . . . . . . . . . . . . 48 3.8 Advanced: Perplexity’sRelationtoEntropy . . . . . . . . . . . . 49 3.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 BibliographicalandHistoricalNotes . . . . . . . . . . . . . . . . . . . . 53 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4 NaiveBayesandSentimentClassification 56 4.1 NaiveBayesClassifiers . . . . . . . . . . . . . . . . . . . . . . . 57 4.2 TrainingtheNaiveBayesClassifier . . . . . . . . . . . . . . . . . 60 4.3 Workedexample . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 4.4 OptimizingforSentimentAnalysis . . . . . . . . . . . . . . . . . 62 4.5 NaiveBayesforothertextclassificationtasks . . . . . . . . . . . 64 4.6 NaiveBayesasaLanguageModel . . . . . . . . . . . . . . . . . 65 4.7 Evaluation: Precision,Recall,F-measure . . . . . . . . . . . . . . 66 4.8 TestsetsandCross-validation . . . . . . . . . . . . . . . . . . . . 68 4.9 StatisticalSignificanceTesting . . . . . . . . . . . . . . . . . . . 69 4.10 AvoidingHarmsinClassification . . . . . . . . . . . . . . . . . . 73 4.11 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 BibliographicalandHistoricalNotes . . . . . . . . . . . . . . . . . . . . 74 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 5 LogisticRegression 77 5.1 Classification: thesigmoid . . . . . . . . . . . . . . . . . . . . . 78 5.2 LearninginLogisticRegression . . . . . . . . . . . . . . . . . . . 82 5.3 Thecross-entropylossfunction . . . . . . . . . . . . . . . . . . . 83 5.4 GradientDescent . . . . . . . . . . . . . . . . . . . . . . . . . . 84 5.5 Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 5.6 Multinomiallogisticregression . . . . . . . . . . . . . . . . . . . 91 5.7 Interpretingmodels . . . . . . . . . . . . . . . . . . . . . . . . . 94 5.8 Advanced: DerivingtheGradientEquation . . . . . . . . . . . . . 94 3 4 CONTENTS 5.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 BibliographicalandHistoricalNotes . . . . . . . . . . . . . . . . . . . . 96 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 6 VectorSemanticsandEmbeddings 97 6.1 LexicalSemantics . . . . . . . . . . . . . . . . . . . . . . . . . . 98 6.2 VectorSemantics . . . . . . . . . . . . . . . . . . . . . . . . . . 101 6.3 WordsandVectors . . . . . . . . . . . . . . . . . . . . . . . . . . 102 6.4 Cosineformeasuringsimilarity . . . . . . . . . . . . . . . . . . . 106 6.5 TF-IDF:Weighingtermsinthevector . . . . . . . . . . . . . . . 107 6.6 PointwiseMutualInformation(PMI) . . . . . . . . . . . . . . . . 110 6.7 Applicationsofthetf-idforPPMIvectormodels . . . . . . . . . . 112 6.8 Word2vec . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 6.9 VisualizingEmbeddings . . . . . . . . . . . . . . . . . . . . . . . 119 6.10 Semanticpropertiesofembeddings . . . . . . . . . . . . . . . . . 120 6.11 BiasandEmbeddings . . . . . . . . . . . . . . . . . . . . . . . . 122 6.12 EvaluatingVectorModels . . . . . . . . . . . . . . . . . . . . . . 123 6.13 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 BibliographicalandHistoricalNotes . . . . . . . . . . . . . . . . . . . . 125 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 7 NeuralNetworksandNeuralLanguageModels 128 7.1 Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 7.2 TheXORproblem . . . . . . . . . . . . . . . . . . . . . . . . . . 131 7.3 FeedforwardNeuralNetworks. . . . . . . . . . . . . . . . . . . . 134 7.4 FeedforwardnetworksforNLP:Classification . . . . . . . . . . . 138 7.5 FeedforwardNeuralLanguageModeling . . . . . . . . . . . . . . 139 7.6 TrainingNeuralNets . . . . . . . . . . . . . . . . . . . . . . . . 143 7.7 Trainingtheneurallanguagemodel . . . . . . . . . . . . . . . . . 150 7.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 BibliographicalandHistoricalNotes . . . . . . . . . . . . . . . . . . . . 152 8 SequenceLabelingforPartsofSpeechandNamedEntities 153 8.1 (Mostly)EnglishWordClasses . . . . . . . . . . . . . . . . . . . 154 8.2 Part-of-SpeechTagging . . . . . . . . . . . . . . . . . . . . . . . 156 8.3 NamedEntitiesandNamedEntityTagging . . . . . . . . . . . . . 158 8.4 HMMPart-of-SpeechTagging . . . . . . . . . . . . . . . . . . . 160 8.5 ConditionalRandomFields(CRFs) . . . . . . . . . . . . . . . . . 167 8.6 EvaluationofNamedEntityRecognition . . . . . . . . . . . . . . 172 8.7 FurtherDetails . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 8.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 BibliographicalandHistoricalNotes . . . . . . . . . . . . . . . . . . . . 175 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 9 DeepLearningArchitecturesforSequenceProcessing 178 9.1 LanguageModelsRevisited . . . . . . . . . . . . . . . . . . . . . 179 9.2 RecurrentNeuralNetworks . . . . . . . . . . . . . . . . . . . . . 180 9.3 RNNsasLanguageModels . . . . . . . . . . . . . . . . . . . . . 183 9.4 RNNsforotherNLPtasks . . . . . . . . . . . . . . . . . . . . . . 185 9.5 StackedandBidirectionalRNNarchitectures . . . . . . . . . . . . 188 9.6 TheLSTM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 9.7 Self-AttentionNetworks: Transformers . . . . . . . . . . . . . . . 194 CONTENTS 5 9.8 TransformersasLanguageModels . . . . . . . . . . . . . . . . . 202 9.9 ContextualGenerationandSummarization . . . . . . . . . . . . . 202 9.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 BibliographicalandHistoricalNotes . . . . . . . . . . . . . . . . . . . . 205 10 MachineTranslationandEncoder-DecoderModels 207 10.1 LanguageDivergencesandTypology . . . . . . . . . . . . . . . . 209 10.2 TheEncoder-DecoderModel . . . . . . . . . . . . . . . . . . . . 212 10.3 Encoder-DecoderwithRNNs . . . . . . . . . . . . . . . . . . . . 213 10.4 Attention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216 10.5 BeamSearch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 10.6 Encoder-DecoderwithTransformers . . . . . . . . . . . . . . . . 223 10.7 SomepracticaldetailsonbuildingMTsystems . . . . . . . . . . . 224 10.8 MTEvaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 10.9 BiasandEthicalIssues . . . . . . . . . . . . . . . . . . . . . . . 231 10.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232 BibliographicalandHistoricalNotes . . . . . . . . . . . . . . . . . . . . 232 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 11 Transfer Learning with Contextual Embeddings and Pre-trained Lan- guageModels 236 12 ConstituencyGrammars 237 12.1 Constituency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238 12.2 Context-FreeGrammars . . . . . . . . . . . . . . . . . . . . . . . 238 12.3 SomeGrammarRulesforEnglish . . . . . . . . . . . . . . . . . . 243 12.4 Treebanks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250 12.5 GrammarEquivalenceandNormalForm . . . . . . . . . . . . . . 255 12.6 LexicalizedGrammars . . . . . . . . . . . . . . . . . . . . . . . . 256 12.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 BibliographicalandHistoricalNotes . . . . . . . . . . . . . . . . . . . . 262 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 13 ConstituencyParsing 265 13.1 Ambiguity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265 13.2 CKYParsing: ADynamicProgrammingApproach . . . . . . . . 267 13.3 Span-BasedNeuralConstituencyParsing . . . . . . . . . . . . . . 273 13.4 EvaluatingParsers . . . . . . . . . . . . . . . . . . . . . . . . . . 275 13.5 PartialParsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276 13.6 CCGParsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277 13.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284 BibliographicalandHistoricalNotes . . . . . . . . . . . . . . . . . . . . 284 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285 14 DependencyParsing 287 14.1 DependencyRelations . . . . . . . . . . . . . . . . . . . . . . . . 288 14.2 DependencyFormalisms. . . . . . . . . . . . . . . . . . . . . . . 290 14.3 DependencyTreebanks . . . . . . . . . . . . . . . . . . . . . . . 291 14.4 Transition-BasedDependencyParsing . . . . . . . . . . . . . . . 293 14.5 Graph-BasedDependencyParsing . . . . . . . . . . . . . . . . . 302 14.6 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308 14.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309 6 CONTENTS BibliographicalandHistoricalNotes . . . . . . . . . . . . . . . . . . . . 310 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311 15 LogicalRepresentationsofSentenceMeaning 312 15.1 ComputationalDesiderataforRepresentations . . . . . . . . . . . 313 15.2 Model-TheoreticSemantics . . . . . . . . . . . . . . . . . . . . . 315 15.3 First-OrderLogic . . . . . . . . . . . . . . . . . . . . . . . . . . 318 15.4 EventandStateRepresentations. . . . . . . . . . . . . . . . . . . 325 15.5 DescriptionLogics . . . . . . . . . . . . . . . . . . . . . . . . . . 330 15.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 BibliographicalandHistoricalNotes . . . . . . . . . . . . . . . . . . . . 336 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337 16 ComputationalSemanticsandSemanticParsing 338 17 InformationExtraction 339 17.1 RelationExtraction . . . . . . . . . . . . . . . . . . . . . . . . . 340 17.2 RelationExtractionAlgorithms . . . . . . . . . . . . . . . . . . . 343 17.3 ExtractingTimes . . . . . . . . . . . . . . . . . . . . . . . . . . . 351 17.4 ExtractingEventsandtheirTimes . . . . . . . . . . . . . . . . . . 355 17.5 TemplateFilling . . . . . . . . . . . . . . . . . . . . . . . . . . . 357 17.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359 BibliographicalandHistoricalNotes . . . . . . . . . . . . . . . . . . . . 360 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361 18 WordSensesandWordNet 362 18.1 WordSenses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363 18.2 RelationsBetweenSenses . . . . . . . . . . . . . . . . . . . . . . 365 18.3 WordNet: ADatabaseofLexicalRelations . . . . . . . . . . . . . 367 18.4 WordSenseDisambiguation. . . . . . . . . . . . . . . . . . . . . 370 18.5 AlternateWSDalgorithmsandTasks . . . . . . . . . . . . . . . . 373 18.6 UsingThesaurusestoImproveEmbeddings . . . . . . . . . . . . 376 18.7 WordSenseInduction . . . . . . . . . . . . . . . . . . . . . . . . 376 18.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377 BibliographicalandHistoricalNotes . . . . . . . . . . . . . . . . . . . . 378 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379 19 SemanticRoleLabeling 381 19.1 SemanticRoles . . . . . . . . . . . . . . . . . . . . . . . . . . . 382 19.2 DiathesisAlternations . . . . . . . . . . . . . . . . . . . . . . . . 383 19.3 SemanticRoles: ProblemswithThematicRoles . . . . . . . . . . 384 19.4 ThePropositionBank . . . . . . . . . . . . . . . . . . . . . . . . 385 19.5 FrameNet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386 19.6 SemanticRoleLabeling . . . . . . . . . . . . . . . . . . . . . . . 388 19.7 SelectionalRestrictions . . . . . . . . . . . . . . . . . . . . . . . 392 19.8 PrimitiveDecompositionofPredicates . . . . . . . . . . . . . . . 396 19.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398 BibliographicalandHistoricalNotes . . . . . . . . . . . . . . . . . . . . 398 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 400 20 LexiconsforSentiment,Affect,andConnotation 401 20.1 DefiningEmotion . . . . . . . . . . . . . . . . . . . . . . . . . . 402 CONTENTS 7 20.2 AvailableSentimentandAffectLexicons . . . . . . . . . . . . . . 404 20.3 CreatingAffectLexiconsbyHumanLabeling . . . . . . . . . . . 405 20.4 Semi-supervisedInductionofAffectLexicons . . . . . . . . . . . 407 20.5 SupervisedLearningofWordSentiment . . . . . . . . . . . . . . 410 20.6 UsingLexiconsforSentimentRecognition . . . . . . . . . . . . . 415 20.7 UsingLexiconsforAffectRecognition . . . . . . . . . . . . . . . 416 20.8 Lexicon-basedmethodsforEntity-CentricAffect. . . . . . . . . . 417 20.9 ConnotationFrames . . . . . . . . . . . . . . . . . . . . . . . . . 417 20.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419 BibliographicalandHistoricalNotes . . . . . . . . . . . . . . . . . . . . 420 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420 21 CoreferenceResolution 421 21.1 CoreferencePhenomena: LinguisticBackground . . . . . . . . . . 424 21.2 CoreferenceTasksandDatasets . . . . . . . . . . . . . . . . . . . 429 21.3 MentionDetection . . . . . . . . . . . . . . . . . . . . . . . . . . 430 21.4 ArchitecturesforCoreferenceAlgorithms . . . . . . . . . . . . . 433 21.5 Classifiersusinghand-builtfeatures . . . . . . . . . . . . . . . . . 435 21.6 Aneuralmention-rankingalgorithm . . . . . . . . . . . . . . . . 437 21.7 EvaluationofCoreferenceResolution . . . . . . . . . . . . . . . . 440 21.8 WinogradSchemaproblems . . . . . . . . . . . . . . . . . . . . . 441 21.9 GenderBiasinCoreference . . . . . . . . . . . . . . . . . . . . . 442 21.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444 BibliographicalandHistoricalNotes . . . . . . . . . . . . . . . . . . . . 444 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447 22 DiscourseCoherence 448 22.1 CoherenceRelations . . . . . . . . . . . . . . . . . . . . . . . . . 450 22.2 DiscourseStructureParsing . . . . . . . . . . . . . . . . . . . . . 453 22.3 CenteringandEntity-BasedCoherence . . . . . . . . . . . . . . . 457 22.4 Representationlearningmodelsforlocalcoherence . . . . . . . . 461 22.5 GlobalCoherence . . . . . . . . . . . . . . . . . . . . . . . . . . 463 22.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466 BibliographicalandHistoricalNotes . . . . . . . . . . . . . . . . . . . . 467 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469 23 QuestionAnswering 470 23.1 InformationRetrieval . . . . . . . . . . . . . . . . . . . . . . . . 471 23.2 IR-basedFactoidQuestionAnswering . . . . . . . . . . . . . . . 479 23.3 EntityLinking . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483 23.4 Knowledge-basedQuestionAnswering . . . . . . . . . . . . . . . 487 23.5 UsingLanguageModelstodoQA . . . . . . . . . . . . . . . . . 490 23.6 ClassicQAModels . . . . . . . . . . . . . . . . . . . . . . . . . 491 23.7 EvaluationofFactoidAnswers . . . . . . . . . . . . . . . . . . . 494 BibliographicalandHistoricalNotes . . . . . . . . . . . . . . . . . . . . 495 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 496 24 Chatbots&DialogueSystems 497 24.1 PropertiesofHumanConversation . . . . . . . . . . . . . . . . . 498 24.2 Chatbots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501 24.3 GUS:SimpleFrame-basedDialogueSystems . . . . . . . . . . . 509 24.4 TheDialogue-StateArchitecture . . . . . . . . . . . . . . . . . . 513 8 CONTENTS 24.5 EvaluatingDialogueSystems . . . . . . . . . . . . . . . . . . . . 522 24.6 DialogueSystemDesign. . . . . . . . . . . . . . . . . . . . . . . 525 24.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527 BibliographicalandHistoricalNotes . . . . . . . . . . . . . . . . . . . . 528 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529 25 Phonetics 531 25.1 SpeechSoundsandPhoneticTranscription . . . . . . . . . . . . . 531 25.2 ArticulatoryPhonetics . . . . . . . . . . . . . . . . . . . . . . . . 532 25.3 Prosody . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537 25.4 AcousticPhoneticsandSignals . . . . . . . . . . . . . . . . . . . 539 25.5 PhoneticResources . . . . . . . . . . . . . . . . . . . . . . . . . 549 25.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 550 BibliographicalandHistoricalNotes . . . . . . . . . . . . . . . . . . . . 551 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 552 26 AutomaticSpeechRecognitionandText-to-Speech 553 26.1 TheAutomaticSpeechRecognitionTask . . . . . . . . . . . . . . 554 26.2 FeatureExtractionforASR:LogMelSpectrum . . . . . . . . . . 556 26.3 SpeechRecognitionArchitecture . . . . . . . . . . . . . . . . . . 560 26.4 CTC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 562 26.5 ASREvaluation: WordErrorRate . . . . . . . . . . . . . . . . . 567 26.6 TTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 569 26.7 OtherSpeechTasks . . . . . . . . . . . . . . . . . . . . . . . . . 574 26.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 575 BibliographicalandHistoricalNotes . . . . . . . . . . . . . . . . . . . . 575 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 578 Bibliography 579 SubjectIndex 613 CHAPTER 1 Introduction La dernie`re chose qu’on trouve en faisant un ouvrage est de savoir celle qu’il faut mettrelapremie`re. [Thelastthingyoufigureoutinwritingabookiswhattoputfirst.] Pascal 1 2 CHAPTER2 • REGULAREXPRESSIONS,TEXTNORMALIZATION,EDITDISTANCE CHAPTER 2 Regular Expressions, Text Normalization, Edit Distance User: I am unhappy. ELIZA: DO YOU THINK COMING HERE WILL HELP YOU NOT TO BE UNHAPPY User: I need some help, that much seems certain. ELIZA: WHAT WOULD IT MEAN TO YOU IF YOU GOT SOME HELP User: Perhaps I could learn to get along with my mother. ELIZA: TELL ME MORE ABOUT YOUR FAMILY User: My mother takes care of me. ELIZA: WHO ELSE IN YOU FAMILY TAKES CARE OF YOU User: My father. ELIZA: YOUR FATHER User: You are like my father in some ways. Weizenbaum(1966) ELIZA ThedialogueaboveisfromELIZA,anearlynaturallanguageprocessingsystem thatcouldcarryonalimitedconversationwithauserbyimitatingtheresponsesof a Rogerian psychotherapist (Weizenbaum, 1966). ELIZA is a surprisingly simple programthatusespatternmatchingtorecognizephraseslike“IneedX”andtranslate them into suitable outputs like “What would it mean to you if you got X?”. This simpletechniquesucceedsinthisdomainbecauseELIZAdoesn’tactuallyneedto knowanythingtomimicaRogerianpsychotherapist. AsWeizenbaumnotes,thisis oneofthefewdialoguegenreswherelistenerscanactasiftheyknownothingofthe world. Eliza’s mimicry of human conversation was remarkably successful: many people who interacted with ELIZA came to believe that it really understood them and their problems, many continued to believe in ELIZA’s abilities even after the program’s operation was explained to them (Weizenbaum, 1976), and even today chatbots suchchatbotsareafundiversion. Of course modern conversational agents are much more than a diversion; they cananswerquestions,bookflights,orfindrestaurants,functionsforwhichtheyrely on a much more sophisticated understanding of the user’s intent, as we will see in Chapter 24. Nonetheless, the simple pattern-based methods that powered ELIZA andotherchatbotsplayacrucialroleinnaturallanguageprocessing. We’llbeginwiththemostimportanttoolfordescribingtextpatterns:theregular expression. Regular expressions can be used to specify strings we might want to extractfromadocument,fromtransforming“IneedX”inElizaabove,todefining stringslike$199or$24.99forextractingtablesofpricesfromadocument. text We’llthenturntoasetoftaskscollectivelycalledtextnormalization,inwhich normalization regular expressions play an important part. Normalizing text means converting it to a more convenient, standard form. For example, most of what we are going to do with language relies on first separating out or tokenizing words from running tokenization text, the task of tokenization. English words are often separated from each other by whitespace, butwhitespace is not always sufficient. New York and rock ’n’ roll aresometimestreatedaslargewordsdespitethefactthattheycontainspaces,while sometimeswe’llneedtoseparateI’mintothetwowordsIandam. Forprocessing tweetsortextswe’llneedtotokenizeemoticonslike:) orhashtagslike#nlproc.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.