WHAT’S THAT SUPPOSED TO MEAN? MODELING THE PRAGMATIC MEANING OF UTTERANCES A DISSERTATION SUBMITTED TO THE DEPARTMENT OF LINGUISTICS AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Marie-Catherine de Marneffe November 2012 © 2012 by Marie-Catherine H.J.N.L. de Marneffe. All Rights Reserved. Re-distributed by Stanford University under license with the author. This work is licensed under a Creative Commons Attribution- Noncommercial 3.0 United States License. http://creativecommons.org/licenses/by-nc/3.0/us/ This dissertation is online at: http://purl.stanford.edu/vt913xr7954 ii I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Christopher Manning, Primary Adviser I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Daniel Jurafsky I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Beth Levin I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Christopher Potts Approved for the Stanford University Committee on Graduate Studies. Patricia J. Gumport, Vice Provost Graduate Education This signature page was generated electronically upon submission of this dissertation in electronic format. An original signed hard copy of the signature page is on file in University Archives. iii Abstract Many strands of natural language processing work, by and large, capture only the literal meaning of sentences. However, in even our most mundane interactions, much of what we communicate is not said explicitly but rather inferred from the context. If I ask a friend to lunch and she replies, I had a very large breakfast, I will infer that she does not want go, even though she (perhaps deliberately) avoided saying so di- rectly. This dissertation focuses on building computational models of such pragmatic enrichment. I aim at capturing aspects of pragmatic meaning, the kind of information that a reader will reliably extract from an utterance within a discourse. I investigate three phenomena for which humans readily make inferences. The first study concentrates on interpreting answers to yes/no questions which do not straightforwardly convey a ‘yes’ or ‘no’ answer. I focus on questions involving scalar modifiers (Was the movie wonderful? It was worth seeing.) and numerical answers (Are your kids little? I have a 10 year-old and a 7 year-old.). To determine whether the intended answer is yes or no, we need to evaluate how worth seeing relates to wonderful, andhow10 and 7 year-old relatetolittle. Canweautomaticallylearnfrom real texts what meanings people assign to these modifiers? I exploit the availability of alargeamountoftexttolearnmeaningsfromwordsandsentencesincontexts. Ishow that we can ground scalar modifier meaning based on large unstructured databases, and that such meanings can drive pragmatic inference. The second study detects conflicting statements. If an article about a factory says that 100 people were working inside the plant where the police defused the rockets, whereas a second about the same factory reports that 100 people were injured, and we understand these statements, we will infer that they are contradictory. I created iv the first available corpus of contradictions which, departing from the traditional view in formal semantics, I have defined as pieces of text that are extremely unlikely to be considered true simultaneously. I argue that such a definition, rather than a logical notion of contradiction, better fits people’s intuitions of what a contradiction is. Through a detailed analysis of such naturally-occurring conflicting statements, I identified linguistic factors which give rise to contradiction. I then used a logistic regression model to learn the best way of weighing these different factors, and put this model to use to predict whether a new set of sentence pairs was contradictory. Thethirdstudytargetsveridicality–whethereventsdescribedinatextareviewed as actual, non-actual or uncertain. What do people infer from a sentence such as At a news conference, Mr. Fournier accused Paribas of planning to pay for the takeover by selling parts of the company? Is Paribas going to pay for the takeover by selling parts of the company? I show that not only lexical semantic properties but context and world knowledge shape veridicality judgments. Since such judgments are not always categorical, I suggest they should be modeled as distributions. I build and describe a classifier, which balances both lexical and contextual factors and can faithfully model human judgments of veridicality distributions. Together these studies illustrate how computer systems begin to recover hearers’ readings by exploiting probabilistic methods and learning from large amounts of data incontext. Mydissertationhighlightstheimportanceofmodelingpragmaticmeaning to reach real natural language understanding. Humans rely on context in their every- day use of language. Computer programs must do likewise, and the work presented here shows that it is feasible to automatically capture some aspects of pragmatic meaning. v Acknowledgements My thanks go first to my committee, without whom this dissertation would not exist. To Chris Manning, my advisor: I am so glad you answered your office phone on Christmas Eve eight years ago and accepted me as a visitor in the NLP group! This was the start of my American journey, and of an amazing time at Stanford. You have been a wonderful advisor. You have shaped my thoughts, always critiqued my work in a constructive way, and helped me gain confidence in myself. You have guided me all the way, put up with my multiple leg injuries, and I know you truly cared about me. For all your guidance, encouragement and support: thank you! To Chris Potts: I am grateful for the many invaluable insights and your very concrete help in my work. Thank you for the fruitful collaboration, and for your constant optimism. To Beth Levin: Iamindebtedforthehelpyouhavealwaysprovidedmewith,intellectuallyand personally. You have been so generous with your time, reading and commenting on so many of my writings. You caught imprecise or unclear formulations, and helped me communicate better what I meant to say. Your comments were always to the point! To Dan Jurafsky: for your contagious energy, for the very useful feedback on my presentations, for helping me to keep an eye on the big picture, and for teaching me to say ‘no’. This dissertation does not reflect all the work I did during my time at Stanford, and I owe a great debt of gratitude to other faculty members with whom I inter- acted closely on various projects. To Joan Bresnan: you have always asked the right questions, and interacting with you has been extremely encouraging. To Eve Clark: I have discovered the fascinating aspect of language acquisition thanks to you, and as my children are emerging to language, it is quite captivating to experience “live” vi everything I have learned from you. I hope to still be able to do research in that area too. To Meghan Sumner: you opened my eyes to the field of speech perception and experimental design. You also helped me tremendously to find a good balance between personal life, motherhood and work. Thank you for all the wonderful advice, and for believing I could do it all. I am also grateful to Peter Sells who guided me during my first year in the PhD program, and to Tom Wasow for his kindness and advice when I faced important life decisions. To my mentors in Belgium, Leila du Castillon, Lambert Isebaert, Philippe Del- sarte, Andr´e Thayse and C´edrick Fairon: you played a significant role in my life and helped me shape my academic path. To Jean-luc Doumont, for teaching me your vision of public speaking, and your invaluable help in preparing my interviews. To the linguistics and computer science students and postdocs who have been here with me, for your friendship, your advice and your help. Special thanks to Bill McCartney, Jenny Finkel, Nate Chambers, Nola Stephens, Anubha Kothari, Middy Tice,OlgaDmitrieva,YuanD’AntilioandSpenceGreen: youhavebeengoodlisteners when I needed it, and helped me in various ways in my research. And of course to my cohort: Matt Adams, Uriel Cohen Priva, Scott Grimm, Jason Grafmiller and Tyler Schnoebelen, thank you for the great six years together and the warm companionship! To my Belgian friends who preceded me in the Bay Area or followed a path similar to mine: Maureen Heymans, C´edric Dupont, Virginie Lousse, Lieven Caboor, Sylvie Denuit (the best babysitter ever!), Ga¨etanelle Gilquin, Caroline De Wagter, Daniel Wolf, and to Charlene Noll who also counts in this group! Thank you for your tremendous support, you have been an oasis of home and I could not have made it without you. To my American mom, Beverly Pacheco: I am grateful for your generosity, and for making me feel so welcome in your family. To my parents for their love and upbringing which made me who I am. To my sister, Daphn´e, and my brothers, Olivier and Guillaume, who I can always count on despite the long distance. To Go´eric, my husband, for always believing in me and for pushing us to embark on this American journey eight years ago. And last but not least, to our children, Timoth´ee and Ali´enor, whose smiles put everything in perspective and force me to stop and smell the roses along the way. vii Contents Abstract iv Acknowledgements vi 1 Introduction 1 1.1 Different levels of meaning . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 How humans use language . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Early days of natural language understanding . . . . . . . . . . . . . 6 1.4 Computational models of pragmatic meaning . . . . . . . . . . . . . . 9 1.4.1 Underspecification in dialogue: The case of gradable adjectives 12 1.4.2 Detection of conflicting information . . . . . . . . . . . . . . . 13 1.4.3 Assessment of veridicality . . . . . . . . . . . . . . . . . . . . 14 2 The Stanford dependency representation 16 3 Learning the meaning of gradable adjectives 24 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.2 Uncertainty in indirect answers . . . . . . . . . . . . . . . . . . . . . 27 3.2.1 Quantifying indirect answers to yes/no questions . . . . . . . 27 3.2.2 Quantifying uncertain indirect answers . . . . . . . . . . . . . 31 3.3 Previous work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.4 Gradable adjectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.5 Corpus description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.5.1 Types of question-answer pairs . . . . . . . . . . . . . . . . . 38 viii 3.5.2 Answer assignment . . . . . . . . . . . . . . . . . . . . . . . . 39 3.6 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.6.1 Learning modifier scales and inferring yes/no answers . . . . . 45 3.6.2 Interpreting numerical answers . . . . . . . . . . . . . . . . . 47 3.7 Evaluation and results . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.8 Error analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.9 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4 Detection of conflicting information 61 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 4.1.1 Definition of the RTE task . . . . . . . . . . . . . . . . . . . . 62 4.1.2 Definition of “contradiction detection” . . . . . . . . . . . . . 65 4.2 A corpus of conflicting information . . . . . . . . . . . . . . . . . . . 66 4.2.1 Annotation guidelines for marking contradictions in the RTE datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 4.2.2 Annotation results . . . . . . . . . . . . . . . . . . . . . . . . 70 4.3 Typology of conflicting information . . . . . . . . . . . . . . . . . . . 72 4.4 System description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 4.4.1 Filtering non-coreferent events . . . . . . . . . . . . . . . . . . 78 4.4.2 Contradiction features . . . . . . . . . . . . . . . . . . . . . . 81 4.5 Evaluation and results . . . . . . . . . . . . . . . . . . . . . . . . . . 85 4.6 Error analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 4.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 5 Assessment of veridicality 93 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 5.2 Veridicality and modality . . . . . . . . . . . . . . . . . . . . . . . . . 97 5.3 Corpus annotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 5.3.1 FactBank corpus . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.3.2 Annotations from the reader’s perspective . . . . . . . . . . . 101 5.3.3 An alternative scale . . . . . . . . . . . . . . . . . . . . . . . . 104 5.4 Lessons from the new annotations . . . . . . . . . . . . . . . . . . . . 107 ix 5.4.1 The impact of pragmatic enrichment . . . . . . . . . . . . . . 107 5.4.2 The uncertainty of pragmatic enrichment . . . . . . . . . . . . 114 5.5 A system for veridicality assessment . . . . . . . . . . . . . . . . . . . 120 5.6 Evaluation and results . . . . . . . . . . . . . . . . . . . . . . . . . . 123 5.7 Error analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 5.8 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 6 Conclusion 137 Bibliography 141 x
Description: