ebook img

Off-line answer extraction for Question Answering PDF

176 Pages·2008·1.25 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Off-line answer extraction for Question Answering

University of Groningen Off-line answer extraction for Question Answering Mur, Jori IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document version below. Document Version Publisher's PDF, also known as Version of record Publication date: 2008 Link to publication in University of Groningen/UMCG research database Citation for published version (APA): Mur, J. (2008). Off-line answer extraction for Question Answering. [Thesis fully internal (DIV), Rijksuniversiteit Groningen]. s.n. Copyright Other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons). The publication may also be distributed here under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license. More information can be found on the University of Groningen website: https://www.rug.nl/library/open-access/self-archiving-pure/taverne- amendment. Take-down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim. Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons the number of authors shown on this cover page is limited to 10 maximum. Download date: 14-02-2023 Jori Mur Off-line answer extraction for Question Answering ii This research was carried out in the project Question Answering using Dependency Relations,whichispartoftheresearchprogrammeforInteractive Multimedia Inform- ation eXtraction, IMIX, financed by NWO. TheworkinthisthesishasbeencarriedoutundertheauspicesoftheLOTschooland the Center for Language and Cognition Groningen (CLCG) of the Faculty of Arts of the University of Groningen. Groningen Dissertations in Linguistics 69 ISSN 0928-0030 (cid:13)c 2008, Jori Mur ISBN: 978-90-367-3567-4 Cover design: (cid:13)c 2008, Roelie van der Molen Printed by Grafimedia, Groningen Document prepared with LATEX2ε and typeset in pdfTEX. RIJKSUNIVERSITEIT GRONINGEN Off-line answer extraction for Question Answering Proefschrift ter verkrijging van het doctoraat in de Letteren aan de Rijksuniversiteit Groningen op gezag van de Rector Magnificus, dr. F. Zwarts, in het openbaar te verdedigen op donderdag 23 oktober 2008 om 14.45 uur door Jori Mur geboren op 28 juni 1980 te Hardenberg iv Promotor: Prof.dr.ir. J. Nerbonne Copromotor: Dr. G. Bouma Beoordelingscommissie: Prof.dr. P. Hendriks Prof.dr. M. de Rijke Prof.dr. B.L. Webber Preface There are many people who helped me writing this thesis and in this preface I take the opportunity of thanking them. I am first indebted to Gosse Bouma, who has been a fine supervisor throughout these four years. Judging by the stories of many other phds I believe it is by no means commonplace that you were always available to answerquestions,commentonpapersandgiveadvice. Ihavealwaysappreciatedthat a lot. I am also grateful to my professor John Nerbonne for his valuable comments on my work and for reading my chapters so quickly everytime I had handed one in. Furthermore I want to thank my reading committee, Petra Hendriks, Bonnie Webber and Maarten de Rijke for their comments on my work. I thank all my colleagues of CLCG for creating such a nice working environment. Afraid of forgetting someone I will not mention you all by name, but there are a few peoplethatIwanttothankinparticular. FirstLonnekeandIsmailforbeingthebest roommates I could wish for. It was really quiet and empty when you left and I am happy that John came to keep me from feeling lonely the last couple of months. I thank Gertjan, Gosse, Ismail, J¨org, and Lonneke as fellow members of the Groningen QA CLEF team. It was great working with you in this project and participating in CLEF together. I thank all the people from wednesday sports, especially Roel for organising it every year. How I am going to miss these weekly hours. Jacky, besides being a great colleague you deserve a lot of gratitude for all the work you have put in the Spanish lunches. Muchas Gracias! I hope you do not stay too long on the other side of the world. Further, I wish to thank all the quiz people and especially Jacky and later Erikfororganisingit. IwillcontinuetojoinalthoughIamabitdissappointedinhow useful my CLEF knowledge turns out to be, that is, not at all. I am grateful to all my co-schildpadden, Lonneke, Jacky, Erik-Jan, Ismail, Geoffrey, and Jantien for the weekly meetings where we could vent our frustrations and support each other during the progress of writing our thesis. I am greatly indebted to Roelie for designing my website four years ago. I am very happy that you also agreed to design the cover of this thesis. Tanke wol! Special thanks in advance for Therese and Jelena, I am very happy that you will stand by my side as my paranimf’s at my public defense. Tack s˚a mycket, hvala lepa! Lonneke, I v vi have mentioned you already a couple of times, but you cannot be mentioned enough. Without you I would not have finished this work. Thank you for your support, your friendship and for all the fun we had together. We started almost at the same time and I am very happy that we will have our defence on the same day. A warm thanks goes to all my parents, Mam, Paul, Pap, and Hette for always supportingmeandbelievinginme. Ikbenheelblijmetjullieallemaal! FinallyIwant to thank Fokke for all his love and support. Dank je wel, liefie, dat je telkens al mijn hoofdstukken wilde lezen, mij steunde als ik weer eens in een dip zat, en er gewoon altijd voor me bent. Contents 1 Introduction 1 1.1 Question Answering: motivation and background . . . . . . . . . . . . 1 1.2 This thesis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.2.1 Off-line answer extraction: motivation and background. . . . . 8 1.2.2 Research questions and claims . . . . . . . . . . . . . . . . . . 12 1.2.3 Chapter overview . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2 Off-line Answer Extraction: Initial experiment 15 2.1 Experimental Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.1.1 Joost. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.1.2 Alpino . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.1.3 Corpus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.1.4 Question set . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.1.5 Answer set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.2 Initial experiment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.2.1 Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.2.2 Questions and answers . . . . . . . . . . . . . . . . . . . . . . . 28 2.2.3 Evaluation methods and results . . . . . . . . . . . . . . . . . . 28 2.2.4 Discussion of results and error analysis . . . . . . . . . . . . . . 29 2.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3 Extraction based on dependency relations 37 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.2 Answer Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.2.1 Extraction with Surface Patterns . . . . . . . . . . . . . . . . . 40 3.2.2 Extraction with Syntactic Patterns . . . . . . . . . . . . . . . . 41 3.2.2.1 Equivalence rules . . . . . . . . . . . . . . . . . . . . 42 3.2.2.2 D-score . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.3 Experiments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.3.1 Extraction task . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.3.2 Question Answering task . . . . . . . . . . . . . . . . . . . . . 47 vii viii CONTENTS 3.4 Discussion of results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 4 Coreference resolution for off-line answer extraction 57 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.2 Coreference resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.2.1 Choosing an approach . . . . . . . . . . . . . . . . . . . . . . . 59 4.2.2 Coreference resolution process. . . . . . . . . . . . . . . . . . . 61 4.2.2.1 Preprocessing. . . . . . . . . . . . . . . . . . . . . . . 62 4.2.2.2 Resolving Pronouns . . . . . . . . . . . . . . . . . . . 63 4.2.2.3 Resolving Common Nouns . . . . . . . . . . . . . . . 78 4.2.2.4 Resolving Named Entities . . . . . . . . . . . . . . . . 86 4.2.3 Evaluation and results . . . . . . . . . . . . . . . . . . . . . . . 88 4.2.3.1 Trade-off recall and precision . . . . . . . . . . . . . . 88 4.2.3.2 MUC-score . . . . . . . . . . . . . . . . . . . . . . . . 88 4.2.3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . 93 4.2.3.4 Error analysis . . . . . . . . . . . . . . . . . . . . . . 93 4.3 Using coreference information for answer extraction. . . . . . . . . . . 95 4.3.1 Extraction task . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 4.3.2 Question Answering task . . . . . . . . . . . . . . . . . . . . . 101 4.3.3 Discussion of results . . . . . . . . . . . . . . . . . . . . . . . . 104 4.4 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 5 Extraction based on learned patterns 109 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 5.1.1 Bootstrapping techniques . . . . . . . . . . . . . . . . . . . . . 110 5.1.2 Aims and overview . . . . . . . . . . . . . . . . . . . . . . . . . 113 5.2 Bootstrapping algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 115 5.2.1 Pattern induction. . . . . . . . . . . . . . . . . . . . . . . . . . 115 5.2.2 Pattern filtering . . . . . . . . . . . . . . . . . . . . . . . . . . 117 5.2.3 Fact extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 5.3 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 5.3.1 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 5.3.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 5.4 Discussion of results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 5.5 General discussion on learning patterns . . . . . . . . . . . . . . . . . 127 CONTENTS ix 5.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 6 Conclusions 133 6.1 Summary of main findings . . . . . . . . . . . . . . . . . . . . . . . . . 133 6.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 Bibliography 139 A Patterns 149 A.1 Capital . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 A.1.1 Surface patterns . . . . . . . . . . . . . . . . . . . . . . . . . . 150 A.1.2 Dependency patterns . . . . . . . . . . . . . . . . . . . . . . . . 150 A.2 Currency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 A.2.1 Surface patterns . . . . . . . . . . . . . . . . . . . . . . . . . . 150 A.2.2 Dependency patterns . . . . . . . . . . . . . . . . . . . . . . . . 151 A.3 Date of Birth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 A.3.1 Surface patterns . . . . . . . . . . . . . . . . . . . . . . . . . . 151 A.3.2 Dependency patterns . . . . . . . . . . . . . . . . . . . . . . . . 151 A.4 Founder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 A.4.1 Surface patterns . . . . . . . . . . . . . . . . . . . . . . . . . . 151 A.4.2 Dependency patterns . . . . . . . . . . . . . . . . . . . . . . . . 152 A.5 Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 A.5.1 Surface patterns . . . . . . . . . . . . . . . . . . . . . . . . . . 152 A.5.2 Dependency patterns . . . . . . . . . . . . . . . . . . . . . . . . 153 A.6 Location of Birth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 A.6.1 Surface patterns . . . . . . . . . . . . . . . . . . . . . . . . . . 153 A.6.2 Dependency patterns . . . . . . . . . . . . . . . . . . . . . . . . 153 Samenvatting 155 GRODIL 161

Description:
There are many people who helped me writing this thesis and in this preface I take the opportunity of thanking them. I am first indebted to Gosse Bouma, who has been a fine supervisor throughout these four years. Judging by the stories of many other phds I believe it is by no means commonplace that
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.