ebook img

Integration of world knowledge for natural language understanding PDF

252 Pages·2012·3.78 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Integration of world knowledge for natural language understanding

ATLANTISTHINKINGMACHINES VOLUME3 SERIESEDITOR: KAI-UWEKU¨HNBERGER Atlantis Thinking Machines SeriesEditor: Kai-UweKu¨hnberger InstituteofCognitiveScience UniversityofOsnabru¨ck,Germany (ISSN:1877-3273) Aimsandscopeoftheseries This series publishes books resulting from theoretical research on and reproductions of general Artificial Intelligence (AI). The book series focuses on the establishment of new theoriesandparadigmsinAI.Atthesametime,theseriesaimsatexploringmultiplescien- tificanglesandmethodologies,includingresultsfromresearchincognitivescience,neu- roscience, theoreticalandexperimentalAI,biologyandfrominnovativeinterdisciplinary methodologies. Allbooksinthisseriesareco-publishedwithSpringer. Formoreinformationonthisseriesandourotherbookseries,pleasevisitourwebsiteat: www.atlantis-press.com/publications/books AMSTERDAM–PARIS–BEIJING (cid:2)c ATLANTISPRESS Integration of World Knowledge for Natural Language Understanding Ekaterina Ovchinnikova USCISI 4676AdmiraltyWay MarinadelRey,CA90292 USA AMSTERDAM–PARIS–BEIJING AtlantisPress 8,squaredesBouleaux 75019Paris,France ForinformationonallAtlantisPresspublications,visitourwebsiteat:www.atlantis-press.com Copyright Thisbook,oranypartsthereof,maynotbereproducedforcommercialpurposesinanyformorby anymeans,electronicormechanical,includingphotocopying,recordingoranyinformationstorage andretrievalsystemknownortobeinvented,withoutpriorpermissionfromthePublisher. AtlantisThinkingMachines Volume1:Enaction,Embodiment,EvolutionaryRobotics.SimulationModelsforaPost-Cognitivist ScienceofMind-MariekeRohde,EzequielA.DiPaolo Volume2:Real-WorldReasoning:TowardScalable,UncertainSpatiotemporal,Contextualand CausalInference-BenGoertzel,NilGeisweiller,Lu´cioCoelho,PredragJanicˇic´,CassioPennachin ISBNs Print: 978-94-91216-52-7 E-Book: 978-94-91216-53-4 ISSN: 1877-3273 (cid:2)c 2012ATLANTISPRESS Foreword Inference-basednaturallanguageunderstanding(NLU)wasathrivingareaofresearchin the 1970s and 1980s. It resulted in good theoretical work and in interesting small-scale systems.Butintheearly1990sitfounderedonthreedifficulties: • Parserswerenotaccurateenoughtoproducepredicate-argumentrelationsreliably,so thatinferencehadnoplacetostart. • Inferenceprocesseswerenotefficientenoughnoraccurateenough. • TherewasnolargeknowledgebasedesignedforNLUapplications. The first of these difficulties has been overcome by progress in statistical parsing. The secondproblemisonethatmanypeople,includingEkaterinaOvchinnikova,areworking on now. The research described in this volume addresses the third difficulty, and indeed showsconsiderablepromiseinovercomingit.Forthisreason,IbelieveDr.Ovchinnikova’s workhasarealpotentialtoreigniteinterestininference-basedNLUinthecomputational linguisticscommunity. Akeynotioninherworkisthattherealreadyexistssufficientworldknowledgeinavariety ofresources,atalevelofprecisionthatenablestheirtranslationintoformallogic. Tomy mind,themostimportantoftheseareWordNetandFrameNet,especiallythelatter,andshe describesthekindofinformationonecangetoutoftheseresources.Sheexploitsinpartic- ularthehierarchicalinformationandtheglossesinWordNet,generating600,000axioms. ShealsodescribeshowonecanutilizeFrameNettogenerateabout50,000axiomsrepre- sentingrelationsbetweenwordsandframes, andabout5000axiomsrepresentingframe- framerelations. HeranalysisofFrameNetisquitethorough, andIfoundthispartofher workinspiring. ShealsocriticallydiscussesfoundationalontologiessuchasDOLCE,SUMO.andOpen- Cyc, anddomain-specificontologiesofthesortbeingconstructedfortheSemanticWeb. v vi IntegrationofWorldKnowledgeforNaturalLanguageUnderstanding Sheexaminestheproblemsraisedbysemi-formalontologies,likeYAGOandConceptNet, whichhavebeengleanedfromtextorNetizensandwhichmaybemoredifficulttotrans- lateintoformallogic. Shealsoshowshowtousedistributionaldataforadefaultmodeof processingwhentherequiredknowledgeisnotavailable. Heruseofknowledgefromavarietyofresources,combinedintoasinglesystem,leadsto theveryhardproblemofensuringconsistencyinsuchaknowledgebase. Sheengagesin averyclosestudyofthekindsofconceptualinconsistenciesthatoccurinFrameNetand in Description Logic ontologies. She then provides algorithms for finding and resolving inconsistenciesintheseresources.Ifoundthispartofherworkespeciallyimpressive. Sheexaminesthreeformsofinference–standarddeduction,weightedabduction,andrea- soningindescriptionlogics,explicatingthestrengthsandweaknessesofeach. Finallysheevaluatesherworkoninference-basedNLUbyapplyingherreasoningengines totheRecognizingTextualEntailmentproblem.SheusestheRTE-2testsetandshowsthat herapproach,withnospecialtuningtotheRTEtask,achievesstate-of-the-artperformance. She also evaluates her approach, with similarly excellent results, on the Semantic Role Labeling task and on paraphrasing noun-noun dependencies, both of which fall out as a by-productofweightedabduction. So the research described here is very exciting indeed. It is a solid achievement on its ownanditpromisestoopendoorstomuchgreaterprogressinautomaticnaturallanguage understandingintheverynearfuture. JerryR.Hobbs InformationSciencesInstitute UniversityofSouthernCalifornia MarinadelRey,California Acknowledgments TheresearchpresentedinthisbookisbasedonmyPhDthesis,thatwouldnothavebeen possiblewithoutthehelpofmanypeople.InthefirstplaceIwouldliketothankmythesis advisorKai-UweKu¨hnbergerwhohasscientificallyandorganizationallysupportedallmy researchadventuresgivingmefreedomtotrywhateverIthoughtwasinteresting. IamindebtedtoJerryHobbswhohasinvitedmetovisittheInformationSciencesInstitute where I have spent the most productive six months of my dissertation work. Jerry has introducedmetotheexcitingfieldofabductivereasoningandencouragedmetocombine thisapproachwithmyresearchefforts,whichturnedtobehighlysuccessful. IowemydeepestgratitudetoFrankRichterwhohassupportedmefromtheverybeginning ofmyresearchcareer.WheneverIneededscientificadviceororganizationalsupport,Frank wasalwaystheretohelp. MygratitudeespeciallygoestotheISIcolleagues.Iparticularlybenefitedfromdiscussions withEduardHovy. ThankstoRutuMulkar-Mehtawhohasdevelopedandsupportedthe Mini-TACITUS system, I managed to implement the extensions to the system that many ofmyresearchresultsarebasedupon. IverymuchthankNiloofarMontazeriwhoshared withmethetediousworkonrecognizingtextualentailmentchallenge. IthankNicolaGuarinoforgivingmeanopportunitytospendacoupleofweeksattheLab- oratoryofAppliedOntology. ManythankstotheLOAcolleaguesLaureVieu,Alessandro Oltramari,andStefanoBorgoforafruitfulcollaborationonthetopicofconceptualconsis- tency. Thefollowinggratitudesgototheresearchersfromallaroundtheworldwhohavedirectly contributedtothiswork.IverymuchthankTonioWandmacherforbeingmyguideintothe worldofdistributionalsemanticsandforconstantlychallengingmytrustininference-based approaches.IamgratefultoJohanBos,thedeveloperoftheBoxerandNutcrackersystems, who helped me to organize experiments involving these systems. I would like to thank vii viii IntegrationofWorldKnowledgeforNaturalLanguageUnderstanding AnselmoPen˜asforcollaboratingwithmeontheissueofparaphrasingnoundependencies. IthankMichaelMcCordformakingtheESGsemanticparseravailableformyexperiments. IthankHelmarGustwhoagreedtowriteareviewofmythesis. Concerningthefinancialside,IwouldliketothanktheGermanAcademicExchangeser- vice(DAAD)foraccordingmeathreeyeargraduatescholarship. IalsothanktheDoctor- ateProgrammeattheUniversityofOsnabru¨ckforsupportingmyconferenceandscientific tripsfinancially. Iwould liketothank Johannes Dellert, Ilya Oparin, UlfKrumnack, Konstantin Todorov, andSaschaAlexeyenkoforvaluablecomments,hints,anddiscussions. Specialthanksto Ilyaforkeepingaskingmewhenmythesiswasgoingtobefinished. IamgratefultoIrinaV.Azarovawhogavemeafeelingofwhatcomputationallinguistics reallyis. IexpressmyparticulargratitudetomyparentsAndreyandElenafortheircontinuedsup- portandencouragement,whichIwasalwaysabletocounton. Finally, I sincerely thank my husband Fedor who has greatly contributed to the realiza- tion of this book. Thank you for valuable discussions, introduction into statistical data processing, manifold technical and software support, cluster programming necessary for large-scaleexperiments,andallotherthings,whichcannotbeexpressedbywords. E.O.,November2011,LosAngeles Contents Foreword v Acknowledgments vii ListofFigures xiii ListofTables xv ListofAlgorithms xvii 1. Preliminaries 1 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 HowtoReadThisBook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2. NaturalLanguageUnderstandingandWorldKnowledge 15 2.1 WhatisNaturalLanguageUnderstanding?. . . . . . . . . . . . . . . . . . . . . . 15 2.2 RepresentationofMeaning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.2.1 MeaningRepresentationinLinguisticTheories. . . . . . . . . . . . . . . 19 2.2.2 LinguisticMeaninginArtificialIntelligence . . . . . . . . . . . . . . . . 26 2.3 SharedWordKnowledgeforNaturalLanguageUnderstanding . . . . . . . . . . . 30 2.3.1 Linguisticvs.WorldKnowledge . . . . . . . . . . . . . . . . . . . . . . 31 2.3.2 NaturalLanguagePhenomenaRequiringWordKnowledgetobeResolved 33 2.4 ConcludingRemarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3. SourcesofWorldKnowledge 39 3.1 Lexical-semanticResources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.1.1 Hand-craftedElectronicDictionaries . . . . . . . . . . . . . . . . . . . . 46 3.1.2 AutomaticallyGeneratedLexical-semanticDatabases . . . . . . . . . . . 53 3.2 Ontologies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 3.2.1 FoundationalOntologies . . . . . . . . . . . . . . . . . . . . . . . . . . 59 3.2.2 Domain-specificOntologies . . . . . . . . . . . . . . . . . . . . . . . . . 64 3.3 MixedResources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 3.3.1 OntologiesLearnedfromText. . . . . . . . . . . . . . . . . . . . . . . . 65 3.3.2 OntologiesLearnedfromStructuredSources:YAGO . . . . . . . . . . . 66 3.3.3 OntologiesGeneratedUsingCommunityEfforts:ConceptNet . . . . . . . 67 3.4 ConcludingRemarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 ix

Description:
This book concerns non-linguistic knowledge required to perform computational natural language understanding (NLU). The main objective of the book is to show that inference-based NLU has the potential for practical large scale applications. First, an introduction to research areas relevant for NLU i
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.