ebook img

Entity-Oriented Search PDF

358 Pages·2018·12.137 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Entity-Oriented Search

The Information Retrieval Series Krisztian Balog Entity- Oriented Search The Information Retrieval Series Volume 39 SeriesEditors ChengXiangZhai MaartendeRijke EditorialBoard NicholasJ.Belkin CharlesClarke DianeKelly FabrizioSebastiani Moreinformationaboutthisseriesathttp://www.springer.com/series/6128 Krisztian Balog Entity-Oriented Search KrisztianBalog UniversityofStavanger Stavanger,Norway ISSN1387-5264 TheInformationRetrievalSeries ISBN978-3-319-93933-9 ISBN978-3-319-93935-3 (eBook) https://doi.org/10.1007/978-3-319-93935-3 LibraryofCongressControlNumber:2018946540 ©TheEditor(s)(ifapplicable)andtheAuthor(s)2018,Thisbookisanopenaccesspublication. Open Access This bookis licensed under the terms of the Creative Commons Attribution 4.0Inter- nationalLicense(http://creativecommons.org/licenses/by/4.0/), whichpermitsuse,sharing,adaptation, distribution andreproduction inanymediumorformat,aslong asyougive appropriate credit tothe originalauthor(s)andthesource,providealinktotheCreativeCommonslicenseandindicateifchanges weremade. Theimages or other third party material in this book are included in the book’s Creative Commons license,unlessindicatedotherwiseinacreditlinetothematerial.Ifmaterialisnotincludedinthebook’s CreativeCommonslicenseandyourintendeduseisnotpermittedbystatutoryregulationorexceedsthe permitteduse,youwillneedtoobtainpermissiondirectlyfromthecopyrightholder. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant protectivelawsandregulationsandthereforefreeforgeneraluse. Thepublisher,theauthors,andtheeditorsaresafetoassumethattheadviceandinformationinthisbook arebelievedtobetrueandaccurateatthedateofpublication.Neitherthepublishernortheauthorsor theeditorsgiveawarranty,expressorimplied,withrespecttothematerialcontainedhereinorforany errorsoromissionsthatmayhavebeenmade.Thepublisherremainsneutralwithregardtojurisdictional claimsinpublishedmapsandinstitutionalaffiliations. Printedonacid-freepaper ThisSpringerimprintispublishedbytheregisteredcompanySpringerNatureSwitzerlandAG Theregisteredcompanyaddressis:Gewerbestrasse11,6330Cham,Switzerland Szüleimnek Preface Ihavenotyetreachedmygoal...ButIforgetwhatisbehind, andIstruggleforwhatisahead.Iruntowardthegoal,soIcan wintheprizeofbeingcalledtoheaven.ThisistheprizeGod offersbecauseofwhatChristJesushasdone. (Philippians3:12–14,CEV) The idea of writing this book stemmed from a series of tutorials that I gave with colleagueson“entitylinkingandretrievalforsemanticsearch.”Therewasnosingle text on this topic that would cover all the material that I wished to introduce to someonewhoisnewtothisfield.Withthisbook,Isetouttofillthatgap.Ihopethat bymakingthebookopenaccess,manywillbeabletouseitandbenefitfromit. Forme,writingthisbook,inmanyways,waslikerunningamarathon.Noone forced me to do it, yet I thought that—for some reason—it’d be a good idea to challengemyselftodoit.Then,alongtheway,therecomesinevitablyapointwhere oneasks:Whyam Idoingthisto myself?Butthen,inthe end,crossingthefinish linecertainlyfeelslikeanaccomplishment.Intime,thisexperiencemightevenbe rememberedasifitwasawalkinthepark.1Inanycase,itwasagoodrun. I wish to express my gratitude to a number of people who played a role in makingthisbookhappen.Firstofall,IwouldliketothankRalfGerstner,executive editor for Computer Science at Springer, for seeing me through to the successful completion of this book and for always being a gentleman when it came to my deadline extension requests. I also want to thank the InformationRetrieval Series editors Maarten de Rijke and ChengXiang Zhai for the comments on my book proposal. AveryspecialthankstoJamieCallanandtoanonymousReviewer#2forreview- ingthebookandformakingnumerousvaluablesuggestionsforimprovements. The following colleagues provided feedback on drafts of specific chapters at various stages of completion, and I would like to thank them for their insightful comments:MarekCiglan,ArjendeVries,KalervoJärvelin,MiguelMartinez,Edgar 1Notetoself:No,itwasn’t. vii viii Preface Meij,KjetilNørvåg,DougOard,HeriRamampiaro,RalfSchenkel,AlbertoTonon, andChenyanXiong. I want to thank Edgar Meij and Daan Odijk for the collaborationon the entity linking and retrieval tutorials, which planted the idea of this book. Working with youwasalwayseasy,enjoyable,andfun.Mygratitudegoestoallmyco-authorsfor thejointworkthatcontributedtothematerialthatispresentedinthisbook. I am especially grateful to the Department of Electrical Engineering and Computer Science at the University of Stavanger for providing a pleasant work environment,whereIcoulddevoteasubstantialamountoftimetowritingthisbook. I would like to thank my PhD students for giving me their honest opinion and offeringconstructivecriticismondraftsofthebook.Theyare,ingender-first-then- alphabetical order: Faegheh Hasibi, Jan Benetka, Heng Ding, Darío Garigliotti, Trond Linjordet, and Shuo Zhang. Special thanks, in addition, to Faegheh for the thorough checking of technical details and for suggestions on the organizationof thematerial;toDaríofortidyingupmyreferences;toJanforprettifyingthefigures and illustrations; to Trond for injecting entropy and for the careful proofreading and numeroussuggestions for language improvements;to Shuo and Heng for the orientalperspectiveandfortellingmethatIusetoomanywords. Last but not least, I want to thank my friends and family for their outstanding supportthroughouttheyears.Youknowwhoyouare. Stavanger,Norway KrisztianBalog April2018 Website http://eos-book.org Thisbookisaccompaniedbytheabovewebsite.Thewebsiteprovidesavarietyof supplementarymaterial,correctionsofmistakes,andrelatedresources. ix Contents 1 Introduction................................................................. 1 1.1 WhatIsanEntity?.................................................... 2 1.1.1 NamedEntitiesvs.Concepts............................... 3 1.1.2 PropertiesofEntities ....................................... 4 1.1.3 RepresentingPropertiesofEntities........................ 5 1.2 ABriefHistoricalOutlook........................................... 6 1.2.1 InformationRetrieval....................................... 7 1.2.2 Databases ................................................... 8 1.2.3 NaturalLanguageProcessing.............................. 9 1.2.4 SemanticWeb............................................... 10 1.3 Entity-OrientedSearch............................................... 11 1.3.1 ABird’s-EyeView.......................................... 11 1.3.2 TasksandChallenges....................................... 14 1.3.3 Entity-Orientedvs.SemanticSearch ...................... 15 1.3.4 ApplicationAreas .......................................... 16 1.4 AbouttheBook....................................................... 17 1.4.1 Focus........................................................ 17 1.4.2 AudienceandPrerequisites ................................ 17 1.4.3 Organization................................................ 18 1.4.4 TerminologyandNotation ................................. 19 References.................................................................... 20 2 MeettheData............................................................... 25 2.1 TheWeb .............................................................. 26 2.1.1 DatasetsandResources..................................... 27 2.2 Wikipedia............................................................. 28 2.2.1 TheAnatomyofaWikipediaArticle...................... 29 2.2.2 Links ........................................................ 32 2.2.3 Special-PurposePages...................................... 33 2.2.4 Categories,Lists,andNavigationTemplates.............. 33 2.2.5 Resources ................................................... 35 xi

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.