ebook img

Query Understanding for Search Engines PDF

228 Pages·2020·4.6 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Query Understanding for Search Engines

The Information Retrieval Series Yi Chang Hongbo Deng  Editors Query Understanding for Search Engines The Information Retrieval Series Volume 46 SeriesEditors ChengXiangZhai,UniversityofIllinois,Urbana,IL,USA MaartendeRijke,UniversityofAmsterdam,TheNetherlandsandAholdDelhaize, Zaandam,TheNetherlands EditorialBoardMembers NicholasJ.Belkin,RutgersUniversity,NewBrunswick,NJ,USA CharlesClarke,UniversityofWaterloo,Waterloo,ON,Canada DianeKelly,UniversityofTennesseeatKnoxville,Knoxville,TN,USA FabrizioSebastiani,ConsiglioNazionaledelleRicerche,Pisa,Italy InformationRetrieval (IR) deals with access to and search in mostly unstructured information,in text, audio, and/orvideo, either from one large file or spread over separateanddiversesources,instaticstoragedevicesaswellasonstreamingdata. Itispartofbothcomputerandinformationscience,andusestechniquesfrome.g. mathematics,statistics, machinelearning,databasemanagement,orcomputational linguistics. Information Retrieval is often at the core of networked applications, web-baseddatamanagement,orlarge-scaledataanalysis. The Information Retrieval Series presents monographs, edited collections, and advancedtextbooksontopicsofinterestforresearchersinacademiaandindustry alike. Its focus is on the timely publication of state-of-the-art results at the fore- front of research and on theoretical foundations necessary to develop a deeper understandingofmethodsandapproaches. Thisseriesisabstracted/indexedinScopus. Moreinformationaboutthisseriesathttp://www.springer.com/series/6128 Yi Chang • Hongbo Deng Editors Query Understanding for Search Engines Editors YiChang HongboDeng JilinUniversity AlibabaGroup Jilin,China Zhejiang,China ISSN1871-7500 ISSN2730-6836 (electronic) TheInformationRetrievalSeries ISBN978-3-030-58333-0 ISBN978-3-030-58334-7 (eBook) https://doi.org/10.1007/978-3-030-58334-7 ©SpringerNatureSwitzerlandAG2020 Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpartof thematerialisconcerned,specificallytherightsoftranslation,reprinting,reuseofillustrations,recitation, broadcasting,reproductiononmicrofilmsorinanyotherphysicalway,andtransmissionorinformation storageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilarmethodology nowknownorhereafterdeveloped. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant protectivelawsandregulationsandthereforefreeforgeneraluse. Thepublisher,theauthors,andtheeditorsaresafetoassumethattheadviceandinformationinthisbook arebelievedtobetrueandaccurateatthedateofpublication.Neitherthepublishernortheauthorsor theeditorsgiveawarranty,expressedorimplied,withrespecttothematerialcontainedhereinorforany errorsoromissionsthatmayhavebeenmade.Thepublisherremainsneutralwithregardtojurisdictional claimsinpublishedmapsandinstitutionalaffiliations. ThisSpringerimprintispublishedbytheregisteredcompanySpringerNatureSwitzerlandAG. Theregisteredcompanyaddressis:Gewerbestrasse11,6330Cham,Switzerland Foreword Web searchengineshavemadesuchincredibleadvancesin thepastthreedecades thatinteractingwiththemhasbecomepartofallconnectedpeople’sdailyroutines. With such success, come high expectations, and searchers do not tolerate search engines not perfectly understanding and satisfying their needs. Regular Web searchers rarely realize however that understanding their queries is a really hard technicalchallenge,whichisyettobecompletelysolved. One of the sources that is of challenge is that queries are just an approximate projection of users’ needs: they are not necessarily well, and almost never fully expressed.IliketousePlato’scaveallegorytoexplainthispoint.InPlato’sallegory, chainedprisonersfacingthe backof the cave canonly see the shadowsofobjects passing behind them. They mistake them for the actual objects, as they are not exposedtoanyotherreality.Inthesamespirit,Iseequeriesasshadowsofourneeds. Unlesssearchenginesdevelopfulltelepathiccapabilities(hey,aSciFifanlikeme canalwaysdream),theycanonlyworkwithqueriesasaproxytousers’needsand assuch,mayneverbeabletofullycomprehendthem.Nevertheless,searchengines needtodotheirbestatunderstandingqueries,ashardasitmaybe,iftheywantto haveashotatsatisfyingusers. Onthepositivesidehowever,queryunderstandinghasmadeaclearprogressin the past two decades,together with the evolutionof search engines, and this even withouttelepathiccapabilities:-).Theincreasingavailabilityofpersonal/contextual signals about searchers (as long as privacy is enforced), especially with new mediums, such as voice search in mobile or digital assistants, makes me hopeful forthefuture. IamdelightedtoseethattwoprominentresearchersinthefieldsuchasYiChang andHongboDenghavetakenuponthemselvestorallysearchexpertsfromleading academiaandindustrialresearchinstitutionsinordertodivedeepintothisimportant topic.Thebookcontributorsexaminethedifferentelementsofqueryunderstanding andmostnotably: v vi Foreword 1. core understanding,where the search engine tries to associate deeper semantic meaning with the issued query; this covers, for instance, query classification, querytagging,orinferringtheintentbehindqueries, 2. query rewrite, which consists of augmentingor transforming queries in such a waythatthesearchenginecanmanipulatethemandproducebetterresults,and finally, 3. query suggestion, one of my favorite topics (as I had the privilege to lead the teamthatlaunchedGoogleSuggestmorethanadecadeago),whichconsistsin assistingthesearcherinexpressingtheirneeds.FollowingthePlatoallegory,this mechanismhelpsthe“shadow”tobeclosertotherealuser’sneed,viadynamic queryautocompletion,relatedquerysuggestions,etc. I am sure that researchers and practitioners in the field, from students to experts,willgreatlybenefitfromreadingthisbook,whichprovidesaframeworkto “understandunderstanding”(pun intended), as well as a comprehensiveoverview of the state of the art. I sincerely hope that it will inspire developers to improve theirsolutionsandresearcherstocontinueinnovatinginthatarea,whichremainsas fascinatingasever. Haifa,Israel YoelleMaarek Contents 1 AnIntroductiontoQueryUnderstanding ................................ 1 HongboDengandYiChang 2 QueryClassification ......................................................... 15 JiafengGuoandYanyanLan 3 QuerySegmentationandTagging.......................................... 43 XuanhuiWang 4 QueryIntentUnderstanding ............................................... 69 ZhichengDouandJiafengGuo 5 QuerySpellingCorrection.................................................. 103 YanenLi 6 QueryRewriting ............................................................. 129 HuiLiu,DaweiYin,andJiliangTang 7 QueryAuto-Completion..................................................... 145 LiangdaLi,HongboDeng,andYiChang 8 QuerySuggestion ............................................................ 171 ZhenLiao,YangSong,andDengyongZhou 9 FutureDirectionsofQueryUnderstanding............................... 205 DavidCarmel,YiChang,HongboDeng,andJian-YunNie vii Editors and Contributors AbouttheEditors Dr. Yi Chang is the Dean of School of Artificial Intelligence, Jilin University, China. He was a Technical Vice President at Huawei Research America and a research director at Yahoo Research before that. His research interests include informationretrieval,data mining,machinelearning,naturallanguageprocessing, and artificial intelligence. He has published more than 100 papers on premium conferencesorjournals,andhehasservedastheconferencegeneralchairforACM WSDM’2018 and ACM SIGIR’2020. He was elected as an ACM Distinguished Scientistin2018,forhiscontributionstointelligentalgorithmsforsearchengines. Dr. Hongbo Deng is a senior staff engineer and director in the Search and Recommendation Business Unit at Alibaba Group. Before that, he was a senior software engineer at Google and a senior research scientist at Yahoo! Labs. His research interests include information retrieval, Web search, data mining, recom- mendation system, and natural language processing. He obtained his Ph.D. from theDepartmentofComputerScienceandEngineeringatTheChineseUniversityof HongKong.Hehaspublishedmorethan40papersontopconferencesandjournals and won several best paper awards, including the Best Paper Award in SIGKDD 2016andtheVannevarBush BestPaperAwardinJCDL 2012.Inaddition,hehas been actively serving as a program committee member in KDD, WWW, SIGIR, WSDM,andCIKMaswellasco-organizingseveralworkshops.Dr.HongboDeng isaseniormemberofACM. Contributors DavidCarmel AmazonResearch, Haifa,Israel YiChang JilinUniversity, Jilin,China ix x EditorsandContributors HongboDeng AlibabaGroup, Zhejiang,China ZhichengDou RenminUniversityofChina, Beijing,China JiafengGuo ChineseAcademyofSciences, Beijing,China YanyanLan ChineseAcademyofSciences, Beijing,China YanenLi LinkedInInc., MountainView,CA,USA LiangdaLi YahooResearch, Sunnyvale,CA,USA ZhenLiao FacebookInc., MenloPark,CA,USA HuiLiu MichiganStateUniversity, EastLansing,MI,USA Jian-YunNie UniversityofMontreal, Montreal,QC,Canada YangSong GoogleResearch, MountainView,CA,USA JiliangTang MichiganStateUniversity, EastLansing,MI,USA XuanhuiWang GoogleResearch, MountainView,CA,USA DaweiYin BaiduInc., Beijing,China DengyongZhou GoogleResearch, MountainView,CA,USA

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.