Studies in Computational Intelligence 487 Cai-Nicolas Ziegler Social Web Artifacts for Boosting Recommenders Theory and Implementation 1 3 Studies in Computational Intelligence Volume 487 SeriesEditor J.Kacprzyk,Warsaw,Poland Forfurthervolumes: http://www.springer.com/series/7092 Cai-Nicolas Ziegler Social Web Artifacts for Boosting Recommenders Theory and Implementation ABC PDDr.Cai-NicolasZiegler PAYBACKGmbH(AmericanExpress) Albert-Ludwigs-UniversitätFreiburgi.Br. München Germany ISSN1860-949X ISSN1860-9503 (electronic) ISBN978-3-319-00526-3 ISBN978-3-319-00527-0 (eBook) DOI10.1007/978-3-319-00527-0 SpringerChamHeidelbergNewYorkDordrechtLondon LibraryofCongressControlNumber:2013937342 (cid:2)c SpringerInternationalPublishingSwitzerland2013 Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpartof thematerialisconcerned,specificallytherightsoftranslation,reprinting,reuseofillustrations,recitation, broadcasting,reproductiononmicrofilmsorinanyotherphysicalway,andtransmissionorinformation storageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilarmethodology nowknownorhereafterdeveloped.Exemptedfromthislegalreservationarebriefexcerptsinconnection with reviews or scholarly analysis or material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of this publication or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’slocation,initscurrentversion,andpermissionforusemustalwaysbeobtainedfromSpringer. PermissionsforusemaybeobtainedthroughRightsLinkattheCopyrightClearanceCenter.Violations areliabletoprosecutionundertherespectiveCopyrightLaw. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant protectivelawsandregulationsandthereforefreeforgeneraluse. Whiletheadviceandinformationinthisbookarebelievedtobetrueandaccurateatthedateofpub- lication,neithertheauthorsnortheeditorsnorthepublishercanacceptanylegalresponsibilityforany errorsoromissionsthatmaybemade.Thepublishermakesnowarranty,expressorimplied,withrespect tothematerialcontainedherein. Printedonacid-freepaper SpringerispartofSpringerScience+BusinessMedia(www.springer.com) Foreword IfirstmetDr.ZieglerwhenhewasaPh.D.studentspendingafewmonthsvisiting ourGroupLensResearchlab inMinnesota.FromthefirstI couldtellhe wasa re- searcher of unusualvision, not contentto work within the boundsof the previous literatureonrecommenders,butwantingtounderstandhowtheearlyrecommender toolscouldbereshapedtomeetneedsthattheirusersdidn’tevenimaginetheyhad yet.Hewasparticularlyinterestedinunderstandingthefitbetweenrecommenders asthesemagicaldevicesthatwereexpectedtosurpriseanddelighttheirusers,and theusers’realinformationneedsacrossavarietyofinterests. Dr.Ziegler’stimewithGroupLenswasfruitful,producinganexcellentpaperthat characterizedtheproblemoftopicdiversification.Thekeyinsightinthisworkisthat evenoncearecommenderalgorithmhaszoomedinaccuratelyonauser’sinterests, providinga set of results composedof the items that are individuallypredictedto be the most interesting may lead to a bored user. For instance, a user who loves Star Trek movies might like to have one new Star Trek movie recommended,but willcertainlybedisappointedtohavealistcontainingonlythetenmostrecentStar Trek movies.Paradoxically,therecommendermustrecommenditemsthatareless preferredinordertoproducealistthatismorepreferred.Dr.Ziegler’spaperdefined thisproblemcarefully,gaveafirmmathematicalfoundationrichenoughtosupport a variety of approachesto diversification, and demonstrated that in practice users preferthemorediverselists. ThisworkisthereforecharacteristicofDr.Ziegler.Hefoundacoreproblemthat waspoorlyunderstood,gaveitastrongfoundation,andhelpedthecommunitysee its importance. His paper on this mentioned topic is highly cited in the research literaturetothisday. Thisbookcontainsarichsetofexamplesofthisresearchapproachinpractice,in severalkeydomains.Inadditiontoaframingoftherecommendationproblem,there are three deep contributions of the book to a richer understanding of recommen- dation: topic diversification (which we have already discussed), taxonomy-driven filtering,andtrustmodels. Thekeyideabehindtaxonomy-drivenfilteringisthatusersoftenhavedifferent levels of interest for different parts of the taxonomyof an informationspace. For VI Foreword instance, one user who works with the Java programming language may be par- ticularly interested in new work on the type system, while another user may be mostinterestedinJava-basedWebcontainers.Recommendersthatareawareofthe possibilityofthesedifferencescangainpredictivepower.Intheearlydaysofrec- ommendersystemstaxonomiessuchasthesewerenotavailabletouse,evenifthe algorithmictools hadbeen available.A keycontributionof the bookis to demon- stratethattodaytherearetworichsourcesofsuchinformation.First,severalSocial Webprojectsarecreatinglarge,openinformationtaxonomies,suchasthecategory hierarchyinWikipedia.Second,powerfulmethodsoftextprocessingenabletheau- tomatic extraction of taxonomiesfrom textualinformationspaces. Lookingto the future, we can predictthat such tools will soon work for music, photos, and even movies.Miningandmakinguseofthesetaxonomiesopensthepotentialforpower- fulnewapproachestorecommendation. Thekeyideabehindtrustmodelsisthathumanshavenotionsoftrustthatarenot always compatible with the recommendations from a “black box” recommender. Exposinghowtherecommendermodelworks,and,crucially,exposingwhichother humanshavecontributedtoa setofrecommendationscanhavea biginfluenceon howmuchtherecipientoftherecommendationstrustthem.Thisbookexplorestrust modelsbasedonhomophilybetweenmembersofarecommendercommunity.Over the long term we can expect trust models like these that cross communities, that canbemanipulatedbytheenduser(“no,Idon’ttrustthatguy!”),andthatprovide explanations for why a recommendation can be trusted (“your friends Alice and Bethbothlikedthisquadricopter,soyouprobablywilltoo”). Dr.Zieglerisavisionaryscientist,andthisbookdemonstrateshiskeeninsightto newapproachestothinkingaboutrecommendationthatarenowbeingexploredby hundredsof otherscientists worldwide.Inreadingthisbookyouwillengagewith important problems in recommendation, and will see how thinking deeply about userneedsleadstofreshinsightsintotechnologicalpossibilities. Minneapolis,USA JohnRiedl March2013 Preface Recommender systems, those software programs that learn from human behavior and make predictions of what products or services we are expected to appreciate andthuspurchase,havebecomeanintegralpartofoureverydaylife.Theyprolifer- ateacrosselectroniccommercearoundtheglobe.Take,forinstance,Amazon.com, thefirstcommerciallysuccessfulandprominentexampleofsuchsystems,making use of a broad range of recommendersystem types: The companyis said to have experiencedsignificantdouble-digitgrowthinsalessolelythroughpersonalization, thusrepresentinganimpressiveuplift.Numeroussystemshavefollowedandtoday, recommendersystemsexistforvirtuallyallsortsofconsumablegoods,e.g.,books, movies,music,andevenjokes. Atthesametime,anewevolutionontheWebhasstartedtotakeshape,knownas “participationage”,“collectivewisdom”,and–mostwidelyusedtoday–“Web2.0” or “Social Web”: Consumer-generatedmedia and content has become rife, social networkshaveemergedandarepullingsignificantsharesoftheoverallWebtraffic. Inlinewiththesedevelopments,novelinformationandknowledgestructureshave becomereadilyavailableontheWeb:People’spersonaltiesandtrustlinks,human- crafted large taxonomies for organizing and categorizing all kinds of items. For example,themassiveDMOZOpenDirectoryProjectthathastakenonthechallenge tocategorizetheentireWebbyitsclassificationsystem. This textbook presents approaches to exploit the new Social Web fountain of knowledge,zeroinginfirstandforemostontwoinformationartifacts,namelyclas- sificationtaxonomiesandtrustnetworks.Thesetwoareusedtoimprovetheperfor- manceofproduct-focusedrecommendersystems:While classification taxonomies are appropriate means to fight the sparsity problem prevalent in many productive recommendersystems, interpersonaltrust ties – when used as proxiesfor interest similarity–areabletomitigatetherecommenders’scalabilityproblem. While maintainingthe principalfocusof improvingproductrecommendersys- tems through taxonomies and trust, several digressions from this main theme are included,suchastheuseofWeb2.0taxonomiesforcomputingthesemanticprox- imity of named entity pairs, or the recommending of technology synergies based on Wikipedia and our semantic proximity framework. These slight digressions, VIII Preface however, make the book even more valuable by adding perspectives of what else canbeachievedwiththosepreciousinstrumentsofknowledgethatcanbecreated fromtheWeb2.0’sovertlyaccessiblerawmaterialofdataandinformation. Mu¨nchen,Germany Cai-NicolasZiegler March2013 Acknowledgements MostoftheresearchpresentedinthistextbookhasbeenconductedduringmyPh.D. periodattheAlbert-Ludwigs-Universita¨tFreiburgi.Br.,Germany,aswellasGroup- LensResearchattheUniversityofMinnesota,USA. Aboveall,IwouldliketothankProf.Dr.GeorgLausen,mysupervisoratDBIS, the Institute of Databases and Information Systems in Freiburg. He has been my mentorthroughoutmy Ph.D. period,and hascontinuedto be so ever since. I owe himalotandvaluehimnotonlyforhiswork,butalsoforthepersonheis. Iwouldalsoliketothankmysecondsupervisor,Prof.Dr.JosephA.Konstan,as wellasProf.Dr.JohnRiedl,bothfromtheGroupLensResearchlabinMinneapolis. These two subject matter experts have provided fresh new input from a different, more HCI-focused perspective, which makes this book even more valuable to the reader. AbigthanksalsogoestoProf.Dr.Dr.LarsSchmidt-Thieme,whohasintroduced meto methodsofquantifyingtheperformanceofrecommendersystemsin offline experiments. It is now many years ago that I first came to his office in order to discusscollaborativefiltering.AndthatIcameoutofitwithawealthofnewinsights andknowledge. Mygratitudeisexpressedalsototheresearcherswhohelpedmealongtheway, particularly Dr. Paolo Massa, Zvi Topol, Ernesto D´ıaz-Avile´s, Prof. Dr. Jennifer Golbeck,Dr.SeanM.McNee,Prof.Dr.DanCosley,Dr.MaximilianViermetz,and Dr.StefanJung.ItgoesontoRonHornbakerandErikBenson,maintainingtheAll ConsumingandBookCrossingcommunity,respectively.Theyprovidedthecommu- nitydataforrenderingtheonlineuserstudiespossible. Uponcoveringtheresearchsideofcontributions,Inowswitchtothemoreemo- tionalones:Namelymyfamily,myparentsKlausandAngelika,whohavealways been there for me. As well as my “little” brotherChris. And for sure my beloved wifeMiriam,thebestthateverhappenedtomeinmylife. To myparents,andChris,my“little”brother.
Description: