UC San Diego UC San Diego Electronic Theses and Dissertations Title Latent feature models for dyadic prediction / Permalink https://escholarship.org/uc/item/4xw874p5 Author Menon, Aditya Krishna Publication Date 2013 Peer reviewed|Thesis/dissertation eScholarship.org Powered by the California Digital Library University of California UNIVERSITYOFCALIFORNIA,SANDIEGO Latentfeaturemodelsfordyadicprediction Adissertationsubmittedinpartialsatisfactionofthe requirementsforthedegreeofDoctorofPhilosophy in ComputerScience by AdityaKrishnaMenon Committeeincharge: ProfessorCharlesElkan,Chair ProfessorGertLanckriet ProfessorRamamohanPaturi ProfessorLawrenceSaul ProfessorNunoVasconcelos 2013 Copyright AdityaKrishnaMenon,2013 Allrightsreserved. TheDissertationofAdityaKrishnaMenonisapprovedandisacceptable inqualityandformforpublicationonmicrofilmandelectronically: Chair UniversityofCalifornia,SanDiego 2013 iii DEDICATION Tomyparents iv EPIGRAPH Nothinghappenshere, Nothinggetsdone, Butyougettolikeit. DavidMcComb v TABLEOFCONTENTS SignaturePage........................................................ iii Dedication ........................................................... iv Epigraph............................................................. v TableofContents ..................................................... vi ListofFigures ........................................................ xii ListofTables ......................................................... xiv Acknowledgements.................................................... xvi Vita ................................................................. xix AbstractoftheDissertation ............................................. xxi Chapter1 Introduction ............................................... 1 1.1 Recap: thevalueofsupervisedlearning ........................... 1 1.2 Fromsupervisedlearningtodyadicprediction...................... 3 1.2.1 Collaborativefilteringandfriends ......................... 3 1.2.2 Dyadicprediction: aninformaloverview ................... 5 1.3 Questionstobeaddressed....................................... 7 1.4 Contributionsofthisdissertation ................................. 8 1.5 Organizationofthisdissertation ................................. 10 Chapter2 OverviewofDyadicPrediction ............................... 12 2.1 Aformaldefinitionofdyadicprediction........................... 12 2.2 Exampleinstantiationsoftheframework .......................... 13 2.2.1 Collaborativefiltering ................................... 13 2.2.2 Linkprediction ......................................... 14 2.2.3 Responseprediction..................................... 15 2.2.4 Itemresponsetheory .................................... 16 2.3 Generalityoftheframework..................................... 16 2.3.1 Trainandtestdistributions ............................... 16 2.3.2 Labelspace ............................................ 18 2.3.3 Side-information........................................ 19 2.4 Relationshiptoexistingframeworks .............................. 20 2.4.1 Supervisedlearning ..................................... 20 2.4.2 Matrixcompletion ...................................... 21 2.4.3 Weightedlinkprediction ................................. 22 vi 2.4.4 RandomeffectsmodelsandANOVA....................... 23 2.5 Overviewofdyadicpredictionmodels ............................ 24 2.5.1 Unsupervisedmodels.................................... 25 2.5.2 Feature-basedmodels ................................... 26 2.5.3 Clusteringmodels....................................... 28 2.5.4 Latentfeaturemodels.................................... 29 2.6 Analysisofthelatentfeatureapproach ............................ 40 2.6.1 Strengthsandweaknesses ................................ 40 2.6.2 Traininglatentfeaturemodels ............................ 41 2.6.3 Connectionstoothermodels.............................. 42 2.6.4 Acommentontheindependenceassumption ................ 45 Chapter3 LFL:aLog-LinearModelforDyadicPrediction................. 47 3.1 Motivation: agenericdyadicpredictionmodel ..................... 47 3.2 Afirstattemptatalog-linearmodel .............................. 48 3.2.1 Log-linearmodelsingeneral ............................. 48 3.2.2 Applyingthelog-linearframeworktodyadicprediction ....... 50 3.2.3 Aweaknessofthemodel: thepropensityproblem............ 51 3.3 LFL:alog-linearmodelwithlatentfeatures ....................... 51 3.3.1 Addinglatentfeaturestothelog-linearmodel ............... 51 3.3.2 Exploitingside-information .............................. 53 3.3.3 Trainingthemodel ...................................... 53 3.3.4 Makingpredictions ..................................... 55 3.4 AnalysisoftheLFLmodel...................................... 56 3.4.1 StrengthsandweaknessesoftheLFLmodel ................ 56 3.4.2 Differentperspectivesonthemodel........................ 57 3.4.3 Dowegetmeaningfulprobabilities? ....................... 60 3.5 ExtensionsandvariationsontheLFLmodel ....................... 60 3.5.1 Alternatefactorizations .................................. 61 3.5.2 Fixingabaseclass ...................................... 62 3.5.3 Finer-grainedweightsforside-information.................. 63 3.6 Comparisontoexistingmodels .................................. 64 3.6.1 PCAandprobabilisticsvariants ........................... 64 3.6.2 Statisticalnetworkmodels................................ 66 3.6.3 Othermodels........................................... 66 3.7 Experimentaldesign ........................................... 68 3.7.1 Aimsoftheexperiments ................................. 68 3.7.2 Hyperparameterselectionprocedure ....................... 69 3.7.3 Practicaldetailsontrainingprocedure...................... 70 3.7.4 Implementationdetails................................... 72 3.8 Experimentalresults ........................................... 73 3.8.1 Arelocaloptimaaconcern? .............................. 73 vii 3.8.2 Isthemodelpowerful? .................................. 74 3.8.3 ApplicationtoIRT:doesthemodelworkinpractice? ......... 75 3.9 Conclusion ................................................... 80 3.10 Acknowledgements ............................................ 81 Chapter4 ModellingRatingDistributions: ApplicationtoCollaborativeFiltering 82 4.1 AdvantagesofLFLforcollaborativefiltering ...................... 82 4.1.1 Modellingratingdistributions............................. 83 4.1.2 Addressingthecold-startproblem ......................... 85 4.2 IsLFLappropriateforcollaborativefiltering? ...................... 87 4.2.1 Ordinalnatureofcollaborativefilteringlabels ............... 87 4.2.2 Ispredictingthemodeappropriate? ........................ 88 4.2.3 Ismaximizinglog-likelihoodappropriate? .................. 89 4.3 ModifyingLFLforcollaborativefilteringproblems ................. 91 4.3.1 Modifyingthetrainingobjectivefunction ................... 91 4.3.2 Modifyingtheunderlyingmodel .......................... 92 4.3.3 Whichapproachisbetter? ................................ 94 4.4 Analysisofthemodel .......................................... 96 4.4.1 Matrixfactorizationperspective ........................... 96 4.4.2 Arerating-specificweightsmeaningful? .................... 97 4.5 Furtherextensionsofthemodel.................................. 98 4.5.1 Anchorpointsforprediction .............................. 98 4.5.2 Otherapproachestocapturingordinalstructure .............. 99 4.5.3 Incorporatingcollaborativefilteringspecificextensions ....... 100 4.6 Comparisontoexistingmodels .................................. 101 4.6.1 Comparisontoexistingmodelsforratingdistributions ........ 101 4.6.2 Comparisontoexistingschemesforcold-startcorrection ...... 106 4.7 Experimentaldesign ........................................... 108 4.7.1 Aimsoftheexperiments ................................. 108 4.8 Experimentalresults ........................................... 109 4.8.1 Does the choice of scoring and training scheme affect perfor- mance?................................................ 109 4.8.2 Comparisononbenchmarkdatasets ........................ 110 4.8.3 Resultsinthecold-startsetting............................ 119 4.8.4 Arethelearnedprobabilitiesmeaningful?................... 122 4.9 Conclusion ................................................... 123 4.10 Acknowledgements ............................................ 125 Chapter5 ApplicationtoLinkPrediction................................ 130 5.1 Linkprediction: overviewandexistingmodels ..................... 130 5.1.1 Problemdefinition ...................................... 130 5.1.2 Desiderataforalinkpredictionmodel...................... 131 5.1.3 Existinglinkpredictionmethods .......................... 133 viii 5.1.4 Doexistingmethodsmeetthedesiderata?................... 136 5.2 ApplyingtheLFLmodeltolinkprediction ........................ 139 5.2.1 DoesLFLmeetthedesiderata?............................ 139 5.2.2 Handlinggenericgraphs: undirected,directed,multirelational.. 141 5.3 Overcomingclassimbalanceforunweightedgraphs................. 146 5.4 Experimentaldesign ........................................... 149 5.4.1 Aimsoftheexperiments ................................. 149 5.4.2 Descriptionofdatasets................................... 150 5.4.3 Evaluationmethodology ................................. 152 5.5 Experimentalresults ........................................... 153 5.5.1 Resultsforbinaryedges ................................. 153 5.5.2 Resultsfornominaledges ................................ 160 5.6 Conclusion ................................................... 161 5.7 Acknowledgements ............................................ 162 Chapter6 PredictingClickthroughRates: ApplicationtoResponsePrediction. 163 6.1 Backgroundandrelatedwork.................................... 164 6.1.1 Theresponsepredictionproblem .......................... 164 6.1.2 Challengesinresponseprediction ......................... 165 6.1.3 Formaldefinitions ...................................... 166 6.1.4 Existingmodels ........................................ 167 6.2 Fromcollaborativefilteringtoresponseprediction .................. 168 6.2.1 Adyadicinterpretationofresponseprediction ............... 169 6.2.2 Overviewofourlatentfeaturemodel....................... 169 6.3 Aconfidence-weightedfactorizationmodel ........................ 171 6.3.1 Confidence-weightedfactorization......................... 171 6.3.2 Comparisontoexistingmethods........................... 173 6.4 Incorporatingside-information .................................. 175 6.4.1 Ajointfactorizationandfeaturemodel ..................... 175 6.4.2 Aniterativerefinementprocedure ......................... 176 6.5 Incorporatinghierarchies ....................................... 178 6.5.1 Hierarchicalregularization ............................... 178 6.5.2 Agglomeratefitting ..................................... 180 6.5.3 Residualfitting ......................................... 182 6.5.4 Puttingitalltogether: ahybridmethod ..................... 182 6.5.5 Handlingcold-startpagesandads ......................... 183 6.6 Experimentaldesign ........................................... 183 6.6.1 Aimsoftheexperiments ................................. 183 6.6.2 Datasetsused .......................................... 184 6.6.3 Methodscompared...................................... 185 6.6.4 Evaluationmethodology ................................. 186 6.7 Experimentalresults ........................................... 187 ix
Description: