Table Of ContentUC San Diego
UC San Diego Electronic Theses and Dissertations
Title
Latent feature models for dyadic prediction /
Permalink
https://escholarship.org/uc/item/4xw874p5
Author
Menon, Aditya Krishna
Publication Date
2013
Peer reviewed|Thesis/dissertation
eScholarship.org Powered by the California Digital Library
University of California
UNIVERSITYOFCALIFORNIA,SANDIEGO
Latentfeaturemodelsfordyadicprediction
Adissertationsubmittedinpartialsatisfactionofthe
requirementsforthedegreeofDoctorofPhilosophy
in
ComputerScience
by
AdityaKrishnaMenon
Committeeincharge:
ProfessorCharlesElkan,Chair
ProfessorGertLanckriet
ProfessorRamamohanPaturi
ProfessorLawrenceSaul
ProfessorNunoVasconcelos
2013
Copyright
AdityaKrishnaMenon,2013
Allrightsreserved.
TheDissertationofAdityaKrishnaMenonisapprovedandisacceptable
inqualityandformforpublicationonmicrofilmandelectronically:
Chair
UniversityofCalifornia,SanDiego
2013
iii
DEDICATION
Tomyparents
iv
EPIGRAPH
Nothinghappenshere,
Nothinggetsdone,
Butyougettolikeit.
DavidMcComb
v
TABLEOFCONTENTS
SignaturePage........................................................ iii
Dedication ........................................................... iv
Epigraph............................................................. v
TableofContents ..................................................... vi
ListofFigures ........................................................ xii
ListofTables ......................................................... xiv
Acknowledgements.................................................... xvi
Vita ................................................................. xix
AbstractoftheDissertation ............................................. xxi
Chapter1 Introduction ............................................... 1
1.1 Recap: thevalueofsupervisedlearning ........................... 1
1.2 Fromsupervisedlearningtodyadicprediction...................... 3
1.2.1 Collaborativefilteringandfriends ......................... 3
1.2.2 Dyadicprediction: aninformaloverview ................... 5
1.3 Questionstobeaddressed....................................... 7
1.4 Contributionsofthisdissertation ................................. 8
1.5 Organizationofthisdissertation ................................. 10
Chapter2 OverviewofDyadicPrediction ............................... 12
2.1 Aformaldefinitionofdyadicprediction........................... 12
2.2 Exampleinstantiationsoftheframework .......................... 13
2.2.1 Collaborativefiltering ................................... 13
2.2.2 Linkprediction ......................................... 14
2.2.3 Responseprediction..................................... 15
2.2.4 Itemresponsetheory .................................... 16
2.3 Generalityoftheframework..................................... 16
2.3.1 Trainandtestdistributions ............................... 16
2.3.2 Labelspace ............................................ 18
2.3.3 Side-information........................................ 19
2.4 Relationshiptoexistingframeworks .............................. 20
2.4.1 Supervisedlearning ..................................... 20
2.4.2 Matrixcompletion ...................................... 21
2.4.3 Weightedlinkprediction ................................. 22
vi
2.4.4 RandomeffectsmodelsandANOVA....................... 23
2.5 Overviewofdyadicpredictionmodels ............................ 24
2.5.1 Unsupervisedmodels.................................... 25
2.5.2 Feature-basedmodels ................................... 26
2.5.3 Clusteringmodels....................................... 28
2.5.4 Latentfeaturemodels.................................... 29
2.6 Analysisofthelatentfeatureapproach ............................ 40
2.6.1 Strengthsandweaknesses ................................ 40
2.6.2 Traininglatentfeaturemodels ............................ 41
2.6.3 Connectionstoothermodels.............................. 42
2.6.4 Acommentontheindependenceassumption ................ 45
Chapter3 LFL:aLog-LinearModelforDyadicPrediction................. 47
3.1 Motivation: agenericdyadicpredictionmodel ..................... 47
3.2 Afirstattemptatalog-linearmodel .............................. 48
3.2.1 Log-linearmodelsingeneral ............................. 48
3.2.2 Applyingthelog-linearframeworktodyadicprediction ....... 50
3.2.3 Aweaknessofthemodel: thepropensityproblem............ 51
3.3 LFL:alog-linearmodelwithlatentfeatures ....................... 51
3.3.1 Addinglatentfeaturestothelog-linearmodel ............... 51
3.3.2 Exploitingside-information .............................. 53
3.3.3 Trainingthemodel ...................................... 53
3.3.4 Makingpredictions ..................................... 55
3.4 AnalysisoftheLFLmodel...................................... 56
3.4.1 StrengthsandweaknessesoftheLFLmodel ................ 56
3.4.2 Differentperspectivesonthemodel........................ 57
3.4.3 Dowegetmeaningfulprobabilities? ....................... 60
3.5 ExtensionsandvariationsontheLFLmodel ....................... 60
3.5.1 Alternatefactorizations .................................. 61
3.5.2 Fixingabaseclass ...................................... 62
3.5.3 Finer-grainedweightsforside-information.................. 63
3.6 Comparisontoexistingmodels .................................. 64
3.6.1 PCAandprobabilisticsvariants ........................... 64
3.6.2 Statisticalnetworkmodels................................ 66
3.6.3 Othermodels........................................... 66
3.7 Experimentaldesign ........................................... 68
3.7.1 Aimsoftheexperiments ................................. 68
3.7.2 Hyperparameterselectionprocedure ....................... 69
3.7.3 Practicaldetailsontrainingprocedure...................... 70
3.7.4 Implementationdetails................................... 72
3.8 Experimentalresults ........................................... 73
3.8.1 Arelocaloptimaaconcern? .............................. 73
vii
3.8.2 Isthemodelpowerful? .................................. 74
3.8.3 ApplicationtoIRT:doesthemodelworkinpractice? ......... 75
3.9 Conclusion ................................................... 80
3.10 Acknowledgements ............................................ 81
Chapter4 ModellingRatingDistributions: ApplicationtoCollaborativeFiltering 82
4.1 AdvantagesofLFLforcollaborativefiltering ...................... 82
4.1.1 Modellingratingdistributions............................. 83
4.1.2 Addressingthecold-startproblem ......................... 85
4.2 IsLFLappropriateforcollaborativefiltering? ...................... 87
4.2.1 Ordinalnatureofcollaborativefilteringlabels ............... 87
4.2.2 Ispredictingthemodeappropriate? ........................ 88
4.2.3 Ismaximizinglog-likelihoodappropriate? .................. 89
4.3 ModifyingLFLforcollaborativefilteringproblems ................. 91
4.3.1 Modifyingthetrainingobjectivefunction ................... 91
4.3.2 Modifyingtheunderlyingmodel .......................... 92
4.3.3 Whichapproachisbetter? ................................ 94
4.4 Analysisofthemodel .......................................... 96
4.4.1 Matrixfactorizationperspective ........................... 96
4.4.2 Arerating-specificweightsmeaningful? .................... 97
4.5 Furtherextensionsofthemodel.................................. 98
4.5.1 Anchorpointsforprediction .............................. 98
4.5.2 Otherapproachestocapturingordinalstructure .............. 99
4.5.3 Incorporatingcollaborativefilteringspecificextensions ....... 100
4.6 Comparisontoexistingmodels .................................. 101
4.6.1 Comparisontoexistingmodelsforratingdistributions ........ 101
4.6.2 Comparisontoexistingschemesforcold-startcorrection ...... 106
4.7 Experimentaldesign ........................................... 108
4.7.1 Aimsoftheexperiments ................................. 108
4.8 Experimentalresults ........................................... 109
4.8.1 Does the choice of scoring and training scheme affect perfor-
mance?................................................ 109
4.8.2 Comparisononbenchmarkdatasets ........................ 110
4.8.3 Resultsinthecold-startsetting............................ 119
4.8.4 Arethelearnedprobabilitiesmeaningful?................... 122
4.9 Conclusion ................................................... 123
4.10 Acknowledgements ............................................ 125
Chapter5 ApplicationtoLinkPrediction................................ 130
5.1 Linkprediction: overviewandexistingmodels ..................... 130
5.1.1 Problemdefinition ...................................... 130
5.1.2 Desiderataforalinkpredictionmodel...................... 131
5.1.3 Existinglinkpredictionmethods .......................... 133
viii
5.1.4 Doexistingmethodsmeetthedesiderata?................... 136
5.2 ApplyingtheLFLmodeltolinkprediction ........................ 139
5.2.1 DoesLFLmeetthedesiderata?............................ 139
5.2.2 Handlinggenericgraphs: undirected,directed,multirelational.. 141
5.3 Overcomingclassimbalanceforunweightedgraphs................. 146
5.4 Experimentaldesign ........................................... 149
5.4.1 Aimsoftheexperiments ................................. 149
5.4.2 Descriptionofdatasets................................... 150
5.4.3 Evaluationmethodology ................................. 152
5.5 Experimentalresults ........................................... 153
5.5.1 Resultsforbinaryedges ................................. 153
5.5.2 Resultsfornominaledges ................................ 160
5.6 Conclusion ................................................... 161
5.7 Acknowledgements ............................................ 162
Chapter6 PredictingClickthroughRates: ApplicationtoResponsePrediction. 163
6.1 Backgroundandrelatedwork.................................... 164
6.1.1 Theresponsepredictionproblem .......................... 164
6.1.2 Challengesinresponseprediction ......................... 165
6.1.3 Formaldefinitions ...................................... 166
6.1.4 Existingmodels ........................................ 167
6.2 Fromcollaborativefilteringtoresponseprediction .................. 168
6.2.1 Adyadicinterpretationofresponseprediction ............... 169
6.2.2 Overviewofourlatentfeaturemodel....................... 169
6.3 Aconfidence-weightedfactorizationmodel ........................ 171
6.3.1 Confidence-weightedfactorization......................... 171
6.3.2 Comparisontoexistingmethods........................... 173
6.4 Incorporatingside-information .................................. 175
6.4.1 Ajointfactorizationandfeaturemodel ..................... 175
6.4.2 Aniterativerefinementprocedure ......................... 176
6.5 Incorporatinghierarchies ....................................... 178
6.5.1 Hierarchicalregularization ............................... 178
6.5.2 Agglomeratefitting ..................................... 180
6.5.3 Residualfitting ......................................... 182
6.5.4 Puttingitalltogether: ahybridmethod ..................... 182
6.5.5 Handlingcold-startpagesandads ......................... 183
6.6 Experimentaldesign ........................................... 183
6.6.1 Aimsoftheexperiments ................................. 183
6.6.2 Datasetsused .......................................... 184
6.6.3 Methodscompared...................................... 185
6.6.4 Evaluationmethodology ................................. 186
6.7 Experimentalresults ........................................... 187
ix
Description:in quality and form for publication on microfilm and electronically: Aditya Krishna Menon, Krishna-Prasad Chitrapura, Sachin Garg, Deepak Agarwal, Omer Tamuz, Sumit Gulwani, Butler Lampson, and Adam Tauman Kalai.