Table Of ContentModeling Attractiveness and Multiple Clicks in Sponsored
Search Results
∗ ∗
Dinesh Govindaraj Tao Wang S V N Vishwanathan
MicrosoftBingAds PurdueUniversity PurdueUniversity
Bangalore WestLafayette,IN WestLafayette,IN
digovind@microsoft.com taowang@purdue.edu vishy@stat.purdue.edu
ABSTRACT 1. INTRODUCTION
4
1 Click modelsareanimportanttoolforleveraginguserfeed- Whenausersubmitsaquery,searchenginesreturnorganic
0 back,andareusedbycommercialsearch enginesforsurfac- search results as a list of ranked URLs. Sometimes URLs
2 ing relevant search results. However, existing click models linkedtolandingpagesofads(sponsoredsearchresults)are
are lacking in two aspects. First, they do not share infor- also displayed along with the organic search results. Spon-
n
mationacrosssearchresultswhencomputingattractiveness. sored search URLs displayed above the organic search re-
a
Second, they assume that users interact with thesearch re- sults are called mainline ads, while those displayed on the
J
sults sequentially. Based on our analysis of theclick logs of sidearecalled sidebar ads. Searchenginesgenerate revenue
1
a commercial search engine, we observe that the sequential when users click on ads. In this paper we are interested
scan assumption does not always hold, especially for spon- in modeling how users interact with and click on sponsored
]
R sored search results. To overcome the above two limita- search results. Inparticular,wefocusonmainlineadssince
tions,weproposeanewclickmodel. Ourkeyinsightisthat amajorityofuserclicksandhencerevenuecanbeattributed
I
. sharinginformationacrosssearchresultshelpsinidentifying to theseads.
s
important words or key-phrases which can then be used to
c
[ accurately compute attractiveness of a search result. Fur- Miningclick-throughlogsisanimportantcomponentofthe
thermore, we argue that the click probability of a position never ending quest of commercial search engines to surface
1 as well as its attractiveness changes during a user session the most relevant (sponsored) search results in response to
v anddependsontheuser’spastclickexperience. Ourmodel an user query. To understand this, consider the following
5 seamlessly incorporatestheeffectofexternalities(qualityof scenario: Suppose documents d and d′ are displayed in re-
5
other search results displayed in response to a user query), sponse to a query q. If the numberof clicks that d receives
2 user fatigue, as well as pre and post-click relevance of a is disproportionately high compared to d′, one can reason-
0 sponsored search result. We propose an efficient one-pass ably conclude that d is more relevant than d′ for q. While
.
1 inference scheme and empirically evaluate the performance thisapproachof learningfrom the“wisdom of thecrowd”is
0 of our model via extensive experiments using the click logs cheaper than employing a human labeler, it is not without
4 of a large commercial search engine. its fair share of difficulties. For instance, it is well known
1 that user clicks display a position bias, that is, documents
: athigherrankedpositionsaremorelikelytobeclickedthan
v Categories andSubject Descriptors
lower ranked documents. There is a rich body of research
i
X H.4[InformationSystemsApplications]: Miscellaneous which tries to infer an unbiased estimate of the document
r relevancefromclick-throughlogsbyexplicitlymodelinguser
a General Terms behaviorusing click models.
Algorithms
Almost all existing click models are designed for organic
search, and make the simplifying assumption that users in-
Keywords
teract with thesearch resultssequentially [12, 13]. In other
ClickModels,Userbehavior,Querylog,ClickandBrowsing words,theyassumethatauserexaminesanURLatposition
data analysis, Sponsored Search,Optimization, Ranking i+1 only after examining the URLs at positions 1,...,i.
Furthermore, since an overwhelming majority of user ses-
∗
Work donewhen visiting Microsoft Bing Ads,Bangalore. sions end after one click, many models focus exclusively on
this scenario. However, based on the analysis of the click
logs of a commercial search engine, we observed that user
interaction with sponsored search results do not confirm to
these assumptions. Approximately 10% of the usersessions
with clicks contain more than one click. Furthermore, in
approximately 30% of multi-click sessions at least one pair
of clicks is in reverse order (e.g., a click at position 3 fol-
lowed by a click at position 1). This situation is depicted
graphically in Figure 1.
0.5 Inthispaper,wepresentanovelstatisticalclickmodelwhich
modelsreverseclicks,andincorporatesanewmethodfores-
timatingattractiveness. Likeinpreviouswork[15,16,5]we
0.4 also assume a separable model, that is, the user propensity
for clicking on a search result is a product of the proba-
s
k bility of examining that position and the attractiveness of
c
Cli 0.3 the document displayed at that position. However, our key
of insight is that the examination probability of a position as
n well as the attractiveness of a position changes over time
o
ti 0.2 and dependson theuser’s past click experience.
c
a
Fr Our contributions can be summarized as follows: We pro-
0.1 pose a new click model for user interaction with sponsored
searchresults. Ourmodelcanhandlemultipleclicksessions
aswellasusersessionswheretheorderofclicksisreversed.
Furthermore, we present a new method for estimating at-
0
tractiveness which shares information across multiplespon-
−3 −2 −1 1 2 3
sored search results displayed in response to a user query.
ConsecutiveClick Distance
Ourmodelisflexibleandcaneasilyincorporateuserfatigue,
pre-click and post-click relevance of an ad, and externali-
ties. We derive an efficient one-pass inference mechanism
Figure 1: If the user clicks on position i followed by forestimatingparameters. Finally,weshowthatourmodel
position j, then the click distance is defined as j−i. comprehensively outperforms existing click models in large
Since at most 4 mainline ads are shown, the click scale empirical evaluation using real-life data from a large
distance lies in the range [−3,3]. We plot a normal- commercial search engine.
ized histogram of the distance between consecutive
clicks.
2. CLICK MODELS: A BRIEF REVIEW
In what follows we will use theterms documents, ads, urls,
and sponsored search results interchangeably. Almost all
We conjecture that users scan all the sponsored search re- click prediction models make the following separability as-
sults before they begin clicking. This is because sponsored sumptions: (1) A document is clicked if and only if it is
search results are shown in a small block at the beginning examined(orviewed);(2)Theprobabilitythatadocument
of the page and usually occupy only 4 to 8 lines. Perhaps, is clicked is independent of its position given that it was
eyetracking studies[9] which areoften used to supportthe viewed; (3) The probability that a document is viewed is
linear scan assumption in organic search may not apply to independent of the document given the position, and is in-
sponsored search1. Therefore, meaningful click models for dependentof theother adspresented.
sponsored search need to model reverseclicks.
Unlike organic search, where short snippets of the landing ExaminationHypothesis. Let Ei (resp. Ci) be a random
pageofasearchresultisdisplayed,sponsoredsearchresults variablewhichisequalto1ifadocumentdiatpositioniwas
are designed to grab attention. Towards this end, they em- examined (resp. clicked). Based on the above assumptions
ploy titles that are short, catchy, and some words (which one can write:
match query terms) are displayed in bold font. An spon-
sored search result might look very attractive to an user, P(Ci=1|di)=P(Ci=1∩Ei=1|di)
but after clicking on it she might realize that the landing =P(Ci=1|di,Ei =1)P(Ei=1). (1)
page does not contain the information that she wants. In
other words, not only the attractiveness but also the post- | rele{vzance }|positi{oznbia}s
click relevance of a sponsored search result are important Inotherwords, theprobabilityof a documentbeingclicked
factors for usersatisfaction in sponsored search. canbefactoredintotheproductoftherelevanceofthedocu-
mentandthepositionbias. Thisistheso-calledexamination
Somewhatsurprisingly,mostexistingclickmodelseitherdo hypothesis[15] or separability assumption [1].
not distinguish between attractiveness and post-click rele-
vance or use very simple methods of estimating attractive-
ness. However, consider the following: the presence of the Cascade Model and Extensions. The cascade model [6]
words“bonded and insured”in the title can make a spon- assumes thatusers scan thedocumentsfrom top tobottom
soredsearchresultmoreattractiveforthequery“plumber”. withoutskipping,thatis,usersexaminetheadssequentially:
Ifwecansharesuchinformationacrossallsponsoredsearch
resultsdisplayedinresponsetothequery“plumber”,wecan P(Ei=1|Ei−1 =1)=1 and P(Ei=1|Ei−1 =0)=0. (2)
accurately estimate attractiveness.
The cascade model [6] further assumes that users exam-
inedocumentsuntiltheyfindan appropriateone, andthen
1Validating our conjecture using carefully designed user abandon thesearch after thefirst click:
studies is an interesting research direction in its own right,
but is unfortunately beyondthescope of thecurrent work P(Ei=1|Ei−1 =1,Ci−1)=1−Ci−1. (3)
The basic cascade model can only deal with query sessions Effect of Externalities. Externalities [8] refer to the ob-
with a single click. The dependent click model (DCM) [11] servation that the click on a document also dependson the
extendsthisto multi-click sessions, by modifying (3) to qualityofotherdocumentspresentedonthesamesearchre-
sult page. That is, the examination event will be affected
P(Ei=1|Ei−1 =1,Ci−1)=(λi−1)Ci−1. (4) not only by the documents shown above a certain position,
but also by thedocumentsbelow a certain position.
Here λi is estimated from the empirical probability that a
browsing session did not end after a click at position i.
Much of the early work on click models considers the dis-
playeddocumentsasbeingindependentofeachother. Some
The user browsing model (UBM) [7] extends the DCM by
recentworkhastriedtotakeexternalitiesintoaccount. For
assuming that the examination probability depends on the
instance, [17] proposed a conditional random fields based
last clicked position in the same query session. In other
model to click prediction to consider the externalities., [19]
words
considersexternalitiesbutrestricttheirattentiontosessions
P(Ei=1|C1,...,Ci)=βri, (5) with just two ads. Ontheother hand,[16] showed that the
relevanceofadocumentatapositionisnotconstantandit
where ri denotes the position of the previous click, that is, isaffectedbytheclicksinotherpositions. Inthecontextof
ri := argmaxjI(Cj =1). The Bayesian browsing model news recommendation, [2] extended the basic DBN model
(BBM) [14] models relevance P(Ci = 1|di,Ei = 1) as a to take into account a sub-modular information gain score,
random variable, and usesa Bayesian approach to estimate which affectstherelevanceof adocument. Althougha sim-
its value. ilar extension is also possible in our setting, for the sake of
brevitywe will omit discussing this for our model.
DynamicBayesianNetworkModel. ThedynamicBayesian 3. MODEL DESCRIPTION
network (DBN) [4] model differs from theabove click mod-
3.1 Notation
els in two important ways. First, it distinguishes between
Since we have to deal with reverse clicks, our notation is
attractiveness (also called perceived relevance or pre-click
non-standard. Given a query q, we assume that the user is
relevance in literature [16]) and post-click relevance. Sec-
presented with a ranked list of n documents or sponsored
ond, it incorporates userfatigue. This leads to
search results D = {d1,...,dn}, and she might choose to
P(Ci−1 =1|di−1,Ei−1 =1)=θdi−1 (6) click on 0≤nˆ ≤n documents. We consider this as a query
P(Ei=1|Ei−1 =1,Ci−1)=λi−1(cid:0)1−ρdi−1(cid:1)Ci−1. (7) vseasrsiiaobnl.e; cLietfocri1∈≤{i0,≤1,nˆ...d,enn}otbesetahempuolstiitnioomnioafl trhaendio-tmh
(Hreersep,.ρadtti−ra1ct(irveesnpe.sθsd)i−of1)dodceunmoteensttdhi−e1pdosistp-claliycekdsaattipsfoascittiioonn ccl1i:cik=, a(nc1d,cw2,e··le·t,cbio)t,hancd0 canbdecanˆs+h1orttohabnedefqouracl1t:nˆo+01.. Let
i−1. For simplicity, one can set λi−1 = λ for all i [4], or
likeintheDCMestimateλi−1 from empiricalprobabilities.
Start
Similarly, θ is estimated (essentially) by counting the
di−1
numberoftimesadocumentwasdisplayedandthenumber
oftimesitwasclicked. Aswewillseelaterinthepaper,esti-
matingtheattractivenessbyusinganaive-Bayeslikemodel
leads to information sharing and hence betterestimates for
Yes
theattractiveness. Abandon?
Recently [3] extended the DBN by learning a per-query fa- Prev. clicks Attractiveness
tigueparameterandaddingavariabletoindicateiftheuser
No
islikelytointeractwithsponsoredsearchresultsordirectly
skip over them. A related model to DBN is the click chain
Select a position to click
model (CCM) [10].
Temporal Hidden Click Model. Some recent work [18]
has focused on incorporating revisiting behaviors into click No
models. However,thetemporalhiddenclickmodel(THCM) Satisfied?
proposed by the authors of [18] does not distinguish be-
tween attractivenessandpost-clickrelevance. Furthermore,
THCM allows for only two kindsof transitions namely Yes
P(Ei=1|Ei−1 =1)=α (8) Stop
P(Ei−2=1|Ei−1 =1)=γ, (9)
withα+γ ≤1. Notethatthetransitionprobabilitiesdonot
depend on the location i−1. Furthermore, the probability Figure 2: User browsing behavior.
of examining a position decays exponentially with distance
from thecurrent location. The user browsing behavior in sponsored search can be ex-
plained byFigure2. Afterclickingonadocument,theuser 3.2 Graphical Model
has two choices: she can abandon the session or she can Following standard practice, we will place a Beta prior on
choose to return and click on any other document that has η
i
not been clicked so far. Users abandon a session for one of
two reasons: ηi ∼Beta(αηi,βiη) (14)
and a Dirichlet prior on γi
• In the abandon with satisfaction case, the user found
the information she was looking for in the sponsored γi ∼Dirichlet(αγi). (15)
searchresults. WecapturethisbehaviorviaaBernoulli
ABayesiannetworkspecificationofourmodelcanbefound
random variable si, which takes on a value of 1 when in Figure 3. We denote all vectors in bold and matrices in
theuserissatisfiedafterclickingthedocumentdci−1 at
position ci−1. This, in turn,dependson thepost-click
relevance oftheclickeddocument,whichwedenoteas αη
ρ . In otherwords,
dci−1
P(si =1|ci−1,D)=ρdci−1. (10) i=1:nˆ+1
By default, we will set P(si=1|c0,D)=0. a ti
• Intheabandon withoutsatisfaction case,theusermay
choose to skip ahead to the organic search results or
close the browsing session. This behavior is captured
via another Bernoulli random variable ti, which de- D ci−1 ci
pendsupon theperseverance of theuser. Wewill let
P(ti=1|si =0,ci−1)=ηi, (11)
whereηidenotestheprobabilitythatuserwillabandon
thesession after i clicks.
si vi
Asonecanexpect,theperseveranceoftheuserdecreasesas
thenumberofclicksincrease. Thisallowsourmodeltocap-
ture the effect of externalities in the following natural way: Aγ
The userfatigue parametermultiplied bythepre-click rele-
vance gives theinstantaneous pre-click relevance of a docu-
ment. If an ad appears alongside other highly relevant ads Figure 3: Our click prediction model represented as
which have already received clicks, then its instantaneous a Bayesian Network. Shaded nodes are observed.
pre-click relevance reduces. In other words, the number of
previous clicks observed is a direct measure of externalities capital and bold. If we let s=(s1,...,sn), t=(t1,...,tn),
(relevance of other documents). On the other hand, when V =(v1,...,vn), αη =((αη1,β1η),...,(αηn,βnη)), and Aγ =
the user decides to continue, then two factors influence the (αγ,...,αγ), then our click model can be described as fol-
1 n
position of thenext click: lows:
P(c,s,t,a,V|αη,Aγ,D)= (16)
• She will choose a position based on attractiveness of
P(a|D)×P(s|D)×P(αη|D)×P(Aγ|D)×
the documents presented. We denote the attractive-
tnheasstotfhtehaetdtroaccutmiveenntesastopfostihtieondocciubmyeanctis,aantddaiffsseuremnet nˆY+1P(ci|si,ti,a,vi)×P(ti|ci−1,αη)×
positions is independent of each other. Furthermore, i=1
let a denote then-dimensional vectorof ai and P(vi|ci−1,Aγ)×P(si|ci−1,D),
P(aci =1|D)=θdci. (12) where we define
• The position of the previous click influences the posi- P(ci =0|si =1)=1 (17a)
tionofthenextclick,theso-calledpositionbias effect. P(ci =0|ti =1)=1 (17b)
Ttohri,swishocaseptduirsetrdibuustiniognviisaginvednimbeynγsicoi−n1alrandomvec- P(ci=j|si,ti=0,a,vi)=P(aj =1)·P(vi,j =1)(1.7c)
P(vi,j =1|c1:i−1)=γci−1,j. (13) Iftheuserhasabandonedthesession(si=1orti =1)then
Hereγci−1,j denotestheprobabilitythattheusertran- subsequent valueof ci is set to 0. This is captured in (17a)
sitionstopositionj afterthepreviousclickatposition and (17b) above. On the other hand, if the user decides to
ci−1. We allow for non-zero γci−1,j even if j < ci−1. continue,thentheprobabilityof aclick atthej-thposition
This is different from existing click models which as- (withj 6=0)dependsontheattractivenessofthedocument
sumethatusersscanthelistlinearly,thatisci>ci−1. as well as theposition bias as can beseen from (17c).
3.3 Inference
Table 1: Because of confidentiality concerns we
Global parameters of our model include ηi, and γi for i =
cannot report the size of our data set. Instead, we
1,...,n. Furthermore, for each document d we need to in-
report data statistics which are normalized by the
fer its attractiveness θ and post-click relevance ρ . The
d d
hyper-parameters αη and Aγ are estimated from historical number of sessions in the 0-10 bucket. For each
decile we also show the fraction of sessions which
data, and the other parameters such as ηi, γi, and ρd can
received one to four clicks.
be estimated efficiently by one pass through the click-logs.
DetailscanbefoundinAppendixA.Nextwediscussanovel
Freq. Frac. 1 Click 2 Click 3 Click 4 Click
method for computingθ .
d 0-10 1.0 91.6% 7.3% 1% 0.1%
10-50 0.39 91.5% 7.3% 1.1% 0.1%
3.4 Estimating Attractiveness
50-100 0.16 92.3% 6.6% 1% 0.1%
Asponsoredsearchresultddisplayedinresponsetoaquery 100-500 0.37 93.5% 5.7% 0.7% 0.1%
qcontainsthreecomponentsnamelyatitle,description,and
500-1000 0.15 95.4% 4.1% 0.42% 0.08%
adisplayURL.WeusethefollowingnaiveBayeslikescheme
>1000 1.0 97.94% 1.9% 0.15% 0.01%
toestimateattractiveness: Foreachwordw,excludingcom-
monly occurring stop words, which appears either in the
title, description or the display URL2 we compute Nw the Table 2: First click distribution across the four
numberofoccurrencesofwinDq,andNˆw thenumberofoc- mainline positions.
currencesofw inthesubsetofDq thatreceivedaclick. For
infrequentqueries(searchvolumelessthan50inaoneweek 1 2 3 4
window) we let Dq be the entire training corpus, while for 0.708 0.163 0.0787 0.0503
frequent queries (which occur with frequency greater than
50) we restrict Dq to the documents displayed in response
to q. Let |d| denotethenumberof words in d and estimate 1. Dynamic Bayesian Network (DBN) model: The user
perseverance parameter in DBN (see (7)) was set to
θd = |d1|∀Xw∈dNNˆww. (18) λvai−li1da=tioλn,=ba0s.e0d1ofonrtahlelrie.coTmhmisewnadsatfioounnodfv[4ia].cross-
Inotherwords,foreachwordw∈dweestimatethefraction 2. Independent Click Model (ICM): Since the cascade
of times it occurred in a clicked document for query q, sum model[6]canonlyhandlesingleclicksessions,onecan
the contribution due to each word independently and nor- extendeditbysettingtheuserperseveranceparameter
malizebythelengthofdtoobtainθ . Theadvantageofour λ in (4) to one [11]. This yields theICM.
d
methodisthatweshareinformationacrossdocumentsinthe
3. PositionModel(PM):Inthismodel,theclicksequence
collectionDq. Forinstance,ifawordw occursfrequentlyin
onlydependsonthepositionofanad,andisindepen-
documentswhich are clicked in response to a query,then it
dent of its contents or attractiveness. In other words,
is very likely to be a relevant word for that query and our
the first click is at position one, second click on po-
method gives it a high weight. Another advantage of our
sition two and so on. Because of position bias (see
method is that we are able to address the cold start prob-
Table 2), thisis astrong baseline.
lem; we can estimate the attractiveness of a new sponsored
searchresultbasedonthewordsthatappearintheadcopy. 4. Attractiveness model (AM): The click sequence only
depends on the attractiveness of an ad (section 3.4),
4. EXPERIMENTAL EVALUATION andisindependentofitsposition. Inotherwords,the
Wecollected threeweeksofclick logs from alarge commer- most attractivead is clicked first and so on.
cial searchengine,andusedthefirstweekdataforcomput-
ingpriors,thesecondweekdatafortraining,andthirdweek
4.2 EvaluationPlan
data for testing. Wefilter thedataand only retain sessions
whereatleastonemainlineadwasdisplayed. Ourtestsetup Inthefirstpartofourexperimentalevaluationwearemainly
isclosertoareal-worlddeploymentscenario,andsomewhat interested in understanding how well our algorithm is able
differentfromthecommonlyusedpracticeofrandomlysplit- to infer post-click relevance. After all, this is the main mo-
tingavailabledataintoatrainingandtestset. Inparticular, tivation behind studying click models. Since other models
we retain all queries in the test set, even if they have not donotdistinguishbetweenattractivenessandpost-clickrel-
been observed in the training set. In this case, our estima- evance, we will only compare our algorithm with the dy-
tionispurelybasedonpriors,whicharecomputedfromthe namicBayesiannetwork(DBN)modelof[4]. Detailsofthis
first week of data. We set αγ +βγ =10 and αη+βη =10. experiment can be found in Section 4.3.
i i i i
Table 1 summarizes our training data statistics. Note that
forhighdecile(>1000) queries,therearesignificantlyfewer For our second set of experiments (Sections 4.4 to 4.8) we
three and four click sessions than for tail queries. focus on verifying that our model is able to predict the se-
quence of user clicks accurately. Click models are usually
4.1 Baselines evaluated by computing average perplexity, or the closely
related log-likelihood on the test set. Recall that the per-
Forourexperimentswewillcomparetheperformanceofour
plexity for a single user session is computedas
algorithm against thefollowing four baselines:
2Forsimplicity thedisplayURListreatedasasingleword. p=2−n1Pni=1Cilogqi+(1−Ci)log(1−qi), (19)
where qi is theprobability of observing a click at position i 1
aspredictedbythemodelandCi indicatesifanactualuser OurModel
click was observed at that position. DBN
However, we believe that perplexity alone does not tell the
0.9
full story. For instance, the perplexity scores of our model
n
(1.1932) andDBN (1.1984) are verysimilar, buttheydiffer o
si
vastly in terms of how well they predict a sequence of user ci
e
clicks. Similarly,amodelwhichsimplypredictsqiastheem- Pr
pirical click-through rate, especially for high decile queries, 0.8
achieves very low perplexity but is unable to explain a se-
quenceofuserclicks(alsoseetheresultsfortheAMbelow).
Therefore, weadaptastrongerevaluationcriterionwhichis
designed to answer thefollowing naturalquestions:
0.7
0.2 0.4 0.6 0.8 1
Recall
• How well does the model predict the first click posi-
tion?
• How well does the model predict two, three, and all Figure4: AreaUnderCurveforpredictingrelevance
four click sequences? This is a very stringent evalua- of Ads
tioncriterionformultipleclickmodels,andthemodel
wins only if it predicts all clicks in the sequence cor-
4.4 Predicting the FirstClicked Ad
rectly.
This isamulti-class classification problemwith imbalanced
• Even if the model makes mistakes in predicting the class probabilities. Therefore PM which predicts that the
actual click sequence,we want tounderstandwhether first click happens at position one and has an accuracy of
the actual click sequence ranks high in terms of log- 72.27% on our test dataset. The AM has an accuracy of
likelihood. 73.5% while the DBN and ICM accuracies are 72.7% and
71.7% respectively. In our model we predict the ad which
• Does the model predict the top two and three clicks
has the maximum click propensity as the first clicked ad.
correctly? Inotherwords,thepredictedclicksequence
This results in an accuracy of 79.59%, a gain of over 8.2%
need not be in the same order as the actual click se-
as compared to the other models. To further understand
quencebutthemodelneedstoaccuratelypredictwhic
theperformance of models across different query deciles we
hadswereclicked. Forinstance,iftheactualclick se-
plottheaccuracyofdifferentmodelsinFigure5. Notethat
quence was {1,2} and the model predicts {2,1} then
weconsistentlyoutperformallothermodelsacrossallquery
thisisconsideredtobeacorrectpredictionasperthis
deciles, with the gains being more substantial for the lower
metric. Note that predicting the top four clicks in a
decile queries which are harder tolearn.
four click session will always be100% accurate.
• How does the model fare in terms of predicting and 1.4
rankingof sessions with reverse clicks? OurModel
1.2 DBN
ICM
Intuitively, predicting well on higher decile queries is eas-
PM
ier than lower decile queries. In order to understand the 1
strengthsandweaknessesofvariousmodel,foralltheabove AM
y
caseswereporttheirperformanceacrossdifferentquerydeciles. c
ra 0.8
u
c
4.3 Predicting Post-ClickRelevance c
A
0.6
We have access to approximately 10,000 (query, ad) pairs
fromthetestdatasetwhichhavebeenlabeledasrelevantor
irrelevant by trained human editors. Note that the human 0.4
editors look into the landing pages to determine the labels.
Following [4] we rank these (q,d) pairs using a score which
0.2
iscalculatedasθ ×ρ (attractivenesstimespost-clickrele-
d d
vscaonrcee,).anTdhewedomcueamseunretsparreecirsaionnkevdsbraesceadll.onTthheerceosmulptsutfeodr 0-10 10-50 50-100 00-500 0-1000 >1000
our model and DBN are plotted in Figure 4. Clearly our 1 50
proposed model outperforms DBN consistently across re-
Query Volume
calls. In particular note that our model has high precision
atlowrecallandisthereforeabletorankrelevantdocuments
higheronthelistascomparedtoDBN.Ourmodelachieves
an AUC of 0.8653 compared to DBN which is only able to Figure5: Accuracy of firstclicksequenceprediction
achieve 0.7843. across different query deciles.
0.6
OurModel OurModel OurModel
0.8
DBN 0.6 DBN DBN
ICM ICM ICM
PM PM PM
0.4 0.6
AM AM AM
acy 0.4
ur
Acc 0.4
0.2
0.2
0.2
0 0 0
0-10 10-50 50-100 100-500 500-1000 >1000 0-10 10-50 50-100 100-500 500-1000 >1000 0-10 10-50 50-100 100-500 500-1000 >1000
QueryVolume QueryVolume QueryVolume
Figure 6: Accuracy of two (L), three (M), four (R) click sequence prediction across different query deciles.
4.5 Predicting the Entire ClickSequence
Table 4: Overall Actual Click Sequence Ranking.
We compute click propensity for each click sequence with
Numberof Clicks Average Rank
the same length as the actual click sequence. We say that
1 1.25 ± 0.62
our model predicted the click sequence only when we pre-
2 2.25 ± 2.27
dict theentireclick sequencecorrectly. Table3summarizes
3 2.67 ± 4.17
theresults. Ourmodelgains8.6%(resp.25%)ontwo-click-
4 1.87 ± 3.36
sequence prediction and 3.4% (resp. 37%) on three-click-
sequencepredictionoverDBN(resp.PM).Themodelaccu-
racies are comparable when predictingfour click sequences,
which are only a very small fraction of thedata. click sequence among the top 3 or 4 of all the possible per-
mutations of click sequence.
Although AM is very competitive when predicting the first
Figure7showshowtherankingoftheactualclicksequence
click,itisunabletopredictlongerclicksequencesaccurately.
varies across different query deciles. As can be seen from
This is because users do not decide to click on an ad solely
Table1,longerclicksequencesaremorefrequentinthelower
based on its attractiveness, position bias and past click ex-
periencealso playsanimportantroleandneedstobetaken deciles,andhenceourmodelhasmoredatatolearninthese
into account. deciles. Consequently the average rank of the actual click
sequence as predicted by our model for two, three and four
clicks is lower for lower decile queries.
Table 3: Accuracy of Predicting the Entire Click
Sequence. 4.7 Predicting Top Clicks
# Clicks OurModel DBN PM AM ICM
Herewe ignore theorder and focus on understandingif our
2 36.71 33.79 29.35 25.44 10.94
modelisabletopredictthepositions oftheclicksintwoand
3 32.26 31.27 23.54 12.80 23.55
threeclicksessions. Table5shows weout perform all other
4 47.94 48.58 48.58 6.86 48.58
models in predicting top 2 and 3 click positions. Figure 8
shows accuracies across querydeciles.
Figure6showstheaccuracyofthemodelsfortwo,threeand
fourclickpredictionacrossdifferentquerydeciles. Notethat
we consistently outperform other models across all query Table 5: Accuracy of Predicting the Locations of
deciles in two click sequence prediction. In three click se- Top Clicks.
quenceprediction,DBNperformsslightlybetterintopdeciles # Clicks OurModel DBN PM AM ICM
thanourmodel. Sincenumberof queriesandsession intail 2 54.93 51.11 43.89 48.5 16.07
decilesarehigherthantopdeciles,weachievebetteroverall 3 62.46 59.14 44.98 59.7 45.01
performance than DBN.
4.8 Predicting ReverseClickSequences
4.6 Ranking oftheActual ClickSequence
Inour finalexperiment wefocus only on thesessions where
In the previous section we focused on predicting the entire at least onepair of clicks was observed in reverse order(re-
click sequence correctly. Here we focus on how we rank the verse click sessions). As before, we focus on the accuracy
actual click sequences in terms of log-likelihood. For this, (Table 6), rank of the actual click sequence (Table 7), and
we compute the log likelihood for all permutations of click accuracy of predicting the location of the top clicks (Ta-
sequences. We then sort the click sequences based on the ble 8). Figure 9 shows the accuracy and rank of the actual
computed log likelihood and check the rank of the actual click sequencefordifferent deciles, andFigure 10shows the
click sequence in this list. Table 4 summarizes our results. accuracy of predicting positions of clicks in two and three
As can be seen, our model on the average ranks the actual click sessions.
1 1
OurModel OurModel
DBN DBN
0.8 ICM 0.8 ICM
PM PM
AM AM
y y
ac 0.6 ac 0.6
ur ur
c c
c c
A A
0.4 0.4
0.2 0.2
0-10 10-50 50-100 100-500 500-1000 >1000 0-10 10-50 50-100 100-500 500-1000 >1000
QueryVolume QueryVolume
Figure 8: Accuracy of predicting positions of two (L) and three (R) click sessions across different query
deciles.
0.12
FirstTwoClicks FirstTwoClicks
0.1 FirstThreeClicks nce 8 FirstThreeClicks
AllFourClicks ue AllFourClicks
q
8·10−2 Se 6
y ck
Accurac 6·10−2 ctualCli 4
4·10−2 A
of
k 2
2·10−2 an
R
0 0
0-10 10-50 50-100 100-500 500-1000 >1000 0-10 10-50 50-100 100-500 500-1000 >1000
QueryVolume QueryVolume
Figure 9: Accuracy (L) of predicting the entire click sequence, and ranking (R) of the actual click sequence
across different query deciles for sessions where at least one pair of clicks was observed in reverse order.
1
OurModel OurModel
0.8 DBN DBN
PM 0.8 PM
AM AM
0.6 ICM ICM
y y
ac ac 0.6
ur ur
c c
c c
A A
0.4
0.4
0.2 0.2
0-10 10-50 50-100 100-500 500-1000 >1000 0-10 10-50 50-100 100-500 500-1000 >1000
QueryVolume QueryVolume
Figure 10: Accuracy of predicting positions of two (L) and three (R) click sessions across different query
deciles for sessions where at least one pair of clicks was observed in reverse order.
SingleClick Two Click Three Click Four Click
Table 8: Accuracy of Predicting Locations of Top
Clicks when Actual Click Sequence has Reverse Or-
der.
10
# Clicks OurModel DBN PM AM ICM
e
c
n 2 55.74 51.12 45.66 49.50 16.09
e
u 3 61.51 58.20 44.74 59.14 44.75
q 8
e
S
k
c
Cli 6 Online sponsored search auctions are priced using a Gener-
alized Second Price (GSP) auction mechanism. Inherent in
al
u thismodelistheassumptionthattheclicksontheadshap-
t
Ac 4 penindependentofeachother. Wearecurrentlyworkingon
of developingpricingmechanismswhichwilltakeintoaccount
k theclickingbehaviorpredictedbyourmodel. Oureffortsare
an 2 also directed towardsimprovingthereverseclick prediction
R
accuracy of our model by using stronger priors. Finally, we
arealsoworkingtowardsmakingourmodelrobusttonoise.
0
0-10 10-50 50-100 00-500 0-1000 >1000 APPENDIX
1 50 A. PARAMETER ESTIMATION
QueryVolume Our training data consists of m sessions, and we assume
that thedocument set Dk was displayed in thek-thsession
inresponsetoqueryqk andweobservedaclicksequenceck
of length nˆk. Ifwe assume that the sessions are iid, then
Figure 7: Actual click sequence ranking across dif-
ferent query deciles. m
P ck | Dk = P ck|Dk . (20)
(cid:16)n o n o(cid:17) Y (cid:16) (cid:17)
k=1
Table 6: Accuracy of Predicting the Entire Click Letckdenotethelocationofthei-thclickinthek-thsession,
i
Sequence when Actual Click Sequence has Reverse I(·)denotetheindicatorvariableofanevent,D′ denotethe
Order. set of uniquequery-documentpairs in our session data and
# Clicks OurModel DBN PM AM D=|D′|. Plugging in (16) into (20), using (10)–(17) shows
2 8.1 0.0 0.0 17.70 that P ck | Dk
3 0.6 0.0 0.0 9.4 (cid:0)(cid:8) (cid:9) (cid:8) (cid:9)(cid:1)
n n
4 0.1 0.0 0.0 3.9 ∝ θψd × γδjkp(γ )
Y d YY jk jk
d∈D′ j=1k=1
Asexpectedourmodelpredictionaccuracy forreverseclick ×Yn (1−ηj)βjηjβj′ p(ηj)× Y (1−ρd)κdρκd′d,
sessionsislowsincethereverseclicksessionsareonlyaround j=0 d∈D′
30% ofthemulti-clicksessions. Eventhoughouraccuracies
where we define
are lower than the attractiveness model, our model ranks
actual reverse click sequences in top 5 of all the possible m nˆk
permutationsofclick sequence. Ourmodeloutperformsall ψd =XXI(cid:16)dcki =d(cid:17) (21)
other models in most of the experiments and still achieves k=1i=1
betterrankingofactualclicksequenceevenwhenclickshave m nˆk
been observed in reverse order. δi,j =XXI(cid:16)ckt−1 =i(cid:17) and I(cid:16)ckt =j(cid:17). (22)
k=1t=1
ψ countsthenumberof times documentd occursin {D },
5. CONCLUSIONAND FUTUREWORK d k
δi,j countsthenumberof timeswe observedaclick at posi-
We presented a new multiple click model for modeling how
tion j after havingobserved a click at position i. Notethat
users interact with and click on sponsored search results.
when i< j δi,j counts the in-sequence users clicks, and for
Our model can handle reverse click sequence and compre-
all i>j, δi,j capturesout of sequenceclicks.
hensively outperforms othermodels across a numberof dif-
ferent metrics in extensiveempirical evaluation.
m m
βj =XI(cid:16)nˆk >j(cid:17) and βj′ =XI(cid:16)nˆk =j(cid:17) (23)
Table 7: Overall Reverse Click Sequence Ranking. k=1 k=1
Numberof Clicks Average Rank (24)
2 3.15±3.01
βj countsthenumberof sessions with greaterthanj clicks,
3 3.89±5.46 whileβ′ countsthenumberofsessionswithexactlyj clicks.
j
4 2.67±4.56 It is easy to see that βj =Pj′>jβj′′ for j 6=0.
Research and development in information retrieval,
m SIGIR’08, pages 331–338, 2008.
κ = I d 6=d and I d =d for some i, (25) [8] ArpitaGhosh and Mohammad Mahdian. Externalities
d Xk=1 (cid:16) cknˆk (cid:17) (cid:16) cki (cid:17) in online advertising. In Proceedings of the 17th
international conference on World Wide Web, WWW
m ’08, pages 161–168, 2008.
and κ′d =Xk=1I(cid:16)dcknˆk =d(cid:17). (26) [9] ELayue-rtaraAck.iGngraannkaal,yTsihsoorfstuesnerJboaechhaivmiosr, ianndwwGweriseGaarcyh..
κ′ countsthenumberof sessionswhich endedafterclicking InProceedings of the 27th annual international ACM
d SIGIR conference on Research and development in
onthedocumentd,whileκ isthenumberofsessionswhich
d
information retrieval, SIGIR’04, pages 478–479, 2004.
did not end after a click on document d.
[10] Fan Guo, Chao Liu, AnithaKannan,Tom Minka,
Michael Taylor, Yi-Min Wang,and Christos Faloutsos.
As a consequenceof using theBeta prior, we can estimate
Click chain model in web search. In Proceedings of the
ηj = βj′ +αβjηj′ ++αβηjj+βjη and ρd = κ′dκ+′dκd (27) 1W8tWhWinte’0r9n,aptiaogneasl1c1o–n2fe0r,e2n0c0e9o.n World wide web,
[11] Fan Guo, Chao Liu, and YiMin Wang. Efficient
Furthermore, sinceγi ∼Dir(αγi), we can estimate: multiple-click models in web search. In Proceedings of
γi,j = Pjδ(iδ,ji,+j+αγiα,jγi,j). (28) tS2h0eea0r9Sc.ehcoannddADCatMa MInitneirnnga,tiWonSaDlMCo’n0f9e,repnacgeeson12W4–e1b31,
[12] T. Joachims. Optimizing search engines using
A.1 TimeComplexity
clickthrough data. InProceedings of the ACM
In order to perform the updates (27) and (28) we need to Conference on Knowledge Discovery and Data Mining
compute the quantities defined in (21) – (26). However, all (KDD). ACM, 2002.
these quantities involve simple counts, which can be com- [13] Thorsten Joachims, LauraGranka, Bing Pan, Helene
puted by using one-pass through the click logs. Therefore, Hembrooke,and Geri Gay. Accurately interpreting
we can conclude that inference in our model is extremely clickthrough dataas implicit feedback.In Proceedings
salable. of the 28th annual international ACM SIGIR
conference on Research and development in
B. REFERENCES information retrieval, SIGIR’05, pages 154–161, 2005.
[1] Gagan Aggarwal, Ashish Goel, and Rajeev Motwani. [14] Chao Liu, Fan Guo, and Christos Faloutsos. BBM:
Truthfulauctions for pricing search keywords.In bayesian browsing model from petabyte-scaledata. In
Proceedings of the 7th ACM conference on Electronic Conference on Knowledge Discovery and Data Mining,
commerce, pages 1–7, 2006. KDD’09, pages 537–546, 2009.
[2] AmrAhmed,A.J. Smola, Choon HuiTeo, and [15] Matthew Richardson,Ewa Dominowska, and Robert
S.V. N.Vishwanathan.Fair and balanced: Learning Ragno. Predicting clicks: Estimating theclick-through
to present news stories. InWeb Science and Data rate for new ads. In World Wide Web Conference,
Mining (WSDM),2012. 2007.
[3] Azin Ashkanand Charles L.A.Clarke. Modeling [16] Ramakrishnan Srikant,Sugato Basu, NiWang, and
browsing behavior for click analysis in sponsored DarylPregibon. Userbrowsing models: relevance
search. In Proceedings of the 21st ACM Conference on versusexamination. In Proceedings of the 16th ACM
Information and Knowledge Management, pages SIGKDD international conference on Knowledge
2015–2019, 2012. discovery and data mining,KDD’10, pages 223–232,
2010.
[4] Olivier Chapelle and YaZhang. A dynamicbayesian
network click model for web search ranking.In [17] ChenyanXiong, Taifeng Wang, WenkuiDing, Yidong
Proceedings of the 18th International Conference on Shen,and Tie-Yan Liu.Relational click prediction for
World Wide Web, WWW ’09, pages 1–10, 2009. sponsored search. InProceedings of the fifth ACM
international conference on Web search and data
[5] YeChen and Tak W.Yan.Position-normalized click
mining,WSDM’12, pages 493–502, 2012.
prediction in search advertising. InProceedings of the
18th ACM SIGKDD international conference on [18] DanqingXu,YiqunLiu,Min Zhang, ShaopingMa,
Knowledge discovery and data mining,KDD’12, pages and LiyunRu.Incorporating revisiting behaviors into
795–803, 2012. click models. In Proceedings of the fifth ACM
international conference on Web search and data
[6] Nick Craswell, OnnoZoeter, Michael Taylor, and Bill
mining,WSDM’12, pages 303–312, New York,NY,
Ramsey.An experimentalcomparison of click
USA,2012. ACM.
position-bias models. InProceedings of the
international conference on Web search and web data [19] WanhongXu,Eren Manavoglu, and Erick Cantu-Paz.
mining,WSDM’08, pages 87–94, 2008. Temporal click model for sponsored search. In
Proceedings of the 33rd international ACM SIGIR
[7] Georges E. Dupretand Benjamin Piwowarski. A user
conference on Research and development in
browsing model to predict search engine click data
information retrieval, SIGIR’10, pages 106–113, 2010.
from past observations. In Proceedings of the 31st
annual international ACM SIGIR conference on