Managing Risk of Bidding in Display Advertising Haifeng Zhang†, Weinan Zhang‡,∗Yifei Rong♯, Kan Ren‡, Wenxin Li†, Jun Wang♮ †PekingUniversity,‡ShanghaiJiaoTongUniversity, ♯YOYIInc.,♮UniversityCollegeLondon [email protected], [email protected], [email protected] 7 1 0 ABSTRACT in this paper, currently handles 5 billion ad transactions 2 In this paper, we deal with the uncertainty of bidding for daily. Moreover, Fikisu DSP claims to process 32 billion display advertising. Similar to the financial market trad- ad impressions daily [2]; Turn reports to handle 2.5 mil- n lion persecond in thepeaktime[29]. Toget thepictureof ing,real-timebidding(RTB)baseddisplayadvertisingem- a the scale, the New York Stock Exchange trades around 12 J ploys an auction mechanism to automate the impression billion sharesdaily[1],whiletheShanghaiStockExchange level media buying; and running a campaign is no differ- 0 tradesabout14billionsharesdaily[4]. Itisfairtosaythat ent than an investment of acquiring new customers in re- 2 thetransactionvolumefromdisplayadvertisinghasalready turn for obtaining additional converted sales. Thus, how surpassed that of thefinancialmarket. to optimally bid on an ad impression to drive the profit ] To further compare, similar to the financial market, ad T and return-on-investment becomes essential. However, the impression trades are also automated largely by auction large randomnessoftheuserbehaviorsandthecost uncer- G mechanisms — while the financial market uses a double tainty caused by the auction competition may result in a s. significantriskfromthecampaign performanceestimation. auction to create bid and ask quotes [6], RTB (Real-time Bidding) display advertising adopts the second-price auc- c In this paper, we explicitly model the uncertainty of user tion to gather bid quotes from advertisers once an impres- [ click-through rate estimation and auction competition to sion is being generated [38, 23]. As it is possible now to capture the risk. We borrow an idea from finance and de- 2 track user actions resulted from an online campaign, ad- rive the value at risk for each ad display opportunity. Our v vertising optimization becomes more resembling to that of formulationresultsintworisk-awarebiddingstrategiesthat 3 thefinancial market tradingand tendsto be driven bythe penalize risky ad impressions and focus more on the ones 3 marketing profit and return-on-investment(ROI).That is, with higher expected return and lower risk. The empirical 4 thereisexplicitandmeasurablecampaigngoalofacquiring study on real-world data demonstrates the effectiveness of 2 new users from the campaign in order to obtain additional our proposed risk-aware bidding strategies: yielding profit 0 salesfromtheacquiredusers. Thus,howtoproperlybidan gains of 15.4% in offline experiments and up to 17.5% in . adimpressiontodrivetheprofitandROIbecomesessential 1 anonlineA/BtestonacommercialRTBplatformoverthe toperformance-driven campaigns [25,39]. 0 widely applied biddingstrategies. Bidding strategies are normally built by estimating the 7 utility of each ad impression, which is commonly done by 1 Keywords predicting the underlying user’s click-through rate (CTR) : v Risk-aware Bidding Strategy, Value at Risk, Demand-Side or conversion rate (CVR) [17]. Existing solutions for pre- i Platform, Real-Time Bidding,Display Advertising dicting CTR range from linear models [13, 21, 15, 17, 26] X andgradientboostingdecisiontrees(GBDT)[31]tofactor- r 1. INTRODUCTION izationmachines[24]. Allofthemaimtoreturnapredicted a CTR value of the given ad impression. However, we never Display advertising has become a significant battlefield knowwhethersucha point estimation isconfidentenough. forbigdata[33]. Astheadvertisingtransactionsareaggre- The true underlying CTR may heavily deviate from the gated across websites in real time, the display advertising predictedvalue,giventhesignificantuncertaintyoftheun- industryhasauniqueopportunitytounderstandtheinter- derlyinguserbehavior. Suchutilityfluctuation,alongwith net traffic, user behaviors, and online transactions. In this thecostuncertaintycausedbytheauctioncompetitions[5], paper,wehaveuseddatasetsfromaleadingDSP(Demand resultsinasignificantriskofthecampaignprofitestimation Side Platform) in China, iPinYou, which processes up to which should not beignored. 18billion ad impressionsperday[3]. Theadvertisingplat- In this paper, we depart from conventional CTR point form YOYI, which has deployed our proposed algorithms estimations and explicitly model the CTR distribution to ∗Weinan Zhang is thecorresponding authorof this paper. capturetheuncertainty(risk)oftheutilitymeasureforeach potential ad impression. Our idea is inspired from finance Permissiontomakedigitalorhardcopiesofallorpartofthisworkforpersonalor aboutvalueatrisk: astatisticaltechniqueusedtomeasure classroomuseisgrantedwithoutfeeprovidedthatcopiesarenotmadeordistributed and quantify the level of financial risk with an investment forprofitorcommercialadvantageandthatcopiesbearthisnoticeandthefullcita- tiononthefirstpage. Copyrightsforcomponentsofthisworkownedbyothersthan [19]. Thevalueat risk of an ad display opportunityisgen- ACMmustbehonored.Abstractingwithcreditispermitted.Tocopyotherwise,orre- erallydefinedasthelowerboundoftheadimpressionvalue publish,topostonserversortoredistributetolists,requirespriorspecificpermission guaranteed by a specific probability (risk). With the mea- and/[email protected]. sured CTR prediction and market price risks, we propose WSDM2017,February06-10,2017,Cambridge,UnitedKingdom two practical risk-aware bidding strategies to handle the (cid:13)c 2017ACM.ISBN978-1-4503-4675-7/17/02...$15.00 uncertainty from both utility estimation and market com- DOI:http://dx.doi.org/10.1145/3018661.3018701 petition. Our methods are evaluated on two large-scale dingstrategyistobidtheestimatedtruevaluefor eachad real-world ad datasets. With campaign profit as the key impression. For performance-driven campaigns, the prede- performance indicator (KPI), we find that our risk-averse fined true value is normally based on the economical value biddingstrategies,whichpenalizethebidswithhighuncer- of the user actions, such as a click or a conversion. The tain CTR (or profit) and award theconfident ones, yield a expected true value for a specific impression is estimated 15.4%profitgainoveralinearbiddingstrategywithacon- as action value multiplied by action rate [17, 8]. However, ventionallogisticregressionCTRestimator[25]andlargely thetruth-tellingbiddingstrategy is optimal only whenthe outperform therisk-neutraland risk-seeking bidding. budget and auction volume are not considered. With the Furthermore, we have deployed the risk-averse bidding campaign budget and lifetime auction volume constraints, strategies on YOYI’s DSP and conducted a 7-day online theoptimalbiddingstrategy may notbetruth-telling. Ex- A/Btest. Inthelive test,ourproposed risk-aware bidding tendingfromthetruth-tellingbiddingstrategy,theauthors strategies bring 17.5% higher campaign profit and 61.5% in [25] proposed the generalized bidding function with a higher CTR over the conventional ones by effectively sav- linear relationship to the predicted CTR for each ad im- ingmoneyonuncertainandlow-valueopportunities,which pression being auctioned. Compared to [25], the authors verifies the practical effectiveness of our solutions of man- in [40] proposed a functional optimization framework to aging advertising biddingrisk. directly optimize the bidding function, where the derived Therestofthispaperisorganizedasfollows. Wediscuss function showed that an optimal bidding function could therelated work in Section 2. OurCTR distribution mod- be non-linear w.r.t. predicted CTR. The non-linearity is eling is provided in Section 3. In Section 4, the concept of closely related to the distribution of market price [5]. Ex- value at risk and the derived risk-aware bidding strategies tending to multiple campaign bid optimization task, the are discussed. Offline and online experimental results are authors in [39] further devised linear and non-linear bid- provided in Sections 5 and 6, respectively. We finally con- dingfunctionsfromthefunctionaloptimizationframework. cludethis paperand discuss our futurework in Section 7. Extending from previous strategies, our work introduces a newfactor,i.e. thestandarddeviationofpredictedCTR,to be considered by bidding function in addition to predicted 2. RELATED WORK CTR. User Response Prediction. Predicting the probability Risk Management and Applications. Risk is a con- of a specific user response, e.g., CTR and CVR, is a key sequence of action taken in spite of uncertainty [22]. The function for performance-driven online advertising [13, 21, objective of risk management is to assure uncertainty does 15]. The applied CTR estimation models today aremostly not to-some-extentdeflect thebusiness from its goals [9]. linear. Logistic regression is the most widely used model, Modernportfoliotheory(MPT)[20]originatesfrommod- normallytrainedbystochasticgradientdescent(SGD)[17, eling uncertainty of the return of combinations of multiple 26]. The authors in [21] proposed to use an online learn- financial assets. It presents a quantitativemethod to mea- ing algorithm called follow-the-regularized-leader (FTRL) sure such uncertainty (or risk) and embeds it into the de- to train logistic regression from the streaming data. The cision making of investment [16]. In MPT, the variance model successfully bypassesthelearning rate updateprob- of thereturn of each asset is modeled as itsrisk. Then the lem in SGD and it empirically works effectively. Bayesian riskandexpectedreturnofaportfolioofinvestedassetsare probit regression [13] is another linear model for online quantified by the fund allocation, the mean return of the learning where thefeature weights are modeled with a dis- assets and their covariance matrix [20, 16]. MPT utilizes tributionandthemodellearningisviaupdatingtheweight themean-varianceanalysistomakeaninvestmentportfolio posterior. Binary naive Bayes [14] is also a popular linear for any tradeoff between the risk and the expected return, model,byassumingthefeaturesareconditionallyindepen- or w.r.t. a reference investment such as bank rates [28]. dent. With such advantages, MPT has been adopted in almost Linear models are simple and effective in learning, but everywhereof financial investment[11]. may fail to capture the interactions between the assumed Recently, the ideas of risk management has been intro- (conditionally) independentraw features[13]. By contrast, ducedtoinformationretrieval,suchasdocumentrankingin non-linear models are capable of learning feature interac- web search [34,35] anddiversification in top-Nrecommen- tionsinvariouswaysandcouldpotentiallyimprovepredic- dation [30, 43], to improve the model robustness or catch tionperformance [24,31]. Gradient boostingdecision trees the users’ satisfaction on uncertainty psychologically. In (GBDT) [31, 15]are a straightforward non-linearmodelto theareaofrecommendersystem,banditssolutions[42]have capture feature interactions. Moreover, latent factor mod- been proposed to model confidence interval to balance ex- els, particularly factorization machines (FMs) [24], map ploration and exploitation in a risk-seeking fashion. eachbinaryfeatureintoalowdimensionalcontinuousspace, Computational advertising is associated with a certain and the feature interaction is automatically explored via levelofdeficitrisk,particularlyforperformance-drivencam- vectorinner product. paigns as the goal is to acquire new users and gain more Real-Time Bidding Strategies. The emergence of ad sales from them. The risk comes from the dynamics of exchanges for display advertising in 2009 [23] provides au- the market and the user online behaviors [32]. The au- tomatic trading mechanism for advertisers to buy media thors in [39] proposed to measure campaign-level risk and inventoryin impression leveland determinetheacceptable returninaspecialcaseofarbitragebetweenCPMandCPA. price viasecond price auction [38]. Comparedto[39],ourworkfocusesonsinglecampaign op- The authors in [12] proposed an algorithm that made timization, and our risk is modeled from the uncertainty dynamic bidding decisions to achieve an optimal delivery of user response and market competition at impression- with the budget constraint. In [8], the bid price from each level. Generally, our work borrows the concept of value campaign was adjusted bythepublisheror thesupply-side at risk from finance to derive risk-aware bidding strate- platform in real time and the goal was to maximize the gies intending to reasonably allocate budget between un- publisher side profit. Borrowing the idea of the optimal certainimpressionsandconfidentimpressionsandachievea truth-telling bidding in sponsored search [10], a basic bid- Table 1: Notations and descriptions instance(x,y),theposterior distribution of w becomes Notation Description p(y|x,w)p(w) x Thefeaturesofbidrequest. p(w|x,y)= (4) w TheweightsoffeaturesinCTRestimationfunction. w′p(y|x,w′)p(w′)dw′ y Thetruelabelofuserresponse. µyˆ TThheeemsteiamnavteecdtolarboeflwof.userresponse. ∝σ(wTx)y(1−Rσ(wTx))(1−y) qold,ie−(wi−µold2,i)2qold,i, 2π S Thecovariancematrixofw. i r qi Theprecision(reciprocalofvariance)ofwi. Y v Thepre-definedvalueofpositiveuserresponse. where µold,i and qold,i are the prior parameters of the i-th b Thebidpricedeterminedbybiddingstrategy. dimension of w before observingthedata instance (x,y). z Themarketpricedeterminedbysecond-priceauction. α Thecoefficientbalancingutilityandrisk. ApproximationofPosterior. SincetheposteriorEq.(4) R Theutilityofadvertiser. iscomputationallycomplex,wemaintainaLaplaceapprox- p() Theprobabilitydensityfunction. imation [7] to keep it consistent with the prior. There are P() Theprobabilitymassfunction. σ() Thesigmoidfunction. alternativeapproximateinferencessuchasvariationalinfer- ence(VI)[7]. InthispaperweadoptLaplaceapproximation campaign-levelprofitgain,whichisunlikeinfinancewhere duetoitssimplerimplementationandlowercomputational risk control is only for balancing return and risk at item- cost than VI, which involves extra variational parameters level (impression-level in RTB). and invokes a time-consuming EM algorithm for training. The Gaussian approximation to the posterior distribution takestheform 3. CTR DISTRIBUTION MODELING p(w)=N(w|µ ,S ), (5) RTB is generally a two-phase process: 1) CTR estima- new new tion based on features of bid request; 2) bid price deter- where µ is defined by the w which maximizes the new MAP mination based on estimated CTR. In this section, we ex- logarithmic posterior: plicitly model the CTR distribution in order to deal with theuncertaintyofaCTRestimation. Next,inSection4we µnew =argmaxylnσ(wTx)+(1−y)ln(1−σ(wTx)) w shallproposerisk-awarebiddingstrategiesfromtheinferred 1 CTR distribution and themarket price distribution. − q (w −µ )2+const, (6) Inthispaper,wedon’tprovidedown-to-earthbackground 2 old,i i old,i i of RTB. For more details about RTB, we refer to [33]. For X whose SGD updatingis clarity, we summarise notations in thispaperin Table 1. 3.1 Preliminary: BayesianLogisticRegression µnew,i ←µnew,i+η· (y−σ(µTnewx))xi WeproposetouseaBayesianlogisticregressiontomodel (cid:16) −q (µ −µ ) (7) old,i new,i old,i the CTR distribution due to the following reasons: (i) lo- gisticregression(LR)hasbeenwidelydeployedastheCTR and S is given by the inverse of the matrix of(cid:17)second new predictionmodelinmostRTBadplatforms[17,25]andour derivativesof thenegative log likelihood, which satisfies model is a natural extension to tackle the uncertainty of a CTR estimation with LR; (ii) We adopt Bayesian treat- Sn−e1w=−∇∇lnp(µ|x,y)=So−ld1+σ(µTx)(1−σ(µTx))xxT. ment to model uncertainty since it has been well studied As we follow [41, 13] to assume each w as independent i by previous works [41, 13] for CTR estimation; (iii) Al- witheachotherandthusS isdiagonal,wehavetheupdat- thoughBayesianprobitregression[41,13]hasthepotential ingof precision parameters as to model the uncertainty, its probit activation function is of no closed form, and thusis computationally low cost ef- qnew,i =Sn−e1w,i,i =qold,i+σ(µTx)(1−σ(µTx))x2i. (8) fectiveness in RTB. 3.2 Predicted CTR Distribution For readability, in this section, we present a preliminary onBayesianlogisticregression,whilefordetails,wereferto EquippedwithaBayesianlogisticregressionwithaLaplace [7]. For a multi-dimensional feature vector x representing approximation of the parameter posterior, we are ready to the input ad display opportunity, the conventional logistic propose a simple yet novel solution for modeling the CTR regression estimates theCTR by: distribution. Note that the CTR itself is a probability es- timation that a click event occursfrom an impression with 1 yˆ=σ(wTx)= , (1) feature x takesthe form 1+e−wTx yˆ=P(y=1|x)=σ(wTx), w ∼N(µ ,q−1), (9) i i i whereσisthesigmoidfunctionandwistheweightvectorof logistic regression. The likelihood of observing the correct where yˆ denotes the CTR random variable that generates binary click label y given features x and weights w is the binary observation y, so our goal is to estimate the distribution of yˆ. As introduced in Section 3.1, we assume p(y|x,w)=σ(wTx)y(1−σ(wTx))(1−y). (2) each wi is from Gaussian i.i.d., thedistributionof iwixi also follows N( µ x , q−1x ). Thus, we have In the Bayesian version of logistic regression, w is mod- i i i i i i P eled as a random variable with a p.d.f. p(w). Thus the yˆ=σ(Pa), a∼PN µ x , q−1x . (10) marginal conditional probability p(y|x)is i i i i (cid:16)Xi Xi (cid:17) Considerthatifrandomvariablesxandysatisfyy=g(x) p(y|x)= p(y|x,w)p(w)dw. (3) and g−1 is monotonic and differentiable, we have[37] w Z onWwe,fowlhloiwch[4is1,a1p3r]atcotaicdaolpstetatiGnagu.sAsifatnerporibosreNrv(inµg0,aq0−d1aIta) py(y)=px(g−1(y)) ∂g−∂1y(y) . (11) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) CTR Distribution, q=30 8 CTR Distribution p(CTR)463572 µµµ===0--00..21 p(y)4232311.......0005505 CTR y 1 0.5 00.0 0.2 0.4 0.6 0.8 1.0 0.00.0 0.2 0.4 0.6 0.8 1.0 CTR CTR CTR Distribution, µ=-0.1 Market Price Distribution 7 0.018 p(CTR)46352 qqq===531000 p(z)0000000.......000000001011014466082 Market Price z 1 0.002 00.0 0.2 0.4 0.6 0.8 1.0 0.0000 50 100 150 200 250 300 CTR Market Price Figure1: Anillustration oftheproposedCTRdistribution Profit Distribution 0.018 against itsparameters µand q in Eq. (12). 00..001146 Profit vy−z In our case, σ−1(yˆ) = lnyˆ−ln(1−yˆ) is monotonic and p(vy-z)0000....000001016082 differentiable within (0,1), so we obtain the closed-form of 00..000042 yˆ’s p.d.f. 0.000100 50 0 50 100 150 200 250 Profit. P(vy-z < 0 | b=84)=16.5% pyˆ(yˆ)= 1 e−(σ−12(Pyˆ)i−qPi−i1µxiixi)2, (12) Fdiisgturribeu2ti:onAwnheexnabmidpdleinogftCheTRex,pmecatrekdetutpilriitcyevaanlude.prTohfiet (yˆ−yˆ2) 2π iqi−1xi profit p.d.f. on 0 is the probability of losing the auction, q resulting in a peak. which provides an expliPcit CTR p.d.f. To our best knowl- edge, the above proposed solution has not been studied in culatedastheutilityrminusthecostz,whichissetasthe previous literature [41, 13, 36]. To understand the nature optimization target of performance-driven campaigns [39]. of the model, Figure 1 plots the CTR distribution against Inageneralsetting,consideringboththeestimatedCTR its parameters µ and q, where x = 1 . As observed, the 10 yˆ and cost z are stochastic variables: yˆ ∼ p (yˆ) and z ∼ yˆ p.d.f. presentsasingle-peak shapein [0,1] which is similar p (z), the bid optimization problem is to find the optimal z with theshapes of betadistributions. bid priceto theauction. It is straightforward to see from Figure 1 that µ and q Letusfirstconsiderasimplecasewithoutconsideringthe jointly determine the peak location and sharpness of the uncertaintyofourestimation,wherethegoalistomaximize CTR p.d.f. and specifically µ influences more on the peak theexpectedprofit R(b) by marginalizing out yˆand z: location while q influencesmore on the distribution sharp- ness. TounderstandhowitmodelstheCTRpredictionun- certainty/confidence,fromEq.(8),weseeeachtimeadata b∗ =argmaxE[R(b)] (14) instance with feature xi is observed, the precision qi will b be updated with a higher value, which in turn contributes b a sharper CTR p.d.f. in Eq. (12). Therefore, for the ad =argmax (v·yˆ−z)pz(z)dz·pyˆ(yˆ)dyˆ. (15) impression with frequent (similar) features, the predicted b ZyˆZz=0 CTR is of low uncertainty,and vice versa. Takethederivativew.r.t. b and set it to 0: 4. RISK-AWAREBIDDINGSTRATEGIES With our CTR distribution model in Eq. (12), we next ∂E[R(b)] ∂ b investigate the conditional distribution of a utility given a = p (z) (v·yˆ−z)p (yˆ)dyˆdz (16) specificbidpriceforaninputbidrequest. Byconsideringa ∂b ∂bZz=0 z Zyˆ yˆ risk-awareutilityasanoptimizationtarget,wearereadyto derivethecorrespondingrisk-aware biddingstrategy. Note =pz(b) (v·yˆ−b)pyˆ(yˆ)dyˆ=0 (17) thatalternativeCTRdistributionmodelscanalsobeincor- Zyˆ porated in oursolution framework. ⇒b∗ = v·yˆ·p (yˆ)dyˆ=v·E[yˆ]=E[r]. (18) yˆ Specifically, we start from the theoretic derivation of a Zyˆ Bayesiantruth-tellingbiddingstrategyandprovideananal- ysis of its risk in Section 4.1. Then we will propose two solutions, discussed in Sections4.2 and 4.3 respectively. Wesee that theoptimal bid priceis theproductof theac- tion valueand theestimated CTR yˆ, which is independent 4.1 Analysis: BayesianTruth-telling Bidding of the market price distribution. If we assume yˆ is known The utility r of an ad impression could be definedbased andfixed,i.e.,pyˆ(yˆ)focusesitsmassonasinglepoint,then on the advertiser’s value v on a specific user action, e.g., the optimal bid price is v·yˆ. Note that the optimality of click or conversion. For example, if the value v is on each truth-tellingbiddingis forone-shot auction. Whenconsid- click, given an impression with CTR yˆ distributed as in ering campaign budget and auction volume, a coefficient φ Eq. (12), theutility r and its p.d.f. of this impression are is commonly added to Eq.(18) as φ·E[r]. Discussion of Risk. The above classic bidding solution r=v·yˆ, p (r)=p (yˆ)/v. (13) r yˆ is built on maximizing the expectation of the profit R(b), Moreover, the cost to win the ad impression comes from regardless of its uncertainty. However, apotential problem the highest bid from other competitors, defined as market is that there is a chance a bid is won, but v·yˆis less than pricez[5]. Theprofitofwinningthisimpressionisthencal- z, in which case R(b) will be negative. We can obtain the probability of such negative profit P(R(b)<0) as Considering campaign budget and auction volume, a co- efficientφiscommonlyaddedtoEq.(23),whichissimilarto 1 P(R(b)<0)= p (yˆ)P(b>z>v·yˆ)dyˆ truth-tellingbidding[25]. This also applies tothe strategy yˆ Z0 proposed in Section 4.3. The detailed realization method 1 b andefficiencyanalysisofbothstrategiesareprovidedinthe = pyˆ(yˆ) pz(z)dzdyˆ, (19) appendix. Z0 Zv·yˆ 4.3 Bidding forRiskManagement ofProfit whichshowsitisofrisktogetnegativeprofitwhateverthe positive bid price. We illustrate our point in Figure 2 us- Anotherwaytoapproachingrisk-awarebiddingstrategies ing an example. We set the CTR p.d.f. as Eq. (12) with istogobacktotheanalysisofcampaignprofitR(b),which µ x =−1, q−1x = 1;themarketpricedistribution is oneof thekeyperformance indicators (KPIs) in RTB. i i i i i i 3 is assumed to be a log-normal p.d.f. with µ = 4, σ = 0.5; Mathematically, tPhe value per cPlick is set as 300. The expected value of CTRis0.283,andthusthetruth-tellingbiddinginEq.(18) 0 b≤z (lose) R(b)= . (24) would give 84. From our simulation with 10,000 samples, (v·yˆ−z b>z (win) weplottheprofitdistributiongiventhetruth-tellingbid84 withprobability16.7%theprofitisnegative. FromEq.(19) AsbothCTRyˆandmarketpricezaremodeledasstochas- we see that thedown-siderisk, theprobability of theCTR tic random variables with p.d.f. pyˆ(yˆ) and pz(z), R(b) lowerthanthemeanyˆ,contributesmoretotheprobability canbenaturallyregardedasadependentrandomvariable. of negative profit; the higher standard deviation (or vari- Again, usingLemma 1, thevalue-at-risk of profit ance) is of yˆ, the higher probability of such negative profit R˜(b)=E[R(b)]−αStd[R(b)], (25) may occur in the truth-telling bidding case. Thus, such risk of negative profit should be carefully considered and which has a guarantee that for α > 0 the real profit is incorporated into thebidding strategies to help campaigns lowerthanR˜(b)withaprobabilitysmallerthan1/(1+α2); achievesatisfactory performance with controlled risk. symmetrically for α<0. Inthissetting,weproposethesecondrisk-awarebidding 4.2 Bidding withValue-at-RiskUtility strategy that generates the bid which yields themaximum Theutilityp.d.f. pr(r)isofhighimportanceforriskmod- value-at-riskof profit: eling. From Figure 2 we know that due to the uncertainty of utility, bidding with the utility expectation E[r] will in- bRMP(R(b))=argmax E[R(b)]−αStd[R(b)]. (26) b troduce the risk of negative profit, which might be worse than nobid. Let usexamine thedownside(upside)risk. Through analyzing the properties of R(b), we find that E[R(b)] and Std[R(b)] have a non-trivial trade-off relation, Lemma 1 (Cantelli’s Inequality). Forarandomvari- which can be balanced by b. We solve the optimal b by able r with mean µ and standard deviation σ, the following importingtheconceptofefficientfrontierfromfinance[16]. inequalities hold: Detailed analysis can be found in theappendix. 1 P(r<µ−ασ)< ,α>0, (20) 5. OFFLINEEMPIRICALSTUDY 1+α2 1 Inthissection,weempiricallyevaluateourproposedrisk- P(r>µ−ασ)< ,α<0. (21) 1+α2 aware biddingstrategies on real-world data1. 5.1 Datasets WithLemma1wecanfindavariantofvalue atrisk (VaR) Two ad log datasets in ourempirical studyare: [19, 27] for utility iPinYou is the largest independent programmatic media r˜=E[r]−αStd[r], (22) buyingplatform in China. The dataset2 was released for research after its global competition on RTB al- whichhasaguaranteethatforα>0therealutilityislower gorithms in 2013 [18]. It contains 19.50M ad impres- thanr˜withaprobabilitysmallerthan1/(1+α2);forα<0 sions, 14.89K clicks from 9campaigns during10days the real utility is higher than r˜with a probability smaller in 2013, which involve 16K Chinese Yuan (CNY) ex- than 1/(1+α2). pense in total. According to the data publishers, the UsingVaRr˜asanewrisk-awareutility,wesetthetruth- last threedaydataofeachcampaign issplitintotest telling bid at r˜. Taking Eq.(13) into Eq.(22) gives data while the former part into training data. The b (r)=r˜=v·(E[yˆ]−αStd[yˆ]), (23) overall effective cost per click (eCPC) is 1.07 CNY VaR on the training data and 1.13 CNY on the test data, where thep.d.f. of yˆis given in Eq.(12). which means the overall market competition did not Whenα>0,thebiddingstrategyisrisk-averse,inversely change dramatically duringthat 10 days. when α < 0 it is risk-seeking; the traditional truth-telling YOYI is another mainstream demand-side platform com- bidding (Eq. (18)) is now a special case of b (r) with VaR pany. This proprietary dataset is mainly used for α=0, called risk-neutral. trainingtherisk-awaremodelsdeployedforonlineA/B Note that we derive VaR strategy based on Lemma 1 testing. Details will be given in Section 6. ratherthanthedistribution ofyˆ(Eq.(12)) directlyfor the Both datasets are with record-per-line format. After ad followingreasons: (i)Lemma1isageneraloneinthesense log joining, each record is formalized as a triple (x,y,z), thatitonlyneedsmeanandstandarddeviation,soitcanbe wherexisthehighdimensionalfeaturevectorforeach bid applied tootherCTR distribution models, especially those without analytical forms; (ii) α can be simply set over all 1Ourexperimentisrepeatableandthecodeisgivenbelow: bid requests to make the computation more efficient. The https://github.com/pkuzhf/bidding-at-risk-experiment uniformαcanalsoberegardedasariskcontrolparameter. 2Dataset link: http://data.computational-advertising.org 5.01e6 iPinYou 3427 71e6 iPinYou 3476 requestwiththecorrespondingadinformation,yistheuser 4.5 6 feedbackontheadimpression,e.g.,thebinaryclickorcon- 4.0 5 3.5 version actions, z is the market price for that auction, i.e., ofit3.0 ofit4 thelowest priceto bid in orderto win theauction. Pr22..05 Risk-averse Pr3 Risk-averse Risk-seeking 2 Risk-seeking 5.2 Experiment Protocol 11..05 RLRisk-neutral 1 RLRisk-neutral Wefollowed [39]toset upourevaluation procedure. For 0.5 0 0.00.20.40.60.81.01.21.41.61.8 0.00.20.40.60.81.01.21.41.61.8 each campaign bidding strategy with a predefined budget, Cost 1e7 Cost 1e7 theoptimalparameters(µ,S)werelearnedontrainingdata and the hyperparameters (α and φ, discussed later) were Figure 3: Profit v.s. cost of theVaR bidding. tunedonvalidationdata,whichwastheearlyhalfsplitfrom thetest data. Then wereplayed thehistoric bid recordsto 61e6 iPinYou 3427 61e6 iPinYou 3476 testtheperformanceontheotherhalfoftestdata. Wedid 4 5 not use cross-validation since ourdata is sequential. 2 4 For each campaign, there was a defined value v for the ofit 0 ofit3 uthseerclaiccktiovna,luie.ei.s, daecfilnicekdibnyoauprreoxppoerrtiimonenotf.eCFoPlClowoinngtra[3in9-] Pr 2 RRiisskk--aseveekrsineg Pr2 RRiisskk--aseveekrsineg ingdatatoimitatethetruevaluesetbytheadvertisers. In 4 Risk-neutral 1 Risk-neutral LR LR this paper, we set the proportion to 100%. Every time the 6 0 0.0 0.5 1.0 1.5 2.0 2.5 0.0 0.5 1.0 1.5 2.0 2.5 testedbiddingagentreceivedabidrequestfrom theadex- Cost 1e7 Cost 1e7 change,itgeneratedabidwiththetestedbiddingstrategy. If the bid price was higher than the historic market price, Figure 4: Profit v.s. cost of theRMPbidding. theagentwontheadimpressionandpaidthemarketprice 5.4 EvaluationMeasures and then the historic recorded binary user feedback, i.e., click or not, was observed. If there was a user click, the For a bid optimization task, the major evaluation mea- agent made revenue of the click value. The test ended ei- sures are campaign profit, i.e., the earned total click value ther when there was no more test bid request or when the minus the cost, and ROI, i.e., the ratio between the profit campaign budget was exhausted,if applicable. and the cost. In addition, we also monitor other key met- Notethattheofflineexperimentcannotfullysimulatethe rics,suchascostperthousandimpressions(CPM),auction marketcompetitionbecausethereisnoobserveduserfeed- winningrate, CTR and eCPC to gain more insights. back and market price for historically lost auctions. Nev- Inordertoinvestigatehowbiddingstrategiesbalancere- ertheless, our evaluation protocol keeps the bid requests, turn against risk, we also propose an additional metric: displayed ads, and auction environment unchanged. We profit - λ cost, which is named as Cost-Penalized Profit trytoanswer that underthesame contextifthecampaign (CP-Profit) in this paper. Intuitively, advertisers want to was given a different biddingstrategy, whetherthey would maximize the profit of their performance-driven campaign beable to get more clicks with thebudgetlimitation. giventhebudget,orminimizetheadvertisingcostgiventhe profit, either of which forms a profit/cost tradeoff. Note 5.3 Compared Bidding Strategies that this metric was mainly used for model selection, i.e., Thecompared biddingstrategies are as follows. selecting α and φ. As we will show in Section 5.5 about • LR - Linear RevenueBidding: the baseline, which is the profit/cost analysis, CP-Profit is an effective metric to themost widely used biddingfunction balancetheprofit and thecost to select an optimal model. b =φ·v·yˆ, (27) LR 5.5 Profitand CostAnalysis whereyˆisthepredictedCTR with logistic regression Weanalyzedtheprofitandcostofvariousbiddingstrate- as in Eq. (1). φ is the scaling parameter tunedbased gieswithvariousparametersettings. Thiswouldhelpusun- onthemarketcompetitiveness[25]. TheoverallAUC derstand the properties of the bidding strategies and help of this LR estimator on iPinYou dataset is 69%. usselect optimal model hyperparameters. • VaR - Value at Risk Bidding: our first risk-aware bidding strategy proposed in Section 4.2, which bids Non-Budgeted Settings. Without budget constraint, the value at risk, shown in Eq. (23), where α is the the basic LR bidding strategy is truth-telling, i.e. φ = 1. hyperparametertobe tunedon validation data. Fortherisk-awarebiddingstrategieswevariedthevalueof • RMP - Risk Management of Profit based Bidding: α. Figure 3 shows the profit and cost change when setting different α’s in VaR Eq. (23), with α > 0 as risk-averse, the second risk-aware bidding strategy proposed in α<0as risk-seekingandα=0asrisk-neutral. Weclearly Section 4.3, which seeks theoptimal bid to maximize observed that for all tested campaigns the results gener- theexpectedprofitwithitsstandarddeviationasrisk ally formed a single-peak concave shape, showing strong constraint, shown in Eq. (26), where α is the hyper- trade-off between the profit and cost. Risk-averse strate- parameter tobe tunedon validation data. gies intended to yield lower cost than risk-seeking ones, as The two risk-aware bidding strategies rely on the same the risk-averse strategies would always bid lower than the CTR distribution model proposed in Section 3. Theinitial risk-neutral one, and the latter further bid lower than the prior distribution of w was given that (i) µ was set as risk-seeking ones. The performance-driven campaigns pur- 0 thepointestimationgeneratedbylogisticregression;(ii)q sue high profit and low cost. Thus if a bidding strategy 0 was set as constant 1. Our model achieves a comparable yieldshigherprofitandlowercostthananotherone,wesay AUCwithstandardLRmodel. Notethatforthebudgeted theformerstrategy dominates thelatterone. Fromthere- bid optimization tasks in Section 5.7, there would also be sultswesawthatalmostalltherisk-seekingstrategieswere a hyperparameterφ multiplied in VaR (23) and RMP (26) dominated by part of the risk-averse strategies. That is to to betunedto avoid budget under-or over-delivery. say, for almost every risk-seeking setting with α<0, there 1e6 iPinYou 3427 41e6 31e6 6 2 2 5 1 0 0 4 ofit 2 ofit 1 Profit 321 RRiisskk--aseveekrsineg Pr 46 RRRLRiiissskkk---asneveeeukrtsrinaegl Pr 4352 RRRLRiiissskkk---asneveeeukrtsrinaegl 0 Risk-neutral 80.50.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 60.5 0.0 0.5 1.0 1.5 2.0 2.5 LR Cost 1e7 Cost 1e7 1 B(cid:0)(cid:1)(cid:2)(cid:3)(cid:4) Selected Model 2 Figure 6: Model selection. Colored dashed lines stand for 1 0 1 2 3 4 5 6 7 λ=0(blue),0.2(green),0.4(red) respectively. 1e6 Figure5: Selectedmodelperformancecomparisonwithdif- ferent λ’s (validation dataset), iPinYou campaign 3427 for 0.41e7 81e6 iPinYou 3358 RMP strategy with 1/4 budget setting. The models maxi- 0.2 7 hmigizhinligghCtePd-Pwriotfihtrwedithcirλcl=es.0,N0.o2t,e0.t4haatreLsReleacntdedr,iswk-haiwcharies ofit 00..02 6 Pr 0.4 5 models are selected separately and then plotted together. CP- 00..68 LR 4 LR always exists one or more risk-averse setting with α > 0 1.0 V(cid:5)(cid:6)(cid:7)(cid:8)(cid:9)(cid:10)(cid:11)(cid:12)(cid:9)(cid:13)αα 3 (cid:15)(cid:16)(cid:17)(cid:18)(cid:19)(cid:20)(cid:21)(cid:22)(cid:23)(cid:20)(cid:24)αα that yields equalor higher profit with lower cost. 1.2 (cid:6)R(cid:14)(cid:7)(cid:8)(cid:9)(cid:10)(cid:11)(cid:12)(cid:9)(cid:13) 2 (cid:17)(cid:25)(cid:26)(cid:18)(cid:19)(cid:20)(cid:21)(cid:22)(cid:23)(cid:20)(cid:24) 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Figure 4 illustrates the profit and cost trade-off yielded λ λ by RMP Eq. (26) with various risk settings. We also ob- served that, similarly, the risk-averse strategies tended to Figure7: Selectedmodelperformancecomparisonwithdif- yieldlowercostthantherisk-seekingones. Therisk-neutral ferent λ’s (validation dataset). strategy acted as a splitting point between the risk-averse and risk-seeking ones. fixedλ,wethenobtainedanoptimalαthatmaximizesthe CP-Profit. Budgeted Settings. In practice, advertisers tend to set Figure7plotstheperformanceofthetestedbiddingstrate- up campaign budget constraints. With the budget con- gies against CP-Profit metrics with various λ in validation straint, the basic LR bidding strategy is not necessarily data. Wefoundthat: (i)Therisk-awarestrategiesVaRand truth-telling, i.e., φ6=1 in Eq. (27) [25]. We performed an RMP outperformed the baseline LR on all studied cam- analysis on the profit and cost trade-off for different bud- paigns with all λ settings. (ii) Specifically, when λ = 0, get constraints. Foreach campaign,wefollowed [39]touse i.e., the evaluation metric was purely the profit, VaR and 1/2,1/4,...,1/32 of theoriginal total cost in thetest data RMP outperformed LR with 50.7% and 25.1% improve- as the budget. For LR we varied the value of φ while for ments. Alsonotethattherewere89%selectedmodelswith VaR and RMP we varied both φ and α to obtain different positive α’s, which suggested when there is a fixed volume profits and costs. As an example, the results of iPinYou ofbidrequestsandnolimitedbudget,theriskcontrolhelps campaign3427forRMPwith1/4budgetsettingareshown advertisers make higher profit by spending less money on in Figure 5. We saw that part of the risk-averse strategies opportunities with high risk. (iii) For metrics with λ > 0, dominated other strategies, which was in accordance with the performance gained of VaR and RMP over LR were theones without budget constraints. larger. As we observed that the CP-Profit curves of VaR Distinction with Lowering Bidding. The performance andRMP were convexwhile theones of LR was a straight of risk-averse bidding strategies was not due to lowering line,whichmeanttherisk-awarestrategiesprovidedhigher bidding. As we know, for a truth-telling linear bidding, if ROI when the model selection metrics became more con- we lower the bid price uniformly, it would result in cost servative,i.e., with higherλ. decreasing, profit decreasing and ROI increasing. Unlike Withthemodelselectedonvalidationdata,wecompared this,ourrisk-aversebiddingstrategiesloweredthebidprice thestrategies ontestdata. Theoverallperformance across of uncertain requests and raised the bid price of confident 9 iPinYou campaigns is shown in Figure 8. We observed requests, which led to higher profit and higher ROI with that: (i) Most risk-aware bidding strategies yielded signif- invariant cost. It can be verified in Figure 5. For example, icantly higher profit than the baseline LR, which demon- theleft circled LRmodel lowered thebid priceof theright strated the effectiveness of incorporating risk modeling in circled LR model by applying a smaller φ, so it achieved bid optimization. (ii) All VaR strategies yielded signifi- lower cost, lower profit and higher ROI. In the meantime, cantly higher ROI compared to LR, which meant bidding the risk-averse models above the right circled LR model, atriskhelpsadvertisersavoidfromwastingmoneyonrisky which had a positive α and a larger φ, achieved the same opportunities. It turned out that such risky opportunities cost, higher profit and higherROI. brought little revenueso avoiding them led to a lower cost and higher profit, and thus much higher ROI. (iii) RMP strategies stably yielded about 10% profit more than LR 5.6 Bidding withoutBudget Constraints while theROI was lower than LR. This was because RMP In order to select different risk-aware models (i.e. the sought the bid price which directly maximized the value strategies’ hyperparameterα) with various risk-returnbal- at risk of profit, which resulted in consistent profit gains. ancedmetrics,weusedtheproposedCP-Profitmetric. Specif- Ontheotherhand,RMPdid not explicitlycontrol thebid ically, withafixedvalueof λ,thecontourofaspecificCP- price like VaR, thus it did not yield lower CPM or cost. Profit is a straight line in the cost-profit plot. The highest As a result, its ROI was not higher than LR. (iv) All the CP-Profit is achieved by the tangent point of the contour risk-awarebiddingstrategiesbroughthigherCTRthanthe and the cost-profit points. Figure 6 provides example tan- baseline LR, which might be counterintuitive. Actually, gent pointswith different λ’s on the cost-profit plot. For a the proposed strategies allocate the budget from the high- ROI 1.4 ROI 4.01"# P$%&’( 00..87 _^^ ‘abcdfghijk _m^nopqrfisrtfghijk 1.2 0.6 ]^ _l^ 01..80 333 ..!46 000...435 4600 _]6^0^^ 0.6 3.2 00..12 20 4200 LR λ(cid:27)=(cid:28)0(cid:29).0λ(cid:27)=(cid:28)0(cid:29).2λ(cid:27)=(cid:28)0(cid:29).4λ(cid:29)=(cid:30)0(cid:31).0λ(cid:29)=(cid:30)0(cid:31).2λ(cid:29)=(cid:30)0(cid:31).4 3.0 LR λV=a0R.0λV=a0R.2λV=a0R.4λR=M0P.0λR=M0P.2λR=M0P.4 0.0 LR Z[\ 0 LR VaR 0 LR VaR 1" 3 C+, 1"- CP. udttdtqvwfrgxk 3.0 {|vg k {‘~ghijk 0 *) 76..05 __^l 2.5 ^^0yy.^^0}]7 00 !*0) 655...005 4]6 2_.y0z 0000....00004635 00 .!705 44..05 2 0_.y5^ 0.02 LR VaR VaR VaR RMP RMP RMP LR VaR VaR VaR RMP RMP RMP 0 0.0 0^.y0^0_ λ=0.0λ=0.2λ=0.4λ=0.0λ=0.2λ=0.4 λ=0.0λ=0.2λ=0.4λ=0.0λ=0.2λ=0.4 LR VaR LR VaR LR VaR 1"- eCPC 6.01" 1 W’//’/2,4(e Figure 10: Online results from YOYIDSP. !7 .05 5.5 7.0 5.0 6.5 4.5 posedstrategiesfilteredoutthelow-valuecases,whichwere 6.0 4.0 5.5 3.5 always with high uncertainty. (iv) For CPM and winning 5.0 3.0 4.5 2.5 rate, VaR reduced the CPM by lowering bids on unconfi- LR VaR VaR VaR RMP RMP RMP LR VaR VaR VaR RMP RMP RMP λ=0.0λ=0.2λ=0.4λ=0.0λ=0.2λ=0.4 λ=0.0λ=0.2λ=0.4λ=0.0λ=0.2λ=0.4 dent cases and highering the bids on confident ones (via Figure 8: Overallnon-budgetedtest performance. tuning φ); RMP did not guarantee any low bid as it just sought the bid yielding the highest value of risk-controlled ROI profit. >?@ ADEFGH 1.4 4.0 1.2 ;<= 3.6 1.0 3.4 6. ONLINEDEPLOYMENTANDA/BTEST 0.8 3.2 3.0 Online Environment. We have deployed our bidding 0.6 λ=L0R.0λ=LR0.2λ=L0R.4λ5=607.0λ5=607.2λ5=607.4λ7=90:.0λ7=90:.2λ7=90:.4 λ=L0R.0λ=LR0.2λ=L0R.4λV=a0R.0λV=a0R.2λV=a0R.4λR=M0P.0λR=M0P.2λR=M0P.4 strategies on YOYI Platform, which is a mainstream DSP >? 3 LMN >?O LAQ intheglobal RTBmarket. Thetotal daily bidrequestvol- ><II 6.0 ume received by YOYI was about 5 billion. Among the I<KJ 5.5 received bid requests, about 200M (4%) met YOYI’s bid I<KI 5.0 target rules of cost-per-click (CPC) campaigns, for which I<=J 4.5 YOYI would return a valid bid. Each of our tested strate- I<=I 4.0 gies was allocated with 1.5% bid request volume from 10 LR LR LR VaR VaR VaR RMP RMP RMP LR LR LR VaR VaR VaR RMP RMP RMP λ=0.0λ=0.2λ=0.4λ=0.0λ=0.2λ=0.4λ=0.0λ=0.2λ=0.4 λ=0.0λ=0.2λ=0.4λ=0.0λ=0.2λ=0.4λ=0.0λ=0.2λ=0.4 CPC campaigns, i.e. about 3M bid opportunities daily. 7.0>?O SLAL >? > TGUUGUXNYHS For training our models, we collected the impression and 4.5 6.5 clickdataof7consecutivedaysjustbeforetheA/Btesting 4.0 6.0 in Jan. 2016, which contained 424M ad impressions, 532K 3.5 5.5 clicks and 47.2K US dollars (USD) expense. The eCPC 3.0 5.0 on thetrainingdata is0.087 USD.A 2% negativeinstance 2.5 λ=L0R.0λ=LR0.2λ=L0R.4λV=a0R.0λV=a0R.2λV=a0R.4λR=M0P.0λR=M0P.2λR=M0P.4 λ=L0R.0λ=LR0.2λ=L0R.4λV=a0R.0λV=a0R.2λV=a0R.4λR=M0P.0λR=M0P.2λR=M0P.4 down sampling was performed by YOYI system. The ad- Figure9: Performancewithbudgetconstraint(1/2budget). vertisersof tested CPC campaigns paid a fixedamount for everyuserlandingaction,i.e.,validclick. Thelandingvalue uncertainty cases to low-uncertainty ones, which does not v was set to be 0.054 USD, which was the average CPC of mean theaverage CTR should get lower. all campaigns. 5.7 Bidding withBudget Constraints Test Setting. We tested two strategies for 7 days in Jan. 2016: (i) The baseline LR strategy. It consists of a lo- Following the previous section, we sought the tangent gistic regression whose AUC is 88% and a linear bidding points which maximized CP-Profit with various λ’s. The function with fixed φ = 0.56 set by YOYI operation staff. selectedmodelsarehighlightedwithredcirclesinFigure5. (ii)TheproposedVaRstrategy. Theparameterαacquired TheremightbeoverlappedredcirclesastheCP-Profitwith from the training was a positive value that slightly varied different λ’s still selected thesame model. daily,whichmeantthetrainedbiddingstrategyrisk-averse. Figure9providestheoverallperformanceover9iPinYou The parameter φ was set to be greater than LR. Thus the campaignswithbudgetconstraint(1/2budgetsetting). The twoparametershadoppositeeffectsonbidprice,whichre- baseline is LR with λ = 0, i.e., simply selecting φ yielding sultedinanaveragebidpriceclosetothatofLR.Strategies the highest profit on validation dataset. We found that: were tested with the same bid volume constraint. The bid (i) For small λ settings, VaR and RMP were both better volume for each strategy was 19.5M. The whole test live than LR on profit, which verified the claim that the risk- volume contained 39M auctions resulting in 3.3M impres- aware biddingstrategies successfully yieldhigherprofitvia sionsand7.4K userlandings. VaRspent12% morebudget controlling therisk. Specifically, risk-aware biddingstrate- than LR since φ was learned from training thus there was gies performed better on yielding profit for the campaigns noguarantee of equivalent budgetspendingfor A/Btest. withpoorerLRestimators. (ii) Allproposedmethods,i.e., LR with non-zero λ model selection and VaR/RMP with Result Discussion. The online results are presented in various λ’s, yielded higher ROI than the baseline LR with Figure10. Weshowtheperformanceoversixmetrics. From λ = 0. Some strategies (VaR and RMP with λ = 0,0.2) the results, we found that: (i) Our VaR strategy outper- yielded both higher profit and higher ROI. (iii) All risk- formed baseline LR on ROI by about 5%, which demon- aware bidding strategies provided higher CTR (except for strated the efficacy of our risk-aware bidding strategy. (ii) the conservative VaR with λ=0.4), which meant the pro- VaR achieved 17.5% higher profit than LR. Although this waspartlyduetothemorebudgetspentbyVaR,themore [7] C.M.Bishop.Patternrecognition and machine learning. importantfactwasthatVaRstrategycouldhelpadvertiser springer,2006. spendsuchmorebudgetataequivalentorevenhigherROI [8] Y.Chen,P.Berkhin,B.Anderson,andN.R.Devanur. within the same bid request volume. Note that advertis- Real-timebiddingalgorithmsforperformance-based displayadallocation.KDD,2011. ers determine their budget usually depending on ROI and [9] M.Crouhy,D.Galai,andR.Mark.The essentials of risk ROI is always negative correlative with budget. Thereby, management,volume1.McGraw-HillNewYork,2006. if a strategy allows an additional amount of budget spent [10] B.Edelman,M.Ostrovsky,andM.Schwarz.Internet on the same ROI, advertisers would like to increase their advertisingandthegeneralizedsecondpriceauction: budget by that amount. (iii) VaR won much fewer auc- Sellingbillionsofdollarsworthofkeywords.NBER,2005. tions than LR. According to our setting, VaR bid higher [11] E.J.Elton,M.J.Gruber,S.J.Brown,andW.N. for confident cases and lower for risky cases. Its low win- Goetzmann. Modern portfolio theory and investment ningrateresultindicatedthattheextraconfidentcasesVaR analysis.JWS,2009. gained were fewer than the risky cases VaR lost. (iv) As [12] A.Ghosh,B.I.Rubinstein,S.Vassilvitskii,and M.Zinkevich.Adaptivebiddingfordisplayadvertising.In for CTR, VaR achieved 64.4% higher CTR than baseline WWW,2009. LR, which demonstrated the high efficiency of risk-aware [13] T.Graepel,J.Q.Candela,T.Borchert,andR.Herbrich. strategies in finding high quality ad display opportunities. Web-scalebayesianclick-throughratepredictionfor (v)VaRgot higherCPM thanLR.Thisindicatedthatour sponsoredsearchadvertisinginmicrosoft’sbingsearch strategytendedtotargetatimpressionswithhigherquality engine.InICML,pages13–20, 2010. atthecostofhigherprice. AlthoughVaRstrategyachieved [14] D.J.HandandK.Yu.Idiot’sbayesnotsostupidafter higherCPM, italsoachievedhigherCTR andtheco-effect all? International statistical review,69(3):385–398, 2001. was reflected in low eCPC and high ROI, which were ben- [15] X.He,J.Pan,O.Jin,T.Xu,B.Liu,T.Xu,Y.Shi, A.Atallah,R.Herbrich,S.Bowers,etal.Practicallessons eficial. frompredictingclicksonadsatfacebook.InADKDD, 2014. 7. CONCLUSION [16] J.Hull.Risk Management and Financial Institutions, In this paper, we presented a solution for modeling the volume733.Wiley,2012. uncertaintyofCTRestimationinRTBdisplayadvertising. [17] K.-C.Lee,B.Orten,A.Dasdan,andW.Li.Estimating With aBayesian logistic regression CTR estimator, we ob- conversionrateindisplayadvertisingfrompast performancedata. InKDD,2012. tained a closed form of CTR distribution density. On the [18] H.Liao,L.Peng,Z.Liu,andX.Shen.ipinyouglobalrtb basisofthedistributionsofCTRandthemarketprice,two biddingalgorithmcompetitiondataset. InADKDD,2014. risk-aware bidding strategies are formulated: the first one [19] T.J.LinsmeierandN.D.Pearson.Valueatrisk. (VaR)bidsthevalueatriskoftheestimatedutilities,while Financial Analysts Journal, 56(2):47–67, 2000. thesecondone(RMP)seekstheoptimalbidthatmaximizes [20] H.Markowitz.Portfolioselection.The JF,1952. a lower bound of profit with a controlled risk. Our risk- [21] H.B.McMahanetal.Adclickprediction: aviewfromthe returnanalysisandofflineexperimentdemonstrated15.4% trenches.InKDD,2013. profitgainoveralinearoptimizedbiddingstrategy. Totest [22] J.Mun.Modeling risk: Applying Monte Carlo simulation, the applicability of our risk-aware bidding strategies in a real options analysis, forecasting, and optimization techniques,volume347.JohnWiley&Sons,2006. live setting, the bidding strategies have been deployed on [23] S.Muthukrishnan.Adexchanges: Researchissues.In anoperationalplatform. 17.5%profitgainoverthebaseline WINE.2009. was observed in a 7-day onlineA/B test. [24] R.J.Oentaryo,E.P.Lim,D.J.W.Low,D.Lo,and For future work, we will analyze the market equilibrium M.Finegold.Predictingresponseinmobileadvertising ifmore advertisersadopttherisk-aware biddingstrategies. withhierarchicalimportance-awarefactorizationmachine. We also plan to further explore applications from the pro- InWSDM,2014. posedCTRdistributionmodelandtherisk-awarebidopti- [25] C.Perlich,B.Dalessandro,R.Hook,O.Stitelman, mizationframework. Foranadplacementwithmulti-frame T.Raeder,andF.Provost.Bidoptimizingandinventory scoringintargeted onlineadvertising.InKDD,2012. dynamiccreatives,aportfoliooptimization[11]canbeper- [26] M.Richardson,E.Dominowska,andR.Ragno.Predicting formed toselect theoptimalcreativescombinationtoyield clicks: estimatingtheclick-throughratefornewads.In the highest profit with a controlled risk. Moreover, our WWW,pages521–530. ACM,2007. CTR distributionmodelispotentialtobedynamically up- [27] R.T.RockafellarandS.Uryasev.Optimizationof dated to evaluate users’ interest on an advertised product conditionalvalue-at-risk.Journal of risk,2:21–42, 2000. after repeated ad displays overtime. [28] W.F.Sharpe.Thesharperatio.Streetwise–the Best of the Journal of Portfolio Management,1998. AcknowledgementThiswork is financiallysupportedby [29] J.Shen,B.Orten,S.C.Geyik,D.Liu,S.Shariat,F.Bian, National NaturalScienceFoundation of China (61632017). andA.Dasdan.From0.5millionto2.5million: Efficiently scalingupreal-timebidding.InICDM,2015. 8. REFERENCES [30] Y.Shi,X.Zhao,J.Wang,M.Larson,andA.Hanjalic. [1] DailyNYSEgroupvolume.http://goo.gl/2EflkC. Adaptivediversificationofrecommendationresultsvia Accessed: 2016-02. latentfactorportfolio.InSIGIR,2012. [2] FikisumobileDSP.https://goo.gl/M39SjZ.Accessed: [31] I.Trofimov,A.Kornetova,andV.Topinskiy.Using 2016-02. boostedtreesforclick-throughratepredictionfor [3] iPinYoustatistics.http://goo.gl/Jd4xIV.Accessed: sponsoredsearch.InWINE,page2.ACM,2012. 2016-02. [32] J.WangandB.Chen.Sellingfuturesonlineadvertising [4] TheShanghai StockExchangedata. slotsviaoptioncontracts. InWWW,2012. https://goo.gl/Fl2JxZ.Accessed: 2016-02. [33] J.Wang,W.Zhang,andS.Yuan.Displayadvertisingwith [5] K.Amin,M.Kearns,P.Key,andA.Schwaighofer.Budget real-timebidding(RTB)andbehaviouraltargeting.arXiv optimizationforsponsoredsearch: Censoredlearningin preprint arXiv:1610.03013, 2016. MDPs.UAI,2012. [34] J.WangandJ.Zhu.Portfoliotheoryofinformation [6] M.AvellanedaandS.Stoikov.High-frequencytradingina retrieval.InSIGIR,2009. limitorderbook.QuantitativeFinance,8(3):217–224, 2008. [35] L.Wang, P.N.Bennett, andK.Collins-Thompson. Robustrankingmodelsviarisk-sensitiveoptimization.In SIGIR,2012. [36] X.Wang,W.Li,Y.Cui,R.Zhang,andJ.Mao. Click-throughrateestimationforrareevents inonline advertising.Online Multimedia Advertising: Techniques and Technologies,2010. [37] L.Wasserman.Allof statistics: a concise course in statistical inference.SpringerScience&BusinessMedia, 2013. [38] S.Yuan,J.Wang, andX.Zhao.Real-timebiddingfor onlineadvertising: measurementandanalysis.InADKDD, 2013. [39] W.ZhangandJ.Wang.Statisticalarbitrageminingfor displayadvertising.InKDD,2015. [40] W.Zhang,S.Yuan,andJ.Wang. Optimalreal-time biddingfordisplayadvertising.InKDD,2014. [41] Y.Zhang,D.Wang,G.Wang,W.Chen,Z.Zhang,B.Hu, andL.Zhang.Learningclickmodelsviaprobitbayesian inference.InCIKM,2010. Figure 11: An example of E[R(b)], Std[R(b)] trend w.r.t. [42] X.Zhao,W.Zhang,andJ.Wang.Interactive collaborative bid price, and therelationship between them. filtering.InCIKM.ACM,2013. [43] H.Zhu,H.Xiong,Y.Ge,andE.Chen.Mobileapp the same samples of yˆ and z. Thus, the computational recommendations withsecurityandprivacyawareness.In complexity of offline calculation for RMP only has con- KDD,2014. stant difference with the one for VaR, which takes a few minutes as discussed above. When online bidding, RMP 9. APPENDIX strategy has equal computational complexity with logistic regression, which makes it capable in real-world business. VaR Realization. Our VaR strategy relies on the CTR Analysis of RMP strategy. To give an insight into distribution model from Section 3. The parameters µ and i q are trained offline, while µ x and q−1x for a solving b in Eq. (26), we analyze the relationship between i i i i i i i E[R(b)] and Std[R(b)]. Given a bid b, the expectation of givenbidrequestiscalculatedonline,whichonlytakesdou- bletimecomparedtoatraditiPonalLRmodePl. Wethencal- R(b)is formulated as culate E[yˆ] and Std[yˆ] from iµixi and iqi−1xi, which E[R(b)]=v·E[yˆ]·z0−z1, (28) is, however, quite costly. Fortunately, we can move this calculation from online to offlPine by buildiPng up a look-up where z denotes bzkp (z)dz for simplicity. And the ex- table,with( iµixi, iqi−1xi)asitskeyand(E[yˆ],Std[yˆ]) pected skquared profi0t is z as its value. Since µ x and q−1x are in partic- R ular ranges PrespectivPelyi, wieican discireitizeithe value with E[R(b)2]=v2E[yˆ2]z0−2vE[yˆ]z1+z2. (29) enough accuracy andPcalculate E[yˆP] and Std[yˆ] of the dis- Based on E[R(b)]and E[R(b)2], thevarianceof profit is crete ( µ x , q−1x ) combinations. For each bid re- i i i i i i qcrueetsetkoPenylinaen,dwtePheronurnedad( E[iyˆµ]ixani,d Stidq[i−yˆ1]xfir)omtotgheet tlohoekd-uisp- Var[R(b)]=v2E[yˆ2]z0−2vE[yˆ]z1+z2−(vE[yˆ]z0−z1()320) table with O(1) time. P P We furtheranalyze thechange trendsof theexpectation VaR Efficiency. For offline look-up table building, if we E[R(b)]and variance Var[R(b)] w.r.t. thebid price b. separate µ x and q−1x into 1000 bins respectively i i i i i i ∂E[R(b)] and sample 1000 times from the corresponding yˆdistribu- =p (b)·(vE[yˆ]−b), (31) tion,itnePeds109sampPlingoperationsintotal,whichcanbe ∂b z doneinafewminutesonamodernPC.Foronlinebidding, which is positive when b<vE[yˆ] and negative otherwise. as discussed above, it has equal computational complexity We are also interested in the change trend of Var[R(b)] with logistic regression. in therange around thetruth-tellingbid price. Weobtain RMP Realization. We solve Eq. (26) using a look-up ∂Var[R(b)] table too, which takes ( iµixi, iqi−1xi) as its key and ∂b =pz(vE[yˆ])·v2·Var[yˆ]≥0. (32) bid price b as its value. For building the look-up table, (cid:12)b=vE[yˆ] we offline enumerate b wPithin aPlimited range. For each Therefore, th(cid:12)(cid:12)e variance increases in a range around the (cid:12) enumeratedb,wesamplemarketpricezandestimatedCTR conventionaltruth-tellingbidpricevE[yˆ],sodoesthestan- yˆ according to their distributions for sufficient times and darddeviation. Figure11showsanexampleoftherelation- calculate R(b) for each sample. After that we obtain the ship between E[R(b)] and Std[R(b)], where the parameters objectiveE[R(b)]−αStd[R(b)]foreachb. Finallywechoose of CTR distribution and market price distribution are the the optimal b that maximizes the objective and store the same as Figure 2. solutioninthelook-uptable. Whenonlinebidding,wejust BasedonthereasonablepreferenceofhigherE[R(b)]and calculate iµixi and iqi−1xi for the bid request, then lower Std[R(b)], we further borrow the concept of efficient findtheoptimal b from thelook-up table with O(1) time. frontier fromfinance[16]. Everypointattheefficientfron- P P RMP Efficiency. ComparedtoVaRstrategy,RMPaddi- tiercorrespondstotheoptimalbidpricedefinedbyEq.(26) withaparticularα. Forexample,theslopesαofthedashed tionally enumerates b and samples z in offline calculation. lines are 0,1,2 respectively. The tangent points are of the In practice, 1000 enumerations are enough as the range of maximumE[R(b)]−αStd[R(b)],eachofwhichcorresponds bid price is quite small. Note that, for each key of the toa particular bid price, i.e., thesolution of Eq.(26). look-up table, all the enumerated bid prices b can share