Robust Influence Maximization Wei Chen Tian Lin Zihan Tan MicrosoftResearch TsinghuaUniversity IIIS,TsinghuaUniversity [email protected] [email protected] [email protected] Mingfei Zhao Xuren Zhou IIIS,TsinghuaUniversity TheHongKongUniversityof [email protected] ScienceandTechnology [email protected] ABSTRACT that the ground-truth influence probabilities on edges are 6 exactly known. Separately, a number of studies [25, 26, 14, 1 Inthispaper,weaddresstheimportantissueofuncertainty 0 intheedgeinfluenceprobabilityestimatesforthewellstud- 24, 23] propose learning methods to extract edge influence 2 ied influence maximization problem — the task of finding probabilities. Due to inherent data limitation, no learning n k seed nodes in a social network to maximize the influence method could recover the exact values of the edge proba- bilities, and what can be achieved is the estimates on the u spread. We propose the problem of robust influence maxi- true edge probabilities, with confidence intervals indicating J mization,whichmaximizestheworst-caseratiobetweenthe thatthetruevaluesarewithintheconfidenceintervalswith influencespreadofthechosenseedsetandtheoptimalseed 2 high probability. The uncertainty in edge probability esti- set, given the uncertainty of the parameter input. We de- 1 mates,however,mayadverselyaffecttheperformanceofthe sign an algorithm that solves this problem with a solution- ] dependent bound. We further study uniform sampling and influence maximization task, but this topic has left mostly unexplored. The only attempt addressing this question is a I adaptive sampling methods to effectively reduce the uncer- S recentstudyin[17],butduetoatechnicalissueasexplained tainty on parameters and improve the robustness of the in- . in [17], the results achieved by the study is rather limited. s fluence maximization task. Our empirical results show that c parameter uncertainty may greatly affect influence maxi- In this paper, we utilize the concept of robust optimiza- [ tion [3] in operation research to address the issue of influ- mization performance and prior studies that learned influ- 2 ence probabilities could lead to poor performance in robust ence maximization with uncertainty. In particular, we con- siderthattheinputtotheinfluencemaximizationtaskisno v influencemaximizationduetorelativelylargeuncertaintyin longer edge influence probability on every edge of a social 1 parameter estimates, and information cascade based adap- graph, butinsteadaninterval inwhichthe true probability 5 tive sampling method may be an effective way to improve may lie. Thus the input is actually a parameter space Θ, 5 the robustness of influence maximization. which is the product of all intervals on all edges. For any 6 seedsetS,letσ (S)denotetheinfluence spreadofS under 0 Keywords θ parametersettingθ∈Θ. ThenwedefinerobustratioofSas . 01 stioocnia,linnfeotrwmoartkios,nidniflffuuesniocne maximization, robust optimiza- agc(Θhi,eSvi)n=gtmheinmθ∈aΘximσσθuθ((SmSθ∗))in,flwuheenrceeSspθ∗reisadthuenodpertipmaarlamseeetderseθt. 6 Intuitively, robust ratio of S indicates the (multiplicative) 1 : 1. INTRODUCTION gap between its influence spread and the optimal influence v spread under the worse-case parameter θ ∈Θ, since we are i In social and economic networks, Influence Maximization unsurewhichθ∈Θisthetrueprobabilitysetting. Thenour X problem has been extensively studied over the past decade, optimization task is to find a seed set of size k that maxi- r duetoitswideapplicationstoviralmarketing[12,18],out- mize the robust ratio under the known parameter space Θ a breakdetection[21],rumormonitoring[6],etc. Forexample, — we call this task Robust Influence Maximization (RIM). acompanymayconductapromotioncampaigninsocialnet- Itisclearthatwhenthereisnouncertaintyonedgeproba- works by sending free samples to the initial users (termed bilities,whichmeansΘcollapsestothesingletrueparameter as seeds), and via the word-of-mouth (WoM) effect, more θ, RIM degenerates to the classical influence maximization and more users are influenced by social links to join the problem. However, when uncertainty exists, solving RIM campaign and propagate messages of the promotion. This may be a more difficult task. In this paper, we first pro- problem is first introduced by Kempe et al. [18] under an poseanalgorithmLUGreedythatsolvestheRIMtaskwitha algorithmic framework to find the most influential seeds, solution-dependentboundonitsperformance,whichmeans and they propose the independent cascade model and lin- that one can verify its performance after it selects the seed ear thresholdmodel,whichconsiderthesocial-psychological set (Section 3). We then show that if the input parameter factors of information diffusion to simulate such a random spaceΘisonlygivenandcannotbeimproved,itispossible process of adoptions. that even the best robust ratio in certain graph instances Since Kempe et al.’s seminal work, extensive researches √ could be very small (e.g. O(logn/ n) with n being the havebeendoneoninfluencemaximization,especiallyonim- number of nodes in the graph). This motivates us to study provingtheefficiencyofinfluencemaximizationintheinde- samplingmethodstofurthertightenparameterspaceΘ,and pendentcascademodel[10,9,15,4,27],allofwhichassume thusimprovingtherobustnessofouralgorithm(Section4). However, due to the generality of problem scope studied Inparticular,westudybothuniformsamplingandadaptive in[19],theyshowstronghardnessresultsandthentheyhave samplingforimprovingRIMperformance. Foruniformsam- to resolve to bi-criteria solutions. Instead, we are working pling,weprovidetheoreticalresultsonthesamplecomplex- onaparticularinstanceofrobustsubmodularoptimization, ityforachievingagivenrobustratiooftheoutputseedset. andtheirbi-criteriasolutionmaygreatlyenlargetheselected For adaptive sampling, we propose an information cascade seed set size, which may not be allowed in our case. Fur- basedsamplingheuristictoadaptivelybiasoursamplingef- thermore, they work on finite set of submodular functions, fort to important edges often traversed by information cas- but in our case our objective function is parametrized with cades. Throughextensiveempiricalevaluations(Section5), θ from a continuous parameter space Θ, and it is unclear weshowthat(a)robustratioissensitivetothewidthofthe how their results work for the continuous case. confidenceinterval,anditdecreasesrapidlywhenthewidth In a parallel work, He and Kempe study the same sub- oftheconfidenceintervalincreases;asaresultpriorstudies ject of robust influence maximization [16], but they follow thatlearnededgeprobabilitiesmayresultinpoorrobustra- the bi-criteria approximation approach of [19], and thus in tio due to relative large confidence intervals (and thus high general their results are orthogonal to ours. In particular, uncertainty); (b) information cascade based adaptive sam- they use essentially the same objective function, but they pling method performs better than uniform sampling and work on a finite set of influence spread functions Σ, and other baseline sampling methods, and can significantly im- require to find k·ln|Σ| seeds to achieve 1−1/e approxi- prove the robustness of the influence maximization task. mation ratio comparing to the optimal seed set of size k; In summary, the contribution of our paper includes: (a) when working on continuous parameter space Θ, they show proposing the problem of robust influence maximization to that it is equivalent to a finite spread function space of size address the important issue of uncertainty in parameter 2n andthusrequiringΘ(kn)seedsforabi-criteriasolution, estimates adversely impacting the influence maximization whichrendersthebi-criteriasolutionuseless. Thustheirbi- task; (b) providing the LUGreedy algorithm that guaran- criteriaapproachissuitablewhenthesetofpossiblespread tees a solution-dependent bound; and (c) studying uniform functions Σ is small. andadaptivesamplingmethodstoimproverobustinfluence AdaptivesamplingforimprovingRIMbearssomeresem- maximization. blance to pure exploration bandit research [5], especially to Notethatproofsofsometechnicalresultscanbefoundin combinatorial pure exploration [7] recently studied. Both the appendix. useadaptivesamplingandachievesomeoptimizationobjec- tiveintheend. However,theoptimizationproblemmodeled 1.1 AdditionalRelatedWork incombinatorialpureexploration[7]doesnothavearobust- ness objective. Studying robust optimization together with Influence maximization has been extensively studied and combinatorialpureexplorationcouldbeapotentiallyinter- we already point out a number of closely related studies to esting topic for future research. Another recent work [20] our work in the introduction. For a comprehensive survey, uses online algorithms to maximize the expected coverage one can refer to the monograph [8]. We discuss a few most of the union of influenced nodes in multiple rounds based relevant work in more detail here. on online feedbacks, and thus is different from our adap- To the best of our knowledge, the study by He and tivesamplingobjective: weusefeedbackstoadjustadaptive Kempe [17] is the only attempt prior to our work that sampling in order to find a seed set nearly maximizing the also tries to address the issue of uncertainty of parame- robust ratio after the sampling is done. ter estimates impacting the influence maximization tasks. However, besides the similarity in motivation, the technical treatments are quite different. First, their central problem, 2. MODELANDPROBLEMDEFINITION calledinfluencedifferencemaximization,istofindaseedset ofsizek thatmaximizestheadditivedifferencebetweenthe As in [18], the independent cascade (IC) model can be two influence spreads of the same seed set using different equivalently modeled as a stochastic diffusion process from parameter values. Their purpose is to see how large the in- seed nodes or as reachability from seed nodes in random fluence gap could be due to the uncertainty in parameter live-edgegraphs. Forbrevity,weprovidethelive-edgegraph space. However, our goal is still to find the best possible descriptionbelow. ConsideragraphG=(V,E)comprising seedsetforinfluencemaximizationpurpose,whileconsider- asetV ofnodesandasetE ofdirectededges, whereevery ingtheadverseeffectoftheuncertainty,andthusweutilize edge e is associated with probability p ∈ [0,1], and let e therobustoptimizationconceptandusetheworse-casemul- n=|V|andm=|E|. Togeneratearandomlive-edgegraph, tiplicative ratio between the influence spread of the chosen we declare each edge e as live if flipping a biased random seed set and the optimal seed set as our objective function. coinwithprobabilityp returnssuccess,declareeasblocked e Second,theirinfluencedifferencemaximizationturnsoutto otherwise (with probability 1−p ). The randomness on all e be hard to approximate to any reasonable ratio, while we edges are mutually independent. We define the subgraph L provide an actual algorithm for robust influence maximiza- consisting of V and the set of live edges as the (random) tion that has both a theoretical solution-dependent bound live-edge graph. Given any set S ⊆ V (referred as seeds), and performs reasonably well in experiments. Third, we let R (S)⊆V denote the reachable set of nodes from S in L further consider using sampling methods to improve RIM, live-edge graph L, i.e., (1) S ⊆ R (S), and (2) for a node L which is not discussed in [17]. v∈/ S,v∈R (S)iffthereisapathinLdirectingfromsome L Inthecontextofrobustoptimization,Krauseetal.’swork node in S to v. onrobustsubmodularoptimization[19]ispossiblytheclos- For convenience, we use parameter vector θ =(p ) to e e∈E est to ours. Our RIM problem can be viewed as a specific denote the probabilities on all edges. The influence spread instance of robust submodular optimization studied in [19]. function σ (S) is defined as the expected size of the reach- θ Algorithm 1 Greedy(G,k,θ) under parameter space Θ as Input: Graph G, budget k, parameter vector θ σ (S) 1: S ←∅ g(Θ,S):=min θ , (1) 2: fo0r i=1,2,...,k do θ∈Θ σθ(Sθ∗) 34:: vS←←aSrgma∪xv{∈/vS}i{σθ(Si−1∪{v})−σθ(Si−1)} wbihlietryeoSnθ∗eivsertyheedogpetiimsaglivseonlubtiyonθ.of size k when the proba- i i−1 5: end for Given Θ and solution S, the robust ratio g(Θ,S) charac- 6: return S terizes the worst-case ratio of influence spread of S and the k underlying optimal one, when the true probability vector θ isunknown(exceptknowingthatθ∈Θ). Then,theRobust InfluenceMaximization(RIM)problemisdefinedasfollows. able set from S, that is (cid:88) σθ(S):= Pr[L]·|RL(S)|, Problem 2 (Robust Influence Maximization). θ L GivenagraphG=(V,E),parameterspaceΘ=× [l ,r ] e∈E e e wherePr [L]istheprobabilityofyieldinglive-edgegraphL and a fixed budget k, we are required to find a set S ⊆V of θ undervectorθ. From[18],weknowthattheinfluencespread k vertices, such that robust ratio g(Θ,S) is maximized, i.e., function is non-negative (∀S ⊆ V, σ (S) ≥ 0), monotone θ σ (S) (V∀,S∀⊆v∈TV⊆σV,(Sσθ∪(S{)v}≤)−σθσ(T()S),)a≥ndσs(uTbm∪o{dvu}l)a−r (σ∀S(T⊆))T. ⊆ SΘ∗ :=Sa⊆rVg,m|Sa|=xkg(Θ,S)=Sa⊆rVg,m|Sa|=xkmθ∈iΘnσθθ(Sθ∗). θ θ θ θ Thewell-knownproblemofInfluenceMaximizationraised TheobjectiveofthisproblemistofindaseedsetS∗ that Θ in [18] is stated in the following. hasthelargestrobustratio,thatis,S∗ shouldmaximizethe Θ worst-case ratio between its influence spread and the opti- Problem 1 (Influence Maximization [18]). Given mal influence spread, when the true probability vector θ is a graph G = (V,E), parameter vector θ = (p ) and a e e∈E unknown. When there is no uncertainty, which means Θ fixed budget k, we are required to find a seed set S ⊆ V of collapsestothetrueprobabilityθ,wecanseethattheRIM k vertices, such that the influence spread function σ (S) is θ problemisreducedbacktotheoriginalinfluencemaximiza- maximized, that is, tion problem. S∗ := argmax σ (S). In RIM, the knowledge of the confidence interval is as- θ θ sumed to be the input. Another interpretation is that, it S⊆V,|S|=k can be viewed as given an estimate of probability vector It has been shown that Influence Maximization problem is θˆ = (pˆ ) with a perturbation level δ on each edge e, e e∈E e NP-hard [18]. Since the objective function σθ(S) is sub- suchthatthetrueprobabilitype ∈[pˆe−δe,pˆe+δe]=[le,re], modular, we have a (1− 1e) approximation using standard whichconstitutesparameterspaceΘ=×e∈E[le,re]. Notice greedy policy Greedy(G,k,θ) in Algorithm 1 (assuming a that, in reality, this probability could be obtained via edge value oracle on function σθ(·)). Let Sθg be the solution of samplings,i.e.,wemakesamplesonedgesandcomputethe Greedy(G,k,θ). Asaconvention,weassumethatbothopti- fraction of times that the edge is live. On the other hand, mal seed set Sθ∗ and greedy seed set Sθg in this paper are of wecanalsoobserveinformationcascadesoneachedgewhen fixed size k implicitly. collectingthetraceofdiffusionintherealworld,sothatthe Ontheotherhand,itisprovedbyFeige[13]thatsuchan corresponding probability can be learned. approximationratiocouldnotbeimprovedfork-maxcover However, when the amount of observed information cas- problem, which is a special case of the influence maximiza- cade is small, the best robust ratio max g(Θ,S) for the S tion problem under the IC model. given Θ can be low so that the output for a RIM algorithm However,theknowledgeoftheprobabilityonedgesisusu- doesnothaveagoodenoughguaranteeoftheperformancein ally acquired by learning from the real-world data [25, 26, theworstcase. Thenanaturalquestionis,givenΘ,howto 14, 24, 23], and the obtained estimates always have some furthermakesamplesonedges(e.g.,activatingsourcenode inaccuracy comparing to the true value. Therefore, it is u of an edge (u,v) and see if the sink node v is activated naturaltoassumethat,fromobservationsofedgee,wecan through edge e) so that max g(Θ,S) can be efficiently im- S obtainthestatisticallysignificantneighborhood[le,re], i.e., proved? To be specific, how to make samples on edges and the confidence interval where the true probability pe lies in outputΘ(cid:48) andS(cid:48) accordingtotheoutcomesothat(a)with with high probability. This confidence interval prescribes high probability the true value θ lies in the output parame- the uncertainty on the true probability pe of the edge e, terspaceΘ(cid:48),wheretherandomnessistakenaccordingtoθ, andsuchuncertaintyonedgesmayadverselyimpactthein- and (b) g(Θ(cid:48),S(cid:48)) is large. This sub-problem is called Sam- fluencemaximizationtask. Motivatedbythis,westudythe plingforImprovingRobustInfluenceMaximization,andwill problemofrobustinfluencemaximizationasspecifiedbelow. be addressed in Section 4. Suppose for every edge e, we are given an interval [l ,r ] e e (0 ≤ le ≤ re ≤ 1) indicating the range of the probability, 3. ALGORITHM AND ANALYSIS FOR and the ground-truth probability p ∈[l ,r ] of this edge is e e e RIM unknown. DenoteΘ=× [l ,r ]astheparameterspaceof e∈E e e networkG,andθ=(p ) asthelatentparametervector. Consider the problem of RIM, parameter space Θ = e e∈E Specifically, let θ−(Θ) = (l ) and θ+(Θ) = (r ) as × [l ,r ] is given, and we do not know the true proba- e e∈E e e∈E e∈E e e theminimumandmaximumparametervectors,respectively, bility θ∈Θ. Let θ− =(l ) and θ+ =(r ) . e e∈E e e∈E and when the context is clear, we would only use θ− and Our first observation is that, when Θ is a single vector θ+. ForaseedsetS ⊆V and|S|=k,defineitsrobustratio (l = r , ∀e ∈ E), it is trivially reduced to the classical e e Algorithm 2 LUGreedy(G,k,Θ) We refer α(Θ)(1− 1) as the solution-dependent bound of e Input: Graph G=(V,E), budget k, parameter space Θ= g(Θ,SLU) that LUGreedy achieves, because it depends on Θ × [l ,r ] thesolutionSLU. Thegoodthingisthatitcanbeevaluated e∈E e e Θ 1: Sg ←Greedy(G,k,θ−) oncewehavethesolution,andthenweknowtherobustra- θ− 2: Sg ←Greedy(G,k,θ+) tio must be at least this lower bound. Note that the bound 3: reθt+urn argmaxS∈(cid:110)Sθg−,Sθg+(cid:111){σθ−(S)} icsatgeosotdhaiftαth(Θe)iniflsunenotcetosporsemadalσl,θ(aSnΘLdU)thwuesfiitndinhtausranginodoid- performance under any probability vector θ∈Θ. It is worth remarking that the choice of using α(Θ) = Influence Maximization problem. Therefore, we still have σ (SLU)/σ (Sg ) as a measurement is for the follow- the following hardness result on RIM [18, 13]: inθg− reΘasonsθ:+(aθ)+Intuitively, Sg is expected to be the θ− best possible seed set we can find that maximizes σ (·); Theorem 1. RIM problem is NP-hard, and for any ε> θ− (b) Meanwhile, we consider Sg as a potential seed set 0, it is NP-hard to find a seed set S with robust ratio at θ+ for the later theoretical analysis (in the proof of Theo- least 1− 1 +ε. e rem 6), which requires the alignment of the same seed To circumvent the above hardness result, we develop al- set for the numerator and denominator. Thus, α(Θ) ≥ gorithmsthatachievesreasonablylargerobustratio. When max{σθ−(Sθg−),σθ−(Sθg+)}/σθ+(Sθg+). In particular, when we are not allowed to make new samples on the edges to θ+ and θ− tend to the same value θ, RIM is tending to- improvetheinputinterval,itisnaturaltoutilizethegreedy wards the classical Influence Maximization, and thus the algorithm of submodular maximization in [18] (i.e., Algo- influence spread σ (SLU) can be close to the best possible θ Θ rithm1)asthesubroutinetocalculatethesolution. Inlight result σ (Sg). The approach adopted by LUGreedy is simi- θ θ ofthis,wefirstproposeLower-UpperGreedyAlgorithmand lar to the sandwich approximation used in [22]. the solution-dependent bound for g(Θ,S), and then discuss Thefollowingexampleshowsthatforcertainproblemin- g(Θ,S) in the worst-case scenario. stances, the gap ratio α(Θ) of LUGreedy could match the robustratiog(Θ,SLU),whichalsomatchesthebestpossible 3.1 Lower-UpperGreedyAlgorithm Θ robust ratio max g(Θ,S). |S|=k Given parameter space Θ = × [l ,r ] with the min- e∈E e e Example 1. Consider a graph G=(V,E) where the set imum and maximum parameter vectors θ− = (l ) e e∈E of nodes are equally partitioned into 2k subsets V =∪2k V and θ+ = (r ) , our Lower-Upper Greedy algorithm i=1 i e e∈E such that every V contains t+1 nodes. Let V ={vj |1≤ (LUGreedy(G,k,Θ)) is described in Algorithm 2 which out- i i i putsthebestseedsetSLUfortheminimumparametervector j ≤ t+1} and set E = ∪2i=k1Ei where Ei = {(vi1,vij) | 2 ≤ θ− such that Θ j ≤ t+1}. That is, every (Vi,Ei) forms a star with vi1 being the node at the center, all stars are disconnected from SΘLU := argmax {σθ−(S)}. (2) one another. For the parameter space we set the interval on S∈(cid:110)Sg ,Sg (cid:111) every edge to be [l,r]. When LUGreedy select k nodes, since θ− θ+ all v1’s have the same (marginal) influence spread, w.l.o.g., Toevaluatetheperformanceofthisoutput,wefirstdefine suppiose that LUGreedy selects {v1,v1,...,v1}. Then if we the gap ratio α(Θ) ∈ [0,1] of the input parameter space to 1 2 k set the true probability vector θ ∈ Θ such that p = l for be every e∈∪k E , and p =r for every e∈∪2k eE , it is i=1 i e i=k+1 i α(Θ):= σθ−(SΘLU). (3) easy to check that max|S|=kg(Θ,S) = g(Θ,SΘLU) = α(Θ) = σθ+(Sθg+) 11++ttrl. Then, LUGreedy achieves the following result: The intuition from the above example is that, when there Theorem 2 (solution-dependent bound). Given a aremanyalternativechoicesforthebestseedset,andthese alternative seed sets do not have much overlap in their in- graph G, parameter space Θ and budget limit k, LUGreedy outputs a seed set SLU of size k such that fluence coverage, the gap ratio α(Θ) is a good indicator of Θ the best possible robust ratio one can achieve. (cid:18) (cid:19) g(Θ,SLU)≥α(Θ) 1− 1 , In the next subsection, we will show that the best robust Θ e ratio could be very bad for the worst possible graph G and parameter space Θ, which motivates us to do further sam- where α(Θ):= σσθθ+−((SSθgΘL+U)). pling to improve Θ. Proof. For any seed set S, g(Θ,S)=min σθ(S) by 3.2 Discussionontherobustratio definition. Obviously,itisafactthatσ (S)isθm∈Θonσoθt(oSnθ∗)eon Forthetheoreticalperspective,weshowinthispartthatif θ θ for any fixed S. From the definition of optimal solutions wemakenoassumptionoronlyaddlooseconstraintstothe and the greedy algorithm, we can get σ (S∗) ≤ σ (S∗) ≤ input parameter space Θ, then no algorithm will guarantee θ θ θ+ θ σ (S∗ )≤ σθ+(Sθg+). Moreover, it can be implied that good performance for some worst possible graph G. θ+ θ+ 1−1/e Theorem 3. For RIM, (cid:18) (cid:19) (cid:18) (cid:19) σ (S) 1 σ (S) 1 g(Θ,S)≥min θ 1− = θ− 1− . 1. There exists a graph G = (V,E) and parameter space θ∈Θ σθ+(Sθg+) e σθ+(Sθg+) e Θ=×e∈E[le,re], such that UseseedsetSLU fromLUGreedy,anditfollowsimmediately (cid:18) (cid:19) that g(Θ,SΘLU)Θ≥ σσθθ+−((SSθgΘL+U))(cid:0)1− 1e(cid:1)=α(Θ)(cid:0)1− 1e(cid:1). |mS|a=xkg(Θ,S)=|mS|a=xkmθ∈iΘnσσθθ((SSθ∗)) =O nk . 2. There exists a graph G = (V,E), constant δ = Θ(cid:0)1(cid:1) Inthissectionwestudybothuniformsamplingandadap- n andparameterspaceΘ=× [l ,r ]wherer −l ≤δ tive sampling for improving RIM. According to the Cher- e∈E e e e e for every e∈E, such that noff’s bound, the more samples we make on an edge, the (cid:18) (cid:19) narrower the confidence interval we get that guarantees the logn maxg(Θ,S)=O . true probability to be located within the confidence inter- |S|=k n valwithadesiredprobabilityofconfidence. Aftersampling to get a narrower parameter space, we could use LUGreedy 3. Consider random seeds set S˜ of size k. There exists a algorithm to get the seed set. (cid:16) (cid:17) graph G=(V,E), constant δ =Θ √1n and parame- 4.1 UniformSampling ter space Θ=× [l ,r ] where r −l ≤δ for every e∈E e e e e In Sampling for improving RIM, the goal is to design a e∈E, we have sampling and maximization algorithm A that outputs Θ(cid:48) (cid:20) σ (S˜) (cid:21) (cid:18)logn(cid:19) and S(cid:48) such that with high probability the robust ratio of mΩaxmθ∈iΘnES˜∈Ω σθθ(Sθ∗) =O √n , Sno(cid:48)ffin’sΘb(cid:48)ouisndlartgoe.coAmfpteurtesatmheplcinognfieddegnecse, winetecravnalu,saenCdhtehre- where Ω is any probability distribution over seed sets confidenceintervalcanbefurthernarroweddownwithmore of size k, and ES˜∈Ω[·] is the expectation of random set samples. However, the key issue is to connect the width S˜ taken from the distribution Ω. of confidence interval with the stability of influence spread. We propose two ideas exploiting properties of additive and In the first case, we allow the input Θ to be an arbi- multiplicative confidence interval respectively to this issue, trary parameter space. It is possible that Θ = ×e∈E[0,1] andincorporateintoUniformSamplingalgorithm(inAlgo- for some graph G, which means there is no knowledge at rithm 3) with theoretical justification (in Theorem 6). all for edge probabilities. Then any seed set may achieve Ourfirstideaisinspiredbythefollowinglemmafrom[11] O(cid:0)k(cid:1)-approximation of the optimal solution in the worst to build the connection in the additive form. n case. Intuitively, a selected seed set S may rarely activate Lemma 4 (Lemma 7 in [11]). Given graph G and pa- othernodes(i.e.,O(k)),whileoptimalsolution(tothelatent rameter space Θ such that ∀θ ,θ ∈ Θ, (cid:107)θ −θ (cid:107) ≤ δ, θ) may cover almost the whole graph (i.e., Ω(n)). 1 2 1 2 ∞ then, ∀S ⊆V, Inthesecondcase,anadditionalconstraintisassumedon the parameter space (cid:13)(cid:13)θ+−θ−(cid:13)(cid:13)∞ ≤δ, i.e., for every e∈E, |σθ1(S)−σθ2(S)|≤mnδ. r −l ≤ δ, to see if we could obtain a better performance e e We use a tight example (in the order of |V| and |E|) to when δ is small. However, even though δ is in the order of illustrate the connection and give an insight of this lemma O(1/n),therobustratiocanbeassmallasO(logn/n). The as follows. Consider graph G = (V,E) with |V| = n and proof is related to the phase transition in the Erdo˝s-R´enyi |E| = m (m (cid:29) n). Let G be two disjoint cycles, each graph for the emergence of giant component. In particu- containing exactly n nodes and n edges. We arbitrarily lar, if we have a graph G consisting of two disconnected, 2 2 assign the rest m−n edges between two cycles. Then, for equal-sizedErdo˝s-R´enyirandomgraphswithedgeprobabil- everyedgeeinthecycle,theintervalisl =r =1,andl = ities close to the critical value of generating a giant con- e e e 0, r = δ for those between two cycles, which constitutes nected component, then whenever we select a seed in one e Θ=× [l ,r ]. Supposeδ>0issufficientlysmall,andlet component,thatcomponentcouldbejustbelowthethresh- e∈E e e budget k=1. For any single-node set S, it is easy to check old resulting in O(logn) influence spread while the other that for θ− = (l ) , σ (S) = n, and for θ+ = (r ) , component is just above the threshold leading to Θ(n) in- e e∈E θ− 2 e e∈E σ (S)≈ n+n(m−n)δ,thus|σ (S)−σ (S)|≈ 1n(m− fluence spread. Thus, the worst-case ratio for any one-node θ+ 2 2 θ+ θ− 2 n)δ inthiscase. Asacomparison,fromLemma4,weknow seed set is always O(logn/n). A similar discussion can be that |σ (S)−σ (S)|≤mnδ. found in [17]. θ+ θ− Therefore,theabovelemmaestablishestheguidancethat In the third case, we allow the algorithm to be random- ized, namely the output seed set S˜ is a random set of size wemaysampleeveryedgeforsufficienttimestoshrinktheir confidenceintervalsinΘ,andfeedLUGreedywithΘassame k. Even in this case, the robust ratio could be as bad as √ assolvingRIM,thentheperformanceisguaranteedbyThe- O(logn/ n). orem2,whichmatchesourintuitionthatLUGreedyperforms well with the satisfactory Θ. 4. SAMPLINGFORIMPROVINGRIM On the other hand, our second idea is to use the multi- From the previous section, we propose LUGreedy algo- plicative confidence interval to reduce the fluctuation of in- rithm to check the solution-dependent bound of the robust fluencespread,thenLUGreedystillapplies. Thenextlemma ratio, and point out the worse-case bound could be small if is crucial to achieve this goal. Θ is not assumed to be tight enough. Lemma 5. Given graph G=(V,E) and parameter space Theorem3intheprevioussubsectionpointsoutthatthe Θ. If there exists λ ≥ 0, for all edge e ∈ E, s.t., r ≤ best possible robust ratio max g(Θ,S) can be too low so e S (1+λ)l , then for any nonempty set S ⊆V, that the output for RIM could not provide us with a satis- e fyingseedsetintheworstcase. Thenanaturalquestionis: σ (S) θ+ ≤(1+λ)n, (4) given the input Θ, can we make efficient samples on edges σ (S) θ− so that Θ is narrowed into Θ(cid:48) (this means the true θ ∈ Θ(cid:48) and with high probability) and then output a seed set S(cid:48) that makes g(Θ(cid:48),S(cid:48)) large? This problem is called Sampling for maxmin σθ(S) ≥(1+λ)−n. (5) Improving RIM. |S|=kθ∈Θ σθ(Sθ∗) Algorithm 3 US-RIM is crucial to sample edges appropriately. Moreover, we can Input: Graph G=(V,E), budget k, ((cid:15),γ) adapt our sampling strategy dynamically to put more sam- Output: Parameter space Θ , seed set S pling effort on critical edges when we learn the edge proba- out out 1: for all e∈E do bilities more accurately over time. 2: Sample e for t times, and observe x1,...,xt For convenience, given graph G = (V,E), we define ob- e e 3: pe ← 1t (cid:80)ti=1xie, and set δe according to Theorem 6 servation set M = {Me}e∈E as a collection of sets, where 4: re ←min{1,pe+δe}, le ←max{0,pe−δe} Me ={x1e,x2e,··· ,xtee}denotesobservedvaluesofedgeevia 5: end for the first t samples on edge e. We allow that a parameter e 6: Θ ←× [l ,r ] space Θ ⊆ × [0,1] is given, which can be obtained by out e∈E e e 0 e∈E 7: S ←LUGreedy(G,k,Θ ) some initial samples M (e.g., uniformly sample each edge out out 0 8: return (Θ ,S ) of the graph for a fixed number of times). out out Thefollowinglemmaisusedtocalculatetheconfidencein- terval,whichisacombinationofadditiveandmultiplicative Inthislemma,theratioofinfluencespreadcanbebounded Chernoff’s Bound. We adopt this bound in the experiment basedontherelationofl andr inthemultiplicativeform. sincesomeedgesinthegraphhavelargeinfluenceprobabil- e e Tounifybothideasmentionedabove,weproposeUniform ity while others have small ones, but using either additive Sampling for RIM algorithm(US-RIM)inAlgorithm3,and or multiplicative bound may not be good enough to obtain the theoretical result is presented in Theorem 6. Basically, asmallconfidenceinterval. Thefollowingboundisadapted the algorithm samples every edge with the same number from [1] and is crucial for us in the experiment. of times, and use LUGreedy to obtain the seed set. We set Lemma 7. For each e ∈ E, let M = (cid:8)x1,x2,...,xte(cid:9) differenttandδ forthetwoideas. Henceforth,weexplicitly e e e e e be samples of e in M = {M } , and t be the sample refer the first setting as Uniform Sampling with Additive e e∈E e number. Given any γ > 0, let confidence intervals for all form(US-RIM-A),andthesecondoneasUniform Sampling edges be Θ=× [l ,r ], such that, for any e∈E, with Multiplicative form (US-RIM-M). e∈E e e (cid:40) (cid:114) (cid:41) Theorem 6. Given a graph G = (V,E), budget k, and l =min pˆ + c2e −c c2e +pˆ , 0 accuracy parameter (cid:15),γ > 0, let n = |V| and m = |E|, e e 2 e 4 e then for any unknown ground-truth parameter vector θ = (cid:40) (cid:114) (cid:41) c2 c2 (pe)e∈E, Algorithm US-RIM outputs (Θout,Sout) such that re =max pˆe+ 2e +ce 4e +pˆe, 1 , (cid:18) (cid:19) 1 g(Θout,Sout)≥ 1− e (1−(cid:15)), wherepˆ = (cid:80)ti=e1xie, c =(cid:113)3 ln2m. Then, withprobability e te e te γ with Pr[θ ∈ Θ ] ≥ 1−γ, where the randomness is taken at least 1−γ, the true probability θ=(p ) satisfies that out e e∈E according to θ, if we follow either of the two settings: θ∈Θ. 1. Set t= 2m2n2ln(2m/γ), and for all e, set δ = k(cid:15) ; Our intuition for non-uniform sampling is that the edges k2(cid:15)2 e mn alongtheinformationcascadeofimportantseedsdetermine 2. Assume we have p(cid:48) such that 0<p(cid:48) ≤min p , set e∈E e the influence spread, and henceforth they should be esti- (cid:16) (cid:17)2 t = 3ln(2m/γ) 2n +1 , and for all edge, set mated more accurately than other edges not along impor- p(cid:48) ln(1/1−(cid:15)) tant information cascade paths. Thus, we use the follow- δ = 1p log 1. e n e γ ing Information Cascade Sampling method to select edges. Starting from the seed set S, once node v is activated, v In general, the total number of samples summing up all edgesisO(m3n2log(m/γ))forUS-RIM-A,andO(mn2log(m/γ)) will try to activate its out-neighbors. In other words, for for US-RIM-M kw2(cid:15)it2h an additional constant p(cid:48), thp(cid:48)e(cid:15)2lower tehveenryeowuti-lledbgeeseamofplve,ddoenncoetetotegaesnetrhaetenuamnbeweroobfssearmvaptlieosn, bound probability on all edge probabilities. The difference xte based on the latent Bernoulli distribution with success e is that the former has a higher order of m, and the latter probability p , and t will be increased by 1. The process requires the knowledge of p(cid:48) and has an extra dependency goes on untilethe endeof the information cascade. on O(1/p(cid:48)). Since the sample complexity for both settings WeproposeInformationCascadeSamplingforRIM (ICS- can be calculated in advance, one may compare the values RIM) algorithm in Algorithm 4, which adopts information and choose the smaller one when running the uniform sam- cascade sampling described above to select edges. plingalgorithm. Anintuitiveinterpretationisthat: (1)with Algorithm 4 is an iterative procedure. In the i-th itera- highprobability(≥1−γ),thealgorithmalwaysoutputsan tion, Lemma 7 is used to compute the confidence interval (1−1e−(cid:15))-approximationsolutionguaranteedbyUS-RIM-A; Θi fromobservationsetMi. ThenaccordingtoΘi,wefind (2) if p(cid:48) = Ω(k2) (it is a loose assumption naturally satis- thelower-uppergreedysetSLU anduseinformationcascade m2 Θi fiedinpractice),wemaychooseUS-RIM-Mtoachievebetter to update observation set M by absorbing new samples. i+1 sample complexity. Since the robust ratio g(Θ,SLU) cannot be calculated ef- Θi ficiently, we will calculate α(Θ) (defined in (3)) instead. 4.2 Non-uniformandAdaptiveSampling In our algorithm, we use a pre-determined threshold κ In a real network, the importance of edges in an influ- (κ ∈ (0,1)) as the stopping criteria. Therefore, for S , out ence diffusion process varies significantly. Some edges may the robust ratio g(Θ,S ) ≥ α(Θ)(cid:0)1− 1(cid:1) > κ(cid:0)1− 1(cid:1) is out e e havelargerinfluenceprobabilitythanothersorconnecttwo guaranteedbyTheorem2,andthetrueprobabilityθ∈Θ out important nodes in the network. Therefore, in sampling it holds with probability at least 1−γ due to Lemma 7. Algorithm 4 ICS-RIM(τ): Information Cascade Sampling and in this cases we view that edge (u,v) are sampled. Ac- Input: Graph G = (V,E), budget k, initial sample M , cordingtothedatareportedin[2],onaverageeveryedgeis 0 threshold κ, γ. sampled 318 times for their learning process. We then use Output: Parameter space Θout, seed set Sout 318 samples on each edge as our initial sample M0. 1: i←0 2: repeat NetHEPT. The NetHEPT dataset [10] is extensively used 3: Get Θ based on M (see Lemma 7). in many influence maximization studies. It is an aca- i i 4: SLU =LUGreedy(G,k,Θ ) demiccollaborationnetworkfromthe”HighEnergyPhysics- Θi i 5: M ←M Theory”section of arXiv form 1991 to 2003, where nodes i+1 i 6: for j =1,2,...,τ do represent the authors and each edge in the network rep- 7: Do information cascade with the seed set SLU resents one paper co-authored by two nodes. It contains Θi 8: Duringthecascade,oncev∈V isactivated,sample 15233 nodes and 58891 undirected edges (including dupli- all out-edges of v and update M catededges). Weremovethoseduplicatededgesandobtain i+1 9: end for a directed graph G = (V,E),|V| = 15233,|E| = 62774 (di- 10: i←i+1 rectededges). SincetheNetHEPTdatasetdoesnotcontain 11: until α(Θ )>κ thedataofinfluenceprobabilityonedges,wesettheproba- i 12: S ←SLU bilityonedgesaccordingtotheweightedcascade model[18] out Θi−1 13: Θout ←Θi−1 as the ground truth parameter, i.e., ∀e=(v,u)∈E, let xu 14: return (Θout,Sout) be the in-degree of u in the edge-duplicated graph, ye be thenumberofedgesconnectingnodev andu,thenthetrue probabilityisp =1−(1− 1 )ye. Followingthesamebase- e xu line of Flixster, we initially sample each edge for 318 times Compared with information cascade sampling method, as M . calculating a greedy set is time-consuming. Therefore in 0 Algorithm4,wecallLUGreedyonceeveryτ roundsofinfor- 5.1.2 Algorithms mation cascades to reduce the cost. WetestboththeuniformsamplingalgorithmUS-RIMand the adaptive sampling algorithm ICS-RIM, as well as an- 5. EMPIRICALEVALUATION otheradaptivealgorithmOES-RIM(Out-EdgeSampling)as We conduct experiments on two datasets, Flixster1 and the baseline (to be described shortly). Each algorithm is NetHEPT2 to verify the robustness of influence maximiza- given a graph G and initial observation set M . Note that 0 tion and our sampling methods. the method to estimate the parameter space based on sam- pling results in Algorithm 3 and Algorithm 4 are different. 5.1 ExperimentSetup In order to make the comparison meaningful, in this sec- tion, for all three algorithms, a common method according 5.1.1 DataDescription to Lemma 7 is used to estimate the parameter space. In all tests, we set the size of the seed set k = 50. To reduce Flixster. TheFlixsterdatasetisanetworkofAmericanso- the running time, we use a faster approximation algorithm cial movie discovery service (www.flixster.com). To trans- PMIA (proposed in [9]) to replace the well known greedy form the dataset into a weighted graph, each user is rep- algorithm purposed in [18] in the whole experiment. The resented by a node, and a directed edge from node u to v accuracyrequirementγ =o(1)issettobeγ =m−0.5 where is formed if v rates one movie shortly after u does so on m is the number of edges. the common movie. The dataset is analyzed in [2], and the influence probability are learned by the topic-aware model. US-RIM. The algorithm is slightly modified from Algo- Weusethelearningresultof[2]inourexperiment,whichis rithm3forabettercomparisonofperformance. Themodi- a graph containing 29357 nodes and 212614 directed edges. fiedalgorithmproceedsinaniterativefashion: Ineachiter- There are 10 probabilities on each edge, and each probabil- ation,thealgorithmmakesτ samplesoneachedge,updates 1 ityrepresentstheinfluencefromthesourceusertothesink ΘaccordingtoLemma7andcomputesα(Θ). Thealgorithm on a specific topic. Since most movies belong to at most stops when α(Θ)≥κ=0.8. τ is set to 1000, 1000, 250 for 1 twotopics,weonlyconsider3outof10topicsinourexperi- NetHEPT, Flixster(Topic 8), Flixster(Mixed), respectively ment,andgettwoinducedgraphswhosenumberofedgesare to achieve fine granularity and generate visually difference 23252 and 64934 respectively. For the first graph, probabil- of α(Θ) in our results. itiesoftopic8aredirectlyusedasthegroundtruthparam- eter (termed as Flixster(Topic 8)). For the second graph, ICS-RIM. As stated in Algorithm 4, in each iteration, the we mix the probabilities of Topic 1 and Topic 4 on each algorithmdoτ =5000timesinformationcascadesampling 2 edge evenly to obtain the ground-truth probability (termed based on the seed set from the last iteration, and then it as as Flixster(Mixed)). After removing isolated nodes, the updates Θ according to Lemma 7, computes α(Θ) and uses number of nodes in the two graphs are 14473 and 7118 re- LUGreedy algorithm to compute the seed set for the next spectively. round. The algorithm stops when α(Θ)≥κ=0.8. In [2], the probability for every edge (u,v) is learned by rating cascades that reach u and may or may not reach v, OES-RIM. This algorithm acts as a baseline, and it pro- 1http://www.cs.sfu.ca/∼sja25/personal/datasets/ ceeds in the similar way to ICS-RIM. Instead of sampling 2http://research.microsoft.com/en- information cascades starting from the current seed set as us/people/weic/projects.aspx in ICS-RIM, OES-RIM only sample out-edges from the seed set. More specifically, in each iteration, the algorithm sam- ples 5000 times of all out-edges of the seed set from last iteration, for the three graphs respectively, and then it up- dates Θ according to Lemma 7, computes α(Θ) and uses LUGreedy algorithm to compute the seed set for the next round. Note that for OES-RIM, α(Θ) remains small (with the increase of the number of samples) and cannot exceed the threshold κ even the iteration has been processed for a large number of times, therefore we will terminate it when α(Θ) is stable. 5.1.3 α¯ asaUpperBound Theorem 2 shows that α(Θ)(cid:0)1− 1(cid:1) is a lower bound for e the robust ratio g(Θ,SLU). We would also like to find some Θ upperboundofg(Θ,SLU): Iftheupperboundisreasonably Θ close to the lower bound or match in trend of changes, it indicates that α(Θ)(cid:0)1− 1(cid:1) is a reasonable indicator of the e robust ratio achieved by the LUGreedy output SLU. For any Θ θ∈Θ,wedefineα¯(Θ,θ)= σθ(LΘU). Thefollowingshowsthat Figure 1: α(Θ) and α¯(Θ) for different widths of con- σθ(Sθg) fidence interval W. α¯(Θ,θ) is an upper bound for g(Θ,SLU): Θ α¯(Θ,θ)= σθ(SΘLU) ≥ σθ(SΘLU) ≥ min σθ(cid:48)(SΘLU) =g(Θ,SLU). First,weobservethatastheparameterspaceΘbecomes σθ(Sθg) σθ(Sθ∗) θ(cid:48)∈Θ σθ(cid:48)(Sθ∗(cid:48)) Θ wider, the value of both α(Θ) and α¯(Θ) become smaller, which matches our intuition that larger uncertainty results The next question is how to find a θ = (θ ) ∈ Θ to e e∈E in worse robustness. Second, there is a sharp decrease of make the upper bound α¯(Θ,θ) as small as possible. In our α(Θ) between W ∈ [0,0.1] and a much slower decrease af- experiments, we use the following two heuristics and take terwards for all three graphs. The decrease of α¯(Θ) is not their minimum. as sharp as that of α(Θ) but the decrease also slows down ThefirstheuristicborrowstheintuitionfromExample1, with larger W after 0.2. The overall trend of α(Θ) and which says that the gap ratio α(Θ) is close to the robust ratio g(Θ,SLU) when (a) there are two disjoint seed sets α¯(Θ) suggests that the robust ratio may be sensitive to the Θ uncertainty of the parameter space, and only when the un- with similar influence spead, (b) their cascade overlap is certainty of the parameter space reduces to a certain level small, and (c) the reachable edges from one seed set use that we can obtain reasonable guarantee on the robustness lowerendparametersvalueswhilethereachableedgesfrom of our solution. the other seed set use upper end parameters. Thus in our As a comparison, we know that the average number of heuristic, we use PMIA algorithm to find another seed set S(cid:48) of k nodes when we remove all nodes in SLU. We then samples on each edge is 318 for the learned probabilities in Θ do information cascades from both SLU and S(cid:48) for an equal theFlixsterdataset. Thiscorrespondstoanaverageinterval Θ widthof0.293fortopic8and0.265forthemixedtopic. At number of times. Finally, for every edge e, if it is sampled moreintheinformationcascadewithseedsetSLUthanwith these interval widths, α(Θ) values are approximately 0.04 Θ S(cid:48), we set θ = l , otherwise we set θ = r . The second and 0.08 respectively for the two graphs, and α¯(Θ) are ap- e e e e proximately0.12and0.2respectively. Thismeansthat,even heuristicisavariantofthefirstone,wherewerunanumber ofinformationcascadesfromSLU,andforanyedgeethatis consideringtheupperboundα¯(Θ),therobustratioispretty Θ low, and thus the learned probabilities reported in [2] may sampledinatleast10%ofcascades,wesetθ =l ,otherwise e e result in quite poor performance for robust influence maxi- we set θ =r . e e mization. Other more sophisticated heuristics are possible, but it Of course, our result of α(Θ) and α¯(Θ) is only targeted could be a separate research topic to find tighter upper at the robustness of our LUGreedy algorithm, and there boundfortherobustratio,andthusweonlyusethesimple could exist better algorithm having higher robustness per- combinationoftheabovetwointhispaper,whichisalready formance at the same uncertainty level. Finding a better indicative. We henceforth use α¯(Θ) to represent the upper RIM algorithm seems to be a difficult task, and we hope bound found by the minimum of the above two heuristics. thatourstudycouldmotivatemoreresearchinsearchingfor 5.2 Results such better RIM algorithms. Besides SLU, we also indepen- Θ dently test the classical greedy seed set Sg for θ =(p ) θ e e∈E 5.2.1 α(Θ)andα¯(Θ)withPredeterminedIntervals on the lower parameter vector θ− (that is σθ−(Sθg) versus Inthefirstexperimentweexploretherelationshipbetween σθ+(Sθg+) α(Θ)), and the average performance on each data point is the width of confidence interval Θ=× [l ,r ] and α(Θ) e∈E e e 2.45%, 1.05%, 6.11% worse than SLU for Flixster(Mixed), together with α¯(Θ). For a given interval width W, we set Θ Flixster(Topic 8) and NetHEPT, respectively. Therefore, it l = min{p − W,0},r = max{p + W,1} ∀e ∈ E, where e e 2 e e 2 shows that SLU outperforms σg in the worse-case scenario, p is the ground-truth probability of e. Then we calculate Θ θ e andhenceforthweonlyuseSLUinthefollowingexperiments. α(Θ) and α¯(Θ). We vary the width W to see the trend of Θ changes of α(Θ) and α¯(Θ). Figure 1 reports the result on 5.2.2 ResultsforSamplingalgorithms the three graphs with seed set size k=50. Figure2: α(Θ)andα¯(Θ)fordifferentaveragenumber Figure4: α(Θ)andα¯(Θ)fordifferentaveragenumber of samples per edge on graph NetHEPT. of samples per edge on graph Flixster(Mixed). with both α(Θ) and α¯(Θ). All these consistency suggests thatgapratioα(Θ)couldbeusedasanindicatorforthero- bustnessofAlgorithmLUGreedy,anditisreasonabletouse α(Θ) incomparingthe performance of different algorithms. Second, comparing the performance of three algorithms, we see that both US-RIM and ICS-RIM are helpful in im- proving the robust ratio of the selected seed set, and ICS- RIMisbetterthanUS-RIM,especiallywhenthesamplesize increases. The baseline algorithm OES-RIM, however, per- forms significantly poorer than the other two, even though it is also an adaptive algorithm as ICS-RIM. The reason is that, the lower-upper greedy set SLU changes little after a Θ certain number of iterations in OES-RIM, and thus only a small number of edges (out edges of SLU) are repeatedly Θ sampled. The probabilities on these edges are already esti- matedveryaccuratelywhileotheredgeprobabilitiesarefar from accurate. It is the inaccurate edges that make α(Θ) and the best robust ratio small. In contrast, ICS-RIM uses Figure3: α(Θ)andα¯(Θ)fordifferentaveragenumber informationcascadestosamplenotonlyedgesdirectlycon- of samples per edge on graph Flixster(Topic 8). nectingtotheseedsetbutalsoedgesthatcanbepotentially reached. This suggests that it is important for a sampling method to balance the sampling between critical edges and other potentially useful edges in order to achieve better ro- Figures 2, 3 and 4 reports the result of α = α(Θ) and bustness in influence maximization. α¯=α¯(Θ)forthethreetestedgraphsrespectively,whenthe Overall, the results suggest that information cascade average number of samples per edge increases. For better based sampling method stands out as a competitive choice presentation,wetrimallfiguresaslongasα(US-RIM)=0.7. whenwecanadaptivelysamplemoreedgestoachievebetter (For example, in Flixster(Topic 8), US-RIM requires 77318 robustness. If adaptive sampling is not possible, predeter- samples in average for α to reach 0.8, while ICS-RIM only mined uniform sampling may also perform reasonably well. needs 33033, and for OES-RIM α sticks to 0.118.) For the sampling algorithms, after the i-th iteration, the 6. CONCLUSION observationsetisupdatedfromM toM ,andtheaver- i−1 i agenumberofsamplesperedgeinthenetworkiscalculated. In this paper, we propose the study of robust influence Markers on each curve in these figures represent the result maximization to address the impact of uncertainty in edge afteroneiterationofthecorrespondingsamplingalgorithm. probabilityestimatesthatwouldinevitablyoccurinpractice The results on all three graphs are consistent. First, for to the influence maximization task. We propose the LU- eachpairofα(Θ)andα¯(Θ),eventhoughthereisstillsome Greedy algorithm with a proven solution-dependent bound, gap, indicating either the lower bound or the upper bound and further propose sampling methods, in particular infor- maynotbetightyet,thetrendsonbothα(Θ)andα¯(Θ)are mationcascadebasedadaptivesamplemethodtoeffectively consistent: Bothincreasewiththenumberofsamples,even reduce the uncertainty and increase the robustness of the withsimilarslopesateachpoint;andamongdifferentalgo- LUGreedy algorithm. The experimental results validate the rithms,therankingorderandrelativechangeareconsistent usefulness of the LUGreedy algorithm and the information cascadebasedsamplingmethodICS-RIM.Moreover,there- probabilistically triggered arms. CoRR, sults indicate that robustness may be sensitive to the un- abs/1407.8339, 2014. certainty of parameter space, and learning algorithms may [12] P. Domingos and M. Richardson. Mining the network need more data to achieve accurate learning results for ro- value of customers. In KDD 2001. bust influence maximization. [13] U. Feige. A threshold of ln n for approximating set Ourworkopensupanumberofresearchdirections. First, cover. Journal of the ACM (JACM), 45(4):634–652, it is unclear what could be the upper bound of the best ro- 1998. bust ratio given an actual network and learned parameter [14] A. Goyal, F. Bonchi, and L. V. Lakshmanan. Learning space. Answeringthisquestionwouldhelpustounderstand influence probabilities in social networks. In WSDM whether robust influence maximization is intrinsically dif- 2010. ficult for a particular network or it is just our algorithm [15] A. Goyal, W. Lu, and L. V. Lakshmanan. Celf++: that does not perform well. If it is the latter case, then optimizing the greedy algorithm for influence an important direction is to design better robust influence maximization in social networks. In WWW 2011. maximization algorithms. Another direction is how to im- [16] X. He and D. Kempe. Robust influence maximization. prove sampling methods and learning methods to achieve In KDD 2016. more accurate parameter learning, which seems to be cru- [17] X. He and D. Kempe. Stability of Influence cial for robust influence maximization. In summary, our Maximization. ArXiv e-prints, Jan. 2015. work indicates a big data challenge on social influence re- [18] D. Kempe, J. Kleinberg, and E´. Tardos. Maximizing search—thedataonsocialinfluenceanalysisisstillnotbig the spread of influence through a social network. In enough, such that the uncertainty level in model learning KDD 2003. mayresultinpoorperformanceforinfluencemaximization. We hope that our work could encourage further researches [19] A. Krause, H. B. McMahon, C. Guestrin, and to meet this challenge from multiple aspects including data A. Gupta. Robust submodular observation selection. collection, data analysis, and algorithm design. JMLR, 9:2761–2801, 2008. [20] S. Lei, S. Maniu, L. Mo, R. Cheng, and P. Senellart. Acknowledgment Online influence maximization. In KDD 2015. [21] J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, The research of Wei Chen is partially supported by the J. VanBriesen, and N. Glance. Cost-effective outbreak National Natural Science Foundation of China (Grant No. detection in networks. In KDD 2007. 61433014). [22] W. Lu, W. Chen, and L. V. Lakshmanan. From competition to complementarity: comparative 7. REFERENCES influence diffusion and maximization. In VLDB 2015. [1] A. Badanidiyuru, R. Kleinberg, and A. Slivkins. [23] P. Netrapalli and S. Sanghavi. Learning the graph of Bandits with knapsacks. In FOCS 2013. epidemic cascades. In SIGMETRICS 2012. [2] N. Barbieri, F. Bonchi, and G. Manco. Topic-aware [24] M. G. Rodriguez, D. Balduzzi, and B. Sch¨olkopf. social influence propagation models. Knowledge and Uncovering the temporal dynamics of diffusion information systems, 37(3):555–584, 2013. networks. In ICML 2011. [3] A. Ben-Tal and A. Nemirovski. Robust [25] K. Saito, R. Nakano, and M. Kimura. Prediction of optimization–methodology and applications. information diffusion probabilities for independent Mathematical Programming, 92(3):453–480, 2002. cascade model. In Knowledge-Based Intelligent [4] C. Borgs, M. Brautbar, J. T. Chayes, and B. Lucier. Information and Engineering Systems, pages 67–75. Maximizing social influence in nearly optimal time. In Springer, 2008. SODA 2014. [26] J. Tang, J. Sun, C. Wang, and Z. Yang. Social [5] S. Bubeck, R. Munos, and G. Stoltz. Pure exploration influence analysis in large-scale networks. In KDD in finitely-armed and continuous-armed bandits. 2009. Theoretical Computer Science, 412:1832–1852, 2011. [27] Y. Tang, X. Xiao, and Y. Shi. Influence maximization: [6] C. Budak, D. Agrawal, and A. El Abbadi. Limiting near-optimal time complexity meets practical the spread of misinformation in social networks. In efficiency. In SIGMOD 2014. WWW 2011. [7] S. Chen, T. Lin, I. King, M. R. Lyu, and W. Chen. Combinatorial pure exploration of multi-armed bandits. In NIPS 2014. APPENDIX [8] W. Chen, L. V. Lakshmanan, and C. Castillo. Information and influence propagation in social A. PROOFOFTHEOREM3 networks. Synthesis Lectures on Data Management, 5(4):1–177, 2013. [9] W. Chen, C. Wang, and Y. Wang. Scalable influence Proof. (Case 1): Let G be an n-clique and Θ = maximization for prevalent viral marketing in ×e∈E[0,1], i.e., for every edge e, le = 0 and re = 1. For large-scale social networks. In KDD 2010. arbitrary set S = {v1,··· ,vk}, there exists a valid pa- [10] W. Chen, Y. Wang, and S. Yang. Efficient influence rameter vector θ = (pe)e∈E ∈ Θ, where pe = 0 for all maximization in social networks. In KDD 2009. ES = {e = (u,v) | u ∈ S or v ∈ S} and pe = 1 for all e∈/ E . Then,σ (S)=kandσ (S∗)=n−1,whichimplies [11] W. Chen, Y. Wang, and Y. Yuan. Combinatorial S θ θ θ multi-armed bandit and its extension to that g(Θ,S)=minθ∈Θ σσθθ((SSθ∗)) ≤ n−k1. For any set S of size