ebook img

Asymptotic Laws for Joint Content Replication and Delivery in Wireless Networks PDF

1.1 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Asymptotic Laws for Joint Content Replication and Delivery in Wireless Networks

1 Asymptotic Laws for Joint Content Replication and Delivery in Wireless Networks S. Gitzenis, Member, IEEE, G. S. Paschos, and L. Tassiulas, Member, IEEE Abstract—We study the scalability of multihop wireless com- as the network (number of nodes) and content (number munications, a major concern in networking, for the case that of files) both scale to infinity. Link capacity is shown users access content replicated across the nodes. In contrast (cid:16)√ (cid:17) to scale from O N , the non-sustainable regime for to the standard paradigm of randomly selected communicating pairs, content replication is efficient for certain regimes of file the wireless network of [2], down to O(1). The latter popularity, cache and network size. Our study begins with the implies that, provided sufficiently broad wireless links, 2 detailed joint content replication and delivery problem, a hard the system performance is unaffected of its size. 1 combinatorial optimization, which is reduced to the simpler 4) The precise conditions relating the content volume vs. 0 replication density problem whose performance is of the same the network size, the cache size and the file popularity 2 orderastheoriginal.AssumingaZipfpopularitylaw,andletting the number of files and network nodes both go to infinity, we distribution are presented for caching to make a differ- n (cid:16)√ (cid:17) identify scaling capacity laws from O N down to O(1). ence in the system performance and scalability. a J The above are summarized in a discussion in Section VII. 5 1 I. INTRODUCTION II. PROBLEMBACKGROUND&RELATEDWORK A novel technology, key for the future Internet, is content- Consider a wireless network of N nodes, where multihop ] I based networking [1]: in this paradigm, data requests are communicationsareusedtoexchangedata:insuchanetwork, N √ placed on content, as opposed to network address, and routes the maximum throughput per node scales as O(1/ N) [2]. s. are formed based on content provision and user interest. This celebrated, albeit pessimistic result essentially states that c Content caching is a well-known technique that exploits the per-user throughput drops to zero as the network expands. [ temporalaswellasspatialvicinityofuserrequeststoimprove The source of this decline lies in the assumed uniform traffic 1 the user-perceived Quality of Service (QoS) and network per- matrix for the generation of throughput(cid:16)√dema(cid:17)nd: the average v formance metrics. Its merits have been widely demonstrated communicating pair distance rises as Θ N . 5 in various networking paradigms, such as Content Delivery Thisresultstimulatedaseriesofattemptstooverrunit,asin 9 0 Networks (CDNs), Publish-Subscribe and Peer-to-Peer (P2P). [3],wherecooperativetransmissionsareemployedtopreserve √ 3 Another major technology for future networks is wireless theuserthroughput,stillhardlyavoidingthe1/ N law.In[4], . networking, as it supports the exemplar of ubiquitous access non-uniform transfer matrices are analyzed, identifying laws 1 0 for mobile users. Unfortunately, long multihop communica- for various types of flows (e.g., asymmetric, multicast, etc.). 2 tions are known not to scale [2] (i.e., the maximum common Even in the case that a multihop wireless network is aided 1 rateforallflowsisinverselyproportionaltotheaveragenum- by infrastructure, to overcome the 1/√N law, quite many base : (cid:16)√ (cid:17) v ber of hops). Considering, moreover, the volatility of wireless stations Ω N are required [5]. Nonetheless, the limitation i communications and the associated performance bottlenecks, X isshowntobeinnaturegeometrical[6],and,thus,organically it becomes quite important to assess the performance benefits linked to Maxwell’s electromagnetic theory. r a of caching and its effect on the system sustainability. Inthiswork(and[7],anearlierversion),wefocusonnodes Aiming to address the performance and sustainability of generating requests on particular content (i.e., files), as op- wireless networks with caching, after reviewing past and posedtospecificdestinations.Thisisamajorshift,asfilesmay relatedworkandtheasymptoticnotationinSectionsIIandIII, be cached in multiple nodes in the network; hence, requests this work makes the following contributions: canbepotentiallyservedfrommultiplelocations,closetotheir 1) It formulates the optimization of content replication origin.Moreover,weassumethestandarduniformdistribution jointlywithroutingforminimizingtherequiredwireless about the request origin, along with a power law regarding linkcapacity(SectionIV),ahardcombinatorialproblem the requested content. The goal is to investigate whether the that looks at the detailed delivery routes and the cache enhancement of caching can turn sustainable a large wireless √ content at each node; this is step-by-step reduced to network, i.e., overcome the O(1/ N) law of [2]. 2) a simple and mathematically tractable problem, whose Cachingisatechniquewell-knowntoimproveperformance scopespansjusttothemacroscopicfrequencyofcontent invariousdomains.Inwirelessnetworks,performancebenefits replication(SectionV).Anefficientsolutionisdesigned, arise from the hop reduction [8]; this work revisits and andisshowntobeorderefficienttotheoptimalsolutions enhances the basic model of [8] in the direction of deter- of both the simplified and the original hard problem. mining the associated scaling laws. In the Publish-Subscribe 3) Using the above solution, in Section VI, we study and paradigm,cachingisemployedtopreserveinformationspatio- analyzetheasymptoticlawsoftherequiredlinkcapacity temporally and shield against link breakages and mobility 2 [9]. In wireless meshes, cooperative caching can improve the same row or column with non-directed links. By keeping performance by means of implementation [10]. the node density fixed and increasing the network size N, we An exploration of the benefit of caching in the scaling of obtain a scaling network similar to [2]. Moreover, to avoid large networks has been conducted in [11], but on a different boundary effects, we consider a toroidal structure as in [19]. perspective than ours. First, [11] uses arbitrary traffic matrix. This grid structure explicitly considers discrete nodes, in Second, the cache contents are input parameters, whereas, contrast to the average cache capacity density of [8]. Unless in this work (and in [8]), replication is a key part of the nodes coordinate transmissions in complex schemes (e.g., [3], optimization. Last, [11] assumes the paradigm of cooperative [11]), it is a reasonable communications scheme: the commu- transmissions of [3], which leads to a delivery strategy differ- nication range is limited due to attenuation and interference; entthanshortestpathrouting.Intotal,thenetworkcanbecome usingafrequencyreusefactorappropriatetothephysicallayer approximately sustainable with the use of an hierarchical tree (or TDMA, or random access at the MAC layer), the network structureoftransmissionsoverarbitrarilylonglinks.However, layerisabstractedtothelattice[12].Last,thissetupessentially itshouldbestressedthattheresultsof[3],[11]dependonthe matches the random communicating pairs of [2]. particular signal attenuation parameters assumed. Nodes (or users located therein) generate requests to access Here, we study the square grid, a well known 2D model files/data, indexed by m∈M(cid:44){1,2,...,M}. Each node n [12] of wireless networks’ topology and connectivity, and isequippedwithacache/buffer,whosecontentsaredenotedby symmetrictrafficaboutitsorigin.Thisapproachislessgeneral thesetB ,asubsetofM.Ifarequestatnodenregardsafile n than[11],aimingtoidentifydefiniteasymptoticlawsandshred m that lies in B , then it is served locally. Due to the limited n light on whether caching can make the system scalable. buffercapacity,mwilloftenbenotavailableinB ,thus,node n To this end, we consider the Zipf law for the non-uniform n will have to request m over the network from some other file requests, as in [8]. Literature provides ample evidence node w that keeps m in its cache. Let, then, W ⊆N be the m [13],[14]thatthefilepopularityintheInternetfollowssucha set of nodes that maintain m in their caches. power law. The Zipf parameter ranges from 0.5 [15] to 3 [16] Let, moreover, K be the storage capacity of nodes’ cache depending on the application: low values are representative in measured in the number of files it can store. This means that routers, intermediate values in proxies and higher values in all M files are of the same (unit) size, placing a constraint on mobile applications [17], [18]—see also references therein. thecardinalityofcachecontents|B |≤K.Thegeneralization n to variable sized files can be still captured in this framework III. ASYMPTOTICNOTATION by splitting each large file into multiple unit segments, and then treating its segments as separate, independent files1. Let us define the notation used in the asymptotic laws that For the problem of replication not to be trivial, it should be follow. Let f and g be real functions. Then, f ∈o(g) if K <M, which implies that each node has to select the files (cid:12) (cid:12) (cid:12)f(x)(cid:12) to buffer in its cache (later, in the asymptotic laws, we shift for any k >0, there exists x◦ s. t. for x≥xo,(cid:12)(cid:12)g(x)(cid:12)(cid:12)≤k. our attention to K (cid:28)M). Moreover, for the network to have sufficient memory to store each file at least once, it should be Although o(g) defines a set of functions, it is customary to writef =o(g)(slightlyabusingnotation),insteadoff ∈o(g). KN ≥M. (1) Moreover, f =O(g), if there exists a k >0 such that f(x) is eventually, in absolute value, less or equal to kg(x), that is Last, let each node n ∈ N generate requests for data at a rateof λ .Althoughthedefinitions ofSectionIV-Acoverthe (cid:12) (cid:12) n there exist k >0,xo >0 s. t. for x≥xo,(cid:12)(cid:12)(cid:12)fg((xx))(cid:12)(cid:12)(cid:12)≤k. greeqnueerastlcraastee,awcerosstsicknotodeths,eλsym=meλtr.icEnaocdhertreaqfufiecs,torfeugnairfdosrma n particular file m ∈ M, depending on the file m’s popularity Using such a k, we can write that f l≤im kg and f l<im k(cid:48)g, if p . In essence, [p ] is a probability distribution, i.e., sets f,g are positive functions and k(cid:48) >k. m m theprobabilityofarequestforagivenfile.Clearly,eachfile’s Iftheinequalitiesontheabovedefinitionsarereversed,i.e., replicationshouldbebasedonitspopularity:e.g.,tominimize |f(x)/g(x)| ≥ k, we get that f = ω(g), or f = Ω(g); using the network traffic, popular files should be stored densely. such a k, f l≥imkg and f l>imk(cid:48)g, for positive f,g and k(cid:48) <k. lim lim In the case that f ≤ g and f ≥ g, we write that f ∼g. A. General Replication-Routing Problem Last, f =Θ(g) if f =Ω(g) and f =O(g). Network traffic is the criterion to decide about the file An important consequence of the above is that f = O(g) replication, and in particular, the lowest link capacity that lim doesnotimplyf < g—e.g.,considerf(x)=2g(x);however, can sustain the request demand. Defining C as the rate of (cid:96) the reverse is true. Moreover, if f l<img, then g−f =Θ(g). traffic carried by link (cid:96), the network is stable, not discarding requests, only if the capacity of link (cid:96) exceeds C . In the (cid:96) IV. BASICDEFINITIONSANDTHEGENERALPROBLEM primary formulation of the problem, we consider the worst link case, i.e., the most loaded link, max C . Next, we also Assume N peers, with N being a square of an integer, (cid:96) (cid:96) consider the average traffic over the links, avg C . In both indexed by n ∈ N (cid:44) {1,2,...,N}, arranged on a square (cid:96) (cid:96) √ √ grid on the plane of size N rows times N columns. Each 1The presented framework allows for independent storage of a file’s node is connected to its four neighbors that lie next to it on segments,dispensingwiththeneedtokeepthemallinthesamecache. 3 Setof Set WorstLink AverageLink AverageLink SymbolDefinition Symbol allowedvalues Cardinality Problem: NodeCapa- NodeCapa- TotalCapa- city(WN) city(AN) city(AT) Node n N N (cid:44)4ν AlternateNodenotation (x,y) {1,...,N21−1}2 N Objective minCWN minCAN minCAT File/Data m M M Cost CWN(cid:44)maxC CAN=CAT(cid:44)avgC (cid:96) (cid:96) Buffer/Cachecontents Bn SeeTableII Function (cid:96) (cid:96) Path(withuseprobability) [r;v1,...,vk] Optimization [Bn],[Rn,m] fSroemtonfoRdoeuntestoRfinle,mm,i (cid:40)(cid:20)rin,m;vin1,m,...,vink,im,n,m(cid:21)(cid:41) Variables withRn,m=(cid:110)(cid:104)rin,m;vin1,m,...,vink,im,n,m(cid:105)(cid:111) RouteRn,m,i probability rin,m (0,1] Constraints Foralln∈N,|Bn|≤K (2) (cid:88)|Bn|≤KN(3) Nodemaintaining onBuffer n∈N fileminitscache wm Wm Wm Contents Foralln∈N,Bn⊆M (4) nNooddeent’hsartesqeurevsetsscolnienmt wm,n [Bn]: n∈∪NBn=M (5) Noodnesthseeirrvreedqubeystnsoodfemwm n Qw,m Qw,m Constraints Forallni,sma,liis:tno,fvaind1,jmac,evnin2t,mn,o.d.e.s,vink,im,n,m (6) toHthoDepescneosruvitniyntgofrfnodomadtena(osmd)eWnm h(ndm,m) [0(cid:2),N1+,∞1(cid:3)) o[nRRno,mut]e:s FoFroarllanll,nm,,mi::B(cid:88)vini,kmri,inn,,mm=(cid:51)1m ((87)) Canonicaldensityofdatam d◦m (cid:8)N1,N4,...,1(cid:9) 1+ν Opt.Value, CWN(cid:63) CAN(cid:63) CAT(cid:63) Logarithmofcan.density νm◦ (cid:44)−log4d◦m {0,1,...,ν} 1+ν Arguments BnWN(cid:63),RWnN,(cid:63)m BnAN(cid:63),RAnN,(cid:63)m BnAT(cid:63),RAnT,(cid:63)m TABLEI TABLEII LISTOFSYMBOLSUSEDINSECTIONIV. LISTOFJOINTREPLICATION/DELIVERYPROBLEMS. setups, the replication problem regards finding the cache Bn PROBLEM1[WORSTLINKNODECAPACITY(WN)]: atallnodesn,thatminimizemaxC oravg C ,respectively. (cid:96) (cid:96) (cid:96) Minimize max C ([B ],[R ]), subject to (2), (4-8). (cid:96) (cid:96) n n,m The individual node capacity constraint of the primary formulation is expressed through |B | ≤ K, for all n ∈ N; PROBLEM2[AVERAGELINKNODECAPACITY(AN)]: n again, we also consider an alternate version of the capacity Minimize avg C ([B ],[R ]), subject to (2), (4-8). (cid:96) (cid:96) n n,m constraint, pertaining to the total capacity over the network: (cid:80) |B | ≤ KN. This corresponds to a relaxed network PROBLEM3[AVERAGELINKTOTALCACHE(AT)]: n n setting where the total buffer capacity NK is to be arbitrarily Minimize avg C ([B ],[R ]), subject to (3), (4-8). (cid:96) (cid:96) n n,m distributed to the nodes. As each file should be stored at least As shown in Table II, we use the (cid:63) notation to denote the once in the network, in all cases, it has to be ∪ B =M. n∈N n optimalvalueoftheobjectivefunction,andaminimizingpair Allowing for receiver driven anycast strategy, the network of the cache contents and routes (as there may exist multiple shouldchooseanodew toservetherequestsofclientnon m,n optimalsolutions).Notethatintheabovedefinitions,wehave m;w shouldbeselectedfromthepossiblymanycandidates m,n omitted from the argument list the parameters [p ], [λ ] and of set W . Therefore, the replication problem is implicitly m n m cache capacity K to contain the notation clutter. linked to a joint delivery problem of finding appropriate paths Itisclearthattheseproblemsareofcombinatorialcomplex- ofadjacentnodesn,v ,v ,...,v fromeachclientntoanode 1 2 k ity, and thus are not amenable to an easy to compute, closed- v ∈W for each file m, that minimize maxC or avg C . k m (cid:96) (cid:96) (cid:96) form solution. However, as we are interested in asymptotic Since there exist multiple routes between a node n and laws, we will use the simpler structure of the latter problems the caches containing m, we should allow splitting the traffic to design straightforward replication strategies, and compute among them, to balance the load among links. Thus, the approximations that are within a constant to the optimal. delivery problem involves finding a set of routes First, observe that, as the WN and AN problems share the Rn,m ={Rn,m,i}={[rin,m;vin1,m,vin2,m,...,vink,m]} same constraints, [BnWN(cid:63)],[RWnN,m(cid:63) ] is valid for AN. Moreover, for any [B ],[R ], it is max C ≥avg C , therefore, n n,m (cid:96) (cid:96) (cid:96) (cid:96) fromeachclientnodevtoaservernodevn,m ∈W .Oneach route R , rn,m denotes the portion ofikrequestsmof node n CWN(cid:63) ≥avgC(cid:96)([BnWN(cid:63)],[RWnN,m(cid:63) ])≥CAN(cid:63). n,m,i i (cid:96) for file m that use path n,vn,m,vn,m,...,vn,m is used by node n. Therefore, it has to bie1(cid:80) ir2n,m =1.ik Second,theATproblemhasrelaxedconstraintsincompari- i i sontoAN,i.e.,any([B ,R ])ofANsatisfiesAT,too,and Given the cache contents [B ] and routes [R ], it is easy n n,m n n,m both problems share a common objective function. Thus, to sum up the traffic C per link l; we do not provide the ex- (cid:96) plicit formula, but later we carry out associated computations. CAN(cid:63) =CAT([BAN(cid:63)],[RAN(cid:63) ])≥CAT(cid:63). n n,m Based on the above, we define three variants of the Combining the above, it follows that replication-delivery problem (Table II), beginning with the primary (and hardest) one, relaxed next to simpler versions. LEMMA1[WNVS.ANVS.AT]: CWN(cid:63) ≥CAN(cid:63) ≥CAT(cid:63). 4 Observe that shortest paths are sufficient in the AN and AT problems. Indeed, consider for some n,m a route PROBLEM4[CONTINUOUSDENSITY(CD)]: Minimize [mrino,rme;hvoin1p,ms ,thvian2n,mp,a.t.h.,[vnin,k,vim,1n,,mv2],.in..,svekt] Rwint,hmmth∈at Bivnkv.olRvees- CCD([dm])(cid:44) √62 (cid:88) (cid:20)√1d −1(cid:21)pm, placing it by [rin,m;v1,v2,...,vk]} (or, if the shorter path is m∈M m already in Rn,m, adding the routing probabilities) reduces the with respect to [dm], subject to: average link load thanks to the new path’s fewer hops: 1) For any m∈M, 1 ≤d ≤1, (cid:80) N m LEMMA 2 [AN-AT SHORTEST PATH OPTIMALITY]: The 2) m∈Mdm ≤K. optimal routes RAnN,m(cid:63) ,RAnT,m(cid:63) consist of shortest paths only. In the CD Problem, the optimization variables are the densities d , which express the fraction of caches containing Astheremayexistmultiplepathsfromasourcentofilem m −1 withthesamehopcount;inthiscase,wearefreetoarbitrarily filem.Intheobjective,dm2−1approximatestheaveragehop distribute traffic among them. Let, therefore, h(n,m) denote countfromarandomnodetoacachecontainingm.Weighted the hop-count of a shortest path between n and m. It is by the probability pm of requests on m, the summation (cid:88) (cid:88) (cid:88) expresses the average link load per request. C(cid:96) = h(n,m)pm. (9) It is clear that any [B ] that satisfies the WN, AN or AT n (cid:96) n∈Nm∈M problems’ constraints yields a distribution [d ] that satisfies m For simplicity, the above assumes a unit request rate per node the constrains of CD. Combining this with Lemma 1, λ=1; it is clear that the link capacities of WN, AN and AT THEOREM4[CDBOUND]: CCD(cid:63) ≤CAT(cid:63) ≤CAN(cid:63) ≤CWN(cid:63). are directly linear to the arrival rate λ. Then, both sides of (9) sumthetotalloadonthenetwork:theLHSexpressesthetotal The 1/N ≤ dm ≤ 1 are an important difference from the load as the sum over all links, while the RHS expresses it as modelof[8],which,asseennext,affectsthesolution,andhas the load generated by each file request of each node. a major impact in the asymptotics. V. DENSITY-BASEDFORMULATIONS B. CD Problem Solution The above formulation takes a microscopic view on the First, let us establish the uniqueness of the solution: preciseroutesandspecificcachecontents.Now,weswitchtoa LEMMA 5 [CD CONVEXITY]: CD is a strictly convex opti- macroscopic view, that considers the frequency of occurrence mization problem, and has a unique optimal solution. of each file in the caches. This leads to an easy-to-solve problem, which permits finding the asymptotic link rate. Thecomputationofthesolutionisrelativelyeasy,usingthe Karush-Kuhn-Tucker (KTT) conditions. Except for the trivial case of M > KN, the summation constraint is satisfied as A. Replication Density and Distance Approximation an equality. Regarding the pair of constraints on each d , m Focusing on the frequency of occurrence of each file m in either one of them can be an equality, or none. This partitions the caches B , we define the replication density d as the n m M into three subsets, one containing files of unit replication fraction of nodes that store file m in the network: density (i.e., stored at every node) M ={m:d =1}, one m dm = N1 (cid:88) 1{m∈Bn}. (10) containingfilesstoredinjustonenode(cid:20)M(cid:20)={m:dm = N1}, and the complementary M =M\(M ∪M ). n∈N (cid:149) (cid:20) (cid:20) If the files are ordered, then the three subsets are ordered: Under shortest path routing, the inverse of replication den- sity dm may be regarded in a fluid approximation as the LEMMA 6 [MONOTONICITY OF SETS M ,M,M ]: Pro- (cid:20) (cid:149) (cid:20) number of peers served by the node that maintains m in its vided that p is non-increasing on m, the optimal solution m cache; hence, it also represents the size of the area served by for the CD problem takes the form of M ={1,2,...,l−1}, (cid:20) a specific location as the source of information m. M = {l,l +1,...,r −1}, and M = {r,r +2,...,M}, (cid:149) (cid:20) NotethatintheATandpreviousproblems,d takesvalues where l and r are integers with 1≤l≤r ≤M +1. m in the set of {1/N,2/N,...,1}, and moreover, that The uniqueness of the optimal solution [dCD∗] implies the m (cid:88) 1 (cid:88) 1 (cid:88) uniqueness of the indices and the three partinitions. We, then, d = |B |≤ K =K. m N n N denote by l(cid:63),r(cid:63), and M(cid:63), M(cid:63), M(cid:63) their optimal values. m∈M n∈N n∈N (cid:20) (cid:149) (cid:20) Given these, the solution dCD(cid:63) takes the form of Consider now densities [dAT(cid:63)] defined from the optimal m cache [BAT(cid:63)] and (10). The fomllowing is a very useful result  1, m∈M(cid:63), (11a) (nsaeteurAalplypenlnedaidxuAsftoorathneewproporofbolfeamllbreassueldtsoonftthheisfiSleecdtieonns)ittiheast: dCmD(cid:63) =  K−l(cid:63)(cid:80)+r(cid:63)1−−1pM32−Nr(cid:63)+1 pm23, m∈M(cid:20)(cid:63)(cid:149), (11b) LEMMA3[ATCAPACITYLOWERBOUND]:  1 , j=l(cid:63) j m∈M(cid:63). (11c) √ (cid:32)(cid:115) (cid:33) N (cid:20) CAT(cid:63) ≥ 2 (cid:88) 1 −1 p . Fig. 1 illustrates such an example solution, depicting the 6 dAT(cid:63) m density dCD(cid:63), indices l(cid:63) and r(cid:63), as well as sets M(cid:63), M(cid:63), M(cid:63) m∈M m m (cid:20) (cid:149) (cid:20) when file popularities follow the Zipf law (see Section VI-A). 5 1.0 where each data is placed is selected as a node with the least 0.8 number of files so far and in case of tie, by scanning the diagonals(step12).Inthisway,wespreadfilestobalancethe 0.6 dCmD(cid:63) M(cid:63) M(cid:63) M(cid:63) load over the links, a key property for the WN problem. 0.4 (cid:20) (cid:149) (cid:20) Algorithm 1 [Cache Data Filling]: 0.2 0.0 1: B(0,0) ←M0 0 l(cid:63) 5 10 r(cid:63) 15 20 2: for k ∈{1,2,...,ν} do m 3: for x=0,1,...,2k−1 do Fig.1. AnexamplecaseofdensitydmandtheM(cid:20)(cid:63),M(cid:149)(cid:63)andM(cid:63)(cid:20)partitions. 4: for y =0,1,...,2k−1 do Solidlineplotsthe∼m−23τ lawofm∈M(cid:63)(cid:149),whenpmfollowsZipf’slaw. 5: B(x,y+2k−1) ←B(x,y) {Replicate cache contents} 6: B(x+2k−1,y) ←B(x,y) 7: B(x+2k−1,y+2k−1) ←B(x,y) C. Discrete Density Formulation 8: while Mk is not empty do Now, we restrict our attention to networks with a number 9: m←argminpm. {Find the file m of highest pm} of nodes equal to a power of 4, N = 4ν, and consider a 10: Mk ←mM∈Mkk\{m}. {Remove it from set Mk} constrained version of the CD Problem of discrete densities: 11: S ← argmin |B(x,y)|. {Find the set of (x,y)∈{0,1,...,2k−1}2 PROBLEM 5 [DISCRETE DENSITY (DD)]: Minimize nodes(x,y)withtheminimumnumberofelements} CDD([dm])(cid:44)CCD([dm]) with respect to [dm], subject to: 12: Select node (x,y) from S as the first node by 1) For any m∈M, dm =4−νm, with νm ∈{0,1,...,ν}. scanning from top left to bottom right the main (cid:80) 2) d ≤K diagonal, then next diagonal, etc with wrap around m∈M m in the 2k×2k submatrix. {See Fig. 2} It is clear that the CD Problem is a relaxed version of the DDProblem,i.e.,any[d ]satisfyingDDconstraints,satisfies m Denote by B◦ = B◦ the cache contents of node (x,y) CD’s constraints as well, hence CCD(cid:63) ≤CDD(cid:63). n (x,y) at the end of the Algorithm 1. Clearly, in B◦ each data m strDaiDgh,tdfoerfiwnaerddotnoacodnissctrruetcetsaent,eifsfincoietnetassyoltuotisoonlvde◦m.S(cid:44)till4,−itνm◦is ilsocraetpeldicaetveedry4ν2m◦νm◦=nNo/dde◦ms tailmonegs itnhethtewnoetawxoisrkn(,awsiitnh Freipgl.ic3a)s. setting it to the largest power less or equal to [dCD(cid:63)] m Next, we study the validity and optimality of Algorithm 1. d◦ (cid:44)max(cid:8)4−i:4−i ≤dCD(cid:63), i∈{0,1,...,ν}(cid:9). (12) m m E. Validity and Optimality for AN and WN Problem Let, then, CDD◦ =CCD◦ be the required link rate of the DD LEMMA8[ALGORITHM1CACHESIZE]: Algorithm 1results and CD problems for the [d◦ ] replication density. Although m in caches Bn filled with at most K files. not optimal, [d◦ ] is quite efficient a solution: m The above result shows that Algorithm 1 constructs a THEOREM7[CANONICALPLACEMENTEFFICIENCY]: solution compatible to the WN’s and AN’s constraints. Let √ CCD(cid:63) ≤CCD◦ <2CCD(cid:63)+ 2. iunscsopnejcuinfyctiaonsettootfheshcoarcthesetsp[Bat◦h]s(s[Ree◦nF,mig].t6oiunseApfoprenddeilxivAer)y: 6 n • In the case that there are multiple wm nodes at the same hop-count from node n, requests of n about m will be directedtothew nodeatthenorthandwestofnoden. m D. Replication Policy Design • If n and wm are on the same row or column of the WepresentareplicationalgorithmthatallocatesthefilesM network, we use the single I-shared shortest path. in the caches [Bn] given the replication densities d◦m. In the • If n and wm are on different rows and columns of the algorithm, we represent each node n by its coordinates (x,y) network,weuseequallythetwoL-shapedshortestpaths. √ with x and y taking values in {0,1,..., N −1}. First, we The above routes are optimal for the AN problem, but partition the data into the sets M0,M1,...,Mν, such that not necessarily for WN. Define CAN◦ (cid:44) CAN([Bn◦],[R◦n,m]), asMussbIiinmgdcnaeoeterndditxa,toiwtnhsaeesudetnaaletiraqtmu(mesetnentpsoofd1oe)fdb◦mMinya=t0hs,seii4.g2e−nν.i,im◦.ntghT×ehto2efinνBlm◦,e(s0e,sat0uoc)b,hbmthefiaeltrerfieipxrmsl.itc1ag×teet1ds CWAN◦i,j(cid:44)(cid:44)C(cid:40)WN1K(,[−B1in◦−],Nj[R(cid:80)◦n,Mkm=−]i)+ja1npd2k/3, iiff KK−−ii−− Njj >=00,. (13) N at all nodes of the network. Then, in the loop of step 2, we Nextresultsestablishtheorderoptimalityoftheconstructed scan all sets M and allocate its elements to the buffers. k solution for the AN and WN problems: First, we repeat the 2k−1×2k−1 first submatrix created so far to the 2k ×2k submatrix (loop of step 4), and then, we THEOREM9[ALGORITHM1OPTIMALITYONAN]: iteratively take out of set Mk files at decreasing popularity, 1 3√ filling one node in the 2k ×2k submatrix (step 8). The node CAN(cid:63) ≤CAN◦ ≤ + 2CAN(cid:63). 2 2 6  1 (2k−1)2k+1 (2k−2)2k+2 ··· 2·2k−1 2·2k (cid:40)(n+1)1−1τ−−(τm+1)1−τ≤Hτ(n)−Hτ(m)≤ n1−τ−1(−mτ+1)1−τ+1, if τ(cid:54)=1,  2k+1 2 (2k−2)2k+2 ··· 3·2k−1 3·2k lnmn++11≤Hτ(n)−Hτ(m)≤lnmn++12, if τ=1.   (15)  2·2k+1 2k+2 3 ··· 4·2k−1 4·2k     As we are interested in the asymptotic scaling of link rates,  ... ... ... . . . ... ...  for notational simplicity, we remove the multiplicative factor   (2k−2)2k+1 (2k−3)2k+2 (2k−4)2k+3 ··· 2k−1 4k  from the objective function of the CD Problem, and refer to   (2k−1)2k+1 (2k−2)2k+2 (2k−3)2k+3 ··· 2·2k−1 2k the resulting quantity as C. Substituting the solution (11) and plugging in the Zipf distribution into (16), it follows that Fig.2. Theorderofprecedenceinthe2k×2kmatrix(Algorithm1,step12). M C (cid:44) (cid:88) (cid:16)dm−21 −1(cid:17)pm =C(cid:149) +C(cid:20)−(cid:88)pm, (16) m∈M j=l 1 A 1 5 1 8 1 5 where (cid:80)M p =O(1) (as it lies always in [0,1]), and j=l m 3 2 3 2 3 2 3 2 (cid:104) (cid:105)3 H (r−1)−H (l−1) 2 1 4 1 B 1 4 1 9 C(cid:149) (cid:44)m(cid:88)∈M(cid:149)√pdmm (=14) 23τ (cid:113)K(cid:149) Hτ(M23τ) , (17) 3 2 3 2 3 2 3 2 C (cid:44) (cid:88) √pm (=14)√N Hτ(M)−Hτ(r−1), (18) (cid:20) d H (M) 1 6 1 5 1 C 1 5 m∈M m τ (cid:20) (K−l+1)N −(M −r+1) 3 2 3 2 3 2 3 2 K (cid:44) . (19) (cid:149) N 1 4 1 7 1 4 1 D Last, for simplicity, we drop the star notation and use dm, l and r to refer to the values of the optimal solution. 3 2 3 2 3 2 3 2 B. Estimation of l and r Fig. 3. An example run of Algorithm 1 for M1 = {1,2,3},M2 = As indices l,r are not provided in a closed form, we derive {4,5},M3 = {6,7,8,9,A,B,C,D,...}: Files {1,...,D} have been approximations in order to find the scaling of C. placed in the buffers; in subsequent steps, buffers will start receiving their 1) Estimation of l: first, l ≤ K +1 (hence l = Θ(1)), as secondfile.Theboxesmarkthe2k×2k submatrices. l−1representsthenumberoffilescachedinallnodes.IfM (cid:149) and M are not empty, using (11b), d <1 is equivalent to l (cid:20) M−r+1 (cid:104) (cid:105) THEOREM 10 [ALGORITHM 1 OPTIMALITY ON WN]: The K−l+1− N <l23τ H23τ(r−1)−H23τ(l−1) . (20) maximum link load CWN◦ is within a multiplicative constant If, moreover, the first set M is not empty, i.e., l > 1, then to the optimal CWN(cid:63) plus an additive term 2+A /4, that (cid:20) 0,0 d =1. This means that if we attempted to decrease index l−1 depends on the distribution [p ], and cache capacity K. m l by 1, this would violate the density constraints, and result in (11b) a number greater than 1 for d : l−1 VI. ASYMPTOTICLAWSFORZIPFPOPULARITY M−r+1 (cid:104) (cid:105) To study the scaling of link rate, we switch from the K−l− ≥(l−1)23τ H2τ(r−1)−H2τ(l−2) . (21) N 3 3 arbitrary popularity to the Zipf law, a distribution known to Thus, provided l > 1, it can be uniquely determined as model well various types of network traffic. the lowest integer that satifisfies (20)-(21), which is unique (Theorem5).Anapproximationforlcanbecomputedtreating A. Zipf Law and Approximations (20)asanapproximateequalitywhenM (cid:54)=∅,orequivalently (cid:149) The Zipf distribution is defined as follows: when l<r (as dl−1 =1 and dl <1): M−r+1 (cid:104) (cid:105) p = 1 m−τ, (14) K−l+1− ∼=l23τ H2τ(r−1)−H2τ(l−1) . (22) m H (M) N 3 3 τ 2) Estimationofr: IfM ∪M isnotempty,d > 1 ⇔ where τ is the power law parameter, indicating the rate of (cid:149) (cid:20) r−1 N popularity decline as m increases, and Hτ(n) (cid:44) (cid:80)nj=1j−τ (K−l+1)N−M+r−1>(r−1)23τ(cid:104)H2τ(r−1)−H2τ(l−1)(cid:105). is the truncated (at n) zeta function evaluated at τ (also 3 3 (23) called the nth τ-order generalized harmonic number). The limit H (cid:44) lim H (n) is the Riemann zeta function, which Again,ifsetM isnotempty,i.e.,r ≤M,thend =N−1. τ n→∞ τ (cid:20) r convergeswhenτ >1.WederiveanapproximationforH (n) Thus, if we attempted increasing index r by one, (11b) would τ by bounding the sum: for n≥m≥0, violate the constraint resulting in a density less than N−1: (cid:104) (cid:105) (cid:90) n(x+1)−τdx≤Hτ(n)−Hτ(m)≤1+(cid:90) n x−τdx,⇒ (K−l+1)N−M+r ≤r23τ H23τ(r)−H23τ(l−1) . (24) m m+1 7 As before, (23) is an approximate equality if l<r, i.e., Note that an approximate solution of (27) is obtained from (K−l+1)N−M+r−1∼=(r−1)23τ(cid:104)H23τ(r−1)−H23τ(l−1)(cid:105). K−(l−1)∼=(l−1)(cid:104)H23τ −H23τ(l−1)(cid:105)(∼=15)32lτ−−13 ⇔ (25) 2τ −3 3) Estimation of l/r: For all l,r, it is N > dl =(cid:0)r−1(cid:1)23τ. l∼=1+ 2τ K. (28) dr−1 l As before, whenever l and r are not equal to the extremes, Next, we study the conditions to have M almost empty: (cid:20) i.e., 1<l<r <M +1, it holds that d /d =N. Thus, l−1 r THEOREM16[M ALMOSTEMPTY]: M −r =o(M) iff l∼=rN−23τ. (26) • for τ <3/2, (cid:20) M l≤im(cid:0)1− 2τ(cid:1)KN, 3 Next,weshowthatitisimpossibletohavefilescachedinall lim nporodoefssuonflethses trheesupltosptuhlaatriftoyllpoawracmaentebreτfoiusngdreianteArptpheannd32ix(Bth)e: •• ffoorr ττ =>33//22,, MlnMMl≤i≤m(cid:20)K(KN−,ˆl+ˆl11−)2(3τ23τ−1)(cid:21)23τN23τ. LEMMA11: If τ ≤ 32, then l→1. where ˆl = ˆl{τ>3,M =∅} from Theorem 15. If the above 2 (cid:20) Weproceedtotheasymptoticbehaviorofthesystemonthe inequalities are strict, then r =M +1 (and thus M =∅). (cid:20) rateC,andindiceslandr,whichgovernpartitingtoM ,M (cid:20) (cid:149) THEOREM17[CAPACITYFORALMOSTEMPTYM ]: and M . The scaling behavior regards the case of the number (cid:16)√ (cid:17) (cid:20) (cid:20) of nodes N and the number of files M increasing to infinity. • If τ <1, C =Θ M . We use ˆl and rˆto refer to the limits of l and r. • If τ =1, C =Θ(cid:16) √M (cid:17). First, a set of basic results establish the upper bound of logM C = O(cid:16)√N(cid:17), the Gupta-Kumar rate [2]. This is intuitive: if • If 1<τ <3/2, C =Θ(cid:16)(cid:0)M3/2−τ(cid:1)(cid:17). 3 replication is ineffective (e.g., due to large number of files), • If τ =3/2, C =Θ log2 M . then the system and its performance essentially reduce to [2]. • If τ >3/2, C =Θ(1). (cid:16)√ (cid:17) LEMMA12[BOUNDONC ]: C =O N . D. Non-empty M (cid:149) (cid:149) (cid:20) (cid:16)√ (cid:17) WhenM isnon-empty,itisC >0.AsCorollary14shows, LEMMA13[BOUNDSONC ]: C =O N . Furthermore, (cid:16)√(cid:20) (cid:17) (cid:20) (cid:20) (cid:20) (cid:16)√ (cid:17) C is O N , i.e., the Gupta-Kumar rate [2]. Thus, we turn lim 1) for τ <1, and r < M, it is C =Θ N , (cid:16)√ (cid:17) 2) for τ >1, it is C =Θ(cid:16)√N(cid:0)r(cid:20)1−τ−M1−τ(cid:1)(cid:17). our attention to identifying the cases that C =o N . (cid:20) If, moreover, r l<imM, then C =Θ(cid:16)√N (cid:17). THEOREM18[ˆlANDrˆFORNON-EMPTYM(cid:20)]: IfM exceeds (cid:20) rτ−1 the condition of Theorem 15, COROLLARY14[BOUNDONC]: C=O(cid:16)√N(cid:17). • ifKN−M =ω(1),thenwediscernthefollowingcases: 3−2τ Next,westarttheasymptoticanalysis,partitioningthespace τ <3/2: l→1, r ∼ (KN −M), (29) 2τ ofM,N parametersaccordingtowhethertheyproducesingly τ =3/2: l→1, rlnr ∼KN −M, (30) replicated files or not, i.e., non-empty or empty M . (cid:20) τ >3/2 and M l≤im(K−β)N : (cid:20) (cid:21) C. Almost Empty M M (cid:20) l→ˆl∼=α K+1−lim , (31) The first case to consider is when the number of nodes N N andfilesM increasetowardsinfinity,andatthesametimeM (cid:20) M (cid:21) remains an almost empty set. We define formally M(cid:20)≈∅ iff(cid:20) r ∼α KN23τ − N1−23τ . (32) |M |=o(M), i.e., the number of its elements is lower order lim than(cid:20)the total files. For this to happen, M should increase at a τ >3/2 and M > (K−β)N : slowpacewithN,sothattheconstraintdm ≥N−1issatisfied l→1, r ∼(cid:20)2τ (KN −M)(cid:21)23τ (33) foralmostall(i.e.,uptoo(M))files.Theextremecaseofthis 3 is to first let N → ∞, and then have M → ∞, i.e., split the where α= 2τ−3, β = 3 . limit of N,M jointly going to infinity to a double limit. 2τ 2τ−3 • if KN −M = O(1), then l → 1,r = Θ(1), with the TostudytheasymptoticsofC,wefirstestimatelandr.The exact value determined by almost empty M implies that M−r =o(M), thus r ∼M . THEOREM15[ˆl(cid:20)FORALMOSTEMPTYM ]: (cid:40)KN−M+r−1>(r−1)23τH23τ(r−1), (34) 1) For τ ≤3/2, l→1. (cid:20) KN−M+r ≤r23τH2τ(r). 3 2) For τ >3/2, l converges to the integer solution of (cid:40) (K−l+1)l−23τ < H23τ −H23τ(l−1), (27) is Nthoetesatmhaet,irn =theΘca(cid:16)s(eKoNf τ−>M3)/223,τ(cid:17)theinasbyomthpt(o3t2ic) laanwd f(o3r3)r. (K−l)(l−1)−23τ ≥ H2τ −H2τ(l−2), Moreover, the approximation of (31) on ˆl can be precisely 3 3 if such exists and is greater than 1, or 1 otherwise. carried out via (27), if we substitute K with K−limM/N. 8 TABLEIII (a) TheCasesofτ <1,τ =1and1<τ <3/2. M M finite N→∞, then M l≤im 3−2τKN Ml>im3−2τ2τKN, M ∼KN M →∞ 2τ and M l<imKN KN−M =ω(1) KN−M=O(1) M empty empty almost empty non-empty non-empty non-empty (cid:20) ˆl 1 1 1 1 1 1 rˆ M +1 M +1 M −o(M) 3−2τ(KN −M) 3−2τ(KN −M) Θ(1) (34) 2τ 2τ (cid:16)√ (cid:17) (cid:16)√ (cid:17) (cid:16)√ (cid:17) (cid:16)√ (cid:17) (cid:16)√ (cid:17) τ <1 Θ(1) Θ M Θ M Θ M Θ M Θ M (cid:16) √ (cid:17) (cid:16) √ (cid:17) (cid:16) √ (cid:17) (cid:16)√ (cid:17) (cid:16)√ (cid:17) C τ =1 Θ(1) Θ M Θ M Θ M Θ M Θ M logM logM logM (cid:16) (cid:17) (cid:16) (cid:17) (cid:16) (cid:17) (cid:16) √ (cid:17) (cid:16)√ (cid:17) 1<τ < 23 Θ(1) Θ M32−τ Θ M32−τ Θ M32−τ Θ (KN−MM)τ−1 Θ M (b) TheCaseofτ =3/2. M M finite N→∞, then MlnM l≤imKN MlnM l>imKN and M l<imKN M ∼KN M →∞ KN −M =ω(1) KN −M =O(1) M empty empty almost empty non-empty non-empty non-empty (cid:20) ˆl 1 1 1 1 1 1 rˆ M +1 M +1 M −o(M) rlnr∼KN −M rlnr∼KN −M Θ(1) (34) (cid:16) (cid:17) (cid:16)(cid:113) (cid:17) (cid:16)√ (cid:17) C Θ(1) Θ(log32 M) Θ(log32 M) Θ log32 r Θ M log32 r Θ M KN−M (c) TheCaseofτ >3/2. N→∞, then M l≤imhN23τ Ml>imhN23τ,and Ml>im(K−β)N M ∼KN M M finite lim lim M →∞ (see Th. 16) M ≤(K−β)N and M < KN KN−M=ω(1) KN−M=O(1) M empty empty almost empty non-empty non-empty non-empty non-empty (cid:20) ˆl Θ(1) (27) Θ(1) (27) Θ(1) (27) ∼=α(cid:2)K+1−limM(cid:3) 1 1 1 N (cid:20) (cid:21) rˆ M +1 M +1 M −o(M) ∼α KN23τ−N1M−23τ ∼(cid:2)23τ(KN−M)(cid:3)23τ ∼(cid:2)23τ(KN−M)(cid:3)23τ Θ(1) (34) (cid:18) √ (cid:19) (cid:16)√ (cid:17) C Θ(1) Θ(1) Θ(1) Θ(1) Θ(1) Θ M Θ M 3(τ−1) (KN−M) 2τ lim THEOREM19[CAPACITYFORM < KN,M (cid:54)=∅]: Problem, we consider the scaling of A0,0; from (13), (cid:16)√ (cid:17) (cid:20) • If τ <1, C =Θ M . 1 (cid:88)M 2 1 • If τ =1, C =Θ(cid:16) √M (cid:17). A0,0 ≤ K pm3 ≤ KH23τ(M). logM k=1 (cid:16) (cid:17) • If 1<τ <3/2, C =Θ M32−τ . Thus, A0,0 diverges only for τ ≤ 3/2, as Mτ−1 when τ < (cid:16) 3 (cid:17) 3/2,orlogM whenτ =3/2.AseasilyverifiedfromTableIII, • If τ =3/2, C =Θ log2 r . A alwaysscalesslowerthanC,therefore,TableIIIpertains 0,0 • If τ >3/2, C =Θ(1). to the scaling of the required rate for the worst link case, too. THEOREM20[CAPACITYFORM ∼KN]: (cid:16)√ (cid:17) F. Discussion on Asymptotic Laws • If τ ≤1, C =Θ M . (cid:16) √ (cid:17) Themainresultoftheasymptoticlawsregardstheminimum • If 1<τ <3/2, C =Θ M . required link rate required to sustain a request rate of λ = 1 (KN−M)τ−1 • If τ =3/2, C =Θ(cid:16)(cid:113) M log32 r(cid:17). from each node. As a preliminary comment, the sustainable KN−M link rates are subject to the information theory and Shannon’s (cid:18) √ (cid:19) capacity law. Thus, a rate C that scales to infinity should be • If τ >3/2, C =Θ M . 3(τ−1) interpreted rather as the inverse of the maximum sustainable (KN−M) 2τ √ request rate λ, e.g., the result of C = Θ( M) for λ = 1 is E. Validation for the WN Problem √ equivalent to C =Θ(1) for λ=Θ(1/ M), as in [2]. Recapitulating on the asymptotic laws, C, CCD(cid:63), and CAN(cid:63) Thepowerlawparameterτ setstwophasetransitionpoints, are of the same order (Theorem 9). To complete with the WN 1 and 3/2, leading to distinct asymptotics: the higher τ, 9 (cid:88) the more uneven the popularity of files, and thus, the more Q =N. (35) w,m advantage in caching (i.e. lower link rate C). As an example, w∈Wm on τ > 3/2 and M ≤ (K −(cid:15))N, with (cid:15) a small constant, It is not hard to see that the best arrangement for a node w C =Θ(1),or,inwords,thewirelessnetworkisasustainable. that serves a cluster of Q nodes on requests for data m, is w,m Nevertheless, such high a τ arises in particular scenarios (i.e., when these nodes lie in a square rhombus centered at w: this mobile applications), as discussed in Section II. (cid:80) minimizes the total hop-count h(n,m). Thus, the Morecommonarethecasesoflowandintermediatevalues 1st node (w itself) has a hop counnt∈hQw=,m0, the next 4 nodes ofτ (trafficinroutersandproxies),whichflattenthepopularity have h = 1, the next 8 nodes have h = 2, etc. as illustrated distribution towards the uniform. Replication becomes less (cid:16)√ (cid:17) in Fig. 4. Therefore, taking into account the nodes of the last effective,endinguptotheΘ M lawforτ <1,asynonym incomplete ring at hop count equal to ρˇ+1, it is oftheGupta-Kumarlaw,ifweassociatethenumberoffilesM Hopsforthenodesoftherhombusathops≤ρˇ to the number of communicating pairs in [2]. When M scales (cid:88) (cid:122) (cid:125)(cid:124) (cid:123) h(n,m)≥1×0+4×1+8×2+···+4ρˇ·ρˇ slower than N, there is an improvement over [2], especially onτ ≥1,which,undertheaboveprism,expressestheGupta- n∈Qw,m Numberofnodesatρˇ+1 ×(ρˇ+1) Kumar law for the flows induced from the replication. (cid:122) (cid:125)(cid:124) (cid:123)(cid:122) (cid:125)(cid:124) (cid:123) +[Q −1−2(ρˇ+1)ρˇ](ρˇ+1) Inanalternateperspective,thejointscalingofM andN can wm 2ρˇ+1 be considered as each new node enriching the network with =2ρˇ(ρˇ+1) +[Q −1−2(ρˇ+1)ρˇ](ρˇ+1), its ‘own’ files. In the case of M ∼δKN being a fraction of 3 w,m the total buffer capacity, the node has spare capacity to cache where ρˇ is the radius of the rhombus. The radius can be other files, too. The value of constant δ determines the size of computed as the integer part ρˇ=(cid:98)ρ(cid:99), where ρ satisfies the set M (Theorem 16), and, consequently, the scaling law (cid:16)√(cid:20) (cid:17) (cid:16) (cid:17) Qw,m =2(ρ+1)ρ+1. ofC:Θ M forτ <1,Θ M32−τ for1<τ < 3,orΘ(1) 2 Indeed, the RHS expresses the number of elements in a for τ > 3. Under this perspective, the improvement for τ ≥1 2 rhombus of radius ρ, when ρ is an integer. Solving for ρ, spotlights the advantage of replication: as M is of the same (cid:112) order with N, the flow model is a fair comparison to [2]. −1+ 2Qw,m−1 ρ= . (36) Last, when the ratio of M over KN approaches 1, hardly 2 does it remain any capacity left for replication: rate C in the Observe that last two columns(cid:16)√of th(cid:17)e tables essentially match the Gupta- 2ρ(ρ+1)2ρ+1≤2ρˇ(ρˇ+1)2ρˇ+1+[Q −1−2(ρˇ+1)ρˇ](ρˇ+1). Kumar rate of Θ M . 3 3 w,m (37) Indeed, the above is an equality when ρ is integer, as the VII. CONCLUSIONS&FUTUREWORK second term of the RHS vanishes. Moreover, it is easy to see Inthiswork,weinvestigatedonthejointdeliveryandrepli- that the LHS is a convex and increasing function of Q , w,m cation problem of flat wireless networks with caching. After (i.e., if we substitute ρ = ρ(Q ) from (36)), whereas w,m formulatingthepreciseproblemonsquarelatticenetworks,we the RHS is piecewise linear on Q (the pieces’ endpoints w,m gradually reduced it to a simple density-based problem; from correspond to integer values of ρ). Comparing the two, the itssolution,wedesignedanefficientorder-wisereplicationand LHS cannot exceed the RHS, as illustrated in Fig. 5. Thus, delivery scheme and derived the scaling laws on the required √ (cid:114) link capacity. As our study reveals, the key parameters in (cid:88) 2ρ+1 2 1 h(n,m)≥2ρ(ρ+1) = (Q −1) Q − , network sustainability are the file popularity parameter τ and 3 3 w,m w,m 2 the relative scaling of the numbers of files vs. network nodes. n∈Qw,m A future extension of this work will include a new dimen- withtheequalitybeingtruewhenρisinteger.Moreover,since (cid:113) √ sion of scaling with node capacity K; that is, in an expanding f(x) = (x − 1) x− 1 − x( x − 1) ≥ 0 for x ≥ 1 (as 2 network, not only new nodes and content are added, but f(1)=0, f(cid:48)(1)>0 and f(cid:48)(cid:48)(x)≥0 for x≥0), it is existing nodes evolve with augmented buffer storage. In such √ aevseentufpo,rrleopwlicvaatliuoensiosfeτxppercotveiddetodtbheatadcvacahnetacgaepoaucsitoyrdKer-swcailsees, (cid:88) h(n,m)≥ 32Qw,m(cid:104)Qw12,m−1(cid:105). (38) n∈Qw,m sufficiently fast with the number of files M. Then, the total number of hops for each m is expressed as √ APPENDIXA (cid:88) (cid:88) (cid:88) (cid:88) 2 (cid:104) 1 (cid:105) ProofofLemma3: UsingLemma2,weuseasinglearbitrary h(n,m)= h(n,m)≥ 3 Qw,m Qw2,m−1 shortest path for each n,m pair to a unique wm,n. From the n∈N √ w∈(cid:34)Wm n∈Qw,m w(cid:35)∈Wm  di.ies.c,utshseioren eoxnisdtmN, dthme sneotdWesminhathsecanredtiwnaolriktytWhamt s=ervNedtmhe, ≥ 32Wm (cid:88) Qw,mWm−1 (cid:115)(cid:88) Qw,mWm−1−1 requests for file m. Let Qw,m be the set of nodes served from √ w∈Wm w∈Wm nodewforrequestsonm,andQw,mbeitscardinality.Clearly, (=35) 2N(cid:104)d−m12 −1(cid:105) 3 10 Noden∈/Qw,m Nodewm Noden∈Qw,m w n SetQw,m (cid:96) Nodes1hopaway n(cid:48) Nodes2hopaway (cid:96)(cid:48) Qw,m Nodes3hopaway Fig.5. TheRHSof(37))(dashedline) N(laosdteisnc4ohmoppleatwearying) asinzdetQhew,LmH:Sknooft(s3i7n)divceartseutshethpeocinlutsstoefr Fthieg.n6o.desThseervsqeduabreycnluosdteerwQ,wwiothf ρ(Qw,m)=1,2,...,where(37)isan the I-shaped and L-shaped routes Fig.4. Aclusterofpeersservedbynodewm fordatamofρˇ=3. equality. fornodesnandn(cid:48),respectively. where the last inequality is Jensen’s inequality applied on Proof of Theorem 7: The first part has already been shown. √ the convex function g(x)=x( x−1) for the average node For the second part, note that dCD(cid:63) <4d◦ . Thus, m m count per w, (cid:80) Q W−1. √ (cid:34) (cid:35) √ (cid:34)(cid:115) (cid:35) w w,m m 2 (cid:88) 1 2 (cid:88) 4 In total, 2CCD(cid:63)= −1 p = −2 p √ 3 (cid:112)dCD(cid:63) m 6 dCD(cid:63) m av(cid:96)eC(cid:96)=21N (cid:88) (cid:88)h(n,m)pm≥ 62 (cid:88)(cid:16)d−m12−1(cid:17)pm. > √m2∈M(cid:88) (cid:34)m1 −2(cid:35)p =mC∈CMD◦− √m2. m∈Mn∈N m∈M 6 (cid:112)d◦ m 6 As this inequality is true for the optimal values of the AT m∈M m problem, the result follows. Proof of Lemma 8: Observe that on each iteration of k, Algorithm1addsdatatoacachewiththeleastitems.Thus,for Proof of Lemma 5: Consider density vectors [d ] and [d(cid:48) ] m m cache n to end up with K+1 or more items, all other n(cid:48) (cid:54)=n both satisfying the constraints, and their convex combination (cid:80) should have at least K items, i.e., |B |≥NK+1. d(cid:48)m(cid:48) =λdm+(1−λ)d(cid:48)m forλ∈(0,1).Function √1x isstrictly This is a contraddiction: each itemn∈mNappnears in Nd◦ = convex, hence CCD([d(cid:48)(cid:48)] < λCCD([d ])+(1−λ)CCD([d(cid:48) ]), m m m m 4ν−νm caches, and from d◦ construction, it is i.e., the objective CCD([d ]) is strictly convex. Moreover, the m m (cid:88) (cid:88) (12) (cid:88) constraints N−1 ≤ dm ≤ 1 are convex, and the constraint |Bn|=N d◦m ≤ N dCmD(cid:63) ≤NK. (cid:80) d ≤ K is affine. Thus, the optimization is strictly n∈N m∈M m∈M m∈M m convex [20, §4.2.1], and has a unique optimal solution. Proof of Theorem 9: Given the shortest path routing and the replication pattern, we can compute the expected number Proof of Lemma 6: Let us first prove that M = (cid:20) of hop counts from a random client to the node serving its {1,2,...,l−1}. Suppose that in the optimal solution dCD(cid:63), m request. Indeed, consider that a node in W that stores m in M contains m but not m , with m < m . Due to the m 2 1 1 2 (cid:20) its cache, and let us compute the summation of hops to the ordering, it is p ≥ p . We then construct a new solution m1 m2 client nodes in Q it serves. Q is a set of nodes laying d(cid:48) byexchangingthedensitiesofm andm ,i.e.d(cid:48) =dCD(cid:63) w,m w,m anmd d(cid:48) = dCD(cid:63), and maintaining t1he densi2ties ofma2ny othme1r on a square of size 2νm◦ ×2νm◦, with wm at its center (Fig. 6). m1 m2 From symmetry, we can perform the summation by counting m: d(cid:48) =dCD(cid:63) for m∈M\{m ,m }. m m 1 2 twice the required hops along one axis. For d◦ <1, The link rate of the two solutions differ by m CCD(cid:63)−C(cid:48) = √62(cid:34)(cid:32)(cid:112)pdmC1D(cid:63) − (cid:112)pdm(cid:48)2 (cid:33)+(pm2 −pm1)(cid:35) (cid:88)h(n,m)=2×[(cid:122)1+.M..o+vem2eνnm◦t−s1o+nt1h(cid:125)e+(cid:124)ve.rt.ic.a+ld(ir2eνctm◦io−n1−1)(cid:123)c]o(cid:122)2lu(cid:125)νm(cid:124)m◦n(cid:123)s √2 m1 (cid:32) m12 (cid:33) n∈Qw,m = 6 (pm1 −pm2) (cid:112)dCD(cid:63) −1 . =23νm◦−1. m1 Given the Nd◦ nodes of W , the average load across all If p = p , d(cid:48) has the same link rate CCD as the dCD(cid:63), m m m1 m2 m m links is computed by summing across m from (9): therefore it is optimal, too. This means that we can replace daCmcDo(cid:63)nwtriatdhicdt(cid:48)miontoacsotnhsetroupcttimaaselcsoonludtioopntiimsaulnsioqluuetio(Lne,mwmhiach5)is. CAN◦ = 21N (cid:88) (cid:88) 23νm◦−11{d◦m<1}pm Consider now the case of p > p . As m ∈/ M , it is m∈Mw∈Wm idimsCmDap1(cid:63)lci=eosndtthr(cid:48)maa2dtiC<ctCi1oD(cid:63).nC−tloCeat(cid:48)hrCleDy,>hfyro0pm.oTthmthhe1usesisa,bdaoCmbvmDo1(cid:63)eu2eitsqmunao1ttiaoonnp1d,tipmmma21l.,(cid:20)>whpimch2 = 121N m1(cid:88)∈M(cid:88)Nd(cid:34)◦m213νm◦−11{(cid:35)d◦m<1}pm1 3√ Similarly,fortheM(cid:20)part,asolutionwithm3 ∈M(cid:20),m4 ∈/ ≤ 4 + 4 (cid:112)d◦ −1 pm = 4 + 4 2CCD◦. M and m3 <m4 leads to a contradiction. m∈M m (cid:20)

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.