Table Of Content

Storage Allocation for Multi-Class Distributed Data Storage Systems Koosha Pourtahmasi Roshandeh, Moslem Noori, Member, IEEE,, Masoud Ardakani, Senior Member, IEEE, and Chintha Tellambura, Fellow, IEEE Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB, Canada Email: {pourtahm, moslem, ardakani, ct4 }@ualberta.ca 7 Abstract—Distributed storage systems (DSSs) provide a scal- memoryallocation[1], [2]. Itis knownthatstorageallocation 1 ablesolutionforreliablystoringmassiveamountsofdatacoming affects different performance measures of a DSS such as P s 0 from various sources. Heterogeneity of these data sources often [2]–[7],servicerate[8]–[10]aswellasstorageandrepaircost 2 meansdifferentdataclasses(types) exist inaDSS,each needing [11]. Hence,severalstudieshavebeendedicatedtoimproving a differentlevel of quality of service (QoS). As a result, efficient n data storage and retrieval processes that satisfy various QoS theperformanceofDSSsviacarefulstorageallocation.Inthis a requirements are needed. This paper studies storage allocation, work,we focusonstudyingtheeffectofstorageallocationon J 3 mseteaonfinsgtohroawgednaotadeosfodfiffaerDenStSc.laMssoersemspuestcifibceasllpyr,eaadssuomveirngthae the successful data recovery, Ps. Finding the optimal storage allocation to maximize P is 2 probabilistic access to the storage nodes, we aim at maximizing s the weighted sum of the probability of successful data recovery known to be a quite challenging problem and the optimal ] ofdataclasses,whenforeachclassaminimumQoS(probability storageallocationforageneralDSSsetuphasyettobefound T of successful recovery) is guaranteed. Solving this optimization [1]. Nevertheless, several studies have addressed maximum- s.I pthreobolepmtimfaolrstaoraggeenearlalolcasetitounpwisheninttrhaectdabatlea. oTfheuasc,hwcelasfisnids Ps storage allocation for specific setups and under various c assumptions. For instance, optimal quasi-uniform allocation spread minimally over thestorage nodes, i.e. minimal spreading [ formaximizingP ,whereafixed-sizerandomlychosensubset allocation (MSA). Using upper bounds on the performance of s 1 the optimal storage allocation, we show that the optimal MSA of storage nodes is accessed by the server, is studied in v allocationapproachestheoptimalperformanceinmanypractical [2]. Also, two approximation algorithms are proposed in 6 cases.Computersimulationsarealsopresentedtobetterillustrate [3] to maximize P in a heterogeneous DSS where storage s 0 the results. nodes have different reliabilities. Moreover, for a DSS with 5 Index Terms—Distributed storage systems, storage allocation, heterogeneousnodes in terms of storage capacity, an iterative 6 minimal spreading allocation, multi-class. algorithm has been proposed in [6] to find a k-guaranteed 0 1. I. INTRODUCTION allocation1. Inanotherstudy[10], itisshownthatconsidering the effect of successful data recovery on the service rate and 0 Distributed storage systems (DSSs) play a vital role in assuming exponential waiting time at the storage nodes, the 7 1 numerous innovative and cost-effective services by providing service rate is maximized through uncoded data replication : reliable anytime/anywhereaccessto immense amountof data. over the storage nodes. v i For this, different types of data are stored with redundancy While the aforementioned studies have shed some light on X over a networked set of storage nodes. That is, for storing the optimal storage allocation, they fall short of addressing r a data chunk, the DSS server (controller) first encodes it a practical aspect of DSSs, that is the heterogeneity of the a according to an error-correction coding scheme. Then, the storeddata.Tobemorespecific,aDSSusuallystoresdifferent encodeddataispartitionedintomultiplepiecesandeachpiece classes (types) of data often coming from different sources is stored over a storage node. [12]. We call such a DSS a multi-class DSS where each class When a request to download the data is submitted to requiresitsownlevelofqualityofservice(QoS).Forinstance, the server, it tries to access all or a subset of the storage AmazonS3 allowsits customersto choosefrom threestorage nodes in order to obtain pieces of the encoded data. If classes offering different levels of durability, reliability, and the server receives enough pieces of the encoded data, the availability [13]. While such asymmetric QoS requirements original chunk can be successfully recovered to serve the has to be taken into account in the storage allocation, to download request. However, due to hardware/software failure the best of our knowledge, it has not been considered in or network congestion, the server may not receive enough the previous studies. Note that in a multi-class DSS, storage piecesoftheencodeddataresultingindatarecoveryfailure.To allocationfordifferentclasses are intertwineddue to the limit quantitativelyaccountforthese circumstances,the probability on the available storage space. This makes storage allocation of successful data recovery by the server, denoted by Ps, is for a multi-class DSS a more challenging problem compared commonly used in the literature [1]–[3]. The way that the server partitions and stores the coded 1Anallocationissaidtobek-guaranteedifthestoreddatacanberecovered data over the storage nodes is often referred to as storage or byaccessing anyarbitrary setofk storagenodes 2 to the storage allocation for a single-class DSS. Inthispaper,westudytheproblemofstorageallocationfor a multi-class DSS. More specifically, we consider storing K classes of data over a DSS with N storage nodes. To account for possible access failures (e.g., due to network congestion), weadopttheprobabilisticaccessmodel[1]whereeachstorage nodefailstorespondtotheserver’saccessrequestwithagiven probability q. Assuming a storage budget T for the ith class i of data, we focus on maximizing the weighted sum of the probability of successful recovery of all data classes where weights reflect the QoS requirementof each class. Further, to guaranteeaminimumviableserviceforeachclass,aconstraint on the minimum probability of successful recovery for each class is introduced in the optimization problem. Fig.1. SystemModel. The analytical solution of this optimization problem is, however, intractable even if there is only one class in the Later in Section V, we explain how this assumption can be network, i.e. a single-class DSS. That said, we narrow down relaxedtoapplyourresultstomoregeneralcases.Thestorage our attention to the optimal minimal spreading allocations size for class i ∈ K = {1,2,...,K} is limited by a storage (MSA) to maximize the consideredweighted sum. It is worth budget T which can reflect the QoS of class i. i mentioning that MSA is the optimal symmetric allocation for To store the data of class i ∈ K = {1,2,...,K}, its maximizingP insomesingle-classDSSsetups[1]motivating s k blocks of data are first encoded by a suitable minimum us to contemplate its performance for the considered multi- distance separable (MDS) code [1], [3] to form n coded i class DSSs. Furthermore, MSA is the optimal allocation in blocks such that n ≤ T . Having these n encoded data i i i terms of maximizing the service rate in a single-class DSS blocks,theserverthenspreadsthemovertheN storagenodes with exponential waiting time at the users [10]. In addition, accordingtoastorageallocationpolicyformallydefinedlater another study [8] shows that MSA minimizes the expected in this section. That said, the server can retrieve the k data recovery delay when the storage budget is an integer. blocks of class i if it successfully receives at least k out of AssumingMSAasthestorageallocation,wefirstformulate the n codedblocksfromthe storage nodeswhen a download i finding optimal MSA to maximize the weighted sum of the request is submitted. A graphical illustration of the system probabilitiesofsuccessfulrecoveryasanon-linearintegerop- model is presented in Fig. 1. timization. Then, an iterative algorithm with time complexity For simplicity of presentation, we normalize all storage O(N)ispresentedtosolvethisoptimizationproblem.Despite capacities and data sizes with k. That is, the normalized the optimality of this iterative solution, its linear complexity capacity of each storage node n, denoted by c , is equal to n may become a challenge when applied to large-scale DSSs. one for n ∈ N = {1,2,...,N}. Further, 0 ≤ x ≤ c = 1 i,n n To address this issue, we also present a suboptimal solution denotes the normalized number of encoded blocks from class for the considered weighted-summaximization problem. This i that are stored over storage node n. With some abuse suboptimalsolutionhasaworst-casecomplexityofO(K)and of notations, we keep using n and T for the normalized i i the resulting MSA either matches or slightly underperforms number of coded blocks and storage budget of class i. To the optimal MSA. In the next step, the performance of the ensurethat the budgetlimits and nodes’storage limits are not presented (sub)optimal MSA is compared with the upper violated, we have x = n ≤ T and x ≤ bound on the weighted sum of the probabilities of successful n∈N i,n i i i∈K i,n c = 1. In the following, we use a storage budget vector n recovery when no assumption on the format of the allocation P P T =(T ,T ,...,T )refertothestoragebudgetofallclasses. 1 2 N policy is made. We analytically prove that optimal MSA Nowthatwehaveexplainedthestoragemodel,weformally achievesthisupperboundforasignificantrangeofq meaning define the allocation policy. that MSA is indeed the optimal allocation in those ranges. Definition 1. A storage allocation policy elucidates the way II. SYSTEMMODEL the encoded blocks of the K classes are stored over the N Inthissection,weexplainthedatastorageandserveraccess nodes and is identified by a K-tuple A = (A1,A2,...,AK) model in the considered multi-class DSS. where Ai =(xi,1,xi,2,...,xi,N) for all i∈K. Beside determining how coded data blocks are spread over A. Storage model thestoragenodes,astorageallocationpolicyAalsoimplicitly Here, we consider the storage of K classes of data over identifies the rate of the MDS codes for each data class. a DSS. More specifically, k data blocks from each of the K B. Access model classes of data are to be stored over a DSS with N equal- capacity storage nodes. For clarity of presentation, let us We adopta probabilistic access model[1] is adopted at the assumethatthestoragecapacityofeachnodeisalsok blocks. server. That is, the server tries to access all the storage nodes 3 when a requestto downloadthe data of a class, say class i, is of the allocation policy of one class and the probability of submitted.However,theserver’saccessattempttoanodemay successful recovery of other classes. The following example failduetoahardwarefailureorcongestionatthenode.Hence, helps explaining such an interdependency for a simple DSS. a probability of successful access p < 1 is considered. The server is able to recover the data of class i if n∈rxi,n ≥1 Example 1. Consider a DSS with N = 3 nodes where whererdenotesthesetofnodeswhichhavebeensuccessfully c = 1, ∀n ∈ N and T = (3,5). Table I shows four pos- P n 2 4 accessed by the server. Hence, the probability of successful sible allocations for this setup alongside their corresponding recovery of the class i’s data is probability of successful recovery. As seen from this table, if α ≫ α , then Case 1 and Case 4 are optimal for p ≤ 1 P = p|r|(1−p)N−|r|I x ≥1 . (1) 1 2 2 s,i i,n and 1 <p respectively resultingin the maximumP . Onthe rX⊆N "Xn∈r # other2hand, for α ≪α or α =α , only Case 1s,r1esults in where I[G] is the indicator function meaning that I[G]=1 if 1 2 1 2 the optimal solution of (P1). (cid:4) G is true, and I[G]=0 otherwise. Given an allocation policy A = (A ,A ,...,A ), the K- 1 2 K tuple p = [P ] is a feasible vector of probabilities of s s,i i∈K successful recoveryif the data of the ith class can be success- TABLEI fully recovered with probability P . We find the following s,i DIFFERENTALLOCATIONSANDTHEIRCORRESPONDINGPROBABILITYOF definition useful for our problem definitions in Section III. SUCCESSFULRECOVERY. Definition 2. The successful recovery region of a multi-class Allocation Ps,1 Ps,2 DSS, denoted by Θ, is defined as the union of all feasible CCaassee21::AA11==((11,,3812,,810)),,AA22==((00,,5481,,158)) pp pp2 vectors ps =[Ps,i]i∈K given a storage budgets vector T. Case3:A1 =(43,24,14),A2=(14,41,34) 2p2−p3 2p2−p3 III. MOTIVATION AND PROBLEM DEFINITION Case4:A1=(12,12,21),A2=(152,152,152) 3p2−2p3 p3 Theeffectofthestorageallocationpolicyontheprobability of successful data recovery has been investigated in several prior studies for single-class DSSs. To this end, optimal Since (P1) cannot be solved in a general setup, here we storage allocation policies that maximize the probability of focus on a special, yet practically important, case of the stor- successful recovery in single-class DSSs have been proposed age allocation, called minimum spreading allocation (MSA), for various setups. For more details, the interested reader is formally defined in the following. referred to [1]–[3]. Finding such optimal allocation policies, however, is more challenging for a multi-class DSS as the allocation policy Definition 3. In a minimal spreading allocation policy, for of each class potentially affects the probability of successful each class i, we have recovery of other classes. Further, in a multi-class DSS, the 1 ∀n∈V i x = allocationpolicyshouldreflectthe reliabilityneeds,measured i,n (0 otherwise bytheprobabilityofsuccessfulrecovery,ofeachofthediverse where V ⊂N is the set of nodes storing the data of class i. i data classes. Here, to improve the overall performance of the system It has been shown that MSA is optimal in terms of ex- while taking into account different classes’ service require- pected recovery delay, average service rate and probability of ments,weconsideroptimizingaweightedsumofP ’s,where s,i successfulrecoveryforseveralsetupsofsingle-classDSSs[1], ahigherweightisassignedtomoreimportantdataclasses.We [8], [10]. Moreover, by adopting an MSA policy, the data of aim at solving this optimization problem under the constraint each class could be stored through replication [1] removing that forany class i, P ≥Pmin, to ensurea minimumviable s,i s,i theneedfordevisingcustom-tailoredstoragecodingschemes. service for it. Denoting the weight of class i by α > 0, the i That said, we focus on finding the optimal MSA for the weighted sum maximization problem is formulated as: considered multi-class DSS in the rest of the paper. maximize hα,p i s A subject to p ∈Θ, (P1) From Definition 3, one can see that an MSA allocation is s fully identified by |V |, denoted by x from now on, as x Pmin ≤P ∀i∈K, i i i,n s,i s,i is either 0 or 1. where α = [α ] is a vector containing the weights i i∈K associated with the classes and h·,·i is the inner product of Notice than when K ⌊T ⌋ ≤ N, the MSA optimization i=1 i the vectors. has a trivial solution of x = ⌊T ⌋, ∀i ∈ K. Hence, we only i i The difficulty of solving (P1) lies in finding Θ, otherwise consider K ⌊T ⌋ >PN in the sequel. Furthermore, P = i=1 i s,i the objective function and the constraint on minimum Ps,i 1−qxi where q =1−p. This is because the recovery of the P are simple linear functions. An analytical description of Θ, class i data fails when all x nodes containing its data fail. i however, is intractable due to the strong interdependency Hence, finding the optimal MSA to maximize the weighted 4 sumcanbeformulatedasthefollowingminimizationproblem Algorithm 1 K 1: procedure (N,K,T,α) minimize αiqxi 2: xoipt ←0, ∀i∈K {xi}i∈K i=1 3: while N >0 do X K 4: αj ←max(α) subject to xi ≤N, (P2) 5: xojpt ←xojpt+1 Xi=1 6: αj ←qαj K 7: N ←N −1 ⌊T ⌋>N, i 8: if xj =⌊Tj⌋ then xXi=m1in ≤x ≤⌊T ⌋ ∀i∈K, 9: α←(α1,...,αj−1,αj+1,...,αK) i i i 10: end if where xi ∈ Z+ and xmi in = ⌈logq(1 − Psm,iin)⌉. Note that 11: end while in the above optimization problem, the first constraint makes 12: end procedure sure that the total available storage capacity is not exceeded. Further, the minimum requirement on the probability of successful recovery and budget limit of each class are enforced updated to N −1 and α is updated α q. That is, the second through the last constraint2. j j unit of storage can now be assigned following similar steps. Now, with a change of variable y = x − xmin, the i i i Repeating this approach, we can find the optimal solutions. optimizationproblemin(P2)istransformedintothefollowing Theonlyotherthingthatwehavetobecarefulaboutisthat K weshouldnotletanydataclasstoviolateitsstoragelimit.For minimize βiqyi {yi}i∈K i=1 this,ifaclassreachesitslimit,wesetxi forthisclassto⌊Ti⌋, X we removethis class formthe problemandwe continue.This K subject to y ≤N′, recursive procedure is continued until all N storage units are i (P3) assigned. The above procedure is presented as Algorithm 1. i=1 XK The time complexity of this solution is O(N). This is ⌊T′⌋>N′, because, we assign the available storage units one at a time. i Hence, we end up repeating the assignment process N times. i=1 X 0≤y ≤⌊T′⌋ ∀i∈K, Aftereveryassignment,weneedtoperformonecomparisonto i i where βi = αiqxmiin, N′ = N − i∈Kxmi in and Ti′ = Ti − fiorndderthmeanyewnolatrbgeesatcwceepigtahbt.leW. HheenncNe,iisnlathrgee,ntehxitssceocmtiopnle,xwitye xmin. iNote that here, (P3) resemblesP(P2) xmin = 0, ∀i ∈ K. suggest another approach for solving (P2) that in many cases i gives the optimal solution. If not, the solution is close to Hence, for the ease of notations and discussions, in the rest optimal. The complexity order of the new algorithm is O(K) of this work, we focus on (P2) were all xmin’s are set to i which can be much smaller than O(N). Moreover, since in zero.The optimizationproblemin (P2) is a non-linearinteger some cases the solution of the new approach is in closed optimizationproblem.Inthefollowingsection,wepresentour form,the relation between system parametersand the optimal mainresultsonhowthestorageallocationproblemin(P2)can solution is better seen compared to the iterative approach be tackled. presented above. IV. MAIN RESULTS B. A low-complexity near-optimal solution of (P2) A. Exact solution of (P2) Theoptimizationproblem(P2) can bedirectly solvedif we The exact solution of (P2) can be found using an iterative removethebudgetconstraintsfordataclasses(orequivalently approach where we assign the available storage units one at T = N, ∀i ∈ K) and let x ∈ R, ∀i ∈ K. The following i i the time. lemma gives the optimal solution of this relaxed version of Imagine that we want to assign the first unit of storage. It (P2). is clear that, in order to minimize αiqxi, this unit must cbleasasssji.gnNeodwt,othtehestcolraasgsewailtlhoctahteionlaPrpgreosbtleαmi,claent ubsecuaplldattheids TNKhe+oreKm1 lo1g.qCQoKjn=αs1Ki,dj−6=e1irαxjim∈iniRm,ize∀si ∈Ki=K1.αiTqhxein,su∀bij,excit =to ctolaossnejwalhreeardeythheastootnalessttoorraaggeebuunditgaestsiisgnreeddutcoeidt.tHoeNnc−e,1ifatnhde Ki=1xi ≤N. i P optimalsolutionforclassj isx intheoriginalproblem,inthe P Proof:Usinggeometric-meanarithmetic-meaninequality, j updatedproblemitisx −1.Also,notethatclassj contributes we have j to αiqxi as (αjq)qxj−1. Therefore, the updated problem K K is similar to the original with two minor differences: N is qxi+logqαi ≥K K qxi+logqαi P i=1 vui=1 (2) X uY is2inHfeeares,ibilteisoathsseurwmiesde.thatxmi in≤TiandPi∈Kxmi in≤N astheproblem =KqtPKi=1xi+lKogqQKi=1αi 5 Now,noticethatthe right-handside of(2) is minimizedwhen N − K ⌊x ⌋=M. Then, the optimal solution of problem i=1 i K x = N. Moreover, arithmetic-mean achieves its lower (P2) can be obtained as i=1 i P bound when P ∀i,j ∈K , qxi+logqαi =qxj+logqαj xopt = ⌊xi⌋+1 1≤i≤M =qN+logqKQKi=1αi (3) i (⌊xi⌋ M <i≤K Proof: See Appendix B. =D Simply put, Theorem 3 states that the M classes with the where D is a constant. Solving (3) for xi results in xi = largestnon-integerpartmustreceive⌈xi⌉andtheotherclasses KN + K1 logq QKj=α1K,j−6=1iαj. must receive ⌊xi⌋ storage units. Notice that in this case, we While Lemmai1 provides the optimal real-valued solutions directly derived the optimal solution of (P2). for the relaxedversionof (P2) with no individualbudgetcon- 2) Case2(|K1|=0,|K3|6=0): Here,xi >0, ∀i∈K,but straints, the actual solutions of (P2) are non-negativeintegers data classes that belong to K3 have xi’s that are greater than intherangeofthebudgetconstraintsofdifferentdataclasses. their correspondingbudgetlimit. The next Theorem gives the InthefollowingweshowthatitispossibletouseLemma1 optimal solution of data classes in K3. as a baseline to form a low-complexity and near-optimal Theorem 4. Assume |K | = 0. Then, the optimal solution of 1 solution for (P2). The overview of our approach is that we problem (P2) has the property xopt =⌊T ⌋, ∀i∈K . i i 3 usetheunconstrainedvaluesobtainedinLemma1topartition the set of K data classes into three subsets. Based on the Proof: By contradiction and following similar steps to cardinalityofthesesubsets,threedifferentcasesareidentified. the proofof Theorem2, one can easily provethis theorem.A In Case 1, we directly propose the optimal solutions of (P2). detailed proof is therefore omitted. In Case 2, we provide an iterative approach with worst- Using Theorem 4, we have the optimal solution of opti- case complexity order O(K) for solving (P2). In Case 3, mization problem (P2) for data classes in K3. Now, we can we propose a sub-optimal solution of (P2), again with worst- repeat solving (P2) with the updated values Knew = K\K3 case complexity O(K). Our numerical results in Section VI and Nnew =N − i∈K3⌊Ti⌋. showthatthe performanceofthis solutionisveryclose to the The new optimization problem has fewer data classes and P optimal solution of Section IV-A. Since typically K ≪ O, mayendupbeinga problemidentifiedbyCase 1,2oreven3 the new algorithm is much more efficient than the optimal (discussedbelow).Nonetheless,astheproblemsizeissmaller, solution. evenin the worst-case,after a maximumof K rounds,(P2) is Based on the values of x = N + 1 log QKj=1,j6=iαj, we solved. partitionthesetofdataclasseisKKintotKhreedqistinαciKt−su1bsetsof 3) Case 3 (|K1|6=0): Here, since K1 6=∅, there are some negativevaluesamongx ’sresultedfromLemma1.Moreover, K ,K andK suchthatthatK ={i|x <0,i∈K},K = i 1 2 3 1 i 2 wemayhavesomevaluesthatgobeyondthebudgetconstraint {i | 0≤x <⌊T ⌋,i∈K} and K ={i | x ≥⌊T ⌋,i∈K}. i i 3 i i of data classes (depending on the cardinality of K ). The Now, three different cases are possible: 3 complexity of this case is rooted in many different scenarios 1) |K |=|K |=0. 1 3 that can happen. To see this complexity, note that if we force 2) |K |=0, |K |6=0. 1 3 xopt = 0 for a class with x < 0 and update x ’s from 3) |K |6=0. i i i 1 Lemma 1, some of the classes that used to be in K may no 3 We address these three cases separately. longer be there. Likewise, if we force x =⌊T ⌋ for some of i i 1) Case 1 (|K | = |K | = 0): In this case, all x ’s 1 3 i themembersofK ,theelementsofK willchange.Also,the 3 1 obtained using Lemma (1) satisfy the budget constraints of orderthatwechoosetosetx ’stozeroor⌊T ⌋affectsthefinal i i their respective data classes. The only problem is that these solutionandourstudysuggeststhattheoptimalordering(even x ’s are not necessarily integers. In two steps, we find the i if possible to identify) does not follow any easy procedure. optimal integer-valued solutions. First, using the following Hence, to fulfill our main goal of having a low-complexity theoremweshowthattheoptimalvalueinclassiiseither⌊x ⌋ i solution, we suggest the following approach, which later is or ⌈x ⌉, where x is found from Lemma 1. Then, in another i i numerically tested and gives great results. theorem,we identifywhich classes shoulduse the ceilingand After using Lemma 1 to find x ’s, we let xopt = 0, i ∈ which classes the floor of their corresponding x . i i i K . Now, we can repeat solving (P2) with the updated value 1 Theorem 2. For xi resulted from Lemma 1, assume ∃i ∈ Knew =K\K1. Again, note that the new problem has fewer K,x ∈/ Z. Also, assume |K | = |K | = 0. Then, the optimal classes,andthatitmaybeaprobleminanyoftheabovethree i 1 3 solution of problem (P2) in class i is either xopt = ⌊x ⌋ or cases (mostlikely in Case 1). Therefore,in Case 3, similar to i i ⌊x ⌋+1, ∀i∈K. Case 2, the worst-case complexity is O(K). i Please note that in cases 1 and 2, we do not deviate from Proof: See Appendix A. the actual optimal solutions, and only in Case 3 we may lose Theorem 3. In Theorem 2, let x = ⌊x ⌋ + e , ∀i ∈ K. optimality. Moreover, Case 3 is a rare one because it needs i i i Without loss of generality, assume e ≥ e ≥ ... ≥ e , and someverylowpriorityclasses(classeswithverysmallweight 1 2 K 6 Algorithm 2 1: procedure (N,K,T,α) 2: while |K|6=∅ do 3: ∀i∈K, xi ← KN + K1 logq QKj=α1K,j−6=1iαj i 4: K1 ←{i|xi <0,i∈K} 5: K2 ←{i|0≤xi <⌊Ti⌋,i∈K} 6: K3 ←{i|xi ≥⌊Ti⌋,i∈K} 7: if |K1|=∅ & |K3|=∅ then K 8: M ←N − i=1⌊xi⌋ 9: ei ←xi−⌊xi⌋ P 10: For the M greatest values of ei’s 11: xoipt ←⌊xi⌋+1, otherwise xoipt ←⌊xi⌋ 12: Break 13: else if |K1|=∅ & |K3|6=∅ then Fig. 2. An equivalent system model for the storage node capacity greater 14: xoipt ←⌊Ti⌋, ∀i∈K3 thanone. 15: K←K\K3 16: N ←N − i∈K3⌊Ti⌋ by generalizing the results of [1] as 17: else if |K1|6=∅ then n rT N 18: xoipt ←0, ∀Pi∈K1 αimin( Ni,1) r pr(1−p)(N−r) (4) 19: K←K\K1 Xi∈KXr=0 (cid:18) (cid:19) As our numerical results show in Section VI, the solution 20: end if of(P2) getsveryclose to this upperboundin a wide rangeof 21: end while access probability p. This means MSA for multi-class DSS is 22: end procedure an efficient solution for those regions. InordertoanalyticallyidentifytheregionofpwhereMSA is close to optimal, another simple comparison can be done. comparedto others) such that Lemma 1 givesrise to negative We comparethesolutionof(P2) withthe idealcase whereall x ’sforthem.Evenifthisisthecase,assigningzerostorageto i classes are retrieved perfectly (P = 1,∀i ∈ K), where we s,i those classes should not affect the overallperformancemuch, considerthecasethattherearenoindividualbudgetconstraints astheyhadverysmallweightscomparedtoothers.Hence,we on the data classes. expect this procedure to perform well. This is indeed verified Thefollowingtheoremidentifiestherangeofaccessproba- by our numerical results in Section VI. bilitypwheretheperformancegapbetweentheoptimalMSA We have summarized the above approach to sub-optimally andperfectrecoverycase(describedabove)islessthanǫ.This solve (P2) in Algorithm 2. theorem is studied numerically in Section VI. Theorem 5. Assume T = N, ∀i ∈ K, then the gap between C. Bounding the performance i theweightedsumofsuccessfulrecoveryprobabilityofalldata Now that (P2) is solved (optimally in Section IV-A or sub- classesin optimalMSA andperfectrecovery is lessthanǫ for omputicmhawlleyloinstSbeyctsioolnvinIVg-MB)S,Aitinissteiandteroefstsionlgvintog tshtuedgyenheorwal p>1−min{(KK ǫKK α )N1,( K αKmi−n1 α )|N1−1|} optimalstorageallocationformulatedin(P1).Considerthegap i=1 i j=1,αj6=αmin j (5) between the weighted sum of the probabilities of successful Q Q where α = min α . recovery of different classes for (P1) and (P2). A small gap min i 1≤i≤K indicates that instead of solving the highly complex (P1), one Proof: See Appendix C. can resort to solving the much more efficient (P2) without According to the proof of Theorem 5, in the proposed much performance loss. In other words, a small gap suggests range of access probability p, our heuristic algorithm gives thatMSAisclosetooptimalforourmulti-classsetup.Hence, the optimal MSA. we are interested to see when this gap is small. In the following,we presenta generalizationof the consid- Asdiscussed(P1)istoohardtobesolvedefficiently.Hence, eredproblem.Recallthat we assumed thatthe normalizedca- here we first find an upperboundonthe performanceof(P1). pacityofallstoragenodesisequalto1,i.e.,c =1, ∀n∈N. n Clearly, the performance gap of MSA and (P1) is less than The next section discusses situations where c > 1 for some n or equal to the performancegap between MSA and the upper n. bound of (P1). Also, wherever MSA performs close to this upper bound, we know that MSA is efficient, and there is no V. NODES WITHSTORAGE CAPACITY LARGERTHAN ONE need to solve the highly complex (P1). Previouslyweassumedc =1forallnodes.Inotherwords, n Atrivialupperboundontheerformanceof(P2)isobtained beforenormalizations,this is equivalentto say thateach node 7 can store the data of one class and that all nodes have the Algorithm 3 same capacity.Now,assume the storagecapacityof nodesare 1: procedure (N,K,T,α,c′ s) n different,alsotheycanbegreaterthanthesizeofdataforone 2: while K >0 do class. Normalizing these capacities with the data size of one 3: N ← n∈N cn class, noden can store cn ≥1 unit(s)of data. Also, since our 4: Ti ←min{Ti,|N|}, ∀i∈K P focus is on MSA, there is no point to consider non-integer 5: xi ←0, ∀i∈K cn’s, hence ∀n∈N, cn ∈N. 6: while N >0 do To handle such situations, we can think of node n with 7: αj ←max(α) capacity cn > 1 as a super-node which consists of cn sub- 8: xj ←xj +1 storage nodes, each having unit capacity. This is depicted in 9: αj ←qαj Figure 2. This new model is very similar to what we studied 10: N ←N −1 earlier where all nodes had unit capacity. The difference is 11: if xj ≥⌊Tj⌋ then in the access model. Here, we can think of two practically 12: α←(α1,...,αI−1,αI+1,...,αK) important cases. 13: end if First, assume that the access to each sub-storage node 14: end while is independent of all other sub-storage nodes, and the data 15: xm ←max(xi)’s collector successfully accesses each sub-storage node with 16: xompt ←xm probability p. This can represent a practical scenario where 17: N ←N \Nf different files are accessed in different times (e.g., requests 18: K←K\{m} are buffered at a storage node to be serviced later). In this 19: end while case, the new model is equivalent to the MSA studied earlier 20: end procedure and the optimal or efficient near-optimal solutions can be found using the previously discussed methods, by letting Nnew = Nn=1cn. Figure 3 presents the simulations results for a multi-class Theotherinterestingcaseiswhenallthesub-storagenodes DSS where N = 20, T = 20, T = 8 and T = 4. Further, P 1 2 3 in one super node are accessed simultaneously by the data the weights for each class are α =8, α =5, α =1. More 1 2 3 collector.Thisaccessissuccessfulwithprobabilitypandfails specifically,theweightedsumsoftheprobabilitiesofsuccess- withprobability1−pforallthesesub-storagenodes.Thisfor ful recovery of all data classes for the optimal MSA obtained example can represent a situation when a hardware failure using Algorithm 1 and the near-optimal MSA obtained using hasaffectedastoragenode(super-node).ForMSA,allocating Algorithm 2 are compared with the upper bound of (4). As a more than one sub-storage node of a super-node to a specific benchmark,theweightedsumoftheprobabilitiesofsuccessful data class is pointless. Thus, the total number of sub-storage recovery for a random MSA, averaged over 100 realizations, nodesallocatedtodataclassicannotbemorethanthenumber is also depictedin Figure 3. As we see, our proposedsolution of super nodes, i.e. xi ≤ N, ∀i ∈ K. Moreover, as data in Algorithm 2 matches the optimal solution of Algorithm 1 are being assigned to sub-storage nodes, some super-nodes and is in factoptimal. In addition,both solutionssignificantly may ran out of capacity, putting an even harsher limit on the outperform random MSA and achieve the upper bound for remaining data classes. p≥0.6.ThismeansthatMSAisindeedtheoptimalallocation To handlethiscase, we first solveMSA by lettingNnew = for (P1). Nn=1cn and Tinew = min{Ti,|N|}, ∀i ∈ K. Assume xompt To verify our results in Theorem 5, we present Figure 4 is the greatestamongxopt’s. Since we need to ensure that the where three classes of data are stored over N = 15 nodes. P i total allocation given to class m is more than that of class j Here, it is assumed that T1 = T = T = N and α = 6, 2 3 1 where j ∈ K\{m} , we start by allocating xompt ≤ N sub- α1 =4 andα1 =1. Theresultsof thisfigureconfirmthatthe storage units from N different super-node. Since we do not gap between the optimal MSA and the actual optimal storage want to fill super-nodesas long as possible, we start from the allocation for (P1) approaches 0 and satisfies (5). super-nodewith the largest capacity (super-node1) and move Figure 5 depicts the results for a DSS with three classes of on. After finishing allocation of xompt, we update the available data where N = 25, T1 = 8, T2 = 15 and T3 = 23. Further, super-nodessincesomemayhavebeenalreadyfilled.Assume α =1, α =5 and α =8 and it is assumed that xmin =1 1 1 1 i thesefilledsuper-nodesaredenotedbyNf.We repeatsolving for i = 1,2,3. Again, we observe that Algorithm 2’s results MSA by letting Nnew = N \Nf, Knew = K\{m}. Note match the one for Algorithm 1 verifyingthe (near) optimality that since we allocate one data class in each iteration, this of Algorithm 2. Further, both solutions outperform random algorithm is O(K( n∈N cn)). MSA. We have summarized the above procedure in Algorithm 3. P VII. CONCLUSION VI. SIMULATIONRESULTS In this paper, we studied a multi-class DSS where various Inthissection,simulationresultsarepresentedtoverifyour classes of data, each with a different QoS requirement in analytical analysis under different DSS setups. terms of successful recovery probability, are to be stored 8 Fig. 3. Performance of optimal MSA in comparison with average random Fig.4. Thegapbetween thegeneral upperboundandtheoptimalMSA. MSA. over the storage nodes. For access to the storage nodes, we considered the widely used probabilistic access model. We then formulated the optimization problem for finding the minimal spreading allocation (MSA) that maximizes the weighted sum of the successful recovery probability of all data classes, subject to a guaranteed probability of success for each class. Through a number of intermediate results we solvedthisoptimizationproblem.Wearguedthattheproposed solution is applicable to a wide range of scenarios including butnotlimitedtoheterogeneousDSSswithstoragenodesthat have unequal capacities. Next, we studied the gap between the performance of the optimal MSA allocation and some upper bounds on the optimal allocation. This study revealed that in many practical cases, MSA performs very close to the intractable optimal allocation. Finally, simulation results were presented to better illustrate our analysis. Fig. 5. Performance of the proposed solutions when at least one storage nodehasbeendedicated toeachdataclass. ACKNOWLEDGEMENT TheAuthorswouldliketothankAlbertaInnovatesTechnol- ogyFutures(AITF),TELUSCorporationandNaturalSciences and let xoipt = ⌊xi⌋+hi, hi ≥ 1. Now, consider the new and Engineering Research Council of Canada (NSERC) for allocation as ⌊x ⌋+h −1 k =i supporting this work. i i x′ = ⌊x ⌋−h +1 k =j k  j j APPENDIX A  xopt k 6=i,j k PROOF OF THEOREM2 NoticethatsincewehaveK =∅,theallocationstrategyx′ Proof:Assumex =⌊x ⌋+e , ∀i∈Kwhere0≤e <1. isafeasibleallocation.Now,w3eprovethatthenewallocatiokn i i i i Withoutlossofgenerality,assumeα ≥α ≥...≥α which outperformstheformerone.Tothisend,itissufficienttoonly 1 2 K implies that x ≥ x ≥ ... ≥ x (otherwise by changing the prove that 1 2 K x ’s, one can decrease the objective function). i Now note that we have Ki=1xi = N and qxi+logqαi = q⌊xi⌋+hi+logqαi +q⌊xj⌋−hj+logqαj D, ∀i∈K where D is a constant and D ∈R+. ≥q⌊xi⌋+hi−1+logqαi +q⌊xj⌋−hj+1+logqαj P K,Uxsionpgt c<on⌊trxad⌋ic,tiloetn,xfooprt t=he⌊oxpotpimt⌋a−l MhS,A,hass≥um1ew∃hjic∈h ⇐(a⇒) Dqhi−ei +Dq−hj−ej ≥qhi−ei−1D+q−hj−ej+1D (6) j j j j j j indicates ∃i 6= j ∈ K, xoipt > ⌊xi⌋ (since Ki=1xoipt = N) ⇐(b⇒) q−hj−ej(1−q)(1−qhi+hj+ej−ei−1)≥0 P 9 where: [5] Z.Li,T.Ho,D.Leong,andH.Yao,“Distributedstorageallocation for (a) follows from the fact that qxi+logqαi =D, ∀i∈K. heterogeneous systems,”inAllertonConf.onCommunication, Control, andComputing, Oct2013,pp.320–326. (b) follows from the facts that e −e >−1 and h +h ≥2. j i i j [6] M. Noori and M. Ardakani, “Allocation for heterogeneous storage Equation (6) contradicts the optimality of xopt’s and shows nodes,”IEEECommunications Letters,vol.19,no.12,pp.2102–2105, i that xopt ≥ ⌊x ⌋, ∀i ∈ K. Similarly, it can be shown that Dec2015. ∄i∈Ki, xopt >i⌊x ⌋+1. [7] I. Andriyanova and P. M. Olmos, “On distributed storage allocations i i for memory-limited systems,” in IEEE Global Communications Conf. (GLOBECOM),2015,pp.1–6. APPENDIX B [8] D. Leong, A. G. Dimakis, and T. Ho, “Distributed storage allocations PROOF OF THEOREM3 for optimal delay,” in IEEE Intl. Symp. on Information Theory (ISIT), July2011,pp.1447–1451. Proof: Note that since we have K = ∅, xpossible = [9] G. Joshi, Y. Liu, and E. Soljanin, “On the delay-storage trade-off in 3 i ⌊x ⌋+1 is a feasible allocation for ∀i ∈ K. Using the proof contentdownloadfromcodeddistributedstoragesystems,”vol.32,no.5, i pp.989–997, 2014. ofTheorem2andassuming∃i,j ∈K, e ≥e ,itissufficient i j [10] M. Noori, E. Soljanin, and M. Ardakani, “On storage allocation for to show that maximum service rate in distributed storage systems,” in IEEE Intl. αiq⌊xi⌋+1+αjq⌊xj⌋ ≤αiq⌊xi⌋+αjq⌊xj⌋+1 [11] QSy.mYpu.,oKn.IWnf.orSmhautmio,nanTdheCo.ryW(.ISSIuTn)g,,Ju“Mlyin2i0m16iz,aptipo.n2o4f0–s2to4r4a.gecostin ⇐(a⇒) q1−ei +q−ej ≤q−ei +q1−ej distributed storage systems with repair consideration,” in IEEEGlobal Communications Conf.(GLOBECOM),2011,pp.1–5. ⇐(b⇒) (1−q)(q−ej −q−ei)≤0 [12] A.Kumar,R.Tandon,andT.Clancy, “Onthe latency andenergy effi- ciency of distributed storage systems,” IEEETrans. Cloud Computing, where: vol.PP,no.99,pp.1–13,2015. (a) follows from the fact that qxi+logqαi =D, ∀i∈K. [13] https://aws.amazon.com/s3/storage-classes/, [Online; accessed 22-Nov- 2016]. (b) follows from the fact that e ≥e . i j which completes the proof. APPENDIX C PROOF OF THEOREM5 Proof:UsingTheorem1,onecaneasilyshowthatxopt ≥ i αK−1 1 1, ∀i∈Kifandonlyifp≥1−( min )|N−1|.Now, QKj=1,αj6=αminαj assume xopt ≥ 1, ∀i ∈ K, for the performance gap between i the perfect recovery (all data classes can be recovered with probability one) and optimal MSA we have K K K αi− αi(1−qxoipt)= qxoipt+logqαi i=1 i=1 i=1 X X X ≥KqPKi=1xoipt+KlogqQKi=1αi =KqN+logqKQKi=1αi ≥ǫ the first inequality comes from the geometric mean arithmetic-mean inequality. Solving the last inequality ffuorrthper =imp1lie−s qthgaitvetshep p≤erfo1rm−an(cKeKgQǫaKpKi=1bαeit)wN1eenwhitchhe optimal MSA and perfect recovery is less than ǫ for ǫK αK−1 p>max{1−( )N1,1−( min )|NK−1|} KK K α K α i=1 i j=1,αj6=αmin j and completes the proof. Q Q REFERENCES [1] D.Leong,A.G.Dimakis,andT.Ho,“Distributed storageallocations,” IEEETrans.Inf.Theory,vol.58,no.7,pp.4733–4752, 2012. [2] M.Sardari,R.Restrepo,F.Fekri,andE.Soljanin,“Memoryallocation in distributed storage networks,” in IEEE Intl. Symp. on Information Theory(ISIT),June2010,pp.1958–1962. [3] V.Ntranos,G.Caire,andA.G.Dimakis,“Allocationsforheterogenous distributed storage,”inIEEEIntl.Symp.onInformation Theory(ISIT), 2012,pp.2761–2765. [4] B. Hong and W. Choi, “Optimal storage allocation for wireless cloud caching systems with a limited sum storage capacity,” IEEE Trans. Wireless Commun.,vol.15,no.9,pp.6010–6021,Sept2016.