Two-Stage Subspace Constrained Precoding in Massive MIMO Cellular Systems An Liu, Member IEEE, and Vincent Lau, Fellow IEEE, Department of Electronic and Computer Engineering, Hong Kong University of Science and Technology Abstract—We propose a subspace constrained precoding limited by the channel coherence time and it may become scheme that exploits the spatial channel correlation structure in much smaller than M as M grows large. Second, to realize massive MIMO cellular systems to fully unleash the tremendous the inter-cell interference mitigation gains in massive MIMO gain provided by massive antenna array with reduced channel usingcoordinatedbeamforming[4]orcooperativeMIMO[5], stateinformation(CSI)signalingoverhead.TheMIMOprecoder 5 at each base station (BS) is partitioned into an inner precoder real-time global CSIT knowledge is required for cooperative 1 andaTransmit(Tx)subspacecontrolmatrix.Theinnerprecoder MIMO and at least partial global CSIT (i.e., each BS needs 0 is adaptive to the local CSI at each BS for spatial multiplexing to know the channels from this BS to all users) is required 2 gain.TheTxsubspacecontrolisadaptivetothechannelstatistics for coordinated beamforming. However, the acquisition of n for inter-cell interference mitigation and Quality of Service (partial)globalCSITisaverychallengingprobleminpractical a (QoS) optimization. Specifically, the Tx subspace control is J formulated as a QoS optimization problem which involves an massive MIMO systems because both the downlink pilot 0 SINR chance constraint where the probability of each user’s training and the uplink CSI feedback overhead will become 2 SINRnotsatisfyingaservicerequirementmustnotexceedagiven unacceptable as the number of antennas M grows large. It is outageprobability.Suchchanceconstraintcannotbehandledby even more difficult to obtain real-time (partial) global CSIT theexistingmethodsduetothetwo-stageprecodingstructure.To ] due to the backhaul signaling latency. T tacklethis,weproposeabi-convexapproximationapproach,which I consists of three key ingredients: random matrix theory, chance In this paper, we propose a two stage subspace constrained . constrained optimization and semidefinite relaxation. Then we precoding for massive MIMO cellular systems. In practice, s c propose an efficient algorithm to find the optimal solution due to local scattering effects [6], [7], the channel vectors [ of the resulting bi-convex approximation problem. Simulations of users are usually concentrated in a subspace with a much 1 sbhaoswelintheas.t the proposed design has significant gain over various smaller dimension than M when M is large. The proposed v two-stage precoding takes advantage of this limited scattering 1 Index Terms—Massive MIMO, Subspace constrained precod- property to achieve intra-cell spatial multiplexing and inter- 2 ing, QoS Guarantees cell interference mitigation without all the aforementioned 7 practical issues. Specifically, the MIMO precoder at each BS 4 0 I. INTRODUCTION is partitioned into an inner precoder and a semi-unitary Tx . Massive MIMO has been regarded as one of the key subspacecontrolmatrix.Theinnerprecoderisadaptivetoreal- 1 technologies in future wireless networks. In massive MIMO timelocalCSITperBSforintra-cellspatialmultiplexinggain. 0 5 cellular systems, each BS is equipped with M (cid:29)1 antennas. TheTxsubspacecontrolmatrixisadaptivetolong-termchan- 1 This large spatial degree of freedom (DoF) of massive MIMO nel statistics to achieve the best tradeoff between direct link v: systemscanbeexploitedtosignificantlyincreasethespectrum diversitygainandcrosslink(inter-cell)interferencemitigation i andenergyefficiency[1].Specifically,therearetwoimportant such that the QoS of the users is maximized. Such subspace X roles for the spatial DoF introduced by the massive MIMO, constrained precoding structure simultaneously resolves the r namely the intra-cell spatial multiplexing and the inter-cell aforementionedpracticalchallenges.Forinstance,theissueof a interference mitigation. However, there are several practical insufficient pilot symbols for real-time local CSI estimation is issues towards achieving the huge performance gain predicted resolvedbecausetheBSonlyneedstoestimatetheCSIwithin by the theoretical analysis in massive MIMO systems. First, the subspace determined by the Tx subspace control, which to realize the spatial multiplexing gains (combating intra- is of a much smaller dimension than M. Furthermore, the Tx cell interference) within each BS using conventional MU- subspacecontrolisadaptivetothelong-termchannelstatistics, MIMO precoding [2], [3], real-time CSIT (i.e, channel state which is insensitive to the backhaul latency. As a result, the information at the BS) is required. In most of the existing proposed subspace constrained precoding fully exploits the works, Time-Division Duplex (TDD) is assumed and channel large number of antennas to simultaneously achieve intra-cell reciprocity can be exploited to obtain CSIT via uplink pilot spatial multiplexing per BS and inter-cell interference mitiga- training. However, in this paper, we are focusing on a more tion without an expensive backhaul signaling requirement. challenging but important Frequency-Division Duplex (FDD) In[8],asimilartwo-stageprecodingstructurewasproposed mode of Massive MIMO. In this case, channel reciprocity for single cell massive MIMO systems, where a simple pre- cannotworkandthechannelestimationhastobeobtainedvia beamforming (with a pre-determined dimension) is used to downlink channel estimation and channel feedback, which is control intra-cell interference and achieve spatial multiplex- practicallyinfeasiblebecausethenumberofindependentpilot ing gain based on block diagonalization (BD). However, symbols available for channel estimation is fundamentally the solution in [8] for single cell systems is fundamentally Proposedtwostageprecoding Twostageprecodingin[8] Systemtopology Multi-cellmassiveMIMOsystem SinglecellmassiveMIMOsystem Rolesoftwo-stageprecoding Intra-cell&Inter-cellinterferencemitigation Intra-cellinterferencemitigation Subspacedimensionpartitioning Dynamic Static Adaptivetoheterogeneouspathloss Yes No andQoSrequirementsexplicitly Designapproach Non-trivialdesignbasedonoptimization SimpledesignbasedonBDmethod Performance Good Moderate Table I: Summary of the differences compared with existing (state-of-the-art) two stage precoding scheme in [8]. different from the proposed solution for multi-cell systems ing variable. We further prove the tightness of the SDR using with heterogeneous path loss and heterogeneous QoS require- the specific structure of the Tx subspace control problem. ments, as summarized in Table I. For example, the BD- Finally, we propose a low complexity iterative algorithm to based pre-beamforming design in [8] does not explicitly take solvethisbi-convexproblemandfindtheTxsubspacecontrol into account the heterogeneous path loss and QoS require- solution by combining the convex optimization method and ments and thus may be far from optimal as shown in the the bisection search method. Simulations show that the pro- simulations. Moreover, the pre-beamforming design in [8] posed design achieves significant performance gain compared is based on a heuristic (BD) method and the dimensions with various state-of-the-art baselines under various backhaul of the pre-beamforming matrices for different user clusters signaling latency. are pre-determined and fixed. In this paper, the design of Notations: The superscripts (·)T and (·)† denote transpose Tx subspace control is formulated as a QoS optimization and Hermitian respectively. For a set S, |S| denotes the problem with individual QoS requirements and the dimen- cardinality of S. The operator diag(a) represents a diagonal sions of Tx subspace control matrices are also included in matrix whose diagonal elements are the elements of vector a. the optimization. As shown in the simulations, in multi-cell For a set of K vectors a ∈ CM,i = 1,...,K and a subset i systems with heterogeneous path loss and QoS requirements, S ⊆{1,...,K}, the notation [a] denote a M ×|S| matrix i∈S it is very important to dynamically allocate the dimensions of whosecolumnsaredrawnfromthevectorsin{a ,i∈S}.The i the Tx subspace control matrices over different user clusters. notation UM×N denotes the set of all M ×N semi-unitary With optimal dynamic subspace dimension partitioning, the matrices. For a matrix A, diag(A) represents a diagonal proposedsolutioncanachieveamuchbetterperformancethan matrix whose diagonal elements are the diagonal elements of the two-stage precoding in [8] with static subspace dimension A. span(A) represents the subspace spanned by the columns partition. However, the optimization based solution in this of a matrix A. (cid:107)A(cid:107) is the spectral radius of A. paper also causes several non-trivial technical challenges. • SINRChanceConstraintunderTwo-StagePrecoding: II. SYSTEMMODEL The QoS optimization problem involves an SINR chance A. Massive MIMO Cellular System constraint where the probability of each user’s SINR not Consider the downlink of a massive MIMO cellular system satisfying a service requirement must not exceed a given withLBSsandK single-antennausersasillustratedinFig.1 outageprobability.SuchSINRchanceconstraintdoesnot forL=2andK =8.EachBShasM antennaswithM much have a closed form expression. There are various Bern- larger than the number of the associated users. The massive stein techniques to derive a safe convex approximation MIMOcellularsystemcanberepresentedbyatopologygraph for the chance constraints of standard forms [9], [10]. as define below. However, due to the two-stage precoding structure, the associated SINR chance constraint does not belong to Definition 1 (System Topology Graph). Define the topology any of the standard forms. graph of the massive MIMO cellular system as a bipartite • CombinatorialNatureofDimensionPartitioning:Due graph GT = {B,U,E}, where B denotes the set of all BS to the combinatorial nature, the optimization of subspace nodes,U denotesthesetofallusernodes,andE isthesetofall dimensionpartitioningisverychallengingandbruteforce edgesbetweentheBSsandusers.Anedge(k,l)∈E between solution has high complexity for large M. BS node l ∈ B and user node k ∈ U represents a wireless link between them. Each edge (k,l) ∈ E is associated with In this paper, we address the above technical challenges a CSI label h ∈ CM, which represents the channel vector using a novel bi-convex approximation approach. The first k,l between BS l and user k. For each user node k, let b denote challenge is addressed using the interference approximation k the serving BS. and SINR chance constraint restriction, where we combine the techniques in random matrix theory and chance constraint An example of a massive MIMO cellular system and the optimization to construct a deterministic and (asymptotically) corresponding topology graph is illustrated in Fig. 1. conservative approximation for the SINR chance constraint. The second challenge is addressed using the semidefinite B. Massive MIMO Channel Model relaxation(SDR),whereweapplytheSDRtechniquetoobtain arelaxedbi-convexproblemwhichdoesnotexplicitlyinvolve The channel from BS l to user k is modeled as hk,l = the combinatorial optimization w.r.t. the dimension partition- Θ1/2hw , ∀k,l, where hw ∈ CM has i.i.d. complex entries k,l k,l k,l more general than that considered in [8]. For example, the n- th cluster U is allowed to contain a single user and the users n in the same cluster can have different path loss. At each time slot, linear precoding is employed at BS l to n supportsimultaneousdownlinktransmissionstotheassociated usersinU .Letn denotetheindexoftheclusterthatcontains n k (cid:8) (cid:0) (cid:1) (cid:9) user k and let B = n: n(cid:54)=n , k,l ∈E denote the set k k n of user clusters that interfere with user k. Then the received signal for user k can be expressed as: (a)AmassiveMIMOcellularsystemwith2BSsand 8users. y = h† (cid:112)P v s +h† (cid:88) (cid:112)P v s k k,bk k k k k,bk k(cid:48) k(cid:48) k(cid:48) k(cid:48)∈Unk\{k} (cid:124) (cid:123)(cid:122) (cid:125) intra-cluster interference (cid:88) + h† V P1/2s +z , (1) k,ln n n n k (b) The corresponding topology graph n(cid:124)∈Bk (cid:123)(cid:122) (cid:125) GT(Θ) = {B,U,E}, where B = {1,2}, inter-cluster interference U ={1,2,3,4,5,6,7,8}. wheres ∼CN (0,1)isthedatasymbolforuserk;v ∈CM k k Figure 1: An example of a massive MIMO cellular system and the with (cid:107)v (cid:107) = 1 is the precoding vector for user k; P is the k k corresponding topology graph. ppsyoremwcbeorodlianlvgloemcctaaotterrdiaxttoaBtuBSsSenrn;k;V;Psnn===d[[ivsallg]]ll∈(cid:0)∈[UUPnn]∈∈CC|(cid:1)UMn∈|×R|iUs|nUt|hne|i×sd|Utahntea| of zero mean and unit variance; and Θk,l ∈ CM×M is a is the power allocation matrnix for the nl-tlh∈Uunser clu+ster; and positive semi-definite spatial correlation matrix between BS z ∼CN (0,1) is the AWGN noise. l and user k. Assume block fading channel where hw is k k,l fixed for a time slot but changes over time slots. As such, the CSI is divided into instantaneous CSI H = {h } and C. Two-stage Subspace Constrained Precoding k,l global statistical information Θ(cid:44){Θ } (spatial correlation k,l The precoder V for the n-th cluster at BS l is assumed n n matrices). We make the following assumption on the spatial to have a two-stage structure V = F G . The Tx sub- n n n correlation matrices. space control variable Fn ∈ UM×Sn is used to mitigate Assumption1(SpatialCorrelationModel). Thespatialchannel the inter-cluster interference (including inter-cell interference) correlation matrices satisfy the following two assumptions. in massive MIMO systems, where S ∈ {|U |,...,M} is n n 1) The number of dominant eigenvalues (i.e., the eigenval- calledthedimensionpartitioningvariableforthen-thcluster. ues that are comparable to the maximum eigenvalue) of For convenience, define F = {F1,...,FN} as the set of Θ ,∀k,l is relatively small compared to the number Tx subspace control variables for all user clusters. Both F k,l, of antennas M. and Sn,∀n are computed at a central node based on the 2) The users can be grouped into N user clusters Un,n= global statistical information Θ. The inner precoder Gn = 1,...,N suchthatallusersinUn areassociatedwiththe [gk]k∈Un ∈CSn×|Un| isusedtorealizetheintra-clusterspatial sameBSdenotedbyl andΘ =L Θ◦,∀k ∈U , multiplexing gain using simple zero-forcing (ZF) precoding. wk,haenredLTkr,(lnΘ>◦)0=isMthenispaathnolormsks,alnblieztewdesekpn,lanBtiaSlnlcnorarnedlautisonenr aSspeitcsifiecfafellcyt,ivfeorchaaunsneerlk. T∈heUnn,,tdheefiZneFh(cid:101)inkn,ner(cid:44)prFe†ncohdki,nlng∈veCctSonr n g with (cid:107)g (cid:107) = 1 is obtained by the projection of the matrix. k k effective channel h(cid:101)k,n on the orthogonal complement of the Many experimental measurements as reported in [11], [12] (cid:18)(cid:104) (cid:105) (cid:19) snheolwcotrhrealtaptiroanctidcuael MtoIlMimOitecdhasncnaettlesriinngdeoerdLhOavSetsrpaantsimaliscshiaonn-. subspace span h(cid:101)k(cid:48),n k(cid:48)∈Un\{k} . Hence, Gn is computed locallyatBSl basedonthelocalreal-timeinstantaneousCSI n The standard channel models such as IMT-advanced channel (cid:104) (cid:105)† model [13] also predicts that MIMO channel can have high H(cid:101)n = h(cid:101)k,n ∈ C|Un|×Sn. Note that in the proposed spatialcorrelationespeciallywhentheangularspreadissmall. two stage precko∈dUinng, only the inner precoder is based on Furthermore,inamassiveMIMOsystemwithalargenumber ZF techniques for analytical tractability. The Tx subspace of antennas assembled within a limited space at the BS, control is not based on ZF techniques but based on the QoS the channels are more likely to be highly correlated due to optimization problem P in Section III. insufficient spacing among the antennas [1], [14]. As a result, The diversity gain of the users in the n-th cluster increases thefirstassumptionhasbeenmadeinmanyexistingworkson with the dimension partitioning variable S . On the other n massive MIMO systems, see, e.g., [8], [14]–[18]. In practice, hand, the interference leakage to other user clusters also we can also cluster users into groups using the user clustering increases with S . As a result, there is a fundamental tradeoff n algorithm in [18] such that the second assumption can be between direct link diversity gain and cross link interference closely achieved [8]. Note that the above channel model is leakage. It is important to dynamically adapt the dimension partitioning variable S based on the spatial channel corre- a simple admission control based on the solution of Problem n lation matrices Θ to optimize the overall performance. In P. For example, whenever a new user arrives, the admission general, the simple BD-based subspace precoder F may control solves Problem P with all users (including the newly n be far from optimal as will shown in the simulations. We arrival user). If Problem P is feasible, this user is admissible. shall formulate the design of the Tx subspace control F and Otherwise, this user is not allowed to access the system. As a subspace dimension partitioning S ,∀n formally in Section result, we can focus on the case when Problem P is feasible. n III. Problem P is a very challenging problem. First, the prob- ability functions in the SINR chance constraint (3) do not III. OPTIMIZATIONFORMULATIONFORTXSUBSPACE have closed form expressions. Thus, one may only resort ANDDIMENSIONPARTITIONING to approximation methods. Due to the two-stage precoding Assume that user k has perfect knowledge of the effective structure, the inner ZF precoder Gn is also a function of sinintegrlfee-riennpcuet-spilnugsl-en-ooiustepupto(wSeISr.OT)hcehnanfnoerlhg†kiv,beknFTnkxgksuabnsdpathcee tihneg TapxpsrouxbismpaactieoncomnterothlovdasrifaobrlethFen“.sAtasndaarrdesfuoltr,mt”hecheaxniscte- control F, subspace dimensions {S }, and instantaneous CSI constraints in [9], [10] cannot be applied here. Second, the n H, the SINR of user k is dimensionpartitioningvariableSn alsoneedstobeoptimized, which is a difficult combinatorial problem. Third, the domain (cid:12) (cid:12)2 SINRk = (cid:80) h†PkF(cid:12)(cid:12)h†kG,bkPFnkGg†kF(cid:12)(cid:12) †h +1. (2) UweMp×rSonpoosfeFannoisvenlobni--ccoonnvveexx.aTpoprtaocxkimleatthioenaabpopvreocahcahl,lewnhgiecsh, n∈Bk k,ln n n n n n k,ln consistsofthreekeyingredients:randommatrixtheory,chance constrained optimization and semidefinite relaxation (SDR). where G is the ZF inner precoder given by a function of n Then we propose an efficient algorithm to find the optimal Fn and H (or more precisely, the effective channels H(cid:101)n). In solution of the resulting bi-convex approximation problem. this paper, we focus on designing Tx subspace control F to optimize the QoS of the users. Specifically, we consider the following QoS optimization problem: IV. THEBI-CONVEXAPPROXIMATIONAPPROACH P : max γ SinceitisdifficulttosolveProblemP directly,wepropose F,{Sn},γ≥γ◦ to finda sub-optimal solutionby solving abi-convex approxi- s.t. Pr{SINR ≥w γ}≥1−(cid:15) ,∀k; (3) k k k mation of P. The proposed bi-convex approximation contains Fn ∈UM×Sn,Sn ∈{|Un|,...,M},∀n, (4) threesteps,whereeachstepsolvesonekeytechnicalchallenge associated with Problem P. where the SINR chance constraint in (3) ensures that the probability of SINR not satisfying a service requirement k wkγ is less than a maximum allowable outage probability A. Interference Approximation Step (cid:15) ∈ (0,1), and w > 0 is the QoS weight for user k. The k k To handle the SINR chance constraint in (3), we need to constraint γ ≥ γ◦ is used to guarantee some minimum QoS find a simple characterization of the distribution of SINR , forallusers.TheconstraintS ≥|U |istoensurethattheZF k n n which is very difficult because the interference term I (cid:44) inner precoding is feasible at each BS. The QoS constraint in (cid:80) h† F G P G†F†h in SINR depends onkthe (3)isespeciallyusefulformediastreamingapplicationswhere n∈Bk k,ln n n n n n k,ln k instantaneous channel vectors h ,n ∈ B of all the cross there is a stringent delay requirement for the media packets k,ln k links of user k. To simplify the characterization of SINR , requested by the users and thus it is desirable to maintain a k the interference approximation step is used to address the fixed data rate (with high probability) for each user. The QoS following challenge. weightsw ’sandthemaximumallowableoutageprobabilities k (cid:15)k’scanbeusedtoprovidedifferentialQoSfordifferentusers. Challenge 1 (Deterministic Approximation of Ik). Find a Note that the optimization of the Tx subspace control deterministic approximation Iˆ of I such that Iˆ does not k k k tvhaeriadbilmeeFnsnioinn pParrotbitlieomninPg ivnacrliuadbeles tShne o(oprtimeqizuaivtiaolnenotlfy,boththe doefptehnedcroonssthleiniknsstoafntuasneeroku.s channel vectors hk,ln,n∈Bk dimension of F ) and the value of F . Both F ’s and S ’s n n n n We resort to the random matrix theory to solve the above areadaptivetothespatialchannelcorrelationmatricesΘ,i.e., challenge. Specifically, we apply the random matrix theory theyarefixedandindependentoftheinstantaneousCSIHfor to derive an asymptotic (and deterministic) upper bound Iˆ k given Θ. of I as M → ∞ and |U | → ∞,∀n. We then use k n Remark2(AdmissionControl). NotethatProblemP maynot Iˆ as an approximation of I for finite but large M and k k always be feasible, i.e., the minimum QoS requirement wkγ◦ |Un|’s. Throughout the paper, the notation M → ∞ refers to cannot be guaranteed for all users even with infinite transmit M →∞ and |U |→∞,∀n such that 0<liminf|U |/M ≤ n n power.Inpractice,anadmissioncontrolcanbeusedtoensure M→∞ limsup|U |/M < ∞. The following assumptions are re- n that there is enough resource in the system to guarantee M→∞ quired in order to derive the asymptotic upper bound of I . the minimum QoS for all users. The admission control is a k challenging problem and a systematic design of admission Assumption 2 (Technical Assumptions for Interference Ap- control is out of the scope of this paper. However, we can use proximation). All spatial correlation matrices Θ ,∀k,l have k,l uniformly bounded spectral norm on M, i.e., Our strategy is as follows. First, we show that conditioned (cid:104) (cid:105) liMm→s∞up1≤sku≤pK(cid:107)Θk,l(cid:107) < ∞, ∀l. (5) o(cid:12)n H(cid:101)−k (cid:44) (cid:12)2h(cid:101)k(cid:48),nk k(cid:48)∈Unk\{k}, the effective channel gain (cid:12)h† F g (cid:12) of user k can be expressed as a quadratic Assumption 2 is satisfied by many MIMO channel model (cid:12) k,bk nk k(cid:12) function of a standard complex Gaussian vector. Then we and it is a standard assumption in the literature, see e.g., [15], applythestandardmethodin[9],[10]toconstructaquadratic [16]. restriction of the chance constraint in (6) conditioned on a Under Assumption 2, we have the following theorem. given H(cid:101)−k. Finally, we use the quadratic restriction of the Theorem 3 (Asymptotic Interference Upper Bound). Under conditionalchanceconstraintof(6)withthe“worstcase”H(cid:101)−k Assumption 2 and a feasible Tx subspace control F (i.e., F asaquadraticrestrictionoftheunconditionalchanceconstraint satisfies the constraints in (3) and (4) for some γ ≥ γ◦), we in (6). Since (6) is an asymptotically safe approximation a.s have I −Iˆ ≤ 0 as M →∞, where of (3), the result will then be an asymptotically quadratic k k Iˆk (cid:44) (cid:88) PnTr(cid:16)F†nΘk,lnFn(cid:17). rreessturlitctiisonsuomfmthareizSedINiRn tchheanfcoellocwoinnsgtratihnetorienm(.3)P.leTahsee rfienfearl n∈Bk to Appendix B for the detailed proof. (cid:80) andP = P isthesumtransmitpoweroftheusers n k(cid:48)∈Un k(cid:48) Theorem4(SolutiontoChallenge2). Theasymptoticallysafe in the n-th cluster. approximate SINR chance constraint in (6) is satisfied if Please refer to Appendix A for the proof. (cid:16) √ (cid:17) (cid:16) (cid:17) Replacing the interference term Ik in SINRk using the 1− σ2kδk Tr Θ◦nkFnkF†nk asymptotic interference upper bound Iˆk in Theorem 3, we ≥γ(cid:80) Tr(cid:0)Θ F F†(cid:1)+c (γ), obtain an asymptotically safe approximation of the SINR n∈Bk k,n n n k (cid:16) (cid:17) chance constraint in (3)1: Tr Θ◦nkFnkF†nk ≥σk2+|Unk|−1, (7) Pr(cid:26)(cid:12)(cid:12)(cid:12)h†k,bkFnkgk(cid:12)(cid:12)(cid:12)2 ≥ wPkkγ (cid:16)Iˆk+1(cid:17)(cid:27)≥1−(cid:15)k. (6) where δk =ln(cid:15)1k, σk = √2δk+√22δk+4, Θ◦nk = λn1kΘ◦nk, λnk B. SINR Chance Constraint Restriction Step is the largest eige(cid:16)nvalue√of Θ(cid:17)◦nk, Θk,n = λwnkkPPnkΘk,ln and c (γ)= wkγ + 1− 2δk (|U |−1). The asymptotically safe approximate SINR chance con- k λnkPk σk nk straint in (6) remains intractable although it appears to be In (7), Tr(cid:16)Θ◦ F F† (cid:17) and (cid:80) Tr(cid:0)Θ F F†(cid:1) can relatively easier to handle than the original counterparts in nk nk nk n∈Bk k,n n n be interpreted as the signal power of the direct link and (3). The restriction step aims to find a quadratic conservative interferenceleakagefromthecrosslinks.Hence,thequadratic approximation of (6). There are some existing methods that restrictionof(6)in(7)capturesthekeytradeoffbetweendirect use the worst-case deterministic constraints to approximate link signal power and cross link interference leakage. various forms of chance constraints [9], [10], [19]. For exam- Remark 5. The asymptotically quadratic restriction in (7) is ple, in the context of transmit beamforming design for MU- also valid under finite M. The asymptotic approach based MIMO downlink with imperfect CSI [10], the Bernstein-type on random matrix theory has been widely used in wireless inequality was used to construct a convex restriction of the communications especially when it is difficult to directly SINR chance constraint that involves a quadratic function of analyze the original system and the validity of such approach thestandardcomplexGaussianvector.Unliketheconventional hasbeenverifiedbynumerousworks.Inthispaper,weutilize transmit beamforming problem, in our problem, the MIMO random matrix theory and obtain (7) as a safe approximation precoder is divided into Tx subspace control F and inner n of the original SINR chance constraint when M →∞. It has precoderG whichchangeatdifferenttimescales.Asaresult, n been shown in [20] that the random matrix theory results are the asymptotically safe approximate SINR chance constraint pretty good approximation even when M is small. Hence, (7) in (6) does not belong to any of the standard forms that have isasafeapproximationoftheoriginalSINRchanceconstraint been studied in the existing works. For example, the inner even under finite M in massive MIMO (as illustrated in Fig. precoding vector g is also a function of the random channel k (cid:12) (cid:12)2 3 in Section VI). vectors and thus (cid:12)h† F g (cid:12) is not a quadratic function (cid:12) k,bk nk k(cid:12) According to Theorem 3 and Theorem 4, (7) is a quadratic of the standard complex Gaussian vector. In our solution, the restriction of the original SINR chance constraint in (3) for restriction step entails finding a solution to the following: large M. Replacing the SINR chance constraint (3) in P with the quadratic constraint in (7), we have the following Challenge2(AsymptoticallyQuadraticRestrictionofSINR asymptotically conservative formulation of Problem P: Chance Constraint (3)). Find a quadratic function f (F) k fromFtoR2 suchthatiff (F)≥0,thentheSINRchance k P : max γ constraint in (3) is asymptotically satisfied as M →∞ 1 F,{Sn},γ≥γ◦ s.t. (7) is satisfied for all k, SIN1ARsycmhapntcoeticcaolnlystrsaaifnetainpp(r3o)xiimssaatitoisnfiemdeaanssMtha→tif∞(6.)issatisfied,thenthe Fn ∈UM×Sn,Sn ∈{|Un|,...,M},∀n. (8) C. Semidefinite Relaxation Step Problem Pˆ is a bi-convex problem, i.e., it is convex w.r.t. By combining the techniques in random matrix theory W (γ) for fixed γ (W), but is not jointly convex. Note that and chance constraint optimization, we have constructed an Problem Pˆ does not involve the combinatorial optimization asymptotically quadratic restriction for the complicated SINR w.r.t. the dimension partitioning variable Sn because the chance constraint in (3). However, we still need to solve the optimal Sn is automatically determined by the rank of the challenge associated with the combinatorial optimization of optimal Wn. To completely solve Challenge 3, we further the dimension partitioning variable Sn and the semi-unitary prove the equivalence between Pˆ and P1 in the following constraint on F . theorem. n Theorem 6 (Equivalence between Pˆ and P ). The optimal 1 Challenge3(DimensionPartitioningandSemi-unitaryCon- solutionW(cid:63),γ(cid:63) ofPˆ satisfiesW(cid:63)−W(cid:63)2 =0,∀n.Moreover, n n straint). Find a bi-convex problem which is equivalent ifP isfeasible,thenγ(cid:63) ≥γ◦andthusW(cid:63),γ(cid:63)isalsooptimal 1 to Problem P1 and does not involve the combinatorial for P1. optimization w.r.t. the dimension partitioning variable S . n Please refer to Appendix C for the proof. In the following, we apply SDR to solve the above chal- lenge. It is easy to see that Problem P1 is equivalent to the D. Optimal Solution of Problem Pˆ following problem In this section, we propose an iterative algorithm named P2 : max γ Algorithm BCA to solve Pˆ by combining the convex opti- W,γ≥γ◦ (cid:16) √ (cid:17) (cid:16) (cid:17) mizationmethod(fortheoptimizationofW)andthebisection s.t. (cid:80)1− σ2kδ(cid:0)k Tr Θ◦n(cid:1)kWnk sloewarcchommpeltehxoidty(faolrgothriethomptitmoizsoatlivoenPˆofwγi)t.hWfiexefidrsγt pbraospeodseona ≥γ Tr Θ W +c (γ),∀k; (9) n∈Bk k,n n k the Lagrange dual method. Then we summarize the overall (cid:16) (cid:17) Wn =Wn2,Tr Θ◦nWn ≥ηn,Wn (cid:23)0,∀n, (10) Algorithm BCA and establish its global convergence. 1) Lagrange dual method for solving Problem Pˆ with fixed where W = {W1,...,WN} with Wn = FnF†n ∈ γ: For fixed γ, problem Pˆ is convex and thus can be solved CM×M,n = 1,...,N, and η = max (cid:0)σ2+|U |−1(cid:1). n k∈Un k n by the standard convex optimization method [21]. However, The constraint in (10) is derived from the constraint (8) in the computation complexity of standard convex optimization P and the second constraint in (7) using the fact that W 1 n method is still quite high as the number of antennas M can be expressed as Wn = FnF†n with Fn ∈ UM×Sn if becomes large. Based on the Lagrange dual method, we and only if W = W2,Tr(W ) = S ,W (cid:23) 0 (Note that (cid:16) (cid:17) n n n n n exploit the specific structure of the problem to propose a ◦ Tr ΘnWn ≥ ηn implies that Tr(Wn) ≥ |Un|). For any low complexity algorithm which can significantly reduce the feasiblesolutionWofP ,wecancalculatethecorresponding computationcomplexityoverthestandardconvexoptimization 2 feasible solution F = {F ,...,F } and S ,∀n of P using method. 1 N n 1 eigenvalue decomposition (EVD) W = F I F†,∀n and The Lagrange function of Problem Pˆ with fixed γ is n n Sn n S =Tr(W ),∀n respectively. n n N Problem P2 is still difficult to solve due to the non- L(µ,ν,W) = γ+(cid:88)νn(cid:16)Tr(cid:16)Θ◦nWn(cid:17)−ηn(cid:17)+ convex constraint W = W2 in (10) and the bi-linear term n n n=1 (cid:80) (cid:0) (cid:1) γ n∈BkTr Θk,nWn in(9).Tomaketheproblemtractable, (cid:88)K (cid:20) (cid:16) ◦ (cid:17) we apply SDR to obtain a convex relaxation of the constraint µ a Tr Θ W − Wn = Wn2 as Wn −Wn2 (cid:23) 0, which is further equivalent k=1 k k nk nk (cid:21) to the following convex constraint (cid:88) (cid:0) (cid:1) γ Tr Θ W −c −c γ , (cid:20) (cid:21) k,n n k,1 k,2 I W M n (cid:23)0. (11) n∈Bk W W n n whereµ=[µ ] ∈RK istheLagrangemultipliervec- k k=1,...,K + Then by replacing the constraint Wn =Wn2 with the relaxed tor associated with the constraint in (12), ν =[νn]n=1,...,N ∈ constraint in (11) and removing the constraint γ ≥ γ◦, we RN is the Lagrange multiplier vector associated with the + obtain a relaxed problem of P2 as constraint in (13), ck,1 = ak(|Unk|−1) and ck,2 = λnwkkPk. Pˆ : maxγ Note that L(µ,ν,W) is a partial Lagrangian since (14) is W,γ kept as a separate constraint. The dual function of Problem Pˆ (cid:16) (cid:17) s.t. a Tr Θ◦ W ≥ with fixed γ is k nk nk γ(cid:80) Tr(cid:0)Θ W (cid:1)+c (γ),∀k (12) J(µ,ν)(cid:44)maxL(µ,ν,W), s.t. (14) is satisfied. (15) n∈Bk k,n n k W (cid:16) (cid:17) ◦ Tr Θ W ≥η ,∀n (13) The corresponding dual problem is n n n (cid:20) (cid:21) I W minJ(µ,ν), s.t.µ≥0,ν ≥0. (16) WM Wn (cid:23)0,Wn (cid:23)0,∀n (14) µ,ν n n where ak =(cid:16)1− √σ2kδk(cid:17). AnN=ote(cid:80)thka∈tUnLµ(µka,kνΘ,W◦nk)−=γ(cid:80)(cid:80)kNn∈=U1nTµrk(AΘnk,Wn+n)ν+nΘc◦n(cid:48),,wUhne=re (cid:8)k : n (cid:54)=n,(cid:0)k,l (cid:1)∈E(cid:9),andc(cid:48) isindependentofW.Then Table II: Algorithm BCA (for solving Problem Pˆ) k n the maximization problem in (15) can be decomposed into N Initialization: Choose γ =γ◦ and γ =γ◦ >γ◦ such min max independent problems as that Problem Pˆ with fixed γ =γ◦ is infeasible. Let i=1. mWanxTr(AnWn), s.t. Wn−Wn2 (cid:23)0,Wn (cid:23)0, (17) Step 1fi:xLedetγγ=(i)γ=(i)γmuisni+2nγgmatxh.eSLoalvgeraPnrgoebldeumalPˆmewthitohd in Section IV-D1. forn=1,...,N.LetA =U D U† betheEVDofA and n n n n n Step 2: If a feasible solution, denoted by W(i), is found let U∗ denote the eigenvectors corresponding to the positive n in Step 1, let γ =γ(i); otherwise, let γ =γ(i). min max eigenvalues of An. Then for fixed µ,ν, it can be shown that Step 3: If |γmax−γmin|≤ε, where ε>0 is Problem (17) has a closed form solution given by the tolerable error, terminate the algorithm. Otherwise, let i=i+1 and return to Step 1. W∗(µ,ν)=U∗U∗†. (18) n n n Since Problem Pˆ with fixed γ is convex, the optimal solution is given by W∗(µ∗,ν∗) = {W∗(µ∗,ν∗):∀n}, where n (µ∗,ν∗)istheoptimalsolutionoftheequivalentdualproblem in (16). The dual function J(µ,ν) is convex and ∇J(µ,ν) = [∂ J(µ,ν),...,∂ J(µ,ν),∂ J(µ,ν),...,∂ J(µ,ν)]T µ1 µK ν1 νN is a subgradient of J(µ,ν) at (µ,ν), where (cid:16) (cid:17) ∂ J(µ,ν) = a Tr Θ◦ W∗ (µ,ν) −c γ µk k nk nk k,2 −γ (cid:88) Tr(cid:0)Θ W∗(µ,ν)(cid:1)−c ,∀k k,n n k,1 n∈Bk (cid:16) (cid:17) ∂ J(µ,ν) = Tr Θ◦W∗(µ,ν) −η ,∀n (19) νn n n n Hence, the standard subgradient based methods such as the Figure2:Illustrationoftherelationshipbetweendifferentproblems. subgradient algorithm in [22] or the Ellipsoid method in [21] can be used to solve the optimal solution (µ∗,ν∗) of the dual problem in (16). E. Summary of the Overall Solution Remark 7. For Problem Pˆ with fixed γ, the Lagrange dual The relationship between problems P, P1, P2 and Pˆ is method is a better choice than other standard convex opti- summarized in Fig. 2 and is elaborated below. mization methods such as interior point method [21] for two • Relationship between P and P1: Problem P1 is ob- reasons: 1) the number of primal variables in this problem is tained by replacing the original SINR chance constraint much larger than the number of dual variables (NM2 + 1 in (3) with its asymptotically quadratic restriction in (7). versus K + N with M (cid:29) K and M (cid:29) N); 2) for Hence, Problem P is an asymptotically conservative 1 fixed Lagrange multipliers µ,ν, the optimal primal variables formulationoftheProblemP,i.e.,thesolutionofP isa 1 W∗(µ,ν) have closed form solution. Hence, although the feasiblesolutiontotheoriginalproblemP forsufficiently n convergence speed of the simple subgradient and Ellipsoid large M. algorithms for the optimization of dual variables is not as fast • Relationship between P and Pˆ: By applying the SDR as the more complicated algorithms, the overall computation technique, we obtain Problem Pˆ as a relaxation of P . 1 time (to reach a given accuracy) of the Lagrange dual method By Theorem 6, such relaxation is tight and thus Pˆ is is much smaller than other standard convex optimization equivalent to P . Hence, Pˆ is also an asymptotically 1 methods. conservative formulation of the Problem P. 2) Summary of the Overall Algorithm for solving Problem Then we summarize the overall solution. First, Algo- Pˆ : Algorithm BCA is summarized in Table II and the global rithm BCA is performed to calculate the optimal solution convergence is established in the following theorem. W(cid:63),γ(cid:63) (up to some tolerable error ε) for Problem Pˆ. Then we calculate the corresponding Tx subspace control F(cid:63) = Theorem 8 (Global Convergence of Algorithm BCA). For fixed tolerable error ε, Algorithm BCA terminates after Iε = {F(cid:63)1,...,F(cid:63)N} using EVD Wn(cid:63) =F(cid:63)nISn(cid:63)F(cid:63)n†,∀n, where Sn(cid:63) = (cid:108)log (cid:16)|γ◦−γ◦|(cid:17)(cid:109) iterations. Let Wε = W(Iε),γε = γ(Iε) Tr(Wn(cid:63)). Finally, we output F(cid:63),{Sn(cid:63)},γ(cid:63) as an approximate 2 ε solution to the original Problem P. Note that F(cid:63),{S(cid:63)},γ(cid:63) denote the solution found by Algorithm BCA with tolerable n is usually a conservative solution in the sense that the actual error ε. Then we have |γε−γ(cid:63)|≤ε. Moreover, as ε→0, we outageprobabilityofeachuserk underF(cid:63),{S(cid:63)},γ(cid:63) ismuch have Wε →W(cid:63) and γε →γ(cid:63). n lower than the maximum allowable outage probability (cid:15) . k Sketch of Proof: Algorithm BCA ensures that γ(cid:63) ∈ Remark 9. In this paper, we focus on the optimization of [tγiomnin,,wγemahx]avaefte|γrmeaaxc−h iγtmerina|ti≤on.|γM◦2−oiγre◦|ovaenrd, aγf(tei)r∈the[γim-itnh,γitmearxa]-. Tfixxedsupboswpaecreccoonntrtoroll{PFk}an(dprsoubblesmpacPe),diwmheincshioinssth{eSnke}yfoofr Hence, we have (cid:12)(cid:12)γ(i)−γ(cid:63)(cid:12)(cid:12) ≤ |γ◦−γ◦|, from which Theorem the two stage precoding design for massive MIMO systems. 2i 8 follows immediately. The results and algorithm in this paper also provides a basis for solving the more complicated joint optimization of Tx Rank deficient OAS estimator: subspace control and power control {P }. For example, we Input: T independent channel samples h(j) ∈ CM,j = k p may design an alternating optimization (AO) algorithm to 1,...,T of a random vector h ∈ CM with covariance matrix p solve the joint optimization problem. In each iteration of the Θ. AO algorithm, Algorithm BCA is used as a component to Step 1: Using Gram–Schmidt process to obtain a semi find the optimal Tx subspace control for fixed power control, unitarymatrixU∈CM×Rwhosecolumnsformanorthogonal and then the optimal power control is obtained for fixed Tx basis for h(j)∈CM,j =1,...,T . p subspace control. Step2:FormT independentchannelsampleswithreduced p dimension h(cid:48)(j) = U†h(j) ∈ CR,j = 1,...,T . Note that p F. MethodstoImprovethePerformanceOvertheConservative h(cid:48)(j)’scanbeviewedasrealizationsofarandomvectorh(cid:48) ∈ Solution CR with covariance matrix U†ΘU. Step 3: Use the OAS estimator in [24] with independent One common drawback of using the worst-case determinis- samples h(cid:48)(j),j =1,...,T to obtain an estimation Θˆ(cid:48)of the tic constraint to approximate chance constraint is that, the ob- p covariance matrix of h(cid:48). tained solution is usually too conservative [10]. In the context Step 4: Output the estimation of the covariance matrix of of the probabilistic SINR constrained beamforming problem, h as: Θˆ =UΘˆ(cid:48)U†. thebisectionrefinementmethodhasbeenproposedtomitigate the conservatism due to the deterministic restriction of the Then the statistical CSI acquisition is summarized as fol- probabilistic SINR constraint [10]. This bisection refinement lows. Within each T time slots, choose T time slots such p method can also be applied in our problem to improve the thatthegapbetweenanytwoselectedtimeslotsislargerthan performance of the conservative solution obtained in Section channel coherent time. Then each BS transmits M dedicated IV-E. However, the bisection refinement method in [10] re- pilot symbols in each of the selected T time slots for the quires solving Problem Pˆ for multiple times, which increases userstoestimatethespatialcorrelationmaptricesusingtherank the computation complexity. To address this problem, we deficient OAS estimator. Finally, user k sends the non-zero proposeasimplenon-iterativerefinementmethodtodetermine eigenvalues and the corresponding eigenvectors of Θ ,∀l to k,l thepropervalueofδk.First,wecalculatethesimpleBD-based the associated BS. subspacecontrolFBD usingbaseline3inSectionVI.Then,we let γBD =ΦˇBD((cid:15) )/w ,∀k, where ΦBD is the empirical CDF k k of the SINR of user k, denoted by SINRBD, under BD-based B. Real-time CSI Acquisition k subspacecontrolFBD,andΦˇBD istheinversefunctionofΦBD. The dimension of the effective channel vector of a user NotethatPr(cid:8)SINRBkD ≥wkγBD(cid:9)≈1−(cid:15)k.Finally,δk ischo- in the n-th user cluster is equal to Sn ( the rank of the n- sen as the largest number that satisfies (7) with F=FBD and th Tx subspace control variable Fn), which is O(1) and is γ = γBD. Simulations show that the non-iterative refinement much less than M. To estimate the effective CSI H(cid:101)n of the method achieves almost the same performance as the much n-th user cluster, the pilot symbols need to be precoded using more complicated bisection refinement method. the n-th Tx subspace control variable Fn since the effective CSI H(cid:101)n is restricted in the subspace spanned by Fn. As a V. IMPLEMENTATIONCONSIDERATIONS result, we can simultaneously transmit N pilot symbols at a time for N user clusters because the Tx subspace control In this section, we discuss various implementation issues, F can also mitigate the inter-cluster interference in the pilot such as how to obtain the statistical and real-time CSI, the transmission stage. The total number of orthogonal pilot sym- signalingoverheadandtheimpactofCSIerror.Inthefollow- ing analysis, we assume that the spatial channel correlation bols required for effective CSI estimation H(cid:101)n,n = 1,...,N is max S = O(1). Hence, by treating F as part of the matrices Θ change every T time slots and the rank of each n n n spatial correlation matrix is R. effective channel, the estimation of the effective CSI H(cid:101)n is similar to the conventional channel estimation problem in a single-cell small-scale MIMO system and the conventional A. Slow Statistical CSI Acquisition CSI signaling method in LTE can be used to estimate the Under typical scenario, the spatial channel correlation ma- effective CSI. trices change after a few seconds and the time slot duration is in the order of millisecond, i.e., T is usually in the order of C. Statistical and Real-time CSI Signaling Overhead a few thousands. For example, in urban areas with a carrier frequency of 2GHz and a user speed of 3km/h, the spatial The CSI signaling overhead includes the pilot symbol channel correlation matrices remain unchanged for about 22 overhead(theaveragenumberoforthogonalpilotsymbolsfor seconds according to the measurement results reported in real-time and statistical CSI estimation) and the uplink CSI [23]. Hence, the statistical CSI acquisition can be done at feedback overhead (the average number of feedback vectors a much slower timescale compared to the real-time CSI with different dimensions). For simplicity, assume that each acquisition. Specifically, we propose a rank deficient oracle BS is associated with K users, each user is interfered by 0 approximating shrinkage (OAS) estimator which is able to L − 1 neighbor BSs, and S = S,∀n. Then the real- 0 n obtain an accurate estimation of the spatial correlation matrix time CSI signaling overhead of the proposed scheme is “S using only T =O(R)(cid:28)M independent channel samples. PS, K CS” per time slot per cell, which means that, the p 0 proposed scheme requires transmitting S independent pilot control optimization. On the other hand, the effective CSI symbols and feed backing K complex channel vectors of acquisition and inner precoding per cluster is similar to that 0 dimensionS pertimeslotpercell.Foreachspatialcorrelation in the single-cell small-scale MIMO system and thus they can matrix, the corresponding user only needs to send R non- be implemented in real-time at each time slot. Note that in zeroeigenvaluesandthecorrespondingeigenvectorstotheBS. practice,itisimpossibletoobtainperfecteffectiveCSIateach Hence, the statistical CSI signaling overhead of the proposed BS due to the channel estimation/feedback error. The impact scheme is “MTp PS, K0L0R CM, K0L0 RR” per time slot of imperfect effective CSI on the performance is similar to T T T per cell, which means that, the proposed scheme requires that in the conventional small-scale MIMO systems with ZF transmitting MTp independent pilot symbols, feed backing precoding.Asaresult,wecanalsoemployMMSEtechniques T K0L0R complex vectors of dimension M (eigenvectors of fortheinnerprecodertomitigatetheeffectsofCSIerrors.The T spatial correlation matrices), and feed backing K0L0 real theoretical analysis of the effect of imperfect effective CSI on T vectors of dimension R (eigenvalues of spatial correlation the two stage subspace constrained precoding is left as an matrices) per time slot per cell. As a comparison, the CSI interesting future work. signalingoverheadoftheconventionalsingle-cellMU-MIMO Remark 10 (Extension to Frequency Selective Fading Chan- precoding scheme is “M PS, K0 CM” per time slot per cell. nels). For clarity, we assume flat fading channel in this paper. Since T (cid:29) Tp = O(R) = O(L0) = O(1), the overall CSI In the wideband systems (such as OFDM) with frequency signaling overhead of the proposed scheme is significantly selectivechannels,thespatialcorrelationmatricesareapproxi- smaller as will be shown in the simulations. matelyidenticalondifferentsubcarriers[25],[26].Asaresult, the same Tx subspace control F can be applied to different D. Backhaul Signaling Overhead subcarriersandtheresultsandalgorithmcanalsobeextended Suppose there are a total number of N links (including to the wideband systems. The extension to wideband systems L depends on the specific QoS requirements. If we require that both direct links and cross links) in the network. The BSs need to send R non-zero eigenvalues and the corresponding the SINR of each user on each subcarrier must be larger than a threshold with high probability, then the proposed solution eigenvectors of every spatial correlation matrix to the central node within each T time slots. Hence, the backhaul signaling can be directly applied. However, if we require that the rate overhead is “NLR CM, NLR R” per time slot (i.e., NLR ofeachuser(ratesummedoverallsubcarriers)mustbelarger T T T complex vectors of dimension M and NLR real numbers per thanathresholdwithhighprobability,thentheextensionmay T be non-trivial. time slot). Since T (cid:29) R is usually true for massive MIMO systems, the backhaul signaling overhead can be greatly reduced compared to the cooperative MIMO or centralized VI. SIMULATIONRESULTS coordinated MIMO where the BSs need to send N complex L A. Simulation Setup channel vectors of dimension M to the central node at each time slot. Consider a massive MIMO cellular system with 19 cells, where a reference cell is surrounded by two tiers of (inter- fering) cells. The inter-site distance is 500m. In each cell, E. Summary of Practical Real-time Implementation Aspects there are K users and 2 uniformly distributed hotspots 0 In the proposed two stage precoding technique, there are (user clusters). Each hotspot contains K users and the other c four components (steps) working at different timescales. K −2K usersareuniformlydistributedwithinthecell.Each 0 c Step 1 (Slow Statistical CSI Acquisition): First, the BSs BSisequippedwithM antennas.ThepathgainsL ’sarefirst k,l need to obtain the spatial channel correlation matrices using generated using the path loss model (“Urban Macro NLOS” the method in Section V-A, and then send them to the central model) in [27]. Then the spatial correlation matrices Θ ’s k,l node. are generated according to the local scattering model in [6], Step 2 (Tx Subspace Control Optimization): With the [7]. Specifically, the correlation between the i,m-th channel obtained statistical CSI from the BSs, the central node calcu- coefficients of h is given by k,l lates the optimal Tx subspace control variables {F } using n ˆ Aaslsgoocriiatthemd wBiCthAthaendn-tshenudsserFcnlustoterB).S ln (which is the BS [Θk,l]i,m = ϕmk,axlψk,l(ϕ)ej(cid:36)li,m(ϕ)dϕ, (20) ϕmin Step 3 (Real-time CSI Acquisition): By treating F as k,l n part of the effective channel, the estimation of the real- where ϕ denotes the AoD, ψk,l(ϕ) denotes the power gain time effective CSI H(cid:101)n is similar to the conventional channel for the path specified by the AoD ϕ and (cid:36)li,m(ϕ) is the estimationprobleminasingle-cellsmall-scaleMIMOsystem. phase difference between the i-th and m-th BS antennas on Step 4 (Inner Precoding): At each time slot, BS l the path specified by ϕ. In the simulations, (cid:36)i,m(ϕ) in (20) n l performsinnerprecodingusingtheobtainedeffectiveCSIH(cid:101)n. is generated using a uniform linear array with the antenna Both the statistical CSI acquisition and the Tx subspace spacing equal to half wavelength, and ψ (ϕ) = ψ ,∀ϕ ∈ k,l k,l (cid:104) (cid:105) control optimization can be done at a much slower timescale ϕmin,ϕmax , where ψ is chosen such that Tr(Θ ) = k,l k,l k,l k,l compared to the time slot duration. Hence, the proposed ML and ϕmin,ϕmax are randomly generated such that 1) solution is not sensitive to the backhaul signaling latency (cid:12) k,l (cid:12)k,l k,l delay and computational delay caused by the Tx subspace (cid:12)(cid:12)ϕmk,alx−ϕmk,iln(cid:12)(cid:12) = π6 , 2) if user k and user k(cid:48) belong to the samehotspot,wehaveϕmin =ϕmin,ϕmax =ϕmax.Weassume k,l k(cid:48),l k,l k(cid:48),l 30 the spatial channel correlation matrices Θ change every 1000 ons20 Probability gap timIne sthloetsb.asic simulation setup, the per BS transmit power is n realizati1000. 94 P0r.o9p5. without0 r.9e6fine. = 00..90742 0.98 0.99 1 Pb =10dBandtheothersystemparametersaresetasK0 =7, atio Minimum SINR satisfaction probabilities mink Pr(SINRk >= gamma) Ktthraecnss=mimi3tu,lpaMotiwone=srs, w4o0fe,tw(cid:15)hkiell=ucshe0ar.sn0g5ae,re∀thkgeiavsneyndstebwmykPp=akraL=mke,bKPtkeb0,rs,∀∀kink;.tthhInee nt spatial correl200 P =r o0b.0a1b6ility gap Prop. + simple refine. basic setup to study the impact of different system parameters ere 0.94 0.95 0.96 0.97 0.98 0.99 1 on the performance. We only evaluate the performance of the diff Minimum SINR satisfaction probabilities mink Pr(SINRk >= gamma) er refWereenccoemcpeallretothaevpoeidrfo“urmnsaynmcemoeftrtihceepdrgoepoefsfeedctsso”l.ution with ounts ov20 P =r o0b.0a1b1ility gap Prop. + bisection refine. C the following baselines. 0 0.94 0.95 0.96 0.97 0.98 0.99 1 • Baseline1(FFR):Fractionalfrequencyreuse(FFR)[28] Minimum SINR satisfaction probabilities mink Pr(SINRk >= gamma) isappliedtosuppresstheinter-cellinterference.Thereare 6subbands,3ofwhichareusedfortheinnerzoneand3 Figure 3: Histograms of the minimum SINR satisfaction probabili- areusedfortheouterzone.Theradiusfortheinnerzone tiesunderthebasicsimulationsetup.Thereddashlineindicatesthe is set as 150m. In each cell, ZF beamforming is used to targetSINRsatisfactionprobabilityandthegreendashlineindicates serve the users on each subband. the mean minimum SINR satisfaction probability. • Baseline 2 (Clustered CoMP): 3 neighbor BSs form a BS cluster and employ cooperative ZF [5] to simultane- ously serve all the users within the BS cluster. non-iterativerefinementmethod,theproposedsolutionbecome • Baseline 3 (JSDM-PGP): This is the two-stage precod- “tighter” (i.e., the “probability gap” is smaller), and with the ingschemeproposedin[8],wheretheper-groupprocess- more complicated bisection refinement method in [10], the ing(PGP)withapproximateBDisemployedateachBS. proposed solution can be even tighter. In the approximate BD, the pre-beamforming matrices F are chosen to satisfy F†U(cid:63) =0, ∀n∈B ,∀k, where U(cid:63)k,l ∈ UM×rk(cid:63),l is a senmik-u,lnnitary matrix cokllecting the CD.iffCeroemntpaSryisstoenmoPfarOamuteatgeersThroughput Performance under dominant eigenvectors of Θ . The number r(cid:63) of dom- k,l (cid:16)k,l (cid:17) InFig.4,weplottheoutagethroughputofdifferentschemes inate eigenvectors is chosen such that λdkB,l rk(cid:63),l+1 ≤ underthebasicsimulationsetup,wheretheoutagethroughput (cid:16) (cid:17) −20dB<λdB r(cid:63) ,whereλdB(m)isthem-thlargest of user k is defined as the maximum data rate that can k,l k,l k,l be achieved with a probability no less than 1 − (cid:15) . For eigenvalue of Θ in dB. Simulations show that such k k,l baseline 2, the 3 cooperative BSs need to exchange CSI and choice of r(cid:63) can achieve a good performance. Both k,l payload data, and thus there is CSI delay when the backhaul ZFinnerprecoderandregularizedzero-forcing(RZF)[2] latency is not zero. When there is CSI delay, the outdated inner precoder will be simulated. CSI is related to the actual CSI by the autoregressive model in [29] with the following parameters: the user speed is 3 B. Verify the SINR Satisfaction Probability km/h; the carrier frequency is 2GHz. For baseline 3, ZF inner First, we verify that the original QoS requirement in (3) is precoder is simulated. It can be seen that the performance of satisfied by the proposed solution under finite M. Note that the proposed scheme with the simple non-iterative refinement theQoSrequirementin(3)meansthattheminimumSINRsat- method is close to that with the more complicated bisection isfaction probability min Pr{SINR ≥γ} (conditioned on a refinement method in [10]. The outage throughput of the k k givenspatialcorrelationmatricesΘ)islargerthansometarget proposed scheme is larger than baseline 1 and baseline 3. value 1−(cid:15) . Fig. 3 shows the histograms of the minimum Although the performance of baseline 2 is promising at zero k SINR satisfaction probabilities over different realizations of backhaul latency, the performance quickly degrades at 10ms Θ under the basic simulation setup (M = 40). To obtain backhaul latency. the histograms, we generated 100 realizations of Θ. Then, In Fig. 5, we change the number of antennas to M = 64. for each realization of Θ, the minimum SINR satisfaction Theothersimulationsetupisthesameasthebasicsimulation probability is numerically evaluated using 10000 randomly setup. Compared to Fig. 4 with M = 40 antennas, the generatedrealizationsofinstantaneousCSIH.Fig.3validates performance of all schemes improves as the number of anten- that the proposed solution indeed achieve a minimum SINR nas increases. However, the performance improvement of the satisfaction probability no less than the required target 95%. proposedschemeandbaseline3islarger.Thisisnotsurprising The “probability gap” in Fig. 3 refers to the difference since the proposed scheme and baseline 3 are specifically between the mean minimum SINR satisfaction probability designed for massive MIMO systems with spatially correlated and the target SINR satisfaction probability. Without using channel. therefinementmethods,theproposedsolutionisconservative, In Fig. 6, we change the number of users in each cell to e.g., the “probability gap” is 0.042. However, with the simple K = 8 and the number of users per cluster to K = 4. 0 c