1 Channel Pre-Inversion and max-SINR Vector Perturbation for Large-Scale Broadcast Channels David A. Karpuk, Member, IEEE, and Peter Moss Abstract—We study channel pre-inversion and vector pertur- A. Background and Related Work bation (VP) schemesfor large-scale broadcast channels,wherein 6 a transmitter has M transmit antennas and is transmitting to We studyalinearfadingchannelconsistingofatransmitter 1 K single-antenna non-cooperating receivers. We provide results with M transmit antennas transmitting data to K single- 0 which predict the capacity of MMSE pre-inversion as K →∞. antenna,non-cooperatingreceivers,whereK M. Thebasic 2 We construct a new VP strategy, max-SINR vector perturbation model we consider is ≤ (MSVP), which maximizes a sharp estimate of the signal-to- p interference-plus-noise ratio. We provide results which predict y=Hs+w (1) e the performance of MSVP and demonstrate that MSVP out- S performsotherVPmethods.Lastly,wecombineMSVPwiththe where s CM is an encoded data vector, H CK M is 9 low-complexitySortedQRPrecodingmethodtoshowthatMSVP the chann∈el matrix, w CK is additive noise,∈and t×he ith 2 ohfasutsherespaottecnltoisaelttooecfhfiacniennetllycadpelaicvietry.datatoaverylargenumber Wcoeoradsisnuamteeocfhyan∈neClsKtatie∈sionbfosermrvaetdiobny(CreScIe)iviseravia=ila1b,le..a.t,tKhe. ] Index Terms—Channel Pre-inversion, MMSE Inverse, Vector T transmitter,inwhichcasethetransmittercanwritetheencoded Perturbation,SINR,RandomMatrixTheory,SortedQRPrecod- .I ing, Broadcast Channels data vector as s = Au where u is the uncoded data vector s and A is a precoding matrix depending on the channel. c [ As was observed in [1], the zero-forcing inverse of H is a I. INTRODUCTION poor choice for A when M = K, as the sum capacity does 2 notscale linearlywiththenumberofusersK.Instead,setting v Successful implementation of next-generation (e.g. 5G) 2 mobile broadband internet will require the delivery of high- A to be a regularizedinverse results in superior performance, 1 scaling the sum capacity linearly with the number of users. volume and high-fidelity data (e.g. streaming video) simulta- 1 However, regularized inversion still suffers from a large gap neouslytoalargenumberofusers.Therapidincreaseofboth 8 to channel capacity when the ratio K/M is close to unity. 0 the number of mobile devices and the volume of data to be The methods of [1] were improved upon by the vector . delivered is putting heavy demands on broadcast networks. 1 perturbation (VP) method of [2], in which a perturbation The algorithms underlying data delivery in such networks 0 vectorisaddedtotheuncodeddatavector.Vectorperturbation 6 must evolve along with the networks themselves, to meet the closes the gap to channelcapacity substantially, but the trans- 1 demands of the ever-increasing number of end users. v: Effectivelydeliveringalargeamountofdatatoalargenum- mitter is now burdened with solving a closest vector problem in an arbitrary lattice. While algorithms such as the sphere i berofuserssimultaneouslyimposestwomajorandseemingly X decoder[3], [4] exist to tackle such problems,the complexity contradictory demands on any system. First, the transmission r offindingthemaximum-likelihood(ML)solutionpreventsVP schememustbescalablewiththenumberofusersK,ormore a from being scalable to a large number of users [5]. Lattice precisely,theencodingoperationmusthavelowcomplexityin reductionalgorithmssuchastheLLLalgorithm[6]havebeen termsofthenumberofusers. Secondly,the system musthave usedin VP systems[7], [8], butforverylargedimensionsthe little to no performance degradation as the number of users LLL algorithm itself can be prohibitively complex. increases. That is, we wish to deliver data at rates close to Traditionally,the perturbationvectoris chosen to minimize channel capacity even as K . →∞ thepowerrenormalizationconstantγ (seeequation(4)below) With the goalofmeetingtheabovetwo demands,we study required at the transmitter [2], [9], [10], [11]. A notable channel pre-inversion [1] and vector perturbation (VP) [2] exception is [12], wherein the perturbation vector is chosen methodsforGaussianbroadcastchannelsasK .Among →∞ to minimize the mean square error (MSE) of the system other results, the main contribution is our max-SINR vector and is shown to have superior performance compared to perturbation (MSVP) scheme, which when combined with a the ‘minimize γ’ approach. However, such ‘minimize MSE’ low-complexity encoding algorithm has the potential to meet schemesseem largelyunstudied,with most authorspreferring the demands of next-generation broadcasting networks. tosettheprecodingmatrixtobethezero-forcinginverseofthe channelmatrix,despitepoorperformanceforsquaresystemsat D.Karpukiswith theDepartment ofMathematics andSystemsAnalysis, AaltoUniversity, Espoo,Finland. P.MossisformerlyofBBCResearch and lowersignal-to-noise-ratio(SNR).TheMSEofthesystemwas Development, London, United Kingdom, and is currently an independent also studied in [13] when VP is used in conjunction with the consultant. emails:david.karpuk@aalto.fi, [email protected]. blockdiagonalizationtechnique[14].VPtechniqueshavealso D. Karpuk is supported by Academy of Finland Postdoctoral Researcher grant268364. been studied in channels where users have multiple antennas, 2 i.e. MU-MIMO channels,in [15], [16], [17], thoughwe focus II. SYSTEMMODEL on the single-antenna receiver case. A. Vector Perturbation Channel Model Consider the M K MIMO channel where the transmitter × has M antennas and is communicating to K M non- B. Summary of Main Contributions ≤ cooperating users, each with a single antenna. The intended InSectionIIwereviewtheregularizedVPsystemmodel.In data u = [u ,...,u ]T is a length K column vector of 1 K SectionIIIwestudythesignal-to-interference-plus-noiseratio information symbols (e.g. QAM symbols) with u intended i (SINR) of the system and derive a useful approximation of for receiver i, normalized so that this quantity.In Section IV we use RandomMatrix Theoryto predicttheSINRandergodiccapacityofregularizedinversion Eu ui 2 =c, c=K/M. (3) | | for large systems with no perturbation. The approximation is TheentriesoftheK M channelmatrixHarei.i.d.circularly shown to be accurate through simulations, and generalizes a × symmetric complex random Gaussian with variance 1/K per theorem by the current authors for square systems (K = M) complexdimension.ThechannelHisassumedtobeknownat given in [18]. thetransmitter,whichcomputesaM K precodingmatrixA In Section V we study VP and construct a scheme, which × andanoffsetperturbationvectorx. Thevectorxisafunction we deem max-SINR vector perturbation (MSVP), which of both H and u and belongs to a scaled integer lattice; the provably maximizes our estimate of the SINR when any precise nature of x will be made clear in the next subsection. regularizedchannelinverseisemployed.Thisschemeisshown The transmitter computes an encoded data vector to outperform the Wiener Filter VP introduced in [12] which itselfimplicitlymaximizesadifferentnotionofSINR.Weuse s=A(u+x)/√γ, where γ =Eu A(u+x) 2/K (4) || || RandomMatrixTheorytoestimatetheperformanceofMSVP to within 0.5-1 dB. is a power renormalization constant. The encoded data then In Section VI we focus on VP for large systems, where we satisfies the power constraint Eu s 2 = K which allows for || || fair comparison when we fix K and vary M. use the sub-ML Sorted QR method of [19] to solve for the perturbation vector. We show that for small K, the resulting The ith receiver observes the ith coordinate yi of the total performance is very close to the performance of the ML length K received vector solution, and is essentially the same as that of the lattice- y=Hs+w =HA(u+x)/√γ+w (5) reduction-aided broadcast precoding of [7], even though the SQR method offers less complexity. Lastly we show that for from which they attempt to decode u . Here w = i large K, MSVP outperforms the zero-forcing VP method of [w ,...,w ]T is a length K column vector of additive noise 1 K [2].Weendthepaperbyprovidingconclusionsanddiscussing whoseentriesarei.i.d.circularlysymmetriccomplexGaussian potential future work. with Ew wi 2 = σ2. We define ρ = 1/σ2, and will often | | measure system performanceas a function of ρ or the system size K. Following convention, we set ρ (dB) = 10log (ρ) 10 C. Notation and usually measure ρ in dB. ThesymbolsZ,R,andCdenotetheintegers,realnumbers, and complex numbers, respectively. Capital boldface letters B. Choosing the Perturbation Vector suchasAdenotematrices,andlowercaseboldfaceletterssuch TheoffsetvectorxischosenfromascaledGaussianinteger as x denote vectors. We write A† for the conjugate transpose lattice τZ[i]K for some τ > 0, and may depend on both the of the complex matrix A, and AT for the (non-conjugate) given channel matrix H and given data vector u. Following transpose. If A is rectangular, its pseudo-inverse is denoted [2],thescalarτ ischosensothatifthecoordinatesofthedata by A+. If A is square, its trace and determinant are denoted vectors u are N-QAM constellation points, then the set by tr(A) and det(A), respectively. The squared Frobenius noArm2o=f Atr(=A(Aai)j)=is denoated2b.yT|h|Ae |i|d2Fe,ntaitnydmisatdriexfinoefdsibzye u+x∈CK | ui ∈N-QAM and x∈τZ[i]K (6) K|| i|s|FdenotedI† . Foranyi,jsq|uiajr|ematrixB=(b ), we define is a tra(cid:8)nslated lattice in CK. In other words, τ is ch(cid:9)osen so K ij a square matrix dg(B)Pof the same size by that the various translates of the set of all u are “spaced out evenly”throughouttheEuclideanspaceCK.Onecancompute b if i=j easily thatforunscaled,standardN-QAMsignalingthe value dg(B)ij = 0ii if i=j (2) of τ is 2√N. For our scaling, we have (cid:26) 6 √c N so that dg(B) has the entries of B on the diagonal and τ =2√N = 6c (7) zvearroiasbleelsXewhaerree.deTnhoeteedxpbeyctEat(iXon)aannddvVaarira(nXce),orfesapercatnivdeolmy. 23(N −1) r N −1 q The Gaussian integers Z[i] are defined to be Z[i] = a + where 2(N 1) is the average per-symbol energy of an bi a,b Z C where i2 = 1. { unscaled3 N-Q−AM constellation. | ∈ }⊂ − 3 Following [2], we assume that the ith receiver has knowl- interferenceandsignaltermsin(9).However,the interference edge of dg(HA)iiτ. The receivers model their observation as terms is dwarfed by the noise term in practice, thus one can √γ usually safely ignore this apparent correlation. Secondly, the u+x u+x y=dg(HA) +(HA dg(HA)) +w (8) correlationbetweenthesetermsatanygivenreceivervanishes √γ − √γ as K , as the effect of any single u on how we choose i →∞ andsincetheithreceiverknows dg(HA)iiτ,theycanreducey the total perturbation vector x becomes insignificant. √γ i We briefly point out that in [2, Equation (25)], the channel modulothelattice dg(HA)iiτZ[i] toremovetheith coordinate √γ modelaftersuccessfulreductionmodulotheappropriatelattice of the offset vector x. We assume that the modulo operation is given by always decodes the offset vector x correctly, when in fact it may not if, for example, the noise vector w is very y=u/√γ+(HA IK)(u+x)/√γ+w (12) − large. However, this assumption allows for clean analysis, is which would result in pervasivein the literature, and furthermoreseems to affectall e ^ VP strategiesin questionapproximatelyequally.So while our SINR= capacityplotswillslightlyoverestimateabsoluteperformance, Kc (13) they remain useful when comparing VP strategies to each Eu( (HA IK)(u+x) 2+ A(u+x) 2σ2) || − || || || other.WerestrictourVPsimulationstoρ 10dBtomitigate asadefinitionoftheSINRforregularizedperturbation.How- ≥ the effect of this potential decoding error. ever, this model overestimates the overall signal strength and therefore the capacity of the scheme, especially at low values III. SIGNAL-TO-INTERFERENCE-PLUS-NOISERATIOOF ofρwheretheentriesofdg(HA)maybesubstantiallysmaller VECTORPERTURBATION than unity. We will return to this point in Section V when In this section we discuss the signal-to-interference-plus- we define our max-SINR vector perturbation strategy and noise ratio (SINR) of VP systems. After providing the basic compare it with the Wiener Filter vector perturbation method definition of the SINR for regularized VP systems, we show of [12], whichselects the perturbationvectorto maximizethe howitdiffersfrompreviouslyconsiderednotionsofSINRfor MSE associated with (13). such systems (as in [1], [2], [12]), briefly explain connections with mean square error (MSE) and capacity, and provide B. Connection to Mean Square Error and Capacity a simple approximations of the SINR and capacity when Theconnectionbetweenthe SINRin equation(10) andthe employing a certain class of precoding matrices. mean square error (MSE) of the system is as follows. Let us fix a data vector u and corresponding offset x. The relevant A. Basic Definition estimate of dg(HA)u at the receivers is uˆ =√γy′ where y′ is as in (9). The resulting MSE for the fixed data vector u is After successful reduction modulo the various lattices dg(√HγA)iiτZ[i], the receivers model the resulting vector y′ MSEu =Ew uˆ dg(HA)u 2 (14) obtained from y by || − || = (HA dg(HA))(u+x) 2+ A(u+x) 2σ2 u u+x || − || || || (15) y =dg(HA) +(HA dg(HA)) + w (9) ′ √γ − √γ so that noise signal interference dg(HA) 2c |{z} SINR= || ||F , MSE=Eu(MSEu) (16) and treat|the i{nzterfer}enc|e as noise {wzhen decodi}ng. Modeling MSE thereceivedsignalasdg(HA)u/√γ accountsforthefactthat Theexpression(9)allowsonetowritetheresultingchannel whenAischosentobedifferentfromthezero-forcinginverse capacity for user i, and the average per-user capacity, for a of H, the diagonal gains of the effective channel matrix HA fixed channel H as need not be unity. Eu dg(HA)iiui 2 From (9) we derive, for a fixed channel H and precoding Ci,H =log2 1+ Eu (HA dg|(HA))(u+|x) 2+γσ2 matrix A, the signal-to-interference-plus-noise (SINR) ratio (cid:18) | − i| (cid:19) K of the system to be 1 CH = Ci,H Eu dg(HA)u 2/γ K i=1 SINR= Eu (HA dg(H||A))(u+x|)| 2/γ+Ew w 2 X (17) || − || || || respectively. As in [1, Equation (32)], we make the mild (10) assumptionthatthesignalandinterferencepowersareapprox- dg(HA) 2c = || ||F imately uniformly distributed across all users. This allows us Eu(||(HA−dg(HA))(u+x)||2+||A(u+x)(|1|21σ)2)mtoeaapspurroexoimf caatepaCcHity≈ElHo(gC2(H1)+bSyINR) andhence the ultimate Herewehaveimplicitlyassumedaslowfadingmodel,wherein thechannelHstaysconstantforalargenumberoftransmitted C :=EH(CH)≈EH(log2(1+SINR)) (18) data vectors u. Note that the perturbation vector x depends Theapproximation(18)isgenerallyagoodnumericalestimate on u, and thus there may be some correlation between the for the vector perturbation strategies under consideration. 4 C. Tikhonov Pre-Inversion where d and T are as in Theorem 1. The above implicitly contains the approximation We will consider precoding matrices of the form A=Hα =H†(αIK +HH†)−1 (19) M[SEu :=||T(u+x)||2 (28) forsome(small)constantα 0,whichistheTikhonovinverse ofthemeansquareerrorforagivendatavectoru(andafixed of the channel matrix H w≥ith regularization parameter α. channel H). Whenα=0,theTikhonovinversereducestothezero-forcing inverse, which we will denote IV. CAPACITY OFMMSE PRE-INVERSION FORLARGE SYSTEMS HZF =H†(HH†)−1 (20) In this section we fix the offset vector x to be x = 0; When we set the regularization parameter α = σ2, we will thatis, we are presentlyonlyconcernedwith the performance referto the correspondinginverseof H as the MMSE inverse, of linear precoding strategies with no vector perturbation. which we will denote by Furthermore, we fix the precoding matrix to be A=H . MMSE The goal of this section is to obtain explicit approximations HMMSE =Hσ2 =H†(σ2IK +HH†)−1 (21) for EH(SINR) and the capacity C for MMSE pre-inversion Theoptimalregularizationparameterαwasfoundin[1]tobe to measure performance of large systems. approximatelyKσ2forsquaresystems.Theapparentdisparity with the above matrix HMMSE is a consequence of how we A. Predicting SINR and Capacity of Large Systems have normalized the channel matrix and the transmit power. Our strategy is to compute lim EH(d) explicitly, and We prefer the given normalization, since the regularization K combinetheresultwith(27)tomeas→u∞relarge-scalesystemper- parameter of interest is now independent of the size of the formanceofMMSEpre-inversion.Thisrequiresthefollowing system. two lemmas,the firstof whichisan elementarysimplification of our expressionsfor the SINR and capacity,and the second D. Estimating SINR and Capacity of which is a technical lemma which essentially validates WhenwesettheprecodingmatrixtobeaTikhonovinverse, replacing d with lim EH(d) in the SINR and capacity theresultingSINR(andthusthecapacityC)canbeestimated approximations. K→∞ by a simple, compact expression. To begin, let Lemma 1: Suppose that A=H and that x=0. Then MMSE \ the approximations SINR and C from (27) are given by d=tr(HH )/K =tr(dg(HH ))/K (22) α α The following theorem is the basis for our approximation of S\INR= 1 d d and C b=−EHlog2(1−d). (29) the SINR. − Theorem 1: For a fixed channel matrix H and Tikhonov where d=tr(HHMMSE)/K.b parameter α, define Proof:Sincex=0andEu ui 2 =c,asimplecalculation ε1 =||dg(HHα)−dIK||2F (23) fgoivreasnyEum||aTtr(ixuB+,xw)|e|2se=e e||aTsi|l|y2F|fcr.|oSmin(c2e7)||tBha||t2F = tr(B†B) ε2 =Eu ||(HHα−dg(HHα))(u+x)||2 \ d2K (cid:0) (HHα dIK)(u+x) 2 (24) SINR= tr(T T) (30) −|| − || † where d is as in (22). Let T be any matrix such(cid:1)that d2K = (31) d2K+(1 2d)tr(HH ) T T=d2I 2dHH +HH H+ H (25) − MMSE † K − α α MMSE α d = (32) Then we can bound the SINR by 1 d − d2Kc d2Kc+ε where T is as in Theorem 1. The statement about C is SINR 1 (26) Eu T(u+x) 2+ε2 ≤ ≤ Eu T(u+x) 2 immediate from (27). || || || || and furthermore, we have Klim K1EH(ε1)=0. ranLdeommmvaar2ia:blLeestsu{cXhKth}a∞Ktt=h1efboelloawisnegqutherneceecoonfdirteiaoln-svahblouledd: Proof: See the Appendi→x.∞ (i) 0 < X < 1 for all K, (ii) 0 < lim E(X ) < 1, and K K While we are unable to prove that lim K1EH(ε2) = 0, (iii) lim Var(X )=0. Then K→∞ K K simulation results suggest that this is the→c∞ase, and that even K →∞ for small K the quantity ε2 is very small relative to the other X lim E(XK) terWmseinnotwheolbotwaienr obuorunadp.proximations of the SINR and the Kl→im∞E(cid:18)1−KXK(cid:19)= 1−K→Kl∞im E(XK). (33) capacity C by ignoring the error terms ε and ε , and setting →∞ 1 2 Proof: See the Appendix. S\INR:= Eu||Td(2uK+cx)||2, C :=EHlog2(1+S\INR()27) cirWTcuheleaocrrlayenmsny2om:wmLseetttarHtiecctbhoeemKmpl×aeixnMrtahnmedoaortmerimxGwoafuhsothssiieasnesnevtcartriieioasnba.leresiw.ii.tdh. b 5 variance 1/K per complex dimension, let H = H (αI + 25 α † K HH†)−1 be the Tikhonov inverse with parameter α>0, and let d=tr(HH )/K as in (22). Then we have 20 α lim EH(d)=d(c,α) (34) 15 K B) →∞ d where R) ( 10 N SI 1+c+cα 1+2c( 1+α)+c2(1+α)2 (H d(c,α):= − − . E 5 2c p (35) ρρ == 05 ddBB ρ = 10 dB Furthermore, if we fix A=HMMSE, then 0 ρ = 15 dB ρ = 20 dB ρ = 25 dB lim EH(S\INR)= d(c,σ2) (36) -5 ρ = 30 dB K 1 d(c,σ2) 5 10 15 20 25 30 35 40 →∞ − Value of K in K× K MIMO System 35 and lim C log (1 d(c,σ2)) (37) 30 K ≥− 2 − →∞ 25 forsystemswhichempbloyMMSEpre-inversionwithnovector B) pertuPrbraotoiof:nS.ee the Appendix. R) (d 20 N Theorem 2 provides the approximations SI 15 (H E d(c,σ2) 10 ρ = 0 dB EH(SINR)≈EMMSE(c,σ2):= 1 d(c,σ2), (38) ρρ == 51 0d BdB C log (1 d(c,σ2)−) (39) 5 ρρρ === 122505 dddBBB ≈− 2 − ρ = 30 dB 0 for large systems which employ MMSE pre-inversion with 5 10 15 20 25 30 35 40 Value of K in 2K × K MIMO System α=σ2 and no vector perturbation. Fig.1. Ontop,EH(SINR)forK×K systems(c=1)employingMMSE pre-inversion, forvarious values ofρ=1/σ2.Onbottom,thesameplotfor B. Simulation Results 2K×K systems. Inthissubsectionwe collectsimulationresultswhichstudy the accuracy and predictive ability of the approximation(38), log (1 d(c,σ2)) in a Taylor series as σ2 0 to obtain for channel pre-inversion with A = HMMSE and no vector −(recal2l tha−t ρ=1/σ2) → perturbation. 1) Approximating SINR: In Fig. 1 we plot EH(SINR) as C &−log2(1−d(c,σ2)) a function of K for systems with M = K (top), and M = = log2(ρ)+log2(1−cc)+O(1) c<1 (40) 2K (bottom), for various values of ρ = 1/σ2. In both plots, (cid:26) 12log2(ρ)+O(1) c=1. the solid curves represent experimentally measured values of The experimentalresults of the previoussubsection show that EH(SINR), and the dashed lines the correspondingvalues of the leading term of the series approximates C quite well for EMMSE(c,σ2). We see thatEMMSE(c,σ2) predictsthe limiting ρ>15dBorso.Ifoneacceptsthat log (1 d(c,σ2))isan valueofEH(SINR)verywell,forallvaluesofρ.Ontheother accuratepredictorofC forlargeK,t−hent2hea−boveshowsthat hand, note that the error introduced by applying the large-K the qualitative performanceof square and non-squaresystems limit to small-K systems may be non-negligible when c=1. is different. 2) ApproximatingCapacity: In Fig. 2 we plot the capacity C =EH(CH)asafunctionofρforK =8(top)andK =64 V. MAX-SINRVECTORPERTURBATION (bottom),forc=1,2/3,1/3,1/6.Thesolidmarkedlinesare In this and all subsequent sections we turn our attention the experimentally measured values of C, the dashed lines towardsschemeswhichemploynon-trivialvectorperturbation, are the values of the approximation −log2(1−d(c,σ2)) of thatis,choosex τZ[i]K accordingtosomealgorithmwhich C from (38), and the solid unmarked lines are the channel ∈ is intendedto optimizesystem performance.In [2] andnearly capacity, computed numerically using the results of [20]. We all subsequent literature, the authors fix the precoding matrix seethattheapproximation(38)ofthecapacityisverygoodin tobeA=H forα 0,andforafixeddatavectoruchoose α allscenarios,exceptforsmallsquaresystemswhenρislarge. ≥ the offset vector x to be x= argmin γ = argmin A(u+x) 2 (41) ′ C. Qualitative Behavior of Capacity of MMSE Pre-Inversion x′ τZ[i]K x′ τZ[i]K || || ∈ ∈ To study the qualitative behavior of the capacity of which is a closest-vector problem in a lattice and hence Tikhonov pre-inversion with A = H , we expand solvable with a sphere decoder. MMSE 6 14 systems with K =M =12 employing 16-QAM modulation. E (C ), M = 8 H H We see a consistent gain of approximately 0.5 dB over the 12 EH(CH), M = 12 E (C ), M = 24 WFVP strategy. H H E (C ), M = 48 z) 10 H H H 5.5 ) (bits/sec/H 68 5 MWpchlaSFaiVVnn PnPTeikl hcoanpoavc itinyversion (CH Hz) 4.5 E 4 c/ se 4 s/ 2 bit ) (H3.5 C 00 5 10 15 20 25 30 E(H 3 ρ = 1/σ2 (dB) 14 E (C ), M = 64 2.5 H H 12 EH(CH), M = 96 E (C ), M = 192 2 H H 10 12 14 16 18 20 E (C ), M = 384 z) 10 H H ρ = 1/σ2 (dB) H c/ se 8 Fig. 3. Capacity of the WFVP strategy of [12] and the MSVP defined by s/ (42),forasystemwithK=M =12. bit ) (H 6 C (H E 4 B. Estimating the Performance of max-SINR Vector Pertur- 2 bation In this subsection we will estimate the performance of 0 0 5 10 15 20 25 30 MSVP when using the regularization parameter α = σ2. ρ = 1/σ2 (dB) The regularization parameter α = σ2 is not known to be Fig. 2. Capacity of MMSE pre-inversion for K = 8 (top) and K = the optimal regularization parameter for the MSVP strategy, 64 (bottom). The marked lines are the experimentally computed capacity and without knowledge of the optimal α choosing α=σ2 is C of MMSE pre-inversion, the dashed lines represent our approximation simply convenient. The main result of this subsection is an −log2(1−d(c,σ2))ofthisquantity,andthesolidlinesthechannelcapacity. approximation of EH(SINR) which can be used to predict system performance to within about 0.5-1 dB, which we A. Max-SINR Vector Perturbation demonstrate through numerous simulations. The results of [9], specifically [9, Lemma 1 and Corollary Ratherthanchoosingxtominimizeγ,weinsteadchoosex 1], estimate the power renormalization constant γ for the to minimizethe meansquareerrorofthesystem. Specifically, precoding matrix A = H ; Jensen’s Inequality can then be for a fixed channel matrix H and a fixed data vector u, we ZF used to estimate the expected SINR for such a ‘zero-forcing’ choose the perturbation vector x according to strategy.HowevertheMSVPstrategy,whichsetsA=H MMSE x= argmin MSEu = argmin T(u+x′) 2 (42) and chooses x according to (42), has qualitatively different x′ τZ[i]K x′ τZ[i]K || || performanceatlowervaluesofρwhencomparedtothe‘zero- ∈ ∈ forcing’ strategy. Thus a new predictor of performance is whereT satisfies T T=d2I 2dHH +HH H+ H † K− α α MMSE α required. and d = tr(HH )/K. Note that this provides a VP strategy α While the estimate of γ for the ‘zero-forcing vector per- for any regularization parameter α 0 whatsoever, not just ≥ turbation’ method of [9] does not provide a useful predictor the MMSE parameter α = σ2. We will refer to this strategy for the SINR of max-SINR vector perturbation, the general as max-SINR vector perturbation, or MSVP. strategy therein remains applicable. In particular, we let We emphasizethatthis isnot the WienerFilter VP strategy (WFVP) of [12], which chooses the offset vector x to min- = hypercube in CK, with side length K imize ||L(u+x′)||2, where L†L = (σ2IK + HH†)−1. An H N (43) argument similar to the proof of our Theorem 1 shows that lim τ = lim 6c =√6c L(u+x) 2 is the denominatorof the alternative expression N→∞ N→∞r N −1 || || (13) of the SINR. Thus WFVP attempts to maximize the and we consider data vectors u chosen from the uniform SINR, but does not account for the diagonal entries of HH distribution on . Heuristically, we are approximating the α K H being less than unity. discreteN-QAMdistributionbytheuniforminputdistribution To demonstrate the improvement offered by our MSVP on the minimal hypercube surrounding the constellation as method over the WFVP strategy of [12], we plot the capacity N . One can check that our energy constraint is C =EH(CH)asdefinedin(18)ofbothschemesinFig.3for prese→rved∞, in other words that for such uniform inputs u, we 7 have We can complete our approximation of EH(SINR) by 1 performing the following series of approximations: Eu u 2 = t 2dt=Kc (44) and hence Eu||ui||2 = cvosli(nHceKt)heZHenKtr|i|es||of the data vector u EH(SINR)≈ dE(cH,(σM[2)S2EK)c | | are assumed i.i.d. d(c,σ2)2Kc H,Roeucralelsftirmomate(1M[6)SEanodf(t2h8e)mtheaatnfsoqruaarefixeerdrocrhoafntnheelsmysatterimx ≤ 6πcK(KK+!)11/KEH(det(T†T)1/K) (50) π K+1 d(c,σ2)2 is given by M[SE=Eu||T(u+x)||2 (45) ≈ π6(KK!)+1/1K (EHKid=e0t(TKi†T(M)M)1!i/)K!β2i 1/K ownhe[r9e,TLeimsmasain1]T,hweoilrlemall1ow. Tuhse btoeloewstimpraotpeoEsiHtio(nM[,SmEo)dealnedd ≈ 6(K!)1/K "PKi=0(cid:0)Ki(cid:1)(MM−−!i)!β1i# thus provide a useful estimate of EH(SINR). where β1 and β2 are as in equPation (cid:0)(49(cid:1)). We now have Proposition 1: Suppose an M×K MIMO system employs EH(SINR) Evp(K,M,σ2) (51) MSVP with a fixed channel matrix H and data vectors u ≈ where uniform on , with any regularization parameter α 0. K Then we havM[HeSE≥ 6πcKK(K+!)11/K det(T†T)1/K ≥(46) Evp(K,M,σ2):= π6(KK!)+1/1K "PKiKi==00(cid:0)KKii(cid:1)((MMMM−−!!ii))!!ββ21ii#1/(K52) P (cid:0) (cid:1) for max-SINRvectorperturbationwith α=σ2. The approxi- where T is as in Theorem 1. mationsin(50)allstemfromreplacingaquantitybyitslarge- Proof: The proof is the same as for [9, Lemma 1], K limit, or from an application of Jensen’s Inequality. We [ wherein MSE is expressed as the second moment of the omit further details in favor of demonstrating the validity of VoronoicellofthelatticegeneratedbyTinCK,andisrelated the approximation through simulations. to the second moment of the unit sphere in R2K. We omit further details. C. Simulation Results ToapplytheaboveresulttoapproximatetheexpectedSINR In this subsection we empirically demonstrate the accuracy of the system, we will need to estimate EH(det(T†T)). To of (51) for MSVP with α = σ2. We fix the signaling that end, we recall a result from Random Matrix Theory. Let alphabet to be a 16-QAM constellation for all experiments. W=KHH† be takenfromthe complexWishartdistribution The solid curves represent experimentally measured values WK(M,IK) [21, Section 2.1.3], and let β be constant with of EH(SINR) or the capacity C, and the dashed curves the respect to H. Then [21, Theorem 2.13] states that resulting approximations. 1) SINR as a function of K: We study (51) in Fig. 4 K K M! EH(det(IK +βW))= βi. (47) for M = K (top) and M = 2K (bottom). We see that the i (M i)! approximationis accurate to within about 1 dB when K 4. Xi=0(cid:18) (cid:19) − ≥ Implicit in our approximation (51) is (38) which essentially Whenα=σ2 wecanrewriteT†Tasaproductofmatricesof replaces the value d by its large-K limit, hence one should theformIK+βWandthenapplytheaboveresulttoestimate expect (51) to also be more accurate for larger K. EH(det(T†T)). Straightforward computation gives 2) Approximating Capacity: In Fig. 5 we plot, for K = 8 andc=1,8/9,and4/5,theergodiccapacityC ofmax-SINR 2 1 1 d 1 1 − vector perturbation as well as the corresponding estimate ob- T†T=d2 IK +(cid:18) −d (cid:19) Kσ2W!(cid:18)IK + Kσ2W(cid:19) tainedbycombining(51)and(18)toobtaintheapproximation C log (1+E (K,M,σ2)).Again,weseethatthisestimate ≈ 2 vp Replacingdwithitslarge-K limitd(c,σ2),againapproximat- predicts the expected capacity well. ing the expectation of a ratio by the ratio of the expectations, and using (47) we arrive at VI. MAX-SINRVECTORPERTURBATION FORLARGE SYSTEMS EH(det(T†T))≈d(c,σ2)2KEEHH((ddeett((IIKK ++ββ12WW)))) sioVne,cbtuotrcpoemrtpuurbtiantgiotnheofofeprtsimlaarlgoeffbseentevfietcstoorvxericnh(a4n2n)emlianyvebre- K K M! βi (48) prohibitivelycomplexforlarge K. In[7] the authorsusedthe =d(c,σ2)2K i=0 i (M−i)! 1 LLL lattice-reduction algorithm to achieve this goal, but for PK (cid:0)K(cid:1) M! βi i=0 i (M i)! 2 verylargeK thisreductionitselfcanbeprohibitivelycomplex. − where P (cid:0) (cid:1) We will use a method with even smaller complexity, which we show approaches the performance of the ML solution for 1 d(c,σ2) 2 1 1 small K, and slightly outperforms LLL-based precoding for β1 = −d(c,σ2) Kσ2, β2 = Kσ2. (49) large K. For all experiments 16-QAM modulation was used. (cid:18) (cid:19) 8 35 fora squareK K matrixT wherex rangesoveran integer ′ ρ = 10 dB × ρ = 15 dB lattice. The SQRP algorithm is a modified Gram-Schmidt 30 ρ = 20 dB procedurewhichdecomposestheK KmatrixTasaproduct ρ = 25 dB ρ = 30 dB × T=QRP (54) 25 R) where Q is K K unitary, R = (rij)1 i,j K is K K SIN 20 upper-righttrian×gular,and P is a K K p≤erm≤utationma×trix, (H to attemptto maximizethe diagonal×entriesr of R, in order E ii 15 as i=K,...,1. Substituting into (53) we obtain argmin y Tx′ 2 =argmin y QRPx′ 2 (55) 10 x′ || − || x′ || − || =argmin y˜ Rz 2 (56) z || − || 5 2 3 4 5 6 7 8 Value of K in K× K MIMO System where y˜ =Q†y and z=Px′. 40 Let us recall the definition of the Babai point zB = ρρ == 1105 ddBB [z1B,...,zKB]T, an estimate of the solution to (56) given 35 ρ = 20 dB recursively by ρ = 25 dB ρ = 30 dB c =y˜ /r , zB =[c ] (57) 30 K K KK K K K NR) c =(y˜ r zB)/r , zB =[c ], (58) SI 25 i i− ij j ii i i E(H j=Xi+1 for i=K 1,...,1 20 − where [] denotes rounding to the nearest element of the · 15 underlying per-coordinate constellation. The final estimate of the ML solution x is obtained by computing P 1zB. The − 10 modifiedSQR algorithmof [19] increases the probabilitythat 2 3 4 5 6 7 8 Value of K in 2K × K MIMO System P 1zB istheMLsolutionto(53).Wereferto[19]forfurther − details. FSiIgN.R4.veOcntotroppe,rEtuHrb(aStiIoNnRw)itfhorαa=Kσ×2,Kforsvyasrtieomus(vca=lue1s)oefmρp=loy1i/nσg2m.aOxn- ToapplythisalgorithmtotheVPprocedure,werewritethe bottom,thesameplotfor2K×K systems. argmin problem (42) as x= argmin T(u+x′) 2 = argmin y Tx′ 2 (59) 11 x′ Z[i]K || || −x′ Z[i]K || − || K = 8, M = 8 ∈ ∈ 10 K = 8, M = 9 where y = Tu is the ‘received’ vector. We then apply the K = 8, M = 12 decomposition (54) and compute the estimate of x, namely 9 Hz) 8 P−1zB,asabove.WerefertotheprocessasSortedQRvector c/ perturbation, or just SQR vector perturbation. e s 7 s/ bit B. Comparison with ML and Lattice-Reduction-AidedBroad- ) (H 6 cast Precoding C (H 5 InthetopplotofFig.6wecompareSQRMSVPforK =8 E 4 and c = 1, 4/5 to ML MSVP wherein (42) is solved using a sphere decoder. We also plot the performance of lattice- 3 reduction-aidedbroadcastprecoding[7] appliedto ourMSVP 2 10 15 20 25 30 method,whichusesamatrixdecompositionofTbasedonthe ρ = 1/σ2 (dB) LLL lattice reduction algorithm [6] and similarly computes a Babai estimate of the optimal perturbation vector. In the Fig. 5. The experimentally measured capacity C of MSVP for systems with K = 8 and M = 8, 9, and 12, versus the approximation log2(1+ bottom plot of Fig. 6 we repeat the experiment for K = 80 Evp(K,M,σ2))ofthisquantity. and c = 1, 4/5, omitting the performance of ML MSVP as using a sphere decoder for such a large system is infeasible. As we see from the plots, the performance degradation of A. Sorted QR max-SINR Vector Perturbation using the SQR method instead of an ML method to find WeemploytheSortedQRPrecoding(SQR)methodof[19], the perturbation vector x is minimal. Surprisingly, the SQR asub-MLalgorithmfordecodingspace-timecodeswhichcan methodoffersamarginalbutconsistentimprovementoverthe be summarized as follows. For our purposes it suffices to LLL method at high values of ρ. This is especially notable consider the problem of computing since, as we discuss further in Section VI-D, computing the SQRmatrixdecompositioncanbedonewithlowercomplexity x=argmin y Tx′ 2 (53) than computing the LLL reduction. x′ || − || 9 10 9 M = 8, ML ZFVP 9 M = 8, SQR 8 MSVP M = 8, LLL plain Tikhonov inversion M = 10, ML channel capacity 8 M = 10, SQR 7 Hz) M = 10, LLL Hz) C) (bits/sec/H 567 C) (bits/sec/H456 E(H 4 E(H3 3 2 2 1 10 15 20 25 30 10 15 20 25 30 ρ = 1/σ2 (dB) ρ = 1/σ2 (dB) 10 9 M = 80, SQR ZFVP M = 80, LLL 9 MSVP M = 100, SQR 8 M = 100, LLL plain Tikhonov inversion channel capacity 8 7 Hz) z) ec/ 7 c/H 6 s e E(C) (bits/HH 56 (C) (bits/sHH45 4 E 3 3 2 2 10 15 20 25 30 1 ρ = 1/σ2 (dB) 10 15 20 25 30 ρ = 1/σ2 (dB) Fig. 6. On top, capacity of MSVP systems with K = 8 and c = 1, 4/5 employingML,SQR,andLLLmethodsforcomputingtheperturbationvector Fig. 7. On top, the value of C = EH(CH) for the MSVP and ZFVP x.Onbottom,thesameplotforK=80,omitting theMLstrategy. strategies with K = M = 256 employing 16-QAM signaling, using the SQRdecompositiontosolvefortheperturbationvector.Onbottom,thesame plotforK=M =1024. C. Comparison with Zero-Forcing Vector Perturbation D. Remarks on Complexity TodemonstratethatSQRMSVPcanbeusedinpractice,we In this subsection we compare MSVP (with α = σ2) with nowbrieflydiscussthecomplexityoftheinvolvedalgorithms. the zero-forcing strategy in which A = H and the offset ZF Solving for the perturbation vector x is the main bottleneck vector x is chosen using the SQR algorithm of Section VI-A to implementing VP systems, as it must be done multiple to minimize γ = A(u+x) 2/K as in [2], [9]. We denote times per channel realization and finding the ML solution || || this strategy ZFVP. Channel pre-inversion with A = H MMSE is notoriously complex. The preprocessing performed on the and no perturbation is also shown as a helpful basis for channel matrix H (e.g. computing the matrix T) must only comparison. be doneonceper channelrealizationand thereforehas less of In Fig. 7 we show the performance of ZFVP and MSVP animpactontotalcomputation.Nevertheless,wediscussboth for K = M = 256 (top) and K = M = 1024 (bottom) aspects below. when the SQR algorithm is employed. When K =M =256, 1) Preprocessing: During SQR MSVP, the preprocessing MSVP offers a steady improvement of approximately 1 dB consistsof two parts,namelycomputingthe K K matrix T × over ZFVP between ρ = 10 dB and 20 dB, though for large asinTheorem1,andthenperformingtheSQRdecomposition valuesofρ ZFVPslightlyoutperformsMSVP.ForK =M = toT.TheformercanbedonewithaCholeskydecomposition, 1024 the performance of ZFVP degrades to the point where the computation of which requires O(K3) operations. The we see that MSVP outperforms it at all values of ρ under latter can be done using a modified Gram-Schmidt algorithm consideration. On the contrary, the performance of MSVP is (see [19]), the complexity of which is easily be seen to be apparently constant with increasing system size. To simplify O(K3), and therefore all preprocessing can be performed in presentationweomittedtheperformanceofSQRWFVPfrom O(K3) operations. If LLL reduction is used instead of the these plots, but the behavior is essentially identical to that SQR method, then between O(K4) and O(K5) operations already depicted in Fig. 3. Specifically, MSVP outperforms are needed [6]. WFVP by approximately0.5-1dB atall valuesof ρ when the 2) Solving for the Perturbation Vector: Solving for the SQR algorithm is employed to find the perturbation vectors. perturbation vector x in (42) using a sphere decoder has 10 complexity which is exponential in the dimension K of the whereΣ hasthesingularvaluess /(s2+α),fori=1,...,K α i i lattice [5]. On the other hand, computing the Babai estimate of H along the diagonal. It follows immediately that the α xB using (57) and (59) requires only O(K2) multiplications. singular values of H+ are (s2+σ2)/s2 for i=1,...,K. MMSE i i The complexity of computing the Babai point is the same, Each singular value s of H gives rise to an eigenvalue λ regardless of whether we use the SQR or LLL method to of the matrix d2I 2dHH +HH H+ H , which is compute the perturbation vector. given by K − α α MMSE α VII. CONCLUSIONS AND FUTURE WORK s2 s2(s2+σ2) λ = d2 2d + (63) With the goal of developing scalable and close-to-capacity − s2+α (s2+α)2 data transmission schemes for next-generation broadcast net- s2 s2 2 s2+σ2 works, we have studied channel pre-inversion and vector = d2 2d + (64) − s2+α s2+α s2 perturbation schemes for a large number K of end users. (cid:18) (cid:19) To that end, we have provided an explicit, sharp estimate of > d2 2d s2 + s2 2 (65) the capacity of MMSE channel pre-inversion as K . − s2+α s2+α → ∞ (cid:18) (cid:19) Furthermore, we have proposed a new max-SINR vector s2 2 perturbationscheme which maximizesa sharp estimate of the = d 0. (66) − s2+α ≥ SINR of the system. Random Matrix Theory was used to (cid:18) (cid:19) estimate the performance of our vector perturbation scheme, and the resulting approximation was shown to be accurate. As all λ are obtained this way, the matrix d2IK 2dHHα+ We demonstrated that MSVP outperforms other VP schemes, HHαH+MMSEHα is positive definite. − such as Wiener Filter VP and zero-forcing VP. Lastly, we Let ε and ε be as in the statementof the theorem.To see 1 2 applied the Sorted QR decomposition method to solve for the bounds for the SINR, note that when A=H we have α the perturbation vector, resulting in a scheme which is low- complexity and close to channel capacity for very large K. SINR The low complexity and good performance suggest that our max-SINR vector perturbationmethod could be implemented = ||dg(HHα)||2Fc in practice in large broadcast networks. Eu( (HHα dg(HHα))(u+x) 2+ Hα(u+x) 2σ2) || − || || || Future work will consist of investigations into using fast- (67) decodable space-time codes [22], [23], [24] at the transmit dg(HH ) 2c = || α ||F end, which could naturally reduce the complexity of the ML Eu( (HHα dIK)(u+x) 2+ Hα(u+x) 2σ2)+ε2 search for the perturbation vector. Furthermore, we plan on || − || || || (68) comparing our method with the Degree-2 Sparse VP method of [25], which offers comparable complexity. We plan to Webeginbyboundingthenumeratoraboveandbelow.Bythe investigatethe performanceof MSVP with imperfectCSI and triangle inequality, we have withcorrelatedchannelcoefficients,particularlywhencorrela- tionoccursbetweenthe transmitantennas.Lastly,preliminary simulation results suggest that the quantity dg(HA)ii/√γ, ||dg(HHα)||2F = ||dg(HHα)−dIK +dIK||2F (69) which must be known by the receivers prior to transmission, dI 2 +ε (70) is nearly constant with respect to H, especially for large ≤ || K||F 1 = d2K+ε (71) systems.Thusitmaybepossibletoreplacethisquantitywitha 1 simpleconstantatthereceiveend,cuttingdownonpreliminary communication overhead significantly. This is an avenue of Toseetheotherinequality,leta ,...,a >0beanypositive 1 K potential future research that deserves investigation. real numbers, and let a = 1 K a be their mean. We K i=1 i have ( K a )2 ( K a2)( K 12) by the Cauchy- i=1 i ≤ i=1 iP i=1 APPENDIX Schwarz Inequality, which is easily seen to be equivalent to [a,...P,a]T 2 [a ,P...,a ]TP2. Letting a = (HH ) Proof of Theorem 1: To prove that the matrix T exists, || || ≤ || 1 K || i α ii and a = d = tr(HH )/K we see that d2K = dI 2 HwhHicαhHm+MuMsStEHfirαst isshopwositthivaet dtheefinmitea.triTxodt2hIaKt e−nd2,dlHetHHα =+ ||dg(HHα)||2F. α || K||F ≤ To complete the bounds on the SINR, define UΣV be a singular value decomposition of the channel, † where the diagonal entries of Σ are s ,...,s . The singular 1 K value decomposition of Hα is M[SEu = (HHα dIK)(u+x) 2+ Hα(u+x) 2σ2 || − || || || H = VΣTU (αI +UΣV VΣTU ) 1 (60) (72) α † K † † − [ Following [12, Section 4], the idea is to rewrite MSEu as the = VΣT(αIK +ΣΣT)−1U† (61) normofasinglevector.Toshortennotation,weletz=u+x [ =:Σα and z′ =U†z. Computing MSEu in terms of the singular = VΣ U (62) value decompositionsnow gives(notingthatmultiplyingby a |α † {z }