Joint Channel-and-Data Estimation for Large-MIMO Systems with Low-Precision ADCs Chao-Kai Wen, Shi Jin, Kai-Kit Wong, Chang-Jen Wang, and Gang Wu Abstract—The use of low precision (e.g., 1−3 bits) analog-to- Re( ) ADC Ye =Q(Y) douigtiptaultc(MonIvMenOo)rssy(sAteDmCss)isinatveecrhynilqarugeetomrueldtiupclee-icnopstutanmdupltoipwleer- RF ADC 72¢ 1 ADC 5¢ consumption.Inthiscontext,nevertheless,ithasbeenshownthat Im( ) 2 3¢ the training duration is required to be very large just to obtain 2 an acceptable channel state information (CSI) at the receiver. A ¢ 5 2 Y 1 possiblesolutiontothequantizedMIMOsystemsisjointchannel- ¡3¢¡2¢¡¢ ¢ 2¢ 3¢ and-data(JCD)estimation.Thispaperfirstdevelopsananalytical 0 ¡3¢ frameworkforstudyingthequantizedMIMOsystemusingJCD Re( ) 2 2 estimation. In particular, we use the Bayes-optimal inference for ADC ¡52¢ RF n the JCD estimation and realize this estimator utilizing a recent ¡7¢ N ADC 2 a technique based on approximate message passing. Large-system Im( ) J analysis based on the replica method is then adopted to derive 2 theasymptoticperformancesoftheJCDestimator.Resultsfrom Fig.1. ThequantizedMIMOsystem. 2 simulations confirm our theoretical findings and reveal that the JCD estimator can provide a significant gain over conventional ] pilot-only schemes in the quantized MIMO system. T times the number of users). Clearly, the assumption of perfect I I.INTRODUCTION CSIR becomes quite controversial particularly for quantized . s Verylargemultiple-inputmultiple-output(MIMO)or“mas- MIMO systems. The requirement of long training sequence c [ sive MIMO” systems [1] are widely considered as a key motivates us to consider joint channel-and-data (JCD) esti- technology for 5G wireless communications networks [2,3]. mation in which estimated payload data are utilized to aid 1 Such systems promote the use of a very large number of channel estimation. A major advantage of JCD estimation is v 0 antennasatthebasestation(BS)(e.g.,hundredsorthousands) that relatively few pilot symbols are required to achieve the 8 to serve a number of user terminals (e.g., tens or hundreds) equalization channel and data estimation performances. 5 in the same time-frequency resource. Nonetheless, the high Althoughperformanceenhancementbyusingthistechnique 5 dimensionality greatly increases hardware cost and power is expected, we are not aware of any study for the quantized 0 consumption.ThismotivatesthestudyofMIMOsystemswith MIMO systems using JCD estimation.1 In this paper, we take . 1 verylowprecision(e.g.,1−3bits)analog-to-digitalconvenors the important first step to analyze the achievable performance 0 (ADCs). of the quantized MIMO system using JCD estimation. To this 5 SeveralaspectsoflowprecisionADCshavebeenstudiedin end,weusetheBayes-optimalinferenceforJCDestimationas 1 : theliteratureforsingle-inputsingle-output(SISO)channels[4] this approach provides the minimal mean-square-error (MSE) v andmorerecentlyMIMOchannels[5]andreferencestherein. with respect to (w.r.t.) the channels and payload data. How- i X Inthispaper,ourfocusisonsignaldetectionforthequantized ever, the complexity for carrying out the Bayes-optimal JCD r MIMOsystemswhereeachreceivingantennaisequippedwith estimator appears prohibitive. To address this issue, we use a a averylowprecisionADC.Priorworkinthisdirectioncovered variant version of belief propagation (BP) to approximate the code-division multiple-access (CDMA) systems [6], massive marginal distributions of each data and channel components. MIMO systems [5–9], distributed antenna systems (DASs) In particular, we modify the bilinear generalized approximate [10], and compressed sensing [11]. However, most previous message passing (BiG-AMP) scheme in [15] and adapt it to workassumedperfectchannelstateinformationatthereceiver thequantizedMIMOsystem.Werefertotheproposedmethod (CSIR).Inparticular,[8]revealedthatinaMIMOsystemwith as GAMP-based JCD. Furthermore, by applying large-system one-bitADC,toachievethesameperformanceasthefullCSI analysis based on the replica method from statistical physics, case we have to use a very long training sequence (above 50 weprovidethetheoreticalperformancesfortheBayes-optimal JCD estimator.2 Simulations are used to verify the efficiency C.-K. Wen and C.-J. Wang are with the Institute of Communications of the proposed algorithm and the accuracy of our analysis. Engineering, National Sun Yat-sen University, Kaohsiung, Taiwan (e-mail: [email protected]). S. Jin is with the National Mobile CommunicationsResearchLaboratory,SoutheastUniversity,Nanjing210096, 1Inthecontextofunquantized MIMOsystem,severalaspectsoftheJCD P. R. China. K. Wong is with the Department of Electronic and Electrical estimationhavealreadybeenwidelystudied,seee.g.,[12–14]. Engineering, University College London, UK. G. Wu is with the National 2Inthispaper,theBayes-optimalJCDestimatorisregardedasthetheoret- Key Laboratory of Science and Technology on Communications, University icaloptimalestimator,whiletheGAMP-basedJCDalgorithmcanbethought ofElectronicScienceandTechnologyofChina,Chengdu611731,P.R.China. ofasapracticalalgorithmtoapproximatethetheoreticaloptimalestimator. II.SYSTEMMODEL Similarly,weassumethateachentryHij isdrawnfromthe complex Gaussian distribution NC(0,σh2), where σh2 indicates We consider a block-fading uplink channel with K trans- thelargescalefadingfactor.3 LetPH(Hij)≡NC(0,σh2).Then mit antennas and N receive antennas, in which the channel we have (cid:89) remains constant over T consecutive symbol-intervals (i.e., a P (H)= P (Hij). (5) H H block).ThereceivedsignalY ∈CN×T overtheblockinterval ij can be written in matrix form as III.JCDESTIMATION 1 Y = √ HX+W=Z+W, (1) Our focus is on the setting where the receiver knows the K distributions of H and X but not their realizations. In the where X∈CK×T denotes the transmit symbols in the block, conventional pilot-only scheme, the receiver first uses Y and H ∈ CN×K is the matrix containing the fading coefficients 1 X to generate an estimate of H and then uses the estimated 1 associated to the channels between the transmit antennas and channel for estimating the data X from Y [8]. In contrast the receive antennas, W ∈ CN×T represents the additive 2 2 to the pilot-only scheme, we consider JCD estimation, where temporallyandspatiallywhiteGaussiannoisewithzeromean and element-wise variance σ2, and we define Z(cid:44) √1 HX. the BS estimates both H and X2 from Y(cid:101) given X1. We treat w K the problem in the framework of Bayesian inference [16]. InthequantizedMIMOsystem,asshowninFigure1,each To this end, we first define the likelihood, which is the received signal component Yij, 1 ≤ i ≤ N, 1 ≤ j ≤ T are distribution of the received signals under (2) conditional on quantized separately into a finite set of prescribed values by a the unknown parameters, as: B-bit quantizer Q . The resulting quantized signals can read c (cid:89)N (cid:89)T (cid:16) (cid:12) (cid:17) Y(cid:101) =Qc(Z+W). (2) P(Y(cid:101)|H,X)= Pout Y(cid:101)ij(cid:12)(cid:12)Zij , (6) i=1j=1 Specifically, each complex-valued quantizer Q (·) is defined c as Y(cid:101)ij = Qc(Yij) (cid:44) Q(Re{Yij})+jQ(Im{Yij}), i.e., the where rveaaluleadndquiamnatigzienraQrympaarptssaarreeaql-uvaanltuiezdedinspeuptartaoteolnye. Tofhethreea2lB- Pout(cid:16)Y(cid:101)(cid:12)(cid:12)(cid:12)Z(cid:17)=(cid:32)(cid:112)1 (cid:90) rb dye−(y−Rσew2(Z))2(cid:33) bins, which are characterized by the set of 2B−1 thresholds πσw2 rb−1 (cid:32) (cid:33) [r1,r2,...,r2B−1],suchthat−∞<r1 <r2 <···<r2B−1 < 1 (cid:90) rb(cid:48) −(y−Im(Z))2 ∞. For notational consistence, we define r0 = −∞ and × (cid:112)πσ2 dye σw2 (7) r2B =∞.TheoutputY(cid:101) isassignedavaluein(rb−1, rb]when w rb(cid:48)−1 the quantizer input Y falls in the interval (rb−1, rb] (namely, as Re(Y(cid:101)) and Im(Y(cid:101)) fall in the b-th and the b(cid:48)-th bins, the b-th bin). For example, the threshold of a typical uniform respectively, i.e, Re(Y(cid:101)) = rb and Im(Y(cid:101)) = rb(cid:48). Let Φ(x) (cid:44) quantizer with the quantization step-size ∆ is given by (cid:82)x Dz with Dz (cid:44) √1 e−z22dz. Then (7) has the following −∞ 2π r =(cid:0)−2B−1+b(cid:1)∆, for b=1,...,2B−1, (3) closed-form expression b andthequantizationoutputisassignedthevaluerb− ∆2 when Pout(cid:16)Y(cid:101)(cid:12)(cid:12)(cid:12)Z(cid:17)=Ψb(cid:0)Re(Z)(cid:1)Ψb(cid:48)(cid:0)Im(Z)(cid:1), (8) the input falls in the b-th bin. Figure 1 shows an example of where the 3-bit uniform quantizer. (cid:32)√ (cid:33) (cid:32)√ (cid:33) Since the channel matrix H needs to be estimated at the 2(r −x) 2(r −x) Ψ (x)(cid:44)Φ b −Φ b−1 . (9) receiver, we make the first T1 symbols of the block of T b σw σw symbolsserveaspilotsequences.TheremainingT =T−T 2 1 The prior distributions of H and X are given by (5) and (4), symbolsareusedfordatatransmissions.Thissettingisequiv- alent to partitioning X as X = [X1X2] with X1 ∈ CK×T1 respectively. Then the posterior probability can be computed and X2 ∈ CK×T2. The training and data phases are referred according to Bayes’ rule: to as t-phase and d-phase, respectively. We assume that the matrixX (orX )iscomposedofindependentandidentically P(H,X|Y)= P(Y(cid:101)|H,X)PH(H)PX(X), (10) 1 2 P(Y(cid:101)) distributed (i.i.d.) random variables X (X ) generated from a 1 2 known probability distribution PX1 (or PX2), i.e., where P(Y(cid:101))=(cid:82)H(cid:82)XdHdXP(Y(cid:101)|H,X)PH(H)PX(X) is the marginal likelihood. P (X)=P (X )P (X ) (4) X X1 1 X2 2 Given the posterior probability, an estimate for Xij can be with PX1(X1)=(cid:81)i,jPX1(X1ij), PX2(X2)=(cid:81)i,jPX2(X2ij). obtained by the posterior mean Since the pilot and data symbols should appear on constel- (cid:90) lation points uniformly, the ensemble averages of {Xij} and X(cid:98)ij = dXijP(Xij)Xij, (11) 1 {Xij} are assumed to be zero. In addition, we let σ2 and 2 x1 σ2 be the transmit powers during the t-phase and d-phase, x2 3For ease of notation, we consider the case where all the transmits have respectively, i.e., E{|Xij|2}=σ2 and E{|Xij|2}=σ2 . thesamelarge-scalefadingfactorbutourresultscanbeeasilyextended. 1 x1 2 x2 where i.e., mseXt (cid:44) E{mse(Xt|Y(cid:101))} and mseH (cid:44) E{mse(H|Y(cid:101))}. (cid:90) (cid:90) Our analysis investigates the high-dimensional regime where P(Xij)= dHdXP(H,X|Y(cid:101)) (12) N,K,T → ∞ but the ratios N/K = α, T/K = β, H X\Xij T /K =β , for t=1,2 are fixed and finite. For convenience, t t isthemarginalposteriorprobabilityofXij.Here,thenotation we simply use K →∞ (or refer to as the large-system limit) (cid:82) dX denotes the integration over all the variables in X to denote this high-dimension limit. Following the argument X\Xij exceptforXij.IftheMSEofanestimateXˆ w.r.t.Xisdefined of [20,21], it can be shown that mseXt and mseH are saddle as points of the average free entropy (cid:90) (cid:90) 1 (cid:110) (cid:111) mse(Xt|Y(cid:101))= dHdXP(H,X|Y(cid:101))(cid:107)Xˆt−Xt(cid:107)2F, (13) Φ(cid:44) K2EY(cid:101) logP(Y(cid:101)) , (16) H X for t=1,2, then the posterior mean estimator (11) gives the where P(Y(cid:101)) denotes the marginal likelihood in (10), namely minimumMSE(MMSE)[16].Noticethatgivenaknownpilot thepartitionfunction.Themajordifficultyincomputing(16)is matrixX1,i.e.,PX1(X1)=δ(X1−X1),wecaneasilyobtain theexpectationofthelogarithmofP(Y(cid:101)),which,nevertheless, X(cid:98)1ij =Xi1j from (11). In this case, we have mse(X1|Y(cid:101))=0. can be facilitated by rewriting Φ as [22] Similarly, the Bayes estimate of Hij is given by 1 ∂ (cid:110) (cid:111) (cid:90) Φ= lim logE Pτ(Y(cid:101)) . (17) H(cid:98)ij = dHijP(Hij)Hij (14) K2 τ→0∂τ Y(cid:101) The expectation operator is moved inside the log-function. where P(Hij)=(cid:82)H\Hij(cid:82)XdHdXP(H,X|Y(cid:101)) denotes the tWheenfigresnteervaalilzueatteheErY(cid:101)es{uPltτ(toY(cid:101)a)n}yfoprosaitniveinrteegalern-vuamlubeedr ττ.,Tahnids marginal posterior probability of Hij. The estimate H(cid:98) also technique is called the replica method, and has been widely minimizes the MSE adopted in the field of statistical physics [22] and information (cid:90) (cid:90) mse(H|Y(cid:101))= dHdXP(H,X|Y(cid:101))(cid:107)Hˆ −H(cid:107)2. (15) theory literature, e.g., [12,23–29]. Under the assumption of F H X replica symmetry (RS), the following results are obtained. Hereafter, we will refer to (11) and (14) as the Bayes-optimal Proposition 1: AsK →∞,theasymptoticMSEsw.r.t.X estimator. t and H are associated with the MSEs for the scalar Gaussian Although the Bayes-optimal estimator provides the MMSE channels: estimates, direct computations of (11) and (14) are intractable due to high-denominational integrals involved in the marginal Y =(cid:112)q˜ H +W , (18) posteriorsP(Xij)andP(Hij).In[17],BPprovidesapracti- q˜H (cid:112) H H Y = q˜ X +W , (19) calalternativetoapproximatethesemarginalposteriors.Inthe q˜Xt Xt t X recent compressed sensing literature, the Bayesian framework where WH,WX ∼NC(0,1) are independent of H ∼PH and in combination with a BP algorithm has given rise to the so- X ∼P . Here, the parameters q˜ and q˜ are the solutions t Xt H Xt called approximate message passing (AMP) algorithm [18] to the set of fixed-point equations and the generalized AMP (GAMP) [19]. Applying this devel- 2 opment to our context of the MIMO system means that when (cid:88) q˜ = β q χ , (20a) H is perfectly known, GAMP can provide a tractable way to H t Xt t t=1 approximatethemarginalposteriorsP(Xij)’s.Remarkably,it q˜ =αq χ , (20b) hasprovedthattheapproximationsbecomeexactinthelarge- Xt H t q =c −mse , (20c) system limit for dense matrix H with sub-Gaussian entries. H H H Morerecently,Parkeretal.in[15]appliedthesamestrategyof qXt =cXt −mseXt. (20d) GAMPtotheproblemofreconstructingmatricesfrombilinear where mse =E{|H−E{H|Y }|2} and mse =E{|X − noisy observations (i.e., reconstructing H and X from Y), H q˜H Xt t E{X |Y }|2} are the asymptotic MSEs w.r.t. X and H, which is referred to as bilinear GAMP (BiG-AMP). The BiG- t q˜Xt t respectively, c (cid:44) E{|X |2} = σ2 , c (cid:44) E{|H|2} = σ2, AMP scheme can be applied to tackle the Bayes-optimal JCD Xt t xt H h and estimator and we can adapt it to be used in the quantized 2B−1(cid:90) (cid:16)Ψ(cid:48) (cid:0)√q q v(cid:1)(cid:17)2 MJCIDMaOlgsoertittihnmg..WDeuecatollstphaecdeelvimeloitpaetidonaslg,owreithremmGovAeMdePt-abilasseodf χt (cid:44) (cid:88) Dv Ψb (cid:0)√qH qXt v(cid:1) , (21) b=1 b H Xt the algorithm development in this paper while we will show with its simulation performances later in Section V. (cid:32) √ (cid:33) IV.PERFORMANCEANALYSIS Ψ (V )(cid:44)Φ 2rb−Vt b t (cid:112) knowing the theoretical lower bound of the estimate is σw2 +cHcXt −qHqXt useful to assess any developed algorithm. Therefore, in this (cid:32) √ (cid:33) 2r −V section,ourobjectiveistoderiveanalyticalresultsfortheaver- −Φ b−1 t , (22) (cid:112) σ2 +c c −q q ageMSEsofX2 andHfortheBayes-optimalJCDestimator, w H Xt H Xt and 100 100 1-bit 1-bit 2-bit 2-bit Ψ(cid:48)b(Vt)(cid:44) ∂Ψ∂bV(Vt) 10-1 u3-nbqituantized 10-1 u3-nbqituantized t − (√2rb−Vt)2 − (√2rb−1−Vt)2 10-2 10-2 e 2(σw2+cHcXt−qHqXt) −e 2(σw2+cHcXt−qHqXt) = . (23) (cid:112) 2π(σw2 +cHcXt −qHqXt) BER 10-3 BER 10-3 3.79 dB 10.93 dB 4.40 dB 14.20 dB In the t-phase, i.e., t = 1, the pilot matrix X1 is known. 4.31 dB 4.98 dB Thus, we substitute mse =0 into the above expressions. 10-4 10-4 X1 5.74 dB Proof: An outline proof is given in the appendix. 6.59 dB 10-5 10-5 The above result reveals that in the large-system limit, the performance of the quantized MIMO system employing the 10-6 10-6 Bayes-optimal JCD estimator can be fully characterized by 0 5 10 15 0 5 10 15 SNR (dB) SNR (dB) the equivalent scalar Gaussian channels (18) and (19). For (a) (b) example,theachievablerateundertheseparatedecoding(SD) is the mutual information between Yq˜Xt and Xt for the scalar FJCigD.2e.stiBmEatRiovnesrcsuhsemSNeRis=us1e/dσuw2ndfeorrtQhePsSeKtticnognsswteiltlhataio)npse.rIfnecttheCSreIsRulatsn,dthbe) Gaussian channel (19). Note that in contrast to joint detection no CSIR. Curves denote analytical results and markers denote Monte-Carlo and decoding, the SD involves the joint multiuser detection simulationresultsachievedbytheGAMP-basedJCDalgorithm. followed by a bank of independent decoders. Also, the corre- sponding MSE w.r.t. H can be evaluated through the scalar Gaussian channel (18). Specifically, if the signal is drawn from a quadrature phase shift keying (QPSK) constellation, The simulation results are obtained by averaging over 10,000 the corresponding bit error rate (BER) reads channel realizations. The parameters of the system are set as (cid:16)(cid:112) (cid:17) follows: K = 50, N = 200, T1 = 50, and T2 = 450. The Pe =Q q˜X , (24) pilot sequences of length T1 are randomly generated. In all thefollowingsimulations,weusethetypicaluniformquantizer where Q(x) (cid:44) (cid:82)∞Dz is the Q function, the MSE w.r.t. √ x with the quantization step-size ∆= 0.25. Note that we do payload data X is given by 2 not optimize the quantization step-size but select a good one (cid:90) (cid:16) (cid:112) (cid:17) for general scenarios. We leave the related issue to our future mse =1− Dztanh q˜ + q˜ z , (25) X2 X2 X2 work. Figure 2 shows the corresponding BERs results for the cases of 1) perfect CSIR and b) no CSIR. We observe that and the corresponding MSE w.r.t. H is mseH = 1+σσhh22q˜H. theanalyticalBERexpression(24)generallypredictswellthe If the channel matrix H is perfectly known, the t-phase behavioroftheGAMP-basedJCDalgorithm.Forthecasewith is not required so β2 = β and β1 = 0. Since there is only noCSIR,theGAMP-basedJCDalgorithmcannotworkaswell one phase in X, we omit the phase indices (t) from all the as that predicted by the analytical result at low SNRs. This concerned parameters in this case. Because H is perfectly would be because the GAMP-based JCD algorithm is only an known, we set mseH = 0. Plugging this into (20c), we approximation to the Bayes-optimal JCD estimator. This gap immediately obtain qH = cH = σh2, which leads to qHqX = has motivated the search for other improved estimators in the cHqX, and cHcX −qHqX = cH(cX −qX) = cHmseX. It future.Inaddition,fromFigure2,weseethattheperformance turns out that the equivalent signal-to-interference-plus-noise degradation due to 3-bit quantization is small. For instance, if ratio (SINR) of the scalar Gaussian channel (19) is given by we target the SNR to that attained by the unquantized system q˜X =αcH2(cid:88)B−1(cid:90) Dv(cid:0)Ψ(cid:48)b(cid:0)(cid:0)√√cHqXv(cid:1)(cid:1)(cid:1)2, (26) JwCitDh epsetrimfeacttoCr SonIRlyaintcBuErsRa=lo1ss0−o3f,1t.h1e9 3d-Bb.itEBveanyews-iothpt2im-bailt Ψ c q v quantization, the loss of 2.8 dB remains acceptable. b H X b=1 ComparingFigures2(a)and2(b),weseethatthelossdueto andtheasymptoticMSEw.r.t.Xisgivenbymse =E{|X− X no CSIR is small for the proposed JCD estimator. Therefore, E{X|Y }|2}.IfQPSKisused,theMSEin(25)togetherwith q˜X it is of interest to evaluate the improvement due to the JCD q˜ in (26) agree with [6, (7) & (8)]. More precisely, in [6], X estimation. Following the same system parameters as before, thereal-valuedsystemwithBPSKsignalisconsidered.Inthis √ Figure 3 compares the BERs under the conventional pilot- case, 2r in our paper should be replaced by r . b b only scheme and the proposed JCD estimation scheme. For the pilot-only scheme, we adopt the receiver structure of [8], V.NUMERICALRESULTS which employs the least squares (LS) channel estimate for Toverifytheaccuracyofouranalyticalresults,wecompare the quantized MIMO system. However, unlike [8], we then the analytical BER expression (24) with that obtained by employtheGAMPalgorithmforpayloaddatadetectionbased simulations (performed by the GAMP-based JCD algorithm) on the estimated channel. Therefore, the BERs of the pilot- under the quantized MIMO system with QPSK constellation. only scheme shown in Figure 3 are expected to be better than 100 system, where the achievable rate is always upper bounded Pilot-only by B bits in a B-bit receiver [4]. If we fix the achievable 1-bit rate to 7 bps/Hz, the number of receive antennas for the 3-bit 10-1 Bayes-optimalJCDestimatorisonlyabout 21.8 ≈1.4timesof 2-bit 15.4 theunquantizedBayes-optimalJCDestimator(thebenchmark receiver), while even with unquantized receivers, the number 3-bit 10-2 of receive antennas for the pilot-only scheme requires about 65.5 ≈ 4.3 times of the benchmark receiver. The penalty R 15.4 E 1-bit of increasing 1.4 times of antenna numbers with very low B 10-3 precision ADCs seems to be quite acceptable. JCD 10-4 VI.CONCLUSION 2-bit 3-bit We have developed a framework for studying the best pos- 10-5 sible estimation performance of the quantized MIMO system. 0 5 10 15 Inparticular,weusedtheBayes-optimalinferencefortheJCD SNR (dB) estimation and realized this estimation by applying the BiG- Fig. 3. Average BER versus SNR for QPSK constellations. In the results, AMP technique. Additionally, the asymptotic performances the GAMP-based JCD algorithm and pilot-only scheme are used. Plots are (e.g., MSEs) w.r.t. the channels and the payload data are basedonMonte-Carlosimulationresults. derived in the lareg-system limit. A set of Monte-Carlo sim- ulations was conducted to illustrate that our analytical results 10 10 provide an accurate prediction for the performances of the 9 9 Bayes-optimalJCDestimator.Thenumericalresultshavealso 488 5 4 8 15.21.30. 7 8 65. 86. revealedthattheJCDestimationschemeprovidestremendous 80. improvement over the conventional pilot-only scheme. Hz) 7 Hz) 7 ps/ 6 ps/ 6 b b Rate ( 5 Rate ( 5 APPENDIXA:PROOFOFPROPOSITION1 e e bl 4 bl 4 a a ev ev hi 3 hi 3 Ac unquantized Ac unquantized As stated in Section IV, the MSEs of interest are saddle 2 3-bit 2 3-bit points of the average free entropy (16). However, direct cal- 2-bit 2-bit 1 1-bit 1 1-bit culationisverydifficult.Thus,weresorttothereplicamethod 0 0 by computing the replicate partition function E {Pτ(Y)} in 0 20 40 ® 60 80 100 0 20 40 ® 60 80 100 (17), which with the definition of (10) can be eY(cid:101)xpressed as (a) (b) (cid:40)(cid:90) (cid:89)τ (cid:16) (cid:12) (cid:17)(cid:41) Fig. 4. The achievable rates as functions of α = N/K for a) the Bayes- EY(cid:101) {Pτ(Y)}=EH,X dY(cid:101) Pout Y(cid:101) (cid:12)(cid:12)Z(a) , (27) optimal JCD estimator and b) the pilot-only scheme. β = 10, β1 = 1, a=0 σw2 =10−1,andX2ij ∼NC(0,1). where we define Z(a) (cid:44)H(a)X(a)/√K with H(a) and X(a) being the a-th replica of H and X respectively, and the notations X (cid:44) {X(a),∀a} and H (cid:44) {H(a),∀a}. Here, that employing suboptimal criteria [8]. Even so, as can be (H(a),X(a)) are random matrices taken from the distribution seen from Figure 3, the JCD estimation still shows a large (cid:82) (PH,PX) for a = 0,1,...,τ. In addition, dY(cid:101) denotes the improvement over the pilot-only scheme. integral w.r.t. a discrete measure because the quantized output Finally, we compare the achievable rates as functions of Y(cid:101) is a finite set. Next, we will focus on calculating the the antenna ratio α = N/K for the Bayes-optimal JCD esti- right-hand side of (27), which can be done by applying the mator and the pilot-only scheme under different quantization techniques in [14,20,21] after additional manipulations. precisions in Figure 4. Note that unlike the QPSK signals First, in order to average over (H,X), we introduce two used in pervious simulations, we consider Gaussian signal, (τ +1)×(τ +1) matrices Q = [Qab] and Q = [Qab] i.e., X2 ∼ NC(0,1), in this experiment. We observe that the whose elements are defined bHy Qab H= h(b)(h(Xa)t)†/K aXntd achievable rates of all the quantization precisions increase as H n n the receive antenna numbers even for the 1-bit receivers. This QaXbt =(x(ja))†x(jb)/K. Here, h(na) denotes the nth row vector implies that the use of high-order modulation schemes is also of H(a), and x(a) denotes the jth column vector of X(a) j possible in 1-bit MIMO systems, which shares the same view corresponding to phase block t, and T for t=1,2 represents t as[5].ThispropertyisquitedifferentfromthequantizedSISO the set of all symbol indices in phase block t. The definitions of Q and Q are equivalent to (a,b)th entry of Q is given by H Xt Zt 1=(cid:90) (cid:89)N (cid:89)τ δ(cid:16)h(b)(h(a))†−KQab(cid:17)dQab, (zn(a,j))∗zn(b,)j =QaHbQaXbt (cid:44)QaZbt. (30) n n H H n=10≤a≤b Therefore,wesetQZt =(cHcXt−qHqXt)I+qHqXt1,which 1=(cid:90) (cid:89)2 (cid:89) (cid:89)τ δ(cid:16)(x(a))†x(b)−KQab(cid:17)dQab , ifsoreqju∈ivaTlenatstointroducetotheGaussianrandomvariablezn,j j j Xt Xc,t t t=1j∈Tt0≤a≤b z(a) =(cid:112)c c −q q u(a)+√q q (31) n,j H Xt H Xt H Xt where δ(·) denotes Dirac’s delta. Let Q (cid:44) {Q ,∀t} and Z (cid:44){Z(a),∀a}. Inserting the above intoX(27) yieXldts for a=0,1,...τ, where u(a) and v are independent standard complex Gaussian random variables. (cid:90) EY(cid:101){Pτ(Y(cid:101))}= eK2G(τ)dµ(Hτ)(QH)dµ(Xτ)(QX), (28) expSruebssstiiotnutoinfgΦt(hτe)s.eWRitShtehxepRreSss,iwoneoinntloyhΦa(vτe)tloeaddestetromtihneeRthSe where parameters{c ,q ,c ,q ,c˜ ,q˜ ,c˜ ,q˜ },whichcanbe H H Xt Xt H H Xt Xt 1 (cid:40)(cid:90) (cid:89)τ (cid:16) (cid:12) (cid:17)(cid:41) obtained by equating the corresponding partial derivatives of G(τ)(QZ)(cid:44) K2 logEZ dY(cid:101) Pout Y(cid:101) (cid:12)(cid:12)Z(a) , Φ(τ) to zero. In doing so, as τ → 0, we get that c˜H = 0, a=0 c˜Xt = 0, cH = E{|H|2}, cXt = E{|Xt|2}, and the other parameters{q ,q ,q˜ ,q˜ }aregivenby(20)inProposition µ(τ)(Q )(cid:44)E (cid:89) δ(cid:16)h(b)(h(a))†−KQab(cid:17), 1.Letmse =Hc X−t qHanXdtmse =c −q .Aftertaking H H H n n H H H H Xt Xt Xt n,a,b theseintoaccount,weobtaintheRSexpressionoftheaverage free entropy as µ(Xτ)(QX)(cid:44)EX (cid:89) δ(cid:16)(x(ja))†x(jb)−KQaXbt(cid:17). (cid:88) (cid:32)(cid:88)(cid:90) (cid:33) Φ=α β DvΨ (V )logΨ (V ) t,j,a,b t b t b t t b Then using the Fourier representation of the δ function and 2 computing the integrals by the saddle point method, we attain (cid:88) −αI(H;Z |q˜ )− β I(X ;Z |q˜ ) H H t t Xt Xt 1 E {Pτ(Y(cid:101))}= Extr {Φ(τ)} (29) t=1 K2 Y(cid:101) QH,QX,Q˜H,Q˜X +α(c −q )q˜ +(cid:88)2 β (c −q )q˜ . (32) with H H H t Xt Xt Xt t=1 √ Φ(τ) (cid:44) 1 logE (cid:89) (cid:90) dy (cid:89)P (cid:16)y (cid:12)(cid:12)z(a)(cid:17) where we have defined Vt (cid:44) qHqXtv, Ψb(Vt) is given by K2 Z n,j out n,j(cid:12) n,j (22), and the notation I(X,Z|q) is used to denote the mutual n,t,j∈Tt a informationbetween X and Z withaGaussian scalarchannel (cid:40) (cid:41) √ + K12 logEH (cid:89)n etr(Q˜HH†nHn) −αtr(cid:16)Q˜HQH(cid:17) Z{qH=,qXqtX,q˜H+,Wq˜Xta}ndgiWven∼bNy C(2(00),1a)r.eNsaodtedltehaptotihnetspoafra(m32e)te.rs (cid:40) (cid:41) + K12 logEX (cid:89)etr(Q˜XtX†tXt) −(cid:88)βttr(cid:16)Q˜XtQXt(cid:17), REFERENCES t t [1] T.L.Marzetta,“Noncooperativecellularwirelesswithunlimitednum- where Extr {f(x)} represents the extreme value of f(x) x bersofbasestationantennas,”IEEETrans.WirelessCommun.,vol.9, w.r.t. x, Q˜H = [Q˜aHb] ∈ C(τ+1)×(τ+1) and Q˜X (cid:44) no.11,pp.3590–3600,Nov.2010. {Q˜ = [Q˜ab] ∈ C(τ+1)×(τ+1),∀c,t}. According to [2] E. G. Larsson, F. Tufvesson, O. Edfors, and T. L. Marzetta, “Massive (17)X,t the averXagte free entropy turns out to be Φ = MIMO for next generation wireless systems,” IEEE Commun. Mag., lim ∂ Extr {Φ(τ)}. vol.52,no.2,pp.186–195,Feb.2014. τ→0 ∂τ QH,QX,Q˜H,Q˜X [3] J.G.Andrews,S.Buzzi,W.Choi,S.Hanly,A.Lozano,A.C.K.Soong, andJ.Zhang,“Whatwill5Gbe?”IEEEJ.Sel.AreasCommun.,vol.32, no.6,pp.1065–1082,June2014. The saddle points of Φ(τ) can be obtained by seeking the [4] J.Singh,O.Dabeer,andU.Madhow,“Onthelimitsofcommunication point of zero gradient w.r.t. {Q ,Q ,Q˜ ,Q˜ }. However, with low-precision analog-to-digital conversion at the receiver,” IEEE H Xt H Xt in doing so, it is prohibitive to get explicit expressions about Trans.Commun.,vol.57,no.12,pp.3629–3639,Dec.2009. [5] J. Mo and R. W. Heath Jr, “Capacity analysis of one-bit quantized thesaddlepoints.Therefore,weassumethatthesaddlepoints MIMO systems with transmitter channel state information,” preprint, follow the RS form [22] as Q = (c − q )I + q 1, H H H H 2014.[Online].Available:http://arxiv.org/abs/1410.7353 QQ˜˜H ==((c˜c˜H −−q˜qH˜)I)I++q˜Hq˜1,1Q, Xwthe=re(IcXdten−otqeXstt)hIe+idqeXntti1ty, [6] Kus.inNgakqaumanutriazeadndreTc.eTivaendaksai,gn“Palesrfoorfmlainnecaeravneaclytosriscohfasningenla,”ldinetePctrioocn. Xt Xt Xt Xt matrix and 1 denotes the all-one matrix. In addition, the Inter.Symp.Inform.TheoryanditsApplications(ISITA),Auckland,New Zealand,Dec.2008. application of the central limit theorem suggests that the [7] A.MezghaniandJ.Nossek,“BeliefpropagationbasedMIMOdetection z (cid:44) [z(0)z(1)···z(τ)]T are Gaussian random vectors with n,j n,j n,j n,j operatingonquantizedchanneloutput,”inProc.IEEEInt.Symp.Inform. (τ +1)×(τ +1) covariance matrix QZt. If j ∈Tt, then the Theory(ISIT),Austin,TX,13-18June2010,pp.2113–2117. [8] C. Risi, D. Persson, and E. G. Larsson, “Massive of interference,” IEEE Trans. Wireless Commun., vol. 13, no. 4, pp. MIMO with 1-bit ADC,” preprint, 2014. [Online]. Available: 1536–1276,Apr.2014. http://arxiv.org/abs/1404.7736. [9] S. Wang, Y. Li, and J. Wang, “Multiuser detection in massive spatial modulation MIMO with low-resolution ADCs,” IEEE Trans. Wireless Commun.,2015. [10] J.Choi,D.J.Love,D.R.BrownIII,M.Boutin,“Distributedreception with spatial multiplexing: MIMO systems for the internet of things,” preprint,2014.[Online].Available:http://arxiv.org/abs/1409.7850 [11] Y. Xu, Y. Kabashima, and Lenka Zdeborova´, “Bayesian signal recon- structionfor1-bitcompressedsensing,”J.Stat.Mech.,no.11,p.P11015, 2014. [12] K. Takeuchi, R. R. Mu¨ller, M. Vehkapera¨, and T. Tanaka, “On an achievable rate of large Rayleigh block-fading MIMO channels with noCSI,”IEEETrans.Inf.Theory,vol.59,no.10,pp.6517–6541,Oct. 2013. [13] J. Ma and L. Ping, “Data-aided channel estimation in large antenna systems,” IEEE Trans. Signal Processing, vol. 62, no. 12, pp. 3111– 3124,June2014. [14] C.-K.Wen,Y.Wu,K.-K.Wong,R.Schober,andP.Ting,“Performance limitsofmassiveMIMOsystemsbasedonBayes-optimalinference,”in IEEEInt.Conf.Commun.(ICC),London,UK,2015. [15] J. T. Parker and P. Schniter and V. Cevher, “Bilinear generalized approximatemessagepassing,”IEEETrans.Sig.Proc.,vol.62,no.22, pp.5839–5853,Nov.2014. [16] H.V.Poor,AnIntroductiontoSignalDetectionandEstimation. New York:Springer-Verlag,1994. [17] J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of PlausibleInference. MorganKaufmann,1ed.,1988. [18] D. L. Donoho, A. Maleki, and A. Montanari, “Message passing algo- rithmsforcompressedsensing,”Proc.Nat.Acad.Sci.,vol.106,no.45, pp.18914–18919,2009. [19] S. Rangan, “Generalized approximate message passing for estimation withrandomlinearmixing,”ArXive-prints:1010.5141v2[cs.IT],2010. [20] F.Krzakala,M.Me´zard,andL.Zdeborova´,“Phasediagramandapprox- imatemessagepassingforblindcalibrationanddictionarylearning,”in Proc.IEEEInt.Symp.Inform.Theory(ISIT),Istanbul,Turkey,July2013, pp.659–663. [21] Y. Kabashima, F. Krzakala, M. Me´zard, A. Sakata, and L. Zdeborova´, “Phase transitions and sample complexity in Bayes- optimal matrix factorization,” preprint 2014. [Online]. Available: http://arxiv.org/abs/1402.1298. [22] H. Nishimori, Statistical Physics of Spin Glasses and Information Processing: An Introduction. ser. Number 111 in Int. Series on MonographsonPhysics.OxfordU.K.:OxfordUniv.Press,2001. [23] T.Tanaka,“Astatistical-mechanicsapproachtolarge-systemanalysisof CDMA multiuser detectors,” IEEE Trans. Inf. Theory, vol. 48, no. 11, pp.2888–2910,Nov.2002. [24] A.L.Moustakas,S.H.Simon,andA.M.Sengupta,“MIMOcapacity throughcorrelatedchannelsinthepresenceofcorrelatedinterferersand noise: a (not so) large N analysis,” IEEE Trans. Inf. Theory, vol. 49, no.10,pp.2545–2561,Oct.2003. [25] D. Guo and S. Verdu´ , “Randomly spread CDMA: asymptotics via statistical physics,” IEEE Trans. Inf. Theory, vol. 51, no. 1, pp. 1982– 2010,Jun.2005. [26] R. R. Mu¨ller, “Channel capacity and minimum probability of error in largedualantennaarraysystemswithbinarymodulation,”IEEETrans. Sig.Proc.,vol.51,no.11,pp.2821–2828,Nov.2003. [27] C.-K.WenandK.-K.Wong,“Asymptoticanalysisofspatiallycorrelated MIMOmultiple-accesschannelswitharbitrarysignalinginputsforjoint and separate decoding,” IEEE Trans. Inf. Theory, vol. 53, no. 1, pp. 252–268,Jan.2007. [28] A.Hatabu,K.Takeda,andY.Kabashima,“Statisticalmechanicalanal- ysisoftheKroneckerchannelmodelformultiple-inputmultipleoutput wireless communication,” Phys. Rev. E, vol. 80, pp. 061124(1–12), 2009. [29] M.A.Girnyk,M.Vehkapera¨,L.K.Rasmussen,“Large-systemanalysis of correlated MIMO channels with arbitrary signaling in the presence