Table Of ContentLow-Complexity Near-ML Decoding of Large Non-Orthogonal STBCs
Using PDA
Saif K. Mohammed,A. Chockalingam,and B. SundarRajan
DepartmentofECE,IndianInstituteofScience,Bangalore560012,INDIA
Abstract—Non-orthogonalspace-timeblockcodes(STBC)from that802.11smartWiFi productswith 12 transmitantennas1
cyclicdivisionalgebras (CDA) havinglarge dimensionsareat- at2.5GHzarenowcommerciallyavailable[4](whichestab-
tractivebecausetheycansimultaneouslyachievebothhighspec-
lishesthatissuesrelatedtoplacementofmanyantennasand
tral efficiencies (same spectral efficiency as in V-BLAST for a
9 RF/IFchainscanbesolvedinlargeaperturecommunication
givennumberoftransmitantennas)aswellasfulltransmitdi-
0 versity.Decodingofnon-orthogonalSTBCswithhundredsofdi- terminals like set-top boxes/laptops), large non-orthogonal
0 mensionshasbeenachallenge.Inthispaper,wepresentaprob- STBCs(e.g.,16 16STBCfromCDA)incombinationwith
2 ×
abilistic data association (PDA) based algorithm for decoding large dimension near-ML decoding using PDA can enable
n non-orthogonalSTBCswithlargedimensions. Oursimulation communications at increased spectral efficiencies of the or-
a results show that the proposed PDA-based algorithm achieves
deroftensofbps/Hz(notethatcurrentstandardsachieveonly
J nearSISOAWGNuncodedBERaswellasnear-capacitycoded
BER (within about 5 dB of the theoretical capacity) for large <10bps/Hzusingonlyupto4transmitantennas).
3
non-orthogonal STBCsfromCDA. Westudytheeffect of spa-
1 PDA,originallydevelopedfortargettracking,iswidelyused
tialcorrelationontheBER,andshowthattheperformanceloss
due to spatial correlation can be alleviated by providing more in digital communications[5]-[12]. Particularly, PDA algo-
]
T receive spatial dimensions. Wereport good BER performance rithm is a reduced complexity alternative to the a posteriori
I whena training-basediterative decoding/channelestimation is probability (APP) decoder/detector/equalizer. Near-optimal
s. used(insteadofassumingperfectchannelknowledge)inchan- performancehasbeendemonstratedforPDA-basedmultiuser
c nels with large coherence times. A comparison of the perfor- detection in CDMA systems [5]-[8]. PDA has been used in
[ mancesof thePDAalgorithmandthelikelihoodascent search
(LAS)algorithm(reportedinourrecentwork)isalsopresented. the detection of V-BLAST signals with small numberof di-
1 mensions [10]-[12]. To our knowledge, PDA has not been
v Keywords–Non-orthogonalSTBCs,largedimensions,highspec- reportedfordecodingnon-orthogonalSTBCs with hundreds
9
tralefficiency,low-complexitynear-MLdecoding,probabilisticdata ofdimensionssofar. Ourresultsinthispapercanbesumma-
6
association. rizedasfollows:
8
1
. I. INTRODUCTION • WeadaptthePDAalgorithmfordecodingnon-orthogo-
1 nalSTBCswithlargedimensions. Withi.i.dfadingand
0 Multiple-inputmultiple-output(MIMO)systemsthatemploy perfectCSIR,thealgorithmachievesnear-SISOAWGN
9 non-orthogonalspace-time blockcodes (STBC) from cyclic
uncodedBERandnear-capacitycodedBER(withinabout
0 divisionalgebras(CDA)forarbitrarynumberoftransmitan-
5dBofthetheoreticalcapacity)for12 12STBCfrom
:
v tennas, Nt, are quite attractive because they can simultane- CDA,4-QAM,rate-3/4turbocode,and×18bps/Hz.
i ously provide both full-rate (i.e., N complex symbols per
X t RelaxingtheperfectCSIRassumption,wereportresults
channel use, which is same as in V-BLAST) as well as full •
withatrainingbasediterativePDAdecoding/channeles-
r
a transmitdiversity[1].The2 2Goldencodeisawellknown timation scheme. The iterative scheme is shown to be
×
non-orthogonalSTBCfromCDAfor2transmitantennas[2].
effectivewithlargecoherencetimes.
High spectralefficienciesof the orderof tensof bps/Hz can
Relaxingthei.i.dfadingassumptionbyadoptingaspa-
be achieved using large non-orthogonalSTBCs. For exam- •
tiallycorrelatedMIMOchannelmodel(proposedbyGes-
ple, a 16 16 STBC from CDA has 256 complex symbols
bertetalin[18]),weshowthattheperformancelossdue
×
in it with 512 real dimensions; with 16-QAM and rate-3/4
tospatialcorrelationisalleviatedbyusingmorereceive
turbocode,thissystemoffersahighspectralefficiencyof48
spatialdimensionsforafixedreceiveraperture.
bps/Hz. Decodingofnon-orthogonalSTBCswithsuchlarge
Finally, the performanceofthe PDA algorithmiscom-
dimensions,however,hasbeena challenge. Spheredecoder •
pared with that of the likelihood ascent search (LAS)
anditslow-complexityvariantsareprohibitivelycomplexfor
algorithmwerecentlypresentedin[13]-[15]. ThePDA
decodingsuchSTBCswithhundredsofdimensions.
algorithm is shown to perform better than the LAS al-
Inthispaper,wepresentaprobabilisticdataassociation(PDA) gorithm at low SNRs for higher-order QAM (e.g., 16-
based algorithm for decoding large non-orthogonal STBCs QAM),andinthepresenceofspatialcorrelation.
from CDA. Key attractive features of this algorithm are its
low-complexity and near-ML performance in systems with II. SYSTEMMODEL
largedimensions(e.g.,hundredsofdimensions). Whilecre-
Considera STBC MIMOsystemwithmultipletransmitand
atinghundredsofdimensionsinspacealone(e.g.,V-BLAST)
receiveantennas. An(n,p,k)STBCisrepresentedbyama-
requireshundredsofantennas,useofnon-orthogonalSTBCs
fromCDA can createhundredsof dimensionswith justtens 112 antennas in these products are now used only for beamforming.
of antennas (space) and tens of channel uses (time). Given Single-beam multi-antenna approaches can offer range increase andinter-
ferenceavoidance,butnotspectralefficiencyincrease.
trixXc Cn×p,wherenandpdenotethenumberoftransmit Now,(3)canbewrittenas
∈
antennas and number of time slots, respectively, and k de-
notesthenumberofcomplexdatasymbolssentinoneSTBC y = H x +n . (7)
r r r r
matrix.The(i,j)thentryinX representsthecomplexnum-
c
bertransmittedfromtheithtransmitantennainthejthtime Henceforth,weworkwiththereal-valuedsystemin(7). For
slot.TherateofanSTBCis k.LetN andN =ndenotethe notationalsimplicity,wedropsubscriptsrin(7)andwrite
p r t
number of receive and transmit antennas, respectively. Let
Hc CNr×Nt denote the channel gain matrix, where the y = H′x+n, (8)
∈
(i,j)thentryinH isthecomplexchannelgainfromthejth
transmitantennatoc the ith receiveantenna. We assume that whereH′ = Hr ∈ R2Nrp×2k,y = yr ∈ R2Nrp×1,x = xr ∈
thechannelgainsremainconstantoveroneSTBCmatrixand R2k×1, andn = nr ∈ R2Nrp×1. We assumethatthe channel
coefficientsareknownatthereceiverbutnotatthetransmit-
vary (i.i.d) from one STBC matrix to the other. Assuming
ter. LetA denotetheM-PAMsignalsetfromwhichx (ith
richscattering,wemodeltheentriesofH asi.i.d (0,1). i i
c CN entry of x) takes values, i = 0, ,2k 1. Now, define a
The receivedspace-timesignalmatrix, Yc ∈ CNr×p, can be 2k-dimensionalsignalspaceSto·b·e· theC−artesianproductof
writtenas
A toA . TheMLsolutionisthengivenby
Yc =HcXc+Nc, (1) 0 2k−1
iwtsheenretriNescar∈emCoNdre×lpedisasthi.ei.dnoCiNse m0,aσtr2ix=atNthtγEesre,cweihveerreaEnds dML = adrg∈mSindT(H′)TH′d−2yTH′d, (9)
istheaverageenergyofthetransm(cid:0)ittedsymbols,(cid:1)andγisthe
whosecomplexityisexponentialink.
averagereceivedSNRperreceiveantenna[3],andthe(i,j)th
entryinY isthereceivedsignalattheithreceiveantennain
c A. Full-rateNon-orthogonalSTBCsfromCDA
the jth time-slot. Consider linear dispersion STBCs, where
X canbewrittenintheform[3] We focus on the detection of square (i.e., n=p=Nt), full-
c
rate(i.e.,k=pn=N2),circulant(wheretheweightmatrices
t
k A(i)’s are permutation type), non-orthogonal STBCs from
X = x(i)A(i), (2) c
c c c CDA [1], whose construction for arbitrary number of trans-
Xi=1
mitantennasnisgivenbythematrixinEqn.(9.a)givenatthe
wherexc(i)istheithcomplexdatasymbol,andA(ci) CNt×p bottomofthispage. In(9.a),ωn = ej2nπ,j = √ 1,anddu,v,
∈ −
isitscorrespondingweightmatrix.Thereceivedsignalmodel 0 u,v n 1arethen2datasymbolsfromaQAMalpha-
≤ ≤ −
in(1)canbewritteninanequivalentV-BLASTformas bet.Whenδ =t=1,thecodein(9.a)isinformationlossless
(ILL),andwhenδ = e√5j andt = ej, itis offull-diversity
k
y = x(i)(H a(i))+n = H x +n , (3) andinformationlossless(FD-ILL)[1].Highspectralefficien-
c c c c c c c c cieswithlargencanbeachievedusingthiscodeconstruction.
Xi=1
b e However,sincetheseSTBCsarenon-orthogonal,MLdetec-
whereyc CNrp×1 = vec(Yc), Hc CNrp×Ntp = (I tiongetsincreasinglyimpracticalforlargen.Consequently,a
∈ ∈ ⊗
Hc),ac(i) ∈CNtp×1 =vec(A(ci)),nbc ∈CNrp×1 =vec(Nc), keychallengeinrealizingthebenefitsoftheselargeSTBCsin
x Ck 1 whose ith entry is the data symbol x(i), and practiceisthatofachievingnear-MLperformanceforlargen
c × c
Hc ∈CNrp×k whoseithcolumnisHca(ci),i =1,2, ,k. atlowdecodingcomplexities. TheBERperformanceresults
∈ ··· we reportin Sec. IV show thatthe PDA-baseddecodingal-
Eachelementofx isanM-PAM/M-QAMsymbol. Lety ,
c c
e b gorithmweproposeinthefollowingsectionessentiallymeets
H ,x ,n bedecomposedintorealandimaginarypartsas:
c c c
thischallenge.
e y =y +jy , x =x +jx ,
c I Q c I Q
III. PROPOSED PDA-BASED DECODING
n =n +jn , H =H +jH . (4)
c I Q c I Q
In this section, we present the proposed PDA-based decod-
Further, we define Hr R2Nerp×2k, yr R2Nrp×1, xr ing algorithm for square QAM. The applicability of the al-
R2k×1,andnr R2Nrp∈×1 as ∈ ∈ gorithm to any rectangular QAM is straightforward. In the
∈
H H real-valuedsystemmodelin(8),eachentryofxbelongstoa
Hr =(cid:18) HIQ −HQI (cid:19), yr =[yIT yQT]T, (5) √M-PAMconstellation,whereM isthesizeoftheoriginal
xr =[xTI xTQ]T, nr =[nTI nTQ]T. (6) sthqeuaqre=QloAgM(√coMns)teclloantisotintu.eLnettbbit(is0)o,fbt(ih1)e,i·t·h·e,nbt(irqy−x1) odfenxo.te
2 i
2666 PPPninini===−−−000111.ddd210,,,iiitttiii δPPPninini=−==−−010011dddn.10−,,ii1ωω,inniiωttniiiti δδPPPninini==−−=−001101dddnn.0−−,i12ω,,iin2ωωinn22tiiittii ······.··· δδδPPPninini===−−−000111ddd321,,,.iiiωωωnnn(((nnn−−−111)))iiitttiii 3777. (9.a)
6 . . . . . 7
6 . . . . . 7
6664 PPnini==−−0011ddnn−−21,,iittii PPnini==−−0011ddnn−−32,,iiωωnnii ttii PPnini==−−0011ddnn−−43,,iiωωnn22iittii ······ δPPnini==−−0011ddn0−,i1ω,in(ωn−n(n1−)i1t)iiti 7775
Wecanwritethevalueofeachentryofxasalinearcombi- Gaussian,andhenceyisGaussianconditionedonb(j). Since
i
nationofitsconstituentbitsas thereare2qk 1termsinthedoublesummationin(15),this
−
Gaussian approximationgets increasinglyaccurate for large
q 1
x = − 2jb(j), i=0,1, ,2k 1. (10) Nt(notethatk =Nt2). SinceaGaussiandistributionisfully
i i ··· − characterized by its mean and covariance, we evaluate the
Xj=0
mean and covarianceof y givenb(j) = +1 and b(j) = 1.
i i −
Letb∈{+1,−1}2qk×1,definedas Fornotationalsimplicity,letusdefinepji+ =△ P(b(ij) = +1)
dwebenco=△atentwhhebr(0itt0re)an·x·sma·sbi0(tqt−ed1)bbi(1t0v)e·c·t·obr(1.qD−e1)fi·n·in·gb(2c0k)−=△1[·2·0·2b1(2qk·−·−·112)iq(−T11,1]), wwLaneehdtecrµpaejinji−E+w(=△.r=△)itPdeEeµ(n(bjioy(i+jt|e)bsa(i=jst)h−e=e1x)+.peI1tc)itsaatcnioldenaµorpjit−hearat=△tpojir+E.N(+yo|wpb(iij,j−f)ro==m1−(.115)),,
x = (I⊗c)b, (12) µj+ = h +2k−1 q−1h (2pm+ 1). (16)
i qi+j ql+m l −
where I is the 2k 2k identity matrix. Using (12), we can Xl=0 mX=0
× m=q(i l)+j
rewrite(8)as 6 −
y = H(I c) b+n, (13) Similarly,wecanwriteµji− as
′
⊗
2k 1 q 1
| △={zH } µji− = −hqi+j + − − hql+m(2plm+−1)
whereH ∈ R2Nrp×2qk is the effectivechannelmatrix. Our Xl=0m=qmX(i=0l)+j
goalistoobtainb, anestimateofthebvector. Forthis,we 6 −
= µj+ 2h . (17)
iterativelyupdatethestatisticsofeachbitofb, asdescribed i − qi+j
b
ainndthheafrodlldoewciinsigosnusbasreectmioand,efoornathceerfitanianlnsutamtibsteircsoftoitegreattibo.ns, Nisegxitv,etnheby2Nrp×2NrpcovariancematrixCji ofygivenb(ij)
A. IterativeProcedure b 2k 1 q 1
Cj = E n+ − − h (b(m) 2pm++1)
dTahteesa,lgoonreitfhomr eiascihteoraftitvheeicnonnsattiuturee,ntwbhietsr,ea2rqekpestrafotirsmticedupin- i (cid:26)h Xl=0 mX=0 ql+m l − l i
m=q(i l)+j
eachiteration.Westartthealgorithmbyinitializingtheapri- 6 −
2k 1 q 1
0o,ri·p··ro,b2akb−ili1tieasnadsjP=(b0(ij,)·=··+,q1)−=1.PI(nba(ijn)i=ter−at1i)on=,t0h.5e,s∀taiti=s- hn+Xl=−0 mX−=0 hql+m(b(lm)−2pml ++1)iT(cid:27). (18)
tics of the bits are updatedsequentially, i.e., the orderedse- m=q(i l)+j
6 −
quenceofupdatesinaniterationis b(00),··· ,b(0q−1),······ , Assuming independenceamong the constituentbits, we can
b(0) , b(q−1) . Thestepsinvol(cid:8)vedineachiterationofthe simplifyCj in(18)as
2k 1 ··· 2k 1 i
algo−rithmare−der(cid:9)ivedasfollows.
2k 1 q 1
iTshgeivlieknelbihyoodratioofbitb(ij)inaniteration,denotedbyΛ(ij), Cji = σ2I + − − hql+mhTql+m4pml +(1−pml +). (19)
Xl=0 mX=0
m=q(i l)+j
6 −
P b(j) =+1y
Λ(j) =△ i | Using the above mean and covariance expressions, we can
i PP(cid:0)(cid:0)by(ijb)(j=) =−1+|y1(cid:1)(cid:1) P b(j) =+1 writethedistributionofygivenb(ij) =±1as
= P(cid:0)y|b(ij) = 1(cid:1) P(cid:0)b(ij) = 1(cid:1). (14) P(yb(j) = 1) = e−(y−µji±)T(Cji)−1(y−µji±). (20)
(cid:0) | i − (cid:1) (cid:0) i − (cid:1) | i ± (2π)Nrp|Cji|21
△=β(j) △=α(j)
| {zi } | {zi } Similarly,P(yb(j) = 1)isgivenby
DenotingthetthcolumnofHbyh ,wecanwrite(13)as | i −
t
2k 1 q 1 P(yb(j) = 1) = e−(y−µji−)T(Cji)−1(y−µji−). (21)
y = hqi+jb(ij)+ − − hql+mb(lm)+n, (15) | i − (2π)Nrp|Cji|12
Xl=0 m=qmX(i=0l)+j Using(20)and(21),βj canbewrittenas
6 − i
=△ne P(yb(j) =+1)
| {z } βj = | i
where n R2Nrp×1 is the interference plus noise vector. i P(y|b(ij) =−1)
Tocalceula∈teβi(j),weapproximatethedistributionofntobe =e−((y−µji+)T(Cji)−1(y−µji+)−(y−µji−)T(Cji)−1(y−µji−)). (22)
e
Usingα(j)andβ(j),Λ(j)iscomputedusing(14).Now,using of the old D matrix. Therefore, using the matrix inversion
i i i
the valueof Λi(j), the statistics ofb(ij) isupdatedas follows. lemma,thenewD−1 canbeobtainedfromtheoldD−1as
From(14),andusingP(b(j) = +1y)+P(b(j) = 1y) = 1,
i | i − | D 1h hT D 1
wehave D 1 D 1 − ni+j ni+j − , (28)
− ← − − hT D 1h + 1
Λ(j) ni+j − ni+j η
P(b(j) =+1y) = i , (23)
i | 1+Λ(j) where
i
and η = 4pj+ 1 pj+ 4pj+ 1 pj+ , (29)
i − i − i,old − i,old
(cid:0) (cid:1) (cid:0) (cid:1)
P(b(j) = 1y) = 1 . (24) where pj+ and pj+ are the new i.e., after the update in
i − | 1+Λ(ij) (25) anid (26) ai,nodldold (beforethe(cid:0)update) values, respec-
tivel(cid:1)y. Itcanbe(cid:1)seenthatboththenumeratoranddenomina-
Asanapproximation,droppingtheconditioningony,
tor in the 2nd term on the RHS of (28) can be computedin
Λ(j) O(Nr2p2)complexity.Therefore,thecomputationofthenew
P(b(j) =+1) i , (25) D 1usingtheoldD 1canbedoneinO(N2p2)complexity.
i ≈ 1+Λ(j) − − r
i Computationof (Cj) 1: Using (27) and (19), we can write
and i −
Cj intermsofDas
i
1
P(b(j) = 1) . (26)
i − ≈ 1+Λ(ij) Cji = D−4pji+(1−pji+)hqi+jhTqi+j. (30)
Using the above procedure, we update P(b(j) = +1) and Wecancompute(Cj) 1 fromD 1 atareducedcomplexity
i i − −
P(b(j) = 1)foralli=0, ,2k 1andj =0, ,q 1 usingthematrixinversionlemma,whichstatesthat
i − ··· − ··· −
sequentially. This completes one iteration of the algorithm;
i.e.,eachiterationinvolvesthecomputationofα(j)andequa- (P+QRS)−1 = P−1 P−1Q(R−1+SP−1Q)−1SP−1. (31)
i −
tions (16), (17), (19), (22), (14), (25), and (26) for all i,j.
SubstitutingP =D,Q =h ,R =
The updated valuesof P(b(ij) = +1) and P(b(ij) = −1) in 4pj+(1 pj+2N),rpa×nd2NSrp =2hNTrp×1in(31q)i,+wjege1t×1
(25) and (26) for all i,j are fed back to the next iteration2. − i − i 1×2Nrp qi+j
Thealgorithmterminatesafteracertainnumberofsuchiter-
D 1h hT D 1
ations. Attheendofthelastiteration,harddecisionismade (Cj) 1 = D 1 − qi+j qi+j − , (32)
on the final statistics to obtain the bit estimate b(ij) as +1 if i − − − hTqi+jD−1hqi+j − 4pj+(11 pj+)
Λ(j) 1,and 1otherwise.Incodedsystems,Λ(j)’sarefed i − i
asisof≥tinputst−othedecoder. bi whichcanbecomputedinO(Nr2p2)complexity.
Computationofµji+ andµji−: Computationofβi(j) involves
B. ComplexityReduction thecomputationofµji+ andµji− also. From(17),itisclear
Themostcomputationallyexpensiveoperationincomputing that µji− can be computed from µji+ with a computational
β(j) istheevaluationoftheinverseofthecovariancematrix, overheadofonlyO(Nrp).From(16),itcanbeseenthatcom-
Ciji,ofsize2Nrp×2NrpwhichrequiresO(Nr3p3)complex- pthuitsincgomµpjil+exwitoyucladnrbeqeurierdeuOce(dqNasrfpokll)ocwosm.Dpleefixnitey.veHcotowreuvears,
ity,whichcanbereducedasfollows.DefinematrixDas
2k 1 q 1 2k−1 q−1
D =△ σ2I+Xl=−0 mX−=0hql+mhTql+m4pml +(1−pml +). (27) u =△ Xl=0 mX=0hql+m(cid:0)2pml +−1(cid:1). (33)
Using(16)and(33),wecanwrite
Atthestartofthealgorithm,withpj+ andp(j) initializedto
i i
0.5foralli,j,Dbecomesσ2I+HHT. µj+ = u+2 1 pj+ h . (34)
i − i qi+j
Computation of D 1: We note that when the statistics of (cid:0) (cid:1)
− ucan becomputediterativelyatO(N p)complexityasfol-
b(j) is updated using (25) and (26), the D matrix in (27) r
i lows.Whenthestatisticsofb(j)isupdated,wecanobtainthe
alsochanges. AstraightforwardinversionofthisupdatedD i
newufromtheolduas
matrix would require O(N3p3) complexity. However, we
r
can obtain the D−1 from the previously available D−1 in u u+2 pj+ pj+ h , (35)
O(N2p2)complexityasfollows. Sincethestatisticsofonly ← i − i,old ni+j
r (cid:0) (cid:1)
b(j) is updated, the new D matrix is just a rank one update whose complexity is O(N p). Hence, the computation of
i r
µji+ in(33)andµji− in(34)needsO(Nrp)complexity. The
2Thecomputation ofthestatistics ofacurrent bitinaniteration makes listing of the proposedPDA algorithmis summarizedin the
useofthenewlycomputedstatisticsofitspreviousbits(aspertheordered
sequence ofstatistic updates)inthesameiteration andthestatistics ofits Table-Iinthenextpage.
nextbitsavailablefromthepreviousiteration.
Table-I:ProposedPDA-basedAlgorithmListing 100
LAS, 4x4 STBC
Initialization PDA, 4x4 STBC
Nt x Nt non−orthogonal ILL STBCs LAS, 8x8 STBC
1.pji+ =pji− =0.5, Λ(ij) =1, 10−1 Nt = Nr, 4−QAM PDA, 8x8 STBC
i=0,1, 2k 1,j =0,1, ,q 1. LAS, 16x16 STBC
2.u=0, D−1∀= HHT·+··σ2I−−1. ··· − PSDISAO, 1A6Wx1G6N STBC
` ´
3.num iter:numberofiterations 10−2
4.κ=1; κistheiterationnumber Rate
Statisticsupdateintheκthiteration Error NMuMmSbEe rin oitfi aitle vreactitoonrs f omr L=A 1S0 for PDA BER improves
5.fori=0to2k 1 Bit 10−3 wNitt h= iNncrreasing
−
6.forj=0toq 1
−
Updateofstatisticsofbitb(j)
i 10−4
7.µji+ = u+2`1−pji+´hqi+j
8.µji− = µji+−2hqi+j
9.(Cji)−1 = D−1− hTqi+jDD−−11hhqiq+i+jjh−Tqi4+pjji+D(−11−1pji+) 10−50 2 4 Av6eragFe iRge.ce18ived SNR (1d0B) 12 14 16
10.βij = e−((y−µji+)T(Cji)−1(y−µji+)−(y−µji−)T(Cji)−1(y−µji−)) COMPARISONOFUNCODEDBEROFPDAANDLASALGORITHMSIN
11.pji,+old = pji+, pji,−old = pji− DECODING4×4,8×8,16×16ILLSTBCS.Nt=Nr,4-QAM.#
12.α(j) = pji,+old ITERATIONSm=10FORPDA.MMSEINITIALVECTORFORLAS.
i pj− BERimprovesforincreasingSTBCsizes.With4-QAM,PDAandLAS
i,old
13.Λ(j) =β(j)α(j) algorithmsachievealmostthesameperformance.
i i i
14.pji+ = 1+ΛΛi(ji()j), pji− = 1+Λ1i(j)
UpdateofuandD−1 PDAversusLASperformancewith4-QAM:InFig.1,weplot
15.u ← u+2`pji+−pji,+old´hqi+j theuncodedBERofthePDAalgorithmasafunctionofaver-
16.η = 4pj+ 1 pj+ 4pj+ 1 pj+ agereceivedSNRperrxantenna,γ,indecoding4 4,8 8,
17.D−1 ←i D`−1−−iDh−Tq´i1+−hjqDi+−ji1,hhoTqlqidi+`+jjD+−−η11i,old´ c1h6a×nn1e6lSstTaBteCisnfforormmaCtioDnAawttihtherNecte=ivNerr(aCnSdIR4-)QaAnMd×i..iP.derf×faedc-t
18.end; Endofforloopstartingatline5 ing are assumed. For the same settings, the performanceof
19.if(κ=num iter)gotoline21 theLASalgorithmin[13]-[15]withMMSEinitialvectorare
20.κ=κ+1,gotoline5 alsoplottedforcomparison.FromFig. 1,itisseenthat
21.b(j) =sgn log(Λ(j))
i ` i ´ the BER performance of PDA algorithm improvesand
b i=0,1, ,2k 1, j =0,1, ,q 1 •
22.xi=Pjq∀=−012jˆb(ij)·,··∀i=−0,1,··· ,2k−··1· − ianpcprreoaascehde;se.SgI.S,OperAfoWrmGaNncpeercfloorsmeatoncweitahsinNatbo=utN1rdiBs
23.Tberminate fromSISOAWGNperformanceisachievedat10 3un-
−
codedBERindecoding16 16STBCfromCDAhaving
C. OverallComplexity ×
512realdimensions,andthisillustratestheabilityofthe
WeneedtocomputeHHT atthestartofthealgorithm. This PDAalgorithmtoachieveexcellentperformanceatlow
requiresO(qkN2p2)complexity. So thecomputationofthe complexitiesinlargenon-orthogonalSTBCMIMO.
r
initialD 1inline2requiresO(qkN2p2)+O(N3p3). Based with4-QAM,PDAandLASalgorithmsachievealmost
− r r •
onthecomplexityreductioninSec. III-B, thecomplexityin thesameperformance.
updatingthestatisticsofoneconstituentbit(lines7to17)is
PDAversusLASperformancewith16-QAM:Figure2presents
O(N2p2). So, the complexity for the update of all the 2qk
r an uncoded BER comparison between PDA and LAS algo-
constituentbitsinaniterationisO(qkN2p2). Sincethenum-
r rithms for 16 16 STBC from CDA with N = Nr = 16
t
ber ofiterationsis fixed, the overallcomplexityof the algo- ×
and16-QAMunderperfectCSIR and i.i.d fading. It can be
rithmisO(qkN2p2)+O(N3p3). ForN = N ,sincethere
r r t r seenthatthePDAalgorithmperformsbetteratlowSNRsthan
are k symbols per STBC and q bits per symbol, the overall
the LAS algorithm. For example, with 8 8 and 16 16
complexityperbitisO(p2N2). × ×
t STBCs,atlowSNRs(e.g.,<25dBfor16 16STBC),PDA
×
algorithm performs better by about 1 dB compared to LAS
IV. RESULTS ANDDISCUSSIONS
algorithmat10 2uncodedBER.
−
Inthissection,wepresentthesimulateduncoded/codedBER
Turbo coded BER performance of PDA: Figure 3 shows the
of the PDA algorithm in decoding non-orthogonal STBCs
rate-3/4 turbo coded BER of the PDA algorithm under per-
from CDA3. Number of iterations in the PDA algorithm is
fectCSIRandi.i.dfadingfor12 12ILLSTBCwithN =
settom=10inallthesimulations. × t
N =12 and 4-QAM, which correspondsto a spectral effi-
r
3Our simulation results showed that the performance of FD-ILL (δ = ciencyof18bps/Hz.ThetheoreticalminimumSNRrequired
e√5j,t = e−j)andILL(δ = t = 1)STBCswithPDAdecoding were toachieve18bps/Hzspectralefficiencyona N =N =12
almostthesame.Here,wepresenttheperformanceofILLSTBCs. t r
100
NtxNt non−orthogonal ILL STBCs LAS, 4x4 STBC
Nt = Nr, 16−QAM PDA, 4x4 STBC
Number of iterations m=10 for PDA LAS, 8x8 STBC Perfect CSIR, 18 bps/Hz
10−1 MMSE init. vec. for LAS PLADSA,, 186xx81 S6T SBTCBC 100 Estimated CSIR, Nd = 1, 9 bps/Hz
PDA, 16x16 STBC Estimated CSIR, Nd = 8, 16 bps/Hz
SISO AWGN Min. SNR reqd. to achieve 18 bps/Hz
10−1
10−2
Rate
Bit Error 10−3 Error Rate10−2 142−xQ1A2M IL, Lra SteT−B3C/4, Nturr b=o N Ct o=d 1e2d
Bit 10−3
BER improves
10−4 with increasing Nt=Nr
4.3 dB
10−4
.
10−5
5 10 15 20 25 30 35 40 45
Average Received SNR (dB) 10−5
2 4 6 8 10 12 14
Fig.2 Average Received SNR (dB)
COMPARISONOFUNCODEDBEROFPDAANDLASALGORITHMSIN Fig.3
DECODING4×4,8×8,16×16ILLSTBCS.Nt=Nr,16-QAM.# TURBOCODEDBEROFTHEPDAALGORITHMINDECODING12×12
ITERATIONSm=10FORPDA.MMSEINITIALVECTORFORLAS. ILLSTBCWITHNt=Nr,4-QAM,RATE-3/4TURBOCODE,18
With16-QAM,PDAperformsbetterthanLASatlowSNRs. BPS/HZANDm=10FORi)PERFECTCSIR,ANDii)ESTIMATEDCSIR
USING2ITERATIONSBETWEENPDADECODING/CHANNEL
ESTIMATION.WithperfectCSIR,PDAperformsclosetowithinabout5dB
MIMOchannelwith perfectCSIRand i.i.dfadingis4.3dB fromcapacity.WithestimatedCSIR,performanceapproachestothatwith
(obtainedthroughsimulationoftheergodiccapacityformula
perfectCSIRwithincreasingcoherencetimes.
[3]). From Fig. 3, it is seen that the PDA algorithmis able
toachieveverticalfallincodedBERwithinabout5dBfrom
the theoretical minimum SNR, which is a good nearness to (cid:0)(cid:1)1M Pa(cid:0)(cid:1)itlroix(cid:0)(cid:1)t (cid:0)(cid:1)(cid:0)(cid:1) Pilot
capacityperformance. (cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1) Data STBCs (cid:0)(cid:1)(cid:0)(cid:1)M(cid:0)(cid:1)atr(cid:0)(cid:1)ix(cid:0)(cid:1) Data STBCs
(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1) Nd (cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)
Space (cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1) (cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)
IpteerrfaetcivteCPSIDRAaDsseucmopdtiinogn/CbhyacnonneslidEesrtiinmgataiotnra:inWinegrbealasexdthite- (cid:0)(cid:0)(cid:1)(cid:1) Nt(cid:0)(cid:0)(cid:0)(cid:1)(cid:1)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:0)(cid:0)(cid:1)(cid:1)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:0)(cid:0)(cid:1)(cid:1)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:0)(cid:0)(cid:1)(cid:1)(cid:1)(cid:0)(cid:1) (cid:0)(cid:0)(cid:0)(cid:1)(cid:1)(cid:1)(cid:0)(cid:1)(cid:0)(cid:0)(cid:0)(cid:1)(cid:1)(cid:1)(cid:0)(cid:1)(cid:0)(cid:0)(cid:0)(cid:1)(cid:1)(cid:1)(cid:0)(cid:1)(cid:0)(cid:0)(cid:0)(cid:1)(cid:1)(cid:1)(cid:0)(cid:1)(cid:0)(cid:0)(cid:0)(cid:1)(cid:1)(cid:1)(cid:0)(cid:1)
erativePDAdecoding/channelestimationscheme. Transmis- Nt (cid:0)(cid:1) Nd(cid:2)Nt (cid:0)(cid:1)(cid:0)(cid:1)(cid:0)(cid:1)time
sion is carried out in frames, where one N N pilot ma- 1 Frame
t t
×
trix (for training purposes) followed by N data STBC ma-
d Fig.4
tricesaresentineachframeasshowninFig. 4. One frame
TRANSMISSIONSCHEMEWITHONEPILOTMATRIXFOLLOWEDBYNd
length, T, (taken to be the channel coherence time) is T =
DATASTBCMATRICESINEACHFRAME.
(N +1)N channeluses.Theproposedschemeworksasfol-
d t
lows[16]:i)obtainanMMSEestimateofthechannelmatrix
during the pilot phase, ii) use the estimated channel matrix
todecodethedataSTBCmatricesusingPDAalgorithm,and Fig. 5, we plot the BER of the PDA algorithm in decoding
iii) iterate between channel estimation and PDA decoding 12 12STBC fromCDA withperfectCSIR ini)i.i.d. fad-
×
foracertainnumberoftimes. For12 12STBCfromCDA, ing,andii)correlatedMIMOfadingmodelin[18]. Itisseen
in addition to perfectCSIR performa×nce,Fig. 3 also shows that, comparedto i.i.d fading, there is a loss in diversity or-
theperformancewithCSIRestimatedusingtheproposedit- derinspatialcorrelationforNt = Nr = 12; further,use of
erativedecoding/channelestimationschemeforNd = 1and morerxantennas(Nr = 18,Nt = 12)alleviatesthislossin
N = 8. 2 iterations between decoding and channel esti- performance.Wecandecodeperfectcodes[19],[20]oflarge
d
mationare used. With N = 8 (which correspondsto large dimensionsalsousingtheproposedPDAalgorithm.
d
coherencetimes,i.e.,slowfading)theBERandbps/Hzwith
REFERENCES
estimatedCSIRgetclosertothosewithperfectCSIR.
[1] B.A.Sethuraman,B.S.Rajan,andV.Shashidhar,“Full-diversityhigh-
EffectofSpatialMIMO Correlation: InFigs. 1to 3, weas- ratespace-timeblockcodesfromdivisionalgebras,” IEEETrans.In-
sumedi.i.dfading. Butspatialcorrelationattransmit/receive form.Theory,vol.49,no.10,pp.2596-2616,October2003.
[2] J.-C.Belfiore, G.Rekaya, andE.Viterbo,“Thegoldencode: A2×
antennasand the structure of scattering and propagationen-
2 full-rate space-time code with non-vanishing determinants,” IEEE
vironmentcanaffecttherankstructureoftheMIMOchannel Trans.Inform.Theory,vol.51,no.4,pp.1432-1436,April2005.
resultingindegradedperformance[17]. Werelaxedthei.i.d. [3] H.Jafarkhani, Space-TimeCoding: TheoryandPractice, Cambridge
UniversityPress,2005.
fadingassumptionbyconsideringthecorrelatedMIMOchan- [4] http://www.ruckuswireless.com/technology/beamflex.php
nelmodelin[18],whichtakesintoaccountcarrierfrequency [5] J.Luo,K.Pattipati, P.Willett, andF.Hasegawa, “Near-optimalmul-
tiuserdetectioninsynchronousCDMAusingprobabilisticdataassoci-
(fc),spacingbetweenantennaelements(dt,dr),distancebe- ation,”IEEECommun.Letters,vol.5,no.9,pp.361-363,2001.
tweentxandrxantennas(R),andscatteringenvironment.In [6] Y. Huang and J. Zhang, “Generalized probabilistic data association
multiuserdetector,”Proc.ISIT’2004,June-July2004.
Uncoded, Nr = Nt = 12 (LAS) [20] P.Elia,B.A.Sethuraman,andP.V.Kumar,“Perfectspace-timecodes
Uncoded, Nr = 18, Nt = 12 (LAS) foranynumberofantennas,”IEEETrans.Inform.Theory,vol.53,no.
Uncoded, Nr = Nt = 12 (PDA) 11,pp.3853-3868,November2007.
Uncoded, Nr = 18, Nt = 12 (PDA)
101 Turbo coded, Rate−3/4, Nr = Nt = 12 (LAS)
Turbo coded, Rate−3/4, Nr = 18, Nt = 12 (LAS)
Turbo coded, Rate−3/4, Nr = Nt = 12 (PDA)
100 Turbo coded, Rate−3/4, Nr = 18, Nt = 12 (PDA)
Min. SNR for capacity for 36 bps/Hz (Nr = Nt = 12)
10−1 Min. SNR for capacity for 36 bps/Hz (Nr = 18, Nt = 12)
Rate Uncoded, Nr = Nt = 12 (PDA), i.i.d. (i.e., no correlation)
Error 10−2 12x12 ILL STBC
Bit 16−QAM
10−3
Nd = 72 cm, d = d,
rr t r
10−4 R = 9.4 dB R = 12.6 dB DRθt r === θ5Dr0 t=0 = m9 20, 0 Sd m=e g.3 . 0 , f = 5 G H z.
N N
10−5 S S
10 15 20 25 30 35 40 45 50
Average Received SNR (dB)
Fig.5
EFFECTOFSPATIALCORRELATIONONTHEPERFORMANCEOFPDAIN
DECODING12×12STBCFROMCDA.Nt=12,Nr =12,18,
16-QAM,RATE-3/4TURBOCODE,36BPS/HZ.CORRELATEDCHANNEL
PARAMETERS:fc=5GHZ,R=500M,S=30,Dt=Dr =20M,
θt=θr =90◦,Nrdr =72CM,dt=dr.Correlationdegrades
performance;usingNr >Ntalleviatesthethisperformanceloss.
[7] X.Wang,H.V.Poor,“Iterative(Turbo)softinterferencecancellation
anddecodingforcodedCDMA,”IEEETrans.onCommun.,vol.47,
no.7,pp.1046-1061,July1999.
[8] P. H. Tan and L. K. Rasmussen, “Asymptotically optimal nonlinear
MMSEmultiuserdetectionbasedonmultivariateGaussianapproxima-
tion,”IEEETrans.onCommun.,vol.54,no.8,pp.1427-1438,August
2006.
[9] Y.Yin,Y.Huang,andJ.Zhang,“Turboequalizationusingprobabilistic
data association,” Proc. IEEE GLOBECOM’2004, vol. 4, pp. 2535-
2539,November-December2004.
[10] D.Pham,K.R.Pattipati,P.K.Willet,andJ.Luo,“Ageneralizedproba-
bilisticdataassociationdetectorformultiantennasystems,”IEEECom-
mun.Letters,vol.8,no.4,pp.205-207,2004.
[11] G. Latsoudas and N. D. Sidiropoulos, “A hybrid probabilistic data
association-sphere decoding for multiple-input-multiple-output sys-
tems,”IEEESig.Proc.Letters,vol.12,no.4,pp.309-312,April2005.
[12] Y.Jiaetal,C.M.Vithanage,C.Andrieu,andR.J.Piechocki,“Proba-
bilisticdataassociationforsymboldetectioninMIMOSystems,”Elec-
tronicLetters,vol.42,no.1,5Jan.2006.
[13] K.VishnuVardhan,SaifK.Mohammed,A.Chockalingam,B.Sundar
Rajan,“Alow-complexitydetectorforlargeMIMOsystemsandmul-
ticarrierCDMAsystems,”IEEEJSACSpl.Iss.onMultiuserDetection
forAdv.Commun.SystemsandNetworks,vol.26,no.3,pp.473-485,
April2008.
[14] SaifK.Mohammed,A.Chockalingam,andB.SundarRajan“Alow-
complexitynear-MLperformanceachievingalgorithmforlargeMIMO
detection,”Proc.IEEEISIT’2008,Toronto,Canada,July2008.
[15] SaifK.Mohammed,A.Chockalingam, andB.SundarRajan,“High-
ratespace-time codedlargeMIMOsystems: Low-complexity detec-
tionandperformance,”Proc.IEEEGLOBECOM’2008,NewOrleans,
USA,Dec.2008.
[16] Ahmed Zaki, Saif K. Mohammed, A. Chockalingam, and B. Sun-
dar Rajan, “A training-based iterative detection/channel estima-
tion scheme for large non-orthogonal STBC MIMO systems,” ac-
cepted inIEEEICC’2009, Dresden, Germany, April 2009. Alsosee
arXiv:0809.2446v2[cs.IT]4Oct2008.
[17] D.Shiu,G.J.Foschini,M.J.Gans,andJ.M.Khan,“Fadingcorrelation
anditseffectonthecapacityofmulti-antennasystems,”IEEETrans.
onCommun.,vol.48,pp.502-513,March2000.
[18] D.Gesbert, H.Bo¨lcskei, D.A.Gore,A.J.Paulraj, “OutdoorMIMO
wirelesschannels: Modelsandperformanceprediction,”IEEETrans.
onCommun.,vol.50,pp.1926-1934,December2002.
[19] F.E.Oggier,G.Rekaya,J.-C.Belfiore,andE.Viterbo,“Perfectspace-
time block codes,” IEEE Trans. on Inform. Theory, vol. 52, no. 9,
September2006.