ebook img

A Constrained Channel Coding Approach to Joint Communication and Channel Estimation PDF

0.11 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview A Constrained Channel Coding Approach to Joint Communication and Channel Estimation

A Constrained Channel Coding Approach to Joint Communication and Channel Estimation Wenyi Zhang, Satish Vedantam, and Urbashi Mitra Ming Hsieh Department of Electrical Engineering University of Southern California {wenyizha, vedantam, ubli}@usc.edu Abstract—Ajointcommunicationandchannelstateestimation Increasingrandomnessinchannelinputsincreasesinformation 8 problem is investigated, in which reliable information transmis- transfer while reducing the receiver’s ability to estimate the 0 sion over a noisy channel, and high-fidelity estimation of the channel.Incontrast,deterministicsignalingfacilitateschannel 0 channel state, are simultaneously sought. The tradeoff between estimation at the expense of zero information transfer. In this 2 the achievable information rate and the estimation distortion is quantified by formulating the problem as a constrained channel paper,we showthatthe optimaltradeoffcan beformulatedas n coding problem, and the resulting capacity-distortion function a channelcoding problem, with the channelinput distribution a characterizes the fundamental limit of the joint communication constrained by an average “estimation cost” constraint. J and channel estimation problem. The analytical results are The rest of this paper is organized as follows. Section 7 illustrated through case studies, and further issues such as II introduces the channel model and the capacity-distortion multiple cost constraints, channel uncertainty, and capacity per ] unit distortion are also briefly discussed. function,andSectionIIIformulatestheequivalentconstrained T channelcodingproblem.Section IV illustrates the application I I. INTRODUCTION of the capacity-distortion function through several simple . s In this paper, we consider the problem of joint commu- examples. Section V briefly discusses some related issues c [ nication and channel estimation over a channel with a time- including multiple cost constraints, channel uncertainty, and varying channel state. We consider a noisy channel with a capacity per unit distortion. Finally, Section VI concludesthe 1 randomchannelstate that evolveswith time, in a memoryless paper. v 6 fashion, and is neither available to the transmitter nor the II. CHANNEL MODEL 3 receiver.Theobjectiveis tohavethe receiverrecoverboththe 1 information transmitted from the transmitter as well the state We consider the channelmodel in Figure 1. For a length-n 1 blockof channelinputs,a message M is equallyprobablyse- of the channel over which the information was transmitted. 1. The problemsetting may proverelevantfor situations such as lected among{1,..., enR }, and is encodedby the encoder, 0 environment monitoring in sensor networks [1], underwater generating the corresponding channel inputs {X1,...,Xn}. (cid:6) (cid:7) 8 We provide the following definition. acoustic applications [2], and cognitive radio [3]. A distinct 0 Definition 1: (Encoder) An encoder is defined by a func- featureofourproblemformulationisthatbothcommunication v: tion, f :M={1,..., enR }→Xn, for each n∈N. and channel estimation are required. n i X Theinterplaybetweeninformationmeasuresandestimation (cid:6) (cid:7) r (minimummean-squarederror(MMSE)inparticular)haslong a been investigated; see, e.g., [4] and references therein. Previ- Mˆ ously, however, estimation was only to facilitate information M encoder Xi cPh(ayn|nxe,ls) Yi jaonindt edsetcimodaetorr transmission, rather than a separate goal. For example, a Sˆi commonstrategyinblockinterferencechannels[5]ischannel estimation via training[6]. The purposeof channeltraining is Si only to increase the information rate for communication, and thus the quality of channel estimate is not traded off with the Fig.1. Channelmodelforjointcommunication andchannel estimation. information rate, as we consider in this paper. The problem formulation in [7], [8] bears some similarity Thechannelisdescribedbya transitionfunctionP(y|x,s), to the one we consider in that the receiver is interested in which is the probability distribution of the channel output Y, both communication and channel estimation. It differs from conditioned on the channel input X and the channel state S. our work in a critical way: the channel state is assumed Upon receiving the length-n block of channel outputs, the non-causally known at the transmitter. In contrast, neither the joint decoder and estimator (defined below) declares Mˆ ∈ transmitter nor the receiver knows the channel state in our {1,..., enR }asthedecodedmessage,andalength-nblock problem formulation. of estimates of the channel state. (cid:6) (cid:7) Intuitively,thereexistsatradeoffbetweenachannel’scapa- For technical purposes, in this paper, we assume that the bilitytotransferinformationanditscapabilitytoexhibitstate. random channel state evolves with time in a memoryless fashion. We note that this model encompasses the block where the expectation is with respect to the channel state S interference channel model, because we can treat a block as and the channel output Y conditioned upon the channel input a super-symbol and thus converta block interference channel X = x, and h : X×Y → S denotes an arbitrary one-shot 0 into a memoryless channel. estimator of S given the channel input and output. Definition 2: (Jointdecoderand estimator)A jointdecoder The following theorem establishes the constrained channel and estimator is defined by a pair of functions, g :Yn →M coding formulation. n and h :Yn →Sn, for each n∈N. Theorem 1: The capacity-distortion function for the chan- n Thisdefinitiondiffersfromthatoftheconventionalchannel nel model in Figure 1 is given by decoder (e.g., [9]) in that it explicitly requires estimation of the channel state S at the receiver. The quality of estimation C(D)= sup I(X;Y), (5) is measured by the distortion function d:S×S→R+∪{0}. PX∈PD That is, if Sˆi is the ith element of hn(Yn), then d(Si,Sˆi) where denotes the distortion at time i, i = 1,...,n. For technical convenience,we assume thatd(·,·) is boundedfromaboveso that there exists a finite T > 0 with d(s,s′) ≤ T < ∞ for PD = PX : PX(x)d∗(x)≤D . (6) ( ) any s,s′ ∈ S. Note that for length-n block coding schemes, Remark: Theorem 1xX∈aXpplies to general input/output/state the average distortion is given by alphabets.IfXisacontinuousrandomvariable,thesummation in (6) should be understood as an integral over X. n 1 d¯(Sn,Sˆn)= d(S ,Sˆ ). (1) InordertoproveTheorem1,weshallemploythefollowing i i n i=1 lemmas. X Lemma 1: For any (f ,g ,h )-sequence that achieves Finally, we have the following definitions. n n n C(D), as n → ∞, the achieved average distortion (2) is (in Definition 3: (Achievable rate) A nonnegative number probability) equal to the average distortion with Sˆn replaced R(D) is an achievable rate if there exist a sequence of by encodersandcorrespondingjointdecodersandestimatorssuch that (a) the average probability of decoding error Pe(n) = Sˆn =h∗(Xn,Yn), (7) (1/ enR(D) )· ⌈enR(D)⌉Pr[Mˆ 6=m|M=m] tends to zero n m=1 as n → ∞; and (b) the average distortion in channel state whereh∗(Xn,Yn)denotestheblock-nestimatorthatachieves n estim(cid:6)ation, (cid:7) P the minimum average distortion conditioned upon both the block-n channel inputs and outputs. limsupEd¯(Sn,Sˆn)≤D. (2) Proof: For each n, let us replace the estimator h by h∗ Definition 4: (nC→ap∞acity-distortion function) The capacity- in (7), with its first argument being the channel innputs Xˆnn distortion function is defined as correspondingtothedecodedmessageMˆ.WhenMˆ =M,the minimumaveragedistortionisachievedbyh∗;whenMˆ 6=M, n C(D)= sup R(D). (3) theincrementin theaveragedistortiondueto replacingh by n fn,gn,hn h∗ is bounded from above because d(·,·) ≤ T < ∞. By Remark: The reader may want to distinguish between the n Definitions 3 and 4, as n → ∞, the average probability of capacity-distortionfunction and the rate-distortionfunction in decoding error P(n) → 0. Hence as n → ∞, the minimum lossy source coding [9]. The capacity-distortion function is e averagedistortionisachievedbyh∗(Xˆn,Yn), whichisfurther defined with respect to a state-dependent channel, seeking n equal to (7), in probability. Q.E.D. to characterize the fundamental tradeoff between the rate of Lemma 1 shows that the joint decoder and estimator can informationtransmissionandthedistortionofstateestimation. utilize the reliably decoded channel inputs for channel state In contrast, the rate-distortionfunctionis definedwith respect estimation. The nextlemma,Lemma 2, furthershows thatthe toasourcedistribution,seekingtocharacterizethefundamen- tal tradeoff between the rate of its lossy description and the length-n block estimator can be decomposed into n one-shot estimators, each for one channel use. achievable distortion due to the description. Lemma 2: For any (f ,g ,h )-sequence that achieves n n n III. A CONSTRAINED CHANNEL CODING FORMULATION C(D), as n → ∞, the achieved average distortion (2) is (in probability) equal to that achieved by In this section, we show that the joint communication and channelestimationproblemcan be equivalentlyformulatedas Sˆ =h∗(X ,Y ), i=1,...,n, (8) i 0 i i a constrained channel coding problem. For this purpose, the following minimum conditional distortion will be important. where h∗(X ,Y ) denotes the one-shotestimator that achieves 0 i i The minimum conditional distortion function is defined for theminimumexpecteddistortionforS conditioneduponboth i each possible realization of the channel input X, as the channel input X and output Y . i i Proof: From Lemma 1, as n → ∞, h (Yn) is in probability d∗(x)= inf E[d(S,h0(x,Y))], (4) equivalenttoh∗(Xn,Yn).Thedecompnosition(8)thenfollows h0:X×Y→S n becausethechannelismemoryless.Foreachfixedn,wehave A. Uniform Estimation Costs P(Sn|Xn,Yn)= P(Xn,Yn,Sn) A special case is that d∗(x) = d0 for all x ∈ X. For P(Xn,Yn) such type of channels, the average cost constraint in (6) n P(S ,X ,Y ) exhibits a singular behavior. If D < d0, then the joint = i=1 i i i communication and channel estimation problem is infeasible; n P(S ,X ,Y ) SQn i=1 i i i otherwise, P consists of all possible input distributions, and n P(S ,X ,Y ) D = P Q i=1 i i i thus the capacity-distortion function C(D) is equal to the ni=1 SQiP(Yi|Xi,Si)P(Si) PX(Xi) unconstrained capacity of the channel. One of the simplest = iQ=n1PP((cid:2)YP(iS|Xi,iX)Pi,XY(iX)i) =i=n1P(S(cid:3)i|Xi,Yi). (9) icYthica=annnXesliusb+wtriSathci,tufonofirffoXwrmhifcrehosmtiamsYatht.ieonreccoesitvseirsrtehleiaabdlydidtievceocdheasnMne,l Y Y i i As we take n→∞, the lemma is established. Q.E.D. B. A Scalar Multiplicative Channel ProofofTheorem1:FromLemmas1and2,we canrewrite the average distortion constraint (2) as Consider the following scalar multiplicative channel limsup 1 n Ed(S ,Sˆ )≤D Yi =SiXi, (12) i i n n→∞ Xi=1 where all the alphabets are binary, X = Y = S = {0,1}, n 1 and the multiplication is in the conventional sense for real ⇒ limsup Ed(S ,h∗(X ,Y ))≤D. (10) n i 0 i i numbers. The reader may interpret S as the status of an n→∞ i=1 X informed jamming source, a fading level, or the status of Utilizing (4) and the fact that the channel is memoryless, we anothertransmitter.ActivatingStoits“effectivestatus”S=0 can further deduce from (10) that shuts down the link between X and Y; otherwise, the link Ed∗(X)≤D. (11) X→Y isessentiallynoiseless.We takethedistortionmeasure astheHammingdistance:d(s,sˆ)=1ifandonlyifsˆ6=sand So now the constraints in Definition 3 reduce to having zero otherwise. P(n) → 0 as n → ∞, subject to the constraint (11). This is The tradeoff between communication and channel estima- e exactly the problem of channel coding with a cost constraint tion is straightforward to observe from the nature of the ontheinputdistribution,andTheorem1directlyfollowsfrom channel:for good estimation of S, we want X=1 as often as standard proofs; see, e.g., [10]. Q.E.D. possible, whereas this would reduce the achieved information Discussion: rate. In this example, we assume that P(S = 1) = r ≤ 1/2. (1) The proof of Theorem 1 suggests the joint decoder We shall optimize P(X = 1), denoted by p ∈ [0,1]. The and estimator first decode the transmitted message in a “non- channel mutual information is I(X;Y)=H (pr)−p·H (r), 2 2 coherent”fashion,thenutilizethereconstructedchannelinputs where H (·) denotes the binary entropy function H (t) = 2 2 along with the channel outputs to estimate the channel states. −tlogt−(1−t)log(1−t). For x=0, the optimal one-shot As the coding block length grows large, such a two-stage estimator is Sˆ = 0 (note that P(S = 1) = r ≤ 1/2), and the procedure becomes asymptotically optimal. resulting minimum conditional distortion is d∗(0) = r. For (2) For each x ∈ X, d∗(x) quantifies its associated min- x = 1, the optimal one-shot estimator is Sˆ = Y = S, leading imum distortion. Alternatively, d∗(x) can be viewed as the to d∗(1) = 0. Therefore the input distribution should satisfy “estimation cost” due to signaling with x. Hence the average (1−p)r ≤D. distortionconstraintin(6)regulatestheinputdistributionsuch After manipulations, we find that the optimal solution is that the signaling is estimation-efficient. We emphasize that, given by d∗(x) is dependenton the channel through the distribution of −1 1 −1 the channel state S, and thus differs from other usual costs If D ≥r− 1+eH2(r)/r , p∗ = 1+eH2(r)/r , r such as symbol energies or time durations. and C(Dh)=H (p∗r)i−p∗·H (r);h i 2 2 (3) A key condition that leads to the constrained channel D coding formulation is that the channel is memoryless. Due to else p∗ =1− , r thememorylessproperty,wecandecomposeablockestimator D into multiple one-shot estimators, without loss of optimality and C(D)=H (r−D)− 1− H (r). 2 2 r asymptotically. If the channel state evolves with time in a (cid:18) (cid:19) correlated fashion, then such a decomposition is generally From the solution, we observe the following. For relatively suboptimal. large D, the average distortion constraint is not active, and thus the optimal input distribution coincides with that for the IV. ILLUSTRATIVEEXAMPLES unconstrained channel capacity. As the estimation distortion In this section, we discuss several simple examples to constraint D falls below a threshold, the average distortion illustrate the application of Theorem 1. constraintbecomesactive,andthecapacity-distortionfunction C(D) deviates from the unconstrained channel capacity. We thesameasthatinthescalarmultiplicativechannelcase.After can show from the expression of C(D) that, as D →0, somemanipulations,wefindthattheresultingoptimalsolution for general K ≥1 is log(1−r) C(D)= D+o(D). (13) −r Case 1 2K >1+(1−r)−1/r : rlog(2K −1) Figure 2 depicts C(D) versus D for different values of r. p∗ =1, C(D)= . We notice that the tradeoff between communication rates and K estimation distortions is evidently visible. Case 2 2K ≤1+(1−r)−1/r : −1 1 if D ≥r− 1+ eH2(r)/r ≥0, 2K −1 0.25 (cid:20) (cid:21) −1 1 1 p∗ = 1+ eH2(r)/r ; r 2K −1 0.2 (cid:20) (cid:21) D else p∗ =1− . r )0.15 D 1 ( C(D)= H (p∗r)+p∗ rlog(2K −1)−H (r) . C 2 2 K 0.1 Case 1 arises becaus(cid:8)e if the channe(cid:2)l block length K is (cid:3)(cid:9) sufficiently large such that 2K > 1+(1−r)−1/r, then the 0.05 resulting p∗ as given by Case 2 would be greater than one, r = 0.1, 0.2, 0.3, 0.4, 0.5 whichisimpossibleforavalidprobability.InCase1,wehave 00 0.2 0.4 D 0.6 0.8 1 PX(0) = 0, and all the nonzero symbols selected with equal probability 1/(2K −1). Infact, Case 1 kicksin forrathersmallvaluesof K.Inour Fig.2. Capacity-distortion functionforthescalarmultiplicative channel. channel model we have assumed r ∈ [0,1/2]. For r smaller than 0.175, Case 1 arises for K ≥ 2; and for r larger than 0.175, Case 1 arises for K ≥3. C. A Block Multiplicative Channel In the scalar multiplicative channel (K = 1), we have A generalization of the scalar multiplicative channel is the noticed that C(D) linearly scales to zero as D →0; see (13). following block multiplicative channel For K >1, however, we have Y =S X , (14) rlog(2K −1) i i i C(0)= >0. (17) K whereXandY are length-K blocksso thatthe super-symbols For comparison, let us consider a suboptimal approach based in the blockmemorylesschannelhave alphabetsXK =YK = upon training that transmits X=1 in the first channel use in {0,1}K. The channel state S ∈ S = {0,1} remains fixed each channel block. The receiver can thus perfectly estimate for each block, and changes in a memoryless fashion across the channelstate S and achieve D =0. The encoder then can blocks.WeagainadopttheHammingdistanceasthedistortion usetheremaining(K−1)channelusesineachchannelblock measure. to encode information, and the resulting achievable rate is For such a channel, there are 2K possible vectors for an inputsuper-symbol.However,wenotethat,allofthemexcept rlog(2K−1) R(0)= . (18) the all-zero x = 0 are symmetric. This is because they all K lead to the same conditional distribution for Y as well as the Comparing C(0) and R(0), we notice that their ratio ap- same minimum conditional distortion d∗(x)= 0, ∀x6=0. So proaches one as K → ∞, consistent with the intuition that from the concavity propertyof channelmutualinformationin training usually leads to negligiblerate loss for channelswith input distributions, the optimal input distribution should take long coherence blocks. the following form: V. FURTHER ISSUES PX(0)=1−p, andPX(x)=p/(2K −1), ∀x6=0. In this section, we briefly discuss a few issues that are related to the capacity-distortion function formulation. We can find that the channel mutual information per channel use is A. Multiple Estimators and Other Cost Constraints I(X;Y) 1 In certain applications, multiple cost constraints may be = H (pr)+p· rlog(2K −1)−H (r) , (15) K K 2 2 present. For example, the receiver may be simultaneously and that the av(cid:8)erage distortion(cid:2) constraint is (cid:3)(cid:9) interested in two or more different distortion measures, or the transmitter may have an average energy constraint for the (1−p)r ≤D, (16) channel input, besides the average distortion constraint. The multiplecostconstraintsshouldbesimultaneouslysatisfiedby VI. CONCLUSIONS augmenting the feasible set of input distributions, P (6), to D In this paper, we introduce a joint communication and the intersection of multiple feasible sets, each for one cost channelestimation problem for state-dependentchannels, and constraint. characterize its fundamental tradeoff by formulating it as a For either single or multiple cost constraints, the capacity- channel coding problem with input distribution constrained distortion function can be defined following Section II, for- by an average “estimation cost” constraint. The resulting mulated as a constrained channel coding problem following capacity-distortion function permits a systematic investiga- Section III, and computed following efficient algorithms like tion of the channel property for communication and state theBlahut-Arimotoalgorithm[11],[12]fordiscretealphabets. estimation. Future research topics include specializing the B. Uncertainty in Channel State Statistics general framework to particular channel models in realistic applications,andgeneralizingthe resultsto multiusersystems The constrained channel coding formulation in Section III and channels of generally correlated state processes. can also be extended to the case in which the distribution of the channel state S is uncertain. For such a compound ACKNOWLEDGMENT channel setting, we assume that the joint channel distribu- ThisworkhasbeensupportedinpartbyNSFOCE0520324, tion Pθ(x,s,y) = P(y|x,s)PX(x)PS,θ(s) is parametrized by the Annenberg Foundation, and the University of Southern an unknown parameter θ ∈ Θ, which is induced by the California. parametrized distribution of S, PS,θ(s). If all the alphabets X,Y, and S are discrete, we can show following the proof REFERENCES in [13] that the capacity-distortion function of the compound [1] R.Szewczyk,E.Osterweil,J.Polastre,M.Hamilton,A.Mainwaring,and channel is D.Estrin,“HabitatMonitoringwithSensorNetworks,”Communications oftheACM,vol.47,no.6,pp.34-40,Jun.2004. sup inf I (X;Y), (19) [2] M. Stojanovic, “Recent Advances in High-Speed Underwater Acoustic θ PX∈PDθ∈Θ Communications,” IEEE J. Oceanic Eng., vol. 21, no. 2, pp. 125–137, Apr.1996. where [3] S. Haykin, “Cognitive Radio: Brain-Empowered Wireless Communica- tions,” IEEE J. Select. Areas Commun., vol. 23, no. 2, pp. 201–220, PD = PX : PX(x)d∗θ(x)≤D,∀θ ∈Θ . (20) [4] DFe.bG.u2o0,05S..Shamai(Shitz),andS.Verdu´,“MutualInformationandMin- ( ) xX∈X imum Mean-Square Error in Gaussian Channels,” IEEE Trans. Inform. In I (X;Y) and d∗(x), the subscript θ denotes that they are Theory,vol.51,no.4,pp.1261–1281, Apr.2005. θ θ [5] R. McEliece and W. Stark, “Channels with Block Interference,” IEEE evaluated with respect to Pθ(x,s,y). Trans.Inform.Theory,vol.30,no.1,pp.44–53,Jan.1984. [6] B.HassibiandB.Hochwald,“HowMuchTrainingisNeededinMultiple- C. Capacity Per Unit Distortion Antenna Wireless Links?” IEEE Trans. Inform. Theory, vol. 49, no. 4, pp.951–963,Apr.2003. In light of the definition of channel capacity per unit cost [7] A.Sutivong,M.Chiang,T.M.Cover,andY.-H.Kim,“ChannelCapacity forgeneralcost-constrainedchannels[14],wecananalogously and State Estimation for State-Dependent Gaussian Channels,” IEEE definethecapacityperunitdistortion,andshowthatitisequal Trans.Inform.Theory,vol.51,no.4,pp.1486–1496, Apr.2005. [8] T.M.Cover,Y.-H.Kim,andA.Sutivong,“SimultaneousCommunication to ofDataandState,” [Online]Available atArXiv,2007. I(X;Y) [9] T. M. Cover and J. A. Thomas, Elements of Information Theory, John Cd =supE[d∗(X)]. Wiley&Sons,Inc.,1991. PX [10] R.G.Gallager,InformationTheoryandReliableCommunication,Wiley, The capacity per unit distortion quantifies the maximum 1968. [11] R. E. Blahut, “Computation of Channel Capacity and Rate-Distortion efficiency measured by the ratio between the amount of Functions,” IEEE Trans. Inform. Theory, vol. 18, no. 4, pp. 460–478, transmitted informationand the incurreddistortion in channel Jul.1972. state estimation. [12] S.Arimoto,“AnAlgorithmforCalculatingtheCapacityofanArbitrary Discrete Memoryless Channel,” IEEE Trans. Inform. Theory, vol. 18, From [14], if d∗(x) = 0 for at least two different input no.1,pp.14–20,Jan.1972. letters, then C = ∞; if there exists a unique x ∈ X with [13] D. Blackwell, L. Breiman, and A. J. Thomasian, “The Capacity of a d 0 d∗(x )=0, then C is also given by ClassofChannels,”TheAnnalsofMathematicalStatistics,vol.30,no.4, 0 d pp.1229–1241, Dec.1959. C = sup D(PY|xkPY|x0), (21) [14]TSh.eoVreyr,dvu´o,l.“O36n,Cnoh.an5n,eplpC.1ap0a1c9i–ty10P3e0r,USenpit.1C9o9s0t,.” IEEE Trans. Inform. d d∗(x) x∈X,x6=x0 where D(·k·) denotes the Kullback-Leibler divergence be- tween two distributions. Here, note that in PY|X we marginal- ize over the channel state S. Given (21), we can then conveniently evaluate C for d variouschannels. For example,the scalar multiplicative chan- nel in Section IV-B has C = log(1−r). In contrast, block d −r multiplicative channels in Section IV-C with K ≥ 2 have C =∞, because all input letters except 0 lead to d∗(·)=0. d

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.