Structure of optimal strategies for remote estimation over Gilbert-Elliott channel with feedback Jhelum Chakravorty Aditya Mahajan Electrical and Computer Engineering Electrical and Computer Engineering McGill University, Montreal, Canada McGill University, Montreal, Canada Email: [email protected] Email: [email protected] 7 1 0 2 Abstract—We investigate remote estimation over a Gilbert- that consider idealized channels without packet drops [2]–[9] n Elliotchannelwithfeedback.Weassumethatthechannelstateis andmodelsthatconsiderchannelswithi.i.d.packetdrops[10], a observedbythereceiverandfedbacktothetransmitterwithone [11]. J unitdelay. In addition,thetransmittergets ACK/NACKfeedback 0 for successful/unsuccessful transmission. Using ideas from team The salient features of remote estimation are as follows: 2 theory, we establish the structure of optimal transmission and estimation strategies and identify a dynamic program to deter- (F1) The decisions are made sequentially. ] mine optimal strategies with that structure. We then consider (F2) The reconstruction/estimation at the receiver must be C first-order autoregressive sources where the noise process has O unimodal and symmetric distribution. Using ideas from ma- done with zero-delay. jorizationtheory,weshowthattheoptimaltransmissionstrategy (F3) When a packet does get through, it is received without h. has a threshold structure and the optimal estimation strategy is noise. t Kalman-like. a m I. INTRODUCTION Remote estimation problems may be viewed as a special case of real-time communication [12]–[15]. As in real-time [ A. Motivation and literature overview communication, the key conceptual difficulty is that the data 1 We consider a remote estimation system in which a sen- available at the transmitter and the receiver is increasing with v sor/transmitter observes a first-order Markov process and time. Thus,the domainof the transmissionandthe estimation 3 causally decides which observationsto transmit to a remotely function increases with time. 4 located receiver/estimator. Communication is expensive and 9 Tocircumventthisdifficultyoneneedstoidentifysufficient takes place over a Gilbert-Elliot channel (which is used to 5 statistics for the data at the transmitter and the data at the 0 model channels with burst erasures). The channel has two receiver. In the real-time communication literature, dynamic . states: OFF state and ON state. When the channel is in the 1 teamtheory(ordecentralizedstochasticcontroltheory)isused 0 OFF state, a packettransmittedfromthe sensor to the receiver to identify such sufficient statistics as well as to identify a 7 is dropped. When the channel is in the ON state, a packet dynamic program to determine the optimal transmission and 1 transmittedfromthe sensorto the receiverisreceivedwithout v: error. We assume that the channel state is causally observed estimation strategies. Similar ideas are also used in remote- estimation literature. In addition, feature (F3) allows one to i atthe receiverand isfed backto thetransmitterwith one-unit X further simplify the structure of optimal transmission and delay. Whenever there is a successful reception, the receiver r estimation strategies. In particular, when the source is a first- sends an acknowledgment to the transmitter. The feedback is a order autoregressive process, majorization theory is used to assumed to be noiseless. show that the optimal transmission strategies is characterized At the time instances when the receiver does not receive a by a threshold [5]–[7], [10], [11]. In particular, it is optimal packet (either because the sensor did not transmit or because to transmit when the instantaneous distortion due to not the transmitted packet was dropped), the receiver needs to transmittingisgreaterthanathreshold.Theoptimalthresholds estimate the state of the source process. There is a funda- can be computed either using dynamic programming [5], [6] mental trade-off between communication cost and estimation or using renewal relationships [10], [16]. accuracy. Transmitting all the time minimizes the estimation error but incurs a high communication cost; not transmitting Alloftheexistingliteratureonremote-estimationconsiders at all minimizes the communication cost but incurs a high either channels with no packet drops or channels with i.i.d. estimation error. packet drops. In this paper, we consider packet drop channels Themotivationofremoteestimationcomesfromnetworked withMarkovianmemory.Weidentifysufficientstatisticsatthe controlsystems. Theearliestinstanceoftheproblemwasper- transmitter and the receiver. When the source is a first-order hapsconsideredbyMarschak[1]inthecontextofinformation autoregressiveprocess,weshowthatthreshold-basedstrategies gatheringinorganizations.Inrecentyears,severalvariationsof are optimalbutthe thresholddependson the previousstate of remote estimation has been considered.These include models the channel. B. The communication system 4) The receiver: At time t, the receiver generates an esti- 1) Source model: The source is a first-order time- mateXˆt ∈X ofXt.Thequalityoftheestimateisdetermined homogeneous Markov process {Xt}t≥0, Xt ∈ X. For ease by a distortion function d: X ×X →R≥0. of exposition, in the first part of the paper we assume that C. Information structure and problem formulation X is a finite set. We will later argue that a similar argument It is assumed that the receiver observes the channel state works when X is a general measurable space. The transition causally. Thus, the information available at the receiver1 is probability matrix of the source is denoted by P, i.e., for any x,y ∈X, I2 ={S ,Y }. P :=P(X =y |X =x). t 0:t 0:t xy t+1 t The estimate Xˆ is chosen according to 2) Channel model: The channel is a Gilbert-Elliott chan- t nel [17], [18]. The channel state {St}t≥0 is a binary-valued Xˆt =gt(It2)=gt(S0:t,Y0:t), (3) first-order time-homogeneous Markov process. We use the whereg iscalledtheestimationrule attimet. Thecollection convention that S = 0 denotes that the channel is in the t t g :=(g ,...,g )foralltimeiscalledtheestimationstrategy. OFF state and St = 1 denotes that the channel is in the ON 1 T It is assumed that there is one-step delayed feedback from state. The transition probability matrix of the channel state is thereceivertothetransmitter.2Thus,theinformationavailable denoted by Q, i.e., for r,s∈{0,1}, at the transmitter is Q :=P(S =s|S =r). rs t+1 t I1 ={X ,U ,S ,Y }. The input alphabet X¯ of the channel is X ∪{E}, where E t 0:t 0:t−1 0:t−1 0:t−1 denotes the event that there is no transmission. The channel The transmission decision Ut is chosen according to output alphabet Y is X ∪ {E ,E }, where the symbols E 0 1 0 U =f (I1)=f (X ,U ,S ,Y ), (4) and E are explained below. At time t, the channel input is t t t t 0:t 0:t−1 0:t−1 0:t−1 1 denoted by X¯t and the channel output is denoted by Yt. where ft is called the transmission rule at time t. The collec- The channel is a channel with state. In particular, for any tion f := (f ,...,f ) for all time is called the transmission 1 T realization (x¯0:T,s0:T,y0:T) of (X¯0:T,S0:T,Y0:T), we have strategy. that The collection (f,g) is called a communication strategy. The performance of any communication strategy (f,g) is P(Y =y |X¯ =x¯ ,S =s ) t t 0:t 0:t 0:t 0:t given by =P(Y =y |X¯ =x¯ ,S =s ) (1) t t t t t t T and J(f,g)=E λU +d(X ,Xˆ ) (5) t t t (cid:20)t=0 (cid:21) P(S =s |X¯ =x¯ ,S =s ) X t t 0:t 0:t 0:t−1 0:t−1 wheretheexpectationistakenwithrespecttothejointmeasure =P(St =st |St−1 =st−1)=Qst−1st (2) on all system variables induced by the choice of (f,g). We are interested in the following optimization problem. Note that the channel output Y is a deterministic function t of the input X¯t and the state St. In particular, for any x¯∈X¯ Problem 1 In the model described above, identify a com- and s∈{0,1}, the channel output y is given as follows: munication strategy (f∗,g∗) that minimizes the cost J(f,g) defined in (5). x¯, if x¯∈X and s=1 y =E1, if x¯=E and s=1 II. MAIN RESULTS E0, if s=0 A. Structure of optimal communication strategies This means that ifthere is a transmission (i.e., x¯ ∈ X) and Two-types of structural results are established in the real- the channel is on (i.e., s = 1), then the receiver observes x¯. time communication literature: (i) establishing that part of However, if there is no transmission (i.e., x¯ = E) and the the data at the transmitter is irrelevant and can be dropped channel is on (i.e., s = 1), then the receiver observes E1, if without any loss of optimality; (ii) establishing that the com- the channel is off, then the receiver observes E0. mon information between the transmitter and the receiver 3) The transmitter: Thereis noneed forchannelcodingin can be “compressed” using a belief state. The first structural aremote-estimationsetup.Instead,theroleofthetransmitteris results were first established by Witsenhausen [12] while the todeterminewhichsourcerealizationsneedto betransmitted. second structural results were first established by Walrand Let Ut ∈{0,1} denote the transmitter’s decision. We use the Varaiya [13]. convention that U = 0 denotes that there is no transmission t (i.e., X¯ = E) and U = 1 denotes that there is transmission 1Weusesuperscript1todenotevariablesatthetransmitterandsuperscript2 t 1 (i.e., X¯ =X ). todenotevariables atthereceiver. t t 2Note that feedback requires two bits: the channel state St is binary and Transmission is costly. Each time the transmitter transmits the channel output Yt can be communicated by indicating whether Yt ∈X (i.e., Ut =1), it incurs a cost of λ. ornot(i.e.,transmitting an ACKoraNACK). We establish both types of structural results for remote Theorem 1 In Problem 1, we have that: estimation. First, we show that (X0:t−1,U0:t−1) is irrelevant 1) Structure of optimal strategies: There is no loss of op- at the transmitter (Lemma 1); then, we use the common timality in restricting attention to optimal transmission information approach of [19] and establish a belief-state for and estimation strategies of the form: the common information (S ,Y ) between the transmitter 0:t 0:t and the receiver (Theorem 1). Ut =ft∗(Xt,St−1,Π1t), (10) Xˆ =g∗(Π2). (11) Lemma 1 For any estimation strategy of the form (3), there t t t is no loss of optimalityin restricting attention to transmission 2) Dynamic program: Let ∆(X) denote the space of strategies of the form probability distributions on X. Define value functions V1: {0,1}×∆(X)→R andV2: {0,1}×∆(X)→R Ut =ft(Xt,S0:t−1,Y0:t−1). (6) t t as follows. The proof idea is similar to [14]. We show that {Xt,S0:t−1,Y0:t−1}t≥0 is a controlled Markov process con- VT1+1(s,π1)=0, (12) trolled by {U } . See Section III for proof. t t≥0 and for t∈{T,...,0} Now,following[19], foranytransmission strategyf of the form(6)andanyrealization(s ,y )of(S ,Y ),define 0:T 0:T 0:T 0:T V1(s,π1)= min λπ1(B (ϕ)) ϕ : X →{0,1} as t 1 t ϕ:X→{0,1} n ϕt(x)=ft(x,s0:t−1,y0:t−1), ∀x∈X. +Wt0(π1,ϕ)π1(B0(ϕ))+ Wt1(π1,ϕ,x)π1(x) Furthermore, define conditional probability measures π1 and x∈XB1(ϕ) o t (13) π2 on X as follows: for any x∈X, t V2(s,π2)=min d(x,xˆ)π2(x)+V1 (s,π2P), πt1(x):=Pf(Xt =x|S0:t−1 =s0:t−1,Y0:t−1 =y0:t−1), t xˆ∈Xx∈X t+1 X π2(x):=Pf(X =x|S =s ,Y =y ). (14) t t 0:t 0:t 0:t 0:t where, We call πt1 the pre-transmission belief and π2 the post- Wt0(π1,ϕ)=Qs0Vt2(0,π1)+Qs1Vt2(1,π1|ϕ), transmission belief. Note that when (S0:T,Y0:T) are random W1(π1,ϕ,x)=Q V2(0,π1)+Q V2(1,δ ). variables,thenπ1 andπ2 are alsorandomvariableswhichwe t s0 t s1 t x t t denote by Π1 and Π2. Let Ψ (s,π1) denote the arg min of the right hand side t t t For the ease of notation, for any ϕ: X → {0,1} and i ∈ of (13). Then, the optimal transmission strategy of the {0,1}, define the following: form (10) is given by • Bi(ϕ)={x∈X :ϕ(x)=i}. f∗(·,s,π1)=Ψ (s,π1). • For any probability distribution π on X and any subset t t A of X, π(A) denotes π(x). Furthermore, the optimal estimation strategy of the x∈A • For any probability distribution π on X, ξ =π|ϕ means form (11) is given by that ξ(x)=1 π(Px)/π(B (ϕ)). {ϕ(x)=0} 0 g∗(π2)=argmin d(x,xˆ)π2(x). (15) t Lemma 2 Givenanytransmissionstrategyf oftheform (6): xˆ∈X x∈X X 1) there exists a function F1 such that The proof idea is as follows. Once we restrict attention π1 =F1(π2)=π2P. (7) to transmission strategies of the form (6), the information t+1 t t structure is partial history sharing [19]. Thus, one can use the 2) there exists a function F2 such that commoninformationapproachof[19]andobtainthestructure π2 =F2(π1,ϕ ,y ). (8) of optimal strategies. See Section III for proof. t t t t Remark 1 The first term in (13) is the expected communi- In particular, cation cost, the second term is the expected cost-to-go when δ if y ∈X the transmitter does not transmit, and the third term is the yt t π2 = π1| , if y =E (9) expected cost-to-go when the transmitter transmits. The first t t ϕt t 1 πt1, if yt =E0. ttehremexinpe(c1t4e)dicsotsht-etoe-xgpoe.cted distortion and the second term is Note that in (7), we are treating πt2 as a row-vector and Remark 2 Although the above model and result are stated in (9), δ denotes a Dirac measure centered at y . The yt t forsourceswith finite alphabets,theyextendnaturallytogen- update equations (7) and (8) are standard non-linear filtering eral state spaces (including Euclidean spaces) under standard equations. See Section III for proof. technical assumptions. See [20] for details. B. Optimality of threshold-basedstrategies for autoregressive Now, for any ˘ı1 = (x˘ ,s˘ ,y˘ ) = (x˘ ,s˘,y˘,˘ı1), t+1 t+1 0:t 0:t t+1 t t t source we use the shorthand P(˜ı1 |˜ı1 ,u ) to denote P(I˜1 = t+1 0:t 0:t t+1 In this section, we consider a first-order autoregressive ˘ı1t+1|I˜01:t =˜i10:t,U0:t =u0:t). Then, source {Xt}t≥0, Xt ∈ R, where the initial state X0 = 0 P(˘ı1t+1|i1t,ut)=P(x˘t+1,s˘t,y˘t,˘ı1t|x0:t,s0:t−1,y0:t−1,u0:t) and for t≥0, we have that (=a)P(x˘ ,s˘,y˘,˘ı1|x ,x¯ ,s ,y ,u ) t+1 t t t 0:t 0:t 0:t−1 0:t−1 0:t X =aX +W , (16) t+1 t t (=b)P(x˘ |x )P(y˘|x¯ ,s˘)P(s˘|s )1 where a ∈ R and W ∈ R is distributed according to a t+1 t t t t t t−1 {˘ı1t=˜ı1t} t =P(˘ı1 |˜ı1,u ) (19) symmetric and unimodal distribution with probability density t+1 t t function µ. Furthermore, the per-step distortion is given by where we have added x¯ in the conditioning in (a) because 0:t d(Xt−Xˆt), where d(·) is a even function that is increasing x¯0:t is a deterministic function of (x0:t,u0:t) and (b) follows on R≥0. The rest of the model is the same as before. from the source and the channel models. By marginaliz- For the above model, we can further simplify the result of ing (19), we get that for any˘ı2 =(s˘,y˘,˘ı1), we have t t t t Theorem 1. See Section IV for the proof. P(˘ı2|i1,u )=P(˘ı2|˜ı1,u ) (20) t t t t t t Theorem 2 For a first-order autoregressive source with sym- Now, let c(X ,U ,Xˆ )=λU +d(X ,Xˆ ) denote the per- metric and unimodal disturbance, t t t t t t step cost. Recall that Xˆ =g (I2). Thus, by (20), we get that t t t 1) Structure of optimal estimation strategy: The optimal estimation strategy is given as follows: Xˆ0 = 0, and E[c(X ,U ,Xˆ )|i1,u ]=E[c(X ,U ,Xˆ )|˜ı1,u ]. (21) t t t t t t t t t t for t≥0, Eq.(19)showsthat{I˜1} isacontrolledMarkovprocess t t≥0 Xˆ = aXˆt−1, if Yt ∈{E0,E1} (17) controlled by {Ut}t≥0. Eq. (21) shows that I˜t1 is sufficient t (Yt, if Yt ∈R for performance evaluation. Hence, by Markov decision the- ory [21], there is no loss of optimality in restricting attention 2) Structure of optimal transmission strategy: There exist to transmission strategies of the form (6). threshold functions k : {0,1} → R such that the t ≥0 B. Proof of Lemma 2 following transmission strategy is optimal: Consider 1, if |X −aXˆ |≥k (S ) ft(Xt,St−1,Π1t)=(0, othertwise. t−1 t t−1 πt1+1(xt+1)=P(xt+1|s0:t,y0:t) (18) = P(x |x )P(x |s ,y ) t+1 t t 0:t 0:t Remark 3 As long as the receiver can distinguish between xXt∈X the events E0 (i.e., St =0) and E1 (i.e., Ut =0 and St =1), = Pxtxt+1πt2(xt)=πt2P (22) the structure of the optimal estimator does not depend on the xXt∈X channel state information at the receiver. which is the expression for F1(·). ForF2,we considerthethreecasesseparately.Fory ∈X, Remark 4 It can be shown that under the optimal strategy, t we have Π2 is symmetric and unimodal around Xˆ and, therefore, t t Π1 is symmetric and unimodal around aXˆ . Thus, the π2(x)=P(X =x|s ,y )=1 . (23) t t−1 t t 0:t 0:t {x=yt} transmission and estimation strategies in Theorem 2 depend For y ∈{E ,E }, we have on the pre- and post-transmission beliefs only through their t 0 1 means. πt2(x)=P(Xt =x|s0:t,y0:t) P(X =x,y ,s |s ,y ) Remark 5 Recall that the distortion function is even and = t t t 0:t−1 0:t−1 (24) increasing.Therefore,thecondition|X −aXˆ |≥k (S ) P(yt,st|s0:t−1,y0:t−1) t t−1 t t−1 can bewritten as d(Xt−aXˆt−1)≥k˜t(St−1):=d(kt(St−1)). Now, when yt =E0, we have that Thus, the optimal strategy is to transmit if the per-step P(x ,y ,s |s ,y )=P(y |x ,ϕ (x ),s )Q π1(x ) distortion due to not transmitting is greater than a threshold. t t t 0:t−1 0:t−1 t t t t t st−1st t t III. PROOF OFTHESTRUCTURAL RESULTS (=a) Qst−11πt1(xt), if ϕt(xt)=0 and st =1 (25) (0, otherwise A. Proof of Lemma 1 where (a) is obtained from the channel model. Substitut- Arbitrarilyfixtheestimationstrategygandconsiderthebest ing (25) in (24) and canceling Q 1 from the nu- response strategy at the transmitter. We will show that I˜1 := st−11 {st=1} t meratorandthedenominator,weget(recallthatthisisforthe (Xt,S0:t−1,Y0:t−1) is an information state at the transmitter. case when y =E ), t 0 Given any realization (x ,s ,y ,u ) of the 0:T 0:T 0:T 0:T 1 π1(x) (sxys0t:et,ms0:tv−a1r,iayb0l:te−s1,(uX0:0t:−T1,)S0a:Tnd,Y0˜ı:1tT,U=0:T()x,t,dse0:fit−ne1,yi01t:t−1=). πt2(x)= {ϕπtt1(x(B)=00(}ϕ)t) . (26) Similarly, when yt =E1, we have that IV. PROOF OF OPTIMALITY OF THRESHOLD-BASED STRATEGIESFORAUTOREGRESSIVE SOURCE P(x ,y ,s |s ,y )=P(y |x ,ϕ (x ),s )Q π1(x ) t t t 0:t−1 0:t−1 t t t t t st−1st t t A. A change of variables (=b) Qst−10πt1(xt), if st =0 (27) Defineaprocess{Zt}t≥0 asfollows:Z0 =0andfort≥0, (0, otherwise aZ , if Y ∈{E ,E } t−1 t 0 1 Z = where (b) is obtained from the channel model. Substitut- t (Yt, if Yt ∈X ing (27) in (24) and canceling Q 1 from the nu- st−10 {st=0} Note that Z is a functionof Y . Next, define processes meratorandthedenominator,weget(recallthatthisisforthe t 0:t−1 case when yt =E1), {Et}t≥0, {Et+}t≥0, and {Eˆt}t≥0 as follows: E :=X −aZ , E+ :=X −Z , Eˆ :=Xˆ −Z π2(x)=π1(x). (28) t t t−1 t t t t t t t t The processes {E } and {E+} are related as follows: t t≥0 t t≥0 By combining (23), (26) and (28), we get (9). E =0, E+ =0, and for t≥0 0 0 E , if Y ∈{E ,E } C. Proof of Theorem 1 E+ = t t 0 1 t (0, if Yt ∈X Once we restrict attention to transmission strategies of the and form (6), the information structure is partial history shar- ing[19].Thus,onecanusethecommoninformationapproach Et+1 =aEt++Wt. of [19] and obtain the structure of optimal strategies. Since X −Xˆ = E+ −Eˆ , we have that d(X −Xˆ ) = Following [19], we split the information available at each t t t t t t d(E+−Eˆ ). agent into a “common information” and “local information”. t t It turns out that it is easier to work with the processes Common information is the information available to all deci- {E } , {E+} , and {Eˆ } rather than {X } and sion makers in the future; the remaining data at the decision t t≥0 t t≥0 t t≥0 t t≥0 {Xˆ } . maker is the local information. Thus, at the transmitter, the t t≥0 Next, redefine the pre- and post-transmission beliefs in common information is C1 := {S ,Y } and the local t 0:t−1 0:t−1 terms of the error process. With a slight abuse of notation, information is L1 := X . Similarly, at the receiver, the t t we still denote the (probability density) of the pre- and post- common information is C2 := {S ,Y } and the local t 0:t 0:t transmission beliefs as π1 and π2. In particular, π1 is the informationisL2 :=∅.Whenthetransmittermakesadecision, t t t t conditional pdf of E given (s ,y ) and π2 is the the state (sufficient for input output mapping) of the system t 0:t−1 0:t−1 t conditional pdf of E+ given (s ,y ). is (X ,S ); when the receiver makes a decision, the state t 0:t 0:t t t−1 Let H ∈ {E ,E ,1} denote the event whether the trans- of the system is (X ,S ). By [19, Proposition 1], we get that t 0 1 t t mission was successful or not. In particular, the sufficient statistic Θ1 for the common information at the t transmitter is E , if Y =E 0 t 0 H = E , if Y =E Θ1(x,s)=P(X =x,S =s|S ,Y ), t 1 t 1 t t t−1 0:t−1 0:t−1 1, if Yt ∈R. and the sufficient statistic Θ2t for the common information at We use ht to denote therealization of Ht. Note that Ht is a the receiver is deterministic function of U and S . t t The time-evolutions of π1 and π2 is similar to Lemma 2. Θ2(x,s)=P(X =x,S =s|S ,Y ). t t t t t 0:t 0:t In particular, we have Note thatΘ1 is equivalentto (Π1,S ) and Θ2 is equivalent Lemma 3 Givenanytransmissionstrategyf oftheform(4): t t t−1 t to (Π2t,St). Therefore, by [19, Theorem 2], there is no loss 1) there exists a function F1 such that of optimality in restricting attention to transmission strategies of the form (10) and estimation strategies of the form πt1+1 =F1(πt2). (30) In particular, Xˆ =g (S ,Π2). (29) t t t t π˜2⋆µ, if y ∈{E ,E } Furthermore, the dynamic program of 1 follows from [19, πt1+1 =(µt, if ytt ∈R,0 1 (31) Theorem 3]. Notethattherighthandsideof(14)impliesthatXˆ doesnot where π˜2 given by π˜2(e) := (1/|a|)π2(e/a) is the t t t t dependonS .Thus,insteadof(29),wecanrestrictattentionto conditional probability density of aE+, µ is the prob- t t estimationstrategyof theform(11). Furthermore,theoptimal ability density function of W and ⋆ is the convolution t estimation strategy is given by (15). operation. 2) there exists a function F2 such that Property 2 If π1 is SU(0) and ϕ ∈ F(0), then for any h ∈ {E ,E ,1}, F2(π1,ϕ,h) is SU(0). π2 =F2(π1,ϕ ,h ). (32) 0 1 t t t t Proof: We prove the result for each h ∈ {E ,E ,1} In particular, 0 1 separately.Recalltheupdateofπ1 givenby(33).Forh =E , t 0 δ0, if ht =1 π2 = π1 and hence π2 is SU(0). For ht = E1, π2 = π1|ϕ; πt2 =πt1|ϕt, if ht =E1 (33) SifUϕ(0∈).FFo(0r)h, th=en1,ππ1(2x=)1δ{ϕ,(xw)=hi0c}hisisSSUU((00)).and hence π1 is πt1, if ht =E0. t 0 Property 3 If π2 is SU(0), then F1(π2) is also SU(0). The key difference between Lemmas 2 and 3 (and the reasonthatweworkwiththeerrorprocess{Et}t≥0ratherthan Proof: Recall that F1 is given by (31). The property {X } ) is thatthe functionF2 in (32) dependson h rather follows from the fact that convolution of symmetric and t t≥0 t than y . Consequently, the dynamic program of Theorem 1 is unimodal distributions is symmetric and unimodal. t now given by C. SU majorization and its properties V1 (s,π1)=0, (34) T+1 For any set A, let I denote its indicator function, i.e., A and for t∈{T,...,0} IA(x) is 1 if x∈X, else 0. Let A be a measurable set of finite Lebesgue measure, its V1(s,π1)= min λπ1(B (ϕ)) symmetric rearrangement Aσ is the open interval centered t 1 ϕ:R→{0,1} (35) around origin whose Lebesgue measure is same as A. n +W0(π1,ϕ)π1(B (ϕ))+W1(π1,ϕ)π1(B (ϕ)) Given a function ℓ: R → R, its super-level set at level ρ, t 0 t 1 ρ ∈ R, is {x ∈ R : ℓ(x) > ρ}. The symmetric decreasing Vt2(s,π2)=D(π2)+Vt1+1(s,F1(π2)), o (36) rearrangementℓσ ofℓisasymmetricanddecreasingfunction where, whose level sets are the same as ℓ, i.e., ∞ Wt0(π1,ϕ)=Qs0Vt2(0,π1)+Qs1Vt2(1,π1|ϕ), ℓσ(x)= I{z∈R:ℓ(z)>ρ}σ(x)dρ. Wt1(π1,ϕ)=Qs0Vt2(0,π1)+Qs1Vt2(1,δ0), Z0 Given two probability density functions ξ and π over R, ξ D(π2)=min d(e−eˆ)π2(e)de. majorizes π, which is denoted by ξ (cid:23) π, if for all ρ≥0, eˆ∈R R m Z Again, note that due to the change of variables, the ex- ξσ(x)dx≥ πσ(x)dx. pression for W1 does not depend on the transmitted symbol. t Z|x|≥ρ Z|x|≥ρ Consequently, the expression for V1 is simpler than that in t Given two probability density functions ξ and π over R, ξ Theorem 1. SU majorizes π, which we denote by ξ (cid:23) π, if ξ is SU and a B. Symmetric unimodal distributions and their properties ξ majorizes π. A probability density function π on reals is said to be Now,westatesomepropertiesofSUmajorizationfrom[5]. symmetricandunimodal(SU)aroundc∈Rifforanyx∈R, Property 4 For any ξ (cid:23) π, where ξ is SU(c) and for any a π(c−x) = π(c+x) and π is non-decreasing in the interval prescriptionϕ,letθ ∈F(c)beathreshold-basedprescription (−∞,c] and non-increasing in the interval [c,∞). such that Given c ∈ R, a prescription ϕ: R → {0,1} is called threshold based around c if there exists k ∈R such that ξ(B (θ))=π(B (ϕ)), i∈{0,1}. i i 1, if |e−c|≥k Then, ξ| (cid:23) π| . Consequently, for any h∈{E ,E ,1}, ϕ(e)= θ a ϕ 0 1 (0, if |e−c|<k. F2(ξ,θ,h)(cid:23) F2(π,ϕ,h). a LetF(c)denotethe familyofallthreshold-basedprescription around c. For c = 0, the result follows from [5, Lemma 7 and 8]. The Now, we state some properties of symmetric and unimodal result for general c follows from change of variables. distributions.. Property 5 For any ξ (cid:23) π, F1(ξ)(cid:23) F1(π). m a Property 1 If π is SU(c), then This follows from [5, Lemma 10]. Recall the definition of D(π2) given after (36). c∈argmin d(e−eˆ)π(e)de. eˆ∈RZR Property 6 If ξ (cid:23)a π, then For c = 0, the above property is a special case of [5, Lemma 12]. The result for general c follows from a change D(π)≥D(πσ)≥D(ξσ)=D(ξ). of variables. This follows from [5, Lemma 11]. D. Qualitative properties of the value function and optimal or, equivalently, in terms of the {E } process: t t≥0 strategy 1, if |E |≥k˜ (S ,Π1) Lemma 4 The value functions V1 and V2 of (34)–(36), f (E ,S ,Π1)= t t t−1 t (40) satisfy the following property. t t t t t−1 t (0, otherwise. (P1) Foranyi∈{1,2},s∈{0,1},t∈{0,...,T},andpdfs We prove (40) by induction. Note that π1 = δ which 0 0 ξi and πi such that ξi (cid:23)a πi, we have that Vti(s,ξi)≤ is SU(0). Therefore, by (P2), there exists a threshold-based Vti(s,πi). prescription ϕ0 ∈ F(0) that is optimal. This forms the basis Furthermore, the optimal strategy satisfies the following ofinduction.Nowassumethatuntiltimet−1,allprescriptions properties. For any s∈{0,1} and t∈{0,...,T}: are in F(0). By Properties 2 and 3, Π1 is SU(0). Therefore, t (P2) ifπ1isSU(c),thenthereexistsaprescriptionϕt ∈F(c) by(P2),thereexistsa threshold-basedprescriptionϕt ∈F(0) that is optimal. In general, ϕ depends on π1. that is optimal. This proves the induction step and, hence, by t (P3) if π2 is SU(c), then the optimal estimate Eˆ is c. theprincipleofinduction,threshold-basedprescriptionsofthe t form (40) are optimal for all time. Translating the result back Proof: We proceed by backward induction. V1 (s,π1) T+1 to {Xt}t≥0, we get that threshold-based prescriptions of the trivially satisfies the (P1). This forms the basis of induction. form (39) are optimal. NowassumethatV1 (s,π1)alsosatisfies(P1).Forξ2 (cid:23) π2, t+1 a Observe that Properties 2 and 3 also imply that for all t, we have that Π2 is SU(0). Therefore, by Property 1, the optimal estimate t V2(s,π2)=D(π2)+V1 (s,F1(π2)) Eˆt = 0. Recall that Eˆt = Xˆt −Zt. Thus, Xˆt = Zt. This t t+1 proves the first part of Theorem 2. (a) ≥ D(ξ2)+V1 (s,F1(ξ2)) To prove that there exist optimal transmission strategies t+1 =V2(s,ξ2), (37) where the thresholds do not depend on Π1t, we fix the t estimation strategy to be of the form (17) and consider the where (a) follows from Properties 5 and 6 and the induction problemoffindingthebesttransmissionstrategyatthesensor. hypothesis. Eq. 37 implies that V2 also satisfies (P1). This is a single-agent(centralized)stochastic controlproblem t Now, consider ξ1 (cid:23) π1. Let ϕ be the optimal prescription and the optimal solution is given by the following dynamic a atπ1.Letθ bethethreshold-basedprescriptioncorresponding program: to ϕ as defined in Property 3. By construction, J (e,s)=0 (41) T+1 π1(B (ϕ))=ξ1(B (θ)) and π1(B (ϕ))=ξ1(B (θ)). 0 0 1 1 and for t∈{T,...,0} Moreover, from Property 3 and (37), J (e,s)=min{J0(e,s),J1(e,s)} (42) W0(π1,ϕ)≥W0(ξ1,θ) and W1(π1,ϕ)≥W1(ξ1,θ). t t t t t t t where Combining the above two equations with (35), we get V1(s,π1)=λπ1(B (ϕ))+W0(π1,ϕ)π1(B (ϕ)) Jt0(e,s)=d(e)+Qs0EW[Jt+1(ae+W,0)] t 1 0 +Q E [J (ae+W,1)], (43) +W1(π1,ϕ)π1(B (ϕ)) s1 W t+1 1 J1(e,s)=λ+Q d(e)+Q E [J (ae+W,0)] ≥λξ1(B (θ))+W0(ξ1,θ)ξ1(B (θ)) t s0 s0 W t+1 1 0 +Q E [J (W,1)], (44) +W1(ξ1,θ)ξ1(B (θ)) s1 W t+1 0 ≥V1(s,ξ1) (38) We now use the results of [22] to show that the value t function even and increasing on R (abbreviated to EI). ≥0 where the last inequality follows by minimizing over all θ. Theresultsof[22]relyonstochasticdominance.Giventwo Eq. (38) implies that Vt1 also satisfies (P1). Hence, by the probability density functions ξ and π over R , ξ stochasti- ≥0 principle of induction, (P1) is satisfied for all time. cally dominates π, which we denote by ξ (cid:23) π, if s The argument in (38) also implies (P2). Furthermore, (P3) follows from Property 1. ξ(x)dx≥ π(x)dx, ∀y ∈R . ≥0 E. Proof of Theorem 2 Zx≥y Zx≥y Now, we show that dynamic program (41)–(44) satisfies We first prove a weaker version of the structure of optimal conditions (C1)–(C3) of [22, Theorem 1]. In particular, we transmission strategies. In particular, there exist threshold have: Condition (C1) is satisfied because the per-step cost functions k˜ : {0,1}×∆(R) → R such that the following t ≥0 functions d(e) and λ + Q d(e) are EI. Condition (C2) is s0 transmission strategy is optimal: satisfiedbecausetheprobabilitydensityµofW iseven,which t f (X ,S ,Π1)= 1, if |Xt−aZt−1|≥k˜t(St−1,Π1t) implies that for any e∈R≥0, t t t−1 t (0, otherwise. µ(ae+w)dw = µ(−ae+w)dw. (39) Zw∈R Zw∈R Now,to checkcondition(C3),definefore∈R andy ∈R≥0, V. CONCLUSION In this paper, we studied remote estimation over a Gilbert- ∞ −y M0(y|e)= µ(ae+w)dw+ µ(ae+w)dw Elliotchannelwithfeedback.Weassumethatthechannelstate Zy Z−∞ isobservedbythereceiverandfedbacktothetransmitterwith y one unit delay. In addition, the transmitter gets ACK/NACK =1− µ(ae+w)dw, feedbackforsuccessful/unsuccessfultransmission.Usingideas Z−y ∞ −y from team theory, we establish the structure of optimal M1(y|e)= µ(w)dw+ µ(w)dw. transmission and estimation strategies and identify a dynamic Zy Z−∞ program to determine optimal strategies with that structure. We then consider first-order autoregressive sources where the M1(y|e) does not depend on e and is thus trivially even and noiseprocesshasunimodalandsymmetricdistribution.Using increasing in e. Since µ is even, M0(y|e) is even in e. We show that M0(y|e) is increasing in e for e ∈ R later (see ideas from majorization theory, we show that the optimal ≥0 transmissionstrategyhasathresholdstructureandtheoptimal Lemma 5). estimation strategy is Kalman-like. Sinceconditions(C1)–(C3)of[22,Theorem1]aresatisfied, A natural question is how to determine the optimal thresh- we have that for any s ∈ {0,1}, J (e,s) is even in e and t olds. For finite horizon setup, these can be determined using increasing for e∈R . Now, observe that ≥0 thedynamicprogramof(41)–(44).Forinifinitehorizonsetup, we expectthat the optimalthreshold will not dependon time. J0(e,s)−J1(e,s)=(1−Qs0)d(e)+Qs1EW[Jt+1(ae+W,1)] We believe that it should be possible to evalute the perfor- −λ−Q E [J (W,1)] manceofagenericthresholdbasedstrategyusinganargument s1 W t+1 similartotherenewaltheorybasedargumentpresentedin[16] which is even in e and increasing in e ∈ R . Therefore, for channels without packet drops. ≥0 for any fixed s ∈ {0,1}, the set A of e in which J0(e,s)− t J1(e,s)≤0 is convex and symmetric around the origin, i.e., REFERENCES t asetoftheform[−k (s),k (s)].Thus,thereexistak (·)such t t t [1] J. Marschak, “Towards an economic theory of organization and infor- thatthe actionut =0 isoptimalfore∈[−kt(s),kt(s)]. This, mation,”Decisionprocesses, vol.3,no.1,pp.187–220,1954. proves the structure of the optimal transmission strategy. [2] O. C. Imer and T. Basar, “Optimal estimation with limited measure- ments,” Joint 44the IEEE Conference on Decision and Control and Lemma 5 For any y ∈ R , M0(y|e) is increasing in e, EuropeanControlConference, vol.29,pp.1029–1034,2005. ≥0 [3] M. Rabi, G. Moustakides, and J. Baras, “Adaptive sampling for linear e∈R . ≥0 state estimation,” SIAM Journal on Control and Optimization, vol. 50, no.2,pp.672–702,2012. Proof: To show that M0(y|e) is increasing in e for e ∈ [4] Y.XuandJ.P.Hespanha,“Optimalcommunicationlogicsinnetworked R ,itsufficiestoshowthat1−M0(y|e)= y µ(ae+w)dw controlsystems,”inProceedingsof43rdIEEEConferenceonDecision is≥d0ecreasinginefore∈R .Consideracha−nygeofvariables andControl,vol.4,2004,pp.3527–3532. ≥0 R [5] G. M. Lipsa and N. C. Martins, “Remote state estimation with com- x=ae+w. Then, munication costs for first-order LTI systems,” IEEE Transactions on AutomaticControl,vol.56,no.9,pp.2013–2025,2011. y y−ae [6] A. Nayyar, T. Basar, D. Teneketzis, and V. V. Veeravalli, “Optimal 1−M0(y|e)= µ(ae+w)dw = µ(x)dx (45) strategies for communication and remote estimation with an energy Z−y Z−y−ae harvesting sensor,” IEEE Transactions on Automatic Control, vol. 58, no.9,pp.2246–2260,2013. Taking derivative with respect to e, we get that [7] A. Molin and S. Hirche, “An iterative algorithm for optimal event- triggered estimation,” in4th IFACConference onAnalysis and Design ofHybridSystems (ADHS’12),2012,pp.64–69. ∂M0(y|e) [8] J. Wu, Q. S. Jia, K. H. Johansson, and L. Shi, “Event-based sensor =a[µ(y−ae)−µ(−y−ae)] (46) ∂e datascheduling: Trade-offbetween communication rate andestimation quality,” IEEE Transactions on Automatic Control, vol. 58, no. 4, pp. Now consider the following cases: 1041–1046, April2013. [9] D.Shi,L.Shi,andT.Chen,Event-BasedStateEstimation:AStochastic • Ifa>0andy >ae>0,thentherighthandsideof(46) Perspective. Springer, 2015,vol.41. [10] J. Chakravorty and A. Mahajan, “Remote state estimation with packet equals a[µ(y−ae)−µ(y+ae)], which is positive. drop,”in6thIFACWorkshoponDistributed EstimationandControlin • Ifa>0andae>y >0,thentherighthandsideof(46) Networked Systems,Sep2016. equals a[µ(ae−y)−µ(ae+y)], which is positive. [11] R. Xiaoqiang, W. Junfeng, J. K. Henrik, S. Guodong, and S. Ling, “Infinite horizon optimal transmission power control for remote state • If a < 0 and y > |a|e > 0, then the right hand side estimation overfadingchannels,” arxiv:1604.08680v1 [cs.SY],Apr29 of (46) equals |a|[µ(y −|a|e)−µ(y +|a|e)], which is 2016. positive. [12] H.S.Witsenhausen,“Onthestructureofreal-timesourcecoders,”BSTJ, vol.58,no.6,pp.1437–1451, July-August1979. • If a < 0 and |a|e > y > 0, then the right hand side [13] J. C. Walrand and P. Varaiya, “Optimal causal coding-decoding prob- of (46) equals |a|[µ(|a|e−y)−µ(|a|e+y)], which is lems,”IEEETrans.Inf.Theory,vol.29,no.6,pp.814–820,Nov.1983. positive. [14] D. Teneketzis, “On the structure of optimal real-time encoders and decodersinnoisycommunication,” IEEETrans.Inf.Theory,pp.4017– Thus, in all cases, M0(y|e) is increasing in e, e∈R≥0. 4035,Sep.2006. [15] A.MahajanandD.Teneketzis, “Optimaldesignofsequentialreal-time communication systems,”IEEETrans.Inf.Theory, vol.55,no.11,pp. 5317–5338, Nov.2009. [16] J. Chakravorty and A. Mahajan, “Fundamental limits of remote esti- mation of Markov processes under communication constraints,” IEEE Transactions onAutomaticControl, 2017(toappear). [17] E.N.Gilbert,“Capacityofaburst-noisechannel,”BellSystemTechnical Journal, vol.39,no.5,pp.1253–1265, 1960. [18] E.O.Elliott,“Estimatesoferrorratesforcodesonburst-noisechannels,” BellSystem Technical Journal, vol.42,no.5,pp.1977–1997, 1963. [19] A. Nayyar, A. Mahajan, and D. Teneketzis, “Decentralized stochastic control withpartial historysharing:Acommoninformation approach,” IEEETrans.Autom.Control, vol.58,no.7,pp.1644–1658, jul2013. [20] S. Yuksel, “On optimal causal coding of partially observed Markov sources in single and multiterminal settings,” IEEETrans. Inf. Theory, vol.59,no.1,pp.424–437, 2013. [21] P.R.KumarandP.Varaiya, Stochastic Systems:Estimation,Identifica- tionandAdaptiveControl. UpperSaddleRiver,NJ,USA:Prentice-Hall, Inc.,1986. [22] J. Chakravorty and A. Mahajan, “On evenness and monotonicity of value functions and optimal strategies in markov decision processes,” submitted toOperations ResearchLetters,2016.