ebook img

Sharp Bounds on the Entropy of the Poisson Law and Related Quantities PDF

0.15 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Sharp Bounds on the Entropy of the Poisson Law and Related Quantities

1 Sharp Bounds on the Entropy of the Poisson Law and Related Quantities Jose´ A. Adell, Alberto Lekuona, and Yaming Yu, Member, IEEE Abstract—One of the difficultiesin calculating the capacity of and the binomial. Our results are expressed in terms of two certainPoissonchannelsisthatH(λ),theentropyofthePoisson sequencesof lower and upperbounds,and have the following distribution with mean λ, is not available in a simple form. In features. 0 this work we derive upper and lower bounds for H(λ) that are 1 asymptoticallytightandeasytocompute.Thederivationofsuch • Thesequenceofupperboundsandthesequenceoflower 0 boundsinvolvesonlysimpleprobabilisticandanalytictools.This bounds each gives a full asymptotic expansion for the 2 complementstheasymptoticexpansionsofKnessl(1998),Jacquet entropy. In other words the bounds are asymptotically and Szpankowski (1999), and Flajolet (1999). The same method tight. n yields tight bounds on the relative entropy D(n,p) between a a binomial and a Poisson, thus refining the work of Harremoe¨s • The bounds are derived from familiar quantities such as J and Ruzankin (2004). Bounds on the entropy of the binomial the central moments of the Poisson, and are in a form 7 also follow easily. simpleenoughforboththeoreticalanalysisandnumerical 1 computation. Index Terms—asymptotic expansion, binomial distribution, ] central moments, complete monotonicity, entropy bounds, inte- • The derivation, which involves only real analysis, is ele- T gral representation, Poisson channel, Poisson distribution mentary. (Note that the asymptotic expansions of Knessl I [20] are also real-analytic.) . cs I. INTRODUCTION Denote by Z+ = {0,1,2,...} and by N = Z+ \{0}. As [ usual, the Shannon entropy for a discrete random variable X 1 theUSnhliaknentohnedeinftfreorpenietsialfoerntmroapnyyfobrasthiceGdiasucrsestieanddisistrtirbibuutitoionns,, on Z+ with mass function fi =Pr(X =i) is defined as v suchasthePoisson,thebinomial,orthenegativebinomial,are ∞ 7 “notinclosed form.”InthePoisson case,the lackofa simple H(X)=H(f)= filogfi, 9 analyticexpressionis seen ([24], [22]) as oneof the obstacles Xi=0− 8 2 toobtainingthecapacityofcertainPoissonchannels([6],[12], where we use the natural logarithm and obey the convention . [26],[27]).Computationoftheentropyisalsoabasicproblem 0log0 = 0. Throughout we use N to denote a Poisson 1 λ 0 partly motivated by the maximum entropy characterizations random variable with mean λ, i.e., the mass function is 0 ([28], [25], [14], [29], [18], [30]) of these distributions. e−λλj 1 One strategy to make the entropy functions more tractable Pr(N =j)= , j Z . λ + : j! ∈ v is to express them as integrals ([10], [20], [24]). See [13], i [21], [23] for related integral representations for entropy-like We write H(λ) = H(N ) for simplicity. The best known X λ quantities in the context of Poisson channels. For the entropy bound on H(λ) is perhaps r functions themselves, integral representations have been used a 1 1 to derive asymptotic expansions ([10], [20]). Alternatively, H(λ) log 2πe λ+ , (1) asymptotic expansionscan be obtained using analytic depois- ≤ 2 (cid:18) (cid:18) 12(cid:19)(cid:19) sonisation([16],[17]),singularityanalysis([11]),orlocallimit which is obtained by bounding the differential entropy of theorems ([9]). N +U whereU isanindependentrandomvariableuniformly λ It is obviously desirable to have bounds that accompany distributedon (0,1)([8], Theorem8.6.5).While (1) is simple asymptotic expansions, for both theoretical analysis and nu- informandreasonablyaccurate,itlacksacorrespondinglower mericalcomputation.Thisisespeciallytrueforquantitiessuch bound,anddoesnotextendeasilytocapturehigherorderterms as the entropy of the Poisson law, which may be used, e.g., in the expansion of H(λ). As a remedy we shall derive, for in capacity calculations for discrete-time Poisson channels each m 1, a double inequality of the form ([24], [22]). Part of this work aims to derive tight bounds on ≥ the entropy for fundamental distributions such as the Poisson 2m A(m,k) 1 1 m−1b(m,k) H(λ) log(2πλ) λk ≤ − 2 − 2 − λk Jose´ A.AdelliswiththeDepartamento deMe´todosEstad´ısticos,Facultad kX=m kX=1 de Ciencias, Universidad de Zaragoza, Pedro Cerbuna 12, 50009 Zaragoza, 2m−1 B(m,k) Spain(email:[email protected]). , (2) AlbertoLekuonaiswiththeDepartamentodeMe´todosEstad´ısticos,Facul- ≤ λk taddeCiencias,UniversidaddeZaragoza,PedroCerbuna12,50009Zaragoza, kX=m Spain(email:[email protected]). where A(m,k), b(m,k), and B(m,k) are explicit constants Yaming Yu is with the Department of Statistics, University of California, Irvine,CA,92697-1250,USA(e-mail: [email protected]). ( 0 0). In other words, for each m 1, we give a finite 1 ≡ ≥ P 2 asymptotic expansion in powers of λ−1 with m exact terms In what follows, the kth central moment of the Poisson and explicit lower and upper bounds of the order of λ−m. distribution In Section II we derive (in an equivalent form) the double µ (s) E[(N s)k] k s ≡ − inequality (2). The key steps are plays an important role. The first few values of µ (s) are k • anintegralrepresentationthatrelatesH(λ)tothesimpler µ (s)=1, µ (s)=0, and 0 1 quantity E[log(N +1)]; λ • bounds on E[log(Nλ +1)] in terms of polynomials in µ2(s)=µ3(s)=s, µ4(s)=3s2+s, µ5(s)=10s2+s. λ−1, which translate easily to bounds on H(λ). They obey the well-known recursion ([19], p. 162) Note that (2) is only effective for large λ. We also obtain bounds on H(λ) in terms of polynomials in λ, which work k−2 k 1 µ (s)=s − µ (s), k 2, (3) for small λ. k j (cid:18) j (cid:19) ≥ Besides H(λ), we also consider bounds on the relative Xj=0 entropy between a binomial and a Poisson, thus obtaining a from which it is easy to show that, for k 2, µ (s) is k ≥ versionof“thelawofsmallnumbers”thatrefinestheresultsof a polynomial in s of degree k/2 , where x denotes the ⌊ ⌋ ⌊ ⌋ Harremoe¨s and Ruzankin [15]. While these are of theoretical integer part of x. interest, they also lead to new bounds for the entropy of the In contrast to Theorem 1, Theorem 2 is most effective for binomial.Asusual,fortworandomvariablesX andY onZ+ large λ. with mass functionsf and g respectively,the relative entropy Theorem 2: For any λ>0 and m=1,2,..., we have is defined as 1 1 ∞ fj −rm(λ)≤H(λ)− 2log(2πλ)− 2 −βm(λ)≤0, D(X Y)=D(f g)= f log . j k k Xj=0 gj where By convention0log(0/0)=0, andD(f g)= if f assigns ∞ 2m+1( 1)j−1µ (s) 2m−1b(m,k) j amabsinsoomutisaildreanodfotmhevsaurpiapbolretwofitgh.mTahsrsoufkugnhcotuiotnw∞e let Bn,p be βm(λ)=Zλ  Xj=3 −j(j−1)sj  ds= Xk=1 λk (4) n and Pr(B =k)= pkqn−k, k =0,1,...,n, n,p (cid:18)k(cid:19) ∞ µ (s) 2m a(m,k) where q 1 p, p (0,1) and n N. We consider rm(λ)=Z (2m2+m+1)2s2m+2 ds= λk . (5) ≡ − ∈ ∈ λ kX=m D(n,p)=D(Bn,p Nnp), Letus note thatthe seeminglycumbersomeexpressions(4) k and (5) are actually quite easy to handle. For j 3, the i.e., the relative entropy between Bn,p and Nnp, and derive ≥ jth central moment µ (s) is a polynomial in s of degree bounds on D(n,p) using similar techniques. Bounds on the j j/2 . This, together with (3), shows that the integrand in entropy of the binomial H(Bn,p) are obtained as a corollary. (⌊4) i⌋s a polynomial in s−1, with powers going from s−2 Sections III and IV contain proofs of the main results. We to s−2m. Similar statements hold for the integrand in (5). conclude with a short discussion on possible extensions in Hence the constants a(m,k) and b(m,k) are obtained after Section V. straightforward integration; see Table I for their values for small m. In particular, for m=2 we have II. MAIN RESULTS 31 33 1 1 1 1 A. Sharp Bounds on H(λ) H(λ) log(2πλ) + −24λ2 − 20λ3 − 20λ4 ≤ − 2 − 2 12λ We firstpresenta classofdoubleinequalitiesforH(λ) that 5 1 + . is effective for small λ (say λ 1), but valid for all λ 0. ≤ 24λ2 60λ3 ≤ ≥ Theorem 1: For any λ 0 and m=1,2,..., we have We emphasize that the constants b(m,k), 1 k m 1, ≥ ≤ ≤ − 2m+1 2m are exact in the full asymptotic expansion of H(λ), since (5) c(k) c(k) λk H(λ)+λlogλ λ λk, gives r (λ) = O(λ−m). For example, we have b(3,1) = m k! ≤ − ≤ k! Xk=2 Xk=2 1/12 and b(3,2)= 1/24 from Table I, and hence − − where 1 1 1 1 H(λ)= log(2πλ)+ +O(λ−3), k−1 2 2 − 12λ − 24λ2 k 1 c(k)= ( 1)k−1−j − log(j+1), k =2,3,.... which agrees with the leading terms given by, e.g., Knessl − (cid:18) j (cid:19) Xj=0 ([20], Theorem 2). For fixed m, the two bounds given by Theorem 1 differ by Bounds on H(λ) given by Theorem 2 are illustrated in O(λ2m+1). Hence they are most effective when λ is small. Fig. 1. As λ increases from 10 to 20, the gap between upper Moreover, inspection of the proof (Section III, Part B) shows and lower bounds, rm(λ), decreases that, for 0 λ 1, both the upper and lower bounds in • from 0.1 to 0.05 with m=1, ≤ ≤ ≈ ≈ Theorem 1 converge to H(λ)+λlogλ λ as m . • from 0.017 to 0.004 with m=2, and − →∞ ≈ ≈ 3 TABLEI where VALUESOFa(m,k)ANDb(m,k)FORm=1,2,3,4. k−1 a(m,k) k 1 k m=1 m=2 m=3 m=4 c˜(k)= ( 1)k−1−j − log(n j), k=2,...,n. − (cid:18) j (cid:19) − 1 1 Xj=0 2 1/6 3/2 3 5/3 5 In the following result, the kth central moment of Bn,p, 4 1/20 35/2 105/4 5 17/5 210 µk(n,p) E[(Bn,p np)k], 6 1/42 2275/18 ≡ − 7 167/21 plays an important role. The first few values of µ (n,p) are k 8 1/72 k b(m,k) µ (n,p)=1, µ (n,p)=0, µ (n,p)=pqn, 0 1 2 1 1/6 −1/12 −1/12 −1/12 2 5/24 −1/24 −1/24 µ (n,p)=(q p)pqn, µ (n,p)=3(pqn)2+(1 6pq)pqn. 3 1/60 103/180 −19/360 3 − 4 − 4 13/40 201/80 We write q 1 p throughout. 5 1/210 12367/2520 ≡ − Analogous to Theorem 2, Theorem 4 gives bounds on 6 571/1008 7 1/504 D(n,p) that are effective for large n. Theorem 4: For n, m N and p (0,1), we have ∈ ∈ p+logq bounds on H(λ) gaps between bounds 0 D(n,p)+ β˜ (n,p) r˜ (n,p), m m ≤ 2 − ≤ 0.10 mm==12 where m=3 2.8 0.08 β˜ (n,p)=n 12m+1(−1)jµj(n,s)ds=2m−1˜b(m,k;p) m Z j(j 1)(ns)j nk 2.7 0.06 q Xj=3 − kX=1 and 0.04 1 µ (n,s) 2m a˜(m,k;p) 2.6 0.02 r˜m(n,p)=nZq (2m+2m1+)2(ns)2m+2 ds=kX=m nk . 2.5 mmm===123 0.00 calTcuhleatien,tebgercaalsustheaµt d(enfi,nse)β˜ims(an,ppo)lyannodmr˜imal(nin,ps).aWreeeoabsytation 10 12 14 16 18 20 10 12 14 16 18 20 j λ λ the coefficients˜b(m,k;p) and a˜(m,k;p) after integratingand assembling the results in powers of n−1. For example, we Fig. 1. Bounds on H(λ) (left) and differences between upper and lower have (m=1) bounds(right)givenbyTheorem2withm=1,2,3. logq q 1 1 ˜b(1,1;p)= + , − 2 3 − 6q − 6 • from 0.0068 to 0.00074 with m=3. 1 ≈ ≈ a˜(1,1;p)=2logq q+ , Inshort,formoderateλ,theboundsarealreadyquiteaccurate − q withmassmallas3,and,asexpected,theaccuracyimproves 7 1 1 a˜(1,2;p)= 4logq+2q + + , as λ increases. − − 3q 6q2 6 and (m=2) B. Exact Formulae and Sharp Bounds for D(n,p) p2 ˜b(2,1;p)= , (6) While bounds on Poisson convergence are often stated in 12q terms of the total variation distance ([4], [2]), those based on 3 1 17 5 17 ˜b(2,2;p)= logq q+ , the relative entropy can also be quite effective ([21], [15]). 2 − 2 12q − 24q2 − 24 As a discrepancy measure, relative entropy occurs naturally 6 5 3 1 113 in contexts such as hypothesis testing. In this section we ˜b(2,3;p)= 3logq+ q + + , − 5 − 2q 8q2 − 60q3 120 consider D(n,p) D(B N ), and study its higher-order n,p np 9 3 9 ≡ k asymptotic behavior. Our main result, Theorem 4, may be a˜(2,2;p)= 9logq+3q + + , − − q 2q2 2 regarded as a refinement of that of Harremoe¨s and Ruzankin 83 18 5 122 [15]. As a by-product, we also obtain bounds on the entropy a˜(2,3;p)=78logq 26q+ + , − q − q2 3q3 − 3 of the binomial that parallel Theorem 2. 78 18 31 Analogous to Theorem 1, we have the following exact a˜(2,4;p)= 72logq+24q + − − q q2 − 15q3 expansion for D(n,p) as a function of p. 1 2281 Theorem 3: Fix n N. For p [0,1] we have + + . ∈ ∈ 20q4 60 n n D(n,p)=n(p+qlogq)+ c˜(k)pk, As in Theorem 2, we emphasize that, for each m 2, the kX=2(cid:18)k(cid:19) constants ˜b(m,k;p), 1 ≤ k ≤ m−1, are exact in≥the full 4 asymptotic expansion of D(n,p) (for fixed p as n ). In where → ∞ particular, from (6) we get 13 3log(pq) 5 C = , 1 12 − 2 − 6pq p+logq p2 D(n,p)= + +O(n−2), 7 8 1 − 2 12qn C2 =−3 +4log(pq)+ 3pq − 6(pq)2, 1 for fixed p (0,1). We can also estimate the rate at which C = , D(n,λ/n) d∈ecreases to zero, for fixed λ > 0 as n , 3 −360 → ∞ 1 log(pq) 1 which corresponds to the usual binomial-to-Poisson conver- C = + + . 4 gence.Indeed,settingm=1andp=λ/n,Theorem4yields, 12 2 6pq after routine calculations, To relate (7) to the Poisson case, i.e., Theorem 2, let us fix λ>0andsetp=λ/n.Then,asn ,wehaveH(n,p) D(n,λ/n)= λ2 +O(n−3), H(λ), and in the limit (7) becomes→∞ → 4n2 5 1 1 1 1 H(λ) log(2πλ) , which can be further refined by using larger m. −6λ − 6λ2 ≤ − 2 − 2 ≤ 6λ SuchresultsarerelatedtothoseofHarremoe¨sandRuzankin whichispreciselyTheorem2withm=1.Ingeneral(m 1), ≥ [15], who give several bounds on D(n,p) after detailed Theorem 2 may be seen as a limiting case of Corollary 1. analysesoninequalitiesinvolvingStirlingnumbers.Theorem4 As before, for each m 2, the coefficients ˜b(m,k;p)+ may be regarded as a refinement in that, by letting m be ˜b(m,k;q), k = 1,...,m≥ 1, are exact in the asymptotic arbitrary, it can give a full asymptotic expansion of D(n,p), expansion of H˜(n,p) (for−fixed p as n ). For large n, → ∞ with computable bounds. Our derivation is also simpler (see we may choose larger m to improve the accuracy. Section IV). III. PROOFS OF THEOREMS1 AND 2 Let us denote the entropy of the binomial by H(n,p) = A. Preliminaries H(B ). In the specialcase p=1/2,H(n,p) appearsas the n,p sum capacity of a noiseless n-user binary adder channel as Given a function φ : R R, the mth forward difference → analyzed by Chang and Weldon [7], who also provide simple of φ is defined recursively by bounds on H(n,p). For general p, asymptotic expansions for ∆0φ(x)=φ(x), ∆m+1φ(x)=∆mφ(x+1) ∆mφ(x), H(n,p)havebeenobtainedbyJacquetandSzpankowski[17] − and Knessl [20] (see also Flajolet [11]). We present sharp for any x R and m Z . Equivalently, we have + ∈ ∈ bounds on H(n,p) that complement such expansions. As it m m turns out, due to an elementary identity (see (29) below), ∆mφ(x)= ( 1)m−j φ(x+j), x R, m Z . + − (cid:18)j(cid:19) ∈ ∈ bounds on D(n,p) obtained in Theorem 4 translate directly Xj=0 (8) to those on H(n,p), thus simplifying our analysis. Corollary 1: Let n,m N and p (0,1). Define (This definition extends to functionsdefined only on Z+.) As ∈ ∈ usual,φ(m) standsforthemthderivativeofφ.Ifφisinfinitely 1+log(pq) differentiable, then H˜(n,p)=H(n,p) logn!+nlogn n . − − − 2 ∆mφ(x)=E[φ(m)(x+S )], (9) m Then where S =U + +U and (U ) is a sequence of in- m 1 m k k≥1 ··· dependentand identically distributed (i.i.d.) random variables 0 H˜(n,p) 2m−1˜b(m,k;p)+˜b(m,k;q) having the uniform distribution on [0,1] (cf. [3], Eqn. (2.7)). ≤− − nk On the other hand, two special properties of the Poisson Xk=1 distribution are 2m a˜(m,k;p)+a˜(m,k;q) ≤kX=m nk , λE[φ(Nλ+1)]=E[Nλφ(Nλ)], λ≥0, (10) and (see [3], Theorem 2.1, for example) where a˜(m,k;p) and˜b(m,k;p) are defined as in Theorem 4. dmE[φ(N )] λ =E[∆mφ(N )], λ 0, m Z . (11) Boundson H(n,p) can also be expressedin terms of logn dλm λ ≥ ∈ + and n−k, k = 1,2,..., via familiar bounds on logn!. For In both (10) and (11), φ : Z R is an arbitrary function + example, taking m = 1 in Corollary 1, and using (see [1], for which the relevant expectati→ons exist. From (11) we have 6.1.42) the Taylor formula 1 1 1 1 m ∆kφ(0) <logn! nlogn+n log(2πn)< , E[φ(N )]= λk 12n − 360n3 − − 2 12n λ k! Xk=0 we get 1 λ + (λ u)mE[∆m+1φ(N )]du, (12) u m!Z − C C C 1 1 C 0 1 + 2 + 3 <H(n,p) log(2πnpq) < 4, (7) for any λ 0 and m Z . n n2 n3 − 2 − 2 n ≥ ∈ + 5 B. Proof of Theorem 1 Proof: By (10) we have Let λ 0. We have N +1 1 ≥ E log s = (E[N logN ] slogs). (18) s s (cid:20) s (cid:21) s − H(λ)=λ λlogλ+E[logN !] (13) λ − Taking into account that E[N ]=s, the lower bound follows bydefinition.Applying(12)tothefunctionφ(k)=logk!, k s Z , we obtain ∈ from (18) and the inequality (upon letting x=Ns) + H(λ)+λlogλ λ= m ∆kφ(0)λk xlogx−slogs≥(logs+1)(x−s)+2m+1(k−(1k)k(x1)−sks−)1k , − kX=2 k! Xk=2 − (19) + 1 λ(λ u)mE[∆m+1φ(Nu)]du, which holds for x≥0, s>0. m!Z − On the other hand, using the inequality 0 (14) 2m+1( 1)k−1 for any m=2,3,..., since ∆0φ(0)=∆1φ(0)=0. log(1+x) − xk, x> 1, (20) ≤ k − On the other hand, letting g(x) = log(x+1), x 0, and Xk=1 ≥ using (8) and (9), we can write with x=(N s+1)/s, we obtain s − ∆m+1φ(i)=∆mg(i) N +1 2m+1( 1)k−1E[(N s+1)k] s s E log − − . (21) =(−1)m−1E(cid:20)(i+(m1+−S1)!)m(cid:21) (15) (cid:20) s (cid:21)≤ kX=1 ksk m Again by (10), we have m m = ( 1)m−j log(i+1+j), Xj=0 − (cid:18)j(cid:19) sE[(Ns−s+1)k]=E[Ns(Ns−s)k]=µk+1(s)+sµk(s). where i Z and m=1,2,.... Therefore, Hence the right-hand side of (21) is equal to + ∈ ∆kφ(0)=k−1( 1)k−1−j k−1 log(j+1)=c(k), (16) (2mµ2+m+1)2s(2sm)+2 +2m+1(k−(k1)kµ1k)(ssk), Xj=0 − (cid:18) j (cid:19) kX=2 − which proves the upper bound. for k = 2,3,.... The conclusion follows from (14) and (16) Remark. The quantity E[log(N +1)] already appears as by notingthat, based on(15), the integralin (14) alternatesin s Example 2 of Entry 10 in Chapter 3 of Ramanujan’s second sign for m=1,2,.... notebook ([5]). Proposition 1 gives a finite expansion of E[log(N + 1)] with an explicit upper bound of the order s C. Proof of Theorem 2 of s−(m+1). See [31] for related work on this particular entry Some auxiliary results are needed in the proof of Theorem of Ramanujan. 2. In Lemmas 1 and 2 we denote Proof of Theorem 2: The claim follows directly from 1 1 Lemma 2 and Proposition 1, upon noting µ2(s)=s. H˜(λ)=H(λ) log(2πλ) , λ>0, (17) − 2 − 2 for convenience. IV. PROOFS OF THEOREMS 3AND 4 AND COROLLARY 1 Lemma 1: We have H˜(λ)=O(λ−1), as λ . A. Preliminaries →∞ Proof: See [10] or [20]. Let us recall two special properties of the binomial vari- Lemma 2 expresses the quantity of interest in terms of an able B . For n N, p [0,1], and any function φ : n,p easier quantity, E[log(Ns+1)]. 0,1,...,n R,∈we have ∈ Lemma 2: For λ>0 we have { }→ npE[φ(B +1)]=E[B φ(B )], (22) ∞ 1 N +1 n−1,p n,p n,p H˜(λ)= E log s ds. Z (cid:18)2s − (cid:20) s (cid:21)(cid:19) as well as (see [2], Theorem 1, for example) λ Proof: Recalling (13) and using (11) with φ(k) = dkE[φ(B )] n n,p =k! E[∆kφ(B )], k =0,1,...,n; logk!, k Z+ and m=1, we obtain from (17) dpk (cid:18)k(cid:19) n−k,p ∈ (23) 1 H˜′(λ)=E[log(N +1)] logλ . λ − − 2λ dkE[φ(Bn,p)] =0, k =n+1,n+2,.... By Lemma 1, H˜( )=0, and the claim follows. dpk ∞ The next result presents sharp bounds on E[log(N +1)]. From (23) we obtain the Taylor formula s Proposition 1: For s>0 and m N, we have n ∈ n E[φ(B )]= E[∆kφ(B )](p t)k, (24) N +1 2m+1( 1)kµ (s) µ (s) n,p (cid:18)k(cid:19) n−k,t − 0 E log s − k 2m+2 . kX=0 ≤ (cid:20) s (cid:21)− k(k 1)sk ≤ (2m+1)s2m+2 Xk=2 − where p,t [0,1]. ∈ 6 B. Proof of Theorem 3 As in the proof of Lemma 2, using (20) with By direct calculation B +1 ns n−1,s x= − , ns n! D(n,p)=n(p+qlogq) nplogn+E log . − (cid:20) (n B )!(cid:21) we obtain n,p − (25) B +1 2m+1( 1)k−1E[(B ns+1)k] Using (24) with φ(l) = log(n!/(n l)!), l = 0,...,n and E log n−1,s − n−1,s− . t=0, we see that the expectation in−(25) is equal to (cid:20) ns (cid:21)≤ Xk=1 k(ns)k (28) n E[φ(B )]= n ∆kφ(0)pk, p [0,1]. (26) Again by (22), we have n,p (cid:18)k(cid:19) ∈ Xk=0 nsE[(B ns+1)k]=E[B (B ns)k] n−1,s n,s n,s − − Denoteby g(l)=log(n l),l=0,1,...,n 1.Observethat =µ (n,s)+nsµ (n,s). − − k+1 k ∆kφ=∆k−1g, k N, ∆0φ(0)=0, ∆1φ(0)=logn. This, in conjunction with (28), proves the upper bound. ∈ Proof of Theorem 4: Note that Therefore, we have from (8) 1 µ (n,s) p+logq ∆kφ(0)=∆k−1g(0) n 2 ds= . Z 2(ns)2 − 2 q k−1 k 1 = ( 1)k−1−j − log(n j)=c˜(k), (27) The claim then follows from Lemma 3 and Proposition 2. − (cid:18) j (cid:19) − Xj=0 fork=2,...,n. Theclaim followsfrom(25), (26),and(27). D. Proof of Corollary 1 (cid:4) From the definitions we have H(n,p)=logn! nlogn+n D(n,p) D(n,q), (29) C. Proof of Theorem 4 − − − The followingintegralrepresentationis crucialin the proof whereq 1 pasbefore.Hence,Corollary1isanimmediate of Theorem 4. conseque≡nce−of (29) and Theorem 4. (cid:4) Lemma 3: For any p [0,1] we have ∈ 1 B +1 V. DISCUSSION n−1,s D(n,p)=n E log ds. We have obtained asymptotically sharp and readily com- Z (cid:20) ns (cid:21) q putable bounds on the Poisson entropy function H(λ). The Proof: Differentiating (25) once and applying (23) with method also handles the entropy of the binomial, and the k =1 and φ(l)=log(n!/(n l)!), we get relative entropy D(n,p) between the binomial(n,p) and − Poisson(np)distributions,yieldingfullasymptoticexpansions dD(n,p) =nE[log(n Bn−1,p)] nlog(nq). with explicit constants. While some results are of theoretical dp − − interest, bounds on the entropy are intended to aid channel Since D(n,0)=0, we have capacity calculations for discrete-time Poisson channels, for p example. D(n,p)=n (E[log(n Bn−1,t)] log(n(1 t))) dt. Besides entropy calculations, the method also extends to Z − − − 0 quantities such as the fractional moments of these familiar Note that n − Bn−1,t and Bn−1,1−t + 1 have the same distributions. An example is E[√Nλ], which appears as Ex- distribution. The claim follows by a change of variables ample 1, Entry 10, in Chapter 3 of Ramanujan’s notebook s=1 t. ([5]). Analogous to Lemma 2, a double inequality can be − Proposition 2 gives upper and lower bounds on the key obtained for E[√N ] (details omitted). This is of some λ quantity E[log((Bn−1,s +1)/(ns))]. This parallels Proposi- statistical interest because it gives accurate bounds on the tion 1. bias associated with the square root transformation, which is Proposition 2: For any m N and s (0,1) we have variance-stabilizing for the Poisson. ∈ ∈ For theoreticalconsiderations,the Taylorformulae,integral B +1 2m+1( 1)kµ (n,s) 0 E log n−1,s − k representations, and related results can also help to establish ≤ (cid:20) ns (cid:21)− Xk=2 k(k−1)(ns)k interestingmonotonicitypropertiesofquantitiessuchasH(λ). µ (n,s) For example, it can be shown that H′(λ) is a completely 2m+2 ≤ (2m+1)(ns)2m+2. monotonicfunctionofλ,i.e.,( 1)k−1H(k)(λ) 0forallk − ≥ ≥ 1. It appearsthat many entropy-likefunctionsassociated with Proof: By (22) we have classical distributions are completely monotonic ([32]). One conjecture([33], Conjecture 1) states that D(n,λ/n), n>λ, nsE[log(B +1)]=E[B log(B )]. n−1,s n,s n,s is completely monotonicin n for fixed λ>0. The method of The lower bound follows from this and (19). thisworkmayproveusefultowardresolvingsuchconjectures. 7 ACKNOWLEDGMENTS [24] A. Martinez, “Spectral efficiency of optical direct detection,” J. Opt. Soc.AmericaB,vol.24,no.4,pp.739-749,Apr.2007. The authorswould like to thank the Editor and the referees [25] P.Mateev,“Theentropyofthemultinomialdistribution,” Teor.Verojat- for their careful reading of the manuscript and for their nost.iPrimenen.,vol.23,no.1,pp.196–198,1978. remarks and suggestions, which greatly improved the final [26] S. Shamai (Shitz), “Capacity of a pulse amplitude modulated direct detection photon channel,” in Proc. Inst. Elec. Eng., vol. 137, no. 6, outcome. pp.424-430,Dec.1990,partI(Communications, SpeechandVision). This work has been partially supported by research grants [27] S. Shamai (Shitz) and A. Lapidoth, “Bounds on the capacity of a MTM2008-06281-C02-01/MTM,DGA E-64, UJA2009/12/07 spectrally constrained Poisson channel,” IEEE Trans. Inf. Theory, vol. 39,no.1,pp.19-29,Jan.1993. (UniversidaddeJae´nandCajaRuraldeJae´n),andbyFEDER [28] L.A.SheppandI.Olkin,“EntropyofthesumofindependentBernoulli funds. randomvariables andofthemultinomialdistribution,” inContributions toProbability, pp.201–206,AcademicPress,NewYork,1981. REFERENCES [29] F.Topsøe,“Maximumentropyversusminimumriskandapplicationsto someclassical discrete distributions,” IEEETrans.Inf.Theory,vol.48, [1] M.AbramowitzandI.A.Stegun,HandbookofMathematicalFunctions, no.8,pp.2368–2376,Aug.2002. Dover,NewYork,1964. [30] Y. Yu, “On the maximum entropy properties of the binomial distribu- [2] J.A.AdellandJ.M.Anoz,“Signedbinomialapproximationofbinomial tion,”IEEETrans.Inf.Theory,vol.54,pp.3351–3353, 2008. mixturesviadifferential calculus forlinearoperators,”J.Statist.Plann. [31] Y.Yu,“OnanasymptoticseriesofRamanujan,”RamanujanJ.,vol.20, Inference, vol.138,no.12,pp.3687–3695, 2008. pp.179–188, 2009. [3] J. A. Adell and A. Lekuona, “Sharp estimates in signed Poisson [32] Y. Yu, “Complete monotonicity of the entropy in the central limit approximation of Poisson mixtures,” Bernoulli, vol. 11, no. 1, 47–65, theoremforgammaandinverseGaussiandistributions,”Stat.Prob.Lett., 2005. vol.79,pp.270–274,2009. [4] A.D.Barbour,L.Holst,andS.Janson,PoissonApproximation,Oxford [33] Y.Yu,“Monotonicconvergenceinaninformation-theoreticlawofsmall Studies inProbability, vol.2,Clarendon Press,Oxford,1992. numbers,”IEEETrans.Inf.Theory,vol.55,pp.5412–5422, 2009. [5] B. C. Berndt, Ramanujan’s Notebooks, Part I, Springer-Verlag, New York,1985. [6] D.BradyandS.Verdu´,“Theasymptoticcapacityofthedirectdetection photon channel with a bandwidth constraint,” in Proc. 28th Allerton Conf.Communication,ControlandComputing,AllertonHouse,Monti- cello, IL,Oct.1990,pp.691-700. [7] S.ChangandE.Weldon,“CodingforT-usermultiple-accesschannels,” IEEETrans.Inf.Theory,vol.IT-25,pp.684-691,1979. [8] T.CoverandJ.Thomas,ElementsofInformationTheory,2nded.,New York:Wiley,2006. [9] A.D’yachkovandP.Vilenkin,“AsymptoticsoftheShannonandRe´nyi Jose´ A. Adell received the B.S.degree inmathematics from Zaragoza Uni- entropies for sums of independent random variables,” in Proc. ISIT98 versity,Spain,in1980,andthePh.D.degreeinmathematicsfromtheBasque (MIT,Cambridge, MA,1998),p.376. Country University, Spain, in 1983. During 1981–1990, he was Assistant [10] R.J.Evans,J.Boersma,N.M.BlachmanandA.A.Jagers,“Theentropy ProfessorintheDepartmentofMathematicsattheBasqueCountryUniversity. ofaPoissondistribution:problem87-6,”SIAMReview,vol.30,pp.314– During 1991–2009, he has been Associate Professor in the Department of 317,1988. Statistics atZaragozaUniversity. [11] P. Flajolet, “Singularity analysis and asymptotics of Bernoulli sums,” Theor.Comput. Sci.,vol.275,pp.371-387,1999. [12] M.R.Frey,“InformationcapacityofthePoissonchannel,”IEEETrans. Inf.Theory,vol.37,no.2,pp.244-256,Mar.1991. [13] D.Guo,S.ShamaiandS.Verdu´,“Mutual information andconditional meanestimationinPoissonchannels,”IEEETrans.Inf.Theory,vol.54, no.5,pp.1837–1849,May2008. [14] P.Harremoe¨s,“BinomialandPoissondistributionsasmaximumentropy distributions,” IEEETrans. Inf. Theory, vol. 47,no. 5, pp. 2039–2041, July2001. [15] P.Harremoe¨sandP.S.Ruzankin,“Rateofconvergence toPoissonlaw Alberto Lekuona received the B.S. degree in mathematics from Zaragoza in terms of information divergence,” IEEETrans. Inf. Theory, vol. 50, University, Spain, in 1985, and the Ph.D. degree in mathematics from no.9,pp.2145–2149,2004. ZaragozaUniversity,in1996.During1987–2000,hewasAssistantProfessor [16] P. Jacquet and W. Szpankowski, “Analytical depoissonization and its intheDepartmentofStatisticsatZaragozaUniversity,andAssociateProfessor applications,” Theoret.Comput.Sci.,vol.201,pp.1–62,1998. atthesameplace intheperiod2001–2009. [17] P. Jacquet and W. Szpankowski, “Entropy computations via analytic depoissonization,” IEEE Trans. Inf. Theory, vol. 45, pp. 1072–1081, 1999. [18] O.Johnson,“Log-concavity andthemaximum entropy property ofthe Poisson distribution,” Stochastic Processes andtheir Applications, vol. 117,no.6,pp.791–802, Jun.2007. [19] N.L.Johnson,A.W.KempandS.Kotz,UnivariateDiscreteDistribu- tions,3rded.,Wiley&Sons,Hoboken, NJ,2005. [20] C. Knessl, “Integral representations and asymptotic expansions for Shannon and Renyi entropies,” Appl. Math. Lett., vol. 11, pp. 69-74, 1998. Yaming Yu (M’08) received the B.S. degree in mathematics from Beijing [21] I.Kontoyiannis,P.Harremoe¨s,andO.T.Johnson,“Entropyandthelaw ofsmallnumbers,”IEEETrans.Inf.Theory,vol.51,no.2,pp.466–472, University,P.R.China,in1999,andthePh.D.degreeinstatisticsfromHarvard Feb.2005. University, in 2005. Since 2005 he has been an Assistant Professor in the [22] A. Lapidoth and S. M. Moser, “On the capacity of the discrete-time Department ofStatistics attheUniversity ofCalifornia, Irvine. Poissonchannel,”IEEETrans.Inf.Theory,vol.55,no.1,pp.303–322, 2009. [23] M. Madiman, O. Johnson and I. Kontoyiannis, “Fisher information, compound Poisson approximation and the Poisson channel,” in Proc. IEEE International Symposium on Information Theory, Nice, France, Jun.2007.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.