ebook img

Extremal Relations Between Shannon Entropy and $\ell_{\alpha}$-Norm PDF

1 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Extremal Relations Between Shannon Entropy and $\ell_{\alpha}$-Norm

Extremal Relations Between Shannon Entropy and (cid:96) -Norm α Yuta Sakai and Ken-ichi Iwata Department of Information Science, University of Fukui, 3-9-1 Bunkyo, Fukui, Fukui, 910-8507, Japan, E-mail: {ji140117,k-iwata}@u-fukui.ac.jp 6 1 0 Abstract 2 n The paper examines relationships between the Shannon entropy and the (cid:96)α-norm for n-ary probability vectors, a n≥2.Moreprecisely,weinvestigatethetightboundsofthe(cid:96) -normwithafixedShannonentropy,andviceversa. J α As applications of the results, we derive the tight bounds between the Shannon entropy and several information 8 2 measures which are determined by the (cid:96) -norm. Moreover, we apply these results to uniformly focusing channels. α Then, we show the tight bounds of Gallager’s E functions with a fixed mutual information under a uniform input ] 0 T distribution. I . s c I. INTRODUCTION [ Information measures of random variables are used in several fields. The Shannon entropy [1] is one of the 1 v famous measures of uncertainty for a given random variable. On the studies of information measures, inequalities 8 7 for information measures are commonly used in many applications. As an instance, Fano’s inequality [2] gives the 6 tightupperboundoftheconditionalShannonentropywithafixederrorprobability.Then,notethatthetight means 7 0 the existence of the distribution which attains the equality of the bound. Later, the reverse of Fano’s inequality, i.e., . 1 0 the tight lower bound of the conditional Shannon entropy with a fixed error probability, are established [3]–[5]. 6 On the other hand, Harremoe¨s and Topsøe [8] derived the exact range between the Shannon entropy and the index 1 v: of coincidence (or the Simpson index) for all n-ary probability vectors, n≥3. In the above studies, note that the i X error probability and the index of coincidence are closely related to (cid:96) -norm and (cid:96) -norm, respectively. Similarly, ∞ 2 r several axiomatic definitions of the entropies [9]–[14] are also related to the (cid:96) -norm. Furthermore, the (cid:96) -norm a α α are also related to some diversity indices, such as the index of coincidence. In this study, we examine extremal relations between the Shannon entropy and the (cid:96) -norm for n-ary probability α vectors, n≥2. More precisely, we establish the tight bounds of (cid:96) -norm with a fixed Shannon entropy in Theorem α 1. Similarly, we also derive the tight bounds of the Shannon entropy with a fixed (cid:96) -norm in Theorem 2. Directly α extending Theorem 1 to Corollary 1, we can obtain the tight bounds of several information measures which are determined by the (cid:96) -norm with a fixed Shannon entropy, as shown in Table I. In particular, we illustrate the exact α feasible regions between the Shannon entropy and the Re´nyi entropy in Fig. 2 by using (295) and (296). In Section III-B, we consider applications of Corollary 1 for a particular class of discrete memoryless channels, defined in Definition 2, which is called uniformly focusing [15] or uniform from the output [16]. January29,2016 DRAFT II. PRELIMINARIES A. n-ary probability vectors and its information measures Let the set of all n-ary probability vectors be denoted by (cid:40) (cid:12)(cid:12) (cid:88)n (cid:41) P (cid:44) (p ,p ,...,p )∈Rn(cid:12)p ≥0 and p =1 (1) n 1 2 n (cid:12) j i (cid:12) i=1 for an integer n≥2. For p=(p ,p ,...,p )∈P , let 1 2 n n p ≥p ≥···≥p (2) [1] [2] [n] denote the components of p in decreasing order, and let p (cid:44)(p ,p ,...,p ) (3) ↓ [1] [2] [n] denote the decreasing rearrangement1 of p. In particular, we define the following two n-ary probability vectors: (i) an n-ary deterministic distribution d (cid:44)(d ,d ,...,d )∈P (4) n 1 2 n n is defined by d =1 and d =0 for i∈{2,3,...,n} and (ii) the n-ary equiprobable distribution 1 i u (cid:44)(u ,u ,...,u )∈P (5) n 1 2 n n is defined by u = 1 for i∈{1,2,...,n}. i n For an n-ary random variable X ∼p∈P , we define the Shannon entropy [1] as n n (cid:88) H(X)=H(p)(cid:44)− p lnp , (6) i i i=1 where ln denotes the natural logarithm and assume that 0ln0=0. Moreover, we define the (cid:96) -norm of p∈P as α n (cid:32) n (cid:33)α1 (cid:88) (cid:107)p(cid:107) (cid:44) pα (7) α i i=1 for α∈(0,∞). Note that lim (cid:107)p(cid:107) =(cid:107)p(cid:107) (cid:44)max{p ,p ,...,p } for p∈P . On the works of extending α→∞ α ∞ 1 2 n n Shannon entropy, the (cid:96) -norm is appear in the several information measures. As an instance, Re´nyi [9] generalized α the Shannon entropy axiomatically to the Re´nyi entropy of order α∈(0,1)∪(1,∞), defined as α H (X)=H (p)(cid:44) ln(cid:107)p(cid:107) (8) α α 1−α α forX ∼p∈P .NotethatitisusuallydefinedthatH (X)(cid:44)H(X)sincelim H (X)=H(X)byL’Hoˆpital’s n 1 α→1 α rule. In other axiomatic definitions of entropies [10]–[14], we can also define them by using the (cid:96) -norm, as with α (8). 1Thisrearrangementisdenotedbyreferencetothenotationof[7]. January29,2016 DRAFT Inthisstudy,weanalyzerelationsbetweenH(p)and(cid:107)p(cid:107) toexaminerelationshipsbetweentheShannonentropy α and several information measures. Note that H(p) and (cid:107)p(cid:107) are invariant for any permutation of the indices of α p∈P ; that is, n H(p)=H(p ) and (cid:107)p(cid:107) =(cid:107)p (cid:107) (9) ↓ α ↓ α for any p ∈ P . Hence, we only consider p for p ∈ P in the analyses of the study. Since (cid:107)p(cid:107) = 1 for any n ↓ n 1 p∈P , we have no interest in the case α=1; hence, we omit the case α=1 in this study. Furthermore, since n H(p)=lnn ⇐⇒ (cid:107)p(cid:107)α =nα1−1 ⇐⇒ p=un, (10) H(p)=0 ⇐⇒ (cid:107)p(cid:107) =1 ⇐⇒ p =d , (11) α ↓ n the cases p=u and p =d are trivial; thus, we also omit these cases in the analyses of this study. n ↓ n B. Properties of two distributions v (·) and w (·) n n For a fixed n≥2, let the n-ary distribution v (p)(cid:44)(v (p),v (p),...v (p))∈P be defined by n 1 2 n n  1−(n−1)p if i=1, v (p)= (12) i p otherwise for p∈[0, 1 ], and let the n-ary distribution2 w (p)(cid:44)(w (p),w (p),...,w (p))∈P be defined by n−1 n 1 2 n n  p if 1≤i≤(cid:98)p−1(cid:99), wi(p)= 1−(cid:98)p−1(cid:99)p if i=(cid:98)p−1(cid:99)+1, (13) 0 otherwise forp∈[1,1],where(cid:98)·(cid:99)denotesthefloorfunction.Notethatv (p) =w (p)forp∈[1, 1 ].Inthissubsection, n n ↓ n n n−1 we examine the properties of the Shannon entropies and the (cid:96) -norms for v (·) and w (·). For simplicity, we α n n define H (p)(cid:44)H(v (p)) (14) vn n =−(1−(n−1)p)ln(1−(n−1)p)−(n−1)plnp, (15) H (p)(cid:44)H(w (p)) (16) wn n =−(cid:98)p−1(cid:99)plnp−(1−(cid:98)p−1(cid:99)p)ln(1−(cid:98)p−1(cid:99)p). (17) Then, we first show the monotonicity of H (p) with respect to p∈[0, 1] in the following lemma. vn n Lemma 1. H (p) is strictly increasing for p∈[0, 1]. vn n 2Thedefinitionofwn(·)issimilartothedefinitionof[6,Eq.(26)]. January29,2016 DRAFT Proof of Lemma 1: It is easy to see that n (cid:88) H (p)=− v (p)lnv (p) (18) vn i i i=1 n (cid:88) =−v (p)lnv (p)− v (p)lnv (p) (19) 1 1 i i i=2 n (cid:88) =−(1−(n−1)p)ln(1−(n−1)p)− v (p)lnv (p) (20) i i i=2 =−(1−(n−1)p)ln(1−(n−1)p)−(n−1)plnp. (21) Then, the first-order derivative of H (p) with respect to p is vn ∂H (p) ∂ (cid:16) (cid:17) vn = −(n−1)plnp−(1−(n−1)p)ln(1−(n−1)p) (22) ∂p ∂p (cid:18) (cid:19) (cid:18) (cid:19) d ∂ =−(n−1) (plnp) − ((1−(n−1)p)ln(1−(n−1)p)) (23) dp ∂p (cid:16) (cid:17) (cid:16) (cid:17) =−(n−1) lnp+1 +(n−1) ln(1−(n−1)p)+1 (24) (cid:16) (cid:17) =(n−1) ln(1−(n−1)p)−lnp (25) 1−(n−1)p =(n−1)ln . (26) p Since 1−(n−1)p>p>0 for p∈(0, 1), it follows from (26) that n ∂H (p) vn >0 (27) ∂p for p ∈ (0, 1). Note that H (p) is continuous for p ∈ [0, 1] since lim H (p) = H (1) = lnn and n vn n p→n1 vn vn n lim H (p) = H (0) = 0 by the assumption 0ln0 = 0. Therefore, H (p) is strictly increasing for p ∈ p→0+ vn vn vn [0, 1]. n Lemma1impliestheexistenceoftheinversefunctionofH (p)forp∈[0, 1].Wesecondshowthemonotonicity vn n of H (p) with respect to p∈[1,1] as follows: wn n Lemma 2. H (p) is strictly decreasing for p∈[1,1]. wn n Proof of Lemma 2: For an integer m ∈ [2,n], assume that p ∈ [1, 1 ]. Then, note that (cid:98)p−1(cid:99) = m. It is m m−1 easy to see that n (cid:88) H (p)=− w (p)lnw (p) (28) wn i i i=1 m n (cid:88) (cid:88) =− w (p)lnw (p)−w (p)lnw (p)− w (p)lnw (p) (29) i i m+1 m+1 j j i=1 j=m+2 m (a) (cid:88) =− w (p)lnw (p)−w (p)lnw (p) (30) i i m+1 m+1 i=1 =−mplnp−w (p)lnw (p) (31) m+1 m+1 =−mplnp−(1−mp)ln(1−mp), (32) January29,2016 DRAFT where (a) follows by the assumption 0ln0=0. Then, the first-order derivative of H (p) with respect to p is wn ∂H (p) ∂ (cid:16) (cid:17) wn = −mplnp−(1−mp)ln(1−mp) (33) ∂p ∂p (cid:18) (cid:19) (cid:18) (cid:19) d ∂ =−m (plnp) − ((1−mp)ln(1−mp)) (34) dp ∂p (cid:16) (cid:17) (cid:16) (cid:17) =−m lnp+1 +m ln(1−mp)+1 (35) (cid:16) (cid:17) =m ln(1−mp)−lnp (36) 1−mp =mln . (37) p Since p>1−mp>0 for p∈(1, 1 ), it follows from (37) that m m−1 ∂H (p) wn <0 (38) ∂p for p∈(1, 1 ). On the other hand, we observe that m m−1 (cid:16) (cid:17) lim H (p)= lim −(cid:98)p−1(cid:99)plnp−(1−(cid:98)p−1(cid:99)p)ln(1−(cid:98)p−1(cid:99)p) (39) p→(1)− wn p→(1)− m m (cid:16) (cid:17) = lim −mplnp−(1−mp)ln(1−mp) (40) p→(1)− m (cid:16) (cid:17) =lnm− lim (1−mp)ln(1−mp) (41) p→(1)− m (cid:16) (cid:17) =lnm− lim xlnx (42) x→0+ =lnm (43) for an integer m∈[1,n−1] and (cid:16) (cid:17) lim H (p)= lim −(cid:98)p−1(cid:99)plnp−(1−(cid:98)p−1(cid:99)p)ln(1−(cid:98)p−1(cid:99)p) (44) p→(1)+ wn p→(1)+ m m (cid:16) (cid:17) = lim −(m−1)plnp−(1−(m−1)p)ln(1−(m−1)p) (45) p→(1)+ m (cid:18) 1 (cid:19) (cid:16) (cid:17) = 1− lnm− lim (1−(m−1)p)ln(1−(m−1)p) (46) m p→(1)+ m (cid:18) (cid:19) (cid:18) (cid:19) 1 1 = 1− lnm− − lnm (47) m m =lnm (48) for an integer m ∈ [2,n]. Note that H (1) = lnm from (43) and the assumption 0ln0 = 0. Hence, for any wn m integer m∈[2,n−1], we get that lim H (p)=H (1)=lnn (49) p→(1)+ wn wn n n lim H (p)=H (1)=lnm, (50) p→1 wn wn m m lim H (p)=H (1)=0, (51) p→1− wn wn which imply that H (p) is continuous for p∈[1,1]. Therefore, H (p) is strictly decreasing for p∈[1,1]. wn n wn n January29,2016 DRAFT As with Lemma 1, Lemma 2 also implies the existence of the inverse function of H (p) for p∈[1,1]. Since wn n H (0) = 0, H (1) = lnn, H (1) = lnn, and H (1) = 0, we can denote the inverse functions of H (p) vn vn n wn n wn vn and H (p) with respect to p as follows: We denote by H−1 : [0,lnn] → [0, 1] the inverse function of H (p) wn vn n vn for p∈[0, 1]. Moreover, we also denote by H−1 :[0,lnn]→[1,1] the inverse function of H (p) for p∈[1,1]. n wn n wn n Now, we provide the monotonicity of (cid:107)v (p)(cid:107) with respect to H (p) in the following lemma. n α vn Lemma3. Foranyfixedn≥2andanyfixedα∈(−∞,0)∪(0,1)∪(1,∞),ifp∈[0, 1],thefollowingmonotonicity n hold: (i) if α>1, then (cid:107)v (p)(cid:107) is strictly decreasing for H (p)∈[0,lnn] and n α vn (ii) if α<1, then (cid:107)v (p)(cid:107) is strictly increasing for H (p)∈[0,lnn]. n α vn Proof of Lemma 3: The proof of Lemma 3 is given in a similar manner with [20, Appendix I]. By the chain rule of the derivation and the inverse function theorem, we have (cid:18) (cid:19) (cid:18) (cid:19) ∂(cid:107)v (p)(cid:107) ∂(cid:107)v (p)(cid:107) ∂p n α = n α · (52) ∂H (p) ∂p ∂H (p) vn vn (cid:18) (cid:19) (cid:32) (cid:33) ∂(cid:107)v (p)(cid:107) 1 = n α · . (53) ∂p ∂H(vn(p)) ∂p Direct calculation shows ∂(cid:107)v (p)(cid:107) ∂ (cid:16) (cid:17)1 n α = (n−1)pα+(1−(n−1)p)α α (54) ∂p ∂p 1 (cid:16) (cid:17)1−1(cid:18) ∂ (cid:16) (cid:17)(cid:19) = (n−1)pα+(1−(n−1)p)α α (n−1)pα+(1−(n−1)p)α (55) α ∂p 1 (cid:16) (cid:17)1−1(cid:16) (cid:16) (cid:17)(cid:17) = (n−1)pα+(1−(n−1)p)α α α(n−1) pα−1−(1−(n−1)p)α−1 (56) α (cid:16) (cid:17)1−1(cid:16) (cid:17) =(n−1) (n−1)pα+(1−(n−1)p)α α pα−1−(1−(n−1)p)α−1 . (57) Substituting (26) and (57) into (53), we obtain ∂(cid:107)v (p)(cid:107) n α ∂H (p) vn (cid:32) (cid:33) (cid:16) (cid:17)1−1(cid:16) (cid:17) 1 =(n−1) (n−1)pα+(1−(n−1)p)α α pα−1−(1−(n−1)p)α−1 (58) (n−1)ln1−(n−1)p p (cid:16) (cid:17)1−1(cid:16) (cid:17) 1 = (n−1)pα+(1−(n−1)p)α α pα−1−(1−(n−1)p)α−1 . (59) ln1−(n−1)p p We now define the sign function as  1 if x>0, sgn(x)(cid:44) 0 if x=0, (60) −1 if x<0. January29,2016 DRAFT Since 0<p<1−(n−1)p for p∈(0, 1), we observe that n (cid:18)(cid:16) (cid:17)1−1(cid:19) sgn (n−1)pα+(1−(n−1)p)α α =1, (61)  (cid:16) (cid:17) 1 if α<1, sgn pα−1−(1−(n−1)p)α−1 = 0 if α=1, (62) −1 if α>1, (cid:32) (cid:33) 1 sgn =1 (63) ln1−(n−1)p p for p∈(0, 1) and α∈(−∞,0)∪(0,+∞); and therefore, we have n (cid:18) (cid:19) ∂(cid:107)v (p)(cid:107) sgn n α ∂H (p) vn (cid:32) (cid:33) (cid:16) (cid:17)1−1(cid:16) (cid:17) 1 (=59)sgn (n−1)pα+(1−(n−1)p)α α pα−1−(1−(n−1)p)α−1 (64) ln1−(n−1)p p (cid:18)(cid:16) (cid:17)1−1(cid:19) (cid:16) (cid:17) (cid:32) 1 (cid:33) =sgn (n−1)pα+(1−(n−1)p)α α ·sgn pα−1−(1−(n−1)p)α−1 ·sgn (65) ln1−(n−1)p p  1 if α<1, = 0 if α=1, (66) −1 if α>1, for p∈(0, 1) and α∈(−∞,0)∪(0,+∞), which implies Lemma 3. n ItfollowsfromLemmas1and3that,foreachα∈(0,1)∪(1,∞),(cid:107)v (p)(cid:107) isbijectiveforp∈[0, 1].Similarly, n α n we also show the monotonicity of (cid:107)w (p)(cid:107) with respect to H (p) in the following lemma. n α wn Lemma 4. For any fixed n≥2 and any fixed α∈(0,1)∪(1,∞), if p∈[1,1], the following monotonicity hold: n (i) if α>1, then (cid:107)w (p)(cid:107) is strictly decreasing for H (p)∈[0,lnn] and n α wn (ii) if α<1, then (cid:107)w (p)(cid:107) is strictly increasing for H (p)∈[0,lnn]. n α wn Proof of Lemma 4: Since w (p)=v (p) for p∈[1, 1 ], we can obtain immediately from (59) that n n ↓ n n−1 ∂(cid:107)w (p)(cid:107) (cid:16) (cid:17)1−1(cid:16) (cid:17) 1 n α = (n−1)pα+(1−(n−1)p)α α pα−1−(1−(n−1)p)α−1 (67) ∂Hwn(p) ln1−(np−1)p for p∈(1, 1 ). Since 0<1−(n−1)p<p for p∈(1, 1 ), we observe that n n−1 n n−1 (cid:18)(cid:16) (cid:17)1−1(cid:19) sgn (n−1)pα+(1−(n−1)p)α α =1, (68)  (cid:16) (cid:17) 1 if α>1, sgn pα−1−(1−(n−1)p)α−1 = 0 if α=1, (69) −1 if α<1, January29,2016 DRAFT (cid:32) (cid:33) 1 sgn =−1 (70) ln1−(n−1)p p for p∈(1, 1 ) and α∈(−∞,0)∪(0,+∞); and therefore, we have n n−1  (cid:18)∂(cid:107)w (p)(cid:107) (cid:19) 1 if α<1, sgn n α = 0 if α=1, (71) ∂H (p) wn −1 if α>1, for p∈(1, 1 ) and α∈(−∞,0)∪(0,+∞), as with (66). Hence, for α∈(−∞,0)∪(0,+∞), we have that n n−1 • if α>1, then (cid:107)wn(p)(cid:107)α is strictly decreasing for Hwn(p)∈[ln(n−1),lnn] and • if α<1, then (cid:107)wn(p)(cid:107)α is strictly increasing for Hwn(p)∈[ln(n−1),lnn]. Finally,sinceH (p)=H (p)and(cid:107)w (p)(cid:107) =(cid:107)w (p)(cid:107) foranyintegerm∈[2,n−1],anyp∈[1, 1 ], wm wn m α n α m m−1 and any α∈(0,1)∪(1,+∞), we can obtain that • if α>1, then (cid:107)wn(p)(cid:107)α is strictly decreasing for Hwn(p)∈[ln(m−1),lnm] and • if α<1, then (cid:107)wn(p)(cid:107)α is strictly increasing for Hwn(p)∈[ln(m−1),lnm] for any integer m∈[2,n] and any α∈(0,1)∪(1,∞). This completes the proof of Lemma 4. ItalsofollowsfromLemmas2and4that,foreachα∈(0,1)∪(1,∞),(cid:107)w (p)(cid:107) isalsobijectiveforp∈[1,1]. n α n III. RESULTS In Section III-A, we examine the extremal relations between the Shannon entropy and the (cid:96) -norm, as shown in α Theorems 1 and 2. Then, we can identify the exact feasible region of R (α)(cid:44){(H(p),(cid:107)p(cid:107) )|p∈P } (72) n α n for any n ≥ 2 and any α ∈ (0,1)∪(1,∞). Extending Theorems 1 and 2 to Corollary 1, we can obtain the tight bounds between the Shannon entropy and several information measures which are determined by the (cid:96) -norm, α as shown in Table I. In Section III-B, we apply the results of Section III-A to uniformly focusing channels of Definition 2. A. Bounds on Shannon entropy and (cid:96) -norm α Let the α-logarithm function [19] be denoted by x1−α−1 ln x(cid:44) (73) α 1−α for α (cid:54)= 1 and x > 0; besides, since lim ln x = lnx by L’Hoˆpital’s rule, it is defined that ln x (cid:44) lnx. For α→1 α 1 the α-logarithm function, we can see the following lemma. Lemma 5. For α<β and 1≤x≤y (y (cid:54)=1), we observe that ln x ln x α ≤ β (74) ln y ln y α β with equality if and only if x∈{1,y}. January29,2016 DRAFT Proof of Lemma 5: For 1≤x≤y (y (cid:54)=1), we consider the monotonicity of lnαx with respect to α. Direct lnαy calculation shows ∂ (cid:18)ln x(cid:19) ∂ (cid:18)x1−α−1(cid:19) α = (75) ∂α ln y ∂α y1−α−1 α (cid:18) (cid:19)(cid:18) (cid:19) (cid:18) (cid:18) (cid:19)(cid:19) ∂ 1 ∂ 1 = (x1−α−1) +(x1−α−1) (76) ∂α y1−α−1 ∂α y1−α−1 x1−αlnx (cid:18) 1 (cid:19)(cid:18) ∂ (cid:19) =− +(x1−α−1) − (y1−α−1) (77) y1−α−1 (y1−α−1)2 ∂α x1−αlnx y1−α(lny)(x1−α−1) =− + (78) y1−α−1 (y1−α−1)2 x1−α(lnx)(y1−α−1)−y1−α(lny)(x1−α−1) =− (79) (y1−α−1)2 1 (cid:16) (cid:17) =− x1−α(lnx)(y1−α−1)−y1−α(lny)(x1−α−1) . (80) (y1−α−1)2 Then, we can see that (cid:18) ∂ (cid:18)ln x(cid:19)(cid:19) (cid:18) 1 (cid:16) (cid:17)(cid:19) sgn α (=80)sgn − x1−α(lnx)(y1−α−1)−y1−α(lny)(x1−α−1) (81) ∂α ln y (y1−α−1)2 α (cid:18) 1 (cid:19) (cid:16) (cid:17) =sgn − · sgn x1−α(lnx)(y1−α−1)−y1−α(lny)(x1−α−1) (82) (y1−α−1)2 (cid:16) (cid:17) =(a)−sgn x1−α(lnx)(y1−α−1)−y1−α(lny)(x1−α−1) (83) (cid:18) y1−α−1 x1−α−1(cid:19) (b) =−sgn (lnx) −(lny) (84) y1−α x1−α (cid:16) (cid:17) =sgn (yα−1−1)lnx−(xα−1−1)lny (85) (cid:18)(yα−1−1)lnxα−1−(xα−1−1)lnyα−1(cid:19) =sgn (86) α−1 (cid:18) 1 (cid:19) (cid:16) (cid:17) =sgn · sgn (yα−1−1)lnxα−1−(xα−1−1)lnyα−1 (87) α−1 (cid:18) 1 (cid:19) (cid:16) (cid:17) (c) =sgn · sgn (b−1)lna−(a−1)lnb (88) α−1 where • the equality (a) follows from the fact that (cid:18) (cid:19) 1 sgn − =−1 (89) (y1−α−1)2 for y >0 (y (cid:54)=1) and α∈(−∞,1)∪(1,+∞), • the equality (b) follows from the fact that x1−α,y1−α >0 for α∈(−∞,+∞) and x,y >0, and • the equality (c) follows by the change of variables: a=a(x,α)(cid:44)xα−1 and b=b(y,α)(cid:44)yα−1. Then, it can be easily seen that  (cid:18) 1 (cid:19) 1 if α>1, sgn = (90) α−1 −1 if α<1. January29,2016 DRAFT (cid:16) (cid:17) Thus, to check the sign of ∂ lnαx , we now examine the function (b−1)lna−(a−1)lnb. We readily see that ∂α lnαy (cid:16) (cid:17)(cid:12) (cid:16) (cid:17)(cid:12) (b−1)lna−(a−1)lnb (cid:12) = (b−1)lna−(a−1)lnb (cid:12) =0 (91) (cid:12) (cid:12) a=1 a=b for b>0. We calculate the second order derivative of (b−1)lna−(a−1)lnb with respect to a as follows: ∂2 (cid:16) (cid:17) ∂ (cid:18) ∂ (cid:16) (cid:17)(cid:19) (b−1)lna−(a−1)lnb = (b−1)lna−(a−1)lnb (92) ∂a2 ∂a ∂a (cid:18) (cid:18) (cid:19) (cid:18) (cid:19) (cid:19) ∂ d d = (b−1) (lna) − (a−1) lnb (93) ∂a da da (cid:18) (cid:19) ∂ b−1 = −lnb (94) ∂a a (cid:18) (cid:18) (cid:19)(cid:19) d 1 =(b−1) (95) da a b−1 =− . (96) a2 Hence, we observe that (cid:18) ∂2 (cid:16) (cid:17)(cid:19) (cid:18) b−1(cid:19) sgn (b−1)lna−(a−1)lnb =sgn − (97) ∂a2 a2  1 if 0<b<1, = 0 if b=1, (98) −1 if b>1 for a>0, which implies that • if b>1, then (b−1)lna−(a−1)lnb is strictly concave in a>0 and • if 0<b<1, then (b−1)lna−(a−1)lnb is strictly convex in a>0. Therefore, it follows from (91) that • it b>1, then  (cid:16) (cid:17) 1 if 1<a<b, sgn (b−1)lna−(a−1)lnb = 0 if a=1 or a=b, (99) −1 if 0<a<1 or a>b and • it 0<b<1, then  (cid:16) (cid:17) 1 if 0<a<b or a>1, sgn (b−1)lna−(a−1)lnb = 0 if a=b or a=1, (100) −1 if b<a<1. Since a=xα−1 and b=yα−1, note that • if α>1, then 1≤a≤b (b(cid:54)=1) for 1≤x≤y (y (cid:54)=1) and • if α<1, then 0<b≤a≤1 (b(cid:54)=1) for 1≤x≤y (y (cid:54)=1). January29,2016 DRAFT

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.