ebook img

Generalized Statistics Framework for Rate Distortion Theory with Bregman Divergences PDF

0.13 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Generalized Statistics Framework for Rate Distortion Theory with Bregman Divergences

Generalized Statistics Framework for Rate Distortion Theory with Bregman Divergences 7 0 R. C. Venkatesan 0 Systems Research Corporation 2 Aundh, Pune 411007, India n [email protected] a J 5 1 Abstract—A variational principlefor the rate distortion (RD) theorywithBregmandivergencesisformulatedwithintheambit x1−q −1 ln (x)= , (4) ] ofthegeneralized(nonextensive)statisticsofTsallis.TheTsallis- q 1−q h BregmanRDlowerboundisestablished.Alternateminimization c and schemes for the generalized Bregman RD (GBRD) theory are e derived.AcomputationalstrategytoimplementtheGBRDmodel 1 m ispresented.TheefficacyoftheGBRDmodelisexemplifiedwith ex = [1+(1−q)x] /(1−q);1+(1−q)x≥0 (5) - the aid of numerical simulations. q t  0;otherwise, a st I. INTRODUCTION respectively. The Tsallis entropy (1), and, the generalized K- . Thegeneralized(nonextensive)statisticsofTsallis[1,2]has Ld (3) may be written as t a recentlybeenthefocusofmuchattentioninstatisticalphysics, m and allied disciplines 1. Nonextensive statistics generalizes S (p(x))=− p(x)qln p(x), (6) q q d- the extensiveBoltzmann-Gibbsstatistics, andhasfoundmuch Xx utility in complexsystemspossessing longrangecorrelations, n and, o fluctuations, ergodicity,chirality and fractal behavior. By def- r(x) I (p(x)kr(x))=− p(x)ln , (7) c inition, the Tsallis entropy is defined in terms of discrete q q p(x) [ variables as Xx respectively. Employing the relation [4] 3 v 1− pq(x) x 18 Sq(x)=− 1Px−q ,where, p(x)=1. (1) lnq(cid:18)y(cid:19)=yq−1(lnqx−lnqy), (8) 2 Xx the generalizedK-Ld (7)may be expressedas the generalized 1 The constant q is referred to as the nonextensivity param- mutual information 0 eter. Given two independent variables x and y, one of the 07 fundamental consequences of nonextensivity is demonstrated Iq X;X˜ =− p(x,x˜)lnq p(p(x˜x˜|x)) / by the pseudo-additivity relation (cid:16) (cid:17) xP,x˜ (cid:16) (cid:17) at = pq−11(x˜)Sq(X)− p(x˜)Sq X|X˜ =x˜ . (9) m Sq(xy)=Sq(x)+Sq(y)+(1−q)Sq(x)Sq(y). (2) Px˜ Xx˜ (cid:16) (cid:17) d- Hqe→re,1(.1T)aaknindg(2th)eimlipmlyittqha→te1xteinns(i1v)easntadtiesvtiocksiinsgrel’cHoovsepreitdala’ss | hSq(X|X{˜z=x˜)ip(x˜) } n Here,h•idenotestheexpectationvalue.Theseminalpaperon o rule,Sq(x)→S(x),theShannonentropy.Thejointlyconvex source coding within the frameworkof nonextensivestatistics c generalizedKullback-Leiblerdivergence(K-Ld)isoftheform : [3] by Landsberg and Vedral [5], has provided the impetus for v a number of investigations into the use of nonextensive in- i X q−1 formation theory within the context of coding problems. The p(x) −1 r I (p(x)kr(x))= p(x)(cid:16)r(x)(cid:17) . (3) works of Yamano [6,7] represent a sample of some of the a q q−1 prominenteffortsinthisregard.Thesourcecodingtheoremin Xx a nonextensive setting has been derived by Yamano [8]. In the limit q → 1, the extensive K-Ld is readily recovered. A statistical physics model for the variational problem Akin to the Tsallis entropy, the generalized K-Ld obeys the encountered in rate distortion (RD) theory [9, 10] and the pseudo-additivity relation [3]. informationbottleneckmethod[11],derivedwithina general- Defining the q-deformed logarithm and the q-deformed ized statistics framework, has been recently established [12]. exponential as [4] A nonextensive Blahut-Arimorto (BA) alternate minimization scheme [13] has been derived. The noteworthy result of the 1A continually updated bibliography of works related to nonextensive statistics maybefoundathttp://tsallis.cat.cbpf.br/biblio.htm. study in [12] is that the nonextensive RD curves possesses a lower threshold for the minimum compression information in A number of Bregman divergences have been tabulated in the distortion-compression plane, as compared to equivalent [15]. The generalized K-Ld in (3), also referred to as the RD curves derived on the basis of the Boltzmann-Gibbs- Csisza´r generalized K-Ld is not a Bregman divergence.Since Shannon framework. As was the case in [12], this paper the seminal work by Naudts [17], Bregman divergences have employsvaluesofqintherange0<q <1.Thispaperutilizes been the object of much research in nonextensive statistics. the statistical physics theory presented in [12] to formulate a Defining φ(p) = −plnq 1/p , the Bregman generalized K- generalized Bregman RD (GBRD) model. The GBRD model (cid:16) (cid:17) Ld is defined as extendspreviousworksbyRose [14]andBanerjee et. al. [15, d d 16] to achieve a principled and practical strategy to evaluate d (p,r)= pi pq−1−rq−1 − (p −r )rq−1 φ (q−1) i i i i i RD functions. iP=1 (cid:16) (cid:17) iP=1 =I (pkr) q,B II. RATIONALEFORTHE GBRD MODEL (10) A. Background information Setting q−1→κ, (10) is consistent with Eq. (35) in [17]. The RD problem in terms of discrete random variables is III. THE GBRD MODEL stated as follows: given a discrete random variable X ∈ Ξ called the source or the codebook, and, another discrete ran- ThisSectionprovidesastrategytojointlyobtaintheoptimal dom variable X˜ ∈Ξ˜ which is a compressed representation of supportX˜s2ofthequantizedcodebookwithcardinality X˜s = X (alsoreferredtoasthequantizedcodebookand/ortherepro- k , and, the conditional probability p(x˜|x) that charac(cid:12)(cid:12)teri(cid:12)(cid:12)zes ductionalphabet),theinformationratedistortionfunctionthat theRD problem.Thisis accomplishedbya jointoptim(cid:12)izat(cid:12)ion istobeobtainedistheminimizationofthemutualinformation of I (X,X˜) over all conditional probabilities p(x˜|x). q≥1 tioTnhoefctrhuex RofDthfeunRcDtionproubsilnegmtihsethBeAnuscmheermicea.lTdehteeramcitnuaa-l X˜sm,p(inx˜|x)LqGBRD = X˜sm,p(inx˜|x)nIq(cid:16)X;X˜(cid:17)+βhdφ(x,x˜)ip(x,x˜)o. |X˜s|=k implementation of the BA scheme is sometimes impractical (11) owing to a lack of knowledge of the optimal support of the The optimal Lagrange multiplier β, hereafter referred to as quantized codebook X˜. Exact analytical solutions exist only the inverse temperature, depends upon the optimal toler- for a few cases consisting of a combination well behaved ance level of the expectation of D =< d (x,x˜) > = φ p(x,x˜) sources and distortion measures. An initial attempt to achieve p(x,x˜)d (x,x˜). φ a practical and tractable solution to the RD problem was xP,x˜ performed by Rose [14], for the case of Euclidean square A. Constraint terms distortion functions. Therein, it was demonstrated that for sources whose support is a bounded set, the RD function Generalized statistics has utilized a number of constraints eitherequalstheShannonlowerbound,or,theoptimalsupport todefineexpectations.TheoriginalTsallis(OT)constraintsof for the quantized codebook is finite thus permitting the use the form hAi = p A [1], were convenient owing to their i i of a numerical procedure called the mapping method. The similarity to thePmi aximum entropy constraints. These were pioneering work of Rose was generalized by Banerjee et. al abandonedbecause of difficultiesencounteredin obtaining an [15, 16] to include a wider class of distortion functions using acceptable form for the partition function, with the exception Bregman divergences.One of the significant features in these of a few specialized cases. works involved the formulation of a Shannon-Bregmanlower The OT constraints were subsequently replaced by the bound. Curado-Tsallis (C-T) constraints hAiC−T = pqA . The Motivated by the recent results [12], this paper provides q i i C-T constraints were later replaced by thePi normalized two means to solve the GBRD problem for sources with Tsallis-Mendes-Plastino (T-M-P) constraints hAiT−M−P = boundedsupport.First,theanalyticalsolutionmaybeobtained pq from a Tsallis-Bregman lower bound (Section IV). Next, ipqAi. The dependence of the expectation value on the for a finite reproduction alphabet, the RD function may be Pi i numericallyobtainedbyacomputationalmethodologyderived norPmialization pdf pqi, renders the T-M-P constraints to from generalized statistics (Section III). be self-referential. PTihis paper, like [12], utilizes a recent methodology [18] to “rescue” the OT constraints, and, has B. Bregman Divergences linked the OT, C-T, and, T-M-P constraints. Definition1(Bregmandivergences):Letφbearealvalued strictly convex function defined on the convex set S ⊆ℜ, the B. The nonextensive variational principle domainofφsuchthatφisdifferentiableonint(S),theinterior Keeping X˜ fixed, taking the variational derivative of (11) of S. The Bregman divergence B : S ×int(S) 7→ ℜ+ is s φ with respect to p(x˜|x) while enforcing p(x˜|x) = 1 with definedas B (z ,z )=φ(z )−φ(z )−(z −z ,∇φ(z )), φ 1 2 1 2 1 2 2 Px˜ where B (z ,z ) = φ(z )−φ(z )−(z −z ,∇φ(z )) is φ 1 2 1 2 1 2 2 the gradient of φ evaluated at z2. 2Calligraphic symbolsindicate sets the normalization Lagrange multiplier λ(x), yields vanishes. The inverse temperature β and the effective inverse temperature β∗ relate as 1/(q−1) p(x˜|x)=p(x˜) (1−q) λ˜(x)+βd (x,x˜) , h q n φ oi β = qℵq(x)β∗ . (19) λ˜(x)= λp((xx)) −p(x)(1−q). h1−β∗(q−1)hdφ(x,x˜)ip(x˜|X=x)i (12) Multiplying the terms in the square brackets in (12) by the C. Support estimation step conditional probability p(x˜|x), and summing over x˜ yields Theprocedurein SectionIII.B.iscarriedoutfora givenβ, q p(x˜) p(x˜|x) q+β d (x,x˜)p(x˜|x)+ correspondingtoasinglepointontheRDcurve.Reproduction q−1 p(x˜) φ alphabetswithoptimalsupportareobtainedbykeepingp(x˜|x) Px˜ (cid:16) (cid:17) Xx˜ fixed, and solving (13) βhdφ(x,x˜)ip(x˜|X=x) +λ˜(x) p(x˜|x)=0. | {z } minLq =min I X;X˜ +βhd (x,x˜)i . Px˜ X˜s GBRD X˜s n q(cid:16) (cid:17) φ p(x,x˜)o (20) Evoking p(x˜|X =x)=1, yields The solution to (20) is the optimal estimate/predictor[15, 16] Px˜ q λ˜(x)= (1−q)ℵq(x)−βhdφ(x,x˜)ip(x˜|X=x). (14) x˜∗ =hxip(x|X˜=x˜) = p x|X˜ =x˜ x. (21) Xx (cid:16) (cid:17) q Here, ℵ (x) = p(x˜) p(x˜|x) . The conditional pdf q p(x˜) Algorithm1depictsthepseudo-codeforthegreedyjointnon- Px˜ (cid:16) (cid:17) p(x˜|x) acquires the form convexoptimization procedurethat constitutes the solution to the GBRD model. p(x˜|x)= p(x˜){1−(q−1)β∗dφ(x,x˜)}1/(q−1), ℑ1/(1−q) ℑRD =qℵq(x)+(q−R1D)βhdφ(x,x˜)ip(x˜|X=x), (15) IV. THETSALLIS-BREGMAN LOWERBOUND β∗ = β , ℑRD Theorem 1 For OT constraints, the nonextensive RD func- where β∗ is the effective inverse temperature. Transforming tion with source X ∼ p(x) and a Bregman divergence d is φ q →2−q∗ inthenumeratorandevoking(5),(15)isexpressed always lower bounded by the Tsallis-Bregman lower bound in the form of a q-deformed exponential defined by p(x˜)exp (−β∗d (x,x˜)) p(x˜|x)= Zq˜∗(x,β∗)φ . (16) RqL(D)=Px˜ pq−11(x˜)Sq(X)+ (22) +sup −βD+hln γ (x)i , Thepartitionfunctionevaluatedateachinstanceofthesource q∗ L,β∗ p(x) β≥0n o distribution is where γ is a unique function satisfying 1 Lβ∗ Z˜(x,β∗)=ℑ(1−q) = p(x˜)exp [−β∗d (x,x˜)] (17) RD q φ Xx˜ p(t)γ (t)exp (−β∗d (t,µ))dt=1;∀µ∈dom(φ). The term {1−(1−q∗)β∗d(x,x˜)} in the numerator of (16) Zdom(φ) L,β∗ q∗ φ (23) is a manifestation of the Tsallis cut-off condition [18]. This implies that solutions of (16) are valid when β∗d(x,x˜) < To highlight the critical dependence of the Tsallis-Bregman 1/(1−q∗). The effective nonextensive RD Helmholtz free en- lower bound upon the type of constraint employed to solve theGBRD variationalprobleminSectionIIIB, andenunciate ergy is certain aspects of q-deformed algebra and calculus, a slightly −1 1 ℑ −1 Fq (β∗)= ln Z˜(x,β∗) = RD weaker bound (see Appendix C, [16]) is stated and proved. RD β∗ D q Ep(x) β∗ (cid:28) q−1 (cid:29)p(x). Theorem 2 The generalized RD function is defined by (18) Solution of (16) may be viewed from two distinct per- R (D)= sup −βD+ p(x)ln γ∗(x)dx , spectives, i.e. the canonical perspective and the parametric q β≥0,γ∗∈Λβ∗(cid:26) Zx q∗ (cid:27) perspective. Owing to the self-referential nature of the ef- (24) fective inverse temperature β∗, the analysis and solution of where Λ is the set of all admissible partition functions. β∗ (16) within the context of the canonical perspective is a Further, for each β ≥0, a necessary and sufficient condition formidable undertaking. For practical applications, the para- for γ∗(x) to attain the supremum in (24)is a probability metric perspective is utilized by evaluating the conditional density p(x˜) related to γ∗(x) as pdf p(x˜|x), employing the nonextensive BA algorithm, for a-priori specified β∗ ∈ [0,∞]. Note that within the context −1 γ∗(x)= p(x˜)exp (−β∗d (x,x˜))dx˜ . (25) ofthe parametricperspective, the self-referentialnatureof β∗ (cid:18)Z q φ (cid:19) x˜ Proof that LqGBRD[p(x˜|x)]= D(q)L(GqB)RD = x p(x)expqN∗((−x,ββ∗∗d)(x,x˜))dx =ko∀x˜∈Bǫ ⊂X˜s, − p(x)p(x˜|x)ln p(x˜) dxdx˜+ ⇒ p(xR)p(x˜)expq∗(−β∗d(x,x˜))dxdx˜ =k p(x˜)dx˜, x,x˜ q(cid:16)p(x˜|x)(cid:17) x˜∈B∈ x N(x,β∗) o x˜∈B∈ +βR x,x˜p(x)p(x˜|x)dφ(x,x˜)dxdx˜ ⇒Rx˜∈B∈Rxp(x)p(x˜|x)dxdx˜=ko x˜∈B∈p(x˜R)dx˜⇒1=ko, (=a) R p(x)p(x˜|x)ln p(x˜|x) dxdx˜+ R R R (30) x,x˜ q∗(cid:16) p(x˜) (cid:17) holds true for all x˜ ∈ Bǫ. Defining γβ∗∗(x) = +(=b)βRRx,x˜pp(x(x)p)p((x˜x˜||xx))lndφ(xe,xx˜p)q∗d(x−dβx˜∗dφ(x,x˜)) dxdx˜+ (cid:0)Rx˜p(x˜)expq(−β∗dφ(x,x˜))dx˜(cid:1)−1, (30) yields x,x˜ q∗(cid:16) Z˜(x,β∗) (cid:17) p(x)γ∗ (x)exp (−β∗d(x,x˜))dx=1;∀x˜∈B . (31) +βR x,x˜p(x)p(x˜|x)dφ(x,x˜)dxdx˜ Zx β∗ q∗ ∈ (=c)−R x,x˜p(x)p(x˜|x)Z˜(q−1)(x,β∗)(β∗dφ(x,x˜)+ Using the relation −lnq(x)=lnq∗ 1/x , (26) becomes ln ZR˜(x,β∗) dxdx˜+β p(x)p(x˜|x)d (x,x˜)dxdx˜ (cid:16) (cid:17) q∗ x,x˜ φ (=d)− x,x˜p(x)(cid:17)p(x˜|x)lnqRZ˜(x,β∗)dxdx˜ LqGBRD[p(x˜)]=Zxp(x)lnq∗γβ∗∗(x)dx. (32) =− Rxp(x)lnqZ˜(x,β∗)dx Thus, γβ∗∗(x) satisfies (25) and attains the supremum in (24) (=e)−R p(x)ln p(x˜)exp (−β∗d (x,x˜))dx˜ dx for a given β and corresponding β∗ x q x˜ q∗ φ =LqR [p(x˜)](cid:0).R (cid:1) GBRD (26) Rq(Dβ)=−βDβ +Z p(x)lnq∗γβ∗∗(x)dx, (33) Here, (a) follows from lnq(x) = −lnq∗ 1/x ;q∗ = 2 − where D is the distortion vxalue at which the supremum in (cid:16) (cid:17) β q, (b) follows from (16) where the partition function in the (24)isattainedforagivenβ.Thus,theTsallis-Bregmanlower continuumisZ˜(x,β∗)= p(x˜)exp (−β∗d (x,x˜))dx˜,(c) x˜ q φ bound is followsfrom(8),(d)folloRwsfrom(15),and,(e)followsfrom (17). Analogous to the mapping method of [14], a mapping R (D )=−βD + p(x)ln γ∗ (x)dx=R (D ). qL β β Z q∗ β∗ q β from the unit interval to the reproduction alphabet is sought x (34) such that the Lebesgue measure over the unit interval results in an optimal p(x˜) over the reproduction alphabet. Thus, for V. NUMERICAL SIMULATIONS each instance of a probabilitymeasure correspondingto p(x˜) The efficacy of the GBRD model is computationally in- on X˜ there exists a unique map x˜ : [0,1] 7→ X˜ that maps vestigatedbydrawinga sampleof1000two-dimensionaldata the Lebesgue measure (µ) to p(x˜) such that for any function points,fromthreesphericalGaussiandistributionswithcenters f defined over x˜, f(x˜)p(x˜)dx˜ = f(x˜(u))dµ(u). (2,3.5),(0,0),(0,2) (the quantized codebook). The priors x˜ u∈[0,1] Thus, (26) yields R R and standard deviations are 0.3,0.4,0.3, and, 0.2,0.5,1.0, respectively. To test effectiveness of the support estimation Lq [x˜]= GBRD step, thequantizedcodebookis shifted from the true meansto =− p(x)ln p(x˜)exp (−β∗d (x,x˜))dx˜ dx, x q x˜ q∗ φ positionsat the edgesof the sphericalGaussian distributions. =−Rxp(x)lnq(cid:0)Ru∈[0,1]expq∗(−β∗dφ(x,x˜(u)))(cid:1)dµ(u) dx.The computational procedure described in Algorithm 1 is R (cid:16)R (2(cid:17)7) repeatedly solved for each value of β∗, till a reproduction To obtain the optimality condition, the q-deformed terms alphabet with optimal support is obtained. This consistently characteristic to generalized statistics are to be operated on coincideswiththetruemean,foranegligibleerror.Thiseffect bythedualJacksonderivativeinsteadoftheusualNewtonian is particularly pronounced, and rapidly achieved, for regions derivative [4]. To obtain expressions analogous to the case of of low and intermediate values of β∗, thus providing implicit Newton-Leibniz calculus, the dual Jackson derivative defined proof of the relation between soft clustering and RD with as Bregman divergences [15]. Fig.1depictstheRDcurvesforextensiveRDwithBregman D(q)f(x)= 1 df;1+(1−q)f(x)6=0 1+(1−q)f(x)dx divergences [15, 16] and the GBRD model, with the con- (28) ⇒D(q)ln Z˜(x,β∗)= 1 dZ˜(x,β∗) stituentdiscretepointsoverlaiduponthem.AEuclideansquare q Z˜(x,β∗) dx distortion (a Bregman divergence) is employed. Each curve is employedto enforcethe optimalitycondition.The optimal- has been generated for values of β ∈ [.1,2.5] (the extensive ity condition for (27) becomes case),andβ∗ ∈[.1,100](theGBRDcases),respectively.Note that for the GBRD cases, the slope of the RD curve is −β D(q)Lq =− p(x)ddx˜ expq∗(−β∗dφ(x,x˜(u))) dx. and not −β∗. Note that all GBRD curves inhabit the non- GBRD Z (cid:0)exp (−β∗d (x,x˜(u)))u(cid:1)dµ achievable(nocompression)regionoftheextensiveRDmodel x u∈[0,1] q∗ φ R (29) with Bregmandivergences.Further, GBRD modelspossessing Evoking Borel isomorphism, and, assuming that the optimal alowernonextensivityparameterq inhabitthenon-achievable supportX˜ contains a non-emptyopen ball B , ǫ>0 implies regions of GBRD models possessing a higher value of q. It s ǫ Algorithm 1 GBRD Model [5] P.T. Landsberg and V. Vedral. Distributions and Channel Capacities in Input GeneralizedStatisticalMechanics. Phys.Lett.A,247,pp.211-217,1998. 1. X ∼p(x) over {x }n ⊂dom(φ)⊆ℜm. [6] T. Yamano. Information Theory based on Nonadditive Information i i=1 Content. Phys.Rev.E,63,pp.046105-046111, 2001. 2. Bregman divergence dφ, |X˜s| = k, effective variational [7] T.Yamano. Generalized SymmetricMutualInformationAppliedforthe parameterβ∗ ∈[0,∞],eachβ∗ isasinglepointontheRD Channel Capacity. Phys. Rev. E., To Appear. Manuscript available at http://arxiv.org/abs/cond-mat/0102322. curve. [8] T.Yamano. SourceCodingTheorembasedonaNonadditiveInformation Output Content. PhysicaA,305,1,pp.190-195,2002. n 1.X˜∗ = {x˜}k ,P∗ = {x˜ |x }k that locally [9] T. Cover and J. Thomas. Elements of Information Theory. John Wiley s j=1 j i j=1 & Sons,NewYork,NY,1991. n oi=1 optimizes (11), (R ,D ) tradeoff at each β∗ [10] T.Berger. RateDistortionTheory. Prentice-Hall,EnglewoodCliffs,NJ, β β 2.Value of R (D) where its slope equals −β = 1971. −qℵq(xq)β∗ . [11]PNro.cTeiesdhibnyg,sFo.fCt.hPee3re7itrha,AWn.nBuaialleAkl.leTrthoenInCfoonrmferaetinocneBoonttCleonmecmkuMniceathtioodn,. [1−β∗(q−1)hdφ(x,x˜)ip(x˜|X=x)] ControlandComputing, pp368-377, 1999. Method [12] R.C.Venkatesan. GeneralizedStatisticsFrameworkforRateDistortion 1.Initialize with some {x˜}k ⊂dom(φ). Theory. Physica A, To Appear, 2007. Preliminary manuscript available j=1 2.Set up outer β∗ loop. athttp://arxiv.org/abs/cond-mat/0611567. [13] R. E. Blahut. Computation of Channel Capacity and Rate Distortion repeat Functions. IEEETrans.onInform.Theory,IT18,pp.460473,1972. Blahut-Arimoto loop (16) for single value of β∗ [14] K. Rose. A Mapping Approach to Rate Distortion Computation and repeat Analysis. IEEETrans.onInform.Theory,40,6pp.19391952,1994. [15] A. Banerjee, S. Merugu, I. Dhillon, and, J. Ghosh. Clustering with for i=1 to n do Bregman Divergences. Journal of Machine Learning Research (JMLR), for j=1 to k do 6,pp1705-1749, 2005. p(x˜ |x )← p(x˜j)expq∗(−β∗dφ(xi,x˜j)) [16] A. Banerjee. Scalable Clustering Algorithms Ph.D. dissertation, The j i Z˜(xi,β∗) University ofTexasatAustin,2005. end for [17] J. Naudts. Continuity of a Class of Entropies and Relative Entropies. end for Rev. Math. Phys., 16, pp. 809-824 , 2004. Manuscript available at http://arxiv.org/abs/cond-mat/0208038. for j=1 to k do [18] G. L. Ferri, S. Martinez, and A. Plastino. Equivalence of the Four p(x˜j)← ni=1p(x˜j|xi)p(xi) VersionsofTsallisStatistics. J.Stat.Phys.,04,pp.04009-04024,2004. end for P Manuscript available athttp://arxiv.org/abs/cond-mat/0503441. until convergence Support Estimation Step (X˜s using (21)) 1 β→∞ for j=1 to k do 0.9 β*→∞ x˜j ← ni=1p(xi|x˜j)xi 0.8 end foPr 0.7 Extensive RD with Bregman Divergences until convergence 0.6 Calculate Dβ and Rβ for the β∗. RD0.5 advance β∗ 0.4 GBRD Model:q=0.75 β∗ =β∗+δβ∗ 0.3 0.2 β→ 0 GBRD Model:q=0.50 β*→ 0 0.1 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 is observed that the GBRD model undergoes compression D and clustering more rapidly than the equivalent extensive RD model with Bregman divergences. A primary cause for such Fig. 1. Rate Distortion Curves for GBRD Model and Extensive RD with BregmanDivergences behavior is the rapid increase in β∗ for marginal increases in β, as depicted in Fig. 2 and obtained from (19). ACKNOWLEDGEMENT 100 90 This work was supportedby RAND-MSR contract CSM-DI 80 & S-QIT-101107-2005.Gratitude is expressed to A. Plastino, GBRD Model: q=0.5 70 S. Abe, and, K. Rose for helpful discussions. 60 *β 50 REFERENCES GBRD Model: q=0.75 40 [1] C. Tsallis. Possible Generalizations of Boltzmann-Gibbs Statistics. J. 30 Stat.Phys.,542,pp479-487,1988. 20 [2] M. Gell-Mann and C. Tsallis (Eds.). Nonextensive Entropy- 10 InterdisciplinaryApplications. OxfordUniversityPress,NewYork,2004. 0 [3] C. Tsallis. Generalized Entropy-Based Criterion for Consistent Testing. 0 0.5 1 1.5 2 2.5 3 3.5 4 β Phys.Rev.E,58,pp1442-1445,1998. [4] E. Borges. A Possible Deformed Algebra and Calculus Inspired in Nonextensive Thermostatistics. Physica A, 340, pp. 95-111, 2004. Fig.2. Curvesforβ v/sβ∗ forGBRDModel Manuscript available athttp://arxiv.org/abs/cond-mat/0304545.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.