ebook img

Fluctuation geometry: A counterpart approach of inference geometry PDF

0.42 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Fluctuation geometry: A counterpart approach of inference geometry

Fluctuation geometry: A counterpart approach of inference geometry 2 1 L. Velazquez 0 DepartamentodeF´ısica,UniversidadCat´olicadelNorte,Av. Angamos0610,Antofagasta, 2 Chile. n a Abstract. Startingfromanaxiomaticperspective,fluctuationgeometryisdevelopedasacoun- J terpart approach of inference geometry. This approach is inspiredon the existence of a notable 5 analogy between the general theorems of inference theory and the the general fluctuation the- 1 orems associated with a parametric family of distribution functions dp(I|θ) = ρ(I|θ)dI, which describesthebehaviorofasetofcontinuousstochasticvariablesdrivenbyasetofcontrolparam- ] etersθ. Inthisapproach,statisticalpropertiesarerephrasedaspurelygeometricnotionsderived T from the Riemannian structure on the manifold Mθ of stochastic variables I. Consequently, S this theory arises as an alternative framework for applying the powerful methods of differential geometry for the statistical analysis. Fluctuation geometry has direct implications on statistics . h andphysics. ThisgeometricapproachinspiresaRiemannianreformulationofEinsteinfluctuation t theoryaswellasageometricredefinitionoftheinformationentropyforacontinuousdistribution. a m PACSnumbers: 02.50.-r;02.40.Ky;05.45.-a;02.50.Tt [ Keywords: Geometrical methodsinstatistics;Fluctuationtheory; Informationtheory 2 v 7 1. Introduction 8 3 2 Inference theory supports the introduction of a Riemannian distance notion [1]: 7. ds2F =gαβ(θ)dθαdθβ (1) 0 to characterize the statistical distance between two close members of a generic parametric family 1 1 of distribution functions: : v dp(I θ)=ρ(I θ)dI. (2) | | i X Here, the metric tensor g (θ) is provided by the so-called Fisher’s information matrix [2]. The αβ r existence of this type of Riemannian formulation was pioneering suggested by Rao [3], which is a hereafter referred to as inference geometry . Inference geometry provides very strong tools for ‡ proving results about statistical models, simply by considering them as well-defined geometrical objects. As expected, inference theory and its geometry have a direct applicationin those physical theories with a statistical apparatus as statistical mechanics and quantum theories. A curious exampleistheso-calledextremephysical information principle, proposedbyFriedenin1998,which claimsthatallphysicallawscanbederivedfrompurelyinferencearguments[4]. Inferencegeometry ‡ This approach is referred to as Riemannian geometry on statistical manifolds in the literature. However, the denomination inference geometry is employed here toavoid the ambiguity with fluctuation geometry, whichis also aRiemanniangeometryonstatisticalmanifold. Fluctuation geometry: A counterpart approach of inference geometry 2 has been adapted to the mathematical apparatus of quantum mechanics [5, 6, 7, 8]. In fact, moderninterpretationsofuncertaintyrelationsareinspiredontheirarguments[9,10,11]. Inference geometryhasbeensuccessfullyemployedinstatisticalmechanicstostudyphasetransitions[12,13], as well as in the framework of so-called thermodynamics geometry [14]. The goal of this paper is to show that the parametric family of distribution functions (2) supports the introduction of an alternative Riemannian distance notion: ds2 =g (I θ)dIidIj, (3) ij | whichcharacterizesthestatisticaldistancebetweentwoclosesetsofcontinuumstochasticvariables I and I + dI for fixed values of control parameters θ. This geometrical approach, hereafter referred to as fluctuation geometry, is inspired on the existence of a notable analogy between the general theorems of inference theory and some general fluctuation theorems recently proposed in the framework of equilibrium classical fluctuation theory [15, 16, 17, 18]. The main consequence derivedfromthisanalysisisthepossibilitytorephrasetheoriginalparametricfamilyofdistribution functions (2) in terms of purely geometry notions. This connection enables the direct application of powerful tools of differential geometry for the analysis of their absolute statistical properties . § The paper is organized as follows. For the sake of self-consistence, the next section will be devoted to discuss the main motivation of the present proposal: the analogy between inference theory and fluctuation theory. Afterwards, it will be developed an axiomatic formulation of fluctuation geometry. Firstly, the postulates of fluctuation geometry are presented in section 3. Then, their consequences are considered in section 4 to perform a geometric reinterpretation of the statistical description. Section 5 is devoted to discuss two simple application examples of fluctuation geometry. Implications ofthe presentapproachinsome statistics andphysics problems willbe analyzedinsection6, asexample,a reconsiderationofinformationentropyfora continuous distribution and the development of a Riemannian reformulation of Einstein fluctuation theory. Final remarks and open problems will be summarized in section 7. 2. Motivation Let us start from the parametric family of distribution functions (2), which describes the behavior ofasetofcontinuousstochasticvariablesI drivenbyasetθofcontrolparameters. Letusdenoteby thestatisticalmanifoldconstitutedbyalladmissiblevaluesofthestochasticvariablesI thatare θ M accessibleforagivenvalueθofcontrolparameters,whichishereafterassumedasasimplyconnected domain. Moreover,let us denote by the statistical manifold constituted by all admissible values P ofcontrolparametersθ (eachpointθ representsagivendistributionfunction). Theparametric ∈P family of distribution functions (2) can be analyzed from two different perspectives: To study the fluctuating behavior ofstochastic variablesI , which is the maininterestof θ • ∈M fluctuation theory [17]; To study the relationshipbetweenthis fluctuationbehaviorandthe external control described • by the parameters θ , which is the interest of inference theory [2]. ∈P § Tensorial formalismof Riemannian geometry allows to study the absolute geometric properties of a manifoldM using any of its coordinate representations RI. A relevant example of absolute property is the curvature of the manifoldM,whichismanifestedinanycoordinaterepresentationRI. Ingeneralrelativitytheory,thecurvatureof space-timeM4 isidentified withgravitation interaction. Theeffects ofgravitationareabsolute orirreducible,while theeffects associatedwiththeinertialforces arereducible. Infact,theexistence ofinertialforcescruciallydepends onthereferenceframe,thatis,thespecificcoordinaterepresentationofthespace-timeM4. Fluctuation geometry: A counterpart approach of inference geometry 3 2.1. Fluctuation theory Let us admit that the probability density ρ(I θ) is everywhere finite and differentiable, and obeys | the followingconditionsforeverypointI locatedonthe boundary∂ ofthe statisticalmanifold b θ M , I ∂ : θ b θ M ∈ M ∂ lim ρ(I θ)= lim ρ(I θ)=0. (4) I→Ib | I→Ib ∂Ii | The probability density ρ(I θ) can be considered to introduce the differential forces η (I θ): i | | ∂ η (I θ)= logρ(I θ). (5) i | −∂Ii | By definition, the differential forces η (I θ) vanish in those points where the probability density i | ρ(I θ)exhibitsitslocalmaximaoritslocalminima. Theglobal(local)maximumoftheprobability den|sity can be regarded as a stable (metastable) equilibrium points I¯, which can be obtained from the following stationary and stability conditions: ∂ ∂2 logρ(I¯θ)=0, logρ(I¯θ) 0, (6) − ∂Ii | −∂Ii∂Ij | ≻ where A 0 denotes that the matrix A is positive definite. In general, the differential forces ij ij ≻ η (I θ) characterize the deviation of a given point I from these local equilibrium points. i θ | ∈ M Analogously, it is convenient to introduce the response matrix χ (I θ): ij | χ (I θ)=∂ η (I θ), (7) ij j i | | where ∂ A = ∂A/∂Ii, which describes the response of differential forces η (I θ) under an i i | infinitesimal change of the variable Ij. As stochastic variables,the expectation values of the differential forces η =η (I θ) identically i i | vanish: η =0, (8) i h i and these quantities also obey the fundamental and the associated fluctuation theorems [17]: η δIj =δj, (9) i i χ = η η , (10) (cid:10) ij (cid:11) i j h i h i where δj is the Kronecker delta. The previous theorems are derived from the following identity: i ∂ A(I θ) = η (I θ)A(I θ) (11) i i h | i h | | i substituting the cases A(I θ) = 1, Ii and η , respectively. Here, A(I) is a differentiable function i | definedonthecontinuousvariablesI withdefiniteexpectationvalues ∂A(I θ)/∂Ii thatobeysthe | following the boundary condition: (cid:10) (cid:11) lim A(I)ρ(I θ)=0. (12) I→Ib | Moreover,equation (11) follows from the integral expression: ∂υj(I θ) ∂ρ(I θ) | ρ(I θ)dI = ρ(I θ)υj(I θ) dΣ υj(I θ) | dI (13) ∂Ij | | | · j − | ∂Ij ZMθ I∂Mθ ZMθ derived from the intrinsic exterior calculus of the statistical manifold and the imposition of θ the constraint υj(I θ) δjA(I θ). It is easy to realize that the identMity (8) and the associated | ≡ i | fluctuation theorem (10) are just the stationary and stability equilibrium conditions (6) written in Fluctuation geometry: A counterpart approach of inference geometry 4 term of statistical expectation values, respectively. In particular, the positive definite character of the self-correlationmatrix M (θ)= η (I θ)η (I θ) implies the positive definition of the matrix: ij i j h | | i ∂2logρ(I θ) ∂ η (I θ) = | 0, (14) h i j | i − ∂Ii∂Ij ≻ (cid:28) (cid:29) Remarkably, the fundamental fluctuation theorem (9) suggests the statistical complementarity of the variable Ii and its conjugated differential force η = η (I θ). Using the Schwartz inequality i i | δAδB 2 δA2 δB2 , one obtains the following inequality: h i ≤ (cid:10) ∆(cid:11)(cid:10)Ii∆η(cid:11)i 1, (15) ≥ where ∆x = δx2 is the statistical uncertainty of the quantity x. Clearly, this last inequality exhibits the samh e mi athematical appearance of Heisenberg’s uncertainty relation ∆q∆p ~. p ≥ Recently, this result was employed to show the existence of uncertainty relations involving conjugated thermodynamic quantities [15, 16]. Equation (15) can be generalized considering the inverse Mij(θ) of the self-correlation matrix of the differential forces M (θ) = η (I θ)η (I θ) . ij i j h | | i DenotingbyCij(θ)= δIiδIj theself-correlationmatrixofthestochasticvariablesI,itispossible to obtain the following matrical inequalities: (cid:10) (cid:11) Cij(θ) Mij(θ) 0. (16) − (cid:23) This last inequality is directly obtained from the positive definition of the self-correlation matrix Kij(θ) = Ji(I θ)Jj(I θ) of the auxiliary quantities Ji(I θ) = δIi+Mij(θ)η (I θ). Accordingly, i | | | | the self-correlation matrix Cij(θ) of stochastic variables I is inferior bound by the inverse Mij(θ) (cid:10) (cid:11) of the self-correlationmatrix of the differential forces η . i 2.2. Inference theory Inference theory can be described as the problem of deciding how well a set of outcomes = I I(1),I(2),...I(m) obtained from independent measurements fits a proposed distribution function dp(I θ)[2]. Thisquestionisfullyequivalenttoinferthevaluesofcontrolparametersθ fromthislast (cid:8) | (cid:9) experimental information. To make inferences about control parameters, one employs estimators θˆα =θα( ),thatis,functionsontheoutcomes m,where m = ... (m-times I I ∈Mθ Mθ Mθ⊗Mθ ⊗Mθ the external product of the statistical manifold ). The values of these functions pretend to be θ M the best guess for θα. Let us admit that the probability density ρ(I θ) is everywhere differentiable and finite on the | statistical manifold of control parameters θ. Let us start introducing the statistical expectation P values A( θ) as follows: h I| i A( θ) = A( θ)̺( θ)d , (17) h I| i ZMmθ I| I| I where d =dI(1)dI(2)...dI(m) and ̺( θ) is the so-called likelihood function: I I| m ̺( θ)= ρ(I(i) θ). (18) I| | i=1 Y Taking the partial derivative ∂ = ∂/∂θα of Eq.(17), one obtains the following mathematical α identity: ∂ A( θ) ∂ A( θ) = A( θ)υ ( θ) , (19) α α α h I| i− h I| i h I| I| i Fluctuation geometry: A counterpart approach of inference geometry 5 where: ∂ υˆ =υ ( θ)= logρ( θ) (20) α α I| −∂θα I| are the components of the score vector υˆ = υˆ . Substituting A( θ) = 1 into Eq.(19), one α { } I| arrives at the vanishing of the expectation values of the score vector components: υ ( θ) =0. (21) α h I| i Let us consider now any unbiased estimator A( θ)=θα( ) of the parameter θα, θα( ) =θα, as I| I h I i well as the score vector component A( θ) = υ ( θ). Substituting these quantities into identity β I| I| (19), it is possible to obtain the following results: δθα( )υ ( θ) = δα, (22) h I β I| i − β ∂ υ ( θ) = υ ( θ)υ ( θ) . (23) β α α β h I| i h I| I| i It is easyto realizethat the identities (21) and(23) canbe regardedas the stationaryandstability conditions of the known method of maximum likelihood estimators [2]. According to this method, thebestvaluesoftheparametersθ shouldmaximizethelogarithmofthe likelihood function ̺( θ) I| for a given set of outcomes . Such an exigence leads to the following stationary and stability I conditions: ∂ ∂2 log̺ θ¯ =0, log̺ θ¯ 0, (24) − ∂θα I| −∂θα∂θβ I| ≻ which should be solved to o(cid:0)bta(cid:1)in the maximum like(cid:0)lihoo(cid:1)d estimators θˆα = θ¯α( ). On the mle I other hand, the identity (22) also suggests the statistical complementarity between the estimator θˆα = θα( ) and its conjugated score vector component υ . Using the Schwartz inequality, one α I obtains the following uncertainty-like inequality: ∆θˆα∆υ 1. (25) α ≥ This result can be easily improved introducing the inverse matrix gαβ(θ) of the self-correlation matrix g (θ) = υ ( θ)υ ( θ) and the auxiliary quantity Xα = δθˆα gαβυ . Thus, one can αβ α β β h I| I| i − compose the positive definite form: (λ Xα)2 = XαXβ λ λ 0, (26) α α β ≥ which leads to thDe positiveEdefin(cid:10)ition of(cid:11)the matrix: δθˆαδθˆβ gαβ(θ) 0. (27) − (cid:23) D E This last inequality is the famous Cramer-Rao theorem of inference theory [2, 3] that imposes an inferior boundto the efficiency ofunbiasedestimatorsθˆα, where the self-correlationmatrix g (θ): αβ g (θ)= υ ( θ)υ ( θ) (28) αβ α β h I| I| i is the Fisher’s information matrix referred to in the introductory section. Fluctuation geometry: A counterpart approach of inference geometry 6 Inference theory Fluctuation theory υ( θ)= logρ( θ) η(I θ)= logρ(I θ) θ I I| −∇ I| | −∇ | υ( θ) =0 η(I θ) =0 h I| i h | i υ( θ) δθˆ = 1 η(I θ) δI =1 θ I I| · − h | · i θ υD( θ) = υE( θ) υ( θ) I η(I θ) = η(I θ) η(I θ) h∇ · I| i h I| · I| i h∇ · | i h | · | i Table 1. Analogybetween inferencetheoryandfluctuationtheory. 2.3. Analogy between inference theory and fluctuation theory Fluctuationtheoryandinferencetheoryprovidetwodifferentbutcomplementary characterizations for a given parametric family of distribution functions dp(I θ). Formally, these two statistical | frameworks can be regarded as dual counterpart approaches because of the great analogy among their main definitions and theorems, as clearly evidenced in table 1. To simplify the notation, it was introduced here the gradient operators ∂ and ∂ , the diadic products i I α θ → ∇ → ∇ A B=A B ei ej and ξ ψ =ξ ψ ǫα ǫβ and the Kroneker delta δi 1 and δα 1 . · i j · · α β · j → I β → θ Remarkably, the analogy between fluctuation theory and inference theory is uncomplete in regard to their respective geometric features. The parametric family dp(I θ) is expressed in the | representations and of the statistical manifolds and , respectively. Equivalently, I θ θ R R M P the same parametric family can be also rewritten using the representations and of the Θ ν R R manifolds and , which implies the consideration of the coordinate changes Θ(I): θ I Θ M P R → R and ν(θ) : . Under these parametric changes, the Fisher’s inference matrix (28) behaves θ ν R → R as the components of a second rank covariant tensor: ∂θα∂θβ g (ν)= g (θ). (29) γδ ∂νγ ∂νδ αβ The existence of these last transformationrules guaranteesthe invariance of the inference distance notion(1). Thus,thestatisticalmanifold ofcontrolparametersθcanbeendowedofaRiemannian P structure. The relevance of the distance notion (1) can be understood considering the asymptotic expression of the distribution function of the efficient unbiased estimators θˆ ( ): eff I dQm(ϑθ)= δ ϑ θˆ ( ) ̺( θ)d (30) eff | ZMmθ h − I i I| I whenthenumberofoutcomesmissufficientlylarge. Sinceg (θ) m,onecanobtainthefollowing αβ ∝ approximation formula: 1 g (θ) dQm(ϑθ) exp g (θ)∆ϑα∆ϑβ αβ dϑ, (31) αβ | ≃ −2 s 2π (cid:20) (cid:21) (cid:12) (cid:12) (cid:12) (cid:12) where ∆ϑα =ϑα θα. Accordingly, the distance notion ((cid:12)1) prov(cid:12)ides the distinguishing probability − (cid:12) (cid:12) between two close distribution functions of the parametric family dp(I θ) through the inferential | procedure. The analogy between fluctuation theory and inference theory strongly suggests the existence of a counterpart approach of inference geometry in the framework of fluctuation theory, that is, the existence of a Riemannian distance notion (3) to characterize the statistical separation between close points I and I +dI . Unfortunately, the underlying analogy is insufficient to introduce θ ∈M the particular expression of the metric tensor g (I θ). For example, it is easy to check that the ij | Fluctuation geometry: A counterpart approach of inference geometry 7 fluctuation theorems (8)-(10) can be also expressed in the new coordinate representation [18], Θ R as example, the associated fluctuation theorem: ∂ η (Θθ) = η (Θθ)η (Θθ) , (32) i j i j h | i h | | i where ∂ =∂/∂Θi and η (Θθ): i i | ∂ η (Θθ)= logρ(Θθ). (33) i | −∂Θi | However,the self-correlationmatrix M˜ (θ)= η (Θθ)η (Θθ) associatedwiththe new coordinate ij i j h | | i representation is not related by local transformation rules to its counterpart expression Θ R M (θ) = η (I θ)η (I θ) in the old coordinate representation . The self-correlation matrix ij i j I h | | i R of the differential forces M (θ) = η (I θ)η (I θ) is a matrix function defined on the statistical ij i j h | | i manifold ofcontrolparametersθ, while the metric tensor g (I θ) is a tensorial entity definedon ij P | thestatisticalmanifolds and . Asexpected,thedefinitionofthemetrictensorg (I θ)cannot θ ij M P | involve integral expressions over the manifold as the case of inference metric tensor (28). θ M 3. Fluctuation geometry Fluctuation geometry can be formulated starting from a set of axioms that combine the statistical nature of the manifold and the notions of differential geometry. These axioms specify the way θ M to introduce the metric tensor g (I θ) associated with the parametric family (2). This section is ij | devoted to discuss these axioms and their most direct consequences. 3.1. Postulates Axiom 1 The manifold of the stochastic variables possesses a Riemannian structure, that θ M is, it is provided of a metric tensor g (I θ) and a torsionless covariant differentiation D ij i | that obeys the following constraints: D g (I θ)=0. (34) k ij | Definition 1 The Riemannian structure on the statistical manifold allows to introduce the θ M invariant volume element as follows: g (I θ) ij dµ(I θ)= | dI, (35) | s 2π (cid:12) (cid:12) (cid:12) (cid:12) where gij(I θ) denotes the a(cid:12)bsolute va(cid:12)lue of the metric tensor determinant. | | | (cid:12) (cid:12) Axiom 2 Thereexistadifferentiablescalarfunction (I θ)definedonthestatisticalmanifold , θ S | M hereafter referred to as the information potential, whose knowledge determines the distribution function dp(I θ) of the stochastic variables I as follows: θ | ∈M dp(I θ)=exp[ (I θ)]dµ(I θ). (36) | S | | Definition 2 Let us consider an arbitrary curve given in parametric form I(t) with fixed θ ∈ M extreme points I(t )=P and I(t )=Q. Adopting the following notation: 1 2 dIi(t) I˙i = , (37) dt the length ∆s of this curve can be expressed as: t2 ∆s= g [I(t)θ]I˙i(t)I˙j(t)dt. (38) ij | Zt1 q Fluctuation geometry: A counterpart approach of inference geometry 8 Definition 3 It is say that the curve I(t) exhibits a unitary affine parametrization θ ∈ M when its parameter t satisfies the following constraint: g [I(t)θ]I˙i(t)I˙j(t)=1. (39) ij | Definition 4 A geodesic is the curve I (t) with minimal length (40) between two fixed arbitrary g points (P,Q) . Moreover, the distance D (P,Q) between these two points (P,Q) is given by θ θ ∈M the length of its associated geodesic I (t): g t2 D (P,Q)= g [I (t)θ]I˙i(t)I˙j(t)dt. (40) θ ij g | g g Zt1 q Definition 5 Let us consider a differentiable curve I(t) with an unitary affine θ ∈ M parametrization. The information dissipation Φ(t) along the curve I(t) is defined as follows: d [I(t)θ] Φ(t)= S | . (41) dt Axiom 3 The length ∆s of any interval (t ,t ) of an arbitrary geodesic I (t) with a unitary 1 2 g θ ∈M affine parametrization is given by the negative of the variation of its information dissipation ∆Φ(t): ∆s= ∆Φ(t)=Φ(t ) Φ(t ). (42) 1 2 − − Axiom 4 If is not a closed manifold, ∂ = , the probability density ρ(I θ) associated with θ θ M M 6 ∅ | distribution function (36) vanishes with its first partial derivatives for any point on the boundary ∂ of the statistical manifold . θ θ M M 3.2. Analysis of axioms and their direct consequences Axiom 1 postulates the existence of the metric tensor g (I θ) defined on the statistical manifold ij | . Even, this axiom specifies the Riemannian structure of the manifold starting from the θ θ M M knowledge of the metric tensor g (I θ), e.g.: the covariant differentiation D and the curvature ij i | tensor R (I θ). Equation (34) is an strong constraint of Riemannian geometry that determines ijkl | a natural affine connections Γk for the covariantdifferentiation D , specifically, the so-calledLevi- ij i Civita connection [19]: 1 ∂g ∂g ∂g Γk (I θ)=gkm im + jm ij . (43) ij | 2 ∂Ij ∂Ii − ∂Im (cid:18) (cid:19) The knowledge of the affine connections Γk allows the introduction of the curvature tensor ij Rl =Rl (I θ) of the manifold : ijk ijk | Mθ ∂ ∂ Rl = Γl Γl +Γl Γm Γl Γm, (44) ijk ∂Ii jk− ∂Ij ik im jk− jm ik whichisalsoderivedfromtheknowledgeofthemetrictensorg (I θ)anditsfirstandsecondpartial ij | derivatives. Axiom 2 postulates the probabilistic nature of the manifold , in particular, the existence θ M of the distribution function dp(I θ) and the information potential (I θ). Equivalently, this axiom | S | providesaformaldefinitionfortheinformationpotential (I θ)whenonestartsfromtheknowledge S | of the parametric family (2). The probability density ρ(I θ) of the parametric family (2) obeys the | transformation rule of a tensorial density: ∂Θ −1 ρ(Θθ)=ρ(I θ) (45) | | ∂I (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) Fluctuation geometry: A counterpart approach of inference geometry 9 under coordinate change Θ(I) : of the statistical manifold . The covariance of the I Θ θ R → R M metric tensor g (I θ): ij | ∂Im ∂In g (Θθ)= g (I θ) (46) ij | ∂Θi ∂Θj mn | implies that the pre-factorofthe invariantvolumeelement (35) alsobehavesasa tensorialdensity: −1 g (Θθ) g (I θ) ∂Θ ij ij | = | . (47) s 2π s 2π ∂I (cid:12) (cid:12) (cid:12) (cid:12)(cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12)(cid:12) (cid:12) Admitting that the(cid:12)metric te(cid:12)nsor d(cid:12)etermina(cid:12)n(cid:12)t g(cid:12)(I θ) is non-vanishing everywhere, it is possible (cid:12) (cid:12) (cid:12) (cid:12)(cid:12) | i(cid:12)j | | to introduce probability weight ω(I θ): | ω(I θ)=ρ(I θ) 2πgij(I θ), (48) | | | | | which represents a scalar functiopn defined on the manifold θ. Since the statistical manifold θ M M possessesaRiemannianstructure,theintegrationovertheusualvolumeelementdI canbereplaced by the invariant volume element dµ(I θ). This consideration allows to rephrase the parametric | family (2) in the following equivalent representation: dp(I θ)=ω(I θ)dµ(I θ), (49) | | | which explicitly exhibits the covariance of the distribution function under the coordinate reparametrizations of the manifold . The information potential (I θ) is defined by the θ M S | logarithm of the probability weight ω(I θ): | (I θ)=logω(I θ), (50) S | | which also represents a scalar function defined on the statistical manifold . As discussed in θ M section 6, the negative of the information potential (I θ) can be regarded as a local invariant S | measure of the information content in the framework of information theory. Additionally, (I θ) S | canbeidentifiedwiththescalar entropy of a closed system intheframeworkofclassicalfluctuation theory [18]. Given the probability density ρ(I θ), the information potential (I θ) depends on the | S | metric tensor g (I θ) of the statistical manifold . Axiom 1 postulates the existence of this ij θ | M tensor, but its specific definition is still arbitrary. Axiom 3 eliminates such an ambiguity. In fact, it establishes a direct connection between the distance notion (3) and the information dissipation (41), or equivalently, between the metric tensor g (I θ) and the information potential (I θ). ij | S | Theorem 1 The metric tensor g (I θ) can be identified with the negative of the covariant ij | Hessian (I θ) of the information potential (I θ): ij H | S | g (I θ)= (I θ)= D D (I θ). (51) ij ij i j | −H | − S | Corollary 1 The information potential (I θ) is locally concave everywhere and the metric tensor S | g (I θ) is positive definite on the statistical manifold . ij θ | M Proof. The searching of the curve with minimal length (40) between two arbitrary points (P,Q) is a variational problem that leads to the so-called geodesic differential equations [19]: I˙k(t)D I˙i(t)=I¨i(t)+Γi [I (t)θ]I˙m(t)I˙n(t)=0, (52) g k g g mn g | g g which describes the geodesics I (t) with a unitary affine parametrization. Equations (41) and (42) g can be rephrased as follows: dΦ(s) d2 ∆s= ∆Φ(s) = S = 1. (53) − → ds ds2 − Fluctuation geometry: A counterpart approach of inference geometry 10 Considering the geodesic differential equations (52), the constraint (53) is rewritten as: d2 ∂ ∂2 ∂2 ∂ S =I¨k S +I˙iI˙j S =I˙iI˙j S Γk S , (54) ds2 g ∂Ik g g∂Ii∂Ij g g ∂Ii∂Ij − ij∂Ik (cid:26) (cid:27) which involves the covariant Hessian : ij H ∂2 ∂ =D D = S Γk S . (55) Hij i jS ∂Ii∂Ij − ij∂Ik Combining equations (54)-(55) and the constraint (39), one obtains the following expression: (g + )I˙iI˙j =0. (56) ij Hij g g Its covariant character leads to Eq.(51). Corollary 1, that is, the concave character of the information potential (I θ) and the positive definition of the metric tensor g (I θ) are direct ij S | | consequences of equation (53). Corollary 2 The metric tensor g = g (I θ) can be obtained from the probability density ij ij | ρ=ρ(I θ) through the following set of covariant second-order partial differential equations: | ∂2logρ ∂logρ ∂ g = +Γk + Γk ΓkΓl . (57) ij −∂Ii∂Ij ij ∂Ik ∂Ii jk− ij kl The admissible solutions for the metric tensor g should be finite and differentiable everywhere, ij including also, the boundary ∂ of the statistical manifold . θ θ M M Proof. Expression (57) is derived from equation (51) rewriting the information potential as (I θ) logρ(I θ) log g (I θ)/2π and considering the following identity: ij S | ≡ | − | | | Γj (I θ) p∂log |gij(I|θ)/2π|, (58) ij | ≡ ∂Ii p which is a known property of the Levi-Civita connection (43). Axiom 4 talks about the asymptotic behavior of the distribution function (36) for any point I on the boundary ∂ : b θ M ∂ lim ρ(I θ)= lim ρ(I θ)=0. (59) I→Ib | I→Ib ∂Ii | These last conditions are necessaryto obtain the generalfluctuation theorems (8)-(10) reviewed in the previous section. Moreover,this axiom will be employedto analyze the character of stationary points (maxima and minima) of the information potential (I θ). S | Remark 1 The boundary conditions (59) are independent from the admissible coordinate representation of the statistical manifold . Moreover, the probability weight ω(I θ) vanishes I θ R M | on the boundary ∂ of the manifold . θ θ M M Proof. This remark is a direct consequence of the transformation rule of the probability density (45) as well as the ones associated with its partial derivatives: ∂ρ(Θθ) ∂Ij ∂ρ(I θ) ∂ ∂Θ ∂Θ −1 | = | ρ(I θ) log (60) ∂Θi ∂Θi ∂Ij − | ∂Ij ∂I ∂I (cid:26) (cid:12) (cid:12)(cid:27)(cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) under a coordinate change Θ(I) : I Θ with Jacob(cid:12)ian (cid:12)∂Θ(cid:12)/∂I(cid:12) finite and differentiable R → R (cid:12) (cid:12)| (cid:12) (cid:12)| everywhere. Since the metric tensor determinant g (I θ) is non-vanishing everywhere, Axiom ij | | | 4 directly implies the vanishing of the probability weight ω(I θ) on the boundary ∂ of the θ | M statistical manifold . θ M

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.