ebook img

Persistent Entropy for Separating Topological Features from Noise in Vietoris-Rips Complexes PDF

0.58 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Persistent Entropy for Separating Topological Features from Noise in Vietoris-Rips Complexes

Persistent Entropy for Separating Topological Features from Noise in Vietoris-Rips Complexes ∗ 7 Nieves Atienza1, Rocio Gonzalez-Diaz1, Matteo Rucco2 1 0 2 1 Depto. Mat. Aplic. I, E.T.S.I. Informatica, n University of Seville, Spain a email: {natienza,rogodi}@us.es J 8 2 School of Science and Tech., Computer Science Division, 1 University of Camerino, Italy email: [email protected] ] H O January 30, 2017 . s c [ Abstract 1 v Persistenthomologystudiestheevolutionofk-dimensionalholesalong 7 a nested sequence of simplicial complexes (called a filtration). The set of 5 bars (i.e. intervals) representing birth and death times of k-dimensional 8 holesalongsuchsequenceiscalledthepersistencebarcode. k-Dimensional 7 holes with short lifetimes are informally considered to be “topological 0 noise”, and those with long lifetimes are considered to be “topological . 1 features” associated to the filtration. Persistent entropy is defined as the 0 Shannon entropy of the persistence barcode of a given filtration. In this 7 paper we present new important properties of persistent entropy of Cˇech 1 and Vietoris-Rips filtrations. Among the properties, we put a focus on : v the stability theorem that allows to use persistent entropy for comparing i persistence barcodes. Later, we derive a simple method for separating X topological noise from features in Vietoris-Rips filtrations. r Keywords: Persistent homology, persistence barcodes, Shannon en- a tropy, Cˇech and Vietoris-Rips complexes, topological noise, topological feature 1 Introduction Topology is the branch of mathematics that studies shapes and maps among them. From the algebraic definition of topology a new set of algorithms have ∗AuthorsarepartiallysupportedbySpanishGovernmentundergrantMTM2015-67072-P (MINECO/FEDER,UE). 1 Figure 1: From left to right: RNA secondary suboptimal structures within different bacteria. been derived. These algorithms are identified with “computational topology” or often pointed out as Topological Data Analysis (TDA) and are used for investigating high-dimensional data in a quantitative manner. Persistent homology appears as a fundamental tool in Topological Data Analysis. It studies the evolution of k-dimensional holes along a sequence F of simplicial complexes. The persistence barcode B(F) of F is the collection of bars (i.e. intervals) representing birth and death times of k-dimensional holes along such sequence. In B(F), k-dimensional holes with short lifetimes are in- formally considered to be “topological noise”, and those with long lifetimes are “topological features” of the given data. Persistent homology based techniques are nowadays widely used for analyz- inghighdimensionaldata-setandtheyaregoodtoolsforshapingthesedata-set and for understanding the meaning of the shapes. Persistent homology reveals the global structure of a data-set and it is a powerful tool for dealing with high dimensional data-set without performing dimensionality reduction. There are several techniques for building a topological space from the data. The main approach is to complete the data to a collection of combinatorial objects, i.e. simplices. Anestedcollectionofsimplicesformsasimplicialcomplex. Simplicial complexes can be obtained from graphs and point cloud data (PCD) [1, 2]. For example, PCD can be completed to simplicial complexes by using the Vietoris- Rips filtration, which is a sequence of simplicial complexes built on a metric space,providinginthiswayatopologicalstructuretoanotherwisedisconnected set of points. It is widely used in TDA because it encodes useful information aboutthetopologyoftheunderlyingmetricspace. LetustakealookatFig.1,it representsacollectionofRNAsecondarysub-optimalstructureswithindifferent bacteria. Alltheshapesarecharacterizedbyseveralcircularsubstructures,each of them is obtained by linking different nucleotides. Each substructure encodes functionalpropertiesofthebacteria. Mamuyeetal. [3]usedVietoris-Ripscom- plexesandpersistenthomologyforcertifyingthattherearedifferentspeciesbut characterized with the same RNA sub-optimal secondary structure, thus these species are functionally equivalent. The mathematical details of Vietoris-Rips filtration are given in Section 2 of this paper. Nevertheless, Vietoris-Rips based analysis suffers of the selection of the pa- rameter (cid:15). Generally speaking, for different (cid:15), different topological features can be observed. For example, in [4], several applications of Vietoris-Rips based analysis to biological problems have been reported and examples of different (cid:15) 2 with different meaning were found. In order to select the best (cid:15), some statistics have been provided what is known as “ persistence landscape” [5]. Landscape is a powerful tool for statistically assessing the global shape of the data over different(cid:15). Technicallyspeaking, alandscapeisapiecewiselinearfunctionthat basically maps a point within a persistent diagram (or barcode) to a point in which the x-coordinate is the average parameter value over which the feature exists, and the y-coordinate is the half-life of the feature. Landscape analysis allows to identify topological features. In Section 6, we present the notion of persistententropy,asanalternativeapproachtolandscape. Themaindifference betweenlandscapeandourmethodisthattheformerusestheaverageof(cid:15),while thelatterworksdirectlyonfixed(cid:15). Moreconcretely,persistententropy(whichis theShannonentropyofthepersistencebarcode)isatoolformallydefinedin[6] andusedtomeasuresimilaritiesbetweentwopersistencebarcodes. Aprecursor ofthisdefinitionwasgivenin[7]tomeasurehowdifferentthebarsofabarcode are in length. In [8], persistent entropy is used for addressing the comparison between discrete piecewise linear functions. In Section 11, several properties of the persistent entropy of Vietoris-rips filtrations are presented. For example, the exact formula of maximum and minimum persistent entropy is given for a persistence barcode, fixing the number of bars and the maximum and min- imum length. These results are important later in Section 15 for differentiate topological features from noise. In general, “very” long living bars (long lifetime) are considered topologi- cal features since they are stable to “small” changes in the filtration. In [9] a methodology is presented for deriving confidence sets for persistence diagrams to separate topological noise from topological features. The authors focused on simple, synthetic examples as proof of concept. Their methods have a sim- ple visualization: one only needs to add a band around the diagonal of the persistence diagram. Points in the band are consistent with being noise. The first three methods in that paper were based on the distance function to the data. They started with a sample from a distribution P supported on a topo- logical space C. The bottleneck distance was used as a metric on the space of persistence diagrams. The last method in that paper used density estimation. The advantage of the former was that it is more directly connected to the raw data. The advantage of the latter was that it is less fragile; that is, it is more robust to noise and outliers. In Section 15 in this paper, we derive a simple method for separating topological features from noise of a given filtration using the mentioned persistent entropy measurement. Moreover, we claim it is very easy (and fast) to compute, and easy to adapt depending on the application. A preliminary version of this technique was also presented in [10]. 2 Background This section provides a short recapitulation of the basic concepts needed as a basis for the presented method for separating topological noise from features. Informally, a topological space is a set of points each of them equipped with 3 the notion of neighboring. A simplicial complex is a kind of topological space constructed by the union of k-dimensional simple pieces in such a way that the common intersection of two pieces are lower-dimensional pieces of the same kind. More concretely, K is composed by a set K of 0-simplices (also called 0 vertices V, that can be thought as points in Rd); and, for each k ≥1, a set K k ofk-simplicesσ ={v ,v ,...,v },wherev ∈V foralli∈{0,...,k},satisfying 0 1 k i that: • each k-simplex has k+1 faces obtained by removing one of its vertices; • if a simplex is in K, then all its faces must be in K. The underlying topological space of K is the union of the geometric realization of its simplices: points for 0-simplices, line segments for 1-simplices, filled tri- angles for 2-simplices, filled tetrahedra for 3-simplices and their k-dimensional counterparts for k-simplices. We only consider finite simplicial complexes with finitedimension,i.e.,thereexistsanintegerm(calledthedimensionofK)such that for k >m, K =∅ and, for 0≤k ≤m, K is a finite set. k k Two classical examples of simplicial complexes are Cˇech complexes and Vietoris-Rips complexes (see [11, Chapter III]). Let V be a (finite) PCD in Rd. The Cˇech complex of V and r denoted by Cˇ (r) is the simplicial complex V whose simplices are formed as follows. For each subset S of points in V, form a closed ball of radius r around each point in S, and include S as a simplex of Cˇ (r)ifthereisacommonpointcontainedinalloftheballsinS. Thisstructure V satisfies the definition of abstract simplicial complex. The Vietoris-Rips com- plex denoted as VR (r) is essentially the same as the Cˇech complex. Instead of V checking if there is a common point contained in the intersection of the (r)-ball aroundvforallvinS,wemayjustcheckpairsaddingS asasimplexofCˇ (r)if V√ alltheballshavepairwiseintersections. WehaveCˇ (r)⊆VR (r)⊆Cˇ ( 2r). V V V See Fig.2. In practice, Vietoris-Rips complexess are more often used since they are easier to compute than Cˇech omplexes. Homology is an algebraic machinery used for describing topological spaces. The k-Betti number β (K) represents the rank of the k-dimensional homology k groupH (K)ofagivensimplicialcomplexK. Informally, β (K)isthenumber k 0 ofconnectedcomponentsofK,β (K)countsthenumberoftunnels,β (K)can 1 2 be thought as the number of voids of K and, in general, β (K) can be thought k as the number of k-dimensional holes of K. More precisely, homology groups are defined from an algebraic structure called chain complex composed by a set ofgroups{C (K)} ,whereeachC (K)isthegroupofk-chainsgeneratedbyall k k k the k-simplices of K, and a set of homomorphisms {∂ :C (K)→C (K)} , k k k−1 k called boundary operators, describing the boundaries of k-chains. A k-chain a suchthat∂ (a)=0isak-cycle. Itisak-boundaryifthereexistsa(k+1)-chain k bsuchthat∂ (b)=a. Thisway, thek-dimensional homology group H (K)is k+1 k thegroupofk-cyclesmodulothegroupofk-boundaries. Thek-th Betti number β (K) is the rank of H (K). See [12] and [13] for an introduction to algebraic k k topology. 4 Figure 2: [11, p. 72] Nine points with pairwise intersections among the disks indicated by straight edges connecting their centers, for a fixed time (cid:15). The Cˇech complex Cˇ ((cid:15)) fills nine of the ten possible triangles as well as the two V tetrahedra. The Vietoris-Rips complex VR ((cid:15)) fills the ten triangles and the V two tetrahedra. Persistent homology is a method for computing k-dimensional holes of a giventopologicalspaceatdifferentspatialresolutions. Thekeyideaisasfollows. • First, the space must be represented as a simplicial complex K and a distance function must be defined on the space. • Second, a filtration of K, referred above as different spatial resolutions, is computed. Moreconcretely,afiltrationF ofK isacollectionofsimplicial complexes F={K(t)|t ∈ R} of K such that K(t) ⊂ K for t < s and s there exists t ∈ R such that K = K. The filtration time (or max tmax filter value) of a simplex σ ∈ K is the smallest t such that σ ∈ K(t). For example, let V be a PCD in Rd and let r,r(cid:48) ∈ R. Then, there is a natural inclusions Cˇ (r) ⊆Cˇ (r(cid:48)) and VR (r) ⊆ VR (r(cid:48)) whenever V V V V r ≤r(cid:48). The simplicial complexes Cˇ (r) together with the inclusion maps V define a filtered simplicial complex Cˇ called Cˇech filtration. Similarly, V the simplicial complexes VR (r) together with the inclusion maps define V a filtered simplicial complex VR called Vietoris-Rips filtration. V • Then, persistent homology describes how the homology of a given simpli- cial complex K changes along filtration F = {K(t)|t ∈ R}. If the same topologicalfeature(i.e.,k-dimensionalhole)isdetectedalongalargenum- berofsubsetsinthefiltration,thenitislikelytorepresentatruefeatureof theunderlyingspace,ratherthanartifactsofsampling,noise,orparticular choice of parameters. More concretely, a bar in the k-dimensional persis- tencebarcode,withendpoints[t ,t ),correspondstoak-dimensional start end holethatappears atfiltrationtime t andremainsuntilfiltrationtime start t . The set of bars [t ,t ) representing birth and death times of end start end homologyclassesiscalledthepersistence barcodeB(F)ofthefiltrationF. Analogously, the set of points (t ,t ) ∈ R2 is called the persistence start end diagram dgm(F) of the filtration F. 5 For more details and a more formal description we refer to [11]. Classically, the bottleneck distance (see [11, page 229]) is used to compare the persistence diagrams of two different filtrations. Concretely, let dgm(F) = {a ,...,a } and dgm(F(cid:48)) = {a(cid:48),...,a(cid:48) } be, respectively, the persistence dia- 1 k 1 k(cid:48) gram dgm(F) and dgm(F(cid:48)) of the two filtrations F and F(cid:48), then d (dgm(F),dgm(F(cid:48)))=inf{sup{||v−γ(v)|| }} b ∞ γ v is the bottleneck distance between dgm(F) and dgm(F(cid:48)) where, for points a = (x,y) and γ(a) = (x(cid:48),y(cid:48)) in R2, ||a − γ(a)|| = max{|x − x(cid:48)|,|y − y(cid:48)|} and ∞ γ :dgm(F)→dgm(F(cid:48))isabijectionthatcanassociateapointoffthediagonal with another point on or off the diagonal. Here, diagonal is the set of points {(x,x)}⊂R2. Remark 3 Since simplicial complexes considered in this paper are finite then for given filtrations F and F(cid:48), we have that: • dgm(F) is a finite set of points in R2. • d (dgm(F),dgm(F(cid:48)))=min {max {||a−γ(a)|| }}. b γ a ∞ In the following theorem, it is state that low-distortion correspondences be- tween two PCDs, V and W, in Rd give rise to small distance in the bottleneck distanceofthepersistencediagramsoftheCˇechfiltrationsCˇ andCˇ andthe V W Vietoris-Rips filtrations VR and VR . V W Theorem 4 PersistencestabilityforCˇechandVietoris-Ripscomplexes [14,Th. 5.2.] LetV andW betwosetsofpointsinRd then,foreitherF =Cˇ V V and F =Cˇ or F =VR and F =VR , we have that: W W V V W W d (dgm(F ),dgm(F ))≤2d (V,W), b V W GH where2d (V,W)=inf {sup |d(v,v(cid:48))−d(c(v),c(v(cid:48)))|}}forc:V →W being GH c v,v(cid:48) surjective. Remark 5 Since PCDs considered in this paper are finite, then 2d (V,W)=min{max|d(p,p(cid:48))−d(c(p),c(p(cid:48)))|}}. GH c p,p(cid:48) 6 Persistent entropy In order to measure how much the construction of a filtration is ordered, a new entropy measure, the so-called persistent entropy, were defined in [6]. A precursor of this definition was given in [7] to measure how different the bars of a barcode were in length. In [8], persistent entropy was used for addressing the comparison between discrete piece-wise linear functions. 6 Definition 7 Given a filtration F ={K(t)|t∈R} and the corresponding per- sistence diagram dgm(F) = {a = (x ,y )|1 ≤ i ≤ n} (being x < y for all i i i i i i), let L = {(cid:96) = y −x |1 ≤ i ≤ n}. The persistent entropy E(F) of F is i i i calculated as follows: n n (cid:88) (cid:88) E(F)=− pilog(pi) where pi = S(cid:96)Li , (cid:96)i =yi−xi, and SL = (cid:96)i. i=1 i=1 Sometimes, persistent entropy E(F) will also be denoted by E(L). Notethatthemaximumpersistententropywouldcorrespondtothesituation in which all the bars in the associated persistence barcode are of equal length (i.e., (cid:96) =(cid:96) forall1≤i,j ≤n). Conversely, thevalueofthepersistententropy i j decreasesasmorebarsofdifferentlengthsarepresentinthepersistencebarcode. More concretely, if E(F) has n points, the possible values of E(F) lie in the interval [0,log(n)]. The following result supports the idea that persistent entropy can differen- tiate long from short bars as we will see in Section 15. Theorem 8 [10] Given a filtration F and the corresponding persistence dia- gram dgm(F) = {a = (x ,y )|1 ≤ i ≤ n}, let L = {(cid:96) = y −x |1 ≤ i ≤ n}. i i i i i i For a fixed integer i, 1≤i≤n, let L(cid:48) ={(cid:96)(cid:48),...,(cid:96)(cid:48),(cid:96) ,...,(cid:96) } 1 i i+1 n where (cid:96)(cid:48)j = eEP(Rii) for 1≤j ≤i, Ri ={(cid:96)i+1,...(cid:96)n} and Pi =(cid:80)nj=i+1(cid:96)j. Then E(L)≤E(L(cid:48)). Observe that we can also write (cid:96)(cid:48) =(cid:81)n (cid:96)(cid:96)j/Pi. This last expression will j j=i+1 j be very useful in the proof of Th. 17 in Section 15. Proof. Let us prove that E(L(cid:48)) is the maximum of all the possible persistent entropies associated to barcodes with n bars, such that the list of lengths of the last n−i bars of any of such lists is R . Let M ={x ,...,x ,(cid:96) ,...,(cid:96) } i 1 i i+1 n (where x >0 for 1≤j ≤i) be any of such lists. j Let S =(cid:80)i x . Then, the persistent entropy associated to M is: x j=1 j i (cid:18) (cid:19) n (cid:18) (cid:19) E(M)=−(cid:88) xj log xj − (cid:88) (cid:96)j log (cid:96)j S +P S +P S +P S +P x i x i x i x i j=1 j=i+1 i (cid:18) (cid:19) (cid:18) (cid:19) =−(cid:88) xj log xj − PiE(Ri) − Pi log Pi . S +P S +P S +P S +P S +P x i x i x i x i x i j=1 In order to find out the maximum of E(M) with respect to the unknown vari- ables x , 1≤k ≤i, we compute the partial derivative of E(M) with respect to k those variables:   (cid:18) (cid:19) (cid:18) (cid:19) ∂E∂x(M) = (S +1P )2 PiE(Ri)+Pilog xPi +(cid:88)xjlog xxj . k x i k k j(cid:54)=k 7 (cid:110) (cid:111) (cid:110) (cid:111) Finally, xk = eEP(Rii) |1≤k ≤i is the solution of ∂E∂x(Mk ) =0|1≤k ≤i . (cid:50) The following result establishes a relation between bottleneck distance and persistent entropy. Proposition 9 LetF andF(cid:48) betwofiltrations. Forall(cid:15)>0, thereexistsδ >0 such that if d (dgm(F),dgm(F(cid:48)))<δ then |E(F)−E(F(cid:48))|<(cid:15). b Proof. The proof is similar to the one given in [8] to demonstrate that per- sistent entropy associated to piece-wise linear functions is stable. Fixed (cid:15) > 0, we have to find δ > 0 such that if d (dgm(F),dgm(F(cid:48))) < δ then b |E(F)−E(F(cid:48))|<(cid:15). First, since h(x) = −xlogx is a continuous function in [0,1] (redefining h(0) as 0), for (cid:15)(cid:48) = (cid:15) > 0, there exists δ(cid:48) ∈ (0,1] such that if |x−x(cid:48)| ≤ δ(cid:48) then n |h(x)−h(cid:48)(x)|≤(cid:15)(cid:48). Take δ = SL4n(cid:48)δ(cid:48) and suppose db(dgm(F),dgm(F(cid:48)))<δ. By Remark 3, dgm(F) and dgm(F(cid:48)) are both finite and there exists a bijection γ¯ :dgm(F)→dgm(F(cid:48))suchthatd (dgm(F),dgm(F(cid:48)))=max {||a−γ¯(a)|| }. b a ∞ Let dgm(F)={a ,...,a } (where some of the a can possibly be on the diag- 1 n i onal). Let a =(x ,y ) and γ¯(a )=(x(cid:48),y(cid:48)). Then, i i i i i i ||a −γ¯(a )|| =max{|x −x(cid:48)|,|y −y(cid:48)|}≤δ for all i. i i ∞ i i i i Let (cid:96) =y −x and (cid:96)(cid:48) =y(cid:48)−x(cid:48). Then, i i i i i i |(cid:96) −(cid:96)(cid:48)|=|x −y −(x(cid:48) −y(cid:48))|≤|x −x(cid:48)|+|y −y(cid:48)|≤2δ for all i. i i i i i i i i i i Besides, (cid:12) (cid:12) (cid:12)(cid:88)n (cid:88)n (cid:12) (cid:88)n |S −S |=(cid:12) (cid:96) − (cid:96)(cid:48)(cid:12)≤ |(cid:96) −(cid:96)(cid:48)|≤2δn. L L(cid:48) (cid:12) i i(cid:12) i i (cid:12) (cid:12) i=1 i=1 i=1 Without lost of generality, assume S ≥S . Then S ≤S +2δn. L L(cid:48) L L(cid:48) Let pi = S(cid:96)Li and p(cid:48)i = S(cid:96)L(cid:48)i(cid:48). Then (cid:96) (cid:96)(cid:48) S (cid:96) −S (cid:96)(cid:48) (cid:96) −(cid:96)(cid:48) 2δ δ(cid:48) p −p(cid:48) = i − i = L(cid:48) i L i ≤ i i ≤ = ≤δ(cid:48); i i S S S S S S 2n L L(cid:48) L L(cid:48) L(cid:48) L(cid:48) (S +2δn)(cid:96)(cid:48) −S (cid:96) (cid:96)(cid:48) −(cid:96) 2δn(cid:96)(cid:48) 2δn(cid:18) (cid:96)(cid:48) (cid:19) p(cid:48) −p ≤ L(cid:48) i L(cid:48) i ≤ i i + i ≤ 1+ i ≤δ(cid:48). i i S S S S S S S L L(cid:48) L(cid:48) L(cid:48) L(cid:48) L(cid:48) L(cid:48) Therefore, (cid:12) (cid:12) (cid:12)(cid:88)n (cid:88)n (cid:12) (cid:88)n |E(F)−E(F(cid:48))|=(cid:12) p logp − p(cid:48)logp(cid:48)(cid:12)≤ |p logp −p(cid:48)logp(cid:48)|≤(cid:15), (cid:12) i i i i(cid:12) i i i i (cid:12) (cid:12) i=1 i=1 i=1 which concludes the proof. (cid:50) The result above is used now to prove that persistent entropy is a stable measure for Cˇech and Vietoris-Rips filtrations. 8 Theorem 10 PersistententropystabilitytheoremforCˇechandVietoris- Rips filtrations. Let V and W be two PCDs in Rd. Then, for every (cid:15) > 0 there exists δ >0 such that: If 2d (V,W)≤δ then |E(F )−E(F )|<(cid:15), GH V W where either F =Cˇ and F =Cˇ or F =VR and F =VR . V V W W V V W W Proof. First, by Prop. 9 we have that fixed (cid:15) > 0, there exists δ > 0 such that if d (dgm(F ),dgm(F ))<δ then |E(F )−E(F )|<(cid:15). b V W V W Second, by Th. 4 we have that d (dgm(F ),dgm(F ))≤2d (V,W). b V W GH Therefore, if 2d (V,W)<δ then |E(F )−E(F )|<(cid:15). (cid:50) GH V W 11 Properties of the persistent entropy of Vietoris- Rips filtrations Since Vietoris-Rips filtration are widely used in practice, we focus now our effort in the study of properties of the persistent entropy of this special kind of filtrations. The first thing we have to take into account is that, in practice, one will never construct the filtration up to the end and will stop at a certain time T. Then, VR = {VR (t)|t ≤ T}. To decide when to stop, we use the following V V result. Proposition 12 Let V ={v ,...,v } be a PCD in Rd. Let 1 m min max d(v ,v ) T = i j i j . 2 Then, β (VR (T))=1 and β (VR (T))=0 for k >0. 0 V k V Proof. First, notice that there exists a vertex v such that max d(v,v )=2T. j j That is, d(v,v )≤2T for 1≤j ≤m. Then, v is connected to v by an edge in j j VR (T), for 1≤j ≤m. In particular, β (VR (T))=1. V 0 V Now, observe that if σ = {v ,v ,...,v } is a k-simplex in VR (T) and v (cid:54)∈ σ, 0 1 k V then σ∪{v}={v,v ,v ,...,v } is a (k+1)-simplex in VR (T) and ∂ (σ∪ 0 1 k V k+1 {v})=σ+∂ (σ)∪{v}. k (cid:80) Let c = σ be a cycle in C (VR (T)). Let J = {j|j ∈ I and v is not a i∈I i k V (cid:80) vertex of σ }. Let b= σ ∪{v}. Then j j∈J j (cid:88) (cid:88) (cid:88) ∂ (b)= σ +∂ (σ )∪{v}= σ + σ =c. k+1 j k j j i j∈J j∈J i∈I\J Therefore, c is a boundary. Then β (VR (T))=0 for k >0. (cid:50) k V From now on, given a PCD V ={v ,...,v } in Rd, we construct 1 m VR ={VR (t)|t≤T} for T = minimaxjd(vi,vj). V V 2 9 By Prop. 12, the biggest bar in the persistence barcode in dimension 0 was bornattimet=0andsurvivesuntiltheend(i.e., timet=T)andthesmallest bar was born at time t = 0 and survives until t = r = min d(v ,v ). Fixed i,j i j the number of bars in the persistence barcode and the maximum and minimum lengthsofthebars,T andr,thefollowingresultshowsthelengthsoftherestof the bars that provide the minimum persistent entropy. This result will be very useful in the next section to detect topological features. Theorem 13 Let L = {(cid:96) ,...,(cid:96) } such that (cid:96) = T, (cid:96) ≥ (cid:96) for 1 ≤ i < n 1 n 1 i i+1 and (cid:96) =r. Let M ={T,.Q..,T,r,n.−.Q.,r}. Then n (cid:104) (cid:105) E(L)≥E(M) for Q= αn(α−1−log(α)) being α= r. (α−1)2 T Proof. First, fixed n, T and r, Let pi = SliL. Since the entropy is a concave function in (cid:40) n (cid:41) (cid:88) 1 1 Ω= (p ,p ,...p )| p =1, <p <p < , 1 2 n i (n−1)+α i 1 (n−1)α+1 i=1 being α= r, the minimum is attained at an extremal point of Ω. Let T P =(p ,.i..,p ,αp ,n.−..i,αp ), with 1<i<n, 1 1 1 1 n be an extremal point. Since (cid:80)p = 1, then p = 1 and the entropy of i 1 i+α(n−i) i=1 P is: i(1−α) log(1) E(P)=log(αn)+log(1+ )− α . αn 1+(n −1)α i Consider t= i ∈(0,1), then: n α(1−t)log(1) E(P)=E(t)= α α(1−t)+t The derivative of E(t) is null for: αlog(1)−α(1−α) t = α 0 (1−α)2 So the minimum entropy is attained for (cid:20) αlog(1)−α(1−α)(cid:21) Q=[nt ]= n α . 0 (1−α)2 Taking in account that p = T, the barcode with l = T and l = r with 1 1 n minimum entropy is M ={T,.Q..,T,r,n.−.Q.,r}. (cid:50) Inthefollowingproposition,weestablishthemaximumentropywecanreach for n bars fixing the the maximum and minimum lengths of the bars. 10

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.