ebook img

Empirical geodesic graphs and CAT(k) metrics for data analysis PDF

1.5 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Empirical geodesic graphs and CAT(k) metrics for data analysis

EMPIRICAL GEODESIC GRAPHS AND CAT(K) METRICS FOR DATA ANALYSIS 5 1 KEIKOBAYASHIANDHENRYPWYNN 0 2 Abstract. Amethodologyisdevelopedfordataanalysisbasedonempirically t constructed geodesicmetricspaces. Foraprobabilitydistribution,thelength c alongapath between twopoints canbedefined asthe amountofprobability O mass accumulated along the path. The geodesic, then, is the shortest such pathanddefinesageodesicmetric. Suchmetricsaretransformedinanumber 5 of waysto produceparametrisedfamiliesofgeodesic metricspaces, empirical 1 versionsofwhichallowcomputationofintrinsicmeansandassociatedmeasures of dispersion. These reveal properties of the data, based on geometry, such ] T as those thataredifficulttoseefromthe rawEuclideandistances. Examples includeclusteringandclassification. Forcertainparameterranges,thespaces S become CAT(0) spaces and the intrinsic means are unique. In one case, a . h minimal spanning tree of a graph based on the data becomes CAT(0). In t another, a so-called “metric cone” construction allows extension to CAT(k) a spaces. It is shown how to empirically tune the parameters of the metrics, m makingitpossibletoapplythemtoanumberofrealcases. [ 4 v 0 1. Introduction 2 There havebeen many developmentin statistics in which geometry,represented 0 3 by an extension from Euclidean to more general spaces, has proved fundamental. . Thus, reproducing kernel Hilbert spaces is a part of Gaussian process methods, 1 Sobolev spaces are used for , and Besov spaces, for wavelets. Differential geome- 0 4 try, Riemannian manifolds, curvature and geodesics are at the core of information 1 geometry, shape analysis and manifold learning. : This paper is concerned with CAT(0), and, more generally, CAT(k). Although v i relatedtoRiemannianmanifolds(kdenotestheRiemanniancurvature)thesemetric X spaces have a less rigid structure and are qualitatively different from the spaces r mentioned above. Trees are among the first examples in which CAT(0) properties a have been used in statistics and bioinformatics [6]: the set of trees (the tree space) becomesageodesicmetricspace(tobedefined)whenforeachtree,theedgelengths are allocated to entries in a vector of sufficient dimension to capture all the tree structures. Then, the metric is the Euclidean geodesic distance on a simplicial complex called a “fan” which is induced by the tree structure: the geodesics are theshortestpiecewiselinearpathsinthefan. Suchspaceshavealreadybeenshown to be CAT(0) [6] by Gromov’s theory [15]. For a randomvariable X on a metric space endowedwith a metric d(, ) the M · · general intrinsic mean is defined by µ=arg min E d(X,m)2 . m∈M { } Keywordsand phrases. intrinsicmean,extrinsicmean,CAT(0),curvature,metriccone,clus- teranalysis,non-parametricanalysis. 1 2 KEIKOBAYASHIANDHENRYPWYNN (a) Hyperboloid (b) Plane (c) Sphere Figure 1. Function f(m) for data on (a) a hyperboloid (curva- ture c = 1), (b) a plane (c = 0) and (c) a sphere (c = 1). The − bluer represents the smaller value of f(m). Only for the sphere, f(m) has multiple minima. One hundred data points are sam- pled independently from a “Gaussian mixture,” whose density is proportionalto exp( d(x,µ )2/2σ2)+exp( d(x,µ )2/2σ2)where 1 2 − − d(µ ,µ )=π and σ =π/8. 1 2 The empiricalintrinsicmeanbasedondatax= x ,...,x ,sometimes calledthe 1 n { } Fr´echet mean, is defined as µˆ=arg min f(m), m∈M where n f(m)= d(x ,m)2. i Xi=1 For Euclidean space, µˆ = x¯, the sample mean. In general, f(m) is not necessarily convex and the means, µˆ, are not unique. Figure 1 shows that the curvature can affect the property of f(m). In particular, for so-called CAT(0) spaces, which (trivially) include Euclidean spaces, the intrinsic means µˆ are unique. Even when the mean is not unique, the function f(m) can yield useful informa- tion, for example about clustering. We can also define second-order quantities: n n 1 1 s2 = inf d(x ,m)2 = d(x ,µˆ)2 0 m∈MnXi=1 i nXi=1 i and n 2 s2 = d(x ,x )2. 1 n(n 1) i j − Xi<j A key concept in the study of these issues is that the metrics are global geodesic metrics, that is metrics based on the shortest path between points measured by integration along a path with respect to a local metric. The interplay between the global and the local will concern us to a considerable extent. For someintuition, consideracirclewith arclengthasdistanceandthe uniform measure. The Fr´echet intrinsic mean is the whole circle. We can say, informally, that the non-uniqueness arises because the space is not CAT(0). Although the curvatureis zero inthe Riemanniangeometrysense, there areother considerations related to “diameter”, which prevents it from having the CAT(0) property [17]. If we use the embedding Euclidean distance rather than the arc distance, we have 3 the extrinsic mean, but again, we obtain the whole perimeter. See [5] for further discussion on the intrinsic and extrinsic means. The empirical case is also prob- lematic. For example, two data points on a diameter, with arc length as distance, give intrinsic means on the perpendicular diameter. Simple geometry shows that the empirical extrinsic mean is the whole circle. We also cover the more general CAT(k) spaces giving some new results related to “diameter”as well as conditions for the uniqueness of intrinsic means not requiring the spaces to be CAT(0). Thegeneralformoff(m)depends,here,onthreeparameters,α,β,γ,anditcan be written in compact form: n f (m)= g (d (x ,m)) γ, α,β,γ β α i { } Xi=1 where the function g and the construction of d are given below. Once we have β α introducedthisnewclassofmetrics,varietyofstatisticscanbegeneralised;intrinsic mean, variance, clustering (based on local minima of f(m)). For classification problems, we can select an appropriate metric by cross-validation. Theoretically, we have the opportunity to apply well studies areas of geometry compared with methods based on selection of a good “loss function.” In Table 1, we summarise such generalised statistics. Table 1. A summaryofgeneralisedstatisticsby introducing α,β and γ. Euclidean Generalised metric Metrics d(x,y)=kx−yk d (x,y)=g (d (x,y)) αβ β α n n Intrinsic mean arg min Xkxi−mk2 arg min Xgβ(dα(xi,m))γ m∈Edi=1 m∈Mi=1 n n Variance mm∈iEnd n1 Xi=1kxi−mk2 mm∈iMn n1 Xi=1gβ(dα(xi,m))γ n n Clustering function f(m)=Xkxi−mk2 fαβγ(m)=Xgβ(dα(xi,m))γ i=1 i=1 Therearemanywaystotransformonemetricintoanother,regardlessofwhether theyaregeodesicmetrics. Astraightforwardwayistouseaconcavefunctiongsuch that given a metric d(, ), the new metric is d′(, ) = g(d(, )). This is plausible · · · · · · if we use non-convex f , which are useful, as will be explained, in clustering α,β,γ and classification. Such concave maps are often interpreted as loss functions, but we will consider them in terms of a change of metric which may lead to selection usinggeometricconcepts. Thisisparticularlytruefortheconstructionofg inthis β paper. Thebasicdefinitionandconstructionfromageodesicmetricspacetothespecial geodesics based on accumulation of density are given in the next section, together with the definition of a CAT(0) space. In Section 3, we first show that means and medians in simple one-dimensional statistics can be placed into our framework. 4 KEIKOBAYASHIANDHENRYPWYNN Because geodesics themselves are one-dimensional paths, this should provide some essential motivation. The d -metric is obtained by a local dilation. Our computa- α tional shortcut is to use empirical graphs, whose vertices are data points. We will need, therefore, to define empirical geodesics. We start with a natural geodesic defined via a probability density function in which the distance along a path is the amount of density “accumulated” along that path. Then, an empirical versionis defined whenevera density is estimated. We mighthave basedthe paper onkerneldensityestimates;instead,wehaveadoptedaverygeometricconstruction based on objects such as the the Delaunay complex and its skeleton graph. In Section 4, the d metric is introduced. It is based on a function derived from β a geodesic metric via shrinking, pointwise, to an abstract origin; that is to say an abstractconeisattached. Thesmallerthevalueofβ,theclosertotheorigin(apex). The geometry of the two dimensional case drives our understanding of the general case because of the one-dimensional nature of geodesics. We prove that, for finite β, the embedding coneis CAT(k) with smallerk thanthe originalspace. We cover themoregeneralCAT(k)spaces,givingsomenewresultsrelatedto“diameter”,in Section 5, including conditions for the uniqueness of intrinsic means not requiring the spaces to be CAT(0). Section 6 provides a summary of the effect of changing α and β. After some discussionontheselectionofαandβ inSection7,Section8coverssomeexamples. 2. Geodesics, intrinsic mean and extrinsic mean The fundamentalobjectin this paper is a geodesic metric space. This is defined intwostages. First,defineametricspace =(X,d)withbasespaceX andmetric M d(x,x′). Sometimes, will be a Euclidean space E of dimension d, containing d M thedatapoints,butitmayalsobesomespecialobjectsuchasagraphormanifold. Second, define the length of a (rectilinear) path between two points x,x′ X and ∈ the geodesic connecting x and x′ as the shortest such path. This minimal length defines ametric d∗(x,x′), andthe spaceendowedwith the geodesicmetric is called the geodesic metric space, ∗ = (X,d∗). M M The interplay between = (X,d) and ∗ = (X,d∗) will be critical for this M M paper, and, as mentioned, we will have a number of ways of constructing d∗. For data points x ,...,x in X, the empirical intrinsic (Fr´echet) mean is 1 n n µ=arg inf d∗(x ,µ)2. i µ∈XXi=1 Thereareoccasionswhenitis usefulto represent ∗ asa sub-manifoldofalarger M space (such as Euclidean space) + = (X+,d+) with its own metric d+. We can M then talk about the extrinsic mean: n µ+ =arg inf d+(x ,µ)2. i µ∈XXi=1 Typically, the intrinsic mean is used as an alternative when the geodesic distance, d∗ is hard to compute. The difficultly in considering the intrinsic mean in X+ is that it may not lie in the original base space X. This leads to a third possibility, which is to project it back to X, in some way, as an approximationto the intrinsic mean µ (which may be hard to compute). We will discuss this again in Section 4. 5 2.1. CAT(0) spaces. CAT(0)spaces,whichcorrespondtonon-positivecurvature Riemannian spaces, are important here because their intrinsic means are unique. The CAT(0) property is as follows. Take any three points a,b,c in a geodesic { } metric space X and consider the “geodesic triangle” of the points based on the geodesic segments connecting them. Construct a triangle in Euclidean 2-space with vertices a′,b′,c′ , calledthe comparisontriangle,whose Euclideandistances, { } a′ b′ , b′ c′ , a′ c′ , are the same as the corresponding geodesic distances k − k k − k k − k just described: d(a,b) = a′ b′ , etc. On the geodesic triangle select a point x k − k on the geodesic edge between b and c and find the point x′ on the edge b′c′ of the Euclideantrianglesuchthatd(b,x)= b′ x′ . ThentheCAT(0)conditionisthat k − k for all a,b,c and all choices of x: d(x,a) x′ a′ . ≤k − k For a CAT(0) space (i) there is a unique geodesic between any two points, (ii) the spaceiscontractible,inthetopologicalsense,toapointand(iii)theintrinsicmean in terms of the geodesic distance is unique. CAT(k) spaces, a generalization of CAT(0) spaces, are explained in Section 5. 2.2. Geodesic metrics on distributions. Let X be a d-dimensional Euclidean randomvariableabsolutely continuouswith respectto the Lebesgue measure,with density f(x). Let Γ = z(t),t [0,1] be a parametrised integrable path between two points x = z(0),{x = z∈(1) in }Rd, which is rectifiable with respect to the 0 1 Lebesgue measure. Let d 2 ∂z (t) s(t)=v i , u (cid:18) ∂t (cid:19) uXi=1 t with appropriate modification in the non-differentiable case, be the local element of length along Γ. The weighted distance along Γ is 1 (1) d (x ,x )= s(t)f(z(t))dt Γ 0 1 Z 0 The geodesic distance is d(x ,x )=infd (x ,x ). 0 1 Γ 0 1 Γ HereweconsiderarandomvariableonEuclideanspacebutthis canbe generalized for Riemannian manifolds and even for singular spaces with a density with respect to a base measure naturally defined by the metric. From the geodesic distances on distributions we shall follow three main direc- tions: (1) transform the geodesic metrics in various ways with parameters α,β to obtain a wide class of metrics, (2) discover (locally) CAT(0) and CAT(k) spaces for certain ranges of the pa- rameters, (3) apply empirical versions of the metrics based on an empirical graph whose nodes are the data points. There is an important distinction between global transformations applied to the whole distance between points and local transformations applied to dilate the distance element. 6 KEIKOBAYASHIANDHENRYPWYNN 3. The d metric and minimal spanning trees α The general d metric is a dilation of the original distance d and what we have α referredtoasalocalmetric. Itisobtainedbytransformingthedensityin(1). Thus for Γ= z(t),t [0,1] between x =z(0) and x =z(1), 0 1 { ∈ } 1 d (x ,x )= s(t)fα(z(t))dt Γ,α 0 1 Z 0 and d (x ,x )=infd (x ,x ). α 0 1 Γ,α 0 1 Γ Changing α essentially changes the local curvature. Roughly speaking, when α is more negative (positive), the curvature is more negative (positive). In the next subsection, we look at the one-dimensional case. Although this case iselementary,goodintuitionisobtainedbyrewritingthestandardversioninterms of a geodesic metric. 3.1. One-dimensional means and medians. Assume that X is a continuous univariate random variable with probability density function f(x) and cumulative distributionfunction(CDF)F(x). Themeanµ=E(X)achievesminE (X m)2 . m { − } Here we are using the Euclidean distance: d (x,y)= x y . E | − | The median is defined by ν = F−1(1/2). On a geometric basis, we can say thatν achievesmind (m,x)2, where weuse a metric thatmeasuresthe amountof D m probability between x and z: (2) d (x,z)= F(x) F(z). D | − | Carrying out the calculations: ∞ E d (m,X)2 = (F(m) F(x))2f (x)dx X D X { } Z − −∞ 1 = F(m)(1 F(m)) 3 − − which achieves a minimum of 1 at F(m)= 1, as expected. 12 2 Nowletusconsiderthe sampleversion. Letx ,...,x betheorderstatistics, (1) (n) whichwe assumearedistinct. One ofthe firstexercisesinstatistics is toshowthat µˆ= 1 x =x¯ minimises (x m)2, with respect to m. n i i i− ForPthe median, first conPsider using the first of the two approaches with the empirical CDF Fˆ(x). We obtain various definitions depending on our definition of F andFˆ−1,orjustusingconvention. Usingthemetricapproachthenaturalmetric is to take d˜(x,z)= Fˆ (x) Fˆ(z) , 1 D | − | withthestandarddefinitionoffˆ. Appliedtodistinctdatapointsx ,x thisisequal i j to i j . For an arbitrary m | − | d˜(x ,m)= Fˆ(x ) Fˆ(m) 1 i i | − | 1 = i i(m) , n | − | 7 n where i(m) = max i : x m . Then, min (i i(m))2 is achieved at x { (i) ≤ } m Xi=1 − (n+21) when n is odd and at x when n is even. (n+2) 2 Another approach for the median would be to take a piecewise linear approx- imation to F which is equivalent to having a density fˆ that is proportional to 1 in the interval [x ,x ). Then, the metric is x(n+1)−x(n) (n) (n+1) max(x,z) d˜(x,z)= fˆ(y)dy, 2 Z min(x,z) n andmminXi=1d˜2(xi,m)2 isachievedatx(n+21) whennisoddandat 12(x(n2)+x(n+22)), when n is even. We can think of this last result in another way. Consider the points y = Fˆ(x ) = i as points in [0,1], take the empirical mean of the points i (i) n and transformthem back with the CDF corresponding to fˆ, namely, the piecewise linear approximation. The idea of weighting intervals should provide intuition when we extend the intervals to edges on a graph, because edges are one-dimensional. 3.2. The d metric for graphs. There are a number of options to produce an α empirical version of the d metric, based on the data. One such option would be α to produce a smooth empirical density f(t) followed by numerical integration and optimization to compute the geodesics. We prefer a much simpler method based on a graph constructed from the data. All geodesic computation is then restricted to the graph. We list some candidates (see appendix A.1 for a description of the second and third candidates listed below): (1) the complete graph with vertices at the data points and all edges, (2) theedgegraph(1-skeleton)oftheDelaunaysimplicialcomplexwithvertices at the data points, (3) the Gabriel graph with vertices at the data points. The discussion below applies to the complete graph or any connected sub-graph. For any such graph, define a version of the d distance just for edges, α d˜ =d1−α, α,ij ij whered istheEuclideandistancefromx tox . Thiscanbeexplainedbymaking ij i j a transformation ds ds . → d ij We refertothis asedge regularization. We thenapply αinthe usualwayto obtain ds . dα ij The new “length” of each edge e is obtained by integrating this “density” along ij the edge. In this sense, d also plays the role of density estimation. Although we ij need a regularization d−1/p with respect to the dimension p for density estimation ij [22], we manage the regularization by rescaling the parameter α. Note that α=1 gives the unit length and α=0 restores the original length. 8 KEIKOBAYASHIANDHENRYPWYNN Now we consider only the set of edges E of the graphG(V,E) as a metric space with the metric defined by the geodesic: d˜ (x ,x )=inf d˜ , α 0 1 α,ij Γ (iX,j)∈Γ where the infimum is taken over all (connected) paths Γ between x and x . Note 0 1 that even if d−1/p can estimate the local density well, it does not follow that the ij metricd canbe approximatebythe metric d˜ sinceedgelengths d oneachpath α α ij Γ are not independent. It is suggested that further theoretical work is necessary. Here we will admit d˜ as an approximation of d . α α If the graph is not a complete Euclidean graph with weights equal to the Eu- clideanlengths ofthe edges, some edges may not be in any edge geodesics between any pair of vertices. Definition 3.1. For an edge-weighted graph G with weights d on the graph, ij { } G∗, which is the union of all the edge geodesics between all pairs of vertices, is called the geodesic sub-graph (or geodesic graph) of G. We will see how the geodesic sub-graphs transform as the value of α changes. We make an important general position assumption that the set of values d ij { | (i,j) E are distinct, that is there are no ties. This is an additional general ∈ } position assumption to that given for the Delaunay cells ( see appendix A.1). We orderthe values using only a single suffix for simplicity: d <d < <d where 1 2 M M = E . For α<1, this induces the d˜ (=d1−α) values: ··· | | α,i i d˜ <d˜ < <d˜ . α,1 α,2 α,M ··· Now, consider the geodesics as α . Recall that a circuit in a graph is a → −∞ connected path that begins and ends in some vertex and an elementary circuit is a circuit that visits a vertex no more than once. Consider an edge (i,j) E that ∈ has the following property which we call Q: it is in an elementary circuit of the C graph in which all other edges have smaller values of d namely ij d <d for (r,s) , (r,s)=(i,j). rs ij ∈C 6 Then, the path Γ(i,j) (within the circuit) from x to x not containing the edge i j (i,j) has length smaller than d˜ when α is sufficiently negative: α,ij d1−α <d1−α rs ij (r,s)X∈Γ(i,j) From this argument, we see that for sufficiently large α as α approaches , | | −∞ every edge having property Q is removed from the geodesic sub-graph, and we obtain a tree. Letussummarizethisalgorithm,whichappliestoageneraledge-weightedgraph withdistinctedges. Werefertothisalgorithmasthebackwardsalgorithm. Itclearly gives a tree. (1) Let E = M and label the edges e ,...,e in increasing order of their 1 M | | weights. (2) Starting with edge e , remove e if it is in a cycle otherwise continue to M M e . M−1 (3) (General step) Continue downwards at each stage removing an edge if it is in a cycle of the remaining subgraph. 9 (4) Stop if no more edges can be removed using step 3. There is a natural forwards algorithm that also yields a tree as follows. (1) Let E = M and label the edges e ,...,e in increasing order of their 1 M | | weights. (2) Starting with e , add an edge if adding it does not create a cycle. 1 (3) (General step) Continue adding an edge at each step provided that the addition does not create a cycle. (4) Stop if no more edges can be added. We have the following. Lemma 3.2. Given a connected edge-weighted graph G(V,E), the backward and forward algorithms yield the same tree, which we call T∗(G). Proof. Let T and T be the trees generated by the backward and forward 1 2 algorithmsrespectively. Any edge ofGnotinT cannotbe in T , because itis ina 1 2 circuitof edges with lowerweights than itself, by the forwardconstruction. This is sufficient because each tree has the same number of edges. In fact, the tree can be constructedbythesimplerule: removealledgesthatareinacircuitwith“smaller” edges; the order in which the edges are removed is irrelevant. (cid:3) Although there are several implementations of forward and backward minimal spanning tree algorithms, their importance here is to give intuition for the con- struction of geodesics as α . →−∞ WenowshowthatthetreeT∗(G)istheminimalspanningtreeinastrongsense. Theorem 3.3. Let G(V,E) be a graph with V = n and distinct edge weights | | d , (i,j) E . There is a unique spanning tree T∗(G) whose ordered weights ij { ∈ } d∗ <d∗ < <d∗ have the following minimal property. If d′ <d′ < <d′ 1 2 ··· n−1 1 2 ··· n−1 are the ordered weights of any other spanning tree, then d∗ d′, i=1,...,n 1, i ≤ i − with strict inequality for at least one i=1,...,n 1. Moreover T∗(G) is given by − the forward (or backward) algorithm. Proof. The proof is obtained by contradiction. Let T∗ be the tree constructed by the forward algorithm. Let T′ be another spanning tree with ordered weights d′ < d′ < < d′ and suppose that for some j with 1 j n 1, we have 1 2 ··· n−1 ≤ ≤ − d∗ >d′. Let the edges of T∗ and T′ be e∗ < <e∗ and e′ < <e′ , j j (1) ··· (n−1) (1) ··· (n−1) respectively. Sinced∗ >d′ ,wemusthavee∗ >e′ . Inthesequencee ,...,e , (j) (j) (j) (j) 1 M let e∗ =e ande′ =e ; then, we must haves<r. By the nature ofthe forward (j) r (j) s algorithm, new vertices are used only when an edge is included by the algorithm: allunusededgesformcircuits andthereforeuse edgesofthe tree constructedup to that point. Thus, the subgraph of G(V,E) with edges e ,...,e has exactly j+1 1 r vertices; this value is attained on the addition of the last edge e . However, T′, r which is a subgraph of G(V,E), attains j+1 vertices at e with s < r which is a s contradiction. (cid:3) The ordering property in Theorem 3.3 is known from the theory of stochastic ordering: if the empirical CDFs of the distances d∗ and d′ are G∗ and G′, { i} { i} respectively, then G∗ G′ with strict inequality for at least one value (in fact, ≥ over at least one non-empty interval). Thus, not only d∗ < d′ but also i i i i g(d∗)< g(d′) for any non-decreasing function g. P P i i i i P P 10 KEIKOBAYASHIANDHENRYPWYNN For sufficiently negative α, the tree T itself, that is the tree as a metric space withmetricd ,isaCAT(0)space. Weneedtoextendthemetricsomewhatsothat α it applies to the edges, in addition to the nodes. Thus, for any two points x,x′ on the tree, define d (x,x′)= inf w(s)ds, α Γ(x,x′)ZΓ(x,x′) wherethe integralis takenalongthe (unique)pathΓ(x,x′)onthe tree andw(s)= 1 when line element ds is in edge e in Γ(x,x′). dα ij Theorem 3.4. There is an α∗ suchthat for anyα α∗, thegeodesic sub-graph be- ≤ comes the minimal spanning treeT∗(G) endowed with the d metric and, therefore, α becomes a CAT(0) space. Weseethatforsufficientlynegativeα, everygeodesicdefinedwiththed metric α liesinthetreeT∗. Infact,althoughwestartedwithageneralconnectedgraph,any graphforwhich the edges canbe mapped into a Euclideanintervalgivesa CAT(0) tree using this construction. There are some well-known algorithms and considerable literature on minimal spanning trees. Remarkably, the minimal spanning tree for a complete Euclidean graph is the same as the minimal spanning tree for the Delaunay graph and is therefore a subgraph of the Delaunay graph. Theorem 3.5. Let G be a complete edge-weighted graph whose weights are distinct and monotonically related to those of the complete Euclidean graph G of a point E set X in d dimensions. Then, G, G , the Gabriel graph and the Delaunay graph E of the point set X all have the same and unique minimal spanning tree. Versions of this result are known in two dimensions [2], but the authors had some difficulty in finding a concise proof in the literature. We present a proof in appendix A.3. Thus, any algorithm for finding the minimal spanning tree of a graph will give the minimal spanning tree of the Delaunay graph, without having to find the full DelaunaycomplexforanyEuclideanspaceintowhichthepointscanbeembedded. Further, as seen previously, the minimal spanning tree has the strong minimal property in the sense of Theorem 3.5. 3.3. The double α-chain. We may study the geometry as α increasesawayfrom . Following Theorem 3.3, we are interested in two cases: the Euclidean Delau- −∞ nay graph and the complete Euclidean graph. In both cases, we consider the d α metric. Theorem 3.6. Let G be an edge-weighted graph with distinct weights d1−α and α { ij } let G∗ be its geodesic subgraph; then, α |1−α′|>|1−α|⇒G∗α′ ⊆G∗α Proof. This follows from the consideration of geodesics. An edge (i,j) in G is not in G∗ if it is not a geodesic. In this case, there is an alternative path Γ from i α to j such that d1−α > d1−α. However, this inequality is preserved if α is ij (r,s)∈Γ rs decreased,so that 1 αPis increased. Thus an edge absent from G∗ is absent from G∗ . − α (cid:3) α′

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.