StatisticalScience 2010,Vol.25,No.3,275–288 DOI:10.1214/10-STS335 ©InstituteofMathematicalStatistics,2010 Connected Spatial Networks over Random Points and a Route-Length Statistic David J. Aldous and Julian Shun Abstract. We review mathematically tractable models for connected net- works on random points in the plane, emphasizing the class of proximity graphswhichdeservestobebetterknowntoappliedprobabilistsandstatisti- cians.WeintroduceandmotivateaparticularstatisticRmeasuringshortness of routes in a network. We illustrate, via Monte Carlo in part, the trade-off betweennormalizednetworklengthandRinaone-parameterfamilyofprox- imity graphs. How close this family comes to the optimal trade-off over all possiblenetworksremainsanintriguingopenquestion. Thepaperisawrite-upofatalkdevelopedbythefirstauthorduring2007– 2009. Key words and phrases: Proximity graph, random graph, spatial network, geometricgraph. 1. INTRODUCTION Recall that the most studied network model, the ran- dom geometric graph [40] reviewed in Section 2.1, The topic called random networks or complex net- does not permit both connectivity and bounded nor- works has attracted huge attention over the last 20 malized length in the n limit. An attractive al- years.Muchofthisworkfocusesonexamplessuchas →∞ ternative is the class of proximity graphs, reviewed in socialnetworksorWWWlinks,inwhichedgesarenot Section 2.3, whichin thedeterministic case have been closely constrained by two-dimensional geometry. In studied within computational geometry. These graphs contrast,inaspatialnetwork notonlyareverticesand are always connected. Proximity graphs on random edges situated in two-dimensional space, but also it is points have been studied in only a few papers, but are actual distances, rather than number of edges, that are potentially interesting for many purposes other than ofinterest.Tobeconcrete,wevisualizeidealizedinter- the specific “short route lengths” topic of this paper cityroadnetworks,andafeatureofinterestisthe(min- (see Section 6.5). One could also imagine construc- imum)routelengthbetweentwogivencities.Because tions which depend on points having specifically the weworkonlyintwodimensions,thewordspatialmay Poissonpointprocessdistribution,andonenovelsuch be misleading, but equally the word planar would be network, which we name the Hammersley network, is misleading because we do not require networks to be describedinSection2.5. planar graphs (if edges cross, then a junction is cre- Visualizing idealized road networks, it is natural to ated). taketotalnetworklengthasthe“cost”ofanetwork,but Our major purpose is to draw the attention of read- what is the corresponding “benefit”? Primarily we are ers from the applied probability and statistics commu- interested in having short route lengths. Choosing an nities to a particular class of spatial network models. appropriatestatistictomeasurethelatterturnsouttobe rathersubtle,andthe(only)technicalinnovationofthis DavidJ.AldousisProfessor,DepartmentofStatistics, paper is the introduction (Section 3.2) and motivation UniversityofCalifornia,367EvansHall#3860,Berkeley, ofaspecificstatisticR formeasuringtheeffectiveness California94720,USA(e-mail:[email protected]; ofanetworkinprovidingshortroutes. URL:www.stat.berkeley.edu/users/aldous).JulianShunis Inthetheoryofspatialnetworksoverrandompoints, GraduateStudent,MachineLearningDepartment, CarnegieMellonUniversity,5000ForbesAvenue, it is a challenge to quantify the trade-off between net- Pittsburgh,Pennsylvania15213,USA(e-mail: work length [precisely, the normalized length L de- [email protected]). fined at (2)] and route length efficiency statistics such 275 276 D.J.ALDOUSANDJ.SHUN as R. Our particular statistic R is not amenable to ex- First (Sections 2.1–2.3) are schemes which use de- plicitcalculationevenincomparativelytractablemod- terministic rules to define edges for an arbitrary deter- els,butinSection4wepresenttheresultsfromMonte ministic configuration of cities; then one just applies Carlo simulations. In particular, Figure 7 shows the theserulestoarandomconfiguration.Second,onecan trade-off for the particular β-skeleton family of prox- have random rules for edges in a deterministic config- imitygraphs. uration (e.g., the probability of an edge between cities Given a normalized network length L, for any real- i andj isafunctionofEuclideandistanced(xi,xj),as ization of cities there is some network of normalized inpopularsmallworldsmodels[39]),andagainapply length L which minimizes R. As indicated in Sec- toarandomconfiguration.Third,andmoresubtly,one tion 5, by general abstract mathematical arguments, canhaveconstructionsthatdependontherandomness there must exist a deterministic function R (L) giv- modelforcitypositions—Section2.5providesanovel opt ing (in the “number of cities ” limit under the example. random model) the minimum v→alu∞e of R over all pos- WeworkthroughoutwithreferencetoEuclideandis- sible networks of normalized length L. An intriguing tance d(x,y) on the plane, even though many mod- openquestionisasfollows: elscouldbedefinedwithreferencetoothermetrics(or evenwhenthetriangleinequalitydoesnothold,forthe how close are the values Rβ-skel(L) from the MST). β-skeleton proximity graphs to the optimum valuesR (L)? 2.1 TheGeometricGraph opt AsdiscussedinSection5.3,atfirstsightitlookseasyto In Sections 2.1–2.3 we have an arbitrary configura- designheuristicalgorithmsfornetworkswhichshould tionx xi ofcitypositions,andadeterministicrule ={ } improve over the β-skeletons, for example, by intro- for defining the edge-set . Usually in graph theory E oneimaginesafiniteconfiguration,butnotethatevery- ducingSteinerpoints,butinpracticewehavenotsuc- thing makes senseforlocally finite configurations too. ceededindoingso. Where helpful, we assume “general position,” so that Thispaperfocusesontherandommodelforcitypo- intercitydistancesd(x ,x )arealldistinct. sitionsbecauseitseemsthenaturalsettingfortheoret- i j For the geometric graph one fixes 0<c < and ical study. As a complement, in [10] we give empiri- ∞ defines cal data for the values of (L,R) for certain real-world networks (on the 20 largest cities, in each of 10 US (x ,x ) iff d(x ,x ) c. i j i j ∈E ≤ States). In [8] we give analytic results and bounds on FortheK-neighborgraphonefixesK 1anddefines the trade-off between L and the mathematically more ≥ tractablestretchstatisticRmaxat(4),inbothworst-case (xi,xj) ∈ E iff xi is one of the K closest andrandom-casesettingsforcitypositions.Letusalso neighbors of xj, or xj is one of the K clos- point out a (perhaps) nonobvious insight discussed in estneighborsofxi. Section3.3:indesigningnetworkstobeefficientinthe Amoment’sthoughtshowsthesegraphsareingeneral sense of providing short routes, the main difficulty is notconnected,soweturntomodelswhichare“bycon- providing short routes between city-pairs at a specific struction” connected. We remark that the connectivity distance(2–3standardizedunits)apart,ratherthanbe- thresholdc inthefiniten-vertexmodeloftherandom n tweenpairsatalargedistanceapart. geometricgraphhasbeenstudiedindetail—seeChap- Finally, recall this is a nontechnical account. Our ter13of[40]. purpose is to elaborate verbally the ideas outlined 2.2 ANestedSequenceofConnectedGraphs above; some technical aspects will be pursued else- where. The material here and in the next section was de- veloped in graph theory with a view toward algorith- 2. MODELSFORCONNECTEDSPATIAL mic applications in computational geometry and pat- NETWORKS tern recognition. The 1992 survey [28] gives the his- tory of the subject and 116 citations. But everything There are several conceptually different ways of weneedisimmediatefromthe(carefulchoiceof)defi- defining networks on random points in the plane. To nitions.Onourarbitraryconfigurationxwecandefine be concrete, we call the points cities; to be consistent fourgraphswhoseedge-setsarenestedasfollows: about language, we regard x as the position of city i i andrepresentnetworkedgesaslinesegments(x ,x ). (1) MST relativen’hood Gabriel Delaunay. i j ⊆ ⊆ ⊆ CONNECTEDNETWORKSOVERRANDOMPOINTS 277 Here are the definitions (for MST and Delaunay, it 2.3 ProximityGraphs is easy to check these are equivalent to more familiar Write v and v for the points ( 1,0) and (1,0). definitions).Ineachcase,wewritethecriterion foran − + −2 2 Theluneistheintersectionoftheopendiscsofradii1 edge(x ,x )tobepresent: i j centered at v and v . So v and v are not in the − + − + Minimumspanningtree(MST)[24].Theredoesnot lunebutareonitsboundary.DefineatemplateAtobe • existasequencei k ,k ,...,k j ofcitiessuch asubsetofR2 suchthat: 0 1 m = = that (i) Aisasubsetofthelune. (ii) Acontainstheopenlinesegment(v ,v ). max(d(xk0,xk1),d(xk1,xk2),...,d(xkm 1,xkm)) (iii) A is invariant under the “reflection−in+the y- − <d(x ,x ). axis” map Reflect (x ,x ) ( x ,x ) and the “re- i j x 1 2 1 2 = − flection in the x-axis” map Reflect (x ,x ) (x , y 1 2 1 Relativeneighborhoodgraph.Theredoesnotexista x ). = • cityk suchthat −(2iv) Aisopen. max(d(x ,x ),d(x ,x ))<d(x ,x ). For arbitrary points x,y in R2, define A(x,y) to i k k j i j be the image of A under the natural transformation Gabriel graph. There does not exist a city inside (translation, rotation and scaling) that takes (v ,v ) • thediscwhosediameter isthelinesegmentfrom xi to(x,y). − + tox . j DEFINITION. Given a template A and a locally fi- Delaunaytriangulation[23].Thereexistssomedisc, • niteset ofvertices,theassociatedproximitygraphG with xi and xj on its boundary, so that no city is hasedgeVsdefinedby,foreachx,y , insidethedisc. ∈V (x,y)isanedgeofGiffA(x,y)containsno The inclusions (1) are immediate from these defini- vertexof . tions. Because the MST (for a finite configuration) is V connected,allthesegraphsareconnected. Fromthedefinitions: Figure 1 illustrates the relative neighborhood and if A isthelune,then G istherelativeneighborhood Gabrielgraphs.FiguresfortheMSTandtheDelaunay • graph; triangulation can be found online at http://www.spss. ifAisthedisccenteredattheoriginwithradius1/2, com/research/wilkinson/Applets/edges.html. • thenGistheGabrielgraph. Constructionssuchastherelativeneighborhoodand But the MST and Delaunay triangulation are not in- Gabriel graphs have become known loosely as prox- stancesofproximitygraphs. imity graphs in [28] and subsequent literature, and we next take the opportunity to turn an implicit definition Notethatreplacing A byasubset A canonlyintro- ( intheliteratureintoanexplicitdefinition. duceextraedges.Itfollowsfrom(1)thattheproximity FIG.1. Therelativeneighborhoodgraph(left)andGabrielgraph(right)ondifferentrealizationsof500randompoints. 278 D.J.ALDOUSANDJ.SHUN graph is always connected. The Gabriel graph is pla- For a picturesque description, imagine one-eyed nar.ButifAisnotasupersetofthedisccenteredatthe frogs sitting on an infinitely long, thin log, each being originwithradius1/2,thenGmightnotbeasubgraph able to see only the part of the log to their left before of the Delaunay triangulation, and in this case edges thenextfrog.Atrandomtimesandpositions(precisely, may cross, so G is not planar (e.g., if the vertex-set is as a space–time Poisson point process of rate 1) a fly the four corners of a square, then the diagonals would lands on the log, at which instant the (unique) frog beedges). whichcanseeitjumpslefttothefly’spositionandeats For a given configuration x, there is a collection of it.ThisdefinesacontinuoustimeMarkovprocess(the proximity graphs indexed by the template A, so by Hammersley process) whose states are the configura- choosing a monotone one-parameter family of tem- tionsofpositionsofallthefrogs.Thereisastationary plates, one gets a monotone one-parameter family of versionoftheprocessinwhich,ateachtime,theposi- graphs, analogous to the one-parameter family of tionsofthefrogsformaPoisson(rate1)pointprocess c G geometric graphs. Here is a popular choice [30] in ontheline. which β 1 gives the Gabriel graph and β 2 gives Now consider the space–time trajectories of all the = = therelativeneighborhoodgraph. frogs,drawnwithtimeincreasingupwardonthepage. See Figure 2. For each frog, the part of the trajectory DEFINITION (The β-skeleton family). (i) For 0< betweenthecompletionsoftwosuccessivejumpscon- β< 1 let A be the intersection of the two open discs β sists of an upward edge (the frog remains in place as ofradius(2β) 1 passingthroughv andv . − − + time increases) followed by a leftward edge (the frog (ii) For 1 β 2 let A be the intersection of β ≤ ≤ jumpsleft). the two open discs of radius β/2 centered at ( (β ± − Reinterpreting the time axis as a second space axis, 1)/2,0). and introducing compass directions, that part of the 2.4 NetworksBasedonPowersofEdge-Lengths trajectory becomes a North edge followed by a West edge. Now replace these two edges by a single North- It is not hard to think of other ways to define one- Weststraightedge.Doingthisprocedureforeachfrog parameter families of networks. Here is one scheme and each pair of successive jumps, we obtain a col- used in, for example, [38]. Fix 1 p < . Given ≤ ∞ lection of NW paths, that is, a network in which each a configuration x, and a route (sequence of vertices) city (the reinterpreted space–time random points) has x ,x ,...,x , say, the cost of the route is the sum of 0 1 k an edge to the NW and an edge to the SE. Finally, we pth powers of the step lengths. Now say that a pair (x,y) is an edge of the network if the cheapest p G routefromx toy istheone-steproute.Asp increases from 1 to , these networks decrease from the com- ∞ plete graph to the MST. Moreover, for p 2 the net- ≥ work isasubgraphoftheGabrielgraph. p G 2.5 TheHammersleyNetwork Thereisaquiteseparaterecentliteratureintheoreti- calprobability[26,27]definingstructuressuchastrees and matchings directly on the infinite Poisson point process. In this spirit, we observe that the Hammers- ley process studied in [6] can be used to define a new network on the infinite Poisson point process, which wenametheHammersleynetwork.Thisnetworkisde- signed to have the feature that each vertex has exactly 4 edges, in directions NE (between North and East), NW, SE and SW. The conceptual difference from the networksintheprevioussectionisthatthereisnotsuch a simple “local” criterion for whether a potential edge (x ,x ) is in the network. And edges cross, creating i j junctions. FIG.2. Space–timetrajectoriesinHammersley’sprocess. CONNECTEDNETWORKSOVERRANDOMPOINTS 279 2.6 NormalizedLength The notion of normalized network length L is most easily visualized in the setting of an infinite determin- istic network which is “regular” in the sense of con- sisting of a repeated pattern. First choose the unit of lengthsothatcitieshaveanaveragedensityofoneper unitarea.Thendefine (2) L averagenetworklengthperunitarea, = " averagedegree(numberofincidentedges) ¯ = (3) ofcities. Figure4showsthevaluesofLand"forsomesim- ¯ ple “repeated pattern” networks. Though not directly relevant to our study of the random model, we find Figure 4 helpful for two reasons: as intuition for the interpretation of the different numerical values of L, FIG.3. TheHammersleynetworkon2500randompoints. and because we can make very loose analogies (Sec- tion6.6)betweenparticularnetworksonrandompoints andparticulardeterministicnetworks. repeattheconstructionwiththesamerealizationofthe space–timePoissonpointprocessbutwithfrogsjump- ingrightwardinsteadofleftward.Thisyieldsanetwork 3. NORMALIZEDLENGTHANDROUTE-LENGTH on the infinite Poisson point process, which we name EFFICIENCY theHammersleynetwork.SeeFigure3. 3.1 TheRandomModel REMARKS. (a) To draw the Hammersley network For the remainder of the paper we work with “the onrandompointsinafinitesquare,oneneedsexternal randommodel”forcitypositions.Thefinitemodelas- randomizationtogivetheinitial(time0)frogpositions, sumes n random vertices (cities) distributed indepen- in fact, two independent randomizations for the left- dentlyanduniformlyinasquareofarean.Theinfinite ward and the rightward processes. So to be pedantic, modelassumesthePoissonpointprocessofrate1(per one gets a random network over the given realization unit area) in the plane. The quantities L," above and ofcities.However,onecandeducefromthetheoretical ¯ R below that we discuss may be interpreted as exact resultsin[6]thattheexternalrandomizationhaseffect values in the infinite model or as n limits in the onlyneartheboundaryofthesquare. →∞ finite model; see Section 5. We use the word normal- (b) The property that each vertex has exactly ized as a reminder of the “density 1” convention—we 4 edges, in directions NE (between North and East), choose the normalized unit of distance to make cities NW, SE and SW, is immediate from the construction. haveaveragedensity1perunitarea.Afterthisnormal- Note,however,thatwhileadjacentNWspace–timetra- ization,Listheaveragenetworklengthperunitarea. jectoriesinFigure2donotcross,thecorrespondingdi- agonalroadsintheHammersleynetworkmaycross,so 3.2 TheRoute-LengthEfficiencyStatisticR itisnotaplanargraph,thoughthishasonlynegligible In designing a network, it is natural to regard total effectonroutelengths. lengthasa“cost”.Thecorresponding“benefit”ishav- (c) Intuition, confirmed by Figure 7 later, says that ing short routes between cities. Write #(i,j) for the theHammersleynetworkisnotveryefficientasaroad route length (length of shortest path) between cities i network. It serves to demonstrate that there do exist and j in a given network, and d(i,j) for Euclidean randomnetworksotherthanthefamiliarones,andpro- distance between the cities. So #(i,j) d(i,j), and vides an instance where imposing deterministic con- ≥ wewrite straints (the four edges, in this case) on a random net- workmakesitmuchlessefficient.Howgeneralaphe- #(i,j) r(i,j) 1 nomenonisthis? = d(i,j) − 280 D.J.ALDOUSANDJ.SHUN FIG.4. Variantsquare,triangularandhexagonallattices.Drawnsothatthedensityofcitiesisthesameineachdiagram,andorderedby valueofL. so that “r(i,j) 0.2” means that route length is 20% unreasonabletocharacterizetheUKrailnetworkasin- = longer than straight line distance. With n cities we get efficient simply because there is no very direct route n such numbers r(i,j); what is a reasonable way to betweenOxfordandCambridge. 2 combine these into asingle statistic? Twonatural pos- Thestatistic R hasamoresubtledrawback.Con- ! " ave sibilitiesareasfollows: sideranetworkconsistingof: Rmax maxr(i,j), the minimum-length connected network (Steiner := j i • (4) )= tree)ongivencities; R ave r(i,j), and a superimposed sparse collection of randomly ave (i,j) := • orientedlines(aPoissonlineprocess[45]). where ave denotes average over all distinct pairs (i,j) (i,j). The statistic R has been studied in the con- See Figure 5. By choosing the density of lines to be max text of the design of geometric spanner networks [37] sufficientlylow,onecanmakethenormalizednetwork where it is called the stretch. However, being an “ex- length be arbitrarily close to the minimum needed for tremal” statistic R seems unsatisfactory as a de- connectivity. Butitiseasytoshow(see[7]forcareful max scriptorofrealworldnetworks—forinstance,itseems analysis and a stronger result) that one can construct CONNECTEDNETWORKSOVERRANDOMPOINTS 281 efficiency trade-off [the function R (L) discussed in opt Section5],andso,inparticular,itmakessensetocom- parethevaluesofR fornetworkswithdifferentn. Advantage 2. A more realistic model for traffic would posit that volume of traffic between two cities variesasapower-lawd γ ofdistanced,sothatincal- − culating R it would be more realistic to weight by ave d γ.Thismeansthattheoptimalnetwork,whenusing − R asoptimalitycriterion,woulddependonγ.Useof ave R finessesthisissue;thevalueof γ doesnotaffect R. A related issue is that volume of traffic between two cities should depend on their populations. Intuitively, incorporating random population sizes should make the optimal R smaller because the network designer can create shorter routes between larger cities. We see this effect in data [10]; R calculated via population- weightingistypicallyslightlysmaller.Butwehavenot triedtheoreticalstudy. Disadvantage. The statistic R is tailored to the in- finite model, in which it makes sense to consider two FIG. 5. Efficient or inefficient? Rave would judge this network citiesatexactlydistanced apart(thentheothercitypo- efficientinthen limit. sitions form a Poisson point process). For finite n we →∞ needtodiscretize.Fortheempiricaldatain[10],where such networks so that R 0 as n . Of course n 20, we average over intervals of width 1 unit (re- ave → →∞ = no one would build a road network looking like Fig- call the unit of distance is taken such that the density ure 5 to link cities, because there are many pairs of of cities is 1 per unit area),that is,for d 1,2,...,5, = nearby cities with only very indirect routes between wecalculate them. The disadvantage of R as a descriptive sta- ave ρ(d) meanvalueofr(i,j)overcity-pairs tistic is that (for large n) most city-pairs are far apart, ˜ := (6) withd 1 <d(i,j)<d 1, so the fact that a given network has a small value of − 2 + 2 R says nothing about route lengths between nearby ave R max ρ(d) cities. ˜ :=1 d< ˜ ≤ ∞ We propose a statistic R which is intermediate be- and use R as proxy for R. For larger n we can use tween R and R . First consider (see discussion ˜ ave max shorter intervals. Thus, there is, in principle, a certain belowfordetails) fuzziness to the notion of R for finite networks, and, ρ(d) meanvalueofr(i,j)over in particular, it is not clear how to assign a value of R := to regular networks such as those in Figure 4. But in city-pairswithd(i,j) d practice, for networks we have studied on real-world = andthendefine data and on random points, this is not a problem, as explainednext. (5) R max ρ(d). :=0 d< 3.3 CharacteristicShapeoftheFunctionρ(d) ≤ ∞ In words, R 0.2 means that on every scale of dis- For the connected networks on random points (ex- = tance,routelengthsareonaverageatmost20%longer cluding the Hammersley network) we are discussing, thanstraightlinedistance. the function ρ(d) has a characteristic shape (see Fig- On an intuitive level, R provides a sensible and in- ure 6) attaining its maximum between 2 and 3 and terpretable way to compare efficiency of different net- slowly decreasing thereafter. We suspect that “this works in providing short routes. On a technical level, characteristic shape holds for any reasonable model,” we see two advantages and one disadvantage of using butwedonotknowhowtoturnthatphraseintoapre- R insteadofR . cise conjecture. Note that “smoothness near the maxi- ave Advantage 1. Using R to measure efficiency, there mum”impliesthatanycalculatedvalueRat(6)isquite ˜ is a meaningful n limit for the network length/ insensitivetothechoiceofdiscretization. →∞ 282 D.J.ALDOUSANDJ.SHUN FIG.6. Thefunctionρ(d) forthreetheoreticalnetworksonrandomcities.IrregularitiesareMonteCarlorandomvariation. This characteristic shape has a common-sense in- 4. LENGTH-EFFICIENCYTRADE-OFFFOR terpretation. Any efficient network will tend to place TRACTABLENETWORKS roads directly between unusually close city-pairs, im- Recallthatouroverallthemeisthetrade-offbetween plying that ρ(d) should be small for d <1. For large networklength androute-length efficiency, andthat in d the presence of multiple alternate routes helps pre- this paper we focus on n limits in the random ventρ(d) fromgrowing.Atdistance2 3fromatyp- →∞ modelandtheparticularstatisticsLandR. − ical city i there will be about π32 π22 16 other The models described in Section 2 are “tractable” − ≈ citiesj.Forsomeofthesej therewillbecitiesk near in the specific sense that one can find exact analytic the straight line from i to j, so the network designer formulas for normalized length L. Unfortunately R is can create roads from i to k to j. The difficulty arises not amenable to analytic calculation, and we resort to wherethereisnosuchintermediatecity k:includinga MonteCarlosimulationtoobtainvaluesforR.Table1 directroad(xi,xj)willincreaseL,butnotincludingit and Figure 7 show the values of (L,R) in the models. willincreaseρ(d) for2<d <3. WeexplainbelowhowthevaluesofLarecalculated. Thus,Figure6offersaminorinsightintospatialnet- Notes on Table 1. (a) Values of R from our simula- workdesign:thatitiscitypairsatnormalizeddistance tionswithn 2500. = 2 3 specifically that enforce the constraints on effi- − cientnetworkdesign. TABLE1 Statisticsoftractablenetworksonrandompoints The characteristic shape—at least, the flatness over 2 d 5—isalsovisibleinthereal-worlddata[10]. ≤ ≤ Network L "R For the Hammersley network, the graph of ρ(d) is ¯ quite different; ρ(d) increases to a maximum of 0.35 Minimumspanningtree 0.633 2 ∞ around d 0.8 and then decreases more steeply to a Relativen’hood 1.02 2.56 0.38 value of 0=.21 at d 5. This arises from the particular Gabriel 2 4 0.15 = Hammersley 3.25 4 0.35 structure(fromeachcitythereisoneroadineachquad- Delaunay 3.40 6 0.07 rant)resemblingthedeterministic“diagonallattice”of Figure4,inwhichtheroutebetweensomenearbypairs Notes:Integervaluesareexact.RecallLisnormalizedlength(2), willbeviatwodiagonalroadsandajunction. "isaveragedegree(3)andRisourroute-lengthstatistic(5). ¯ CONNECTEDNETWORKSOVERRANDOMPOINTS 283 FIG.7. ThenormalizednetworklengthLandtheroute-lengthefficiencystatisticRforcertainnetworksonrandompoints.The showthe ◦ beta-skeletonfamily,withRNtherelativeneighborhoodgraphandGtheGabrielgraph.The arespecialmodels: showstheDelaunay • , triangulation,!showsthenetwork 2fromSection2.4and showstheHammersleynetwork. G ♦ (b) Value of L for MST from Monte Carlo [19]. LEMMA 1. ForaproximitygraphwithtemplateA Inprinciple,onecancalculatearbitrarilyclosebounds onthePoissonpointprocess, [11],butapparentlythishasneverbeencarriedout.Of π3/2 course," 2foranytree. (8) L , (c)Th¯eG=abrielgraphandtherelativeneighborhood = 4c3/2 π graph fit the assumptions of Lemma 1 with c π/4 (9) " , and c 2π √3, respectively, and their table =entries ¯ = c = 3 − 4 for L and " are obtained from Lemma 1, as are the wherec area(A). ¯ = valuesforβ-skeletonsinFigure7. PROOF. Take a typical city at position x0. For a (d) For the Hammersley network, every degree city x at distance s the chance that (x ,x) is an edge 0 equals 4, so L 2 (mean edge-length). It follows equalsexp( cs2)andso = × fromtheory[6]thatatypicaledge,say,NEfrom(x,y), − goestoacityatposition(x ξx,y ξy),whereξx and mean-degree ∞exp( cs2)2πsds, + + ξy are independent with Exponential(1) distribution. =#0 − Someanedge-lengthequals 1 L ∞sexp( cs2)2πsds. = 2 0 − (7) ∞ ∞ x2 y2e x ydxdy 1.62. # − − 0 0 + ≈ Evaluatingtheintegralsgives(8)and(9). ! # # $ (e)Foranytriangulation," 6intheinfinitemodel. One can derive similar integral formulas for other ¯ = For the Delaunay triangulation, L ES where S is “local” characteristics, for example, mean density of = the perimeter length of a typical cell, and it is known triangles and moments of vertex degree. See [18, 20, ([35], page 113) that ES 32. Note [33] that the De- 21, 34] for a variety of such generalizations and spe- = 3π launay triangulation is in general not the minimum- cializations. lengthtriangulation.OursimulationresultsinFigure6 4.2 OtherTractableNetworks for ρ(d) for the Delaunay triangulation are roughly consistent with a simulation result in [13] saying that Wedonotknowanyotherwaysofdefiningnetworks ρ(65) 0.05. on random points which are both “natural” and are ≈ tractable in the sense that one can find exact analytic 4.1 ASimpleCalculationforProximityGraphs formulasforL.Inparticular,weknownotractableway Letusgiveanexampleofanelementarycalculation of defining networks with deliberate junctions as in forproximitygraphsoverrandompoints. Figure8.Notealsothat,whileitiseasytomakeadhoc 284 D.J.ALDOUSANDJ.SHUN whereR isthediscretizedversion(6)calculatedusing ˜ intervalsofsomesuitablelengthδ .Applyingthistoa n random configuration X in the finite model gives, for eachL,arandomvariable ) (L) R (X,L). n n := Oneintuitivelyexpectsconvergencetosomedetermin- isticlimit (12) ) (L) R (L) say,asn . n opt → →∞ The analogous result for R will be proved care- max fully in [8], and the same “superadditivity” argument could be used to prove (12). See [43, 44, 47] for gen- eral background to such results. The point is that we FIG. 8. An ad hoc modification of the relative neighborhood do not have any explicit description of the optimal graph,introducingjunctions. [i.e., attaining the minimum in (11)] networks in the finite or infinite models, so it seems very challenging modificationstothegeometricgraphtoensureconnec- toprovethenaturalstrongersuppositionthatthefinite tivity,thesedestroytractability.Ontheotherhand,one optimalnetworksthemselvesconverge(insomeappro- can construct “unnatural” networks (see, e.g., [8]) de- priate sense) to a unique infinite optimal network for signedtopermitcalculationofL. whichthevalueR R (L)isattained. opt = 5. OPTIMALNETWORKSANDN →∞LIMITS 5.3 TheCurveRopt(L) 5.1 TractableModels EverypossiblenetworkontheinfinitePoissonpoint Asmentionedearlier,thequantities L,",R wedis- process defines a pair (L,R), and the curve R ¯ = cuss may be interpreted as exact values in the infinite Ropt(L)canbedefiedequivalentlyasthelowerbound- modelorasn limitsinthefinitemodel.Toelab- ary of the set of possible values of (L,R). There is →∞ oratebriefly,inarealizationofthefinitemodel(ncities no reason to believe that proximity graphs are exactly distributedindependentlyanduniformlyinasquareof optimal, and, indeed, Figure 7 shows that the Delau- area n), a network in Table 1 has a normalized length naytriangulationisslightlymoreefficientthanthecor- L n 1 (networklength) and an average degree responding β-skeleton. But our attempts to do better n − = × " which are random variables, but there is conver- by ad hoc constructions (e.g., by introducing degree-3 ¯n gence(inprobabilityandinexpectation) junctions—seeFigure8foranexample)havebeenun- successful. And, indeed, the fact that the two special (10) L L, " " asn n→ ¯n→ ¯ →∞ models in Figure 7 lie close to the β-skeleton curve to limit constants definable in terms of the analogous lends credence to the idea that this curve is almost network on the infinite model (rate 1 Poisson point optimal. We therefore speculate that the function R opt processontheinfiniteplane).Fortheproximitygraphs looks something like the curve in Figure 9, which we or Delaunay triangulation, the network definition ap- nowdiscuss. plies directly to the infinite model and proof of (10) is WhatcanwesayaboutR (L)?Itisapriorinonin- opt straightforward. For the Hammersley network, (10) is creasing.Itisknown[47]thatthereexistsaEuclidean implicitin[6],andfortheMSTdetailedargumentscan Steiner tree constant L representing the limit nor- ST befoundin[9,43]. malized Steiner tree length in the random model, and clearlyR (L) forL<L .Thefacts 5.2 OptimalNetworks opt ST =∞ We now turn to consideration of optimal networks. Ropt(L) < forallL>LST (13) ∞ ; Given a configuration x of n cities in the area-n R (L) 0 asL square, and a value of L which is greater than n−1 opt → →∞ × (lengthofSteinertree),onecandefineanumber are not trivial to prove rigorously, but follow from the correspondingfactsforR provedin[8].Butweare R (x,L) minofR overallnetworks max (11) n = ˜ unable to prove rigorously that Ropt(L) is strictly de- onxwithnormalizedlength L, creasingorthatitiscontinuous. ≤
Description: