ebook img

Alex Slivkins PDF

133 Pages·2009·1 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Alex Slivkins

EMBEDDING, DISTANCE ESTIMATION AND OBJECT LOCATION IN NETWORKS ADissertation PresentedtotheFacultyoftheGraduateSchool ofCornellUniversity inPartialFulfillmentoftheRequirementsfortheDegreeof DoctorofPhilosophy by AleksandrsSlivkins August2006 (cid:176)c 2006AleksandrsSlivkins ALLRIGHTSRESERVED EMBEDDING,DISTANCEESTIMATIONANDOBJECTLOCATIONINNETWORKS AleksandrsSlivkins,Ph.D. CornellUniversity2006 Concurrentwithnumeroustheoreticalresultsonmetricembeddings,agrowingbodyofresearchinthenet- workingcommunityhasstudiedthedistancematrixdefinedbynode-to-nodelatenciesintheInternet,result- ing in a number of recent approaches that approximately embed this distance matrix into low-dimensional Euclideanspace. Afundamentaldistinctionbetweenthetheoreticalapproachestoembeddingsandthisre- centInternet-relatedworkisthatthelatteroperatesundertheadditionalconstraintthatitisonlyfeasibleto measurealinearnumberofnodepairs,andtypicallyinahighlystructuredway. Indeed,themostcommon frameworkhereisabeacon-basedapproach: onerandomlychoosesasmallnumberofnodes(’beacons’)in thenetwork,andeachnodemeasuresitsdistancetothesebeaconsonly. Moreover,beacon-basedalgorithms are also designed for the more basic problem of triangulation, in which one uses the triangle inequality to inferthedistancesthathavenotbeenmeasured. We give algorithms with provable performance guarantees for triangulation and embedding. We show thatinadditiontomultiplicativeerrorinthedistances,performanceguaranteesforbeacon-basedalgorithms typicallymustincludeanotionof”slack”–acertainfractionofalldistancesmaybearbitrarilydistorted. For arbitrary metrics, we give a beacon-based embedding algorithm that achieves constant distortion on a (1−(cid:178))-fraction of distances; this provides some theoretical justification for the success of the recent networking algorithms, and forms an interesting contrast with lower bounds showing that it is not possible to embed all distances with constant distortion. For doubling metrics (which have been proposed as a reasonableabstractionofInternetlatencies),weshowthattriangulationwithaconstantnumberofbeacons canachievemultiplicativeerror1+δ ona(1−(cid:178))-fractionofdistances,forarbitrarilysmallconstants(cid:178),δ. We extend these results in a number of directions: embeddings with slack that work for all (cid:178) at once; distributed algorithms for triangulation and embedding with low overhead on all participating nodes; dis- tributed triangulation with guarantees for all node pairs; node-labeling problems for graphs and metrics; systemsprojectonlocation-awarenodeselectioninalarge-scaledistributednetwork. Biographical Sketch AleksandrsSlivkinswasbornDecember1,1978inRiga,Latvia(thenSovietUnion). HelivedinRigauntil he finished high school in August 1996. Then he went to California Institute of Technology; he graduated in June 2000 with B.S. in Mathematics. From August 2000 till present Alex has been a graduate student withtheComputerSciencedepartmentofCornellUniversity. Followinghiscandidacyexam,hereceiveda M.S. in Computer Science in 2004. He expects to graduate with a Ph.D. in August 2006. After Cornell he isgoingforaone-yearpostdocatBrownUniversity. HehasacceptedaresearchstaffpositionatMicrosoft Research,SiliconValleyCenterstartingfromJuly2007. iii Acknowledgements Firstofall,IwouldliketothankmythesisadvisorJonKleinbergforhisinvaluablesupportandmentorship in both research- and career-related issues. I have benefited tremendously from interacting with my other twothesiscommitteemembers,EvaTardosandEminGu¨nSirer. IadditionallythankJon,EvaandGunfor theiradviceduringthejobsearchprocess. Throughout my graduate career I have been fortunate to collaborate with many wonderful researchers. Itismypleasuretoacknowledgemyfacultycoauthors(JonKleinbergandGunSirerfromCornell,Matthew AndrewsfromBellLabs,ShukiBruckfromCaltech,andAnupamGuptafromCarnegieMellon),andthank them for their guidance and collegiality. Furthermore, I would like to thank Shuki Bruck for being my undergraduateresearchmentor. I thank graduate students at Cornell Computer Science department for creating a fruitful academic en- vironment. Iwilltrytoavoidlistingtoomanynameshere, butIwouldliketomentionElliotAnshelevich, AnirbanDasgupta,AraHayrapetyan,MartinPal,MarkSandler,ZoyaSvitkina,andTomWexler,aswellas severalolderstudents: AaronArcher,DavidKempe,TimRoughgarden,andChaitanyaSwamy. Finally, I thank my parents and grandparents for supporting and motivating me throughout my entire life. iv Table of Contents 1 Introduction 1 1.1 Overviewofresults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1.1 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.1.2 Relatedwork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.2 Definitionsandtheorems: embeddings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.3 Definitionsandtheorems: distributedalgorithms. . . . . . . . . . . . . . . . . . . . . . . . 8 1.4 Bibliographicnotes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2 Backgroundandpreliminaries 11 2.1 ExpandergraphsandProbability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2 Metricembeddings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2.1 Relationsbetweendifferent(cid:96) norms . . . . . . . . . . . . . . . . . . . . . . . . . 14 p 2.2.2 Embeddingsoffinitemetricsinto(cid:96) spaces . . . . . . . . . . . . . . . . . . . . . . 14 p 2.2.3 Embeddingsintotreemetrics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.3 Lowdimensionalityinmetrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.3.1 Growth-constrainedmetrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.3.2 Doublingmetrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.3.3 Decomposablemetrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3 TriangulationandEmbeddingusingSmallSetsofBeacons 20 3.1 Beacon-basedtriangulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.2 Beacon-basedembeddings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.3 Beacon-basedapproaches: furtherresults . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.3.1 Black-boxGNP-styleembedding . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.3.2 Strongtriangulationwithaconstantnumberofbeacons . . . . . . . . . . . . . . . . 29 3.3.3 Infinitemetricsandarbitrarymeasures . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.4 Fullydistributedapproaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.5 Improvedembeddingsforgrowth-constrainedmetrics . . . . . . . . . . . . . . . . . . . . . 35 3.6 Lowerboundsonembeddingswithslack . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.6.1 Generallower-boundingtechnique . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.6.2 Lowerboundsforcontractingembeddings . . . . . . . . . . . . . . . . . . . . . . . 42 4 GracefullyDegradingDistortionforDecomposableMetrics 44 4.1 Distancescalesandscalebundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.2 Theembeddingalgorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 4.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 4.4 Analysis: proofofLemma4.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 v 4.5 Analysis: toolsfromProbability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.6 Analysis: mapsf andg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 ij (i,j,0) 4.7 ABourgain-styleproofofLemma4.2fordoublingmetrics. . . . . . . . . . . . . . . . . . . 52 4.8 Anextensiontoarbitrarymetrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 5 NetworkTriangulationviaRingsofNeighbors 55 5.1 Frameworkandresults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 5.2 Tools: distributedrandomwalks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 5.3 RandomizedRingsofNeighbors: ProofofTheorem5.3 . . . . . . . . . . . . . . . . . . . . 60 5.4 NetworkTriangulation: ProofofTheorem5.5 . . . . . . . . . . . . . . . . . . . . . . . . . 62 6 Location-awarenodeselectionviaRingsofNeighbors 66 6.1 Meridian: aframeworkforlocation-awarenodeselection . . . . . . . . . . . . . . . . . . . 66 6.2 Analysisofscalability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 6.2.1 FormaldescriptionoftheMeridianframework . . . . . . . . . . . . . . . . . . . . 73 6.2.2 QualityoftheMeridianrings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 6.2.3 Nearestneighborsandcentralleaders . . . . . . . . . . . . . . . . . . . . . . . . . 76 6.2.4 Extensions: exactnearestneighbors . . . . . . . . . . . . . . . . . . . . . . . . . . 77 6.2.5 Extensions: load-balancing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 6.2.6 Fine-tunedversionsoftheresults . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 6.3 FullproofofTheorem6.6oncentralleaderelection . . . . . . . . . . . . . . . . . . . . . . 79 6.4 FullproofofTheorem6.7onexactnearestneighbors . . . . . . . . . . . . . . . . . . . . . 81 6.5 FullproofofTheorem6.9onload-balancing . . . . . . . . . . . . . . . . . . . . . . . . . . 83 6.5.1 Setup: Meridianringsandthesearchalgorithm . . . . . . . . . . . . . . . . . . . . 83 6.5.2 Setup: randomizationandrandomvariables . . . . . . . . . . . . . . . . . . . . . . 84 6.5.3 Theactualproof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 7 DistanceEstimationandObjectLocationviaRingsofNeighbors 88 7.1 Thefourproblemsandrelevantbackground . . . . . . . . . . . . . . . . . . . . . . . . . . 88 7.2 Alow-stretchroutingschemefordoublingmetrics . . . . . . . . . . . . . . . . . . . . . . 94 7.3 Triangulationanddistancelabelingschemes . . . . . . . . . . . . . . . . . . . . . . . . . . 96 7.4 Low-stretchroutingschemes,revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 7.4.1 Routingschemesonmetrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 7.5 Searchablesmall-worldnetworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 7.5.1 FullproofofTheorem7.14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 7.5.2 ComparisonwithKleinberg’ssmallworlds . . . . . . . . . . . . . . . . . . . . . . 106 7.5.3 Comparisonwiththesingle-link-per-nodemodel . . . . . . . . . . . . . . . . . . . 107 7.6 FullproofofTheorem7.12onroutingschemes . . . . . . . . . . . . . . . . . . . . . . . . 108 8 Conclusionsandfurtherdirections 114 Bibliography 117 vi List of Tables 1.1 Lowerboundsforembeddingswithslack . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.1 Lowerboundsforembeddingswithslack . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 7.1 Low-stretchroutingschemesfordoublinggraphs . . . . . . . . . . . . . . . . . . . . . . . 90 7.2 Low-stretchroutingschemesfordoublingmetrics . . . . . . . . . . . . . . . . . . . . . . 102 7.3 Theorem7.12: spacerequirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 vii List of Figures 3.1 Triangulationindoublingmetrics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.2 Perfecttriangulationfordensepointsets . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 6.1 Meridian: multi-resolutionrings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 6.2 Meridian: closestnodediscovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 6.3 Meridian: multi-constraintqueries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 6.4 Meridian:progressateachhop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 6.5 Meridian:trade-offbetweenperformanceandaccuracy . . . . . . . . . . . . . . . . . . . . 76 6.6 Meridian:thein-degreeratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 7.1 Interconnectionsbetweenvariousresultsinthischapter . . . . . . . . . . . . . . . . . . . 93 7.2 ProofofTheorem7.1: Aroutingschemeondoublingmetrics . . . . . . . . . . . . . . . . 96 viii Chapter 1 Introduction Thepastdecadehasseenmanysignificantandelegantresultsinthetheoryofmetricembeddings(forrecent surveys, see [Ind01, Lin02, Mat02a, IM04]). Embedding techniques have been valuable in the design and analysis of algorithms that operate on an underlying metric; many optimization problems become more tractablewhenthegivenmetricisembeddedintoonethatisstructurallysimpler. Meanwhile,anactivelineofresearchinthenetworkingcommunityhasstudiedthedistancematrixde- finedbynode-to-nodelatenciesintheInternet [FJJ+01,GSG02,GS95,HFP+02,KSB01,VPSV02],result- ing in a number of recent approaches that approximately embed this distance matrix into low-dimensional Euclideanspace [DCKM04,NZ02,PCW+03,ST03].1 However,thereisafundamentaldistinctionbetween thisInternet-relatedworkandthelargebodyoftheoreticalworkonembedding,duetothefollowingintrin- sic problem: in any analysis of the distance matrix of the Internet, most distances are not available. The costofmeasuringallnode-to-nodedistancesissimplytooexpensive;instead,wehaveasettingwhereitis generally feasible to measure the distances among only a linear (or near-linear) number of node pairs, and typicallyinahighlystructuredway. Indeed,themostcommonframeworkforInternetmeasurementsofthis typeisabeacon-basedapproach: onechoosesuniformlyatrandomaconstantnumberofnodes(‘beacons’) inthenetwork,eachnodemeasuresitsdistancetoallbeacons, andonethenhasaccesstoonlytheseO(n) measurementsfortheremainderofthealgorithm. (Forexample,thedatacanbesharedamongthebeacons, whothenperformcomputationsonthedatalocally.) This inability to measure most distances is the inherent obstacle that stands in the way of applying algorithms developed from the theory of metric embeddings, which assume (and use) access to the full distance matrix. Thus, to obtain insight at a theoretical level into recent Internet measurement studies, we needtoconsiderproblemsinfollowingtwogenres. (i) What performance guarantees can be achieved by metric embedding algorithms when only a sparse (beacon-based)subsetofthedistancescanbemeasured? (ii) At an even more fundamental level, many Internet measurement algorithms are seeking not to em- bed but simply to reconstruct the unobserved distances with reasonable accuracy (see e.g. [FJJ+01, GSG02,GS95,KSB01]). Canwegiveprovableguaranteesforthistypeofreconstructiontask? Reconstruction via triangulation. Within this framework, we discuss the reconstruction problem (ii) first, as it is a more basic concern. Motivated by the research of Francis et al. on IDMaps [FJJ+01], and 1WespeakofInternetlatenciesasdefiningasa“distancematrix”ratherthanametric,sincethetriangleinequalityisnotalways observed; however, one can view the recent networking research as indicating that severe triangle inequality violations are not widespreadenoughtopreventthematrixofnode-to-nodelatenciesfrombeingusefullymodeledusingnotionsfrommetricspaces. 1

Description:
I am a Senior Researcher at MSR New York City. My research interests are in algorithms and theoretical computer science, spanning machine
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.