(cid:105) (cid:105) “thesis” — 2016/5/5 — 11:59 — page i — #1 (cid:105) (cid:105) ROMA TRE UNIVERSITÀ DEGLI STUDI Roma Tre University Doctoral School in Computer Science and Automation Ciclo XXVIII Doctoral dissertation: Visual Analytics of Network Routing Through Traceroute Data: Models and Techniques Author: Marco Di Bartolomeo Advisors: Prof. Giuseppe Di Battista Prof. Maurizio Patrignani Spring 2016 (cid:105) (cid:105) (cid:105) (cid:105) (cid:105) (cid:105) “thesis” — 2016/5/5 — 11:59 — page ii — #2 (cid:105) (cid:105) (cid:105) (cid:105) (cid:105) (cid:105) (cid:105) (cid:105) “thesis” — 2016/5/5 — 11:59 — page iii — #3 (cid:105) (cid:105) VisualAnalyticsofNetworkRoutingThroughTracerouteData: ModelsandTechniques Athesispresentedby MarcoDiBartolomeo inpartialfulfillmentoftherequirementsforthedegreeof DoctorofPhilosophy inComputerScienceandEngineering RomaTreUniversity DepartmentofEngineering Spring2016 (cid:105) (cid:105) (cid:105) (cid:105) (cid:105) (cid:105) “thesis” — 2016/5/5 — 11:59 — page iv — #4 (cid:105) (cid:105) COMMITTEE: Prof. GiuseppeDiBattista,Dept. ofEngineering,RomaTreUniversity Prof. MaurizioPatrignani,Dept. ofEngineering,RomaTreUniversity REVIEWERS: Prof. StephenG.Kobourov,Dept. ofComputerScience,UniversityofArizona EmdenR.Gansner,GoogleInc. (cid:105) (cid:105) (cid:105) (cid:105) (cid:105) (cid:105) “thesis” — 2016/5/5 — 11:59 — page v — #5 (cid:105) (cid:105) “Agooddecision isbasedonknowledge andnotonnumbers.” (Plato. Laches. 4thcenturyB.C.) (cid:105) (cid:105) (cid:105) (cid:105) (cid:105) (cid:105) “thesis” — 2016/5/5 — 11:59 — page vi — #6 (cid:105) (cid:105) Contents Contents vi 1 Introduction 1 2 PreliminaryConceptsandDefinitions 7 2.1 ComputerNetworks . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 GraphDrawing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 I VisualizingRoutingfromDetailtoOverview 13 3 VisualAnalysisofRoutingDynamicsandTopology 15 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2 ReferenceScenario . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.3 RelatedWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.4 Terminology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.5 AnalysisofData . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.6 UserInterface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.7 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.8 UserStudy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.9 ConclusionsandFutureWork. . . . . . . . . . . . . . . . . . . . . . 51 4 VisualizationofNetworkMetricsasStackedCharts 53 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.2 RelatedWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.3 FindingaBaselineviaWiggleOptimization . . . . . . . . . . . . . . 57 4.4 LayerOrdering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 4.5 LabelingofLayers . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 vi (cid:105) (cid:105) (cid:105) (cid:105) (cid:105) (cid:105) “thesis” — 2016/5/5 — 11:59 — page vii — #7 (cid:105) (cid:105) CONTENTS vii 4.6 TimeComplexityoftheAlgorithms . . . . . . . . . . . . . . . . . . 66 4.7 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.8 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 4.9 ConclusionsandFutureWork. . . . . . . . . . . . . . . . . . . . . . 74 II AbstractRepresentationofRouting 75 5 AutomaticDiscoveryofHigh-ImpactRoutingEvents 77 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 5.2 RelatedWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 5.3 TheEmpathyRelationship . . . . . . . . . . . . . . . . . . . . . . . 79 5.4 SeekingEvents: MethodologyandAlgorithm . . . . . . . . . . . . . 82 5.5 ExperimentalResults . . . . . . . . . . . . . . . . . . . . . . . . . . 87 5.6 ConclusionsandFutureWork. . . . . . . . . . . . . . . . . . . . . . 91 6 VisualAnalysisofRoutingEvents 93 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 6.2 ReferenceScenario . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 6.3 RelatedWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 6.4 RoutingWatch: AVisualEventAnalysisTool . . . . . . . . . . . . . 96 6.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 6.6 ConclusionsandFutureWork. . . . . . . . . . . . . . . . . . . . . . 108 IIIInterplayBetweenRoutingandGeography 111 7 PlanarityofGeoreferencedGraphs 113 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 7.2 ProblemDefinitionandInstancesClassification . . . . . . . . . . . . 115 7.3 Polynomial-TimeAlgorithm . . . . . . . . . . . . . . . . . . . . . . 117 7.4 HardnessResults . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 7.5 ConclusionsandFutureWork. . . . . . . . . . . . . . . . . . . . . . 129 8 HeuristicsforVisualizingGeoreferencedGraphs 131 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 8.2 VisualizingNetworkedandGeographicData . . . . . . . . . . . . . 133 8.3 RelatedWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 8.4 TheRetinaLayoutAlgorithm . . . . . . . . . . . . . . . . . . . . . 139 (cid:105) (cid:105) (cid:105) (cid:105) (cid:105) (cid:105) “thesis” — 2016/5/5 — 11:59 — page viii — #8 (cid:105) (cid:105) viii CONTENTS 8.5 ExperimentalEvaluation . . . . . . . . . . . . . . . . . . . . . . . . 140 8.6 ConclusionsandFutureWork. . . . . . . . . . . . . . . . . . . . . . 145 Appendices 149 ListofPublications 151 Bibliography 153 (cid:105) (cid:105) (cid:105) (cid:105) (cid:105) (cid:105) “thesis” — 2016/5/5 — 11:59 — page 1 — #9 (cid:105) (cid:105) Chapter 1 Introduction The Internet has become a fundamental part of our life. Born as a network for sci- entific purposes, it has grown in size and services to the point to become the back- boneofmanyhumandailyactivities. Studying,shopping,andbanking,areexamples of activities in which the use of the Internet is nowadays well established. The ex- traordinary diffusion of mobile devices (estimated in 2 billion of connected units in 2016[Int16])isgreatlycontributinginmakingpeopleuseonlineservices. Multime- dia services have an increasing importance in this framework, and, in fact, there is atrendinthelastyearstooffermultimediaproductsovertheInternettothegeneral public. Telephone,music,andmoviestreamingareexamplesofthistrend,inwhich nameslikee.g. Skype, Netflix, Youtube, andSpotifyprovedtobeprominentplayers. This phenomenon is advantageous for several stakeholders. Content providers can exploit a robust and world-wide distributed network for distributing their contents, easilyreachingoldandnewcustomersatafractionofthecostsnecessaryforbuild- ing and maintaining traditional, dedicated infrastructures. This has a direct impact on customers, who receive more complete services at lower prices. Also, these ser- vices are often more interactive than traditional ones, thanks to the digital nature of the Internet, which enables a personalized user experience. Finally, Internet Service Providers(ISPs)aretheintermediaryinthiscontext. Theseorganizations,tradition- ally,havedevelopedandhostedthephysicalnetworksthatruntheInternet,andtoday theyseenewmarketopportunitiesindevelopinghigh-performanceinfrastructuresfor multimediaInternetservices. Internet Service Providers are faced with the challenging task of developing and maintaining networks that increase in size at a dramatic pace but, at the same time, must support multimedia Internet services by providing acceptable performance. In 1 (cid:105) (cid:105) (cid:105) (cid:105) (cid:105) (cid:105) “thesis” — 2016/5/5 — 11:59 — page 2 — #10 (cid:105) (cid:105) 2 CHAPTER1. INTRODUCTION thisscenario,metricsareafundamentaltool.Measuringtheperformanceofanetwork allowsforacontinuousmonitoring,supportingthediscoveryoffaultsandthetuning ofparameters.ISPshavealwaysusedsomekindoflocalmonitoringintheirnetworks, forexampleembeddedinrouters,whichareintermediatedevices.However,giventhe sizeofmodernnetworks,alocalalertraisedbyarouterdoesnotnecessarilyrepresent theexperienceofauser,whosepacketstraverselongpathsinthenetwork. Hecould have a much worse perception of a fault, because of multiplicative effects along the path. Or,itcouldnotnoticethefaultatall,becauseheisfarfromitandtheeffecton hisconnectionisonlynegligible. Probesystemsarearecentattempttodealwiththis problem, and are gathering a growing interest. Such a system distributes small de- vicescalledprobesacrosstheInternet,whicharealwaysconnectedandcontinuously perform standard network measurements towards selected targets. Some common measurementsthatareperformedareping,traceroute,HTTPqueries,etc. Theresults ofthemeasurementsarecollectedinlargerepositories,whichareavailableforfurther analysis. Thekeyfeatureofprobesisthattheyareinstalledneartorealusers, often intheirhouses,hencesimulatingtheactualuserexperiencethroughthemetricsthey collect. Among the measurements available in a probe system, traceroute is a standard networkingtoolthatrecordsthepathfollowedbydatainthenetwork,fromthesource to the target. It also records the round-trip time between the source and each inter- mediate node. Like other standard measurements, it is supported by default by any IP-basednetwork,liketheInternet. Differentlyfromothermeasurements,traceroute datacontainintrinsictopologicalinformation, sinceatraceroutebasicallyrepresents a path in the network. This means that they can reveal details on the structure of the network, in addition to its performance. At any instant, a protocol decides the routingofthenetwork,whichisasetofrulesthatestablishwhatpathisfollowedby packets to go from a given source to a given destination. In this sense, traceroute is a simple yet effective tool for sampling the status of the routing at a given instant. If several traceroute paths are merged, the result is the traceroute graph, or routing graph,whichrepresentsanapproximationofthenetworktopologyandoftherouting onthatnetworkatagiveninstant. However, the information richness of traceroutes makes them difficult to handle and understand. In fact, most existing tools make only a partial use of traceroutes, showingsinglepathsandtherelativeround-triptimes,withoutanyattempttoprocess andemphasizethetopologicalinformation. Thisunderusecanbeexplainedbysome challenges,listedinthefollowing,thatareencounteredwhenprocessingtraceroutes producedbyaprobesystem. DataSize Alargeprobesystemensuresafine-grainedsamplingoftheInternet,but (cid:105) (cid:105) (cid:105) (cid:105)
Description: